The US Dept. Of Energy Joint Genome Institute Uses Docker to Deliver Better Science

Background

The US Dept of Energy Joint Genome Institute (DOE JGI) is to advance genomics in support of DOE missions related to clean energy generation and environmental characterization and cleanup. Key mission areas are bioenergy, carbon cycle and biogeochemistry.


Challenges

DOE JGI is at the forefront of large­ scale sequence ­based science, responsible for the QA/QC of sequencing data as it is generated by distributed production facilities.With lots of assemblers and many previous projects that are publication based, they faced challenges with data sequencing being open to subjectivity and bias. They needed to develop pipelines to address the lack of standardization when continually assembling and sequencing lots of genome data to ensure that the data produced and released to the community is high quality and met specific standards. Their existing processes were wasteful and inefficient with assembly software that is difficult and time consuming to set up and get going making it impossible to objectively compare results of the quality and performance of the assemblers.


Solution

With Docker containers DOE JGI was able to improve on the assembler project approach by running benchmarks on assemblers so they can be objectively evaluated and crowd source the assemblers from the bioinformatics community. Using Linux Containers running Docker all genome assemblers and associated pipeline were built within a Docker image and then hosted on Docker Hub. The benchmarking pipeline is now able to pull the image and run it against an array of reference data sets. The produced assembly can now be evaluated against the reference sequence using Quast, a quality assessment tool for genome assemblies, with the assembly metrics and results then posted on the site. By simplifying technology and automating processes, researcher’s had more time for science and improved its quality. Docker containerization provided a standardized pipeline with consistent APIs, and even a catalog, leading to objective comparison of the tools and the results. Researchers can now have a data driven conversation and easily share assemblers & results and data with each other ­ leading to better science.


Get Started for Free
  • Having the right kinds of controls to make sure that changes to production are going through the right processes to make – and getting audited appropriately is important for us. So, the role-based access control is really good.

    Eric Westfall, Enterprise Software Architect, Indiana University
  • We realized that our vision and what we wanted to achieve long term in terms of the datacenter modernization and what was required in our architectural blueprint, that fit very well with what Docker was providing.

    Ajay Dankar, Senior Director Product Management, PayPal
  • Docker’s CaaS approach will enable us to drive transformation across the entire application life cycle from development to operations. A key feature for us is the end-to-end integration with Docker Content Trust to centrally administer and control our images so that only signed and validated content can be used or deployed into a production environment. With Docker, we will be able to ensure application portability, whether it is between dev and ops or between the datacenter and the cloud.

    Keith Fulton, Chief Technology Officer at ADP