The Microbiome Quality Control project

Click here to edit subtitle

Teams participating in the MBQC signed up to provide one or more assessment modules.  Only two modules were evaluated in the MBQC-baseline phase, although many more for sample acquisition, extraction, and so forth will be tested during future scale-up of the project. Each module is designed to interchangeably evaluate one step in a typical human microbiome study. Modules have specifically defined inputs and outputs, so as to be "plug and play" with each other, but are otherwise black boxes - participants choose (and document) how their labs perform each step. All interactions between modules are brokered by the MBQC to ensure that transfers are blinded and that the space of potential sources of variation is well-explored.


Protocol guidelines

  • Upon registering, each team will select how many samples they would like to process, in multiples of 96-sample sets.
  • After registering, each team will submit to the MBQC the lab-specific protocol they plan on following for each selected module.
  • All interactions between modules will be brokered by the MBQC. This includes physical sample transfers and data transfers.
  • After carrying out a module, each team will submit to the MBQC the products (for the pilot, sequencing data or OTUs) for transfer to the following module(s), along with any protocol modifications that occurred during module execution.
  • The MBQC will provide each sample and data product with a unique, anonymized identifier of the form "DZ#####". This identifier will track which teams and protocols have been applied to the sample as it moves between modules. This information will be blinded to participants until the completion of each MBQC phase.
  • All sample types, metadata, and phenotypes will remain blinded to participants until consolidated data release at the end of the phase.
  • The association of participating labs with lab-specific results will remain anonymous unless  individual labs choose to unblind their identifier during registration or subsequently.
  • All data produced by any MBQC modules will be made publicly available in conjunction with consortium publications.
  • All protocols submitted to the MBQC will be made publicly available (with full attribution).

During the MBQC-baseline phase, protocol requirements are flexible - given the pilot's minimal resources, we expect labs to follow, for the most part, whatever procedures are already in place for sample handling. These must conform to the hard guidelines for module input and output interoperability, however, and must record at least the variables listed for each module below.

MBQC-baseline dates

The MBQC-baseline timeline took place over:


  • Registration closed end-of-day September 27th, 2013.
  • Samples ship on and around March 28th, 2014.
  • Handling results (generated sequences) deposited outputs to MBQC by May 16th, 2014.
  • Sequences were provided by MBQC as inputs for bioinformatics processing on and around July 31st, 2014.
  • Bioinformatics results (OTU tables) deposited outputs to MBQC by August 31st, 2014.
  • Consolidated results were returned to participants on and around September 19th, 2014.
  • The MBQC Workshop took place in Washington, DC from September 30th-October 1st, 2014.

In addition to our analysis publication, a meeting report from the MBQC Workshop was published in December, 2015.

Samples

Inputs: none
Outputs: stool and DNA specimens isolated for shipping in sets of 94 Fisher 1.2 mL Cryovial (Product Code 12567500) tubes

  • All provided samples are OHSR-approved for redistribution and analysis at other sites as non-human-subjects research.
  • All provided samples are diagnostic specimens with no reason to believe that specific pathogens are contained within them, to be handled using IATA universal protection. The nomenclature for non-human subjects IATA universal protection is: "Diagnostic specimens, biological substance, category B Un 3373, human origin."
  • Tubes will be shipped in Fisher 1.2 mL Cryovial (Product Code 12567500) tubes in 96-segment cardboard boxes on dry ice.
  • Sample labels will be of the form "DZ#####" and will be accompanied by a 2D barcode for labs with appropriate readers. Sample tracking during handling within labs can use lab-specific identifiers if desired (see Handling).
  • Samples will be accompanied by an Excel manifest with which sample handling must be tracked.
  • Participating labs will be blinded to sample collection methods; sample types to be provided will comprise frozen stool, freeze-dried stool, and extracted DNA.
  • Frozen stool will be shipped dry. Freeze-dried stool will be shipped in 1mL sterile EB buffer (Qiagen Buffer EB, Qiagen 19086). Extracted DNA will be shipped in 10 ul aliquots of 10 mM Tris.
  • Samples will be shipped in two separate cardboard boxes, one for frozen and freeze-dried stool, one for extracted DNA.
  • The pre-storage transport time (hours/days/etc.) for each sample will be tracked for later analysis.
  • The storage time (days/years/etc.) for each sample will be tracked for later analysis.
  • The freeze/thaw cycling of each sample will be tracked for later analysis.

Handling

Inputs: stool and DNA specimens isolated for shipping in sets of 94 Fisher 1.2 mL Cryovial (Product Code 12567500) tubes tubes
Outputs: demultiplexed Illumina FASTQ files, one (SE) or two (PE) per sample

In order to simplify lab participation during the MBQC-pilot, all sub-modules of sample handling can be carried out using any protocol of choice (e.g. already in use within individual labs). We recommend the public protocols listed on this page. However, if labs choose to execute custom protocols, they must be:

  • Submitted to the MBQC (in free text prose) before participation to be made publicly available (with attribution).
  • Include at least the variables specified below for each sub-module.
  • Resubmitted after participation with any modifications that occurred during execution.

To facilitate recording detailed protocol information for later analysis of inter-lab variability, please use the following template for sample handling variables:

DNA extraction

  • Extraction kit manufacturer, model, and batch.
  • Vortex/homogenization time.
  • Homogenization manufacturer, model, kit, and batch.
  • Extracted DNA concentrations (ng/ul).

16S amplification

  • Tube manufacturer(s) and model.
  • Lab-internal transport processes, time, and sample identifiers.
  • Sequencing barcode nucleotide sequences.
  • 16S primer nucleotide sequences.
  • All reagent manufacturers, kits, and batches.
  • Extracted DNA concentrations (ng/ul)

Sequencing

  • Tube manufacturer(s) and model.
  • Lab-internal transport processes, time, and sample identifiers.
  • Sequencing platform hardware manufacturer and model; note that the MBQC-pilot phase can only accommodate  Illumina MiSeq or HiSeq sequencing, although participants are welcome to sequence the same samples using additional platforms.
  • Sequencing platform chemistry, kit(s), batch, and target read length.
  • Sequencing software version and/or date.
  • Sequencing software parameters.

Bioinformatics

Inputs: demultiplexed Illumina FASTQ files, one (SE) or two (PE) per sample
Outputs: one open-reference OTU table in BIOM v1 format using Greengenes 13.5 (May 2013) OTU identifiers, one phylogenetic tree relating the OTUs in Newick format

To facilitate recording detailed protocol information for later analysis of inter-lab variability, please use the following template for bioinformatics variables:

  • All software providers and versions and/or dates.
  • All software parameters and command line arguments.
  • When appropriate, a control file or batch script including all commands used to perform analysis; alternatively, a log of all command lines executed.
  • If using any reference taxonomy/phylogeny outside of Greengenes for open-reference OTU calling, a complete list of OTU reference sequences, taxonomy, phylogeny, and identifiers.