Faecal genomic DNA was extracted using a combination of mechanical and chemical lysis via FastPrep Instrument (MP Biomedicals) and QIAamp Fast DNA Stool Mini Kit (Qiagen) as described previously [18]. The V3-V6 region on prokaryotic 16S rRNA was amplified from total DNA as previously described [7376]. Amplicons were sheared using the Covaris LE220 sonicator (Covaris, Inc., USA) and built into sequencing libraries using GeneRead DNA Library I Core Kit (Qiagen) according to the manufacturer’s protocol. DNA libraries were multiplexed by 96 indices, pooled, and sequenced on the Illumina HiSeq 2500 using paired-end (2x76bp) sequencing. Sequencing reads were demultiplexed (Illumina bcl2fastq 2.17.1.14 software) and filtered (PF = 0) before conversion to FASTQ format.

The average numbers of sequencing quality-passed reads mapping to the 16S rRNA Greengenes global rRNA database (dated May 2013; greengenes/13_5/99_otus.fasta) are described in the Supplementary Table S2.

Read trimming from 3′ was done to remove bases with quality score ≤ 2, followed by additional removal of reads pairs shorter than 60 bp after trimming [7377]. Following read trimming, full length 16S rRNA (V3-V6 region) reconstructions were produced from the short sequencing reads using the EMIRGE amplicon (Expectation Maximisation Iterative Reconstruction of Genes from the Environment) algorithm [7578]. The analysis method is described in the publication of Ong and colleagues [73] and the updated tools and pipeline are available at: https://github.com/CSB5/GERMS_16S_pipeline. EMIRGE leverages 16S rRNA sequences on the SILVA database for template guided assembly of reconstructions of the 16S rRNA amplicon sequences [77, 78]. EMIRGE prevents chimeric sequences from mapping. The reconstructed 16S rRNA sequences (at least 99% sequence similarity) are collapsed into OTUs, and Graphmap [79] was used to map these OTUs to the Greengenes global rRNA database (dated May, 2013; greengenes/13_5/99_otus.fasta) [80]. OTUs were called at various taxonomic levels of identity (species, genus, family etc.) [81]. EMIRGE assigns abundance estimates to the reconstructed 16S rRNA sequences. The relative abundance of OTUs was determined for each sample and converted to relative abundances at various taxonomic level (phylum, family, genus, species level).

DNA and RNA extractions were carried out with the ZR Fecal DNA MicroPrep and ZR Fecal RNA MicroPrep kits (Zymo Research), respectively, following the manufacturer’s protocols that are supplied with the kits. The homogenization step was performed on a FastPrep-24 instrument (MP Biomedicals). For each extraction, ~ 50-100 mg of feces were used. For RNA extractions, the optional DNase digestion step was performed to eliminate traces of contaminating DNA. Extracted nucleic acids were quantitated with Invitrogen’s Picogreen (DNA) and Ribogreen (RNA) assays prior to next generation sequencing library preparation.

RNA library preparation was performed according to Illumina’s TruSeq Stranded mRNA protocol with the following modifications: The oligo-dT mRNA purification step was omitted and instead, 200 ng of total RNA were directly added to the Elution2-Frag-Prime step. The PCR amplification step, which selectively enriches for library fragments that have adapters ligated on both ends, was performed according to the manufacturer’s recommendation but the number of amplification cycles was reduced to 12. All libraries were dual barcoded with Illumina’s TruSeq HT RNA barcodes to enable library pooling for sequencing.

DNA library preparation was performed according to Illumina’s TruSeq Nano DNA Sample Preparation protocol. The samples were sheared on a Covaris S220 or E220 to ~ 450 bp, following the manufacturer’s recommendation. All libraries were dual barcoded with Illumina’s TruSeq HT DNA barcodes to enable library pooling for sequencing.

Finished libraries were quantitated using Invitrogen’s Picogreen assay and the average library size was determined on a Bioanalyzer 2100, using a DNA 7500 chip (Agilent). Library concentrations were then normalized to 4 nM and validated by qPCR on a ViiA-7 real-time thermocycler (Applied Biosystems), using the Kapa library quantification kit for Illumina platforms (Kapa Biosystems). DNA libraries were then pooled at equimolar concentrations and sequenced on an Illumina HiSeq2500 sequencer in rapid mode at a read-length of 250 bp paired-end. RNA libraries were also pooled at equimolar concentrations and sequenced on an Illumina HiSeq2500 sequencer in rapid mode at a read-length of 100 bp paired-end.

The raw metagenomics and metatranscriptomics Illumina reads were adapter-trimmed and quality-trimmed using cutadapt-1.8.1 [82] with parameter of “-q 20 --trim-n --minimum-length 30 --match-read-wildcards”. Afterwards, for each paired-end read dataset, the first pair and its corresponding second pair were locally assembled/merged using FLASH 1.2.6 [83] with “-m 10 -M 251 -x 0.25” and “-m 10 -M 101 -x 0.25” parameter for metagenomics and metatranscriptomics dataset respectively.

The metagenomics dataset was mapped against the hg19 human reference genome with bowtie2 2.2.5 [3] using “--very-sensitive-local” preset as its sensitivity parameter. Any reads that can’t be confidently mapped against the human genome, which were generated by “--un” switch, were separated and processed as the non-host reads. The non-host reads were then aligned against NCBI non-redundant protein database (downloaded on 3 January 2016) using Diamond version 0.85 [84] with default parameters. Based on these alignments, the microbial taxonomical classification was determined using the Lowest-Common-Ancestor (LCA) algorithm implemented in MEGAN6 [85] (parameters: maxmatches = 25 minscore = 100 minsupport = 25). The subsequent taxonomy and KEGG/SEED functional annotation were done as part of MEGAN6 processing and was based on MEGAN’s gi2tax-July2016.bin and gi2keggMarch2016.bin database files.

The metatranscriptomics dataset was mapped against hg19 human reference genome with bowtie2 2.2.5 [86] using “--very-sensitive-local” preset as its sensitivity parameter. The non-human reads were then separated into its ribosomal RNA and non-ribosomal RNA components using sortMeRNA 2.1 [87]. The non-ribosomal reads were then aligned against NCBI non-redundant protein database (downloaded on 3 January 2016) using Diamond version 0.85 [4] with default parameters. Based on these alignments, the microbial taxonomical classification was determined using the Lowest-Common-Ancestor (LCA) algorithm implemented in MEGAN6 [5] (parameters: maxmatches = 25 minscore = 80 minsupport = 25). The subsequent taxonomy and KEGG/SEED functional annotation were done as part of MEGAN6 processing and was based on MEGAN’s gi2tax-July2016.bin and gi2keggMarch2016.bin database files.

The metagenomics sequencing run produced, on average, 8.6 million paired-end reads per sample, or 4.3Gb data per-sample. After quality control and removal of host sequences, on average, 6,864,061 (79.7%) paired-end reads were used for downstream metagenomics analysis. At the end of the classification analysis, on average, 88.93% (66.78–94.97%) of the paired-end reads could be classified to the kingdom Bacteria.

The metatranscriptomics sequencing run produced, on average, 11.9 million paired-end reads per sample, or 2.4Gb data per-sample. After quality control and removal of ribosomal RNA sequences, on average, 1,212,609 (10.12%) paired-end reads were categorized as non-ribosomal RNA reads and used for downstream metatranscriptomics analysis. The resulting analysis shows that, on average, 60.31% (40–75.28%) of the non-ribosomal RNA reads could be classified to the kingdom Bacteria.

Note: The content above has been extracted from a research article, so it may not display correctly.



Q&A
Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.



We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.