Bioinformatic analysis

Nida Amin; Sarah Schwarzkopf; Asako Kinoshita; Johanna Tröscher-Mußotter; Sven Dänicke; Amélia Camarinha-Silva; Korinna Huber; Jana Frahm; Jana Seifert

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Bioinformatic analysis

NA Nida Amin

SS Sarah Schwarzkopf

AK Asako Kinoshita

JT Johanna Tröscher-Mußotter

SD Sven Dänicke

AC Amélia Camarinha-Silva

KH Korinna Huber

JF Jana Frahm

JS Jana Seifert

This method is extracted from research article: Anim Microbiome, Apr 2021

Evolution of rumen and oral microbiota in calves is influenced by age and time of weaning

DOI: 10.1186/s42523-021-00095-3

Ask a question

Favorite

The bioinformatic analysis of Illumina amplicon sequencing datasets covering V1-V2 region of 16S rRNA gene was done using QIIME 2 (2019.10) [49]. The paired-end (PE) Illumina raw sequences (2 × 250 bp) were imported in QIIME 2 using MultiplexedPairedEndBarcodeInSequence semantic type. The PE sequences were demultiplexed using cutadapt (v2.6) within QIIME 2 with q2-cutadapt plugin and demux-paired command, increasing the default error tolerance to 0.2. The residual artificial sequences such as barcodes, forward primer (22 bp) and reverse primer (19 bp) were trimmed by implementing cutadapt (v2.6) in QIIME 2 with q2-cutadapt plugin and trim-paired command [50]. The quality filtration step and joining of PE reads was done by implementing DADA2 pipeline in QIIME 2 with q2-dada2 plugin and denoise-paired command [51]. The trimmed PE sequences were quality filtered by retaining high quality bases (average quality score above 30) and PE reads were joined at a mean length of 313 ± 6 bp, chimeric sequences, non-overlapping regions and singletons were discarded and FeatureTable [Frequency] and FeatureData [Sequence] QIIME 2 artifacts were generated. The PE sequences from each sequencing run were processed separately throughout the analysis resulting in FeatureTable [Frequency] and FeatureData [Sequence] QIIME 2 artifacts per sequencing run after DADA2 step. The filtered FeatureTable [Frequency] artifacts were merged with qiime feature-table merge command and FeatureData [Sequence] artifacts with qiime feature-table merge-seqs command resulting in a total of 6,141,120 reads, with 23,262 ± 1758 reads (mean ± SEM) per sample. Taxonomic classification was performed with q2-feature-classifier plugin and classify-sklearn method using sklearn-based taxonomy classifier (pre-trained on SILVA reference database for 16S rRNA (release_132), under a default confidence of 0.7 [52, 53]. Sequences assigned to cyanobacteria and chloroplast as well as non-bacterial and unassigned sequences from FeatureData [Sequence] and FeatureTable [Frequency] artifacts were removed using q2-taxa plugin in QIIME 2 and a taxonomy-based filtering step using qiime taxa filter-seqs and qiime taxa filter-table commands. All low reads samples (< 5000 reads) were removed from FeatureTable [Frequency] and FeatureData [Sequence] artifacts with qiime feature-table filter-samples and qiime feature-table filter-seqs commands. A biom feature table (FeatureTable [Frequency]-with-taxonomy annotations) was produced with biom add-metadata command in QIIME 2 that was later converted into txt format with biom convert command. The feature table was filtered again by following strict criteria to remove the low abundance OTUs (≤ 0.2% of total reads per sample), thus, resulting in a total of 4,741,355 reads, with mean read counts for stomach tubing samples 17,716 ± 1590 and for buccal swab samples 21,014 ± 2014 (mean ± SEM) per sample and a total of 4906 unique bacterial OTUs. All unique bacterial OTUs were taxonomically reassigned using RDP database [54] and naïve Bayesian RDP classifier [55]. The output taxonomy table was filtered according to [56] with a defined confidence threshold cut-off value for each taxonomic level such as: genus (94.5%), family (86.5%), order (82.0%), class (78.5%) and phylum (75.0%) and the taxonomic assignments were omitted if they fall below the following sequence identity thresholds.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol