Metagenomic sequencing and analysis

HZ Hong Zheng
PX Pengtao Xu
QJ Qiaoying Jiang
QX Qingqing Xu
YZ Yafei Zheng
JY Junjie Yan
HJ Hui Ji
JN Jie Ning
XZ Xi Zhang
CL Chen Li
LZ Limin Zhang
YL Yuping Li
XL Xiaokui Li
WS Weihong Song
HG Hongchang Gao
request Request a Protocol
ask Ask a question
Favorite

In this study, sequencing libraries were generated with approximately 1 μg of DNA per cecum content sample using NEBNext® Ultra™ DNA Library Prep Kit for Illumina (NEB, USA), and index codes were used to attribute sequences to each sample. In brief, DNA sample was fragmented into 350 bp by using a sonication method, and DNA fragments were end-polished, A-tailed and ligated with the full-length adaptor for Illumina sequencing and then amplified by PCR. PCR products were purified by using the AMPure XP system and libraries were prepared on a cBot Cluster Generation System based on the manufacturer’s instructions. Finally, the library preparations were analyzed on an Illumina HiSeq2500 PE150 sequencer at Novogene (Beijing, China) and paired-end reads were generated.

Sequencing adapters and low-quality reads, including sequences with more than 40 bases and with quality score lower than 38 or with N bases more than 10, were filtered by using Readfq (v8, https://github.com/cjfields/readfq). High-quality reads were assembled to generate a number of scaffolds using SOAPdenovo software (v2.04, http://soap.genomics.org.cn/soapdenovo.html). MetaGeneMark (v2.10, http://topaz.gatech.edu/GeneMark/) was employed to predict open reading frames (ORFs), and redundancy was removed using CD-HIT software (v4.5.8, http://www.bioinformatics.org/cd-hit/). DIAMOND (v0.9.9.110, https://github.com/bbuchfink/diamond/) was used to analyze unigene sequence files from NR database at NCBI (https://www.ncbi.nlm.nih.gov/) with the Basic Local Alignment Search Tool (BLAST). The lowest common ancestor (LCA) algorithm was performed to assign ORF alignments into taxonomic groups. The functional profile of KEGG orthology (KO) was predicted based on metagenome data with PICRUSt software [59]. Subsequently, the predicted KO abundances were categorized as levels 1-3 into KEGG pathways.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A