This study used a larger set of gut microbiota profiles that were generated alongside those described in a recent study by Goodrich et al.24, which reported a smaller sample as it considered only complete twin pairs. The processing of faecal samples has been described previously22. Briefly, samples were collected by the individual at home and either bought to a clinical visit or posted on ice to the clinical research department on ice where it was stored at −80 °C. Frozen samples were shipped to Cornell University where DNA was extracted, the V4 region of the 16S rRNA genes amplified, and amplicons sequenced using a multiplexed approach on the Illumina MiSeq platform. Sample reads were demultiplexed and paired-ends merged using a 200nt minimum overlap.
De novo chimera removal was carried out on the 16S rRNA gene sequencing per sample using UCHIME25. Remaining reads were collapsed to de novo operational taxonomic units (OTUs) at 97% identity using SUMACLUST within QIIME version 1.9.026,27. OTU taxonomy was assigned by aligning representative sequences to the Greengenes v13_8 database using UCLUST in QIIME. Analyses were adjusted for sequencing depth throughout by using sample read count as a covariate. Taxonomic abundances were generated by collapsing OTU counts at appropriate levels, followed by conversion to log-transformed relative abundances. Three alpha diversity metrics, namely the Shannon index, phylogenetic diversity, and raw OTU counts, were calculated using QIIME. Beta diversity was calculated as both weighted and unweighted UniFrac metrics, and principal coordinate analysis of the beta distances was carried out using the vegan package28. The first six axes were chosen to represent beta diversity (Supplementary Fig. 8).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.