In order to determine the Average Nucleotide Identity (ANI) between sp. 54 and OTU41-like sequence present within each sample, WMS reads were aligned to the Flavonifractor sp. 54 genome using bwa mem (v0.7.12-r1039; https://github.com/lh3/bwa) [126], retained only paired reads aligned as proper pairs, with minimum mapping quality of 5, and no more than five nucleotides soft-clipped at the end of the read. We then used the Bcftools (version 1.6; http://samtools.github.io/bcftools/) [127] commands “mpileup”, “call”, and “filter” to determine the number of high quality reference and SNP calls across all 131 WMS samples. For each sample, Bcftools “mpileup” was first used to generate allele information for each position in the Flavonifractor sp. 54 reference genome; Bcftools “call” (with ploidy set to 1) was then used to call variants, which were further filtered with Bcftools” filter” to filter out calls with a sum base quality of less than 50. Further, positions were excluded if the depth of high quality calls (i.e., sum of the DP4 field) was 3 standard deviations above the mean depth of passing calls for the sample. This filter was performed to reduce erroneous SNP calls (which would decrease the estimated ANI) from homologous sequences contained in other, more abundant organisms found within each sample.

After applying these filters, the remaining high-confidence, passing SNP calls were then tallied per sample. An estimate of average nucleotide identity (ANI) between the Flavonifractor sp. 54 reference genome and the OTU41-like sequence within each sample was calculated by examining the fraction of passing calls that difference from the reference allele, divided by the total number of passing calls present within that sample.

After applying these filters, the remaining high-confidence, passing SNP calls were then tallied per sample. An estimate of average nucleotide identity (ANI) between the Flavonifractor sp. 54 reference genome and the OTU41-like sequence within each sample was calculated by examining the fraction of passing calls that difference from the reference allele, divided by the total number of passing calls present within that sample.

Note: The content above has been extracted from a research article, so it may not display correctly.



Q&A
Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.



We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.