request Request a Protocol
ask Ask a question
Favorite

The GenBank [33] and RefSeq [34] assembly summary reports (downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS) were used to identify all genome sequences identified as Allistipes, Bacteroides, Parabacteroides, or Prevotella species. Where there were duplicate and identical genomes deposited in both GenBank and RefSeq, only one entry was retained. Depositor-provided protein sequence information had to be available for the genome to be further considered. This process resulted in an initial set of 246 genomes. From this set, species where the particular genome annotation indicated the isolate sequenced was not of human origin or – if such annotation was lacking – where the type strain of the species was not of human origin (e.g. Bacteroides barnesiae from chickens, all Prevotella except Prevotella copri and Prevotella stercorea) were eliminated. All Allistipes species retrieved were retained.

For all retained genomes, DNA identified in the genome’s Generic Feature Format (.gff) file as 16S ribosomal RNA or, where such documentation was lacking, DNA showing best-hit BLAST homology to a 16S sequence from B. fragilis NCTC 9343, was extracted and compared to the 16S sequence database from the Ribosomal Database Project [35] and to the refseq_rna database at NCBI to confirm the species where indicated, or to assign a putative species ID to those genomes identified by the depositor to the genus level only. The final set of genomes encompasses 205 genome sequences, representing four genera and 36 species (Table 1).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A