MAG annotation and taxonomic classification.

DM Dimitri V. Meier
SI Stefanie Imminger
OG Osnat Gillor
DW Dagmar Woebken
request Request a Protocol
ask Ask a question
Favorite

First, the MAGs were annotated using the RAST-Tk pipeline (120). The translated ORFs predicted by RAST-Tk were then searched against various databases as follows: diamond blastp versus UniProt (121), hmmscan (http://www.hmmer.org/) versus Pfam (122), EggNOG-mapper v.1.0.1 (123) versus EggNOG (124), hmmscan versus CAZY (125) and MEROPS (126). The results of all annotations combined in one table were loaded as a structured query language (SQL) database and searched with SQL queries. Annotations of large subunits of CO dehydrogenase and catalytic subunits of methane monooxygenase were verified by phylogeny. Briefly, sequences included in the seed alignment of the respective Pfam, together with further sequences with confirmed function or known classification, were obtained from UniProt. Reference and metagenome-derived sequences were aligned to the Pfam alignment with hmmalign (http://www.hmmer.org/), and phylogenetic trees were calculated using FastTree v.2.1.11 (127) starting from a bionj tree (128), using the Le-Gascuel substitution model (129) and gamma likelihood optimization.

The MAGs were taxonomically classified by GTDB-Tk (36) and by the 16S rRNA gene if it was present in a MAG. The GTDB-Tk classification is based on (i) placement in the phylogenomic tree, (ii) relative evolutionary divergence (RED) value as established by Parks et al. (35), and (iii) average nucleotide identity to the closest related genome within a genus. For novel genera, families, and orders, local phylogenetic trees were calculated de novo based on GTDB alignment (35) by using FastTree (same settings as described above).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A