Updated POG database

TZ Tingting Zheng
JL Jun Li
YN Yueqiong Ni
KK Kang Kang
MM Maria-Anna Misiakou
LI Lejla Imamovic
BC Billy K. C. Chow
AR Anne A. Rode
PB Peter Bytzer
MS Morten Sommer
GP Gianni Panagiotou
request Request a Protocol
ask Ask a question
Favorite

Our updated phage orthologous groups (uPOGs) database was built in October 2016 using the same methodology as Kristensen et al. [27], except that we integrated more recently released and published phage genomes to gain more POGs. The sequences of 3319 publicly available phage genomes were acquired from the NCBI nucleotide (https://www.ncbi.nlm.nih.gov/nuccore/?term=) using same keywords as Kristensen et al. [27], besides, 759 prophage genomes were downloaded from ACLAME database [28]. These phage genomes used for uPOGs construction are available in our website (http://147.8.185.62/VirMiner/downloads/phage_genome/). The sequences of 7734 available prokaryotic genomes deposited in NCBI were also downloaded. POGs were constructed using the standard COG-building method [29].

Among these constructed POGs, we further identified virus-specific POGs based on virus quotient (VQ) computation [27]. All POGs were mapped against phage genomes and prokaryotic genomes, respectively, using psi-blast (E value ≤ 0.001, bit score > 40, homologous region ≥ 40 amino acids). The prokaryotic genomes with prophage regions as identified by PhiSpy [30] were excluded from the virus quotient (VQ) calculation. For each POG, VQ was calculated as the quotient of the frequency of matches to viral genomes versus the sum of the frequency of matches to both viral and prokaryotic genomes. POGs with VQ > 0.85 were considered as highly virus-specific POGs. We also identified taxa-specific POGs that can be used to detect the presence of specific taxon groups. For the POG database developed by Kristensen et al. [27] (POG 2012), results were presented using two different criteria: (a) 100% recall, 100% precision, and a VQ of 1.0 and appeared in only a single copy per genome in the POGs; (b) 100% precision threshold, the VQ threshold at 1.0, but using no recall threshold. In another study that employed the POG database for human gut microbiome analysis [18], Waller et al. used the criteria of 100% precision, VQ greater than 85%, recall greater than 85%, and being present in a single copy per genome. Here we followed the criteria of Waller et al. [18] for taxon signature POG identification. The update of POGs database (uPOGs) is available on our website (http://147.8.185.62/VirMiner/downloads/updated_POG_database/). A more detailed description of these methods is available in Kristensen et al. [27].

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A