Phylogenomic and flexible genome analyses.

AC Alexander B. Chase
DS Douglas Sweeney
MM Mitchell N. Muskat
DG Dulce G. Guillén-Matus
PJ Paul R. Jensen
request Request a Protocol
ask Ask a question
Favorite

We reanalyzed 118 Salinispora genomes (17) representing strains isolated from sponges and globally distributed marine sediments (96% of isolates). Previously, we assigned each strain to one of nine species based on genotypic and phenotypic characteristics (26) (Table S1 in the supplemental material). For each genome, protein-coding regions and gene annotations were assigned using Prokka v1.13.3 (63), and orthologs shared across all genomes were identified with Roary v3.12.0 (64) based on a minimum sequence identity of 85%. The resulting 2,106 potential orthologs were individually aligned using Clustal Omega (65) and screened for complete codon reading frames. The final 2,011 single-copy orthologs were concatenated to infer a core genome phylogeny using RAxML v8.2.10 (66) under the general time-reversal model with a gamma distribution for 100 replicates (Fig. 1A). Any orthologs not shared among all strains were assigned to the flexible genome. Species-specific flexible genes (defined as genes shared by all strains within a species but not observed in any other species) were assigned functional annotation with GhostKOALA against the nonredundant set of KEGG genes (67) for all Salinispora species with ≥3 genomes sequences.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A