Unless specified, the statistical significance of the association between reported PAgs and computed protein properties including SCL, AP, signal peptide, TMH, and TMB were calculated using one-way Fisher’s exact test since we were only interested in over-representation of properties in PAgs only. For the ad hoc analysis of specific property (e.g., SCL prediction), the significance of individual sub-property (e.g., individual SCL locations such as extracellular, cell wall, cytoplasmic membrane, and cytoplasm in G+ bacteria) were further examined by performing one vs. other Fisher’s exact test and the resulting p-value was adjusted by applying Bonferroni correction.
The over-representation of conserved domains, COG clusters, and GO BP, MF, CC terms among Protegen PAgs were tested using Fisher’s exact test and adjusted using Benjamini–Hochberg–Yekutieli procedure. In addition, the significant (adjusted p ≤ 0.05) GO terms (BP, MF, CC) were visualized in hierarchical format using GOfox (33). GOfox3 laid out GO terms using the internal hierarchical GO structure simplification algorithm since GO enrichment analysis tends to generate a large list of enriched GO terms (33).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.