A SSN was generated using the EFI-EST tool (http://efi.igb.illinois.edu/efi-est/) (Gerlt et al., 2015). The full length (native) Cgr2 protein sequence was used as an input to generate a network with the 5000 most similar sequences from the UniProtKB protein database. An initial alignment score cutoff of 10−66 generated a SSN with 2018 nodes (with 100% identity) and 317,130 edges. The SSN was imported into Cytoscope v 3.2.1 and visualized with the ‘Organic layout’ setting. Seven characterized enzymes were present within the network (UniProtKB IDs: fumarate reductases: P83223, P0C278, Q07WU7, Q9Z4P0; urocanate reductase: Q8CVD0; 3-oxosteroid-1-dehydrogenase: P71864; Q7D5C1). The alignment score cutoff was increased to e-value <10−130, until enzymes with known functions separated into putatively isofunctional clusters. At this threshold, Cgr2 appears as a singleton. The network shown in Figure 5A was generated with a cutoff of e-value <10−50, a threshold at which nearly all protein sequences form one cluster. Multiple sequence alignments were generated in Geneious and visualized in Jalview (clustalx coloring). To validate that the clusters in the SSN likely contained isofunctional proteins, Cgr2 was aligned with characterized enzymes and additional selected proteins within the corresponding clusters of the SSN, and the alignment was analyzed for the presence of conserved active site residues involved in substrate binding, activation and proton transfer (Leys et al., 1999; Rohman et al., 2013; Bogachev et al., 2012; Knol et al., 2008; Reid et al., 2000).
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.