We also evaluate the performance of our method by using temporal holdout validation similar to CAFA27. The temporal holdout approach ensures a more “realistic” scenario where function predictions are evaluated based on recent experimental annotations34. We used GO annotations retrieved from SIFTS56 from two time points, version 2019/06/18 (we refer to this as SIFTS-2019) and version 2020/01/04 (we refer to this as SIFTS-2020), to construct our temporal holdout test set. We form the test set from the PDB chains that did not have any annotations in SIFTS-2019 but gained annotations in SIFTS-2020. To increase the GO term coverage, we focus on the PDB chains with both EXP and IEA evidence codes. We obtain 4072 PDB chains (out of which 3115 have sequences <1200 residues). We use our model (trained on SIFTS-2019 GO annotations) to predict functions of these newly annotated PDB chains. We evaluate our predictions against the annotations from SIFTS-2020. The results for MF-, BP-, and CC-GO terms are shown in Supplementary Fig. 17. We also show a few examples of the PDB chains with correctly predicted MF-GO terms by our method, for which both BLAST and DeepGO are not able to make any significant predictions.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.