Temporal holdout validation

Vladimir Gligorijević; P. Douglas Renfrew; Tomasz Kosciolek; Julia Koehler Leman; Daniel Berenberg; Tommi Vatanen; Chris Chandler; Bryn C. Taylor; Ian M. Fisk; Hera Vlamakis; Ramnik J. Xavier; Rob Knight; Kyunghyun Cho; Richard Bonneau

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Temporal holdout validation

VG Vladimir Gligorijević

PR P. Douglas Renfrew

TK Tomasz Kosciolek

JL Julia Koehler Leman

DB Daniel Berenberg

TV Tommi Vatanen

CC Chris Chandler

BT Bryn C. Taylor

IF Ian M. Fisk

HV Hera Vlamakis

RX Ramnik J. Xavier

RK Rob Knight

KC Kyunghyun Cho

RB Richard Bonneau

This method is extracted from research article: Nat Commun, May 2021

Structure-based protein function prediction using graph convolutional networks

DOI: 10.1038/s41467-021-23303-9

Request a Protocol

Ask a question

Favorite

We also evaluate the performance of our method by using temporal holdout validation similar to CAFA^²⁷. The temporal holdout approach ensures a more “realistic” scenario where function predictions are evaluated based on recent experimental annotations^³⁴. We used GO annotations retrieved from SIFTS^⁵⁶ from two time points, version 2019/06/18 (we refer to this as SIFTS-2019) and version 2020/01/04 (we refer to this as SIFTS-2020), to construct our temporal holdout test set. We form the test set from the PDB chains that did not have any annotations in SIFTS-2019 but gained annotations in SIFTS-2020. To increase the GO term coverage, we focus on the PDB chains with both EXP and IEA evidence codes. We obtain 4072 PDB chains (out of which 3115 have sequences <1200 residues). We use our model (trained on SIFTS-2019 GO annotations) to predict functions of these newly annotated PDB chains. We evaluate our predictions against the annotations from SIFTS-2020. The results for MF-, BP-, and CC-GO terms are shown in Supplementary Fig. ¹⁷. We also show a few examples of the PDB chains with correctly predicted MF-GO terms by our method, for which both BLAST and DeepGO are not able to make any significant predictions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol