2.4. Nested Cluster Cross Validation and Target-AUC

Kyoungyeul Lee; Dongsup Kim

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.4. Nested Cluster Cross Validation and Target-AUC

KL Kyoungyeul Lee

DK Dongsup Kim

This method is extracted from research article: Genes (Basel), Nov 2019

In-Silico Molecular Binding Prediction for Human Drug Targets Using Deep Neural Multi-Task Learning

DOI: 10.3390/genes10110906

Request a Protocol

Ask a question

Favorite

According to a study by Mayr et al., machine learning prediction for the chemical database has a variety of potential biases [22]. Firstly, compound series bias easily occurs because databases have highly similar compounds that share common scaffolds [25]. Hence, a prediction model that is optimized for a specific scaffold may overestimate the performance, which is not suitable for activity prediction of novel drugs. Secondly, hyperparameter selection that is optimized for a specific algorithm is also problematic because it does not allow a fair comparison between methods. To overcome such biases, we adopted nested cluster cross validation for comparison between various DNNs [35]. Details of the nested cluster cross validation is described in Supplementary File S-2.

ROC-AUC is one of the most common metrics to evaluate the performance of binary classification (active or inactive). However, based on the purpose of multi-task learning, validating the ROC curve across all targets cannot represent the prediction accuracy for individual targets. Some of the tasks may suffer from low prediction power while other tasks may display overwhelming accuracy. To check such skewed performance, we used target-AUC, which indicates the average ROC-AUC over validation fold for each task [22]. Validating the mean and variance of target-AUC across targets more precisely represents the performance of activity prediction for the respective targets. Moreover, the robustness of multi-task architectures can be evaluated by measuring how many tasks outperform single-task learning [23].

Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol