Assessing model calibration

Andrea Morger; Fredrik Svensson; Staffan Arvidsson McShane; Niharika Gauraha; Ulf Norinder; Ola Spjuth; Andrea Volkamer

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Assessing model calibration

AM Andrea Morger

FS Fredrik Svensson

SM Staffan Arvidsson McShane

NG Niharika Gauraha

UN Ulf Norinder

OS Ola Spjuth

AV Andrea Volkamer

This method is extracted from research article: J Cheminform, Apr 2021

Assessing the calibration in toxicological in vitro models with conformal prediction

DOI: 10.1186/s13321-021-00511-5

Request a Protocol

Ask a question

Favorite

In a conformal prediction setting, the observed error rate of predictions is theoretically proven to not be larger than the specified significance level. In return, any deviations between these values may indicate data drifts (or other causes for the deviations, such as a too small test set). The level of calibration can be visualised in a so-called calibration plot, where the observed error rate (y-axis) is plotted versus the significance level (desired error rate, x-axis). For valid (well-calibrated) models the values should lie on the diagonal line. Deviations from this behaviour signals deviations from perfect calibration. We also include efficiency in the plot, calculated as the fraction of single-class predictions. These plots, from hereon called calibration and efficiency plots (CEPs), were used in this work to assess the model calibration and efficiency (see Fig. Fig.2).2). As a measure of the level of calibration, we use the root-mean-square deviation (RMSD) between the specified significance and the observed error rate.

Calibration and efficiency plot. The dark lines show the mean error rate for the active (dark red) and inactive (dark blue) compounds. For a well-calibrated model, the error rate ideally follows the dashed diagonal line. The light coloured lines illustrate the mean efficiencies expressed as ratio of single label sets for the active (light red) and inactive (light blue) compounds. The shaded areas indicate the respective standard deviations within the fivefold CV. Class 0: inactive compounds, class 1: active compounds

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol