Model evaluation system

Catherine L. Lawson; Andriy Kryshtafovych; Paul D. Adams; Pavel V. Afonine; Matthew L. Baker; Benjamin A. Barad; Paul Bond; Tom Burnley; Renzhi Cao; Jianlin Cheng; Grzegorz Chojnowski; Kevin Cowtan; Ken A. Dill; Frank DiMaio; Daniel P. Farrell; James S. Fraser; Mark A. Herzik, Jr; Soon Wen Hoh; Jie Hou; Li-Wei Hung; Maxim Igaev; Agnel P. Joseph; Daisuke Kihara; Dilip Kumar; Sumit Mittal; Bohdan Monastyrskyy; Mateusz Olek; Colin M. Palmer; Ardan Patwardhan; Alberto Perez; Jonas Pfab; Grigore D. Pintilie; Jane S. Richardson; Peter B. Rosenthal; Daipayan Sarkar; Luisa U. Schäfer; Michael F. Schmid; Gunnar F. Schröder; Mrinal Shekhar; Dong Si; Abishek Singharoy; Genki Terashi; Thomas C. Terwilliger; Andrea Vaiana; Liguo Wang; Zhe Wang; Stephanie A. Wankowicz; Christopher J. Williams; Martyn Winn; Tianqi Wu

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Model evaluation system

CL Catherine L. Lawson

AK Andriy Kryshtafovych

PA Paul D. Adams

PA Pavel V. Afonine

MB Matthew L. Baker

BB Benjamin A. Barad

PB Paul Bond

TB Tom Burnley

RC Renzhi Cao

JC Jianlin Cheng

GC Grzegorz Chojnowski

KC Kevin Cowtan

KD Ken A. Dill

FD Frank DiMaio

DF Daniel P. Farrell

JF James S. Fraser

MJ Mark A. Herzik, Jr

SH Soon Wen Hoh

JH Jie Hou

LH Li-Wei Hung

MI Maxim Igaev

AJ Agnel P. Joseph

DK Daisuke Kihara

DK Dilip Kumar

SM Sumit Mittal

BM Bohdan Monastyrskyy

MO Mateusz Olek

CP Colin M. Palmer

AP Ardan Patwardhan

AP Alberto Perez

JP Jonas Pfab

GP Grigore D. Pintilie

JR Jane S. Richardson

PR Peter B. Rosenthal

DS Daipayan Sarkar

LS Luisa U. Schäfer

MS Michael F. Schmid

GS Gunnar F. Schröder

MS Mrinal Shekhar

DS Dong Si

AS Abishek Singharoy

GT Genki Terashi

TT Thomas C. Terwilliger

AV Andrea Vaiana

LW Liguo Wang

ZW Zhe Wang

SW Stephanie A. Wankowicz

CW Christopher J. Williams

MW Martyn Winn

TW Tianqi Wu

This method is extracted from research article: Nat Methods, Feb 2021

Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge

DOI: 10.1038/s41592-020-01051-w

Request a Protocol

Ask a question

Favorite

The evaluation system for 2019 challenge (model-compare.emdataresource.org) was built on the basis of the 2016/2017 Model Challenge system¹¹, updated with several additional evaluation measures and analysis tools. Submitted models were evaluated for >70 individual metrics in four tracks: Fit-to-Map, Coordinates-only, Comparison-to-Reference and Comparison-among-Models. A detailed description of the updated infrastructure and each calculated metric is provided as a help document on the model evaluation system website. Result data are archived at Zenodo⁵⁴. Analysis software versions/websites are listed in the Nature Research Reporting Summary.

For brevity, a representative subset of metrics from the evaluation website are discussed in this paper. The selected metrics are listed in Table Table22 and are further described below. All scores were calculated according to package instructions using default parameters.

The evaluated metrics included several ways to measure the correlation between map and model density as implemented in TEMPy^16–18 v.1.1 (CCC, CCC_OV, SMOC, LAP, MI, MI_OV) and the Phenix²¹ v.1.15.2 map_model_cc module¹⁹ (CCbox, CCpeaks, CCmask). These methods compare the experimental map with a model map produced on the same voxel grid, integrated either over the full map or over selected masked regions. The model-derived map is generated to a specified resolution limit by inverting Fourier terms calculated from coordinates, B factors and atomic scattering factors. Some measures compare density-derived functions instead of density (MI, LAP¹⁶).

The Q-score (MAPQ v.1.2 (ref. ⁸) plugin for UCSF Chimera³⁸ v.1.11) uses a real-space correlation approach to assess the resolvability of each model atom in the map. Experimental map density is compared to a Gaussian placed at each atom position, omitting regions that overlap with other atoms. The score is calibrated by the reference Gaussian, which is formulated so that a highest score of 1 would be given to a well-resolved atom in a map at an approximately 1.5 Å resolution. Lower scores (down to −1) are given to atoms as their resolvability and the resolution of the map decreases. The overall Q-score is the average value for all model atoms.

Measures based on Map-Model FSC curve, Atom Inclusion and protein sidechain rotamers were also compared. Phenix Map-Model FSC is calculated using a soft mask and is evaluated at FSC = 0.5 (ref. ¹⁹). REFMAC FSCavg¹³ (module of CCPEM⁴²) integrates the area under the Map-Model FSC curve to a specified resolution limit¹³. EMDB Atom Inclusion determines the percentage of atoms inside the map at a specified density threshold¹⁴. TEMPy ENV is also threshold-based and penalizes unmodeled regions¹⁶. EMRinger (module of Phenix) evaluates backbone positioning by measuring the peak positions of unbranched protein C_γ atom positions versus map density in ring paths around C_ɑ–C_β bonds¹⁵.

Standard measures assessed local configuration (bonds, bond angles, chirality, planarity, dihedral angles; Phenix model statistics module), protein backbone (MolProbity Ramachandran outliers²⁰; Phenix molprobity module) and sidechain conformations, and clashes (MolProbity rotamers outliers and Clashscore²⁰; Phenix molprobity module).

New in this challenge round is CaBLAM²² (part of MolProbity and as Phenix cablam module), which uses two procedures to evaluate protein backbone conformation. In both cases, virtual dihedral pairs are evaluated for each protein residue i using C_ɑ positions i − 2 to i + 2. To define CaBLAM outliers, the third virtual dihedral is between the CO groups flanking residue i. To define Calpha-geometry outliers, the third parameter is the C_ɑ virtual angle at i. The residue is then scored according to virtual triplet frequency in a large set of high-quality models from PDB²².

Assessing the similarity of the model to a reference structure and similarity among submitted models, we used metrics based on atom superposition (LGA GDT-TS, GDC and GDC-SC scores²³ v.04.2019), interatomic distances (LDDT score²⁴ v.1.2), and contact area differences (CAD²⁶ v.1646). HBPLUS⁵⁰ was used to calculate nonlocal hydrogen bond precision, defined as the fraction of correctly placed hydrogen bonds with more than six separations in sequence (HBPR > 6). DAVIS-QA determines for each model the average of pairwise GDT-TS scores among all other models²⁷.

Residue-level visualization tools for comparing the submitted models were also provided for the following metrics: Fit-to-Map, Phenix CCbox, TEMPy SMOC, Q-score, EMRinger and EMDB Atom Inclusion; Comparison-to-Reference, LGA and LDDT; and Comparison-among-Models, DAVIS-QA.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol