3.3. Estimating the Effects of Mutations on Protein Stability

FZ Feiyang Zhao
LZ Lei Zheng
AG Alexander Goncearenco
AP Anna R. Panchenko
ML Minghui Li
request Request a Protocol
ask Ask a question
Favorite

One can considerably decrease the number of potential cancer driver mutation candidates by determining the functional impact of each mutation on its corresponding protein. Protein stability may directly relate to functional activity, and changes in stability or incorrect folding could be major consequences of pathogenic missense mutations. It was previously shown that missense mutations destabilize tumor suppressors significantly more than SNPs, but this same effect was not observed for oncogenes [83]. In most cases, missense mutations are deleterious due to decreasing the stability of the corresponding protein [67,84]. For example, oncogenic mutations disrupt Casitas B-lineage lymphoma (CBL) function by decreasing the stability of CBL proteins [85]. Six mutations in the tumor suppressor gene phosphatase and tensin homolog (PTEN) in patients with PHTS-associated cancer show a global decrease in structural stability and increased dynamics across the domain interface [86]. In other cases, missense mutations may cause diseases by enhancing stability of the corresponding protein [87].

Computational methods that accurately predict the effects of variations on protein stability may help to identify functionally important mutations. Typically, the magnitude of mutational effects on stability can be quantified by unfolding free energy changes ∆∆Gfold. The ProTherm database is a collection of thermodynamic parameters for wild-type and mutant proteins [27]. It includes unfolding Gibbs free energy, enthalpy and heat capacity changes, etc. that provide important clues for understanding the relationship among structure, stability and function of proteins and their mutants. This database also contains information on experimental conditions and methods used for measuring these data, which is frequently used as training templates for development of the following in silico prediction methods (Table 1).

Table 2 lists major computational approaches and tools for predicting quantitative changes in unfolding free energy in response to mutations. They are different in terms of algorithms used for training models, procedures used for optimization and sampling of protein conformations, and terms of energy functions. The terms of energy functions may vary from physics-based force fields to knowledge-based potentials by combining different structure-based or sequence-based physicochemical properties of amino acids. In addition, some methods take into account experimental conditions, such as salt concentration, pH values and temperature, which are important for assessing the free energy at near physiological conditions. For example, FoldX uses an empirical force field to evaluate the effects of mutations on stability, folding and dynamics in proteins and DNA [37]. One of the core functionalities of FoldX is the calculation of the unfolding free energy of a macromolecule based on its 3D structure. Its energy function is parametrized on experimental changes of unfolding free energy. FoldX is a software package, can be easily run on the Linux system, and allows users to deal with large datasets. FoldX has become a standard tool for predicting the effects of mutations including both single and multiple mutations on protein stability. SAAFEC is an approach that uses weighted MM-PBSA (Molecular Mechanics - Poisson-Boltzmann Surface Area) methods and various biophysical terms parametrized on thousands of experimental values [38]. Its energy terms are calculated using minimized wild-type and mutant structures. In particular, missing residues in the 3D structures can be added by SAAFEC.

The majority of the above mentioned methods require coordinates of protein 3D structures as the inputs. Prediction accuracy can be influenced by different factors, including protein class and structural flexibility, type of substituted and wild type amino acid and structural environment of the substituted site. The performance of these predictors was assessed and compared in different studies using datasets of experimentally characterized mutants [88,89,90,91,92]. In the first study [92], the performance of six different methods were evaluated on a large set of 2156 single mutations, and the mutations used for training each model were excluded. The following performance ranking was reported: EGAD > CC/PBSA > I-Mutant2.0 > FoldX > Hunter > Rosetta with correlation coefficients between predicted and experimental ΔΔG values in the range of 0.59 and 0.26 and standard deviation in the range of 0.95 and 2.32 kcal mol−1. However, the servers, EGAD and CC/PBSA, with the top performances are no longer available. In the second study [91], 11 online stability predictors (CUPSAT, Dmutant, FoldX, I-Mutant2.0, two versions of I-Mutant3.0 (sequence and structure versions), MultiMutate, MUpro, SCide, Scpred, and SRide) were compared by performing a systematic analysis on 1784 single mutations excluding those used for training each program. I-Mutant3.0, Dmutant, and FoldX were found to be the most reliable predictors. Furthermore, Kepp evaluated the relative performance of these methods by calculating the stability changes of SOD1 and myoglobin variants [89,90]. Five methods, CUPSAT, I-Mutant2.0, I-Mutant3.0, PoPMuSiC and SDM, were tested on 54 SOD1 mutations. The results showed that PoPMuSiC was the most accurate approach with correlation coefficient R ~ 0.5 and MAE ~ 1.0 kcal mol−1 and followed by I-Mutant. Kumar et al. extended this study for SOD1 stability changes upon mutations using three different structures and four additional protein stability predictors (PoPMuSiC 3.1, FoldX, mCSM and ENCoM) [88]. Overall, PoPMuSiC and FoldX were shown as the best methods.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A