PRODIGY: A Contact-based Predictor of Binding Affinity in Protein-protein Complexes

引用 收藏 提问与回复 分享您的反馈 Cited by



Aug 2015



Biomolecular interactions between proteins regulate and control almost every biological process in the cell. Understanding these interactions is therefore a crucial step in the investigation of biological systems and in drug design. Many efforts have been devoted to unravel principles of protein-protein interactions. Recently, we introduced a simple but robust descriptor of binding affinity based only on structural properties of a protein-protein complex. In Vangone and Bonvin (2015), we demonstrated that the number of interfacial contacts at the interface of a protein-protein complex correlates with the experimental binding affinity. Our findings have led one of the best performing predictor so far reported (Pearson’s Correlation r = 0.73; RMSE = 1.89 kcal mol-1). Despite the importance of the topic, there is surprisingly only a limited number of online tools for fast and easy prediction of binding affinity. For this reason, we implemented our predictor into the user-friendly PRODIGY web-server. In this protocol, we explain the use of the PRODIGY web-server to predict the affinity of a protein-protein complex from its three-dimensional structure. The PRODIGY server is freely available at:

Keywords: Protein contacts (蛋白质接触), Buried surface area (隐藏的表面区域), Web-server (网络服务器), Prediction (预测), Protein interface (蛋白质界面), Kd (Kd), Protein-protein interactions (蛋白 - 蛋白相互作用), PPIs (PPIs)


Interaction between biomolecules regulate and control almost every biological process in the cell. Studying and understanding these interactions is therefore a crucial step in the investigation of biological systems and in drug design. Many efforts have been devoted to unravel principles of protein-protein interactions. For this purpose, we introduced a simple but robust descriptor of binding affinity based only on structural properties, mainly intermolecular contacts, of a protein-protein complex (Vangone and Bonvin, 2015). This approach led to the best predictor so far reported. Recently, we implemented our method in the PRODIGY web-server (Xue et al., 2016) (, an online tool to predict the binding affinity of a protein-protein complex given its three-dimensional structure. PRODIGY reports the binding affinity either as Gibbs free energy (ΔG, kcal mol-1) or dissociation constant (Kd, M). PRODIGY predicts the binding affinity using the formula reported in Vangone and Bonvin (2015): It counts the number of Interatomic Contacts (ICs) made at the interface of a protein-protein complex within a 5.5 Å distance threshold, and classifies them according to the polar/apolar/charged character of the interacting amino acids. This information is then combined with properties on the Non-Interacting Surface (NIS), which we have previously shown to influence the binding affinity (Kastritis et al., 2011). For training and testing, we used the binding affinity benchmark of protein-protein complexes published in Kastritis and Bonvin (2010). A recent updated version of this benchmark can be found at: (Vreven et al., 2015).

Further information about the benchmark, the prediction model and its accuracy can be found online on the ‘Dataset’ and ‘Method’ pages of the PRODIGY web-server, respectively.


  1. A computer with internet access


  1. A web browser (the PRODIGY server has been tested successfully on Chrome, Firefox and Safari)
    PRODIGY web server address:
  2. Software repositories for running a local version (not described in this protocol) under a Linux or MacOSX operating system:
    1. PRODIGY repository (
    2. freeSASA (


  1. The software
    1. Technical description
      PRODIGY is made freely available to the scientific community either as standalone software (, which can be used locally on a desktop computer, or more conveniently as an online web-server, for which the usage is explained in this protocol. The PRODIGY software consists of a collection of Python scripts, a few Perl scripts to handle the online submission and the open-source tool freeSASA (Mitternacht, 2016) used to calculate the solvent accessible surface area, using default NACCESS (Hubbard and Thornton, 1993) parameters for atomic radii (
    2. Data requirement
      1. Input file – mandatory
        The main input required to perform binding affinity prediction is a text file containing the atomic coordinates describing the 3D structure of the protein-protein complex (or ensemble of complexes). They can either be experimental structures solved e.g., by X-ray crystallography or NMR spectroscopy, which can be obtained from the worldwide Protein Data Bank (wwPDB) ( (Berman et al., 2003), or structures modeled through computational approaches, e.g., by homology modelling or docking approaches. The 3D coordinates should be provided to the server in PDB or mmCIF format.
        The input structure/structures can be provided in different ways:
        1. By uploading a PDB or mmCIF file.
        2. By providing a PDB code for automatic retrieval from the wwPDB (
        3. By uploading an archive file (.tar, .tgz, .zip, .bz2 or .tar.gz) containing multiple PDB/mmCIF files. This option allows the submission of a unique file when many structures have to be analyzed (e.g., models derived from docking simulations).
      2. Chains – mandatory
        It is necessary to specify chains identifiers for the molecules involved in the interaction. If one (or both) interacting molecule is made of multiple chains at the interface, they all have to be provided separated by comma.
      3. Temperature – optional
        The user can specify at which temperature to perform the calculation of the dissociation constant (Kd). If nothing is specified, PRODIGY will use 25 °C by default.
      4. Job name – optional
        If provided, the job name will be used to identify your run. Otherwise a random label will be assigned.
      5. Email – optional
        If an email is provided, a link to the results will be sent when the job has completed.

  2. How to use PRODIGY web-server
    1. Submitting a prediction
      Here we describe the process of submitting a prediction run to the PRODIGY web-server ( As example, we will use the protein-protein complex between an antibody (FAB) and HIV-1 capsid protein p24, that is present in the Protein Data Bank (PDB) with the access code ‘1E6J’.
      1. Open an internet browser and go to
      2. Fill in the PRODIGY input page (Figure 1):
        1. Insert the PDB code ‘1E6J’ into the ‘Structure’ box for automatic retrieval from the wwPDB.
        2. The complex 1E6J is made of 3 chains: P (corresponding to HIV-1 capsid protein p24) and L and H (corresponding to the FAB). Considering that in this protocol we want to investigate the binding affinity at the interface between the antibody (chains L + H) and the antigen (chain P), you will need to insert:
          Interactor 1 ID_chain(s): P
          Interactor 2 IC_chain(s): L, H

          Figure 1. Example view of an input page of the PRODIGY web-server. (

      3. Personalize your job by defining some (optional) parameters if needed:
        1. In the box related to the temperature, change the 25 °C default if needed (note that this only affects the calculation of the dissociation constant and not the binding affinity ΔG).
        2. Give a name to your prediction run. No spaces or special characters other then ‘-‘ or ‘_’ are allowed. For this example we will name our run ‘1E6J_prediction’.
        3. Add your email address to be notified when your job is done and receive the link to the results page.
      4. We are now ready to send the prediction to PRODIGY: Click on the Submit button at the bottom of the page.
      5. A prediction usually does not take more than a few minutes. After this time, you will be redirected to the result page. If an email has been provided (see the above step B1c.iii), you will be notified when the prediction is complete and receive a link to the results page. Please note that the results are only stored for 2 weeks.
    2. The result page
      The result page is organized in three sections, reporting different information:
      1. Binding affinity and Kd prediction
        The name identifiers of your complex, which contains the PDB code of the retrieved file (or the name of the input you upload) is reported, together with the predicted ΔG (in kcal mol-1) and Kd (in M) values at the given temperature. In this example, -9.1 kcal mol-1 has been predicted for ΔG, corresponding to a Kd of 2.1e-07 M at 25 °C.
      2. Prediction details
        1. Number of ICs calculated within a threshold of 5.5 Å and % NIS classified according to the charged/polar/apolar character of the amino acids are reported. In this case, for example, there are 7 ICs between charged and polar residues and the % NIS charged atoms is 20.48.
        2. Further, the full table (format .txt) of ICs is provided and can be viewed by clicking on the link reported under ‘Table of the ICs at the interface’. The format of the table is the following:
          #chain1 #aa1 #res_num1 #chain2 #aa2 #res_num2
          H → THR → 33 P THR 210
          In which chain ID, residue type and residue number are reported for both residues interacting in Protein 1 and Protein 2.
      3. Download outputs
        In this foldable menu, it is possible to download a ready-to-run Pymol script (.pml) ( that will highlight the interaction interface by displaying and coloring the interacting residues, see Figure 2. Further, it is possible to download a compressed file (.tgz) with all the result files. 

        Figure 2. A three-dimensional representation of the complex 1E6J with the color-coding of the PRODIGY script (.pml). This script can be downloaded from the PRODIGY output page. Interactor 1 is shown in light pink (chains L and H in this example) and Interactor 2 in light blue (chain P), respectively. The interacting residues are represented in sticks in blue and dark pink for Interactor 1 and Interactor 2, respectively.

  3. Useful information
    1. Make sure to check and input the correct chain_IDs for the PDB file that you are uploading/retrieving: chain_IDs have to be present in the file, and correspond to the chains that are interacting. In this example, the FAB has two chains labeled as L and H, and both of them are interacting with the HIV1 capsid protein, which is labeled as chain P.
    2. PRODIGY can deal with files consisting of an ensemble of structures (e.g., as is typical for NMR structures). In the current implementation, only the first model will be considered for prediction. If you wish to analyze every model present in such an ensemble you should split the PDB file into single-model PDB files and submit them all as an archive file. A collection of useful Python scripts for the manipulation of PDB files, such as splitting of ensemble file, residue renumbering, changing chain ID and so on, can be found in our freely available pdb-tools GitHub repository available online at
    3. The PRODIGY web-server currently only supports the 20 canonical amino acids.
    4. Information about the web-server input/output, the prediction method and its performance, and the dataset used for training/testing the method can be found online under the Manual/Method/Dataset PROGIDY pages respectively. These are reachable through the corresponding tabs located at the beginning of each page. 

  4. Distribution/Software download
    The PRODIGY web-server is made freely available to the scientific community at: The prediction scripts are also available from our GitHub repository for local setup and usage at:
    The collection of software developed by the HADDOCK group can be found at:
    The freeSASA software (Mitternacht, 2016) used to calculate the solvent accessible surface area can be downloaded from


To run the ready-to-run Pymol script (.pml) provided by PRODIGY (see step B2c), open a Pymol session with the PDB code that you submitted to PRODIGY and follow one of the possible options:

  1. From the bar menu of Pymol, choose File  Run and navigate in the directory where the PRODIGY Pymol script has been saved. Then select the .pml file clicking on ‘Open’.
  2. In the Pymol terminal bar, type @ followed by the .pml file. Please note, if the Pymol session is not open in that folder, the user will need to type the full path. For example: @home/my_path/prodigy_pymol_script.pml


This protocol has been adapted from: Vangone and Bonvin (2015) and Xue et al. (2016). Anna Vangone was supported by H2020 Marie-Skłodowska-Curie Individual Fellowship MCSA-IF-2015 [BAP-659025].


  1. Berman, H., Henrick, K. and Nakamura, H. (2003). Announcing the worldwide Protein Data Bank. Nat Struct Biol 10(12): 980.
  2. Hubbard, S. J. and Thornton, J. M. (1993). Naccess. Computer Program.
  3. Kastritis, P. L. and Bonvin, A. M. (2010). Are scoring functions in protein-protein docking ready to predict interactomes? Clues from a novel binding affinity benchmark. J Proteome Res 9(5): 2216-2225.
  4. Kastritis, P. L., Moal, I. H., Hwang, H., Weng, Z., Bates, P. A., Bonvin, A. M. and Janin, J. (2011). A structure-based benchmark for protein-protein binding affinity. Protein Sci 20(3): 482-491.
  5. Mitternacht, S. (2016). FreeSASA: An open source C library for solvent accessible surface area calculations. F1000Res 5: 189.
  6. Vangone, A., and Bonvin, A. M. J. J. (2015). Contacts-based prediction of binding affinity in protein-protein complexes. eLife 4: 291.
  7. Vreven, T., Moal, I. H., Vangone, A., Pierce, B. G., Kastritis, P. L., Torchala, M., Chaleil, R., Jimenez-Garcia, B., Bates, P. A., Fernandez-Recio, J., Bonvin, A. M. and Weng, Z. (2015). Updates to the integrated protein-protein interaction benchmarks: Docking benchmark version 5 and affinity benchmark version 2. J Mol Biol 427(19): 3031-3041.
  8. Xue, L. C., Rodrigues, J. P., Kastritis, P. L., Bonvin, A. M. and Vangone, A. (2016). PRODIGY: a web server for predicting the binding affinity of protein-protein complexes. Bioinformatics.


蛋白质之间的生物分子相互作用调节和控制细胞中几乎每一个生物过程。因此,了解这些相互作用是生物系统和药物设计调查中的关键一步。许多努力致力于解开蛋白质 - 蛋白质相互作用的原理。最近,我们引入了简单而强大的结合亲和力的描述符,仅基于蛋白质 - 蛋白质复合物的结构特性。在Vangone和Bonvin(2015)中,我们证明蛋白质 - 蛋白质复合物界面的界面接触数与实验结合亲和力相关。我们的研究结果已经导致了迄今为止报告的最佳表现预测因子(Pearson相关r = 0.73; RMSE = 1.89 kcal mol-1)。尽管该主题的重要性,令人惊讶的是仅有少量的在线工具用于快速和容易地预测结合亲和力。因此,我们将预测器实现在用户友好的PRODIGY Web服务器中。在这个协议中,我们解释了使用PRODIGY网络服务器来预测蛋白质 - 蛋白质复合物从其三维结构的亲和力。 PRODIGY服务器可以免费获得:。
【背景】生物分子之间的相互作用调节和控制细胞中几乎每一个生物过程。因此,研究和理解这些相互作用是生物系统和药物设计调查中的关键一步。许多努力致力于解开蛋白质 - 蛋白质相互作用的原理。为此,我们仅引用蛋白质 - 蛋白质复合物(Vangone和Bonvin,2015)的结构性质(主要是分子间接触)引入了简单而强大的结合亲和性描述符。到目前为止,这种方法导致了最好的预测。最近,我们在PRODIGY网络服务器(Xue et al。,2016)(中实现了我们的方法,该工具用于预测蛋白质 - 蛋白质复合物给出其三维结构。 PRODIGY报告结合亲和力作为吉布斯自由能(ΔG,kcal mol-1)或解离常数(Kd,M)。 PRODIGY使用Vangone和Bonvin(2015)报道的公式预测结合亲和力:它计算在距离阈值范围内的蛋白质 - 蛋白质复合物界面处制成的原子间接触(IC)的数量,并根据相互作用的氨基酸的极性/非极性/带电特性。然后将该信息与非交互表面(NIS)上的性质相结合,我们先前已经显示出影响结合亲和力(Kastritis et al。,2011)。对于培训和测试,我们使用了Kastritis和Bonvin(2010)发表的蛋白质 - 蛋白质复合物的结合亲和力基准。这个基准测试的最新版本可以在 et al。,2015)找到。

关键字:蛋白质接触, 隐藏的表面区域, 网络服务器, 预测, 蛋白质界面, Kd, 蛋白 - 蛋白相互作用, PPIs


  1. 带网络连接的电脑


  1. 网络浏览器(PRODIGY服务器已在Chrome,Firefox和Safari上成功测试)
  2. 用于在Linux或MacOSX操作系统下运行本地版本(本协议中未描述)的软件存储库:
    1. PRODIGY存储库( )< br />
    2. freeSASA(


  1. 软件
    1. 技术说明
      PRODIGY作为独立软件免费提供给科学界( https://github。 com/haddocking/binding_affinity ),可以在桌面计算机上本地使用,或者更方便地用作在线Web服务器,在该协议中解释了该用法。 PRODIGY软件包括一系列Python脚本,一些用于处理在线提交的Perl脚本和用于计算溶剂可访问表面积的开源工具freeSASA(Mitternacht,2016),使用默认的NACCESS(Hubbard和Thornton,1993 )参数( )。
    2. 资料要求
      1. 输入文件 - 强制性
        执行绑定亲和力预测所需的主要输入是包含描述蛋白质 - 蛋白质复合体(或复合体的整体)的3D结构的原子坐标的文本文件。它们可以是通过X射线晶体学或NMR光谱法解析的实验结构,其可以从全球蛋白质数据库(wwPDB)获得( )(Berman 等人,2003),或者模型化的结构通过计算方法,例如,通过同源建模或对接方法。 3D坐标应以PDB或mmCIF格式提供给服务器。
        1. 通过上传PDB或mmCIF文件。
        2. 通过从wwPDB提供自动检索的PDB代码( )。
        3. 通过上传包含多个PDB/mmCIF文件的归档文件(.tar,.tgz,.zip,.bz2或.tar.gz)。当需要分析许多结构(例如,,来自对接模拟的模型)时,此选项允许提交唯一文件。
      2. 链 - 强制性
      3. 温度 - 可选
        用户可以指定在哪个温度下执行解离常数(K )的计算。如果没有指定,PRODIGY默认使用25°C。
      4. 工作名称 - 可选
      5. 电子邮件 - 可选

  2. 如何使用PRODIGY网络服务器
    1. 提交预测
      这里我们描述向PRODIGY网络服务器提交预测运行的过程( )。例如,我们将使用抗体(FAB)和HIV-1衣壳蛋白p24之间的蛋白质 - 蛋白质复合物,其存在于具有访问码'1E6J'的蛋白质数据返回(PDB)中。
      1. 打开互联网浏览器,然后转到 .nl/services/PRODIGY
      2. 填写PRODIGY输入页面(图1):
        1. 将PDB代码'1E6J'插入到"结构"框中,以从wwPDB进行自动检索。
        2. 复合物1E6J由3条链组成:P(对应于HIV-1衣壳蛋白p24)和L和H(对应于FAB)。考虑到在本协议中,我们要调查抗体(链L + H)和抗原(链P)之间的界面的结合亲和力,您需要插入:
          Interactor 1 ID_chain(s):P
          互动器2 IC_chain(s):L,H

          图1. PRODIGY Web服务器的输入页面的示例视图。

      3. 通过定义一些(可选)参数(如果需要)来个性化您的工作:
        1. 在与温度相关的框中,如果需要,请更改25°C默认值(注意,这仅影响解离常数的计算,而不影响结合亲和力ΔG)。
        2. 给你的预测运行一个名字。不允许使用' - '或'_'之外的空格或特殊字符。对于这个例子,我们将命名我们运行'1E6J_prediction'。
        3. 添加您的电子邮件地址,以便在您的工作完成后收到通知,并收到结果页面的链接。
      4. 我们现在可以将预测发送给PRODIGY:点击页面底部的提交按钮。
      5. 预测通常不会超过几分钟。此后,您将被重定向到结果页面。如果提供了电子邮件(请参阅上述步骤B1c.iii),预测完成后会收到通知,并收到结果页面的链接。请注意,结果仅存储2周。
    2. 结果页面
      1. 绑定亲和度和K 预测
        报告您的复合体的名称标识符,其中包含检索到的文件的PDB代码(或您上传的输入的名称)以及预测的ΔG(kcal mol -1 )和K (以M为单位)值。在这个例子中,对于ΔG,已经预测了-9.1 kcal mol -1,对应于25℃时的2.1e-07 M的K 。
      2. 预测细节
        1. 报告了根据氨基酸的带电/极性/非极性特征分类的阈值内的IC数量和%NIS。在这种情况下,例如,充电和极性残基之间有7个IC,并且%NIS带电原子为20.48。
        2. 此外,提供了IC的全表(格式.txt),并且可以通过点击"界面下的IC表"下的链接来查看。表格格式如下:
          H→THR→33 P THR 210
      3. 下载输出
        在这个折叠菜单中,可以下载一个可以运行的Pymol脚本(.pml)( ),将通过显示和着色交互残留来突出显示交互界面,参见图2.此外,可以下载所有的压缩文件(.tgz)结果文件。 

        图2.使用PRODIGY脚本(.pml)的颜色编码的复合1E6J的三维表示。 可以从PRODIGY输出页面下载此脚本。反应器1分别以浅粉红色(本实施例中的链L和H)和浅蓝色(链P)的反应器2显示。相互作用的残留物分别以互相反应器1和反应器2的蓝色和深粉红色的棒表示
  3. 有用信息
    1. 确保检查并输入正在上传/检索的PDB文件的正确的chain_ID:chain_ID必须存在于文件中,并对应于正在交互的链。在该实施例中,FAB具有标记为L和H的两条链,并且它们都与被标记为链P的HIV1衣壳蛋白相互作用。
    2. PRODIGY可以处理由结构集合组成的文件(例如,如NMR结构的典型例子)。在目前的实施中,只有第一个模型将被考虑用于预测。如果您想分析此类集合中的每个模型,您应该将PDB文件分解为单模型PDB文件,并将其全部提交为归档文件。可以在我们免费提供的pdb-tools GitHub存储库中找到用于处理PDB文件的一些有用的Python脚本的集合,例如分割集合文件,重新编号,更改链ID等等,可在
    3. 目前,PRODIGY网络服务器仅支持20种规范氨基酸。
    4. 有关Web服务器输入/输出的信息,预测方法及其性能以及用于训练/测试方法的数据集可以分别在手册/方法/数据集PROGIDY页面上在线查找。这些可通过位于每个页面开头的相应选项卡访问。

  4. 分销/软件下载
    PRODIGY网络服务器免费提供给科学界: http :// 。预测脚本也可从我们的GitHub存储库中获取,以便本地设置和使用:
    HADDOCK集团开发的软件集合可以在以下网址找到: http://www。
    用于计算溶剂可及表面积的freeSASA软件(Mitternacht,2016)可以从 http: //



  1. 从Pymol的酒吧菜单中,选择文件运行并导航到PRODIGY Pymol脚本保存的目录中。然后选择点击"打开"的.pml文件。
  2. 在Pymol终端栏中,键入@,后跟.pml文件。请注意,如果Pymol会话在该文件夹中未打开,用户将需要键入完整路径。例如:@ home/my_path/prodigy_pymol_script.pml


该协议已经由Vangone和Bonvin(2015)和Xue等人(2016)进行了改编。 Anna Vangone得到H2020 Marie-Skłodowska-Curie个人奖学金MCSA-IF-2015 [BAP-659025]的支持。


  1. Berman,H.,Henrick,K.and Nakamura,H。(2003)。< a class ="ke-insertfile"href ="" target ="_ blank">宣布全球蛋白质数据库。 Nat Struct Biol 10(12):980.
  2. Hubbard,SJ和Thornton,JM(1993)。< a class ="ke-insertfile"href =" =文件%2Fvid%2F7240b69c-24be-4b7e-a93f-e759c8fa5292&ie = utf-8&sc_us = 4283969104185628374"target ="_ blank"> Naccess。
  3. Kastritis,PL和Bonvin,AM(2010)。 Are蛋白质 - 蛋白质对接中的评分功能准备预测相互作用?来自新颖的结合亲和力基准的线索。 J Proteome Res 9(5):2216-2225。
  4. Kastritis,PL,Moal,IH,Hwang,H.,Weng,Z.,Bates,PA,Bonvin,AM and Janin,J。(2011)。蛋白质 - 蛋白质结合亲和力的基于结构的基准。蛋白质科学20(3 ):482-491。
  5. Mitternacht,S.(2016)。  FreeSASA:开放源C库,用于溶剂可及表面积计算。 F1000Res 5:189.
  6. vangone,A.,and Bonvin,AMJJ(2015)。  基于联系人的蛋白质 - 蛋白质复合物中结合亲和力的预测。 eLife 4:291.
  7. Vreven,T.,Moal,IH,Vangone,A.,Pierce,BG,Kastritis,PL,Torchala,M.,Chaleil,R.,Jimenez-Garcia,B.,Bates,PA,Fernandez-Recio, Bonvin,AM和Weng,Z.(2015)。更新集成的蛋白质 - 蛋白质相互作用基准:对接基准版本5和亲和力测试版本2. 分子生物学 427(19):3031-3041。
  8. 薛,LC,Rodrigues,JP,Kastritis,PL,Bonvin,AM和Vangone,A.(2016)。 PRODIGY:用于预测蛋白质 - 蛋白质复合物的结合亲和力的Web服务器。生物信息学。
  • English
  • 中文翻译
免责声明 × 为了向广大用户提供经翻译的内容, 采用人工翻译与计算机翻译结合的技术翻译了本文章。基于计算机的翻译质量再高,也不及 100% 的人工翻译的质量。为此,我们始终建议用户参考原始英文版本。 Bio-protocol., LLC对翻译版本的准确性不承担任何责任。
Copyright Vangone and Bonvin. This article is distributed under the terms of the Creative Commons Attribution License (CC BY 4.0).
引用: Readers should cite both the Bio-protocol article and the original research article where this protocol was used:
  1. Vangone, A. and Bonvin, A. M. (2017). PRODIGY: A Contact-based Predictor of Binding Affinity in Protein-protein Complexes. Bio-protocol 7(3): e2124. DOI: 10.21769/BioProtoc.2124.
  2. Vangone, A., and Bonvin, A. M. J. J. (2015). Contacts-based prediction of binding affinity in protein-protein complexes. eLife 4: 291.