The inputs of OPUS-Contact contain four parts. Following TripletRes (Li et al., 2019b), the first three parts are the three raw co-evolutionary features: the covariance matrix (COV), the precision matrix (PRE) (Li et al., 2019a) and the coupling parameters of the Potts model by pseudolikelihood maximization (PLM) (Ekeberg et al., 2013, 2014). The fourth part contains 92 1D features: including 76 features from the first part of input features of OPUS-TASS2, and 1 solvent accessibility, 4 torsion angles [sin(Φ), cos(Φ), sin(Ψ) and cos(Ψ)] and 11 secondary structure (3- and 8-state) predicted by OPUS-TASS2. We use outer concatenation function as SPOT-1D (Hanson et al., 2019) to convert 1D features (L, 92) into 2D features (L, L, 184). Together with the results from trRosetta (Yang et al., 2020) (L, L, 100) and CCMpred (Seemayer et al., 2014) (L, L, 1), the final fourth part features have 285 features in total. Here, COV, PRE, PLM, CCMpred and the results from trRosetta are generated from the multiple sequence alignment results obtained by DeepMSA (Zhang et al., 2020).
The outputs of OPUS-Contact are identical to that of trRosetta (Yang et al., 2020), which include the predicted Cβ–Cβ distance, 3 dihedrals (ω, θ12, θ21) and 2 angles (φ12, φ21) between residues 1 and 2. The distance ranges between 2 and 20 Å, and it is segmented into 36 bins with 0.5 Å interval, plus one bin represents the >20 Å case. φ ranges between 0° and 180°, and it is segmented into 12 bins with 15° interval, plus one bin represents the non-contact case. ω, θ range between -180° and 180°, and they are segmented into 24 bins with 15° interval, plus one bin represents the non-contact case.
The neural network architecture of OPUS-Contact is shown in Supplementary Figure S2. We use a stack of dilated residual-convolutional blocks similar to the 2D feature extraction step in OPUS-TASS2. The 4 inputs parts (COV, PRE, PLM and Others) go through 41 blocks separately at first, and then concatenate to go through the following 21 blocks.
OPUS-Contact also adopts ensemble strategy as trRosetta (Yang et al., 2020) and it consists of seven models. The average is used for the final prediction.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.