Our model contains two main parts, extracting features by convolutional layers to get protein/ligand embedding vectors and predicting their interaction by processing the concatenated vectors by fully connected layers (Fig. 1).

In more detail, we use one hot encoding to represent protein and ligand. The number of unique tokens of protein amino acid and ligand SMILES is 20 and 64, respectively. For each protein, its sequences are encoded and padded at the end to produce a 20 × 1200 matrix. Proteins with residues shorter than 1200 are padded to that length, whereas residues longer than 1200 are cut off to ensure that all inputs have the same size. Similarly, for each ligand, its SMILES identifiers are encoded and padded to produce a 64 × 200 matrix. Then, the two input matrices are processed by three CNN blocks. More specifically, each block consists of two convolutional layers and one pooling layer, Additionally, the inception block [32] is used instead of the VGG block [33] in the last two convolution blocks. The inception block consists of convolutional kernels with different sizes, including 1 × 1, 3 × 3, 5 × 5 and a 3 × 3 max pooling layer. After feature extraction, the protein and ligand are embedded to 1024 dimensional vectors and the two vectors are concatenated to feed into three dense layers, the units of which are 512, 64 and 1. A multi-dropout layer is added after each dense layer to reduce overfitting. Each multi-dropout layer consists of five units generating random dropout values, and then the final dropout is calculated by the weighted mean of these values to achieve better performance. We employ the rectified linear unit (ReLU), sigmoid function and linear function as activation function for middle layers, classification output layer and regression output layer, respectively. At last, the model generates five different values in the last dense layer and combines them into the final output.

Based on the architecture of the single-task model, the multi-task model contains two main parts: shared layers for learning general hidden features from all data and task-specific layers for learning specific weights for different tasks [34]. Here, we have two different tasks: binary classification and regression. The input, feature extraction and concatenation parts are similar to those of the single-task model. The loss functions for different tasks are defined as: binary cross-entropy for classification (Loss1) and the MSE with L2 regularization for regression (Loss2).

where N, M and f(), and g() correspond to the samples and models for the classification and regression tasks; xi and yi correspond to the input and labels, respectively. ||w|| is an L2 regularization term; and λ ≥ 0 is used to adjust the relationship between the empirical risk and regularization term of the regression task.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.