Rich DDI Triple Encoder
This protocol is extracted from research article:
Drug-Drug Interaction Predictions via Knowledge Graph and Text Embedding: Instrument Validation Study
JMIR Med Inform, Jun 24, 2021; DOI: 10.2196/28277

The interaction l between 2 drug entities, u and v, in rich DDI triples (u, l, v), T, can also be represented as translations in low-dimensional space. We set u, v Rk, l Rd. The energy function zdte (u, l, v) is defined as follows:

where b2 is a bias constant and Ml = R×d is the projection matrix. Following the analogous method in the basic triple encoder, the conditional likelihoods of all existing triples are maximized as follows:

Note, in equation 5, l is the relation representation obtained from l = {n1, n2,…}. This will be introduced in-depth next.

A deep autoencoder is employed to construct the relation representation l Rd for a rich DDI triple (u, l, v) ∈ T. Specifically, a DDI relation, l, is described by a set of labels l = {n1, n2,… } ⊆ L. The corresponding binary vector for l is initialized as s = An external file that holds a picture, illustration, etc.
Object name is medinform_v9i6e28277_fig9.jpg, where si = 1 if ni ∈ l, and si = 0 otherwise. The deep autoencoder then takes the binary vector s as input and uses the following nonlinear transformation layers to transform the label set into the low-dimensional space Rk:

where f is the activation function and K is the number of layers. Here, h(i), W(i), and b(i) represent the hidden vector, transformation matrix, and the bias vector in the i-th layer, respectively.

There are 2 parts to the autoencoder: an encoder and a decoder. The encoder employs the tanh activation function to obtain the DDI relation representation l = h(K/2). The decoder deciphers the embedding vector of l to obtain a reconstructed vector An external file that holds a picture, illustration, etc.
Object name is medinform_v9i6e28277_fig10.jpg. Intuitively, PRD should then minimize the distance An external file that holds a picture, illustration, etc.
Object name is medinform_v9i6e28277_fig11.jpg because the reconstructed vector An external file that holds a picture, illustration, etc.
Object name is medinform_v9i6e28277_fig10.jpg should be similar to s. However, the number of zero elements in s is usually much larger than that of nonzero elements due to data sparsity. This leads the decoder to tend to reconstruct zero elements rather than nonzero elements, which conflicts with our purpose. To overcome this obstacle, different weights are set for different elements, and the following objective function is maximized:

where b3 is a bias constant, x is a weight vector, and ⊙ is denoted as the Hadamard product. For x = An external file that holds a picture, illustration, etc.
Object name is medinform_v9i6e28277_fig14.jpg, xi = 1, if si = 0, and xi = β > 1 otherwise. According to equation 8, the probability of P An external file that holds a picture, illustration, etc.
Object name is medinform_v9i6e28277_fig24.jpg can be defined as follows:

where S is the set of binary vectors of all DDI relations. The likelihood of reconstructing the binary vector s of a relation l can be defined as follows:

By maximizing the likelihoods of the encoding and the decoding for all described relations l, the objective function can be defined as follows:

Note: The content above has been extracted from a research article, so it may not display correctly.



Q&A
Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.



We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.