Feature attention layer

KS Kyriakos Schwarz
AA Ahmed Allam
NG Nicolas Andres Perez Gonzalez
MK Michael Krauthammer
request Request a Protocol
ask Ask a question
Favorite

The feature Attention layer is parameterized by a global context vector c with learnable parameters optimized during the training. For a set of input vectors G~_={g~1,g~2,,g~T} (computed in the layer before), Attention scores ψtt[1,,T] are calculated using the pairwise similarity between the context vector cRd and the set G~_ (Eqs. 1314). These scores are normalized and used to compute weighted sum of the {g~1,g~2,,g~T} vectors to generate a new unified vector representation zRd that is further passed to the classifier layer (Eq. 15).

Classifier layer The classifier layer calculates a distance (euclidean or cosine) between the computed representation vectors (za,zb) and then concatenates them with that distance. Subsequently, through an affine transformation, the concatenated feature vector is mapped to the size of the output classes (i.e., presence or absence of interaction). Finally, a softmax function is applied to output the predicted probability distribution over those two classes (i.e. y^(i) for ith drug pair).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A