We defined n as a 295,240-dimensional vector based on the occurrence frequencies of triangle types at a ligand-binding pocket. All triangles that occur in a pocket with edge lengths of 1.0 Å to 15.8 Å are classified as one of 295,240 triangle types. Using X described in the previous section, we found the following.
In the equations above, ni represents the number of the i-th triangle in the list of triangle types. w stands for a vector representing a pocket (Fig. 1). To represent a pocket with a reduced vector based on the MDS result, we used the number of dimensions that satisfy a certain extent of cumulative contribution ratio calculated using only positive eigenvalues (Supplementary Fig. S1). For this study, we set the criteria of the cumulative contribution ratio as 0.98. We define similarity between two pockets i and j as a cosine distance between wi and wj. Therefore, the similarity can be found easily by calculating the inner product between normalized wi and normalized wj. This procedure can be regarded as calculation of the weighted arithmetic mean over X weighted by n.
Schematic diagram of a vector representation of a ligand-binding pocket. Structural and amino acid information of a ligand-binding pocket are converted into a vector based on the MDS result.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.
Tips for asking effective questions
+ Description
Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.