A powerful tool for investigating a number of network properties is represented by graphlets. First of all, let us provide a few useful definitions.
Let G = (V,E) be a graph, where V is the set of nodes, and E is a set of node pairs, i.e., a set of edges. The two nodes paired by an edge are said to be its endpoints. Let S be a subset of V: A graph H = (S, E’) is said to be an induced subgraph of G, if E’ consists of all the edges of E having both their endpoints in S.
A graphlet in G is a small connected induced subgraph of G (Pržulj et al., 2004; Pržulj, 2007). Figure 1 displays all the 30 possible graphlets, with up to five nodes, that can be extracted from a given graph. An orbit of a graphlet consists of different nodes that can be mutually interchanged under a graphlet automorphism. Hence, this relates to the symmetries of the graphlet.
Automorphism orbits 0,1,2, …,72 for the 30 up to five-node graphlets G0,G1,…,G29. Nodes belonging to the same orbit of a graphlet have the same color. The non-redundant orbits we take into account are 0, 1, 2, 4, 6, 8, 9, 10, 11, 12, 13, 15, 18, 19, 22, 24, 25, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, 39, 40, 41, 42, 43, 45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 70.
More intuitively, the orbit defines the “topological character (or relevance)” of a node inside a graphlet, since it allows to distinguish between nodes of the graph G “touching” different nodes belonging to the graphlet Gi (in our case, i = 0,1,…,29, see Figure 1).
For example, let us consider the graphlet G1 in Figure 1. There exists an automorphism that exchanges the endpoints of the path, while no automorphism of the graphlet exists, which exchanges an endpoint with the middle vertex. This results in the existence of two orbits, called Orbit 1 and Orbit 2 (see again Figure 1, where different orbits inside a graphlet have different colors).
Hence, a node of the graph G can be described by a vector containing the counts of different kinds of graphlets which such a node “touches,” or equivalently, the topological characteristics (orbits) it plays within these graphlets.
As a first step, one selects a number of working graphlets having up to some prescribed number of nodes. We focus on up to five-node graphlets, namely, on the 30 graphlets G0,G1,…,G29 (see Figure 1), which globally define the 73 orbits O0,O1,…,O72. By leveraging the set of orbits, one can construct vectors collecting different measures associated with the given network. In detail, for each node, the corresponding graphlet degree vector (GDV) is formed by letting its j-th entry equal to the number of times the node “touches” (i.e., belongs to) the orbit O_j. The GDV can be computed by standard software packages, such as ORCA (Hocevar and Demsar, 2014). Many of the entries of the GDV have well-defined topological interpretations. For instance, the first entry is the degree of the associated node, the second entry is the number of induced paths of length two having the node as an endpoint, the third entry is the number of induced paths of lengths two having the node in the middle, and so on.
Some of the 73 orbits are dependent on other orbits previously considered, meaning that their count for any given node can be derived from the count of other orbits: The corresponding information is then redundant. In this case, the orbit can be neglected in the analysis. Focusing on up to five-node graphlets, it turns out that only 56 of the 73 orbits are non-redundant (see Yaveroğlu et al., 2014, Supplementary Material section). The list of the non-redundant orbits is reported in the caption of Figure 1.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.