Few of methods based on graphlets, has been used for comparison of graph models of biological phenomena, however, it has not been used for models expressed in the language of Petri nets theory. Undirected graphlets (for short graphlets) were proposed in Ref.12 in the context of analysis of PPI networks. The general idea of such an analysis was based on a comparison of frequencies of appearance of some subgraphs in an analyzed PPI network with corresponding frequencies in random graphs (some of related research focus on finding induced subgraphs using graphlets, which is not the topic of this paper). The subgraphs whose frequencies were compared in this approach are connected non-isomorphic graphs on n vertices, where n is usually in the range [2, 5]. These graphs are called graphlets12. For there are 30 graphlets (see Fig. 2). The basic measure, called relative graphlet frequency distance, used for comparison of two graphs G and H based on graphlet frequencies is defined in the following way12:
where , is the number of graphlets of i-th type (cf. Fig. 2) in graph G and is the total number of graphlets in graph G and l is the maximal index of a graphlet. the maximal index of an orbit
All undirected graphlets build on 2–5 vertices.
As this measure is not precise enough for a significant number of comparison cases (it is easy to construct networks with exactly the same degree distribution whose structure and function differ substantially14), another one based on the concept of orbit, was proposed14. The idea of orbit follows from the notion of degree distribution. A degree of vertex v is the number of edges incident to this vertex. But an edge is the smallest graphlet, i.e., the only graphlet containing two vertices. This simple observation leads to an extension of the notion of degree distribution since it may be considered a number of graphlets built on greater number of vertices which are incident with vertex v. However, it is important to distinguish various cases of such an incidency, i.e., which vertex of a given graphlet is incident with vertex v. The idea can be easily illustrated on the example of graphlet (see Fig. 3). From topological point of view there is no difference whether vertex is adjacent to vertex v or vertex is adjacent to v, but these two cases differ from the case, where vertex is adjacent to vertex v. So, a set of vertices of a given graphlet can be divided into subsets containing vertices which are equivalent in the already described sense. These subsets are orbits. More formally, an orbit can be defined using the notion of an automorphism—a special case of an isomorphism. Given two graphs and an isomorphism is function , which is a bijection such that for every two vertices edge if and only if edge . An isomorphism of a graph to itself is an automorphism. A set of all automorphisms of given graph (with an operation of superposition) forms a group called an automorphism group of graph G, which is usually denoted by Aut(G). For vertex an automorphism orbit or simply orbit of this vertex is set 14.
Undirected graphlets with orbits shown in different colors.
Vertices and in graph belong to one orbit, while vertex belongs to another orbit (and these are the only orbits of this graph).
So, now we can measure how many vertices in a given graph are incident with graphlets , , where l is the greatest index of a graphlet in the considered set of graphlets. We should distinguish the cases of incidency with various orbits in graphlet . In other words, we can measure how many vertices are incident with orbits , , where m is the greatest index of an orbit. For a given automorphism orbit j, is the sample distribution of the number of nodes in G touching the appropriate graphlet k-times14. However, in this way we obtain a large collection of numbers (degree distributions) characterizing the analyzed graph. They can be arranged in the following matrix15:
It should be noticed that in practice, where finite graphs are considered, the matrix is finite, since it is upper bounded by the number of vertices of the analyzed graph (α is the maximal possible number of orbit occurrences in one vertex).
It would be better to have one number, instead of such a matrix, which could be used in comparisons of graphs. A measure of this type has been proposed and called a GDD agreement (GDDA)14.
First, is scaled:
and then a normalized j-th distribution is calculated:
Next, for graphs G and H a j-th distance based on their normalized distributions is defined:
A value of this distance is in the range [0, 1]. When the value is equal to 0, it means that the two graphs have identical distributions, and the greater the value is, the greater differences between the graphs are.
It is also convenient to consider a j-th agreement between the two graphs:
And on this basis an agreement between graphs G and H can be defined in the following way:
The idea of graphlets can be extended to the case of directed graphs. They are called digraphlets. It is worth considering since Petri nets have a structure of a directed bipartite graph. (When graphlets up to size 5 are used, then and .)
There are 39 directed graphlets for and one directed graphlet on two nodes (see Fig. 5). In these graphlets there are 129 orbits16.
All directed graphlets build on 2, 3 and 4 vertices. The ones in black color can occur in Petri nets.
In this case the directed GDDA (DGDDA) can be defined analogously to the undirected case16:
where d is the maximal index of an orbit in a directed graphlet and agreements are calculated for such orbits.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.