Graphlets

Bartłomiej Szawulak; Piotr Formanowicz

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Graphlets

BS Bartłomiej Szawulak

PF Piotr Formanowicz

This method is extracted from research article: Sci Rep, Dec 2022

Graphlets in comparison of Petri net-based models of biological systems

DOI: 10.1038/s41598-022-24535-5

Request a Protocol

Ask a question

Favorite

Few of methods based on graphlets, has been used for comparison of graph models of biological phenomena, however, it has not been used for models expressed in the language of Petri nets theory. Undirected graphlets (for short graphlets) were proposed in Ref.^¹² in the context of analysis of PPI networks. The general idea of such an analysis was based on a comparison of frequencies of appearance of some subgraphs in an analyzed PPI network with corresponding frequencies in random graphs (some of related research focus on finding induced subgraphs using graphlets, which is not the topic of this paper). The subgraphs whose frequencies were compared in this approach are connected non-isomorphic graphs on n vertices, where n is usually in the range [2, 5]. These graphs are called graphlets^¹². For $n \in [2, 5]$ there are 30 graphlets (see Fig. 2). The basic measure, called relative graphlet frequency distance, used for comparison of two graphs G and H based on graphlet frequencies is defined in the following way^¹²:

where $F_{i} (G) = - log \frac{N_{i} (G)}{T (G)}$ , $N_{i} (G)$ is the number of graphlets of i-th type (cf. Fig. 2) in graph G and $T (G) = \sum_{i = 0}^{l} N_{i} (G)$ is the total number of graphlets in graph G and l is the maximal index of a graphlet. the maximal index of an orbit

All undirected graphlets build on 2–5 vertices.

As this measure is not precise enough for a significant number of comparison cases (it is easy to construct networks with exactly the same degree distribution whose structure and function differ substantially^¹⁴), another one based on the concept of orbit, was proposed^¹⁴. The idea of orbit follows from the notion of degree distribution. A degree of vertex v is the number of edges incident to this vertex. But an edge is the smallest graphlet, i.e., the only graphlet containing two vertices. This simple observation leads to an extension of the notion of degree distribution since it may be considered a number of graphlets built on greater number of vertices which are incident with vertex v. However, it is important to distinguish various cases of such an incidency, i.e., which vertex of a given graphlet is incident with vertex v. The idea can be easily illustrated on the example of graphlet $G_{1}$ (see Fig. 3). From topological point of view there is no difference whether vertex $u_{1}$ is adjacent to vertex v or vertex $u_{3}$ is adjacent to v, but these two cases differ from the case, where vertex $u_{2}$ is adjacent to vertex v. So, a set of vertices of a given graphlet can be divided into subsets containing vertices which are equivalent in the already described sense. These subsets are orbits. More formally, an orbit can be defined using the notion of an automorphism—a special case of an isomorphism. Given two graphs $H_{1} = (V_{1}, E_{1})$ and $H_{2} = (V_{2}, E_{2})$ an isomorphism is function $f : V_{1} \to V_{2}$ , which is a bijection such that for every two vertices $v_{1}, v_{2} \in V_{1}$ edge ${v_{1}, v_{2}} \in E_{1}$ if and only if edge ${f (v_{1}), f (v_{2})} \in E_{2}$ . An isomorphism of a graph to itself is an automorphism. A set of all automorphisms of given graph $G = (V, E)$ (with an operation of superposition) forms a group called an automorphism group of graph G, which is usually denoted by Aut(G). For vertex $v \in V$ an automorphism orbit or simply orbit of this vertex is set $O r b (v) = {u \in V : u = f (v), f \in A u t (G)}$ ^¹⁴.

Undirected graphlets with orbits shown in different colors.

Vertices $u_{1}$ and $u_{3}$ in graph $G_{1}$ belong to one orbit, while vertex $u_{2}$ belongs to another orbit (and these are the only orbits of this graph).

So, now we can measure how many vertices in a given graph are incident with graphlets $G_{i}$ , $i = 0, 1, \dots, l$ , where l is the greatest index of a graphlet in the considered set of graphlets. We should distinguish the cases of incidency with various orbits in graphlet $G_{i}$ . In other words, we can measure how many vertices are incident with orbits $O_{j}$ , $j = 0, 1, \dots, m$ , where m is the greatest index of an orbit. For a given automorphism orbit j, $d_{G}^{j} (k)$ is the sample distribution of the number of nodes in G touching the appropriate graphlet k-times^¹⁴. However, in this way we obtain a large collection of numbers (degree distributions) characterizing the analyzed graph. They can be arranged in the following matrix^¹⁵:

It should be noticed that in practice, where finite graphs are considered, the matrix is finite, since it is upper bounded by the number of vertices of the analyzed graph (α is the maximal possible number of orbit occurrences in one vertex).

It would be better to have one number, instead of such a matrix, which could be used in comparisons of graphs. A measure of this type has been proposed and called a GDD agreement (GDDA)^¹⁴.

First, $d_{G}^{j} (k)$ is scaled:

and then a normalized j-th distribution is calculated:

Next, for graphs G and H a j-th distance based on their normalized distributions is defined:

A value of this distance is in the range [0, 1]. When the value is equal to 0, it means that the two graphs have identical distributions, and the greater the value is, the greater differences between the graphs are.

It is also convenient to consider a j-th agreement between the two graphs:

And on this basis an agreement between graphs G and H can be defined in the following way:

The idea of graphlets can be extended to the case of directed graphs. They are called digraphlets. It is worth considering since Petri nets have a structure of a directed bipartite graph. (When graphlets up to size 5 are used, then $l = 29$ and $m = 72$ .)

There are 39 directed graphlets for $n \in {3, 4}$ and one directed graphlet on two nodes (see Fig. 5). In these graphlets there are 129 orbits^¹⁶.

All directed graphlets build on 2, 3 and 4 vertices. The ones in black color can occur in Petri nets.

In this case the directed GDDA (DGDDA) can be defined analogously to the undirected case^¹⁶:

where d is the maximal index of an orbit in a directed graphlet and agreements $A_{d}^{j} (G, H), j = 0, 1, \dots, 128$ are calculated for such orbits.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol