Time series modelling and analysis

Alejandra Aranburu-Imatz; Jorge E. Jiménez-Hornero; Ignacio Morales-Cané; Pablo Jesús López-Soto

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Time series modelling and analysis

AA Alejandra Aranburu-Imatz

JJ Jorge E. Jiménez-Hornero

IM Ignacio Morales-Cané

PL Pablo Jesús López-Soto

This method is extracted from research article: Air Qual Atmos Health, Jan 2023

Environmental pollution in North-Eastern Italy and its influence on chronic obstructive pulmonary disease: time series modelling and analysis using visibility graphs

DOI: 10.1007/s11869-023-01310-7

Ask a question

Favorite

To analyze the time series quantitative data, we used descriptive statistics with measures of frequency, central tendency, and dispersion. Normality and homogeneity were calculated using the Shapiro–Wilk and Levene tests. Comparisons for continuous variables were assessed by t-test, and the categorical data were compared using χ² test. For the comparison of mean values between continuous variables, we used a one-factor ANOVA test and for correlation between quantitative variables, Spearman rank correlation. In order to capture the non-linear time effects on the response variable (hospital admissions), we performed two Generalized additive mixed models (GAMM), one per study area (Belluno and Feltre). All hypothesis tests were bilateral. All tests with a confidence level of 95% (p < 0.05) were considered statistically significant.

A visibility graph (VG) is a mathematical entity constituted by nodes, which correspond to the points of the transformed time series, and the edges that connect them. Two nodes at times $t_{a}$ and $t_{b}$ with values of the analyzed variable $y_{a}$ and $y_{b}$ , respectively, are linked if any intermediate node at $t_{c}$ between them ( $t_{a}$ < $t_{c}$ < $t_{b}$ ) and intensity $y_{c}$ fulfils Eq. (1) (visibility criterion). Two adjacent-in-time nodes are always connected because there are no intermediate nodes between them.

The criterion from Eq. (1) corresponds to the so-called natural visibility graph (NVG); however, it is possible to apply other visibility criteria to transform the time series, for example, the one used to construct the horizontal visibility graph (HVG) (Luque et al. 2009), which is simpler to obtain, although there is some information loss from the original time series.

The application of visibility criterion (1) leads to a symmetric square $NxN$ sparse binary adjacency matrix (2) ( $N$ being the number of nodes) representative of the VG, where each row contains the information associated to each node, so that $a_{ij} = 1$ means that nodes $i$ and $j$ have visibility (an edge connects them) and $a_{ij} = 0$ means no visibility. Additionally, the diagonal elements of this matrix are always zero (hollow matrix), because a node has no visibility with itself and the elements surrounding the diagonal are equal to 1, because a node always has visibility with its adjacent nodes.

Several centrality measurements or properties of the adjacency matrix can be defined, among which the degree of each node ( $k_{i}$ ) can be highlighted, which counts the number of nodes connected to it through an edge using the visibility criterion ( $k_{i} = \sum_{j} a_{ij}$ ). This value is an indication of the relative importance of a node, since the nodes with the highest degrees are normally those with the highest values of the analyzed variable in the time series too. One important characteristic of the VG obtained from the degrees is the degree probability distribution $P (k)$ , which is calculated for each degree by dividing the number of nodes with that degree by the total number of nodes. It is known that $P (k)$ can provide information about the nature of the original time series, for example, its periodic, chaotic, random, or multifractal behavior (Lacasa et al. 2008; Mali et al. 2018).

Multivariate analysis can be carried out using the multiplex visibility graph (MVG) approach, based on multi-layered networks, each of which is constituted by the VG of the $M$ involved time series. Therefore, an MVG is represented by the vector of adjacency matrices of the constituent VGs, i.e., $Ω = \{A^{[1]}, A^{[2]}, \dots, A^{[M]}\}$ , being $A^{[α]}$ the adjacency matrix of the VG corresponding to the time series of the $α$ variable (layer) and $a_{ij}^{[α]}$ the $ij$ -element of that matrix $A^{[α]}$ . Two measurements obtained from a MVG are mainly used to perform multivariate analysis (Lacasa et al. 2015; Nicosia and Latora 2015): average edge overlap ( $ω$ ) and interlayer mutual information ( $IM$ ). The former quantifies, on average, the degree of overlap of the edges between any pair of nodes across the different VGs in the MVG (3), while the latter determines the correlation of the degree probability distributions $P (k^{[α]})$ and $P (k^{[β]})$ of the two VGs corresponding to layers $α$ and $β$ of the MVG (4).

$δ_{0, \sum_{α} a_{ij}^{[α]}}$ is the Kronecker delta, which is equal to 1 if $\sum_{α} a_{ij}^{[α]}$ is null and 0 otherwise. $ω$ has 1 as maximum value, meaning that the time profile of the analyzed series are identical, and $1 / M$ as its minimum value, which means that every edge in the MVG only exists in one layer; therefore, a high value of $ω$ (close to 1) indicates a high correlation of the time series involved.

$P (k^{[α]}, k^{[β]})$ is the joint probability distribution of having degree $k^{[α]}$ in layer $α$ and degree $k^{[β]}$ in layer $β$ (5).

where $N_{k^{[α]}, k^{[β]}}$ is the number of nodes at the same time instant which have degree $k^{[α]}$ in layer $α$ and degree $k^{[β]}$ in layer $β$ . There is no theoretical upper limit for IM, but the higher it is around 1, the higher is the correlation between the degree probability distributions $P (k^{[α]})$ and $P (k^{[β]})$ .

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol