Time series modelling and analysis

AA Alejandra Aranburu-Imatz
JJ Jorge E. Jiménez-Hornero
IM Ignacio Morales-Cané
PL Pablo Jesús López-Soto
ask Ask a question
Favorite

To analyze the time series quantitative data, we used descriptive statistics with measures of frequency, central tendency, and dispersion. Normality and homogeneity were calculated using the Shapiro–Wilk and Levene tests. Comparisons for continuous variables were assessed by t-test, and the categorical data were compared using χ2 test. For the comparison of mean values between continuous variables, we used a one-factor ANOVA test and for correlation between quantitative variables, Spearman rank correlation. In order to capture the non-linear time effects on the response variable (hospital admissions), we performed two Generalized additive mixed models (GAMM), one per study area (Belluno and Feltre). All hypothesis tests were bilateral. All tests with a confidence level of 95% (p < 0.05) were considered statistically significant.

A visibility graph (VG) is a mathematical entity constituted by nodes, which correspond to the points of the transformed time series, and the edges that connect them. Two nodes at times ta and tb with values of the analyzed variable ya and yb, respectively, are linked if any intermediate node at tc between them (ta < tc < tb) and intensity yc fulfils Eq. (1) (visibility criterion). Two adjacent-in-time nodes are always connected because there are no intermediate nodes between them.

The criterion from Eq. (1) corresponds to the so-called natural visibility graph (NVG); however, it is possible to apply other visibility criteria to transform the time series, for example, the one used to construct the horizontal visibility graph (HVG) (Luque et al. 2009), which is simpler to obtain, although there is some information loss from the original time series.

The application of visibility criterion (1) leads to a symmetric square NxN sparse binary adjacency matrix (2) (N being the number of nodes) representative of the VG, where each row contains the information associated to each node, so that aij=1 means that nodes i and j have visibility (an edge connects them) and aij=0 means no visibility. Additionally, the diagonal elements of this matrix are always zero (hollow matrix), because a node has no visibility with itself and the elements surrounding the diagonal are equal to 1, because a node always has visibility with its adjacent nodes.

Several centrality measurements or properties of the adjacency matrix can be defined, among which the degree of each node (ki) can be highlighted, which counts the number of nodes connected to it through an edge using the visibility criterion (ki=jaij). This value is an indication of the relative importance of a node, since the nodes with the highest degrees are normally those with the highest values of the analyzed variable in the time series too. One important characteristic of the VG obtained from the degrees is the degree probability distribution Pk, which is calculated for each degree by dividing the number of nodes with that degree by the total number of nodes. It is known that Pk can provide information about the nature of the original time series, for example, its periodic, chaotic, random, or multifractal behavior (Lacasa et al. 2008; Mali et al. 2018).

Multivariate analysis can be carried out using the multiplex visibility graph (MVG) approach, based on multi-layered networks, each of which is constituted by the VG of the M involved time series. Therefore, an MVG is represented by the vector of adjacency matrices of the constituent VGs, i.e., Ω=A1,A2,,AM, being Aα the adjacency matrix of the VG corresponding to the time series of the α variable (layer) and aijα the ij-element of that matrix Aα. Two measurements obtained from a MVG are mainly used to perform multivariate analysis (Lacasa et al. 2015; Nicosia and Latora 2015): average edge overlap (ω) and interlayer mutual information (IM). The former quantifies, on average, the degree of overlap of the edges between any pair of nodes across the different VGs in the MVG (3), while the latter determines the correlation of the degree probability distributions Pkα and Pkβ of the two VGs corresponding to layers α and β of the MVG (4).

δ0,αaijα is the Kronecker delta, which is equal to 1 if αaijα is null and 0 otherwise. ω has 1 as maximum value, meaning that the time profile of the analyzed series are identical, and 1/M as its minimum value, which means that every edge in the MVG only exists in one layer; therefore, a high value of ω (close to 1) indicates a high correlation of the time series involved.

Pkα,kβ is the joint probability distribution of having degree kα in layer α and degree kβ in layer β (5).

where Nkα,kβ is the number of nodes at the same time instant which have degree kα in layer α and degree kβ in layer β. There is no theoretical upper limit for IM, but the higher it is around 1, the higher is the correlation between the degree probability distributions Pkα and Pkβ.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A