2.4. Tigramite Causal Discovery

Supat Saetia; Natsue Yoshimura; Yasuharu Koike

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

2.4. Tigramite Causal Discovery

SS Supat Saetia

NY Natsue Yoshimura

YK Yasuharu Koike

This method is extracted from research article: Front Neuroinform, Feb 2021

Constructing Brain Connectivity Model Using Causal Network Reconstruction Approach

DOI: 10.3389/fninf.2021.619557

Request a Protocol

Ask a question

Favorite

For connectivity model construction, we propose the use of the Tigramite causal discovery framework (Figure 1D). To obtain causal information from measured variables, some assumptions are needed. This framework focuses on three main assumptions under which the time-series graph represents a causal relation (Runge, 2018).

The first assumption is Causal Sufficiency, which assumes that no other unobserved variable exists that influences any other pair of our set of variables, either directly or indirectly. We need this assumption because it is impossible to ensure that we have measured all possible variables (Pearl, 2009). The second assumption is the Causal Markov Condition. This condition dictates the relationship between process X and its associated graph G. It implies that once we know the value of a node's parent at time τ, all other variables in the past become irrelevant for predicting the state of the current node (Spirtes et al., 2000). The third assumption is Faithfullness. Faithfullness guarantees that the graph entails all conditional independence relations that are implied by the Markov condition (Spirtes et al., 2000).

Subsequently for the Causal Markov condition to hold true, the assumption that there is no instantaneous (contemporaneous) causal effects is needed. It may seem counterintuitive to consider the instantaneous effect between dynamical systems because the physical speed of information transfer, i.e., speed of light, is finite. The problem arises when the time-series cannot be sampled with sufficient resolution (Runge, 2018).

The causal discovery algorithm used in this framework is PCMCI. This approach was implemented in this framework to address some of the shortcomings of the PC (Peter and Clark) algorithm (Spirtes and Glymour, 2016). The PC algorithm was invented for random variables without assuming a time order (Lauritzen, 1996). Its process consists of several phases where first an undirected graphical model is estimated, then its links are adjusted using a set of logical rules (Spirtes and Glymour, 2016).

Tigramite defines the time-series graph of a stationary multivariate discrete-time stochastic process X of dimension N as graph structure $G = (V \times ℤ, E)$ of X where the set of nodes in the graph consists of the set of components V at each time t ∈ ℤ. The links in graph $G$ are defined as a connection between variables $X_{t - τ}^{i}$ and $X_{t}^{j}$ connected by a lag-specific directed link $" X_{t - τ}^{i} \to X_{t}^{j} " \in G$ for τ > 0 if and only if

where $X_{t}^{-} = (X_{t - 1}, X_{t - 2}, \dots)$ . X, X_t, and $X_{t}^{-}$ are considered as sets of random variables. The symbol \ denotes set difference (Runge, 2018).

The stationarity is assumed for process X. The process X is casually stationary over a time index set $T$ if and only if for all links $X_{t - τ}^{i} \to X_{t}^{j}$ in graph (Runge, 2018)

The framework constructs a time-series graph of a multivariate stochastic process X_t by evaluating the conditional mutual information (CMI) from subprocesses X_t−τ to Y_t for τ > 0

with infinite past $X_{t}^{-} = (X_{t - 1}, X_{t - 2}, \dots)$ . If Y ≠ X, the link X_t−τ → Y_t is considered as a coupling or cross-link at lag τ. If Y = X, then the link is considered an autodependency or autolink at lag τ (Runge, 2015).

The CMI for multivariate random variables X, Y, Z is defined as

where H denotes the Shannon entropy and densities p(·) are assumed to exist (Runge, 2017). The framework tests the conditional independence hypothesis

against the general alternative. I_{X; Y|Z} = 0 if, and only if, X ⫫ Y|Z, provided that densities are well-defined (Runge, 2017). Tigramite utilizes a permutation-based generation of the distribution under H₀ for hypotheses testing in graph structure construction. The conditional independence testing used in this framework is CMI, as defined in Equation (4) and is a model-free method, therefore, in principle, it can handle non-linear dependencies (Runge, 2018).

The framework then measures information transfer from the past of a process X at times t′ < t to the target variable Y at time t and excludes common information in history shared by X and Y. TE is defined as (Runge, 2015)

To overcome the curse of dimensionality of the condition in each term, TE is estimated using decomposed transfer entropy (DTE) (Runge et al., 2012), utilizing the theory of graphical models (Lauritzen, 1996; Eichler, 2012) which implies that

for a certain finite subset $S_{Y_{t}, X_{t - t a u}} \subset X_{t}^{-} \ X_{t}^{-} \cup X_{t - τ}^{-}$ of the conditions. The suitable set $S_{Y_{t}, X_{t - t a u}}$ can be determined from the constructed time-series graph. The DTE is calculated by

where τ^* is the smallest chosen τ (Runge et al., 2012).

The conditional independence test needed to compute CMI and TE in Tigramite is CMIknn, based on conditional mutual information estimated with the k-nearest neighbor entropy estimator developed by (Kraskov et al., 2004)

with the logarithmic derivative of the Gamma function $Ψ (x) = \frac{d}{d x} l n Γ (x)$ . Free parameter k is the number of nearest neighbors in the joint space of $X$ ⊗ $Y$ ⊗ $Z$ around each sample point i at maximum norm distance ϵ_i. The $k_{i}^{x z}$ , $k_{i}^{y z}$ , and $k_{i}^{z}$ are computed by the number of points with a distance smaller than ϵ_i in the subspace $X$ ⊗ $Z$ , $Y$ ⊗ $Y$ , and $Z$ to get $k_{i}^{x z}$ , $k_{i}^{y z}$ , $k_{i}^{z}$ , respectively (Runge et al., 2017).

The appropriate maximum time delay τ_max usually depends on the nature of the signal being investigated. We can estimate the τ_max by observing the lagged unconditional dependencies decay. In this study, we observed that the dependencies decay beyond a lag of 15. For the significance level α, in the context of this framework it takes the role of a regularization parameter for model-selection, since precise assessment of uncertainty is not possible in iterative hypothesis testing. In our motor task-fMRI application, the algorithm parameters we used are as follow: maximum time lag τ_max = 15 time point, significance level α = 0.01 (Student's t-test).

To quantify causal interaction between the subprocess, this framework proposed a measure I to quantify linear causal effect (CE) of perturbation (Runge, 2015)

where Ψ(τ) is iteratively computed matrix products of estimated coefficient matrices Φ(τ) by (Runge et al., 2015)

The mediated causal effect (MCE) through a component k is the sum over the products of path coefficients only along causal paths through k.

where Ψ^(k)(t) is a computed from Equation (12) with modified path coefficient matrices Φ^(k)(t) where all links toward component k are set to zero

which blocks all paths through component k at any lag (Runge et al., 2015).

Aggregated causal effect (ACE) and aggregated causal susceptibility (ACS) measures on the lag with maximum effect (Runge et al., 2015):

The average mediated causal effect (AMCE) is calculated based on causal paths through a given node

where $C_{k}$ is the set of interactions between all non-identical pairs i, j ≠ k at all lags 0 < τ ≤ τ_max where k is an intermediate component (at any lag) and $| C_{k} |$ denotes its cardinality (Runge et al., 2015).

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol