The LZ complexity [55] and its derivatives are closely related to the Kolmogorov complexity [56,57] and the Shannon entropy. They can be easily computed and have found wide applications in characterizing the randomness of complex data.
To compute the LZ complexity, a numerical sequence has to first be transformed into a symbolic sequence. The most popular approach is to convert the signal into a 0–1 sequence by comparing the signal with a threshold value [14]. That is, whenever the signal is larger than , one maps the signal to 1; otherwise, one maps it to 0. One good choice of is the median of the signal [15]. When multiple threshold values are used, one may map the numerical sequence to a multisymbol sequence. Note that if the original numerical sequence is a nonstationary random walk-type process, one should analyze the stationary difference data instead of the original nonstationary data.
After the symbolic sequence is obtained, it can then be parsed to yield distinct words, and the words can be encoded. Let denote the length of the encoded sequence for those words. The LZ complexity can be computed as
Note, this is very much in the spirit of the Kolmogorov complexity [56,57].
There exist many different methods for performing parsing. One popular scheme was proposed by the original authors of the LZ complexity [55]. Another attractive method is proposed by Cover and Thomas [58]. Let denote the number of words in the parsing of the source sequence by the second scheme. For each word, we can use bits to describe the location of the prefix to the word and one bit to describe the last bit. Then, the total length of the encoded sequence is . Equation (2) then becomes
When n is very large, [55,58]. Dropping terms much smaller than , we can replace in Equation (3) by and obtain
This is the functional form for the definition of the commonly used LZ complexity. Unfortunately, depends on the sequence length. Rapp et al. [59] were among the first to consider normalizing the LZ complexity to make it independent of the sequence length by computational means. This issue was reconsidered by Hu et al. [60] using an analytic approach.
PE is introduced in [21] as a convenient means of analyzing a time series. It may be considered as a measure from chaos theory, since embedding vectors are used in the analysis. Using the notations of [22], it can be described as follows.
For a given but otherwise arbitrary i, the m number of the real values of are sorted in ascending order, . When an equality occurs, e.g., , the quantities x are ordered according to the values of their corresponding j’s; that is, if , then we write . Therefore, the vector is mapped onto a sequence of numbers, , which is one of the permutations of m distinct symbols . When each such permutation is considered as a symbol, the reconstructed trajectory in the m-dimensional space is represented by a symbol sequence. Let the probability for the distinct symbols be . Then, PE, denoted by , for the time series is defined as
The maximum of is when . It is convenient to work with
Thus, gives a measure of the departure of the time series under study from a completely random one: the smaller the value of is, the more structure the time series has.
To detect interesting dynamical changes in a time series, one can partition a time series into overlapping or nonoverlapping segments of short length, compute the PE from each segment, and examine how the PE changes with the segments [22]. Here, we apply this approach to compute the PE from the minute-to-minute logarithmic yields of the composite indices of SSE and DJIA on each trading day, then check how the PE varies with time.
In this paper, for the daily stock index data of both USA and China, we employ the same segmentation and choose and . For the intraday stock index data, since on each day, the data for the Chinese market is shorter than for the USA, we choose , for SSE and , for DJIA, respectively. Note that the time delay L was always chosen to be 1; this is based on the reasoning that stock data are basically random. As explained in [36], such a choice is not only sufficient, but optimal. The selection of m is basically constrained by the length of each sub-dataset under computation—if the data segment is short, then m cannot be too big. To better cope with the randomness in the data, m should not be too small either. The fact that the embedding window is often larger than 1 (and as a result of the patterns in the dataset) makes it possible to quantify the correlations in the data to some degree with the PE. This will be discussed in more depth later in the paper.
Consider a process, where . Such processes are normally called nonstationary random-walk processes. Its differentiation, denoted as , is a covariance stationary stochastic process, with mean , variance , and autocorrelation function have the form [35]
When , , leading to long-range temporal correlation. The process X has a PSD of . A process cannot be aptly modeled by a Markov process or an ARIMA model [61] since the PSD of those processes are distinctly different from . To adequately model a process, a fractional order process has to be used. A well-known process of this class is the fractional Brownian motion model [34].
There are many excellent methods to estimate the Hurst parameter [36]. One of the most popular methods for estimating the Hurst parameter H is detrended fluctuation analysis (DFA) [62]. This involves constructing a random walk process
where is the mean of the series , dividing the constructed random walk process into nonoverlapping segments, determining the best linear or polynomial fits in each segment as the local trends, getting the variance of the differences between the random walk process and the local trends, and averaging them over all the segments. Clearly, DFA may involve discontinuities at the boundaries of adjacent segments. Such discontinuities could be detrimental when the data contain trends [63], or nonstationarity [64] or nonlinear oscillatory components such as signs of rhythmic activity [65,66]. Fortunately, this shortcoming can be readily overcome using a method called adaptive fractal analysis (AFA) [38,39]. AFA is based on a nonlinear adaptive multiscale decomposition, which starts by partitioning a time series into segments of length , where neighboring segments overlap by points. Each segment is then fitted with the best polynomial of order M. We denote the fitted polynomials for the i-th and -th segments by and , respectively, where . We then define the fitting for the overlapped region as
where and can be written as for , and where denotes the distances between the point and the centers of and , respectively. This means that the weights decrease linearly with the distance between the point and the center of the segment. Such a weighting ensures symmetry and effectively eliminates any jumps or discontinuities around the boundaries of neighboring segments, and therefore can maximally suppress the effect of complex nonlinear trends on the scaling analysis.
With the above procedure, AFA can be readily described. For an arbitrary window size w, we determine, for the random walk process , a global trend , where N is the length of the walk. The residual, , characterizes fluctuations around the global trend, and its variance yields the Hurst parameter H according to
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.