Dynamic Complexity Measures : definition and calculation

This work is a generalization of the López-Ruiz, Mancini and Calbet (LMC); and Shiner, 1 Davison and Landsberg (SDL) complexity measures, considering that the state of a system or process 2 is represented by a dynamical variable during a certain time interval. As the two complexity 3 measures are based on the calculation of informational entropy, an equivalent information source 4 is defined and, as time passes, the individual information associated to the measured parameter is 5 the seed to calculate instantaneous LMC and SDL measures. To show how the methodology works, 6 an example with economic data is presented. 7


Introduction
The word complexity, in the common sense meaning, represents systems that are difficult to describe, design or understand.However, since Kolmogorov presented the concept of computational complexity [7], new ideas have been associated to this word, mainly concerning life sciences [1], relating complexity with information [5].
As a consequence, complexity started to be related to open systems and to the emergence of unexpected behaviors, due to nonlinearities [11,13] and, concerning system theory [25], a new meaning was carved, considering that complexity is half way of the equilibrium and disequilibrium [6].
Considering this idea, López-Ruiz, Mancini and Calbet proposed, in a seminal paper [8], the LMC complexity measure by using the informational entropy [22] to evaluate equilibrium, and a quadratic deviation from the uniform distribution to evaluate disequilibrium.However, there are some criticisms about LMC measure, considering that it is inaccurate for some classes of systems obeying Markovian chains and can not be considered an extensive variable.is called SDL [23] and presents results qualitatively similar to the obtained by using LMC, for the majority of usual statistical distributions [15].
The main restriction to LMC and SDL complexity measures is due to Crutchfield, Feldman and Shalizi, as they argue that an equilibrium system can be structurally complex [3], but this problem could be solved weighting order and disorder, according to the specific problem to be analyzed [16].
With these ideas in mind, this article presents a systematization of the methodology used in the referred papers, based on LMC and SDL measures, to be applied to temporal series, by defining and calculating dynamical complexity measures.
The procedure, applied to a temporal series representing some organizational or dynamical aspect of a system, provides hints regarding the evolution of its complexity.
As the LMC and SDL dynamical measures are based on informational entropy [12], the first task, described in the next section, is to define an alphabet source, associating a probability distribution to the possible system states.
Following the definition of the probability distribution, a new section defines how dynamical LMC and SDL measures can be calculated at each time, based on the individual information associated to the system state at this time, generating temporal series for LMC and SDL measures.
To illustrate the calculation procedure, an example related to an economical time series is presented and, in the same section, a practical discussion about how to divide the range of the values assumed by the system state is presented.
The work is closed with a conclusion section, emphasizing that the calculations presented can be applied to any kind of temporal real numbers series.

Defining source and probability distribution for a temporal series
Considering the Shannon's model [22] for an information source, a time series x(n) is considered to be a function of the integers into a real interval, i.e., x(n) : Z + → (a, b), associating to each time t 0 + nT a real number belonging to (a, b), with t 0 > 0 being the initial instant and T > 0 an arbitrary period, depending on the data availability.
The first step is to divide the interval (a, b) into N sub-intervals.For the sake of simplicity, N is At this point, it could be asked how to choose N, as there is a compromise between precision (high values of N) and speed of calculation (low values of N).This question will not be addressed theoretically; however in the example section, practical hints about the choice will be presented.
Consequently, the source's alphabet is defined by the intervals and Then, a time interval defined by a given n must be chosen, and for the time sequence t 0 , t 0 + T, ...t 0 + nT the values of the variable x(n) must be read and associated to the intervals A i , containing the respective value.
Therefore, for the whole set t 0 , t 0 + T, ...t 0 + nT, each interval A i is associated to x(n) a certain number of times n i , which defines a relative frequency p i = n i n+1 , taken as a probability, associated each interval A i .
Following the definition, for each sub-interval A i ∈ (a, b), its individual contribution to the whole information entropy is given by: S i = −p i log 2 p i ; and the maximum value of the informational entropy for the whole source, S max = log 2 N = k, can be calculated [22].

Dynamical LMC and SDL
As the source alphabet and individual information were defined, the instantaneous values of x(n) are associated to the respective S i , allowing the calculation of the instantaneous value of the equilibrium (disorder) term: Combining (1) with the different definitions of the disequilibrium (order) terms, dynamical LMC and SDL measures are defined.

LMC dynamical measure
As indicated by López-Ruiz, Mancini and Calbet [8], the dynamic disequilibrium (order) term can be calculated as the quadratic deviation of the source probability distribution from the uniform distribution and, consequently, the individual contribution of the interval A i is: Extending the definition of LMC measure, dynamical LMC, calculated in t 0 + nT, is given by:

SDL dynamical measure
As proposed by Shiner, Davison and Landsberg [23], the dynamic disequilibrium (order) term can be calculated as the complement of the dynamic equilibrium term: Extending the definition of SDL measure, dynamical SDL, calculated in t 0 + nT, is given by:

Applying the method: practical hints
In this section, the economic series relative to the conversion of currencies studied in [12,20]   It can be observed that, in this case (sixteen-division case), C LMC (n) and C SDL (n) differ only by a scale factor, with the same qualitative time evolution.
Comparing Fig. 2a and 3a, C LMC (n) for different range partitions, the whole qualitative aspects of the curves are the same and by increasing the number of divisions, the dynamical range of the measures changes, implying some rapid oscillatory variations, similar to noise.
Comparing Fig. 2b and 3b, C SDL (n) for different range partitions, the whole qualitative aspects of the curves are the same and the noisy aspect due to the increasing number of interval divisions is maintained.
Consequently, from now on, only LMC measure will be analyzed, since SDL presents the same qualitative dynamical behavior and partition sensitivity.

Range interval partition
By increasing the number of intervals of the x(n) and recalculating C LMC (n), the result for a thirty-two partition is shown in Fig. 4a and, for a sixty-four partition, in Fig. 4b.
By analyzing the results from figures 2a, 3a, 4a and 4b it could be observed that by increasing the number of intervals, the maximum value of C LMC (n) decreases improving the precision but, apparently, for this long series the temporal evolution of C LMC (n) maintains its qualitative behavior mixing noise with accuracy.
Attempting to be more precise about how the range interval partition, the C LMC (n) is calculated for the several partitions, but considering a shorter time period for the data.The interval between July and December of 2002 is chosen, because, as explained in [12], it is critical concerning the conversion rates in Brazil.It can be observed from these results that, for shorter intervals, the general qualitative characteristics of the time evolution appear, independently on the partition.However, as the number of sub-intervals increases, the instantaneous numerical values change but the precision increases, allowing more accurate analysis.

Conclusions
A methodology for calculating LMC and SDL dynamical complexity was developed, starting with the construction of a source and a probability distribution, for any temporal series.
It was observed that LMC and SDL measures, in spite of presenting different numerical results, have very similar temporal evolution, related to the variable x(n).
A point that is always object of discussion is the range interval partition.The choice of the number of sub-intervals is a matter of experience.
Long time intervals are not so sensitive to the increase of the number of divisions; however, for short time intervals, increasing the number of divisions provokes a precision improvement, in spite of the introduction of an apparent noise.
Feldman and Crutchfield[4] proposed a correction for the disequilibrium term, replacing it by the relative entropy with respect to the uniform distribution.Following this line, Shiner, Davison and Landsberg proposed another modification of the LMC measure, replacing the disequilibrium term by the complement of the equilibrium term.This measure Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 11 January 2018 doi:10.20944/preprints201801.0099.v1© 2018 by the author(s).Distributed under a Creative Commons CC BY license.

Figure 4 .
Figure 4. Temporal evolution of LMC complexity

Figure 5 .
Figure 5. LMC temporal evolution for shorter time intervals