On the operational utility of measures of EEG integrated information

Multichannel EEGs were obtained from healthy participants in the eyes-closed no-task 1 condition (where the alpha component is typically abolished). EEG dynamics in the two conditions 2 were quantified with two related binary Lempel-Ziv measures of the first principal component and 3 with three measures of integrated information including the more recently proposed integrated 4 synergy. Both integrated information and integrated synergy with model order p = 1 had greater 5 values in the eyes closed condition. If the model order of integrated synergy was determined 6 with the Bayesian Information Criterion, this pattern was reversed, and in common with other 7 measures, integrated synergy was greater in the eyes open condition. Eyes open versus eyes 8 closed separation was quantified by calculation of the between-condition effect size. Lempel-Ziv 9 complexity of the first principal component showed greater separation than the measures of 10 integrated information. The performance of the integrated information measures investigated 11 here when distinguishing between indisputably different physiological states encourages caution 12 when advocating for their use as measures of consciousness. 13


Introduction
Integrated information measures of multichannel EEGs have attracted attention, in 17 part, because it has been suggested that they might quantify consciousness [1]. It should 18 be noted, however, that others have argued against this suggestion [2], [3]. Several 19 variant measures of integrated information have been considered [4]. Mediano, et al 20 [5] have compared six candidate measures with computationally generated data. The 21 quantification of consciousness raises several deep questions including the definition of 22 consciousness itself. We investigate here an objective criterion of the utility of measures 23 of multichannel EEGs: how effective are these measures in discriminating between 24 incontestably different cognitive/physiological states? 25 In this study we assessed multichannel EEG measures by comparing values ob- 26 tained in the no-task eyes open and no-task eyes closed condition. One of the most 27 consistent properties of the EEG is alpha blocking discovered by Berger in 1924 [6] and 28 confirmed by Adrian and Matthews in 1934 [7]. In most individuals, but certainly not all, 29 a very prominent alpha rhythm (8)(9)(10)(11)(12)(13) is observed in the eyes closed condition. This eyes open contra eyes closed spectra are presented in Hartoyo, et al. [8] and Liley and 32 Muthukumaraswamy [9]. 33 A study comparing ten measures calculated from multichannel EEGs in the eyes 34 open and eyes closed state was published by Rapp, et al. [10]. The present study follows

50
Free-running, no-task, monopolar EEG signals referenced to linked earlobes were 51 obtained in two conditions, eyes closed and eyes open, from F Z , C Z , P Z , O Z , F 3 , F 4 , 52 C 3 , C 4 , P 3 and P 4 using an Electrocap. Bipolar recordings of vertical and horizontal 53 eye movements were recorded from electrode sites above and below the right eye and 54 from near the outer canthi of each eye. Artifact corrupted records were removed from 55 the analyses. All EEG impedances were less than 5 KOhm. Signals were amplified, 56 Gain=18000. Signals were digitized at 1024 Hz using a twelve-bit digitizer. Continuous 57 artifact-free records were obtained from each subject in the two conditions. Ten thousand 58 point records were used in these calculations. As reported in the introduction, an 59 objective of the study was to determine the effect of alpha content on the resulting 60 dynamical measures. All signals were initially passband filtered with cutoff settings at

64
Five measures were used in this study. The first was constructed using Lempel-Ziv complexity [11]. Let (V m 1 , V m 2 , · · · , V m 10000 ) denote the mean normalized time series of the m-th channel (m=1,...10). These vectors become columns in a 10, 000 × 10 matrix.
where V · D · U T is the singular value decomposition of A. The singular value decom-65 position was calculated using the Golub-Reinsch algorithm [12], [13]. D is the diagonal 66 matrix of singular values D = diag(λ 1 , λ 2 , · · · λ 10 ) where we introduce the convention 67 λ j ≥ λ j+1 for all j, and U is the corresponding orthogonal transformation. For these 68 data, the first principal component carries more than 70% of the multichannel signal's 69 variance [14].

70
The first measure is constructed as follows. The first principal component was 71 then partitioned into a binary symbol sequence about the median, and the Lempel-Ziv 72 complexity was calculated [11]; pseudocode is given in Appendix A of [14].

73
The second measure is nearly identical to the first. In this case the mean normalized 74 time series of each channel was also normalized against the channel's standard deviation 75 before constructing matrix A.

76
The third measure is one of the earliest measures of central nervous system information integration proposed by Tononi, et al. [15]. It is constructed by comparing the degree of integration of k-dimensional subsystems with the degree of integration of the N-dimensional parent system. Corr(X k j ) is the j-th instance of a k × k correlation matrix formed by using k of the N channels. Tononi, et al. [15] define the integration of this subsystem as The average integration of k-dimensional subsystem is denoted by I(X k j ) . The system 77 integration C N , the third measure for this study is determined by comparing the inte-78 gration of the N-dimensional system I(X N ) against the integration of subsystems of k 79 channels.
We note that Equation 4 of Tononi, et al. [15] uses k/N as the scaling factor of 81 I(X N ). van Putten and Stam [16] argue that (k − 1)/(N − 1) rather than k/N is the ap-

82
propriate scaling factor and is used here. Pseudocode for C N is given in [10]. of integration based on the Morgerra covariance complexity [17] was identified in [10].

90
The fourth measure examined in this study is another measure of integrated infor- Information Theory. Between ψ and Φ * , ψ is easier to compute for Gaussian processes. 110 We therefore selected integrated synergy [20] for incorporation into this study. A concise 111 mathematical description of integrated synergy is given in the Supplement of this paper.

112
Measure 5 is again integrated synergy. In this case, however, the model order of 113 the underlying Gaussian autoregressive process is not fixed at p = 1 as in [5] but is 114 determined for each multichannel data set by the Bayesian Information Criterion, BIC.

160
The two measures that failed to reject the null hypothesis were C N with alpha content 161 removed and integrated synergy, ψ model order p = 1 also for the case where the alpha 162 band was removed.

163
Nonparametric correlations between measures were quantified with Kendall's tau.

164
The results shown in Tables 5 and 6    NINCS-ADRDA criteria [25]. They found C N higher in Alzheimer's disease as compared   Let P be a particular partition of the state vector V = (V 1 , . . . , V d ) into r state vectors (M (1) , M (2) , . . . , M (r) ). Then the integrated synergy ψ 1 is defined as the decrease in the mutual information between the most-recent past and present of the system when using a partition of the most-recent past that maximizes a certain notion of non-redundant information between the partition elements and the present, where t−1 ∧ V t is the intersection information between a particular partition of V t−1 and V t (Mediano, et al.). The intersection information is defined via an inclusion-exclusion principle as

251
For a linear Gaussian VAR, the union information reduces to the maximum over the mutual informations between the present state and each single element of the partition (A3) (Olbrich, Bertschinger, and Rauh).

252
As in (Mediano, et al.), we restrict our analysis to equal-sized bipartitions of the 253 state vector. This is done to avoid bias due to differences in integrated information 254 between different-sized partitions.

256
The integrated synergy (A1) implicitly assumes that the stochastic process governing 257 the state variable is an order-1 Markov process. That is, it assumes that knowledge of 258 the most recent past is sufficient to predict the future of the time series. There is no 259 reason to assume this a priori, especially for time series derived from electrophysiology.

260
In principle, one could allow for the entire past of the time series to be necessary for 261 optimal prediction, i.e. if the process is a linear stochastic process. In this section, we 262 define this infinite-order integrated synergy as well as an order-p compromise.

263
Let Z a:b = (Z a , Z a+1 , . . . , Z b−1 ) be the random vector containing the state vector from time points a to b − 1. Then the infinite order integrated synergy ψ ∞ is defined as In practice, with finite data available to estimate the time series model, finite orders are necessary. Truncating at sufficiently large lag p into the past gives the order-p integrated synergy ψ p , where all quantities are defined as in the order-1 case with the inclusion of additional 264 lags.

265
In general, a vector stochastic process that is VAR(p) overall need not be VAR(p) 266 in any of its components (Lütkepohl), and thus infinite orders may be necessary for