Signal Fluctuations and the Information Transmission Rates in Binary Communication Channels

In nervous system information is conveyed by sequence of action potentials (spikes-trains). As MacKay and McCulloch proposed, spike-trains can be represented as bits sequences coming from Information Sources. Previously, we studied relations between Information Transmission Rates (ITR) carried out by the spikes, their correlations, and frequencies. Here, we concentrate on the problem of how spikes fluctuations affect ITR. The Information Theory Method developed by Shannon is applied. Information Sources are modeled as stationary stochastic processes. We assume such sources as two states Markov processes. As a spike-trains' fluctuation measure, we consider the Standard Deviation SD, which, in fact, measures average fluctuation of spikes around the average spike frequency. We found that character of ITR and signal fluctuations relation strongly depends on parameter s which is a sum of transitions probabilities from no spike state to spike state and vice versa. It turned out that for smaller s (s<1) the quotient ITR/SD has a maximum and can tend to zero depending on transition probabilities. While for s large enough 1<s the ITR/SD is separated from 0 for each s. Similar behavior was observed when we replaced Shannon entropy terms in Markov entropy formula by their approximation with polynomials. We also show that the ITR quotient by Variance behaves in a completely different way. We show that for large transition parameter s the Information Transmission Rate by SD will never decrease to 0. Specifically, for 1<s<1.7 the ITR will be always, independently on transition probabilities which form this s, above the level of fluctuations, i.e. we have SD<ITR. We conclude that in a more noisy environment, to get appropriate reliability and efficiency of transmission, Information Sources with higher tendency of transition from the state no spike to spike state and vice versa should be applied.


Introduction
Information transmission processes in natural environments are usually affected by signals fluctuations due to the presence of noise-generating factors. 1,2 It is especially visible in biological systems, in particular in signal processing in the brain. [3][4][5] The physical information carriers in the brain are small electrical currents. 6 Specifically, the information is carried by sequences of action potentials also called spikes-trains. Assuming some time resolution MacKay and McCulloch proposed a natural encoding method that associates to each spike-train a binary sequence. 7 Thus, the information is represented by a sequence of bits which, from a mathematical point of view, can be treated as a trajectory of some stochastic process. 8,9 In 1948 C. Shannon developed his famous Communication Theory where he introduced the concept of information and its quantitative measure. 10 The occurrences of both inputs transmitted through a communications channel and output symbols are described by sequences of random variables that define already stochastic processes and form some Information Sources. 8,11 Following this line, to characterize the amount of information transmitted per symbol the Information Transmission Rate (IT R) is applied.
Spike-trains Information Sources are often modeled as Poisson point processes. 12,13 On the other hand, it is known that such processes exhibit Markov properties. 14,15 This is because in these processes when describing spikes arrival times, current time and the time from the last spike is primarily taken into account. 16 Complex systems dynamics, from financial markets 17 to the neural networks of living beings, 18,19 can be characterized, mostly due to the presence of noise, by using fluctuations, variations, or other statistical tools. 20 The natural measure of fluctuations should, in general, reflect oscillations around the mean average value of the signal. Therefore, in most systems in physics, economics, fluid mechanics, fluctuations are most often quantifying using the Standard Deviation. [21][22][23] In this paper, we analyze the relationship between the Information Transmission Rate of signals coming from time-discrete two states Markov Information Source and these signals fluctuations. As a spike-trains' fluctuation measure, we consider already the Standard Deviation of encoded spikes. Moreover, to get the better insight we have also analyzed the case when the IT R is referred to the signals Variance V instead to the Standard Deviation σ.
Our previous research, when we studied the properties of neural coding, shows that neural binary coding cannot be captured by straightforward correlations between input and output signals. 24 In 25, 26 it was found that a key role in assessing the information sent by Markov type Information Sources in dependence on both Firing Rate and signals correlations plays the jumping (transition) parameter s, which is the sum of transition probabilities from the no-spike state to the spike state and vice versa. Here, we found that also the character of the relation between IT R and signal fluctuations strongly depends on the parameter s. It turned out that for small s (s < 1) the quotient IT R σ has a maximum and tends to zero when the probability of transition from no spike state to spike state never reaches 0. While for large enough s the quotient IT R σ is limited from below. We observed that similar behavior of IT R σ is also when we replaced (approximate) Shannon entropy formula by appropriate polynomials.
On the other hand, we found that when we refer the quotient IT R σ to σ, i.e. when we consider, in fact, the quotient IT R V this quotient behaves in a completely different way. This behavior is not regular. Specifically, we observed that for 1 < s there is some range of parameter s for which IT R V has a few local extremas, in opposition to the case IT R σ . The paper is organized as follows. In Section 2, we briefly recall Shannon Information Theory concepts (entropy, information, binary Information Sources, Information Transmission Rate), and fluctuation measure (Standard Deviation and Root Mean Square). In Section 3 we analyzed the quotients IT R σ and IT R V . Section 4 contains the discussion and final conclusions.

Theoretical Background and Methods
To introduce the necessary notation, we briefly recall Shannon Information Theory's basic concepts 8,10,11 i.e. Information, Entropy, Information Source, and Information Transmission Rate.

Shannons Entropy and Information Transmission Rate
Let Z L be a set of all words of length L, built of symbols (letters) from some finite alphabet Z. Each word w ∈ Z L can be treated as an encoded message sent by Information Source Z being a stationary stochastic process. If P (w) denotes the probability that the word w ∈ Z L already occurs, then the information in the Shannon sense carried by this word is defined as This means that less probable events carry more information. Thus, the average information of the random variable Z L associated with the words of length L is called the Shannon block entropy and is given by The appropriate measure for estimation of transmission efficiency of an Information Source Z is the information transmitted on average by a single symbol, i.e. IT R 8,11 IT R (L) (Z) : This limit exists if and only if the stochastic process Z is stationary. 8 In the special case of a two-letters alphabet Z = {0, 1} and the length of words L = 1 we introduce the following notation where P (1) = p, P (0) = 1 − p are associated probabilities. This is, in fact, the formula for the entropy rate of a Bernoulli source. 11 Index 2 in (5) indicates that we consider logarithm wit base 2 what means that we consider the information expressed in bits.

Information Sources
In general, Information Sources are modeled as stationary stochastic processes. 8,11 The information is represented by trajectories of such processes. Here, to study the relation between Information Transmission Rate (IT R) and trajectories fluctuations, we consider Information Sources which are modeled as two-states Markov processes. The trajectories of these processes can be treated as encoded spike-trains. 3,9,27 The commonly accepted natural encoding procedure leads to binary sequences. 9, 27 Spike-trains are, in fact, the main objects to carry information. 3, 6 We additionally consider among the Markov processes as a special case the Bernoulli processes.

Information Sources Markov Processes
We consider time-discrete, two-states Markov process M, which is defined by a set of conditional probabilities p j|i which describe the transition from state i to state j, where i, j = 0, 1, and by the initial probabilities P 0 (0), P 0 (1). The Markov transition probability matrix P can be written as Each of the columns of the transition probability matrix P has to sum to 1 (i.e. it is a stochastic matrix 8 ).
The time evolution of the states probabilities is governed by the Master Equation 28 where n stands for time, P n (0), P n (1) are probabilities of finding states "0" and "1" at time n, respectively. The stationary solution of (7) is given by It is known 8, 11 that for Markov process M the Information Transmission Rate as defined by (4) is of the following form In previous papers, 24-26 when we studied the relation between IT Rs and firing rates and when we compared IT R for Markov processes and for corresponding Bernoulli processes we have introduced a parameter s, which can be interpreted as the tendency of a transition from the no-spike state ("0") to the spike state ("1") and vice versa: It turned out that this parameter plays an essential role in our considerations also in this paper. Note that s = 2 − trM and 0 ≤ s ≤ 2. One can observe that two-states Markov processes are Bernoulli processes if and only if s = 1.

Information Sources Bernoulli Process case
The Bernoulli processes play a special role among the Markov processes. Bernoulli process is a stochastic stationary process Z= (Z i ), i = 1, 2, formed by binary identically distributed and independent random variables Z i . In the case of the encoded spike-trains, we assume that the corresponding process (to be more precise its trajectories) takes successively the values 1 (when spike has arrived in the bin) or 0 (when spike has not arrived). We assume that for a given size of time-bin applied (this depends in turn on the time resolution assumed), spike trains are encoded 29 in such a way that 1 is generated with probability p, and 0 is generated with probability q, where q is equal to 1 − p. Following the definition, the Information Transmission Rate (3) of the Bernoulli process is

Generalized entropy variants
The form of entropy H was derived under assumptions of monotonicity, joint entropy, continuity properties, and Grouping Axiom. In the classical case of the entropy rate H M for Markov process, in formula (11) the terms H(p 1|0 ) and H(p 0|1 ) are clearly understood in the Shannon sense (2). To get a better insight into the asymptotic behavior of the relations studied in this paper, we additionally consider formula (11) with H replaced by its Taylor approximation (10 terms). We also studied the interesting case when instead of H we used famous unimodal map U (p) = 4p(1−p) 30 which is, in fact, close ( Figure 1) to H in the supremum norm. 31 This idea is along the research direction related to generalized concepts of entropy developed, starting from Renyai, 32 by many authors. 33-37 Figure 1 shows the approximation of entropy (11) by polynomials: unimodal map (black dash line) and 10 first terms in the Taylor series of H (gray dash-dot line). We also included the square root of the unimodal map (black point line) in this Figure.

Fluctuations measure
It is commonly accepted that for a given random variable X the fluctuations of values of this random variable around its average can be characterized by the Standard Deviation σ 38 where symbol E means the average taken over the probability distribution associated with the values reached by X.
Considering a stochastic process Y= (X k ), k = 1, 2, 3, . . ., where X k are random variables each with the same probability distribution as X, the fluctuation of trajectories of this process can be estimated by the Root-Mean-Square (RM S). For a given trajectory (x k ) n i=1 , k = 1, . . . , n RMS is defined as the root from the arithmetic mean value of the squares, i.e.
where x navr is the average value, i.e. x navr = 1 n n i=1 x k . Note, that from this formula the form of σ for Markov processes can be derived when using stationary distribution (8) in formula (12).
The Standard Deviation σ for any random variable depends, in fact, not only on its probability distribution but also on the values taken by this random variable. Here we are interested in bits oscillation, i.e. if the spike train occurs or not. Thus, we have limited our considerations to the values 0 and 1.
To get a better insight into the relation between IT R and signal/bits fluctuations we also included an analysis of the quotient IT R V . This is interesting due to the specific form of Variation for the Bernoulli process what leads to interesting observations when consider, for example, the unimodal map to approximate entropy (5). Moreover, when studying IT R V we, in fact, refer the quotient IT R σ to σ since we have simply

Results
In this Section, we study the quotients IT R σ and IT R V as a function of the transition probability p 1|0 from the state no-spike "0" to the spike state "1" for a fixed parameter s (10). Note, that the probability 0 < p 1|0 < 1 and parameter 0 < s < 2 uniquely determined the transition probability matrix P (6) and consequently, they completely define the Markov process M, provided that initial probabilities P 0 (0), P 0 (1) are chosen. Here, as initial probabilities, to get a stationary process, we must assume the probabilities of the form (8).

Information against fluctuations for two-states Markov processes general case
We start our considerations from the most general form of the two-states Markov process. To analyze the quotients IT R σ and IT R V we first express Standard Deviation of Markov process M in terms of conditional probability p 1|0 and parameter s.

Standard Deviation in the Markov process case
For a given Markov process M to evaluate its fluctuation, specifically to address its long time behavior, one considers its corresponding stationary probabilities as defined by (8). Thus, in the limiting case, the Standard Deviation σ for the Markov process can be assumed as σ M = P eq (0) · P eq (1).
Fixing parameter s and expressing σ M as a function of the conditional probability p 1|0 we came to the following formula: Note that in the case of Variance we have a polynomial dependence on p 1|0 (keeping in mind that s is fixed).

Relation between Information Transmission Rate IT R of Markov process and its Standard Deviation
Lets start by establishing the relation between Standard Deviation and IT R for the Bernoulli process. This means that in our notation s is equal to 1. Making use of the classical inequality x − 1 ≥ ln x(for all x > 0) and doing a few simple operations one can come to the inequality 2 · log 2 e ≤ IT R2(p 1|0 ) σ 2 . To find the relations between entropy H M and σ M in more general cases, one can consider the quotient Note that Q M,s σ (p 1|0 ) is a symmetric function with respect to to the axe p 1|0 = s 2 i.e.
Substituting (8), (10) and (14) into (16) we obtain and after simple calculations we have One can check that for smaller s ∈ (0, 1), i.e in case (18), for a given fixed s when p 1|0 tends to interval bounds 0 or to s, the quotient Q H,s σ (p 1|0 ) tends to 0, i.e.: By the form of (20) and symmetry property (17) it is clear that the quotient Q H,s σ (p 1|0 ) reaches the maximum in the symmetry point p 1|0 = s 2 and it is equal to One can check that in the case B) i.e. for s ∈ (1, 2) for a given fixed s when p 1|0 tends to s − 1 or to 1 the quotient Q H,s σ (p 1|0 ) tends to H(s−1) √ (s−1) , i.e.: Thus, we have for s ∈ (1, 2) The typical runnings of Q H,s σ (p 1|0 ) for some values of the parameter, s are shown in Figure 2. Column A is devoted to lower values of the jumping parameter 0 ≤ s ≤ 1, while column B presents the Q H,s σ courses for higher values of the jumping parameter 1 < s < 2. Observe, that for 1 < s < 2 the curves intersect contrary to the case 0 ≤ s ≤ 1. This is mostly since the limiting value (24) is not a monotonic function of s while the maximal value (23) is already monotonic.
Note, that for the approximation of entropy H by polynomials, specifically by unimodal map U and by Taylor series T , the corresponding quotients Q U,s σ B, Q T,s σ behave similarly as for the Shannon form of H (see Figure 2).

Relation between Information Transmission Rate IT R of Markov process and its Variation
To find how the Variation of trajectories of Markov Information Source affects Information Transmission Rate one should consider now a modified quotient Substituting (8) and (10) to (26) we obtain First, observe that clearly as in the standard deviation case we have symmetry property around the value s 2 , i.e.
By this symmetry it is clear that Q H,s V (p 1|0 ) reaches extremum at the point p 1|0 = s 2 and it is equal to 4H( s 2 ). Observe, that in the case A), i.e. for a given fixed s ∈ (0, 1), for p 1|0 tending interval bound i.e. to 0 or s− the quotient Q H,s V (p 1|0 ), in opposite to Q T,s σ (p 1|0 ), tends to infinity, i.e.: Thus, it is clear that Q H,s V (p 1|0 ) reaches a minimum at the point p 1|0 = s 2 . In the case of B), it turned out that the quotient Q H,s V (p 1|0 ) for any fixed s ∈ (1, 2) is bounded both from below and from above. We have: Numerical calculations showed that for the parameters s > s 0 the point p 1|0 = s 2 is a minimum while for s < s 0 at this point, there is a maximum, where the critical parameter s 0 ≈1.33 can be calculated from the equality: The typical running of the Q H,s V (p 1|0 ) for some values of the parameter, s is shown in Figure 3. Panel A (left column) is devoted to lower values of the jumping parameter 0 ≤ s ≤ 1, while panel B presents graphs of Q H,s V (p 1|0 ) for higher values of the jumping parameter 1 < s < 2. It turned out that the approximation of entropy H by polynomials namely by the unimodal map and by Taylor series leads to the completely different behavior of Q H,s V (p 1|0 ). Note, that for the approximation of H in (2) with the unimodal map the quotient Q H,s V (p 1|0 ), for each s, is a constant and equal to 4s(2 − s), while for the approximation by the Taylor series (10 terms) the quotient Q T,s V (p 1|0 ) preserves a similar courses as for H of Shannon form.

Discussion and Conclusions
In this paper, we study relation between the Information Transmission Rate carried out by sequences of bits and these bits fluctuations. These sequences are coming from Information Sources which are modeled by Markov processes. Our results show that the qualitative and quantitative character of the relation between the Information Transmission Rate and signal bits fluctuations strongly depends on the jumping parameter s, which we introduced in our previous papers. 25,26 This parameter characterizes the tendency of the process to transition from state to state. In some sense, it describes the variability of the signals.
It turned out that similarly as in our previous papers when we have studied relation between Information Transmission Rates, spikes correlations, and frequencies of these spikes appearance, the critical value of s is equal to 1 what corresponds to Bernoulli process. For all small s (s < 1) the quotient IT R σ can reach 0, while for larger s (s > 1) this quotient is always separated from 0. Specifically, for 1 < s < 1.7 the IT R will be always, independently on transition probabilities which forming this s, above the level of fluctuations (i.e. σ < IT R). Thus, this shows an interesting fact that for s large enough the information is never completely lost independently on the level of fluctuations.
On the other hand, for each 0 < s < 2 the quotient IT R σ is limited from above by 2 and it is reached for each s, for p 1|0 = s 2 , i.e. it is reached when p 1|0 = p 0|1 . This means that, when compare IT R to σ, the most effective transmission is for symmetric communication channels. Note, that the capacity C(s) of such channels is already equal to It turned out that IT R σ for the approximation of Shannon entropy H by polynomials, specifically by the unimodal map and its Taylor series behaves similarly.
For better insight, we also referred IT R to Variance. We observed that the behavior of the IT R V significantly differs from the behavior of IT R σ . For each s < 1 the quotient IT R V can tend to infinity and it is separated from 0. For 1 < s < 2 it is limited from above and it never reaches 0 for any s. However, it behaves in a more complex way than IT R σ by having even 3 local extreme points, eg. it is visible for s = 1.3 and s = 1.5. On the other hand approximations of Shannon entropy H by polynomials like the unimodal map or by its Taylor series, contrary to the case of IT R σ , lead to a significant qualitative difference between the behavior of IT R V . To summarize, the results obtained show that for Markov information sources, regardless of the level of fluctuation, the level of Information Transmission Rate does not reduce to zero, provided that the transition parameter s is sufficiently large. This means that to get more reliable communication the spike trains should have a higher tendency of transition from the state no spike to spike state and vice versa.
The results are presented in the context of signal processing in the brain, due to the fact that information transmission in the brain is in this case a natural and fundamental phenomena. However, our results have, in fact, a general character and can be applied to any communication systems modeled by two states Markov processes.