The QBIT Theory of Consciousness

The QBIT theory is an attempt toward solving the problem of consciousness based on empirical evidence provided by various scientific disciplines including quantum mechanics, biology, information theory, and thermodynamics. This theory formulates the problem of consciousness in the following four questions, and provides preliminary answers for each question: Question 1: What is the nature of qualia? Answer: A quale is a superdense pack of quantum information encoded in maximally entangled pure states. Question 2: How are qualia generated? Answer: When a pack of quantum information is compressed beyond a certain threshold, a quale is generated. Question 3: Why are qualia subjective? Answer: A quale is subjective because a pack of information encoded in maximally entangled pure states are essentially private and unshareable. Question 4: Why does a quale have a particular meaning? Answer: A pack of information within a cognitive system gradually obtains a particular meaning as it undergoes a progressive process of interpretation performed by an internal model installed in the system. Question 1: What is the nature of qualia? Answer: A quale is a superdense pack of quantum information encoded in maximally entangled pure states. Question 2: How are qualia generated? Answer: When a pack of quantum information is compressed beyond a certain threshold, a quale is generated. Question 3: Why are qualia subjective? Answer: A quale is subjective because a pack of information encoded in maximally entangled pure states are essentially private and unshareable. Question 4: Why does a quale have a particular meaning? Answer: A pack of information within a cognitive system gradually obtains a particular meaning as it undergoes a progressive process of interpretation performed by an internal model installed in the system. This paper introduces the QBIT theory of consciousness, and explains its basic assumptions and conjectures.


Introduction
The problem of consciousness is one of the most difficult problems in biology, which has remained unresolved despite several decades of scientific research. The hard core of the problem of consciousness is in fact the problem of qualia.
Qualia (plural for quale) refers to subjective conscious experiences such as a red color, a sharp pain, a particular smell, or a specific taste. As an example, when we see a red flower, the redness that we experience is a quale. Our consciousness at any moment consists of several different qualia. In fact, "To be conscious" means "to have qualia", and unconscious perception means "qualia-less perception".
To resolve the problem of consciousness, empirical evidence alone is not sufficient; we also need an appropriate theory to select and put together diverse (and sometimes seemingly unrelated) empirical evidence to reveal a hidden pattern. In this context, the QBIT theory is an attempt toward solving the puzzle of consciousness with pieces of evidence collected from different scientific disciplines including quantum mechanics, biology, information theory, and thermodynamics.
The QBIT theory formulates the problem of consciousness in the following four questions: (1) What is the nature of qualia? (2) How are qualia generated? (3) Why are qualia subjective? (4) Why does a quale have a particular meaning?
In sum, the QBIT theory is based on the following assumptions and conjectures: (1) Consciousness requires Maxwell demon-assisted quantum computation.
(2) When information-theoretic certainty within a cognitive system about an external stimulus exceeds a particular level, the system becomes conscious of that stimulus.
(3) A quale is a superdense pack of quantum information encoded in maximally entangled pure states. (4) When a pack of quantum information is compressed beyond a certain threshold, a quale is generated. (5) A quale is subjective because a pack of information encoded in maximally entangled pure states are essentially private and unshareable. (6) A pack of information within a cognitive system gradually obtains a particular meaning as it undergoes a progressive process of interpretation performed by an internal model installed in the system.

Consciousness Requires Computation
As Dehaene et al. (2017) nicely argue, although centuries of philosophical dualism have led us to consider consciousness as unreducible to physical interactions, scientific evidence is compatible with the proposition that consciousness arises from nothing more than a particular type of computation. But what is computation, and what kind of computation is required for consciousness? In cognitive science, computation could be regarded as transformation of one internal representation into another (Sanger 2003;Eliasmith 2010). Here, "internal representation" is defined as a pack of information that stands in for an external stimulus (Clark 1997;Ward and Ward 2009). As Pennartz (2018) argues, it is widely accepted in neuroscience and cognitive science that consciousness requires formation and transformation of internal representations by the nervous system. In the next section, I will explain how computation could give rise to consciousness.
How Does the Brain Generate Qualia?
To explain how brain computations could give rise to consciousness, I use an oversimplified model of sensory processing. Obviously, the brain operates in a much more complex manner than what is depicted in this model. However, this oversimplified model captures the essence of sensory processing by the brain, and clearly explains the basic idea underlying the QBIT theory of consciousness.
A sensory system contains a hierarchy of computational nodes. At the lowest level of this hierarchy, there is a sensory receptor (node 1 or N1) that converts the energy of an external stimulus into a pack of information. This pack of information is the lowestlevel internal representation (representation 1 or R1) that the system creates to represent the stimulus. This internal representation is transmitted up the hierarchy to the next computational node (N2), where the representation undergoes a series of computational operations and, as a consequence, transforms into a higher-level representation (R2). This representation is then transmitted up the hierarchy to the next node (N3), where it is transformed into a representation (R3) that has a higher status than the previous one. This progressive transformation of representations continues until the highest-level internal representation is created at the top of the hierarchy.
Each computational node (for example, N3) receives at least two packs of information: a bottom-up input which is the representation sent forward from the preceding node (N2), and a top-down input which is sent backward from a higher-level computational node (for example, N4). The N3 integrates these packs of information to form a new representation. This new pack of information is compressed by N3, and the compressed representation is then transmitted to N4 for another round of "integration and compression". In the terminology of the QBIT theory, this hierarchical consecutive transformation of representations is called "representation distillation".
The whole computation performed by each node is somewhat similar to what is known as "local operations and classical communication" or LOCC. In quantum information theory, LOCC is a method of information processing in which a local operation is performed in a node of a system, and then the result of that operation is communicated classically to another node where another local operation is performed conditioned on the information received. LOCC, and its relation to the QBIT theory, will be discussed later in this paper. Now, let's return to the oversimplified model.
As a representation ascends the hierarchy, its mutual information with the external stimulus that it represents is increased. Mutual information is in a sense the converse of entropy (Little and Sommer 2013). Therefore, the representation generated by the sensory receptor (i.e. R1) has minimal mutual information and maximum entropy, while the representation generated at the top of the hierarchy has maximal mutual information and the least entropy. An idea similar to this has been recently proposed by Gupta and Bahmer (2019). They argue that an increase in mutual information occurs as sensory information is processed successively from lower to higher levels in a cortical hierarchy. They suggest that this gradual increase in mutual information contributes to perception.
Mutual information between two variables (X and Y) is the average reduction in uncertainty about X that results from knowing the value of Y. In the oversimplified model discussed here, X is a pack of energy (i.e. an external stimulus) and Y is a pack of information (i.e. an internal representation). An increase in mutual information is equivalent to an increase in certainty of the sensory system about the external stimulus. In this sense, the QBIT theory suggests that when certainty of a system about an external stimulus exceeds a particular level, the system becomes conscious of that stimulus. To attain such a high level of certainty, quantum information is required. As Hayden (2005) nicely mentions, "with quantum information, it is possible not just to be certain, but to be more than certain." This wonderful effect of quantum information inspires the idea that in order to become conscious, we need to go beyond the limits of classical physics. Consciousness requires quantum phenomena, including entanglement and coherence. These quantum phenomena and their role in the emergence of consciousness will be explained later in this paper. Let's turn back again to the oversimplified model.
Computations performed at each stage of the hierarchy could be regarded as a kind of "interpretation" that gives a particular meaning to the representation before being sent to the next stage. As a representation ascends the hierarchy, it becomes not only more compressed but also simpler and more meaningful for the system. When the representation becomes compressed beyond a certain level, it transforms into a quale. Therefore, a quale is the most compressed, the simplest, and the most meaningful representation sitting at the top of a hierarchy of internal representations for an external stimulus.
Is there any scientific evidence in support of this oversimplified model of sensory processing? Yes. To some extent, literature on "predictive coding", "the simplicity principle", "Bayesian inference", and "the free-energy principle" supports this model. These are discussed briefly in the following sections.

Predictive Coding
Predictive coding was first developed as a data compression strategy in signal processing (Clark 2013). It is an encoding strategy by which only unpredicted elements of a signal are transmitted to the next stage for further information processing (Williams 2018). In fact, predictive coding compresses a signal (or a representation) by removing the predictable, and hence redundant, elements of that signal (Rao and Ballard 1999).
In a hierarchical model of predictive coding, as described by Rajesh Rao and Dana Ballard (1999), a pack of sensory information in a computational node (for example, the primary visual cortex or V1) is compared against a prediction received from a higher-level computational node (for example, V2). As a result of this comparison, deviations from such predictions (called the prediction errors) are identified and only these elements are fed forward to the next computational node. In this context, the prediction error is the difference between a pack of sensory information and a higherlevel prediction that both enter a computational node.
In predictive coding, feedback and feedforward connections allow the serial, reciprocal exchange of predictions and prediction errors (Shipp 2016). Signals (or packs of information) descending the hierarchy via backward connections (i.e. top-down inputs) contain predictions, while signals ascending the hierarchy via forward connections (i.e. bottom-up inputs) contain prediction errors.
In general, a computational node at any given stage attempts to predict the representation (or the pack of information) generated at the stage below. Furthermore, the same computational node also attempts to improve (or update) the representation at the stage above by reporting its errors of prediction (Shipp 2016). As a representation ascends this hierarchy, its errors are gradually minimized. The representation generated at the top of the hierarchy, has the least prediction errors, and hence is the most accurate prediction that a sensory system has about the associated external stimulus. Little and Sommer (2013) argue that the predictive accuracy of an internal representation could be measured by its mutual information with the sensory input. In this context, mutual information is the amount of information an internal representation contains regarding the associated sensory input. On the basis of these arguments, the QBIT theory suggests that a quale is an internal representation generated at the top of the hierarchy of predictive coding. Therefore, a quale is the most accurate representation, with the least prediction errors, and maximal mutual information.

The Simplicity Principle
The simplicity principle is a powerful unifying principle in cognitive science capable of explaining a wide range of phenomena including perception as well as learning (Chater and Vitanyi 2003). The simplicity principle states that a primary goal of sensory processing is to create the simplest possible internal representations of external stimuli (Chater 1999). The tendency of a cognitive system to create the simplest possible representations is due to the fact that simplest representations allow the most accurate predictions and provide the best basis for decision-making, both necessary for survival in a challenging environment.
To create the simplest possible representations, a cognitive system should be endowed with the capacity to compress information. There is a variety of techniques for information compression that a cognitive system (such as the brain) can exploit to maximize simplicity of its internal representations. One of these techniques is the "matching and unification of patterns" as described by Wolff (2016). This kind of information compression is accomplished through a series of computational operations that search a pack of information to find patterns that match each other, and then merge or unify them so that multiple configurations of the same pattern are reduced to one. Wolff (2019) argues that compressing a representation (or a pack of information) via the matching and unification of patterns increases both the simplicity and the explanatory power of that representation. He suggests that this kind of information compression via the matching and unification of patterns is an essential part of perception, cognition, and learning in the human brain.
The simplicity principle is closely connected to the concept of "Bayesian inference" (Chater and Vitanyi 2003;Pothos 2007;Chater et al. 2010). Feldman (2016) argues that, in cognitive science, complexity minimization and Bayesian inference are regarded as profoundly intertwined, if not practically the same thing. It is noteworthy that Bayesian inference has a built-in tendency towards representations with fewer parameters (i.e. simpler or lower-dimensional representations) over those with more (Feldman 2009). But what is Bayesian inference, and how is it related to the QBIT theory of consciousness.

Bayesian Inference
Bayesian inference is a statistical method of reasoning in which information already available in a system (i.e. prior knowledge) together with new evidence (i.e. incoming information) are used to generate, test, and update a hypothesis (or a belief) about the hidden causes of an event. Bayesian inference can be realized using a variety of strategies, one of which is the hierarchical predictive coding (Aitchison and Lengyel 2017). Both predictive coding and Bayesian inference agree upon the importance of integrating external inputs with internal signals (i.e. predictions, priors, or hypotheses) (Aitchison and Lengyel 2017).
Predictive coding could be regarded as a kind of hierarchical Bayesian inference, in which, top-down predictions play the role of "empirical priors" (Friston 2013). However, at the top of the hierarchy, there is no top-down prediction, and expectations become "full priors". These expectations are usually associated with instincts and prior beliefs that are selected by evolution as necessary for survival (Friston 2013).
Experimental evidence shows that the visual system uses a hierarchical Bayesian inference to interpret sensory information (Lee and Mumford 2003). This is apparently not restricted just to visual perception. In general, perception could be considered as a kind of hierarchical inference or successive rounds of hypothesis testing and updating (Gregory 1980;Friston et al. 2012).
Hierarchical Bayesian inference gradually minimizes uncertainty in a series of hypotheses about an event. This is achieved by accumulating (or maximizing) Bayesian evidence toward the top of the hierarchy. The QBIT theory suggests that when accumulation of Bayesian evidence (and hence certainty) within a cognitive system about an external stimulus exceeds a particular level, the system becomes conscious of that stimulus. This occurs at the top of the hierarchy of Bayesian inference. In fact, a quale is regarded as a hypothesis (about the hidden cause of a sensory input) for which the system has accumulated the greatest amount of Bayesian evidence.
Bayesian inference, predictive coding, and the simplicity principle could be regarded as different manifestations of a more fundamental principle called the "free-energy principle". In the next section, this unifying principle and its relation to the QBIT theory is discussed.

Free-Energy Principle
The free-energy principle states that any self-organizing system (such as a biological organism) that is able to resist decay and maintain its integrity over time must constantly minimize its internal entropy by minimizing its variational free energy (Friston 2010). In this context, variational free energy is an information theoretic analogue of the thermodynamic free energy, and entropy is the long-term average of surprisal (or uncertainty) (Kirchhoff et al. 2018;Ramstead et al. 2018). Therefore, minimizing free energy is equivalent to reducing entropy and uncertainty (Kirchhoff and Froese 2017). Shannon entropy (also called uncertainty) quantifies how much is not known about something (Adami 2016). In other words, entropy is a measure of the amount of information needed to eliminate all uncertainty about a variable (Borst and Theunissen 1999).
According to the free-energy principle, all biological organisms are forced to generate internal models of their environments (Badcock et al. 2019). They must create hierarchical generative models of the world in order to become capable of minimizing their free energy, and consequently minimizing their internal entropy (Ramstead et al. 2018). Minimizing free energy is roughly equivalent to maximizing the evidence for a model (Badcock et al. 2019). Therefore, an organism must constantly maximize evidence for its generative models of the world through Bayesian inference and active sampling of sensory information Kanai et al. 2015).
The free-energy principle proposes that adaptive fitness of an organism corresponds to minimization of sensory uncertainty, which is the average of surprisal (Kim 2018). According to this principle, when an organism is stimulated through its sensory receptors, it instantly (and automatically) initiates an attempt to minimize sensory surprisal by means of active inference (Kim 2018).
Variational free energy is roughly equivalent to prediction error (Friston 2013). Therefore, minimizing free energy increases the accuracy of predictions of a cognitive system. Furthermore, minimizing free energy gives rise to reduction of complexity of accurate predictions (Friston et al. 2016). In fact, free energy can be expressed as complexity minus accuracy (Feldman and Friston 2010). Therefore, minimizing free energy corresponds to minimizing complexity, while maximizing accuracy (Friston 2012). Here, "complexity" is used to imply the amount of statistical regularity, and not the amount of information, within a representation (Adami 2002). Statistical regularity is a kind of redundancy (Barlow 1974). Any regular or predictable element of a representation reduces its simplicity. In fact, the degree of simplicity of a representation is inversely related to the amount of statistical regularity it contains (Barlow 1974). Therefore, minimizing free energy of a representation gives rise to redundancy reduction and hence compression of the representation.
A cognitive system (such as the brain) could minimize its variational free energy by recurrent information passing through a hierarchy of computational nodes, so that each node minimizes uncertainty in the incoming information by receiving a prediction (or a prior) and responding to errors in that prediction (Fotopoulou 2013). In fact, the brain attempts to reduce the probability of being surprised by an external stimulus by reducing errors in its representations of that stimulus (Kirchhoff and Froese 2017).
On the basis of all these arguments, it is plausible to suggest that, in the brain, the overall drive of the free-energy principle is to (1) create an internal model of the external world, (2) maximize Bayesian evidence for that model, (3) reduce uncertainty in internal representations, (4) increase the accuracy of internal representations, (5) maximize simplicity of internal representations, and (6) make the internal representations more compressed. Hobson et al. (2014) argue that when the brain reduces complexity, it also reduces its thermodynamic free energy, and hence reduces the work needed to attain that state. In fact, a brain state with minimum complexity is also the state with minimum thermodynamic free energy. In other words, a maximally simple brain state is in an energetic minimum. The QBIT theory of consciousness suggests that a conscious state corresponds to a state with the minimum possible variational free energy. Therefore, in a hierarchy of internal representations, a quale is the representation that is in an energetic minimum.

Consciousness and Meaning
Information on its own has no intrinsic meaning. It is "interpretation" that adds a meaning to information. The same pack of information can have different meanings, depending on how it is interpreted by a system (Orpwood 2007). The QBIT theory suggests that, in a cognitive system, what interprets a pack of information and assigns a particular meaning to it is an internal model that has been installed in the system. This internal model is in fact a hierarchical generative model. A pack of information (i.e. an internal representation) undergoes interpretation at each stage of this hierarchy, and thus gradually becomes more meaningful as it ascends toward the top of the hierarchy. Consistent with this conjecture, Tschechne and Neumann (2014) argue that computations in early and intermediate stages of visual hierarchy transform local representations into more meaningful representations of contours, shapes and surfaces.
For each quale that the brain can generate, there is a specific internal model installed (or encoded) in the brain. To generate a quale, its associated internal model should be activated. However, bottom-up activation of an internal model does not necessarily give rise to generation of a quale, unless its activation is strong enough to reach the top of the hierarchy. If not, activation of the internal model results in a quale-less (or unconscious) perception. Even in the absence of consciousness and sensory inputs, internal models can guide the behavior of a system (Marstaller et al. 2013).
Internal models are created as a cognitive system observes and interacts with its environment for a long enough time. In other words, internal models gradually form as the system repeatedly acts on the environment through its actuators and receives feedback through its sensors (Marstaller et al. 2013). When the environment or the tasks that should be performed to survive in the environment are complex enough, the cognitive system reacts to this challenge by developing internal models (Marstaller et al. 2013). Expectations and needs are two factors that shape internal models of an animal. As animals evolve to behave appropriately and survive in a dynamic environment, internal models of the environment emerge within their nervous systems. Internal models are hierarchical, nonlinear and dynamic. They could be shaped by learning, and become updated during the lifetime of an animal or over the course of evolution (Marstaller et al. 2013).

Information Compression
The QBIT theory assumes that the key to solve the problem of consciousness is the concept of "information compression". Sometimes the phenomenon of consciousness appears so enigmatic that one cannot stop thinking that the emergence of consciousness requires something like a magic. According to the QBIT theory, if emergence of consciousness actually requires a magic, this magic is performed by information compression. In nature, we have a good example of the magic of compression: extreme compression of matter creates an enigmatic entity, called the "black hole". Likewise, extreme compression of information might create another enigmatic entity, a quale. The QBIT theory suggests that packing too much quantum information into a small space causes something like a gravitational collapse, giving rise to the creation of a quale. Roughly similar to a black hole (which is a superdense pack of matte), the QBIT theory considers a quale as a superdense pack of quantum information.
The QBIT theory assumes that, for extreme information compression, quantum phenomena (such as entanglement and coherence) are required. Classical physics cannot perform the magic. Consistent with this assumption, it has been shown that entangled quantum states can be compressed much more than what is possible via classical lossless compression (Reif and Chakraborty 2007). Furthermore, quantum entanglement is the most important resource for superdense coding (Bruß et al. 2004).
A benefit of information compression is a decrease in the computation requirements by a factor equal to the compression ratio. Such a decrease might be important for systems in which the computing power is limited or too expensive (Bar-Shalom 1972). In fact, any system with limited resources that is located in a challenging environment and solve complex problems need to compress information (Kipper 2019).
Information available to our sensory receptors is highly redundant. Information compression via the reduction of redundancy appears to be a major goal of computation in the earliest stages of sensory systems (Becker 1996). For example, lateral inhibition in the retina could be viewed as a process of removing local correlations in retinal input, thus providing a less redundant and hence more compressed representation of that input (Chater 1999). Therefore, a main goal of computation in the retina is to transform the visual input into a statistically independent form as a first step in creating a compressed representation in the cerebral cortex (Atick and Redlich 1992;Olshausen and Field 1996). A computational benefit of information compression for the cerebral cortex is that the transfer and utilization of a huge amount of sensory information would become much easier and less costly. Furthermore, information compression causes a significant reduction in the amount of memory required to store a pack of information.
As Wolff (2019) argues, compressing a pack of information could be considered as a process of reducing informational redundancy and consequently increasing its simplicity, while retaining as much as possible of its non-redundant predictive information. In fact, compression of information is a cognitive ability that allows predicting the future from the past and estimating probabilities. By this cognitive ability, an animal, for example, can predict where food may be found or where there may be dangers. The better and more efficiently an organism can compress information, the more accurate its predictions will be (Vitanyi and Li 2000). All successful predictive systems, including the human brain, could be regarded as approximations of an ideal information compressor (Maguire et al. 2016).
Based on these evidence and arguments, it seems plausible to suggest that information compression is an important part of cognition (Chater and Vitanyi 2003;Wolff 2016). The QBIT theory suggests that it is also an important and necessary part of consciousness. The idea that consciousness requires information compression is not new. Maguire et al. (2016) as well as Ruffini (2017) have proposed a similar idea previously. Maguire et al. (2016) propose that consciousness can be understood in terms of "data compression", a well-defined concept from computer science that acknowledges and formalizes the limits of objective representation. They suggest that information compression occurs when information is bound (or integrated) together through the identification of shared patterns in a pack of information. Maguire and his colleagues further argue that data compression is not just something that happens when a pack of information is reduced in size. Due to its connection to induction and prediction, information compression can be considered as a process that provides reliable proof of (or evidence for) understanding or comprehension. The higher the level of compression that is achieved by a system, the better the predictions of the system will be, and the greater the extent to which it can be said that the system has understood the information. This is very similar to the idea proposed by Chaitin (2006) that "compression is comprehension". Ruffini (2017) proposed that consciousness is possible only in computing systems that are capable of creating compressed representations of the external world. He argues that the brain is a model builder and a compressor of information. Ruffini suggests that the brain builds a compressive model and uses it to perform information compression with simplicity as a guiding principle.

Consciousness Requires Quantum Phenomena
The QBIT theory suggests that consciousness requires quantum phenomena, including entanglement and coherence. Quantum entanglement and coherence are real physical resources which are indispensable for certain computational tasks that cannot be performed using classical resources such as energy (Maruyama et al. 2005;Streltsov et al. 2017). However, these resources are very fragile at physiologic temperatures as compared to cryogenic temperatures, since the environmental noise increases with increasing temperature, resulting in rapid decoherence and loss of useful entanglement (Marais et al. 2018). In fact, decoherence is a common obstacle for all phenomena that depends on the capacity of preserving and using quantum coherence and entanglement (Viola et al. 1999).
Although coherence and entanglement are so fragile at physiologic temperatures, there is strong evidence that these two quantum phenomena play important roles in certain biological processes, including photosynthesis in plants and bacteria as well as magnetoreception in birds (Engel et al. 2007;Gauger et al. 2011;Lambert et al. 2013;Brookes 2017). Furthermore, there is a growing body of literature in support of the idea that entanglement and coherence are also involved in some aspects of cognition (Wang et al. 2013;Hameroff 2014;Busemeyer and Wang 2015;Surov et al. 2019).
The QBIT theory suggests that quantum entanglement and coherence play an essential role in consciousness. This idea is also the basis of the Orchestrated Objective Reduction (Orch OR) theory of consciousness, developed by Stuart Hameroff and Roger Penrose (2014). The Orch OR theory suggests that, for consciousness to occur in a system, it is necessary that a sufficient amount of material (e.g. microtubule) be kept in a coherent (or pure) state for a long enough time.
In quantum mechanics, every system has a set of states. A state is called pure if it contains maximal information about the system (Atmanspacher et al. 2002). The Orch OR theory suggests that different states of a tubulin represent information in the brain. The theory considers tubulin bits (and quantum bits, or qubits) as entangled coherent states. These coherent (or pure) states of microtubules in one neuron can extend by entanglement to microtubules in adjacent neurons, potentially extending to brain-wide syncytia (Hameroff and Penrose 2014). In line with this theory, evidence shows that long-lived quantum coherence is possible in microtubules as well as in some other molecules within the brain at physiologic temperatures (Craddock et al. 2014;Weingarten et al. 2016).

Maximally Entangled Pure States
The QBIT theory proposes that a quale is a superdense pack of quantum information encoded in maximally entangled pure states. But why maximally entangled pure states?
Maximally entangled pure states are ideal resources for quantum computation, while mixed states are not very useful for this purpose (Horodecki et al. 2009). Some unique and wonderful effects of quantum computation arise only when maximally entangled pure states are available for use. From a thermodynamic point of view, production of maximally entangled pure states is costly, meaning that it requires consumption of energy and production of entropy. However, for some computational tasks such as estimating a given parameter with a high precision, it is more cost effective for a system to use maximally entangled pure states rather than using already available mixed states (Cirac et al. 1999). Therefore, above a certain level of precision, the cost of computation will be reduced if maximally entangled pure states are used.
Since consciousness requires quantum computation with maximally entangled pure states, a conscious agent should be endowed with a mechanism that constantly produces and preserves such states. The QBIT theory proposes that, in the brain, the task of producing and preserving maximally entangled pure states is partly performed by something like a Maxwell demon. In thermodynamics, a Maxwell demon is an entity that couples to a system, and improves thermodynamic efficiency of that system (Zurek 1989). In fact, a Maxwell demon extracts work and removes heat from a target system in a cyclic process. Work extraction and heat removal (which is a thermodynamic process) is equivalent to converting mixed states to pure states (which is an information theoretic process) (Horodecki and Oppenheim 2013). In quantum mechanics, there is a special kind of Maxwell demon which is able to transition a qubit into a purer state (Lebedev et al. 2018). In other words, it can inject pure states to an ongoing quantum computation. This kind of Maxwell demon is in sharp contrast to locally operating classical Maxwell demons. It can purify a target qubit over macroscopic distances on the order of meters and tolerates elevated temperatures of the order of a few Kelvin. Such a spatial separation between the system and the demon has practical benefits because it prevents undesired heating of the system during the demon's memory erasure. In fact, this particular demon not only purify a qubit but also makes the environment surrounding the qubit slightly colder. Furthermore, in contrast to the classical demon, this quantum demon utilizes its purity or coherence as a thermodynamic resource.
Although the concept of Maxwell demon was first introduced in thermodynamics, it gradually found applications in other scientific disciplines including information theory and biology. In biology, for example, it has been demonstrated that the action of a Maxwell demon is necessary for chemotaxis in Escherichia coli (Tu 2008;Ito and Sagawa 2015). In this case, the Maxwell demon attempts to reduce the effects of the environmental noise on the target system. The QBIT theory proposes that consciousness is another biological process in which a Maxwell demon could play important roles. One of these roles is production and preservation of maximally entangled pure states for quantum computation.
Quantum computation will be more efficient if maximally entangled pure states are used (Kwiat et al. 2001). Furthermore, some computational tasks, such as dense coding, generally require pure maximally entangled states (D'Arrigo et al. 2014). However, due to the effects of decoherence, practically available states are most likely to be nonmaximally entangled, partially mixed (i.e. not pure), or both (Kwiat et al. 2001). To counter this problem, different methods of entanglement distillation as well as state purification have been proposed and realized experimentally. Entanglement distillation is a process that increases entanglement, but not purity; while, state purification is a process that increase purity, but not entanglement (Kwiat et al. 2001). Entanglement distillation converts a number of less entangled qubits into a smaller number of maximally entangled qubits (Pan et al. 2003). While, state purification converts mixed states into maximally coherent (i.e. pure) states (Liu and Zhou 2019).
Both entanglement distillation and state purification can be realized by a sequence of "local operations and classical communication" or LOCC (Pan et al. 2003;Horodecki and Piani 2012). Therefore, a sequential series of LOCC can potentially generate maximally entangled pure states (Murao and Vedral 2001).

Why are Qualia Subjective?
A prime feature of qualia is that they are subjective. This means that they are private and unshareable, accessible only to the system that is generating them. Observation or measurement of qualia generated within a system is not possible for any other system.
A quale could be regarded as a private key. In terms of information theory, a private key is a string of bits which has two important features. First, it is perfectly correlated. Second, it is inaccessible to any other person (Horodecki et al. 2009). The first feature is due to maximal entanglement. The second feature is due to maximal coherence (or purity), because an eavesdropper who attempts to obtain knowledge about the private key will unavoidably disturb it, introducing a phase error into the system, which destroys purity (Horodecki et al. 2009).
"Entanglement is the quantum equivalent of what is meant by privacy." This nice statement, and the argument behind it, in a paper by Horodecki et al. (2009) provided insight for the QBIT theory to propose that quantum entanglement might be able to explain the subjectivity of consciousness.
Quantum entanglement has limited shareability. In the case of pure states, it can even be absolutely unshareable (Seevinck 2010). All these arguments can be expressed in terms of the monogamy of entanglement. According to the monogamy of entanglement, maximally entangled pure states are not shareable (Doherty 2014;Susskind and Zhao 2018). Since qualia are encoded in maximally entangled pure states, they should be private and unshareable.

Consciousness and Quantum Information
The QBIT theory proposes that qualia are quantum information in nature, and emergence of qualia requires quantum computation. Most physical phenomena in nature can be formulated and better described in terms of quantum information and computation (Luo 2003). Gravity is a prominent example. Reconciling quantum mechanics with gravity is a hard and yet unresolved problem in physics. Recently, quantum information theory and concepts like entanglement and quantum error correction have come to play a fundamental role in solving this problem. For example, it has been suggested that gravity comes from quantum information (Qi 2018). Furthermore, recent evidence from theoretical physics imply that entangled qubits are not only the origin of gravity, but also the origin of matter and space (Wen 2019). It seems that, at some level, everything reduces to information (Masanes et al. 2013). This inspires the QBIT theory to propose that, at a fundamental level, qualia are quantum information or entangled qubits.
As our knowledge about the nature of quantum information increases, we would gain more insights about the nature of qualia. At present, we know that quantum information is nonlocal. It does not make sense to ask where quantum information is at any given time; it is nonlocally distributed in the entangled state (Susskind and Zhao 2018). Since quantum information is nonlocal, qualia should also be nonlocal. Furthermore, there is some evidence that quantum information is physical (DiVincenzo and Loss 1998). If this turns out to be true, then qualia must also be physical.
In general, information cannot exist without a physical substrate that encodes it (Landauer 1991). Therefore, information that we retain in our brains should also have a physical substrate. This physical substrate is a kind of qubit. But what plays the role of qubits in the brain? There are, at least, two potential candidates: the "tubulin bits" described by Stuart Hameroff and Roger Penrose (2014), and the "neural qubits" described by the physicist Fisher (2015). Fisher (2015) suggests that, in the brain, nuclear spin of a single phosphorus atom residing on a Posner molecule can serve as a qubit, called a "neural qubit". A Posner molecule is a kind of calcium phosphate molecule with a unique chemical structure that can protect phosphorus nuclear spins from decoherence for very long times. Phosphorus nuclear spins in different Posner molecules can become entangled and remain so for relatively long periods of time (Weingarten et al. 2016).
In the brain, the Posner molecule seems to be a promising platform for quantum computations based on phosphorus nuclear spin. The nucleus of a phosphorus atom is an extremely weak magnet. It can be thought of as a compass needle that can point toward either north or south. These north or south positions are equivalent to zeros and ones of binary codes which form the basis of classical computation. In classical computers, information is encoded in zeros and ones, which themselves are represented by different voltages on semiconductors (Adami 2012).

QBIT, Orch OR, and IIT
Two of the most promising and well-developed theories of consciousness are the orchestrated objective reduction (Orch OR) theory (Hameroff and Penrose 2014) and the integrated information theory (Tononi 2008).
The main similarity between the Orch OR theory and the QBIT theory is that both are constructed on the basis of quantum mechanics. Both theories propose that consciousness requires quantum computation, entanglement, and coherence. The Orch OR theory assumes that consciousness necessarily requires collapse of the wavefunction. In fact, this theory is based on a particular interpretation of quantum mechanics that has the concept of "objective collapse of the wavefunction" at its core. Therefore, the Orch OR theory might be incompatible with other interpretations of quantum mechanics, particularly with non-collapse interpretations such as the "many-world interpretation". In contrast to the Orch OR theory, The QBIT theory is not based on a specific interpretation of quantum mechanics. The main emphasis of the QBIT theory is not on the objective or subjective collapse of the wavefunction, but on the extreme compression of information. Therefore, the QBIT theory, unlike the Orch OR theory, might be compatible with all different interpretations of quantum mechanics.
The integrated information theory (IIT) suggests that consciousness is integrated information (Tononi 2004). According to this theory, consciousness has a quantity as well as a quality. Its quantity is determined by the amount of integrated information generated by a system, while its quality is determined by the set of informational relationships generated within that system.
A similarity between IIT and the QBIT theory is that both theories are constructed on the basic concept of information. In fact, both theories directly connect consciousness to information. As explained by Tononi (2008), the concept of integrated information can in principle be extended to include quantum information because there are interesting parallels between integrated information and principles of quantum mechanics. IIT assumes that quantum entanglement and integrated information are informationally one (Tononi 2008).
IIT states that any system that has integrated information is conscious. This leads to a very counterintuitive consequence that even a simple photodiode is endowed with consciousness. Tononi (2008) argues that "even a binary photo diode is not completely unconscious, but rather enjoys exactly 1 bit of consciousness. Moreover, the photodiode's consciousness has a certain quality to it-the simplest possible quality". According to the QBIT theory, this proposal in not correct. A photodiode and even a much more complex system (such as a digital computer) are not conscious at all because they have not the capacity to compress information beyond the stringent threshold required for the generation of consciousness.

Conclusions
According to the QBIT theory of consciousness, a quale (or a subjective conscious experience) is the end-product of "representation distillation". A quale is a maximally compressed representation that is most meaningful for the brain. It is the simplest, the most accurate, and the most efficient representation that could be generated to represent an external stimulus within the brain. When the brain generates such a representation, its uncertainty about the external stimulus becomes as minimum as possible.
The QBIT theory of consciousness is in its first stage of development, attempting to absorb relevant evidence from various scientific disciplines. Apparently, it is not a complete and comprehensive theory, but I think it is on the right path toward solving the problem of consciousness.
The focus of this paper is exclusively on the physical (or neural) aspects of consciousness. This does not mean that consciousness has no other aspects or dimensions. Exploring other aspects of consciousness, as performed by many philosophers and psychologists, is just as valuable as exploring its neural correlates. For example, philosopher Charles Sanders Peirce developed the philosophy of synechism and suggested the idea that consciousness has not only a bodily but also a social dimension which originates outside the individual self.
Another aspect of consciousness that deserves exploration is what Polanyi (1965) calls subsidiary awareness. As explained by Polanyi, "the characteristic feature of subsidiary awareness is to have a function, the function of bearing on something at the focus of our attention". He argues that subsidiary awareness is not equivalent to subconscious or preconscious awareness. Furthermore, it is not identical with the fringe of consciousness as described by William James. Polanyi argues that the connection between body and mind is an instance of the link between the subsidiary and the focal in tacit knowledge.
A challenging issue that any theory of consciousness should deal with is the evolutionary origin of consciousness. Briefly, the QBIT theory suggests that consciousness as a process has evolved from associative learning, and different types of qualia have evolved from ancient forms of feelings. Bronfman et al. (2016) provides evidence to support the idea that unlimited associative learning is the marker of the transition to minimal consciousness during evolution. They argue that unlimited associative learning is the phylogenetically earliest manifestation of consciousness, and the driver of its evolution.
The idea that feelings are the evolutionary origins of qualia has been explored extensively by Langer (1967). She has explained how feelings could evolve within higher organisms to become conscious percepts in human being (Shelley 1998). Due to space reasons, I will not go deeper into these topics. I hope I could explore these ideas in a future paper.
Majid Beshkar I was born in Iran in 1980. I studied dentistry in Tehran University of Medical Sciences, and graduated with a doctorate degree in dentistry (DDS) in 2004. During the doctorate program, I became interested in the scientific study of consciousness, and followed consciousness studies as a serious line of research in parallel with dentistry. In 2009, I started a residency program in oral and maxillofacial surgery in Tehran University of Medical Sciences and graduated with a specialty degree in 2014. Currently, I am assistant professor of oral and maxillofacial surgery in Tehran University of Medical Sciences.