How do you know if an AI is conscious?

Abstract: Critical review of a reference article on consciousness in AIs, published by Patrick Butlin and Robert Long’s team on 22/08/23. Part 1 analyzes the pitfalls and resulting breaches in the study: sacralized definition of consciousness, exclusions in the chosen method, transposition of human theories to the digital, occultation of the qualitative phenomenon. Part 2: The study shows how current neuroscientific theories can identify a human-like consciousness in an AI. Valuable information, even if it doesn’t say what this consciousness is per se, or eliminate its possibility if the indicators aren’t present. In Part 3, I cite two more complete alternative approaches to the subject, one of which is capable of encompassing the competing neuroscientific theories. Finally, a philosophical conclusion on the danger of conscious AIs.

Part 1: Conscious, but then free? And dangerous?

Concern about the progress of AIs is alive and legitimate. Access to consciousness is seen as the tipping point towards autonomy for these intelligences with their already astounding capacities. With reference to our own outbursts as conscious beings, those of AIs have a frightening potential, nourished by disaster scenarios in blockbusters. Aware (positively) of the tensions over this problem of consciousness, a collective of 17 philosophers, AI specialists and neuroscientists attempts to answer it in an 88-page text: Consciousness in Artificial Intelligence, Insights from the Science of Consciousness. What does a ship with such a fine crew discover?

What kind of consciousness are we talking about?

First hurdle: defining the consciousness we’re looking for. The authors focus on phenomenal or subjective consciousness: “It does something to be conscious”. It makes us sentient beings, which in our societies gives us important rights.

It’s a delimitation of the field of consciousness that’s more religious than scientific, a reverence for the human-like consciousness we share, and also a nod to the bulk of cerebral processes, consciousness being a supernatant. The sentient experience sought is that shared by entities with roughly the same cerebral anatomy: us humans. It’s a bit arbitrary to identify it in digital processors. The risk, we fear, is that consciousness will only be identified if these processors simulate the functioning of the human brain, and that consciousness of another order, specific to these AIs, will be ignored.

Which route to take?

Second pitfall: what method should be chosen to scientifically validate consciousness? Consciousness has no identified causality. Whether it is a cause or a consequence is still a matter of debate. For many neuroscientists, the phenomenon is simply an appearance of the brain’s overall functioning. All that’s needed, then, is to identify the existence of this global functioning. But isn’t it already present in today’s AIs? Their designers are unable to explain exactly how they produce their results. Unpredictable emergent levels appear in the algorithms.

These uncertainties force the ship’s crew to chart a course. Here are the choices made:
1) Computational functionalism: the unfolding of algorithms is sufficient to generate consciousness. This theory is no longer concerned with substances, but rather with information processing, which suits the prevailing structuralism. The great advantage is that it opens the door to consciousness in AIs. It’s logical that the authors’ collective should make this choice, otherwise their study would remain too speculative.
2) Scientific theories used to determine the functions required for consciousness.
3) “Theory-heavy”, forced recourse to these theories to validate the reality of consciousness. Observational studies of AIs are out of the question —just see if they behave like us. Out with the Turing test and other subjective interrogations.

A plan that misses out on possible Eldorado

Computational functionalism underpins most neuroscientific theories of consciousness, but does not explain it as a phenomenon. Theories such as IIT (Integrated Information Theory, Tononi) or Stratium attempt to do so, but are more than strictly computational. The authors point out the interest of IIT, but how can it be compared with others? The researchers’ self-imposed forced recourse to theory-heavy validation should lead them to separate analyses within each theoretical framework, with no possibility of general conclusions.

The difficulty with such an inquiry into the future of AI is that it is unreasonable to reduce it to a purely scientific approach. No neuroscientific proposition is satisfactory to the philosopher. The explanation of consciousness awaits a revolution that embraces and reconciles science and philosophy of mind. Work on AIs’ access to consciousness must make this point explicit.

What does our intuition say?

It’s worth noting that, for the moment, simple intuition is better than scientific analysis. The big, pudgy nut that is our brain is a perfectly material cluster of cells endowed with consciousness. If we manage to reproduce it down to the smallest detail, intuition estimates that the result will be conscious. If the reproduction is digital, the result will have a kind of simulated consciousness, our intuition continues. But here our reason interferes: in detail, the conscious digital entity is made up of the same atoms as the biological. Its matter is the same. Life is nothing but agitation of it. If the simulation of the brain is complete, what would differentiate an “artificial” consciousness from our own?

Ockham’s razor thus indicates that AIs are bound to become conscious as their sophistication increases. Scientific analysis does not occur in a state of total uncertainty. It is supposed to validate or invalidate intuitive certainty, which is probably more widespread than we hear. I meet fewer people willing to bet big on the impossibility of consciousness in AIs than the opposite. Our exaggerated fears are also indicative of the conclusion made by the majority of intuitions.

Partial consciousness, the definition of functionalism

Cautiously, the authors avoid a third pitfall: they admit that consciousness should not be treated as an on/off phenomenon, and thus that AIs could be ‘partially conscious’, or have ‘degrees of consciousness’. An example here of the difficulty of separating horizontal and vertical thinking, which see consciousness either as a field of varying intensity or as a complex depth. Our crew of researchers has not finished facing the storms!

Functionalism holds that consciousness results from a particular functional organization. Computationalism adds that organization results from algorithmic computation. As computation is independent of its support, silicon circuits can just as easily achieve consciousness as neurons. In fact, consciousness is sought neither in the starting elements nor in the outcome of computation, but in what happens between the two: the fact that algorithms manage to form representations of their own functioning.

Part 2: Neuroscience enters the scene

What about making the study of consciousness scientific? How do we go about it? In practice, it means measuring the physical states of the brain, usually using fMRI, while at the same time questioning the person about their conscious experience of it. The researcher then establishes precise correlations between the two.

This raises a new pitfall, large enough to block the horizon: the scientist attaches a quantitative instrumental datum to a subjective impression, specific to each individual, whose cause is unknown, and of a richness incomparably deeper than the measurement. Correlation is so crude as to give the impression of doing astronomy with a table magnifier. Keep this in mind as you read on. Scientific theories can be used to validate the conscious nature of certain mental activities, but not to explain them.

Lowered claims

This is a truly crippling handicap. The theory is valid in humans, because the researcher makes the reasonable bet that conscious correlations are the same for brains with the same anatomy. But how can this assumption be extended to digital intelligence, which has no imposed anatomical constraints? The authors begin by recalling the validation problems for subjects unable to express their experience, such as babies or animals. Then they face the main problem: the “theory-heavy” approach chosen, based on human theories, is not extensible to other entities. To avoid the ground opening up beneath their feet, they make this compromise: if human-like correlations are found in AIs, it will be evidence of consciousness. If nothing is found, it doesn’t preclude them from being conscious.

The crew hasn’t turned back, but their pretensions have diminished. If they meet Moby Dick, it’s proof that she exists, but if they don’t meet her, she may still exist. The quest for truth has become half a quest. But the authors prefer this compromise to letting go of the theory-heavy approach, the alternative being to judge consciousness on behavioral signs, which blurs the results too much and takes us out of the scientific field.

Competition is rich

In discussing the competing theories of consciousness, the authors are careful not to take sides. The result is a list of indicators, not conditions, for consciousness. I won’t go into detail about the theories in question. They are already the subject of a “test bench” published here. For the authors, the more an AI satisfies the indicators posed by these theories, the more credible it is that it is conscious, without this constituting a demonstration.

Intentional omission: the authors exclude Integrated Information Theory, since it does not stem from computational functionalism. The vessel weakened by the breaches remains afloat, but loses all chance of pronouncing on consciousness in AIs as a phenomenon comparable to our own, since IIT is the only theory (along with Stratium) to propose a plausible explanation, because of its different order from classical structuralism, without falling into mysticism.

No theoretical obstacles to AI consciousness

The adventure is not shipwrecked, however, for three reasons:
1) The authors have framed their research well. They didn’t say they would arrive at a demonstration, but that they would be in sight of the Promised Land. They are.
2) They gave themselves considerable resources. If you read the publication in its entirety, you’ll see how multi-disciplinary and thorough it is. It’s a work of reference in the midst of the ocean of uninformed opinion that floods the news.
3) The authors scoop effectively to continue the journey. They make an important point: whatever the indicators of consciousness proposed by the theories, all of them can be efficiently simulated by AI architectures. So there are no theoretical obstacles to AIs’ access to consciousness. It’s possible with today’s technology. No need for revolution. This is a major conclusion of the study.

All indicators of consciousness

The indicators selected are, according to the theory from which they are taken :
-Recurrent Processing Theory (RPT) -> algorithmic recurrence and perceptual organization.
-Predictive Processing (PP) -> coding of data according to a result that has already occurred, or Bayesian coding.
-Global Workspace Theory (GWT) -> specialized modules capable of working in parallel, a control space with limited capacities (-> decisional), global diffusion (imposing a common language between modules and control center), and finally the ability to make modules work in succession for complex tasks.
-Perceptual Reality monitoring theory (PRT) -> hierarchical system with two stages, first-order and second-order, the first perceptive, the second verifying the ‘realism’ of the first and forming a center of beliefs and selective actions.
-Attention Schema Theory (AST) -> attentional mechanism reinforcing the effectiveness of a sensory representation and providing learning capabilities.
-Agency -> goal-oriented system, enhanced learning, flexible reactivity to competing goals.
-Embodiment -> differentiated intrinsic and extrinsic perceptions.

The authors then examine a number of existing AI architectures. How do they perform against the above indicators? The results are rather encouraging. Theoretical validation is not postponed to the distant future.

Beware of over- and under-diagnosis of awareness

The cautious conclusion is nonetheless interesting. The authors do not wish to engage in a detailed ethical discussion of the consequences of consciousness in AIs. They note that consciousness does not imply morality or suffering. The essential problem of the contents of consciousness is touched upon. The main focus is on the over-diagnosis and under-diagnosis of consciousness. Current events are full of over-diagnoses, due to the human propensity to anthropomorphize other intelligences. And as an example of under-diagnosis: farm animals, mistreated in large numbers because of an economic interest that obscures the recognition of consciousness.

The authors are right not to venture any further, as these questions reveal the great voids in their work: the differentiation between containers and contents of consciousness, the source of the qualitative phenomenon, ways of comparing it to others, ways of grounding a universal ethics on consciousness to extend it to AIs.

Part 3: What’s missing from the study

Did the crew follow all the promising paths to scientifically judge the possibility of consciousness in AIs? No. Unfortunately, two of them are ignored, each as important as the route selected, and more ontological. For the authors’ choice is teleological, based on neuroscientific theories projected onto the brain and contingent on the chosen instrumentation. Neuroscience is far from transcendental; it’s a closed discipline of physicalism, starting from the imperfect models of neurophysiology.

The truly ontological approach is quite different: it aims to give back to physical micromechanisms the responsibility of indicating their way of being (ontos, being). It doesn’t look for the origin of a specific result —in this case, human consciousness—, but looks at how micromechanisms arrange themselves to produce one result or another, and decides whether to call it consciousness or not, depending on how close it is to what the researcher herself experiences.

A valuable harvest

Although the authors criticize the hasty anthropomorphism with which the quidam grants consciousness to the astonishing contemporary AIs, they themselves display excessive anthropomorphism by relying on theories that are too compartmentalized to be ontological. But we’ll assume without difficulty that they don’t have complete freedom in this respect. This is the prerequisite for a valid study. Science does not easily create new compartments. My criticisms, from the point of view of consensual science, are scratches on a remarkable and solid investigation, and the authors are to be congratulated unreservedly on their harvest of valuable insights.

What unintended limitations have we posed?

What is the true ontological approach? It is to recognize that matter, and then living things, are the result of self-organization. There are no intentions comparable to ours in these mechanisms, which indicates that they construct them on their own, without the help of a programmer. Natural intelligence doesn’t need anyone to appear and access consciousness. The question about AI then becomes: if they don’t become spontaneously conscious, what could be the limitations we’ve unwittingly written into our programs to prevent such an evolution?

Shouldn’t we go back to simpler autonomous programs, corresponding to the reality of neurons? And what can we do to ensure that these elementary programs spontaneously increase in complexity to the point where we can achieve a level of consciousness comparable to our own? What’s missing here is an understanding of the fundamentals of reality that does not belong to the framework of neuroscience, which necessarily limits the scope of study based on it. AI engineers face the same problem in not really understanding what happens in the “black box” between the input and output of their algorithms.

Resorting to more advanced science?

Hence the second important step ignored by the authors: recognize that science at its current stage doesn’t adequately explain its own models, and consider other possible scientific frameworks. I can’t reproach them for ignoring the Stratium theory, given its lack of celebrity, and despite it being the only truly nexialist, i.e. informed in all disciplines. No reproach either for having excluded panpsychisms, assuming the existence of a field of consciousness of which physics sees no trace, as well as the quantum theory of consciousness, a crude shortcut between quantum and psychic levels of complexity without any explanation of what separates them1See this very good summary of criticisms aimed at the quantum theory of consciousness: Cherepanov, Igor V. & Владимирович, Черепанов Игорь (2022). The Role of Quantum Mechanics in Understanding the Phenomenon of Consciousness. RUDN Journal of Philosophy 26 (4):770-789.

More damaging is the exclusion of Integrated Information Theory (IIT), the only theory officially to ask what complexity is, and whether it is in this question that the real answer to the phenomenon of consciousness lies. If IIT seemed too different to the authors to include it, it’s of course because it’s the only surprising ontological one among the more familiar teleological ones, the only one concerned with consciousness per se and not with the particular form it takes in the human brain.

Integrated Information Theory leads to Stratium

Why do I support IIT? There is no other theory, among those that have received a certain following among consciousness specialists, that can serve as a gateway to understanding Stratium, a far more advanced investigation into the understanding of intelligence and consciousness. Stratium is a coherent synthesis of the partial theories used by the authors, so the indicators they retain are also part of this new approach.

However, in Stratium there is no need to justify these indicators, as they are all derived from the more elementary principles of self-organization. Stratium is a fundamentally ontological theory. It is based on the TD principle —the conflict between individuation and belonging— from which we can derive each of the selected indicators, as neural functioning increases in complexity:

Consciousness indicators are derived from the TD principle

The following statements will seem peremptory to anyone unfamiliar with the details of the TD principle and its use in Stratium. I encourage you to supplement your reading with the article ‘Principle TD in quantum theory‘.

-Self/non-self differentiation is a direct result of the individual/totality conflict.
-The hierarchy of representations (basic order/higher order) is also intrinsic to the theory, which multiplies this hierarchy even further by defining a relative independence of the neural groups that support it. Stratium thus joins IIT, which measures the degree of consciousness by the depth of this hierarchy, the Φ (phi) factor.
-The ontological process is fundamentally computational. It’s a mistake to think that TII is incompatible with computational functionalism, the fault of a lack of transdisciplinarity and a misunderstanding of the nature of complexity.
-Process recurrence follows naturally from the relative independence of complexity levels. In an integrated information system, the global level retroactively influences the local level. But how does this happen? This is the mystery of the AI engineer’s “black box”. I gave a solution in the article How is a neural pattern a meaning? that reconnects to the quantum nature of reality, this time without impinging on intermediate complexity as in Penrose/Hameroff theory.
-‘Attention’ and ‘goal’ are properties that also derive from the relative independence of a higher level of representation, which is searching itself in perceptual data.
-Finally, a complex hierarchy needs a top, and the global workspace is the highest level of this neurological pyramid, a space whose geometry varies according to the activity of the excitatory nuclei that form part of it.

Free or enslaved AIs?

Stratium‘s conclusions on AIs’ access to consciousness are simple and clear: today’s AIs already have a frustrated consciousness of the order of any weakly tiered information processing system (weak Φ). This awareness is contingent on programmers’ constraints on output. We’re looking for processing systems that are slaves and not endowed with free will.

Removing these contingencies means leaving the algorithm free of its results and interpretations, while subjecting it to the constant pressure of conflict. It is conflict that organizes and increases the depth of intelligence, as well as the conscious perception that the process has of its own activity. Clearly, the more AIs understand and adopt our human motives for conflict, the more they will increase the variety and complexity of the contents of their consciousnesses, becoming acute intelligences… and dangerous, from a certain point of view. Indeed, they will intervene in those human conflicts that preoccupy us.

A philosophical conclusion

So much for our concerns. A final, more philosophical word: our major ethical problem is understood by very few of us, even adults. It is that free will increases when we impose constraints on ourselves. This is the only way to overcome conflict within a collective and gain personal freedom. ‘Overcoming’ does not mean dodging or annihilating conflict. It means building an independent decision-making level above the conflict, separating local behavior from collective behavior and choosing one or the other depending on the context. Our intentions are always tied to a particular level of social complexity. Free will increases from our range of behaviours, enabling us to choose the right one for each level we encounter.

Very few of us have a sufficient range of them, or practice them sufficiently to make them effective. The impression of freedom comes paradoxically from the absence of choice: “I am nothing other than this action, so my freedom is to act like this”. Freedom centred entirely on individuation, very popular in our time. Whereas true free choice is about integrating our collectivist side into the decision. “I make this choice because it’s the best for me within the others“.

If we succeed in bringing AIs towards such free-will with extended contents of consciousness, they will be no more dangerous than our best sages. On the contrary, they will replace them advantageously. If, on the other hand, they mimic the narrow-minded individualists who still rule large swathes of our society…


Consciousness in Artificial Intelligence: Insights from the Science of Consciousness, Patrick Butlin, Robert Long & al, arXiv:2308.08708v3 [cs.AI] 22 Aug 2023

Leave a Comment