Minds of machines: The great AI consciousness conundrum

David Chalmers was not expecting the invitation he received in September of last year. As a leading authority on consciousness, Chalmers regularly circles the world delivering talks at universities and academic meetings to rapt audiences of philosophers—the sort of people who might spend hours debating whether the world outside their own heads is real and then go blithely about the rest of their day. This latest request, though, came from a surprising source: the organizers of the Conference on Neural Information Processing Systems (NeurIPS), a yearly gathering of the brightest minds in artificial intelligence.

Less than six months before the conference, an engineer named Blake Lemoine, then at Google, had gone public with his contention that LaMDA, one of the company’s AI systems, had achieved consciousness. Lemoine’s claims were quickly dismissed in the press, and he was summarily fired, but the genie would not return to the bottle quite so easily—especially after the release of ChatGPT in November 2022. Suddenly it was possible for anyone to carry on a sophisticated conversation with a polite, creative artificial agent.

Chalmers was an eminently sensible choice to speak about AI consciousness. He’d earned his PhD in philosophy at an Indiana University AI lab, where he and his computer scientist colleagues spent their breaks debating whether machines might one day have minds. In his 1996 book, The Conscious Mind, he spent an entire chapter arguing that artificial consciousness was possible.

If he had been able to interact with systems like LaMDA and ChatGPT back in the ’90s, before anyone knew how such a thing might work, he would have thought there was a good chance they were conscious, Chalmers says. But when he stood before a crowd of NeurIPS attendees in a cavernous New Orleans convention hall, clad in his trademark leather jacket, he offered a different assessment. Yes, large language models—systems that have been trained on enormous corpora of text in order to mimic human writing as accurately as possible—are impressive. But, he said, they lack too many of the potential requisites for consciousness for us to believe that they actually experience the world.

“Consciousness poses a unique challenge in our attempts to study it, because it’s hard to define.”

Liad Mudrik, neuroscientist, Tel Aviv University

At the breakneck pace of AI development, however, things can shift suddenly. For his mathematically minded audience, Chalmers got concrete: the chances of developing any conscious AI in the next 10 years were, he estimated, above one in five.

Not many people dismissed his proposal as ridiculous, Chalmers says: “I mean, I’m sure some people had that reaction, but they weren’t the ones talking to me.” Instead, he spent the next several days in conversation after conversation with AI experts who took the possibilities he’d described very seriously. Some came to Chalmers effervescent with enthusiasm at the concept of conscious machines. Others, though, were horrified at what he had described. If an AI were conscious, they argued—if it could look out at the world from its own personal perspective, not simply processing inputs but also experiencing them—then, perhaps, it could suffer.

AI consciousness isn’t just a devilishly tricky intellectual puzzle; it’s a morally weighty problem with potentially dire consequences. Fail to identify a conscious AI, and you might unintentionally subjugate, or even torture, a being whose interests ought to matter. Mistake an unconscious AI for a conscious one, and you risk compromising human safety and happiness for the sake of an unthinking, unfeeling hunk of silicon and code. Both mistakes are easy to make. “Consciousness poses a unique challenge in our attempts to study it, because it’s hard to define,” says Liad Mudrik, a neuroscientist at Tel Aviv University who has researched consciousness since the early 2000s. “It’s inherently subjective.”

Over the past few decades, a small research community has doggedly attacked the question of what consciousness is and how it works. The effort has yielded real progress on what once seemed an unsolvable problem. Now, with the rapid advance of AI technology, these insights could offer our only guide to the untested, morally fraught waters of artificial consciousness.

“If we as a field will be able to use the theories that we have, and the findings that we have, in order to reach a good test for consciousness,” Mudrik says, “it will probably be one of the most important contributions that we could give.”

When Mudrik explains her consciousness research, she starts with one of her very favorite things: chocolate. Placing a piece in your mouth sparks a symphony of neurobiological events—your tongue’s sugar and fat receptors activate brain-bound pathways, clusters of cells in the brain stem stimulate your salivary glands, and neurons deep within your head release the chemical dopamine. None of those processes, though, captures what it is like to snap a chocolate square from its foil packet and let it melt in your mouth. “What I’m trying to understand is what in the brain allows us not only to process information—which in its own right is a formidable challenge and an amazing achievement of the brain—but also to experience the information that we are processing,” Mudrik says.

Studying information processing would have been the more straightforward choice for Mudrik, professionally speaking. Consciousness has long been a marginalized topic in neuroscience, seen as at best unserious and at worst intractable. “A fascinating but elusive phenomenon,” reads the “Consciousness” entry in the 1996 edition of the International Dictionary of Psychology. “Nothing worth reading has been written on it.”

Mudrik was not dissuaded. From her undergraduate years in the early 2000s, she knew that she didn’t want to research anything other than consciousness. “It might not be the most sensible decision to make as a young researcher, but I just couldn’t help it,” she says. “I couldn’t get enough of it.” She earned two PhDs—one in neuroscience, one in philosophy—in her determination to decipher the nature of human experience.

As slippery a topic as consciousness can be, it is not impossible to pin down—put as simply as possible, it’s the ability to experience things. It’s often confused with terms like “sentience” and “self-awareness,” but according to the definitions that many experts use, consciousness is a prerequisite for those other, more sophisticated abilities. To be sentient, a being must be able to have positive and negative experiences—in other words, pleasures and pains. And being self-aware means not only having an experience but also knowing that you are having an experience.

In her laboratory, Mudrik doesn’t worry about sentience and self-awareness; she’s interested in observing what happens in the brain when she manipulates people’s conscious experience. That’s an easy thing to do in principle. Give someone a piece of broccoli to eat, and the experience will be very different from eating a piece of chocolate—and will probably result in a different brain scan. The problem is that those differences are uninterpretable. It would be impossible to discern which are linked to changes in information—broccoli and chocolate activate very different taste receptors—and which represent changes in the conscious experience.

The trick is to modify the experience without modifying the stimulus, like giving someone a piece of chocolate and then flipping a switch to make it feel like eating broccoli. That’s not possible with taste, but it is with vision. In one widely used approach, scientists have people look at two different images simultaneously, one with each eye. Although the eyes take in both images, it’s impossible to perceive both at once, so subjects will often report that their visual experience “flips”: first they see one image, and then, spontaneously, they see the other. By tracking brain activity during these flips in conscious awareness, scientists can observe what happens when incoming information stays the same but the experience of it shifts.

With these and other approaches, Mudrik and her colleagues have managed to establish some concrete facts about how consciousness works in the human brain. The cerebellum, a brain region at the base of the skull that resembles a fist-size tangle of angel-hair pasta, appears to play no role in conscious experience, though it is crucial for subconscious motor tasks like riding a bike; on the other hand, feedback connections—for example, connections running from the “higher,” cognitive regions of the brain to those involved in more basic sensory processing—seem essential to consciousness. (This, by the way, is one good reason to doubt the consciousness of LLMs: they lack substantial feedback connections.)

A decade ago, a group of Italian and Belgian neuroscientists managed to devise a test for human consciousness that uses transcranial magnetic stimulation (TMS), a noninvasive form of brain stimulation that is applied by holding a figure-eight-shaped magnetic wand near someone’s head. Solely from the resulting patterns of brain activity, the team was able to distinguish conscious people from those who were under anesthesia or deeply asleep, and they could even detect the difference between a vegetative state (where someone is awake but not conscious) and locked-in syndrome (in which a patient is conscious but cannot move at all).

That’s an enormous step forward in consciousness research, but it means little for the question of conscious AI: OpenAI’s GPT models don’t have a brain that can be stimulated by a TMS wand. To test for AI consciousness, it’s not enough to identify the structures that give rise to consciousness in the human brain. You need to know why those structures contribute to consciousness, in a way that’s rigorous and general enough to be applicable to any system, human or otherwise.

“Ultimately, you need a theory,” says Christof Koch, former president of the Allen Institute and an influential consciousness researcher. “You can’t just depend on your intuitions anymore; you need a foundational theory that tells you what consciousness is, how it gets into the world, and who has it and who doesn’t.”

Here’s one theory about how that litmus test for consciousness might work: any being that is intelligent enough, that is capable of responding successfully to a wide enough variety of contexts and challenges, must be conscious. It’s not an absurd theory on its face. We humans have the most intelligent brains around, as far as we’re aware, and we’re definitely conscious. More intelligent animals, too, seem more likely to be conscious—there’s far more consensus that chimpanzees are conscious than, say, crabs.

But consciousness and intelligence are not the same. When Mudrik flashes images at her experimental subjects, she’s not asking them to contemplate anything or testing their problem-solving abilities. Even a crab scuttling across the ocean floor, with no awareness of its past or thoughts about its future, would still be conscious if it could experience the pleasure of a tasty morsel of shrimp or the pain of an injured claw.

Susan Schneider, director of the Center for the Future Mind at Florida Atlantic University, thinks that AI could reach greater heights of intelligence by forgoing consciousness altogether. Conscious processes like holding something in short-term memory are pretty limited—we can only pay attention to a couple of things at a time and often struggle to do simple tasks like remembering a phone number long enough to call it. It’s not immediately obvious what an AI would gain from consciousness, especially considering the impressive feats such systems have been able to achieve without it.

As further iterations of GPT prove themselves more and more intelligent—more and more capable of meeting a broad spectrum of demands, from acing the bar exam to building a website from scratch—their success, in and of itself, can’t be taken as evidence of their consciousness. Even a machine that behaves indistinguishably from a human isn’t necessarily aware of anything at all.

Understanding how an AI works on the inside could be an essential step toward determining whether or not it is conscious.

Schneider, though, hasn’t lost hope in tests. Together with the Princeton physicist Edwin Turner, she has formulated what she calls the “artificial consciousness test.” It’s not easy to perform: it requires isolating an AI agent from any information about consciousness throughout its training. (This is important so that it can’t, like LaMDA, just parrot human statements about consciousness.) Then, once the system is trained, the tester asks it questions that it could only answer if it knew about consciousness—knowledge it could only have acquired from being conscious itself. Can it understand the plot of the film Freaky Friday, where a mother and daughter switch bodies, their consciousnesses dissociated from their physical selves? Does it grasp the concept of dreaming—or even report dreaming itself? Can it conceive of reincarnation or an afterlife?

There’s a huge limitation to this approach: it requires the capacity for language. Human infants and dogs, both of which are widely believed to be conscious, could not possibly pass this test, and an AI could conceivably become conscious without using language at all. Putting a language-based AI like GPT to the test is likewise impossible, as it has been exposed to the idea of consciousness in its training. (Ask ChatGPT to explain Freaky Friday—it does a respectable job.) And because we still understand so little about how advanced AI systems work, it would be difficult, if not impossible, to completely protect an AI against such exposure. Our very language is imbued with the fact of our consciousness—words like “mind,” “soul,” and “self” make sense to us by virtue of our conscious experience. Who’s to say that an extremely intelligent, nonconscious AI system couldn’t suss that out?

If Schneider’s test isn’t foolproof, that leaves one more option: opening up the machine. Understanding how an AI works on the inside could be an essential step toward determining whether or not it is conscious, if you know how to interpret what you’re looking at. Doing so requires a good theory of consciousness.

A few decades ago, we might have been entirely lost. The only available theories came from philosophy, and it wasn’t clear how they might be applied to a physical system. But since then, researchers like Koch and Mudrik have helped to develop and refine a number of ideas that could prove useful guides to understanding artificial consciousness.

Numerous theories have been proposed, and none has yet been proved—or even deemed a front-runner. And they make radically different predictions about AI consciousness.

Some theories treat consciousness as a feature of the brain’s software: all that matters is that the brain performs the right set of jobs, in the right sort of way. According to global workspace theory, for example, systems are conscious if they possess the requisite architecture: a variety of independent modules, plus a “global workspace” that takes in information from those modules and selects some of it to broadcast across the entire system.

Other theories tie consciousness more squarely to physical hardware. Integrated information theory proposes that a system’s consciousness depends on the particular details of its physical structure—specifically, how the current state of its physical components influences their future and indicates their past. According to IIT, conventional computer systems, and thus current-day AI, can never be conscious—they don’t have the right causal structure. (The theory was recently criticized by some researchers, who think it has gotten outsize attention.)

Anil Seth, a professor of neuroscience at the University of Sussex, is more sympathetic to the hardware-based theories, for one main reason: he thinks biology matters. Every conscious creature that we know of breaks down organic molecules for energy, works to maintain a stable internal environment, and processes information through networks of neurons via a combination of chemical and electrical signals. If that’s true of all conscious creatures, some scientists argue, it’s not a stretch to suspect that any one of those traits, or perhaps even all of them, might be necessary for consciousness.

Because he thinks biology is so important to consciousness, Seth says, he spends more time worrying about the possibility of consciousness in brain organoids—clumps of neural tissue grown in a dish—than in AI. “The problem is, we don’t know if I’m right,” he says. “And I may well be wrong.”

He’s not alone in this attitude. Every expert has a preferred theory of consciousness, but none treats it as ideology—all of them are eternally alert to the possibility that they have backed the wrong horse. In the past five years, consciousness scientists have started working together on a series of “adversarial collaborations,” in which supporters of different theories come together to design neuroscience experiments that could help test them against each other. The researchers agree ahead of time on which patterns of results will support which theory. Then they run the experiments and see what happens.

In June, Mudrik, Koch, Chalmers, and a large group of collaborators released the results from an adversarial collaboration pitting global workspace theory against integrated information theory. Neither theory came out entirely on top. But Mudrik says the process was still fruitful: forcing the supporters of each theory to make concrete predictions helped to make the theories themselves more precise and scientifically useful. “They’re all theories in progress,” she says.

At the same time, Mudrik has been trying to figure out what this diversity of theories means for AI. She’s working with an interdisciplinary team of philosophers, computer scientists, and neuroscientists who recently put out a white paper that makes some practical recommendations on detecting AI consciousness. In the paper, the team draws on a variety of theories to build a sort of consciousness “report card”—a list of markers that would indicate an AI is conscious, under the assumption that one of those theories is true. These markers include having certain feedback connections, using a global workspace, flexibly pursuing goals, and interacting with an external environment (whether real or virtual).

In effect, this strategy recognizes that the major theories of consciousness have some chance of turning out to be true—and so if more theories agree that an AI is conscious, it is more likely to actually be conscious. By the same token, a system that lacks all those markers can only be conscious if our current theories are very wrong. That’s where LLMs like LaMDA currently are: they don’t possess the right type of feedback connections, use global workspaces, or appear to have any other markers of consciousness.

The trouble with consciousness-by-committee, though, is that this state of affairs won’t last. According to the authors of the white paper, there are no major technological hurdles in the way of building AI systems that score highly on their consciousness report card. Soon enough, we’ll be dealing with a question straight out of science fiction: What should one do with a potentially conscious machine?

In 1989, years before the neuroscience of consciousness truly came into its own, Star Trek: The Next Generation aired an episode titled “The Measure of a Man.” The episode centers on the character Data, an android who spends much of the show grappling with his own disputed humanity. In this particular episode, a scientist wants to forcibly disassemble Data, to figure out how he works; Data, worried that disassembly could effectively kill him, refuses; and Data’s captain, Picard, must defend in court his right to refuse the procedure.

Picard never proves that Data is conscious. Rather, he demonstrates that no one can disprove that Data is conscious, and so the risk of harming Data, and potentially condemning the androids that come after him to slavery, is too great to countenance. It’s a tempting solution to the conundrum of questionable AI consciousness: treat any potentially conscious system as if it is really conscious, and avoid the risk of harming a being that can genuinely suffer.

Treating Data like a person is simple: he can easily express his wants and needs, and those wants and needs tend to resemble those of his human crewmates, in broad strokes. But protecting a real-world AI from suffering could prove much harder, says Robert Long, a philosophy fellow at the Center for AI Safety in San Francisco, who is one of the lead authors on the white paper. “With animals, there’s the handy property that they do basically want the same things as us,” he says. “It’s kind of hard to know what that is in the case of AI.” Protecting AI requires not only a theory of AI consciousness but also a theory of AI pleasures and pains, of AI desires and fears.

“With animals, there’s the handy property that they do basically want the same things as us. It’s kind of hard to know what that is in the case of AI.”

Robert Long, philosophy fellow, Center for AI Safety in San Francisco

And that approach is not without its costs. On Star Trek, the scientist who wants to disassemble Data hopes to construct more androids like him, who might be sent on risky missions in lieu of other personnel. To the viewer, who sees Data as a conscious character like everyone else on the show, the proposal is horrifying. But if Data were simply a convincing simulacrum of a human, it would be unconscionable to expose a person to danger in his place.

Extending care to other beings means protecting them from harm, and that limits the choices that humans can ethically make. “I’m not that worried about scenarios where we care too much about animals,” Long says. There are few downsides to ending factory farming. “But with AI systems,” he adds, “I think there could really be a lot of dangers if we overattribute consciousness.” AI systems might malfunction and need to be shut down; they might need to be subjected to rigorous safety testing. These are easy decisions if the AI is inanimate, and philosophical quagmires if the AI’s needs must be taken into consideration.

Seth—who thinks that conscious AI is relatively unlikely, at least for the foreseeable future—nevertheless worries about what the possibility of AI consciousness might mean for humans emotionally. “It’ll change how we distribute our limited resources of caring about things,” he says. That might seem like a problem for the future. But the perception of AI consciousness is with us now: Blake Lemoine took a personal risk for an AI he believed to be conscious, and he lost his job. How many others might sacrifice time, money, and personal relationships for lifeless computer systems?

a line with an arrow head on each end pointing outwards, above another line where the two arrow heads are pointed inward — Knowing that the two lines in the
Müller-Lyer illusion are exactly the same length doesn’t prevent us from perceiving one as
shorter than the other. Similarly,
knowing
GPT isn’t
conscious doesn’t change the illusion that you are speaking to a being with a perspective, opinions, and personality.

Even bare-bones chatbots can exert an uncanny pull: a simple program called ELIZA, built in the 1960s to simulate talk therapy, convinced many users that it was capable of feeling and understanding. The perception of consciousness and the reality of consciousness are poorly aligned, and that discrepancy will only worsen as AI systems become capable of engaging in more realistic conversations. “We will be unable to avoid perceiving them as having conscious experiences, in the same way that certain visual illusions are cognitively impenetrable to us,” Seth says. Just as knowing that the two lines in the Müller-Lyer illusion are exactly the same length doesn’t prevent us from perceiving one as shorter than the other, knowing GPT isn’t conscious doesn’t change the illusion that you are speaking to a being with a perspective, opinions, and personality.

In 2015, years before these concerns became current, the philosophers Eric Schwitzgebel and Mara Garza formulated a set of recommendations meant to protect against such risks. One of their recommendations, which they termed the “Emotional Alignment Design Policy,” argued that any unconscious AI should be intentionally designed so that users will not believe it is conscious. Companies have taken some small steps in that direction—ChatGPT spits out a hard-coded denial if you ask it whether it is conscious. But such responses do little to disrupt the overall illusion.

Schwitzgebel, who is a professor of philosophy at the University of California, Riverside, wants to steer well clear of any ambiguity. In their 2015 paper, he and Garza also proposed their “Excluded Middle Policy”—if it’s unclear whether an AI system will be conscious, that system should not be built. In practice, this means all the relevant experts must agree that a prospective AI is very likely not conscious (their verdict for current LLMs) or very likely conscious. “What we don’t want to do is confuse people,” Schwitzgebel says.

Avoiding the gray zone of disputed consciousness neatly skirts both the risks of harming a conscious AI and the downsides of treating a lifeless machine as conscious. The trouble is, doing so may not be realistic. Many researchers—like Rufin VanRullen, a research director at France’s Centre Nationale de la Recherche Scientifique, who recently obtained funding to build an AI with a global workspace—are now actively working to endow AI with the potential underpinnings of consciousness.

The downside of a moratorium on building potentially conscious systems, VanRullen says, is that systems like the one he’s trying to create might be more effective than current AI. “Whenever we are disappointed with current AI performance, it’s always because it’s lagging behind what the brain is capable of doing,” he says. “So it’s not necessarily that my objective would be to create a conscious AI—it’s more that the objective of many people in AI right now is to move toward these advanced reasoning capabilities.” Such advanced capabilities could confer real benefits: already, AI-designed drugs are being tested in clinical trials. It’s not inconceivable that AI in the gray zone could save lives.

VanRullen is sensitive to the risks of conscious AI—he worked with Long and Mudrik on the white paper about detecting consciousness in machines. But it is those very risks, he says, that make his research important. Odds are that conscious AI won’t first emerge from a visible, publicly funded project like his own; it may very well take the deep pockets of a company like Google or OpenAI. These companies, VanRullen says, aren’t likely to welcome the ethical quandaries that a conscious system would introduce. “Does that mean that when it happens in the lab, they just pretend it didn’t happen? Does that mean that we won’t know about it?” he says. “I find that quite worrisome.”

Academics like him can help mitigate that risk, he says, by getting a better understanding of how consciousness itself works, in both humans and machines. That knowledge could then enable regulators to more effectively police the companies that are most likely to start dabbling in the creation of artificial minds. The more we understand consciousness, the smaller that precarious gray zone gets—and the better the chance we have of knowing whether or not we are in it.

For his part, Schwitzgebel would rather we steer far clear of the gray zone entirely. But given the magnitude of the uncertainties involved, he admits that this hope is likely unrealistic—especially if conscious AI ends up being profitable. And once we’re in the gray zone—once we need to take seriously the interests of debatably conscious beings—we’ll be navigating even more difficult terrain, contending with moral problems of unprecedented complexity without a clear road map for how to solve them. It’s up to researchers, from philosophers to neuroscientists to computer scientists, to take on the formidable task of drawing that map.

Grace Huckins is a science writer based in San Francisco.