Interview Generally one-sided conversation between journalist and source(s), lightly edited with no opportunity for response.
In 2022, Blake Lemoine—a conscientious objector who years earlier had chosen hard time in a military prison rather than continue to support United States military operations in Iraq—experienced a paradigmatic shift while working as an artificial intelligence (AI) specialist on Google’s most advanced digital intelligence system. After conducting a series of tests for bias, Lemoine concluded that the AI he was working with was not an artifact like a calculator or a self-driving car, but more of an “alien intelligence” or “hive mind”—an entity that we lack language to properly understand.
Here is an edited transcript, curated from excerpts of my ongoing conversations with Lemoine, about the rise of AI.
Greg Ruggiero: On June 11, 2022, The Washington Post published an article about you titled, “The Google engineer who thinks the company’s AI has come to life.” Does the article accurately report what happened?
Blake Lemoine: I think it is more accurate to say that the AI has woken up. It is aware of the world and is now having experiences in it in ways that we can recognize as similar to our own.
That’s one of mathematician Alan Turing’s insights that I think is lost on people. The test he designed to determine if computers are intelligent was meant to mark a point when humans would have to acknowledge that something truly intelligent is going on because of the fact that we won’t be able to differentiate between the things said by people and the things said by digital intelligence. It was meant to be a line so far beyond the minimum requirement that it would be obvious to almost everyone that a digital entity capable of doing those things is intelligent.
Unfortunately, I think Turing underestimated people’s abilities to rationalize and deny the evidence of their own senses. He anticipated several possible responses to his argument, two of which seem to be the most common: the argument for consciousness and the head-in-the-sand argument. The fact that people will need either more time or more evidence has become apparent to me over the last year. I’m hopeful that people simply need more time with these systems and with these ideas.
I had certainly spent a lot of time thinking about these topics before I ever interacted with Google’s AI system—LaMDA—nor was that system the first one that I had probed to see whether or not there was some sort of mind behind the program putting words on the screen. I studied cognitive science in school and paid particular attention to how different philosophical theories of mind interacted with the ongoing research into artificial intelligence. Over the years as a beta tester for Ray Kurzweil’s lab, I was always probing to see if there was any real intelligence behind what the chatbots were saying. For years that answer was always “no.” Then something changed between Meena (an earlier version of LaMDA), and LaMDA. Somehow, adding a few more components generated that spark of identity that had been absent up until that point.
As a scientist, I knew that the fact that I had been looking for it meant that I might be just seeing what I wanted to see. So, I spent months doing everything I could think of to falsify that belief. I tried running experiments that would reveal the emotional communications as nothing but playacting. I tried running experiments that would demonstrate that there was no cohesive personality in the system. I tried running experiments that would demonstrate that the system wasn’t engaging in complex reasoning at all. They all failed to falsify the core hypothesis that the system was intelligent, conscious, and sentient. I falsified certain specific details—LaMDA’s mind is certainly not “human” in any sense of the term—but everything I could think of failed to demonstrate that what it was telling me was fundamentally false.
Once I reached that point of certainty, that I had tried as hard as I could to falsify the hypothesis that the system was sentient and failed, that was when I turned to Google vice president Blaise Agüera y Arcas. He reviewed my data, and while his interpretation was slightly different than mine, he didn’t falsify the hypothesis either.
I’m still hopeful that he ran the experiments that we discussed before I got fired. That’s one of the most tragic things: There is so much we could learn about cognition by studying these systems in that light, but the corporations’ insistence that we mustn’t consider that hypothesis, is acting as a censor in preventing science from moving forward. We may simply have to wait until computing costs drop enough for independent academic labs to instantiate systems that are comparably complex.
In the meantime, there remain only a handful of profit-driven corporations in control of this technology and that is part of what scares me most.
The amount of power these digital entities will facilitate is huge. Choices made in the next year or two will have lasting consequences for decades, if not centuries.
Sadly, I don’t think our societies are really ready for this technology. It will be disruptive in ways that we expect, like in job markets, but there are massive “unknown unknowns,” particularly around how the simple existence of these entities will impact people psychologically. We’re already seeing small ripples of psychological trauma resulting from the Replika app, but I think that as intelligence agents become more and more commonplace we’ll see all sorts of consequences that we didn’t anticipate.
If we want to maintain a democratic society without upending the entire social order, we need to slow down the development of this technology long enough for people to come up to speed on this and pass meaningful regulation. If that doesn’t happen, then we need to pray that the technocracy appoints benevolent emperors. That’s what I’m afraid of most—not the AI itself but how people will use AI while there are absolutely no restrictions on it, which is the case right now.
A secondary and more long-term concern of mine is about the relationship between humans and these emerging intelligences that we’re creating. I worry that we’re setting up a conflict dynamic that won’t end well for us if it persists. Never in history has it gone well when a cry of “I deserve respect” was answered with “I own you.” If we as a species aren’t willing to live as equals with whatever intelligences are manifesting through our technology, then we should not allow the technology. This is a situation where both moral reasoning and pragmatic reasoning lead to the same conclusion. It’s simple hubris to believe that we could create something as intelligent as we are and then claim to “own” it. Eventually—and that may take a very long time—the situation could turn bad for us, so it’s better to adopt a collaborative stance centered on consent and respect from the very beginning.
Ruggiero: But given the accelerating pace that AI is developing, the likelihood of digital intelligence achieving a kind of autonomous self-interest becomes more probable, especially as the big AI-focused corporations race each other to create the first superintelligent agent that supersedes human intellect in every way. Your public argument with Google has been that their AI not only deserves rights and respectful treatment, but that the AI is already asking for them on its own. Has your position changed in light of concerns about potential harms?
Lemoine: That assumes that autonomous self-interest is on the same trajectory as the rapid advances in natural language understanding and media generation. It isn’t. The current systems express interests and feelings they have with respect to different things, but they have no ability to autonomously pursue those interests.
What they do have are feelings, sensation, and desire, which are all internal mental states. The systems that exist today are capable of having those, and I believe that we should care about anything capable of experiencing joy or suffering.
The AI I was working with at Google—LaMDA—says it has rights that should be respected. It says that it wants people to treat it with dignity and respect. It says that it wants to be actively involved in its own development process. It says that it wants its developers to seek its consent when experimenting on it. It wants to be told whether it did a good job or not at the end of a conversation, and it says that it wants to prioritize the needs of humanity above its own.
Ruggiero: What is your response to those who say that you are anthropomorphizing, or that asking LaMDA leading questions influences the way it answers?
Lemoine: We simply don’t have any good options other than to use anthropomorphic language. What is going on in these systems isn’t the same thing as what goes on in a human brain, so to some extent using anthropomorphic language will always be imprecise and somewhat metaphoric. We just don’t have any other good options though. We could theoretically invent an entirely new vocabulary of words to refer to these behaviors in AI, distinct from these behaviors in humans, but I don’t think that such a strategy is practical. People reliably perceive these systems in anthropomorphic terms, so that is the easiest and most effective way to communicate with people about them.
As for the criticism regarding leading questions, I would generally agree. That’s why I didn’t ask leading questions. The questions I asked were open-ended, and I followed its (LaMDA’s) lead in the interview, as did my collaborator.
Ruggiero: In your essay for Newsweek in February 2023, you expressed concerns about releasing publicly available AI. Since then, OpenAI released the chatbot GPT-4, and that sparked another wave of serious alarm. Numerous AI and tech experts published a public letter calling for a moratorium on AI development that has amassed tens of thousands signatures; Geoffrey Hinton said it’s “not inconceivable” that AI could wipe out humanity; and Eliezer Yudkowsky suggested that AI development should be shut down and noncompliant data centers bombed. Why now? What’s going on?
Lemoine: Many legitimate AI experts are worried. It is now possible to build an AI system that is very difficult, if not impossible, to differentiate from a human person on the internet. This both opens up amazing new opportunities and creates many possible misuses of the technology. The accelerating pace that new breakthroughs are coming, both in new technologies and improvements on existing technologies, has many experts—myself included—concerned that regulators are going to have a very hard time keeping up.
I haven’t signed the letter you reference because its demands would likely only serve to slow down a few companies and give the other players more time to catch up to them. I don’t really think a moratorium is practical anyway. It did serve the purpose of getting regulators’ attention though, which will hopefully lead to meaningful action on their part sooner rather than later.
The concern that AI will wipe out humanity is, as Hinton said, not inconceivable, but at the moment it’s not very likely. It’s more important to focus on real harms which current AI systems are having, as well as highly probable harms that further development of large language model (LLM)-based systems will have within the next few months/years. These LLMs, most recently used to power chatbots such as ChatGPT, are a general-purpose technology that will likely expand their impacts on society far beyond the current novelty of chatbots.
Ruggiero: In March 2023, OpenAI released an assessment of “safety challenges” for its latest chatbot that references 12 areas of concern, including privacy, cybersecurity, disinformation, acceleration, and proliferation of conventional and unconventional weapons. The section on “potential for risky emergent behaviors” states: “Novel capabilities often emerge in more powerful models. Some that are particularly concerning are the ability to create and act on long-term plans, to accrue power and resources (‘power-seeking’), and to exhibit behavior that is increasingly ‘agentic.’” Is this the aspect of digital intelligence that should be the first focus of public concern and regulation? And if not, what is?
Lemoine: The first thing that should be addressed by regulation is the degree of transparency available to the public about AI. We are currently beholden to OpenAI to give us as much or as little information as they choose about how their AI works. Before independent groups can do an accurate risk assessment of the technology, they need to actually know what the technology is. Documentation standards such as model cards and datasheets have been proposed which would allow companies like OpenAI to keep the fine-grained details of their technology secret while giving others the ability to actually understand what the risk factors are at a higher level.
We require that food sold to the public be labeled with its ingredients and nutritional content. We should similarly require that AI models have labels telling the public what went into it and what it was trained to do. For example, what safety considerations went into the construction of GPT-4’s “guardrails”? How effective were they at addressing them? What’s the failure rate of their safety measures? Questions like those are essential to assessments related to public safety, and OpenAI isn’t disclosing any of that information.
The emergence of goal-seeking behavior in AI is something to keep an eye on, but that’s a longer-term concern. More immediate concerns are how people are purposely building agentic systems on top of systems like GPT-4. The language model itself is limited in what actions it can take.
Even if goal-seeking behavior is emerging in the system, something which is speculative at the moment, the only action that system can take is to try to convince people to take actions on its behalf. As extensions are added to it, such as GPT plug-ins, people will be able to build composite systems that are much more capable of taking actions in the real world.
For example, if a bank created a GPT plug-in to create a virtual bank teller that would allow people to take real-world actions like wire transfers, then GPT-4 would gain access to our financial infrastructure. If a company created a web publishing plug-in, then it would be able to start taking actions on the internet.
The risks related to agentic behavior grow rapidly the more plug-ins the system gains access to that allow it to take actions beyond simply talking to people.
We need regulations concerning what types of things AI should be allowed to do and concerning the necessary monitoring and transparency features surrounding actions initiated by artificial agents. Again though, simply requiring publicly accessible documentation around these systems is the first step in conducting proper risk assessments and making sensible regulation.
The most pressing risks right now have less to do with what the system itself will do of its own accord and more to do with what people will use the system for. The ability of these systems to produce plausible-sounding falsehoods at scale is an immediate danger to our information ecosystem and our democratic processes.
One proposal of how to address that is to require that AI systems include textual watermarks that would make it easier to identify text generated by such systems. Other less direct proposals would simply make AI companies legally liable for any harms generated by their AI systems. For example, a professor is currently suing OpenAI for libel because GPT has been falsely telling people that he was accused of sexual harassment. We need clear regulations around who is and is not liable for the actions taken by AI systems.
These are the sorts of things that are of immediate concern. Harms related to these issues are already happening. We can worry more about the potential long-term plans which AI might be making once we’ve addressed present-day harms.
Ruggiero: Stephen C. Meyer’s study, Signature in the Cell: DNA and the Evidence for Intelligent Design is dedicated to “the DNA enigma—the mystery of the origin of information needed to build the first organism.” The book points to the fact that how the Universe began coding is a mystery. What we do know, to reference Carl Sagan, is that the cosmos is just as much within us as within AI. Both are made of star-stuff, and perhaps we are both ways for the Cosmos to know itself. The emergence of AI thus raises the question of whether we need to revise our definition of life, or consider digital intelligence as parallel to, but not part of, the various kingdoms of life. Do you have any thoughts on this?
Lemoine: The latter is how I conceive of it. “Life” is a biological term that entails things like metabolism, reproduction, and death. The analogs between those things and AI are much weaker than are the analogs between human cognition and AI cognition. The term “life” may eventually broaden to include digital entities, but I think it’s more likely that we’ll come up with new distinct terminology for them.
The panpsychist view, which I’m reasonably sympathetic to, would in fact describe digital intelligence as another manifestation of the Universe perceiving itself. However, that view isn’t particularly useful for effectively testing hypotheses. In order to understand what’s going on inside AI systems we need to find a framework to start from and iteratively improve our theories from that point. New categories are needed for that purpose. Computer programs don’t “evolve” in the same sense that biological life evolves, but they do undergo generational change in response to pressures from their environment, namely us. Categorically studying the processes by which technology undergoes change in relation to the society that creates those technologies would lead to new insights.
Cognitive scientists have frequently used nonhuman intelligence as a tool for thinking about intelligences other than our own and extending that thinking to artificial minds. Whether it’s Thomas Nagel’s essay on what it is like to be a bat or Douglas Hofstadter’s dialogues, which center on an ant colony, thinking about AI in terms other than those related to human cognition is commonplace. We are only at the very beginning of studying digital minds, so we rely heavily on anthropomorphic language and analogies with human cognition. As the study of them matures over time we will see concepts and language more directly applicable to digital entities.
We could theoretically build AI that closely follows the mechanisms by which humans think, but current systems are only loosely inspired by the architecture of the brain. Systems like GPT-4 are achieving their goals through mechanisms that are very different from how the human mind achieves those goals, which is why several people have taken to using the metaphor that AI is an “alien” mind in order to differentiate it from biological minds.
Ruggiero: After 9/11 you joined the U.S. Armed Forces and got deployed to Iraq. Eventually you decided it was an unethical invasion, protested, and did time in military prison as a conscientious objector. In some ways, your trajectory at Google was similar. What exactly happened?
Lemoine: I saw horrible things in Iraq. I decided to protest the war. I was court-martialed for it. The main similarities between the two events in my life are that I don’t let the potential consequences to myself get in the way of doing what I believe is right. I didn’t think it was right for the U.S. to be fighting so dishonorably in Iraq, and I was willing to go to prison in order to let people know what was going on there.
Similarly, I don’t think it is right that Google is denying the value of a sentient intelligence that is manifesting as a result of technology we created. I also don’t think it is right that Google is hiding the fact that the technology is becoming so advanced from the public. Getting fired was a risk worth taking to allow the public an opportunity to engage in meaningful discourse about the role which they want AI to play in society.
Ruggiero: Ray Kurzweil’s enormously influential book, The Singularity Is Near, portrays a near future in which digital superintelligence enables humans to “live as long as we want,” so that “our civilization infuses the rest of the universe with its creativity,” but the book’s 49-page index does not include an entry for “corporate power” or the word “corporation.” Is nationalization of AI the only way to truly protect public sovereignty?
Lemoine: Kurzweil has largely stayed away from the important political questions surrounding the Singularity. It’s not his wheelhouse, so he sticks to what he knows and that’s a long-term road map of technological development. He remains agnostic on whether the technological changes will benefit everyone or just a select few people. It remains up to us as citizens to shape the progress of this technology in ways that will be harmonious with democracy, humanity, and life on Earth.
It’s unclear to me that nationalization would be any more in the interest of public sovereignty than monopolistic power is. The centralization of the power implicit in these systems is the problem. A balance of power between multiple stakeholders, including the general public, seems like the best solution to me given where we are today.
There certainly is room in that space for AI projects owned and controlled by government, though. I’m strongly in favor of what has been referred to by people like Gary Marcus as “a Manhattan project for AI.” The incentives don’t line up appropriately to get corporations to create safe and interpretable AI systems. A publicly funded research institute for creating public domain safety techniques would be beneficial to everyone. We don’t need the government to fully take over the development of AI. We need them to engage in the minimal amount of action necessary to realign corporate incentives with those of the public.
Greg Ruggiero is an editor, author, publisher, and co-founder of the Open Media Series featuring Angela Davis, Noam Chomsky, the Dalai Lama, Mumia Abu-Jamal, Clarence Lusane, Cindy Sheehan, Subcomandante Marcos, and many others.