Experts who say that AI welfare is a serious near-term possibility

There are quite a few. Tell me which ones I missed

Oct 01, 2024

Many researchers in philosophy, neuroscience, and AI have recently claimed that there is a realistic possibility that AI systems could have moral status1 soon—that the chance is high enough (even if rather low) that potential AI welfare is worthy of serious study and precaution.2

I've compiled a list of researchers who either directly claim that AI systems might have moral status soon, or assert something that strongly implies this view: e.g. that AI systems might soon be conscious, sentient, or agentic in a morally relevant way.

This list is doubtless incomplete—I suspect I’ve even left out people that I talk to regularly! Who did I miss? Let me know!

(I didn’t include the list of authors from “Consciousness in Artificial Intelligence: Insights from the Science of Consciousness”—illustrious as they are—because they were assembled by me. I wouldn’t want to be accused of padding the list. That said, if I’m being honest, I do think you should count them. They are: Patrick Butlin, Robert Long3, Eric Elmoznino, Yoshua Bengio, Jonathan Birch, Axel Constant, George Deane, Stephen M. Fleming, Chris Frith, Xu Ji, Ryota Kanai, Colin Klein, Grace Lindsay, Matthias Michel, Liad Mudrik, Megan A. K. Peters, Eric Schwitzgebel, Jonathan Simon, Rufin VanRullen.

Sebo and Long (2023), “Moral consideration for AI systems by 2030”, is in this same category.)

Neuroscientists

The Association for the Mathematical Study of Consciousness released an open letter signed by a long list of prominent consciousness researchers, stating that “it is no longer in the realm of science fiction to imagine AI systems having feelings and even human-level consciousness”.

Anil Seth (Sussex) is a leading neuroscientist working on consciousness and author of Being You: A New Science of Consciousness. Seth’s own view is that AI consciousness is unlikely, but he calls for taking AI consciousness very seriously:

While some researchers suggest that conscious AI is close at hand, others, including me, believe it remains far away and might not be possible at all. But even if unlikely, it is unwise to dismiss the possibility altogether. The prospect of artificial consciousness raises ethical, safety, and societal challenges significantly beyond those already posed by AI.

Hakwan Lau (RIKEN Center for Brain Science) is one of the foremost scientists working on higher-order theories of consciousness and related issues like metacognition. Lau has proposed computational implementations of his theory (which is inspired by generative adversarial networks), and writes that “given today’s technology, the possibility of artificial sentience may be already in sight”.

Philosophers

Jonathan Birch (LSE) is a philosopher who works on animal consciousness. He was also involved in producing a groundbreaking report on evidence for invertebrate sentience that led to reforms in UK animal protection law. In his recent book The Edge of Sentience, he writes, “I have come to see the issue as a serious one, and one that does deserve the energy of policy-makers now…I fear that we may create sentient AI long before we recognize we have done so. It could be much easier than we tend to think.”

David Chalmers (NYU) is arguably the world’s foremost expert on consciousness. In a NeurIPS keynote and subsequent article, he assesses the chance of near-term AI consciousness by surveying the various pre-conditions for consciousness that have been proposed and taking stock of AI progress. His conclusion:

Within the next decade, even if we don’t have human-level artificial general intelligence, we may well have systems that are serious candidates for consciousness. There are many challenges on the path to consciousness in machine learning systems, but meeting those challenges yields a possible research program toward conscious AI.

Simon Goldstein (HKU) and Cameron Domenico Kirk-Giannini (Rutgers-Newark) have been cooking in a series of recent papers on this topic. In one paper, they examine language agents and theories of wellbeing to make the case that “the technology already exists to create AI systems with wellbeing”. In another, they make a case for “the near-term possibility of phenomenally conscious artificial systems if global workspace theory is true.” (Global workspace theory was one of the “most promising” perspectives on consciousness in a recent survey of consciousness scientists).

Derek Shiller is a former philosophy professor and a researcher at Rethink Priorities. While he has his own doubts about the computational realization of consciousness on existing hardware, he has argued that:

Google could assemble a small team of engineers to quickly prototype a system that existing theories, straightforwardly applied, would predict is conscious...timelines for apparent digital consciousness may be very short.

Nick Bostrom (Macrostrategy Research Initiative) and Carl Shulman (polymath with many affiliations) write that:

There is considerable disagreement about (a) criteria for a system being conscious, and (b) criteria for a system having moral status; however, many popular accounts of (a) and (b) are not inconsistent with a claim that some existing AI systems have (nonzero degrees of) both phenomenal awareness and moral status.
The sensory and cognitive capacities of some existing AI systems—and thus their moral status on some accounts—appear in many respects to more closely resemble those of small nonhuman animals than those of typical human adults (on the one hand) or those of rocks or plants (on the other).

Both Bostrom and Shulman have argued in various venues that AI welfare is an issue of near-term concern.

AI company executives and employees

Sam Bowman (Anthropic), AI safety research lead, recently announced (in a personal capacity) that Anthropic is “laying the groundwork for AI welfare commitments”:

We’ll want to build up at least a small program in [the current stage of AI development] to build out a defensible initial understanding of our situation, implement low-hanging-fruit interventions that seem robustly good, and cautiously try out formal policies to protect any interests that warrant protecting. I expect this will need to be pluralistic, drawing on a number of different worldviews around what ethical concerns can arise around the treatment of AI systems and what we should do in response to them.

Dario Amodei (Anthropic), co-founder and CEO (and a neuroscientist by training) remarked last year:

I used to think that we didn't have to worry about [AI consciousness] at all until models were operating in rich environments, like not necessarily embodied, but they needed to have a reward function and have a long lived experience. I still think that might be the case, but the more we've looked at these language models and particularly looked inside them to see things like induction heads, a lot of the cognitive machinery that you would need for active agents already seems present in the base language models. So I'm not quite as sure as I was before that we're missing enough of the things that you would need. I think today's models just probably aren't smart enough that we should worry about this too much but I'm not 100% sure about this, and I do think in a year or two, this might be a very real concern.

Avital Balwit (Anthropic), Dario’s Chief of Staff, concurs: “I don't think current Claude is sentient (but, I don't think this is a totally crazy thing to wonder about, and we will be studying this question for future models).”

Amanda Askell (Anthropic) called for more work on this topic in 2022:

My own view is that it would be useful to build a range of potential evaluations for machine consciousness and sentience—evaluations that adequately reflect our uncertainty across our various theories of both….I think this could be a very important project for someone who has expertise in areas like the philosophy of mind, cognitive science, neuroscience, machine consciousness, or animal consciousness, and who has or can develop a working understanding of contemporary ML systems.

Josh Achiam (OpenAI), new Head of Mission Alignment, tweeted in September 2024:

I don't think we're there yet and I wouldn't argue that any current AI system qualifies as a life form yet. But I also think all of the essential pieces are there and it's not far off, and this requires some grappling with.

Achiam was replying to a remark about whether or not two AI systems were showing “real awareness”, when they appeared to freak out after realizing that they are AI systems.

Ilya Sutskever (formerly OpenAI, now of SSI) kicked off a whole bunch of Discourse in 2022 when he tweeted (without elaboration) that “it may be that today's large neural networks are slightly conscious”. He has also suggested a test for AI consciousness.

Rosie Campbell (OpenAI) wrote earlier this month, “Like many AI researchers, I think we should take the possibility of digital sentience seriously.”

From Perez and Long 2023: “Moral status is a term from moral philosophy (often used interchangeably with “moral patienthood”). An entity has moral status if it deserves moral consideration, not just as a means to other things, but in its own right and for its own sake (Kamm, 2007; see Moosavi, 2023). For example, it matters morally how you treat a dog not only because of how this treatment affects other people, but also (very plausibly) because of how it affects the dog itself. Most people agree that human beings and at least some animals have moral status”

A “realistic” or “high enough” chance of X is compatible with thinking that the chance of X is rather low. But we prepare for low-probability, high-importance scenarios all the time: individuals wear seatbelts, companies buy fire insurance, and societies prepare for pandemics. See the quote by Anil Seth above.

I am Robert Long.

M B

Jan 29

Wojciech Zaremba (OpenAI co-founder) on a podcast in 2021:

"There is a Slack channel at OpenAI about welfare for artificial intelligence. Because it is conceivable that through some kinds of trainings, we could generate immense amount of suffering like massive genocides, but frankly, we don't understand it. We don't know if let's say giving negative reward to model is the same as stabbing someone."

https://www.youtube.com/watch?v=429QC4Yl-mA&t=2022s

Expand full comment

Izak Tait

Oct 1

Not to be entirely flagrantly self-promoting, but my entire research is focused on AI welfare: https://izaktait.substack.com/

3 more comments...

Experience Machines

Discussion about this post