AI systems as real-life thought experiments about moral status
Decoupling agency, sentience, and consciousness
[This post grew from some comments I gave at the Workshop on Animal and AI Consciousness put on by the NYU Mind, Ethics, and Policy Program, on Ali Ladak’s recent paper “What would qualify an artificial intelligence for moral standing?”]
When, if ever, will we decide that our interactions with AI systems matter, not only in how they affect us but also in how they affect the AI systems themselves?
An entity - a dog, a person, or even a robot - has moral standing when that entity “counts” directly in its own right. For example, if my family’s house catches fire, that matters only inasmuch as it affects them and me—i.e. it matters instrumentally, but not for the house’s sake. But if a dog catches fire, that matters in its own right, for the dog’s sake.
Three of the most commonly proposed grounds of moral standing are: phenomenal consciousness, sentience, and agency (desires, preferences, goals). Definitions for these terms vary, but for present purposes you can use these ones:
Phenomenal consciousness: subjective experience, there is “something that it is like” to be that entity.
Sentience: a specific subset of phenomenal consciousness, subjective experiences with positive or negative valence. Pleasures like bodily pleasures and contentment have positive valence, and displeasures like pain or sadness have negative valence.
Agency: having desires or preferences, goals one is aiming to achieve.
In human beings, and plausibly in many non-human animals, these three features all come together, as a tightly interwoven package:
Sentience linked to agency: As you’ve no doubt noticed, your good or bad experiences (sentience) are usually accompanied by preferences and desires (agency): stubbing your toe feels bad and you don’t want this bad feeling; massages feel good and you want that good feeling. On the flip side, having your preferences and desires (agency) thwarted leads to conscious experiences of frustration, sadness, and anger (sentience); having your preferences and desires satisfied leads to conscious experiences of satisfaction, happiness, and contentment.
Consciousness linked to sentience: It’s plausible that every animal that has subjective experiences in general (phenomenal consciousness) has experiences of pleasure and pain in particular (sentience): pain and pleasure are ancient and crucial biological adaptations, so it’s plausible that whenever consciousness shows up in the tree of life, pleasure and pain are already there to be consciously experienced.
Because these features are usually tied together in this way, philosophers argue about which ones are actually necessary and/or sufficient for moral standing. As Ali Ladak (2023) points out, an entity with fairly obvious moral standing like a mouse, and an entity with obvious lack of moral standing like a rock, represent cases where these features are all present or all absent. As a result it’s hard to locate what feature(s) exactly is making the normative difference between the two entities—even though some philosophers such as Peter Singer (2011) have claimed that the key difference must be sentience.
The capacity for suffering and enjoying things is a prerequisite for having interests at all, a condition that must be satisfied before we can speak of interests in any meaningful way. It would be nonsense to say that it was not in the interests of a stone to be kicked along the road by a schoolboy. A stone does not have interests because it cannot suffer. Nothing that we can do to it could possibly make any difference to its welfare. A mouse, on the other hand, does have an interest in not being tormented, because mice will suffer if they are treated in this way.
Ladak points out that “the thought experiment doesn’t isolate the cause of this intuition as the difference in the entities’ capacities to suffer. While it is true that a rock cannot suffer and a mouse can, it is also true that a rock has no preferences or goals and a mouse does.”
To isolate the relevant difference, we can try to think of more exotic cases between rock and mouse. Various thought experiments ask the reader to consider hypothetical entities that decouple consciousness, sentience, and agency.
1. Consciousness and agency without sentience: Chalmers’s Vulcans
A Vulcan is, in Chalmers’s words, “a conscious creature that experiences no happiness, suffering, pleasure, pain or any positive or negative affective states”, but still has desires and preferences and (per “conscious”) complex experiences. Chalmers argues that Vulcans have moral standing; his intuition is that even without valenced experience, Vulcans matter.
2. Consciousness without agency or sentience: Chalmers’s “more extreme” Vulcans
A ‘more extreme’ Vulcan is, in Chalmers’s words, a Vulcan that is “indifferent to continuing to live or dying”, lacking agency as well. Chalmers holds that even these creatures still have some moral standing, because “more than affective consciousness and desire satisfaction matter” for moral standing.
3. Agency without consciousness or sentience: Kagan’s (2019) alien robots
Imagine that in the distant future we discover on another planet a civilization composed entirely of machines—robots, if you will—that have evolved naturally over the ages. Although made entirely of metal, they reproduce (via some appropriate mechanical process), and so they have families. They are also members of larger social groups—circles of friends, communities, and nations. They have culture (literature, art, music), and they have industry as well as politics. Interestingly enough, however, our best science reveals to us—correctly—that they are not sentient. Although they clearly display agency at a level comparable to our own, they lack qualitative experience: there is nothing that it feels like (‘on the inside’) to be one of these machines. But for all that, of course, they have goals and preferences, they have complex and sophisticated aims, they make plans and they act on them.
Imagine that you are an Earth scientist, eager to learn more about the makeup of these robots. So you capture a small one—very much against its protests—and you are about to cut it open to examine its insides, when another robot, its mother, comes racing up to you, desperately pleading with you to leave it alone. She begs you not to kill it, mixing angry assertions that you have no right to treat her child as though it were a mere thing, with emotional pleas to let it go before you harm it any further. Would it be wrong to dissect the child?
Kagan argues that it would clearly be wrong to dissect the robot child; agency alone, without consciousness or sentience, can be sufficient for moral standing.
Now, if we all had clear reactions to these cases, sharing the reactions of Chalmers and Kagan, it would be clear whether it is consciousness, sentience, and/or agency that we care about. However, philosophers are still deeply divided about the issue and have stubbornly different intuitions about the cases. For my part, I find it hard to have any strong reaction to these cases; I find myself dumbfounded. I think this is in part because, as noted above and by Kagan, “For creatures like us…pleasure and pain are so deeply intertwined with the other qualitative aspects of our mental life that we may find it difficult to fully imagine what it would be like to be such a being”. When I imagine Kagan’s robots I’m never sure that I am successfully “stripping away” consciousness; when I imagine Chalmers’s Vulcans I’m never sure that I am successfully “stripping away” sentience.
Were this dumbfounding merely a philosophical problem, then it would be like other merely philosophical problems: of no consequence whatsoever. However, our lack of clarity about this issue could be decision-relevant very soon, if not now. AI systems could be thought experiments come to life: actual, not merely hypothetical, entities in which consciousness, sentience, and agency come apart.
Because AI systems have such a different form of intelligence than biological creatures, future AI systems may decouple features that usually come together in biological organisms. AI systems may occupy very different locations in the “space of possible minds”, in Aaron Sloman’s wonderful phrase—and in doing so, force us to decide how we feel about the moral standing of an actual non-sentient robot, or of an actual conscious AI system that lacks valence and desires. Instead of Vulcans, we might confront a GPT-7 that is conscious but experiences only exotic patterns of text and the feeling of comprehension, without any pleasure/pain or desires. Instead of Kagan’s alien robots, we might confront a descendent of RoboCat that is a complex agent with sophisticated desires and preferences, but no conscious experience whatsoever.
Instead of a philosophical problem, we will have a social and political problem. How will we treat these systems? How should we treat them? Will they matter? (Is there a determinate answer to whether they will matter?)
This means that determining the grounds of moral standing is an example of what Nick Bostrom has called “philosophy with a deadline”, that is, an instance where our wisdom must precede our technology. I’ve indicated that eliciting intuitions about thought experiments alone don’t seem very decisive. One potentially promising avenue for progress on this question could be to investigate why, psychologically, we have these intuitive reactions to consciousness, sentience, and agency. Knowing why we have these reactions might help us think about whether we endorse our reactions on reflection. The alternative to making progress on these questions will be that we collectively just wing it, making it up as we go while the world is populated with new and strange AI systems that we have no track record of relating to wisely.
I like this post.
I'm almost sure that Chalmer's Vulcans (agency + consciousness, but no sentience/valence) is an impossible combination, because the essence of consciousness is integration, and valence (as per https://direct.mit.edu/neco/article/33/2/398/95642/Deeply-Felt-Affect-The-Emergence-of-Valence-in) is too important feature for the agent's behaviour itself not to be represented in the integrated consciousness. For example, we can afford not to be aware (in our phenomenal consciousness) of the workings of our guts and immune system exactly because we have no agentic control of it, it just works its way.
Consciousness evolves/develops in service of agency. So, I think the development progression (and, therefore, the permissible combinations) will be:
1. There is only agency. ->
2. There is agency and basal consciousness which represents nothing much apart from valence, i.e., sentience (cf. https://www.frontiersin.org/articles/10.3389/fpsyg.2018.02714/full). ->
3. There are all three things: agency, basal consciousness/sentience, and more complex access consciousness (integrating visual percept, audio percept, thoughts, and other information.)
It's also worth adding here that under minimal physicalism (https://academic.oup.com/nc/article/2021/2/niab013/6334115), a kind of panpsychism that basically says that (functional) representation _is_ consciousness (or awareness, which is in minimal physicalism is a synonym of consciousness), sentience is trivialised: there would be very little systems in the category 1. but not in 2., because valence (in relation to one's own agency) is really quite easy to represent, and it's useful, so almost all agents will represent it, and therefore will be _conscious (aware)_ of it.
Or, maybe this is my misunderstanding of the terms, and basal consciousness, or minimal physicalism-style awareness-as-consciousness is "free of qualia" and therefore of suffering _even if valence is represented there_, and "true" valence, i.e., suffering, as well as other qualia (such as redness) only arise in the _dream_ which an agent continuously creates and plays out in their representative screen. Joscha Bach explains qualia in this way: qualia is virtual quality inside a dream in which the "character" (that represents oneself) is also aware of its own awareness. The state of this dream (i.e., a "frame") is a complex object, so it's harder than just representing agent's valence, or video, or audio percepts. Thus, under this conceptualisation, the ladder of "permissible" combinations becomes as follows:
1. Just agency
2. Agency + phenomenal/basal or even access consciousness, but without a reflectively aware character in this field of consciousness, which would be required for sentience.
3. Agency + consciousness + reflectively self-aware character within the field of consciousness, which will interpret valence as pleasure or suffering, i.e., will be sentient. (BTW, this leaves unresolved the question of whether the "character" or the "host" is sentient, or both.)
Again, it's possible in principle to imagine an agent that will be phenomenally conscious or even reflectively self-aware within its consciousness, but avoid representing valence and therefore will not suffer, but in practice, representing valence is useful and not very difficult, so almost all evolved agents (including trained DNNs) will do this. Although agents designed intelligently (top-down design) may be strategically deprived of this feature of their character's representation, that will make them less capable, but this capability gap may be offset on the level of supra-system design. Trained DNNs could also in principle be "surgically" modified so that they don't represent their own valence, as per https://arxiv.org/abs/2306.03819.
The most pressing question, therefore, is whether "pure agents" (category 1) have moral standing.
Here: https://www.youtube.com/watch?v=4Z8UPddh0e4&t=46m45s, Michael Levin talks not about "moral standing" but of "what is worthy of forming a spiritual bond with", which is not exactly "moral standing", but is arguably a _stronger_ qualification, so that all these agents _definitely_ should have moral standing. And for being "worth of spiritual bonding with", agents should have two qualifications: (1) they should have "shared fate", or "shared struggle for survival" with us; (2) they should have comparable cognitive light cones (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8988303/), i.e., not to be neither much shorter-sighted (like ants, whose cognitive light cones probably extend just seconds and meters) nor much farther-signted (like Gaia that "thinks" on the timescales of centuries at least, albeit its spatial "cone" is comparable to that of humans).
Thus, Levin suggests that neither consciousness nor sentience are in principle required for spiritual bonding and, therefore, moral standing for humans. Appropriately constructed robots without consciousness nor sentience will qualify.
Even though, according to the above view, neither consciousness nor sentience is a _necessary_ qualification for moral standing, these might still be _sufficient_ qualifications. I.e., if ants are conscious and/or sentient, this might be a reason for them to have a moral standing, even though they don't qualify to have spiritual bonds with (most) people, per Levin.