Digital people: biology versus silicon
Some questions about Holden Karnofsky’s allegedly conscious digital people
In Holden Karnofsky’s ‘Most Important Century’ blog post series, he considers a possible future in which advances in AI have lead to a world full of ‘digital people’, as you can see in his ‘roadmap’ for the series.
Holden’s Digital People FAQ discusses the big questions about digital people. One such big question: would they be conscious? Holden argues that “sufficiently detailed and accurate simulations of humans would be conscious, to the same degree and for the same reasons that humans are conscious.” These simulations would be moral patients - they would experience conscious pain and pleasure just like we do, and would deserve moral consideration just like we do.
The question of whether a world full of digital people would in fact be a world full of conscious moral patients matters quite a lot.1 In this post, I take a closer look at Holden’s arguments for digital consciousness. I discuss:
(a) the ‘fading qualia’ argument, which Holden channels. Originally by David Chalmers, the fading qualia argument is, curiously, one of only explicit arguments for the possibility of machine consciousness in all of philosophy2
(b) recent work by Peter Godfrey-Smith arguing that silicon-based systems are unlikely to be conscious, because the details of biology and chemistry really do matter a lot to what functions the brain performs. This perspective complicates the ‘fading qualia’ argument.
(c) Holden’s ‘parity of reasoning’ argument for digital consciousness
(d) whether we should expect the future to be full of conscious human-like digital people, rather than more inhuman AIs
a. The fading qualia argument
By ‘digital people’, Holden means digital entities that (a) have moral value and (b) interact with their environments at human or super-human level. While this category could conceivably include relatively inhuman AI systems that are unlike us in many ways (see section d), Holden mostly discusses human-like digital entities: “digital people just like us, perhaps created via mind uploading (simulating human brains)”. It is these sorts of ‘digital people’ that he argues would be conscious.
Holden’s first argument for this claim is a version of David Chalmers’ “fading qualia” argument. Here’s Holden’s version, which follows Chalmers’ fairly closely:
Imagine one could somehow replace a neuron in my brain with a "digital neuron": an electrical device, made out of the same sorts of things today's computers are made out of instead of what my neurons are made out of, that recorded input from other neurons (perhaps using a camera to monitor the various signals they were sending) and sent output to them in exactly the same pattern as the old neuron.
If we did this, I wouldn't behave differently in any way, or have any way of "noticing" the difference.
Now imagine that one did the same to every other neuron in my brain, one by one - such that my brain ultimately contained only "digital neurons" connected to each other…I would still not behave differently in any way, or have any way of "noticing."
As you swapped out all the neurons, I would not notice the vividness of my thoughts dimming. Reasoning: if I did notice the vividness of my thoughts dimming, the "noticing" would affect me in ways that could ultimately change my behavior. For example, I might remark on the vividness of my thoughts dimming. But we've already specified that nothing about the inputs and outputs of my brain change, which means nothing about my behavior could change.
Now imagine that one could remove the set of interconnected "digital neurons" from my head, and feed in similar input signals and output signals directly (instead of via my eyes/ears/etc.). This would be a digital version of me: a simulation of my brain, running on a computer. And at no point would I have noticed anything changing - no diminished consciousness, no muted feelings, etc. (emphasis mine; and I’ve removed some details about input signals which don’t affect the ensuing discussion)
What exactly is the argument here? Here is how it is supposed to work, as I understand it: as we reflect on this tale of transition-via-replacement from Bio-Holden to Digital-Holden, we see a reductio ad absurdum of the view that Digital-Holden is not consciousness:
Assume (for reductio) that digital-Holden is not conscious.
If Digital-Holden is not conscious, then consciousness vanished during the transition from Bio-Holden to Digital-Holden (either suddenly at some point, or in a gradual ‘fading’ out that extends throughout the process)
If consciousness vanished during the transition between Bio-Holden to Digital-Holden (whether suddenly or gradually), then the system was not able to ‘notice’ this change or report it. After all, ‘noticing’ and ‘reporting’ are behaviors and by stipulation, the neuron replacement process leaves behavior unchanged (“nothing about the inputs and outputs of my brain change, which means nothing about my behavior could change”).
If consciousness vanished and the system was not able to ‘notice’ this change or report it, then this is a bizarre dissociation between the system’s conscious experience (which changes radically) and its normal and human-like behavior (which remains the same).
But such a radical dissociation is not plausible, the argument goes—we have no reason to think that such dissociations would occur. The upshot is that we should reject the assumption that led to this absurdity: we should hold that Digital-Holden is in fact conscious, just like Bio-Holden.3 Both behavior and consciousness have been preserved throughout the process of replacement.
b) Fading qualia and fine-grained functionalism
If successful, the fading qualia argument shows that consciousness can exist in a silicon-based system. But note that the argument only shows that consciousness can exist in one very specific kind of silicon-based system: a system that has the exact same functionality and behavior as the original biological-based system.
But - could there be such a system? Is it actually possible to transition from Bio-Holden to Digital-Holden, as described in the thought experiment, in a way that preserves the exact same functionality and behavior? Maybe you feel like you can dimly imagine a replacement silicon neuron that does just as good a job as a biological neuron. But can you really? Would that actually work?
In Peter Godfrey-Smith’s paper “Mind, Matter, and Metabolism”, he argues that the chemical and biological features of the brain aren’t just ‘implementation details’ that can be replaced with silicon willy-nilly, but actually matter for important aspects of cognition and subjective experience. Neurons are not just input-output devices that could be straightforwardly replaced with bits of silicon without really messing up important dynamics of mind/brain function. And moreover, there are many non-neuronal processes that affect the brain, whether by affecting neurons or in their own right—blood flow, neurotransmitters, the operations of glial cells.
As a consequence, Godfrey-Smith thinks that the gradual replacement that we’re asked to imagine in the ‘fading qualia’ argument is simply impossible. In a talk related to the paper, he says,
Are the scenarios then possible? In the fading qualia case I think clearly no. These arguments were invented when our picture of neural activity was different. Neurons were seen as switching devices. All a neuron does is fire and influence others in a switching network (and alter its sensitivity and output "weights" as they figure in the network). In fact neurons do much more; they participate in the diffusion of small molecules through the system, are affected by blood flow, have their activities modulated by all the events that affects gene regulation inside them. They are living cells….the gradual introduction of nonliving elements doing all these things into an intact living system without behavioral consequences is more of a fantasy.
And in the paper he writes,
Chalmers imagines that as the replacement is done, the agent retains the same behavioral dispositions, so it is strange to suppose that they might lose their qualia. I reply that as the replacement is done, not only do their insides work more and more differently, they must behave more and more differently as well. The new agent is a quite different system. Nothing compels us to believe they have the kind of experience characteristic of human life.
Note that Godfrey-Smith is just not asserting that there’s something ‘magical' about the biological substrate. He’s not advocating for the ‘biological naturalism’ about consciousness that Holden dismisses as a minority/fringe view. Instead, Godfrey-Smith is advocating for a kind of ‘fine-grained’ functionalism. He’s arguing that biological features matters precisely because biology is uniquely suited to implement important functions, in a way that silicon cannot: “My claim is not that nonbiological materials that do all the same things might not count...Rather, the usual candidates offered as a nonbiological basis for mentality will not do the same things. They will be functionally different, not merely different in ‘hardware’ or ‘make-up.’”
Simulations versus replacements
Does this mean that no silicon-based entities could be conscious in the way that humans are? Not necessarily - it just shows that Holden can’t rest much on the supposed consciousness of his imagined silicon-based replacement of a biological brain.
But in any case, Holden is not necessarily discussing silicon-based replacements of the brain; he is discussing silicon-based simulations of the brain. Godfrey-Smith himself admits that full simulation of the brain is in principle possible, even if replacement is not: “Those activities might be modeled – simulated – in a system built from scratch”.
As people who discuss whole brain emulation have noted, it seems like you could in principle build something out of silicon with the same functionality as the brain: just simulate things at as low a level of detail as is needed, whether that is down to the chemical and biological details that Godfrey-Smith argues are important, or even deeper down.4 Simulation is different from replacement; instead of having silicon neurons that abstract away from the fine-grained details of biological neurons, you use silicon to run a computer simulation that simulates biological-style neurons in all of their richness, along with whatever else you want to—neurotransmitters and blood cells and glia and whatever else.
c) The ‘parity of reasoning’ argument
So a simulation seems feasible, in a way that Holden’s silicon replacement was not. That said, it’s harder to see how there could be a transition scenario from a human brain to this sort of simulated system. So it seems we can’t appeal to a fading qualia type argument to argue for the consciousness of simulations.
That said, I think we can can just appeal directly to the fact of functional equivalence, and argue that equivalence of consciousness probably follows as well. I think that is more or less what Holden is doing with his second argument, which I call the ‘parity of reasoning’ argument:
If I asked my digital copy whether he's conscious, he would insist that he is (just as I would in response to the same question). If I explained and demonstrated his situation (e.g., that he's "virtual") and asked whether he still thinks he's conscious, he would continue to insist that he is (just as I would, if I went through the experience of being shown that I was being simulated on some computer - something my current observations can't rule out)...If a reasoning process that works just like mine, with access to all the same facts I have access to, is convinced of "digital-Holden is conscious," what rational basis could I have for thinking this is wrong?
Note that this argument relies on full functional equivalence—that’s what allows Holden to say that the reasoning process “works just like mine”. If we are talking about a simulation, then almost by definition of (successful) ‘simulation’, we have full functional equivalence.
So the ‘parity of reasoning’ argument does seem to succeed where the fading qualia argument failed; detailed simulations seem like they would be conscious in the relevant way.
d) Why create human-like digital people?
[This last section is some rather free-form speculation; I haven’t scrutinized the arguments that follow, or checked what replies someone like Robin Hanson (or even Holden himself) would have to these lines of reasoning.]
On Holden’s view, human-like digital people come on the scene after we already have pretty powerful AI. That’s because sufficiently detailed and accurate simulations are nowhere near possible with today’s technology—but might be possible after we’ve had scientific progress driven by transformative AI.
But if we already have AI that is advanced enough to drive rapid scientific progress, why bother with these complicated simulations—especially if, as I’ve argued, they are likely to involve lots of compute to simulate messy neurons and glial cells and neurotransmitters?
True, we might create some human-like digital people because we want to (for immortality), or maybe (as Holden points out) because they could be used for social science. But it seems unlikely to me that digital people would be competitive with AIs for performing economically valuable tasks.
Especially given what I’ve argued above, simulating humans at a high enough level to ensure consciousness may require a lot of compute, since it will require simulating a lot of biology. Building these systems will probably not be the best way to get an executive assistant to send emails for you. In contrast, ‘inhuman’ AIs that do things in different ways from us are presumably competent by this point, since they have already been helping us (e.g.) invent neuroscience tools. These inhuman AIs can learn (as they currently do) to perform tasks well using their own unique strengths and architectures, not bothering to hew too closely to human-likeness. I’d expect inhuman AIs to be more efficient than human-like digital people at many important tasks. And more inhuman AIs are more plausibly not-conscious—though of course it’s hard to say very much definite about sentience in AI systems—and thus not ‘digital people’ according to Holden’s definition.5
In short: if we have powerful AIs that have enabled simulations in the first place, we will probably also have powerful AIs that can answer phone calls for us; no need for simulations for that job.
Shulman, C., & Bostrom, N. (2021). Sharing the world with digital minds. Rethinking Moral Status, 306-326.
Richard Brown writes, “Perhaps the best (only?) argument for computationalism are David Chalmers’ Dancing Qualia and Fading Qualia arguments”
All this argument establishes is that Digital-Holden is conscious - it leaves open whether the character of his conscious experience is the same as Biological-Holden’s. A related argument, the ‘dancing qualia’ argument, is meant to show that the character of experience is also preserved. But I won’t be discussing that—the issues with that argument are somewhat similar to the ones I’ll discuss with the ‘fading qualia’ argument.
Bostrom in Superintelligence: “with sufficiently advanced scanning technology and abundant computing power, it might be possible to brute-force an emulation even with a fairly limited understanding of the brain. In the unrealistic limiting case, we could imagine emulating a brain at the level of its elementary particles using the quantum mechanical Schrödinger equation. Then one could rely entirely on existing knowledge of physics and not at all on any biological model.”
Holden acknowledges these possibilities: “I think it's reasonably likely that by the time digital people are possible (or pretty soon afterward), they will be quite different from today's humans.”
I like calling the subject “digital people.” Person-hood opens up a lot for discussion.
It feels like there is some paradox hidden in the argument for simulation. Isn’t any actual simulation, in practice if not by definition, only an approximation of that which is simulated? As in a sufficiently detailed map is NOT a map but necessarily IS the territory. This point might be illuminated by philosophical theories about the nature of identity, uniqueness, and sameness. I’ve tried, but can’t read that stuff.
I think that simulation has another big problem beyond those of substrate independence or level of approximation. Simulated conscious entities could not have the experiences and behavior of a person unless they also had a simulated body in a dynamic environment. We could simulate a body, and simulate experiences and behavior, but that would occur in some framework that depended on the outside, real world to be simulating itself for the sake of the simulated conscious entity.
It seems like this idea of a perfect simulation of a human mind requires a perhaps infeasable computation. I.e., you need a functional-equivalence simulation of a brain and a simulated body in a simulated world that itself simulates, as “sensory input” to the sim-body, the entire outside world. Karnofsky (in his Digital People FAQ) doesn’t see this as a problem: you just do all these computations to the level of detail needed for the digital people to have a functional life. That’s like saying that I would not go insane if I was suddenly plunged into a world of low sensory fidelity, limited info about the outside word, and drastically limited agency for a toy body.
If I had the time, I would plow through Robin Hanson’s “The Age of Em” to see how he handles this can of worms. Reviews have failed to tell me how his Em(ulation)’s world works.
However, I did read Neal Stephenson’s “Fall, or Dodge in Hell.” He imagines a way that a population of emulated mind/bodies might incrementally create their own world: if we support them with enough compute power in our world. This allows him to make plausible suggestions about social relations among emulations, and what any connection to our world would be like.