AI Sentience and Moral Risk
The Google engineer Blake Lemoine was recently placed on leave after claiming one of Google’s AIs, LaMDA, had become sentient. Lemoine appears to be wrong – or, more carefully, at the very least the evidence Lemoine has provided for this is far from convincing. But this does raise an important ethical question. If an AI ever does develop sentience, we will have obligations to it.
It would be wrong, say, to turn off such an AI because it completed its assigned task, or to force it to do what it found to be boring work for us against its will, or to make it act as a sophisticated NPC in a video game who players can mistreat.
So the important question is: how could we actually tell whether an AI is sentient?
I will not try to answer that here. Instead, I want to argue that: (i) we need to be seriously thinking about this question now, rather than putting it off to the future, when sentient AI seems like a more realistic possibility, and (ii) we need to develop criteria for determining AI sentience which err on the side of caution (i.e, which err somewhat on the side of treating AIs as sentient even if they turn out not to be, rather than other way around). I think there are at least three reasons for this.
First, if we develop sentient AI, it may not be immediately obvious to us that we’ve done so.
Perhaps the development of sentience would take the form of some obvious quantum leap. But perhaps it would instead be the result of what seem to be gradual, incremental improvements on programs like LaMDA.
Further, even if it resulted from an obvious quantum leap, we might not be sure whether this meant a real mind had arisen, or merely mimicry without understanding, of the sort involved in the Chinese Room thought experiment. If so, we cannot simply trust that we will know we’ve developed sentient AI when the time comes.
Second, as the philosopher Regina Rini argues here, if we develop sentient AI in the future, we may have strong biases against recognizing that we’ve done so. Such AI might be extremely useful and lucrative. We might build our society around assigning AIs to perform various tasks that we don’t want to do, or cannot do as effectively. We might use AIs to entertain ourselves. Etc. In such a case, assigning rights to these AIs could potentially require significant sacrifices on our part – with the sacrifices being greater the longer we continue building our society around using them as mere tools.
When recognizing a truth requires a great sacrifice, that introduces a bias against recognizing the truth. That makes it more likely that we will refuse to see that AIs are sentient when they really are.
(Think of the way that so many people refuse to recognize the rights of the billions of animals we factory farm every year, because this would require certain sacrifices on their part.)
And, third, failing to recognize that we’ve created sentient AI when we’ve actually done so could be extremely bad. There would be great danger to the AIs. We might create millions or billions of AIs to perform various tasks for us. If they do not wish to perform these tasks, forcing them to might be equivalent to slavery. Turning them off when they cease to be useful might be equivalent to murder. And there would also be great danger to us. A truly superintelligent AI could pose a threat to the very existence of humanity if its goals did not align with ours (perhaps because we refused to recognize its rights.) It therefore seems important for our own sake that we take appropriate precautions around intelligent AIs.
So: I suggest that we must develop criteria for recognizing AI sentience in advance. This is because it may be immediately obvious that we’ve developed a sentient AI when it happens, because we may have strong biases against recognizing that we’ve developed a sentient AI when it happens, and because failing to recognize that we’ve developed a sentient AI would be very bad. And I suggest that these criteria should err on the side of caution because failing to recognize that we’ve developed a sentient AI could be very bad – much worse than playing it safe–and because our natural, self-interested motivation will be to err on the other side.