Psychology’s favourite moral thought experiment doesn’t predict real-world behaviour

By Christian Jarrett

Would you wilfully hurt or kill one person so as to save multiple others? That’s the dilemma at the heart of moral psychology’s favourite thought experiment and its derivatives. In the classic case, you must decide whether or not to pull a lever to divert a runaway mining trolley so that it avoids killing five people and instead kills a single individual on another line. A popular theory in the field states that, to many of us, so abhorrent is the notion of deliberately harming someone that our “deontological” instincts deter us from pulling the lever; on the other hand, the more we intellectualise the problem with cool detachment, the more likely we will make a utilitarian or consequentialist judgment and divert the trolley.

Armed with thought experiments of this kind, psychologists have examined all manner of individual and circumstantial factors that influence the likelihood of people making  deontological vs. utilitarian moral decisions. However, there’s a fatal (excuse the pun) problem. A striking new paper in Psychological Science finds that our answers to the thought experiments don’t match up with our real-world moral decisions.

Dries Bostyn and his colleagues at Ghent University recruited nearly 300 participants. All answered several hypothetical moral dilemmas derived from the classic trolley dilemma – for instance, in a building on fire, they had to say whether they would push a man through a locked window to his death in order to make an exit for the five children trapped inside. The participants also completed several questionnaires tapping psychological factors, such as psychopathy and “need for cognition”, previously identified as being associated with being more utilitarian in one’s moral decisions.

A fortnight later, just under 200 of the participants were invited to the psych lab, one at a time, to take part in a real-life moral dilemma involving live mice. The participants saw two cages – one housing one mouse, the other housing five – each wired to an electroshock machine. They were told that in 20 seconds, if they did nothing, the machine would deliver a very painful but nonlethal shock to the cage containing five mice. However, if the participants pressed a button in front of them, they could divert the electric shock to the cage containing one mouse, thus saving the other five from pain (in actuality this was an illusion and all participants were later informed that in fact no mice were shocked or harmed in the study).

The remaining participants went to the psych lab but performed a hypothetical version of the mouse decision. They heard a description of the same two-cage set up faced by the others and they had to say whether they would press the button or not.

The participants who performed the real-life mouse task behaved differently than those who made a purely hypothetical decision – they were less than half as likely to let the five mice get shocked (16 per cent of them left the button unpressed compared with 34 per cent of the hypothetical group). In other words, faced with a real-life dilemma, the volunteers were more consequentialist / utilitarian; that is, more willing to inflict harm for the greater good.

But the most important finding – at least for the validity of moral psychology which so often relies on thought experiments – is that the participants’ preference for deontological vs. utilitarian responding in their answers to the earlier battery of 10 hypothetical moral dilemmas bore no relation to their decision in the real-life mouse task (in contrast, the decisions of participants in the hypothetical mouse group were related to their answers to the earlier moral dilemmas). What is more, none of the psychological factors, such as psychopathy or need for cognition, were related to decision-making in the real-life moral dilemma.

For so long, moral psychology has relied on the notion that you can extrapolate from people’s decisions in hypothetical thought experiments to infer something meaningful about how they would behave morally in the real world. These new findings challenge that core assumption of the field.

That is not to say people’s hypothetical decisions are meaningless. Although participants’ responses to the earlier moral thought experiments did not predict their later real moral decisions (i.e. whether or not to press the button to divert the electric charge), they were not totally unrelated. Among those who pressed the button in the real-life task, if they’d also earlier shown a preference for utilitarian decisions in the thought experiments then they tended to press the button more quickly; they also expressed less doubt and discomfort about their decision.

An obvious criticism of this research is that the trolley problem and its derivatives involve humans, whereas the real-life moral dilemma used in this study involved mice. However, the researchers believe this is not a critical issue since the moral conflicts (deliberately harming the few to save the many) are the same in both cases. They also note that they used a questionnaire to measure their participants’ levels of empathy for animals, and how participants scored made no difference to the pattern of findings (meaning its unlikely that participants’ levels of concern or not for the mice explains the results).

Bostyn and his team don’t know why people’s judgments on the moral thought experiments didn’t predict their choice in the real-life moral task. Current theory – based on the idea that emotional responding leads to more deontological decisions and rational thinking to more utilitarian decisions – isn’t much help because it would actually predict more deontological decisions in the more vivid and emotive real-life task, which is the opposite of what was found. The researchers speculate that perhaps people are more inclined to virtue-signal when answering in the hypothetical (i.e. signalling that they couldn’t possibly choose to deliberately harm another, even to save the majority), but one could just as easily make this case for the very opposite results.

“Future research will have to investigate these and other possibilities,” the researchers concluded. “… [W]e advance the argument that we will be able to bridge the gap between moral judgment and moral behaviour only by exploring new research paradigms that bring more decision making into the real world.”

Of Mice, Men, and Trolleys: Hypothetical Judgment Versus Real-Life Behavior in Trolley-Style Moral Dilemmas

Christian Jarrett (@Psych_Writer) is Editor of BPS Research Digest

13 thoughts on “Psychology’s favourite moral thought experiment doesn’t predict real-world behaviour”

  1. New research? Hardly.Read Richard Dawkins book The Magic of Reality and Desmond Morris’s The Naked Ape. These two books tell you all you will ever need to know about human behaviour and belief based as they are on the greatest experiment with the greatest number of participants ever conducted. Its called real life.

  2. If the researchers really believe that we make the same decisions about mice as humans, they have a great deal to learn about ‘real life’. The mouse decision (for anybody seriously naive enough to believe that it wasn’t a fake situation) was without consequence (to the human).The human decision (in real life) could end you up on a charge of murder, and would certainly entail a great deal unpleasantness and expense at the very least.

  3. This is not new in any way im honestly surprised that some scientists somewhere would believe that. Ethics deals with what should you do rather than with what do you do. its hardly surprising that we do not act up to even our own moral standards.

  4. The idea that mice and human are in any way comparable is laughable. In addition to the legal issue mentioned above we need to add also the question of dealing with irrational grief of relatives.

    Plus, pain, even severe, is hugely diffrent from killing (my guess is that much more people would choose utilitarian for non-lethal pain; death is UNIQUE in its finality). Obviously in bad olden days we could actually kill the mice…

    But most importantly, mice are EXISTENTIALLY diffrent from people. So whreas causing pain might be fairly transferable between species, killing isn’t. Each human is a unique world, and on some level actively killing him is infinitely bad, and I doubt many people think like that of mice.

    I always wanted to see results of this experiment not for 1 vs. 5 (or 1 vs. many) but for fewer vs. more, eg 10 vs. 50. I wonder if they are the same. If my idea is right, 1 vs. More would activate thinking of the uniqueness of each human, while Fewer vs. More would lead to much more utilitarian choices.

  5. The trolley thought experiment is an entirely false condition and has no bearing whatsoever on people’s actual behaviour. The conclusions for various behaviours in the first paragraph are condescending, ill-considered and wrong. The only moral dilemma is how professionals could be so stupid in the first place…

    One of two actual conditions occur: either a subject treats the question like a joke or game and makes a choice based on that or they treat it like a real condition in which not intervening is the only answer possible and here’s why.

    Remember, we are treating this like a real condition. Firstly, there is no way in the world that a person can know that the five people will die. There is no source of information that is that certain or reliable and no way of judging that for yourself. How sure are you of decisions made on the spur of the moment requiring a life or death decision?

    Second, if you do nothing then you are blameless, the result has nothing to do with you. You give police a witness statement and then you can leave. If, however, you interfere with a railway switch to divert a train then you are guilty of a serious crime regardless of the outcome and you are also responsible for the outcome. You will be arrested (there is no doubt about this and in no country would you be hailed as a hero, you are responsible for one death).

    Third, you now have to attend court as a defendant in manslaughter or murder (in most countries there is no category for the self defence of other people; murder is only allowable if your life is threatened, not a stranger’s life in danger from other strangers outside of war conditions). In the case of tossing the fat man, you will be asked how you knew the fat man would stop the trolley and how you know the five people would otherwise die. This is, in the scenario given, impossible for you to prove. People admitted to psychiatric wards relate such stories and such surety of outcomes, indeed, such surety of unknowable outcomes is a very recognisable symptom of conditions such as schizophrenia.

    Any prosecutor worth their salt would have you slapped with a life sentence. There are no loopholes you can exploit here, apart from the insanity you have demonstrated by your actions and your unwavering belief in the magical knowledge of the outcome you claim to possess.

    Most people do not think this through so carefully but they are aware that getting involved has consequences and that being responsible for one death will have very serious consequences requiring months of legal proceedings. People will choose whether to become involved or not, knowing the dire consequences that becoming involved can entail.

    Only those people who think of the question as a game not to be taken seriously will choose to kill one person. If it really were a question of whether one person should die or five than nobody would choose five, for instance if you were driving down a street and the brakes failed, there are five people down the road who can’t here your electric car and you can turn off the highway where there is only one person then everybody would choose to steer off the road. You are involved and responsible either way and so those two variables are controlled for in that scenario, but not in those childishly ridiculous and naïve train and trolley examples. Surely professionals can see the flaw in these antiquated failures??

    Note: I have made this case repeatedly in my Facebook groups ‘Psychiatry and Clinical Psychology’ (167,000 members) and ‘Evolutionary Psychology’ (50,000 members) and other groups I run.

  6. Perhaps real life decisions where a few soldiers are sent by a military leader to attack a gun emplacement knowing that they are unlikely to survive and for the greater good of a a larger number of soldiers about to be committed to an impending battle might open up this debate. There are clear comparisons in the decision making . Plenty of historical examples no doubt. Larger numbers involved. No doubt plenty of differences to be argued too.

  7. This is challenging research, and in parallel process with the original trolley experiments, I feel a certain discomfort about the results, and seek ways to justify holding on to my own sense of coherence between my beliefs and actions.

    One difference between a hypothetical thought experiment and an actual situation is context. As has been pointed out, in the thought experiment, there are no considerations of eg. legal consequences, or the specific details of the characters involved etc. Would it matter if the one was black, or a child, or if the 5 wore hoodies, or were drinking etc.

    So, in the experiment, how could you control for the possibility that the grouped mice may have happy, and the solitary one miserable? It is in the cumulative effects of these multiple tiny influences on us that make every situation unique. That’s something no thought experiment, because it is an abstraction, can ever really capture.

    However, it may be precisely because of that absence of confounding variables that thought experiments find their usefulness.

  8. “based on the idea that emotional responding leads to more deontological decisions and rational thinking to more utilitarian decisions” – I will free the mice.

Comments are closed.