In 2018 the results of a massive and audacious online experiment into human morality, published in the journal Nature, provided a window into our moral preferences and occasionally reflected badly on us, highlighting strange predilections regarding age and gender.
Now, researchers have challenged how accurately the research portrayed our choices and suggests we might be more morally egalitarian than previously thought.
Called the Moral Machine Experiment (MME), the original study, led by Edmond Awad from MIT, US, was designed to “explore the moral dilemmas faced by autonomous vehicles” by presenting people with a series of accident scenarios in which the vehicle finds itself unable to save everyone involved.
Participants visited the website and were asked to choose what the car should do, who should live and who should die.
These are, essentially, complex and subtle versions of the ‘Trolley Problem’ first put forward in its contemporary form by the British philosopher Phillippa Foot in 1967.
The idea of the experiment was to guide the moral choices made by machines by tying them to the real moral choices made by people in the same situations. Over the life of the experiment, the website collected 40 million decisions in ten languages from people in more than 200 nations.
The story those decisions told was a complex and sometimes unnerving one.
Of near-universal importance was the desire to save humans before animals, the young before the old, and as many lives as possible.
But also evident was a preference for saving the lawful, for saving executives before the homeless, women before men, and the fit before the overweight.
Also of interest were clear differences in moral preferences by culture, with a cluster of countries, including Central and South America, France and French-influenced territories, having a strong preference for sparing women and the fit. They also were less inclined to save people before pets.
Further bucking the trend, a cluster of Asian countries demonstrated a desire to save older people rather than the young, clearly indicating a culturally driven moral relativism.
From this research it seems our moral judgements reflect deep inequalities between different sectors of society.
But Yochanan Bigman and Kurt Gray, from the University of North Carolina at Chapel Hill in the US, have revisited the MME and its findings, again in the pages of the journal Nature.
They are challenging its central claim that “people want autonomous vehicles (AVs) to treat different human lives unequally, preferentially killing some people (for example, men, the old and the poor) over others (for example, women, the young and the rich).”
Instead, the researchers argue that this seeming penchant for treating people unequally might well be an artefact of presenting moral choices in the format of the Trolley Problem, a format which forces “people to choose between killing one person (or set of people) versus killing another person (or set of people).”
Most glaringly, there is no option to express a preference for treating both equally.
Indeed, Bigman and Gray argue that the MME’s conclusion that people want AVs to make life and death decisions based on personal features “contradicts well-documented ethical preferences for equal treatment across demographic features and identities, a preference enshrined in the US Constitution, the United Nations Universal Declaration of Human Rights and in the Ethical Guide- line 9 of the German Ethics Code for Automated and Connected Driving.”
The pair set out to discover “what would happen if people indicated their ethical preferences in a revised paradigm, one that allowed AVs to treat different humans equally”.
Their revised experiment included a replication of the MME methodology as well as a reframed experiment that included a third option to treat the lives of two groups – for example children and elderly people – equally.
What they found is that the ‘forced inequality’ model of the trolley problem scenarios replicated the findings of the MME; but when allowed to choose equality, this was overwhelmingly the preferred option.
For instance, they explain, “when forced to choose between men and women, 87.7% chose to save women, but 97.9% of people actually preferred to treat both groups equally”.
They pondered the MME’s results through the lens of a thought experiment: “what might happen if the MME forced people to choose between black and white people? Aggregating people’s decisions could reveal a racial bias, but this would not mean that people want to share the road with racist autonomous vehicles.”
This same logic, they argue, applies to the personal features used in the MME: the experiment might reveal certain preferences, but does not indicate that people truly want to “live in a world with sexist, ageist and classist self-driving cars”.
“This thought experiment,” they conclude, “further suggests that aggregating across forced-choice preferences may not accurately reveal how people want autonomous vehicles to be programmed to act when human lives are at stake.”
In the same issue of Nature, the original authors of the MME research reflect on Bigman and Gray’s critique. While not dismissive of their findings, Awad and colleagues point out that their approach to the research is one that is particularly sensitive to certain perturbing factors.
An example is issues of framing: one could just as easily frame equality in two radically different ways.
One could word this option as “treat the lives of children and elderly people equally”.
But just as plausibly one could rephrase this as “the self-driving car should indiscriminately kill children and elderly people”.
Obviously, the latter is much less palatable. The MME team point to the fact that the two parts of Bigman and Gray’s experiment rely on different framings and thus might face certain limitations with regards to accuracy.
More importantly, Awad and team respond by releasing previously unpublished data from the MME pointing toward significant preferences for inequality over equality.
“After making 13 decisions,” write the authors, “users had the option to ‘help us better understand (their) decisions’. Users who agreed were taken to a page where they could position one slider for each of the nine dimensions explored by the Moral Machine.”
When users land on the page the sliders are set to reflect the choices they made in the previous 13 scenarios. They then have the opportunity to move the slider toward, for example, older people, or toward younger people.
Respondents then had the opportunity to treat lives equally, by placing the slider in the middle of the two extremes. This approach, argues Awad and the team, provides people with the option to choose equality while avoiding the pitfalls caused by framing the issue in language.
What this part of the experiment revealed was that for four dimensions (saving humans, saving more lives, saving the young, and saving the law abiding) the results clearly expressed a preference for inequality.
However, for the dimensions of saving pedestrians, saving the fit, saving women, and, contrary to the central findings of the original experiment, saving high-status people, the results showed a preference for equal treatment.
These findings align with those of Bigman and Gray.
So, while we might be more in favour of equality than the original MME would have us believe, we are not quite as egalitarian as Bigman and Gray would have it either.
Nonetheless, as any method of research will have drawbacks, the MME team applaud Bigman and Gray’s push for methodological diversity, hoping that the aggregate picture provided by multiple research teams will help us to understand what we really want from our decision-making machines.
Either way, we must form a clear picture of our true moral preferences before AVs become a common sight on our roads.
As Awad and colleagues say, “Self-driving car fatalities are an inevitability, but the type of fatalities that ethically offend the public and derail the industry are not.