Now researchers have reproduced the results of another highly-cited study. Back in 2002, Emily Pronin and colleagues first described the “bias blind spot”, the finding that people believe they are less biased in their judgments and behaviour than the general population – that is, they are “blind” to their own cognitive biases. And while that study kick-started a whole line of related research, no one had attempted to directly replicate the original experiments.
But in a preregistered preprint published recently to ResearchGate, Prasad Chandrashekar, Siu Kit Yeung and colleagues report reproducing the original study, first in a small group of Hong Kong undergraduates, and then in two larger samples of 303 and 621 Americans who completed online surveys.
As the list of failed replications continues to build, psychology’s reproducibility crisis is becoming harder to ignore. Now, in a new paper that seems likely to ruffle a few feathers, researchers suggest that even many apparent successful replications in neuroimaging research could be standing on shaky ground.As the paper’s title bluntly puts it, the way imaging results are currently analysed “allows presenting anything as a replicated finding.”
The provocative argument is put forward by YongWook Hong from Sungkyunkwan University in South Korea and colleagues, in a preprint posted recently to bioRxiv. The fundamental problem, say the researchers, is that scientists conducting neuroimaging research tend to make and test hypotheses with reference to large brain structures. Yet neuroimaging techniques, particularly functional magnetic resonance imaging (fMRI), gather data at a much more fine-grained resolution.
This means that strikingly different patterns of brain activity could produce what appears to be the same result. For example, one lab might find that a face recognition task activates the amygdala (a structure found on each side of the brain that’s involved in emotional processing). Later, another lab apparently replicates this finding, showing activation in the same structure during the same task. But the amygdala contains hundreds of individual “voxels”, the three-dimensional pixels that form the basic unit of fMRI data. So the second lab could have found activity in a completely different part of the amygdala, yet it would appear that they had replicated the original result.
While psychology has been mired in a “replication crisis” recently – based on the failure of contemporary researchers to recreate some of its most cherished findings – there have been pockets of good news for certain sub-disciplines in the field. For instance, some replication efforts in cognitive psychology and experimental philosophy or X-phi have been more successful, suggesting that results in these areas are more robust.
To this more optimistic list we may now add personality psychology, or at least the specific area of research linking the Big Five personality trait scores with various personal and life outcomes, such as higher Neuroticism being associated with poorer mental health and reduced relationship satisfaction; higher trait Conscientiousness being associated with less risk of substance abuse; and stronger Extraversion correlating with leadership roles.
In his new paper that is in press at Psychological Science (and available as a preprint at the Open Science Framework), Christopher Soto at Colby College speculates that perhaps it is the tendency for researchers in personality to use large samples of participants, numbering in the hundreds or thousands, and to use reliable, standardised tests, that is to some extent responsible for the relatively robust results in this area. The new findings “leave us cautiously optimistic about the current state and future prospects of the personality-outcome literature,” Soto writes.
Stereotype threat is a very evocative, disturbing idea: Imagine if simply being reminded that you are a member of a disadvantaged group, and that stereotypes hold that members of your group are bad at certain tasks, led to a self-fulfilling prophecy in which you performed worse on such tasks than you would otherwise.
That’s been the claim of stereotype threat researchers since the concept was first introduced in the mid-1990s, and it’s spread far and wide. But as seems to be the case with so many strong psychological claims of late, in recent years the picture has gotten a bit murkier. “A recent review suggested that stereotype threat has a robust but small-to-medium sized effect on performance,” wrote Alex Fradera here at the BPS Research Digest in 2017, “but a meta-analysis suggests that publication bias may be a problem in this literature, inflating the apparent size of the effect.” Adding to the confusion are some results which seem to run exactly opposite to what the theory would suspect, like the one Fradera was reporting on: In that study, female chess players were found to have performed better, not worse, against male opponents, which isn’t what the theory would have predicted.
Now, another study is poised to complicate things yet further. In a paper to be published in the European Journal of Social Psychology, and available as a preprint, a team led by Charlotte Pennington of UWE Bristol recruited female participants to test two mechanisms (reduced effort and working memory disruption) that have been offered to explain the supposed adverse performance effects of gender-related stereotype threat. They also compared different ways of inducing stereotype threat. Interesting questions, you might think, but in all cases the researchers came up empty.
If you Google “holding a warm cup of coffee can” you’ll get a handful of results all telling the same story based on social priming research (essentially the study of how subtle cues affect human thoughts and behavior). “Whether a person is holding a warm cup of coffee can influence his or her views of other people, and a person who has experienced rejection may begin to feel cold,” notes a New York Times blog post, while a Psychology Today article explains that research shows that “holding a warm cup of coffee can make you feel socially closer to those around you.”
These kind of findings are most often associated with John Bargh, a Yale University professor and one of the godfathers of social priming. In his 2017 book Before You Know It: The Unconscious Reasons We Do What We Do, Bargh goes further, even suggesting – based on social priming studies and a small study that found two hours of “hyperthermia” treatment with an infra lamp helped depressed in-patients – that soup might be able to treat depression. “After all,” he writes, “it turns out that a warm bowl of chicken soup really is good for the soul, as the warmth of the soup helps replace the social warmth that may be missing from the person’s life, as when we are lonely or homesick.” He continues, “These simple home remedies are unlikely to make big profits for the pharmaceutical and psychiatric industries, but if the goal is a broader and more general increase in public mental health, some research into their possible helpfulness could pay big dividends for individuals currently in distress, and for society as a whole.”
Replicating a study isn’t easy. Just knowing how the original was conducted isn’t enough. Just having access to a sample of experimental participants isn’t enough. As psychological researchers have known for a long time, all sorts of subtle cues can affect how individuals respond in experimental settings. A failure to replicate, then, doesn’t always mean that the effect being studied isn’t there – it can simply mean the new study was conducted a bit differently.
Many Labs 2, a project of the Center for Open Science at the University of Virginia, embarked on one of the most ambitious replication efforts in psychology yet – and did so in a way designed to address these sorts of critiques, which have in some cases hampered past efforts. The resultant paper, a preprint of which can be viewed here, is lead-authored by Richard A. Klein of the Université Grenoble Alpes. Klein and his very, very large team – it takes almost four pages of the preprint just to list all the contributors – “conducted preregistered replications of 28 classic and contemporary published findings with protocols that were peer-reviewed in advance to examine variation in effect magnitudes across sample and setting.”
Perhaps no concept has been more important to social psychology in recent years — for good and ill — than “social priming”, or the idea, as the science writer Neuroskeptic once put it, that “subtle cues can exert large, unconscious influences on human behaviour.” This subgenre of research has produced a steady drumbeat of interesting findings, but unfortunately, an increasing number of them are failing to replicate – including modern classics, like the idea that exposure to ageing-related words makes you walk more slowly, or that thinking about money increases your selfishness.
The so-called “Macbeth effect” is another classic example of social priming that gained mainstream recognition and acceptance from psychologists and laypeople alike. The term was first introduced by the psychologists Chen-Bo Zhong and Katie Liljenquist, who reported in a 2006 paper in Sciencethat “a threat to one’s moral purity induces the need to cleanse oneself”.
This claim is such an interesting, provocative example of the connection between body and mind that it’s little wonder it has spread far and wide — there aren’t a lot of social-priming findings with their own Wikipedia page (it was also covered here at the Research Digest). But is it as strong as everyone thinks? For a recent paper in Social Psychology the psychologists Jedediah Siev, Shelby Zuckerman, and Joseph Siev decided to find out by conducting a meta-analysis of the available papers on the Macbeth effect to date.
There’s a popular idea in psychology that among the important factors shaping our honesty and generosity is our belief in the concept of free will. Believe more strongly in free will, so the theory goes, and you will be more inclined to prosocial behavior. Supporting this, studies that have momentarily undermined people’s belief in free will – for instance, by giving them a text to read about genetic determinism, or about how neuroscience shows our decisions are out of conscious control – have found that this increases people’s propensity for cheating and selfishness.
Such an effect seems understandable – after all, the notion that humans can choose whether to behave well or badly is fundamental to how we think about moral responsibility. It’s plausible that if you portray free will as an illusion then you provide people with a ready-made excuse for bad, selfish behavior, thus increasing the temptation for them to act that way.
As ever, however, reality is refusing to conform to a simple, intuitively appealing story. Recent attempts to replicate the influence of changing people’s free will beliefs on their subsequent moral behavior have failed, or have applied only to specific groups of people, but not others.
Now a series of four large studies conducted on Amazon’s survey website, each involving hundreds of people, has failed to find a correlation between people’s beliefs about free will and either their generosity toward charities or their inclination to cheat. Writing up their findings in Social Psychological and Personality Science, Damien Crone and Neil Levy at the University of Melbourne and Macquarie University said “… we believe there is good reason to doubt that free will beliefs have any substantial implications for everyday moral behaviors.”
“Update: On Twitter, some researchers argued, reasonably in my view, that I wasn’t quite sceptical enough in relating these findings. See the update at the end of this post for more details.”
If you wanted a poster child for the replication crisis and the controversy it has unleashed within the field of psychology, it would be hard to do much better than Fritz Strack’s findings. In 1988, the German psychologist and his colleagues published research that appeared to show that if your mouth is forced into a smile, you become a bit happier, and if it’s forced into a frown, you become a bit sadder. He pulled this off by asking volunteers to view a set of cartoons (paper ones, not animated) while holding a pen in their mouth, either with their teeth (forcing their mouth into a smile), or with their lips (forcing a frown), and to then use the pen in this position to rate how amused they were by the cartoons. The smilers were more amused, and the frowners less so – and best of all, they mostly didn’t discern the true purpose of the experiment, eliminating potential placebo-effect explanations.
This basic idea, that our facial expressions can feed back into our psychological state and behavior, goes back at least as far as Darwin and William James, but “facial feedback”, as it is known, had never been demonstrated in such an elegant and rigorous-seeming manner. Over time, this style of experiment was replicated and expanded upon, and soon it came to be considered a true blockbuster, so famous it found its ways into psychology textbooks, as well as popular books and articles citing it as an example of the unexpectedly subtle ways our bodies and environments can affect us psychologically. Often, facial feedback has been popularised along the lines of Maybe you can smile your way to happiness!, which added an irresistible self-help element that likely helped spread the idea. Either way, it seemed like a genuinely safe and solid psychological finding. That changed rather abruptly in 2016.
Amid all the talk of a “replication crisis” in psychology, here’s a rare good news story – a new project has found that a sub-field of the discipline, known as “experimental philosophy” or X-phi, is producing results that are impressively robust.
The current crisis in psychology was largely precipitated by a mass replication attempt published by the Open Science Collaboration (OSC) project in 2015. Of 100 previously published significant findings, only 39 per cent replicated unambiguously, rising to 47 per cent on more relaxed criteria.