A replication tour de force

In his famous 1974 lecture, Cargo Cult Science, Richard Feynman recalls his experience of suggesting to a psychology student that she should try to repeat a previous experiment before attempting a novel one:

“She was very delighted with this new idea, and went to her professor. And his reply was, no, you cannot do that, because the experiment has already been done and you would be wasting time. This was in about 1947 or so, and it seems to have been the general policy then to not try to repeat psychological experiments, but only to change the conditions and see what happened.”

Despite the popularity of the lecture, few took his comments about lack of replication in psychology seriously – and least of all psychologists. Another 40 years would pass before psychologists turned a critical eye on just how often they bother to replicate each other’s experiments. In 2012, US psychologist Matthew Makel and colleagues surveyed the top 100 psychology journals since 1900 and estimated that for every 1000 papers published, just two sought to closely replicate a previous study. Feynman’s instincts, it seems, were spot on.

Now, after decades of the status quo, psychology is finally coming to terms with the idea that replication is a vital ingredient in the recipe of discovery. The latest issue of the journal Social Psychology reports an impressive 15 papers that attempted to replicate influential findings related to personality and social cognition. Are men really more distressed by infidelity than women? Does pleasant music influence consumer choice? Is there an automatic link between cleanliness and moral judgements?

Many supposedly ‘classic’ effects could not be found

Several phenomena replicated successfully. An influential finding by Stanley Schacter from 1951 on ‘deviation rejection’ was successfully repeated by Eric Wesselman and colleagues. Schacter had originally found that individuals whose opinions persistently deviate from a group norm tend to be disempowered by the group and socially isolated. Wesselman replicated the result, though finding that it was smaller than originally supposed.

On the other hand, many supposedly ‘classic’ effects could not be found. For instance, there appears to be no evidence that making people feel physically warm promotes social warmth, that asking people to recall immoral behaviour makes the environment seem darker, or for the Romeo and Juliet effect.

The flagship of the special issue is the Many Labs project, a remarkable effort in which 50 psychologists located in 36 labs worldwide collaborated to replicate 13 key findings, across a sample of more than 6000 participants. Ten of the effects replicated successfully.

Adding further credibility to this enterprise, each of the studies reported in the special issue was pre-registered and peer reviewed before the authors collected data. Study pre-registration ensures that researchers adhere to the scientific method and is rapidly emerging as a vital tool for increasing the credibility and reliability of psychological science.

The entire issue is open access and well worth a read. I think Feynman would be glad to see psychology leaving the cargo cult behind and, for that, psychology can be proud too.

– Further reading: A special issue of The Psychologist on issues surrounding replication in psychology.

Post written for the BPS Research Digest by guest host Chris Chambers, senior research fellow in cognitive neuroscience at the School of Psychology, Cardiff University, and contributor to the Guardian psychology blog, Headquarters.

6 thoughts on “A replication tour de force”

  1. For what it's worth, my wife (re)tested the Lake Wobegon Effect as part of her dissertation work (it was not the main point, but as long as you are testing people and you have enough N…). For qualitative comparisons (as opposed to quantitative comparisons, that you'd hope MBA students would be rational about) it came shining through, very unlikely to be a chance result (p < 0.0005, something ridiculous like that). I suppose there is always the possibility that Stanford MBA students would possess both rationality and self-esteem in combined to a degree of Goldilocksian perfection, but you go with the test subjects that you have, not the ones you want. She compensated them with cookies; Stanford was then (early 90s) being investigated by Paul Biddle for improper use of funds from “overhead” charges on grants, and so a lottery was deemed too risky. There were enough cookies that her husband and every single male grad student in the program finally said, “Thanks, but I think I've had enough”.


  2. Just out of curiosity, but how will we ever know if the replication fails because a) the original research was faulty or b) the replication is faulty or c) the situation (e.g. current social norms, etc) has changed? Even within replication, replication will be needed.


  3. Hi Lisa, that's an excellent point, and one we stress in our introduction: Third, direct replications that produce negative results facilitate the identification of boundary conditions for real effects.' There are some nice examples in the special issue where this is attempted (e.g., Zezilc & Jokic, 2014). For others, we hope this will be attempted in the future. Then, meta-analyses will be a good way to decide between A, B, or C. A replication makes the fact that we don't know whether A, B, or C is true more salient, but it is important to remember this is also true before a replication study is performed.


  4. Why the papers by Nosek and colleagues alongside those of Tversky and Kahneman? I think this has been done to hide the fact that implicit attitude research suffers from problems of validity rather than reliability


  5. This is why there is really a need for a journal (or a family of journals) devoted to failed replications. Too often failed replications are deemed unworthy of publication.


