In 2015, the world of psychology research was rocked when more than 100 scientists collaborated in an attempt to reproduce the results of 100 studies published in three leading journals and found that in many cases they couldn’t.
“Collectively these results offer a clear conclusion,” the investigators concluded in a paper in the journal Science. “A large portion of replications produced weaker evidence for the original findings despite using materials provided by the original authors, review in advance for methodological fidelity, and high statistical power to detect the original effect sizes.”
The findings were ominous. One of the most important elements of the scientific method is the assumption that if one bunch of scientists repeats the experiments of another, using exactly the same methods, then the results should be identical.
The fact that in the targeted psych studies only between one-third and one-half of the original results were repeated has been dubbed a “replication crisis”, or “reproducibility crisis”, and continues to generate debate today.
And, it seems, the crisis may be set to spread, with a new study completed by Australian researchers finding replication is also contentious in the fields of ecology and evolutionary biology.
A team of scientists led by Hannah Fraser from the University of Melbourne surveyed 494 ecologists and 313 evolutionary biologists and asked them about questionable research practices. Research approaches that groom raw data to make it look more appealing – that is, closer to the results predicted – has been identified as a primary cause of psychology’s replication crises.
Fraser and colleagues targeted three questionable practices. The first of these was p-hacking, the use of data mining to uncover apparent (or real) correlations that were not included in the original hypothesis. A consequence of p-hacking, wrote researchers from the Australian National University in 2015, is that “nonsignificant results become significant”.
The team also asked about cherry-picking – selecting results that are statistically significant while ignoring others – and a practice known as HARKing, which involves formulating a hypothesis after, instead of before, results are in.
The results, posted on preprint site Open Science Framework (OSF), are disturbing. Across both ecologists and evolutionary biologists, a whopping 64% confessed to cherry-picking. Some 42% indulged in p-hacking, mainly by collecting more data after the results were in. And 51% admitted to reporting unexpected results in ways that made it appear they had predicted them from the start.
The scientists surveyed, Fraser and colleagues relate, offered a raft of reasons to justify their transgressions, including publication bias, pressure, and the need to present a neat, coherent narrative.
“The use of Questionable Research Practices in ecology and evolution research is high enough to be of concern,” Fraser’s team writes, noting that it as high as the rates found in psychology that catalysed the reproducibility crisis.
They recommend that academic journals adopt more rigorous and transparent editing regimes to protect against the practice in future.