Doctor Skeptic: The replication problem

Thursday, 18 September 2014

The replication problem

One of the fundamental principles of science is that the results of any experiment should be reproducible. Reproducibility is essential because it means that the results can be relied upon, as they are more likely to be true. Unfortunately, there is little fame in replicating someone else’s study; it is also hard to get such studies funded (because they are not ‘novel’). Consequently, many studies are not repeated and many findings stand alone without verification from separate, independent researchers. This is a problem because often when studies are replicated, they fail to reproduce the original findings.

What am I talking about?

To get the replication/reproduction terminology clear from the start, I will refer to replicating or repeating studies (doing the same research again, preferably independently) and whether or not the replicated study reproduces the same results.

Two problems

From the definitions above, we have two problems: firstly studies are often not being replicated and secondly, when they are being replicated, they often fail to reproduce the results of the initial study. Follow me so far?

The good and the bad

Reproduction of previous results is a good thing: it is a verification of the findings of previous research, therefore increasing the probability that those findings are true.
Failure of the replication study to reproduce the original study results decreases the probability that the original findings were true. This is bad for the original researchers, but it is good for everyone else because it is science’s way of detecting errors – of self-correcting.
Failure to replicate studies (at all) is bad for everyone because it means that we are less certain that the original results are true, and we could be holding onto to these false beliefs for a very long time, screwing up our practice of medicine.

Falsifiability

According to Popper, it is a basic tenet of science that any finding / statement / theory must be falsifiable. If something cannot be disproved then, in effect, it cannot be challenged and becomes dogma, not science. A study or theory that stands up to attempts to falsify it is a more robust one.

Reproducibility

Replicating studies goes hand in hand with falsifiability. If nobody is going to repeat a study (and attempt to falsify it) then it doesn’t matter if it is falsifiable or not. Being falsifiable is not enough: theories gain strength from standing up to attempts at falsification, and study findings need to be reproduced if they are to be relied upon.

If they can’t be reproduced, then the conclusions of a study are not only not right, they are not even wrong (Pauli).

Examples

An article in Nature from 2012 reported an attempt by researchers to replicate 53 landmark or breakthrough papers in the field of pre-clinical cancer research. Despite multiple attempts at reproducing the findings in their own lab, they could only reproduce the findings of 6 out of the 53 studies. They even went to the extent of contacting original authors and repeating the studies in slightly different ways.

In one case, when told that despite replicating his original research 50 times, that the original findings were not able to be reproduced, the author of the original paper said that he had done it 6 times himself, and only produced the interesting findings once. That means that the original researchers could not even reproduce their own findings in their own labs. Yet they only reported the positive study, not the negative studies. Depending on how you they did it, that is either reporting bias or publication bias, but either way it is biased.

Worryingly, several of the non-reproducible studies had already spawned new fields of research that were founded on the original findings, yet never validated them through re-testing.

Lamenting the declining success rate of clinical trials and a widening gap between discoveries that were ‘interesting’ and those that were ‘feasible’, a team from Bayer replicated 67 published in-house projects (link here) and could only reproduce the original findings in about 20-25%.

Both these papers blame sloppy science by the researchers, and the academic system (which rewards interesting findings).

A 2005 review by Ioannidis looked at the biggest articles (highest number of citations) in three major general medical journals and found that most of them were either not replicated (24%) or were replicated, but the results were not reproduced (no effect - 16%, or lesser effect - 16%).

In a recent review of medical publications testing standards of care, it was found that such studies were more likely to refute the original findings than reaffirm them. Interestingly, this reaffirms Ioannidis’ findings.

Psychology is particularly hard hit by lack of reproducibility, which has sparked such endeavours as the Reproducibility Project (here). This is part of a larger validation project looking to replicate and validate scientific findings (here).

Why is there a lack of repeat studies?

There is a perception that only novel science is rewarding. This may be true, as novel science is rewarded by grant money, promotions, doctorates and fame. Who wants to do a study that someone else has already done, and do it exactly the same way? Well, I do. That is the only way I can find out if those interesting studies that pop up every now and then are true.

The solution

Firstly, scientists need to be more scientific; a lot of this is sloppy science. This is covered in a previous post on Manufactured Significance. The methodology needs to be better, and studies need to have explicit protocols published prior to the research starting.
Secondly, there needs to be more journals like PLOS, who publish anything they get, completely and open access, all on line, based purely on scientific merit not newsworthiness or novelty.
Thirdly, funders (grant funders, industry, universities, governments) need to appropriately prioritise replication research, particularly of important findings that have significant clinical and resource implications.
Fourthly, the readers of the research need to be aware of the problem, and they need to equip themselves with the tools for determining scientific validity and start using those tools.

The bottom line

For many reasons, there is a lack reproduction of scientific findings. Firstly because studies are not being repeated, and secondly because when they are, the results are often not reproduced. This knowledge gap is being recognised, but it will only be filled when those who control research output (the scientists themselves and those who control financial and academic rewards) appreciate the importance of repeating other people’s studies.

Other links

The lack of reproducibility in current research was also highlighted in this Economist article from 2013 with examples across many fields of science.
Another study in psychological science from 2015 here.
Another blogger has also covered this topic here, and offered some solutions here.
Organisations are now aware of these problems. The US Office of Research Integrity is a good reference, as is Retraction Watch, The Reproducibility Initiative, and the Committee on Publication Ethics and Retraction Watch.

9 comments:

Anonymous18 September 2014 at 20:08
What about too many reproducibility studies that lead to research waste? There are times when we do not need any more repeated studies. It most often occurs when people are trying to prove that something works when the previous studies have shown they don't. In this instance it is important to consider whether a repeated study is worthwhile and what it would add. One can look to systematic reviews and meta-analysis to determine how large a study would need to be to overturn the results of previous studies.
ReplyDelete
Replies
Dr Skeptic18 September 2014 at 20:22
Excellent point. Most of the failure to reproduce/replicate is for studies with positive/surprising results, like antibiotics curing back pain. I think those studies HAVE to be replicated. Also, in my own field, I find that 'negative' research is not taken up or accepted, and therefore a replication study could strengthen the evidence base and be more likely to effect a change in practice.
An example of what you are saying would be PRP injections, where the evidence is overwhelmingly negative, but people keep doing more research, and occasionally studies come up positive. I guess I have less of a problem with that (part of my No-such-thing-as-bad-data philosophy) except, as you say, that it is a waste of resources.
ReplyDelete
Replies
John Cunningham18 September 2014 at 20:29
Only 16% of highly cited articles were contradicted later, as reported in the study by Ioannidis. Is that surprising, given the variability in populations, selection criteria, outcome measurements, etc that occur in clinical studies, even those which are considered identical by Ioannidis? It's hardly news. I'm surprised it's not more. One of the things about looking at these contradicting studies is often examining the subtle differences between these "identical" studies as way of determining if these differences are responsible for the outcomes. I often find it's more valuable looking at contradicting studies this way, rather than throwing up one's arms in the air and declaring that all research is flawed.
ReplyDelete
Replies
Anonymous18 September 2014 at 23:56
As a layman this is why I have so little faith in medical practitioners. However, it should be noted that the sloppy science applies not only to medicine but to other areas of interest of daily life. It is almost a regression to belief in magic sometimes. X waved the wand once and it produced this, and this was funded by Y. Never a follow-up to X and his Y funded study. "It" become accepted protocal.
ReplyDelete
Replies
http://solowomenathomeandabroad.blogspot.com/21 September 2014 at 03:22
Excellent article! There was a recent NPR segment on the toll these sloppy studies take on patients in clinical trials. http://www.npr.org/blogs/health/2014/09/15/344084239/patients-vulnerable-when-cash-strapped-scientists-cut-corners
ReplyDelete
Replies

Add comment

Note: only a member of this blog may post a comment.

Why be skeptical about medicine?

Doctors and skeptics are often critical of alternative medicine and other non-medical healing practices because they are not well supported by scientific evidence. This is appropriate.
What is inappropriate is the acceptance of medical practices (established and new) without a requirement for the same level of scientific support.
The evidence supporting many medical practices is less than many people suppose, and similarly, the harms from medicine are often under-appreciated.
We need to ask the same question of medicine that we would ask the alternative practitioner: what is the evidence? But we need skills to be able to critically appraise that evidence, because unlike (say) homeopathy, medical evidence is based on science. This is part of the problem because for many, being scientifically based is reason enough for a treatment to be accepted as true; assuming that a medical treatment works is our default position. This, and the other biases that creep into medical science on so many levels, at least partly due to our keenness to see it work, are the reasons for looking at medicine with a skeptical eye.

Pages

Thursday, 18 September 2014

The replication problem

9 comments: