“Meanwhile, alternate-universe-you that didn’t find a statistically significant result doesn’t publish. The results sit in a file drawer.”
If you’re suffering from depression, have you asked your doctor about reboxetine?
It’s a great antidepressant. I mean, have you seen the data? This meta-analysis concluded that “reboxetine is significantly more effective than placebo in a subgroup of patients with severe depression.” Another meta-analysis showed the same. This one found that the risk of adverse events–side effects like dry mouth, insomnia, etc.–was pretty much the same between reboxetine and placebo.
And these aren’t meta-analyses of just any old studies; you could imagine that a meta-analysis of p-hacked studies infused with researcher allegiance bias could lead to absurd conclusions with tenuous connections to reality. No, these are meta-analyses of mostly double-blind, randomized, placebo-controlled trials (DBRCTs)–the gold standards of clinical trials. Medical doctors get excited over these, because they know that these are the most likely to dispense accurate, truthful information about which drugs work and which don’t.
In fantasy stories, you have Oracles that will truthfully answer any question you ask. In medical research, you have DBRCTs.
So there’s really robust evidence in the medical literature showing that reboxetine is a great treatment for depression, with little side effects. Seems like a pretty boring success story for a drug…
…Until this meta-analysis by Eyding et al. in 2010 had to come along and ruin everybody’s fun. It turns out that the medical literature on reboxetine was afflicted by publication bias; that is to say, positive trials on reboxetine were selectively published, while negative trials never saw the light of day. In the 13 methodologically sound trials that were analyzed, Eyding et al. found that data on 74% of patients were left unpublished. While the three meta-analyses I cited above–drawing mostly from published data–concluded that reboxetine was an effective treatment for depression, the combined published and unpublished data didn’t show a statistically significant difference between reboxetine and placebo. In addition, the combined published and unpublished data showed that patients on reboxetine reported more adverse events than those on placebo (p < 0.001).
The authors concluded that reboxetine was “an ineffective and potentially harmful antidepressant.” Meanwhile, reboxetine had already been on the market for 13 years and had been prescribed to God knows how many patients.
The story of reboxetine is a lesson in humility. If you thought that all you had to do to get a true answer to a research question was do a good study, then the story of reboxetine says, “Nope, even DBRCTs can be wrong.” If you then thought that all you had to do was compile all the DBRCTs in the literature and see what that compilation says, the story of reboxetine says, “Nope, publication bias, motherfucker.”
And it’s not just reboxetine. The medical literature in general has a publication bias problem; this narrative review finds evidence of publication bias in studies on drugs for bipolar disorder, schizophrenia, panic disorder, Alzheimer’s disease, coronary heart disease, HIV/AIDS, ovarian cancer, multiple myeloma, osteoarthritis…Overall, several analyses have found that biomedical studies with positive, statistically significant results are at least 2x as likely to be published as those with negative, non-significant results.
Publication bias is a big deal, because being able to do a systematic overview of the literature is currently the best way we have to assess the truth of scientific claims. Since individual studies can be biased, p-hacked, and/or statistically underpowered, we need to be able to pick out good studies, leave out bad ones, and combine them in a neat package called a meta-analysis that gives us an overview of what we know on a topic. But meta-analyses can only draw from the literature that’s currently published; you can only analyze what you see. If there’s a bunch of research out there stuck in file drawers that disproportionately disputes whatever the current literature suggests, we need to be able to capture that research in a meta-analysis.
Now that we’ve learned about publication bias, as well as p-hacking, we can combine these two pieces of knowledge to make a prediction. If a) sketchy methodology is often used to get positive findings, and b) positive findings are not only more likely to get published at all, but also more likely to get published in higher-impact journals, then you might predict c) that higher-impact journals might have more instances of studies being published that exhibit sketchy methodology, and therefore might have more retractions. Now let’s see what the data seems to suggest:
(Figure taken from Nature News & Comment, here)
This is troubling, since publications in higher-impact journals are, well, higher impact; they tend to have a lot of influence on the beliefs of scientists, as well as of the citizenry at large if the study is impactful enough. So getting these high-impact studies right is pretty important.
And one good way to make sure these high-impact studies are accurate would be to have scientists double-check each other by repeating each other’s experiments–in other words, to have scientists perform replications.
7Contrary to popular belief, the reasons for this bias towards positive publications seems to have less to do with editorial bias against negative publications than with scientists not writing up negative results in the first place, either because they have a perception of an editorial bias against negative publications or because they just aren’t interested.↵