Me and a dozen other academics all just wrote basically the same thing about Open Science in the Journal Of Clinical Epidemiology. After the technical bits, me and Tracey get our tank out. That’s for a reason: publishing academic papers about structural problems in science is a necessary condition for change, but it’s not sufficient. We don’t need any more cohort studies on the global public health problem of publication bias; we need action, of which the AllTrials.net campaign is just one example (and as part of that, we do still need many more audits giving performance figures on individual companies, researchers and institutions, as I explain here). We have a paper coming shortly on the methods and strategies of the AllTrials campaign that I hope will shed a little more light on this, because policy change for public health is a professional activity, not a hobby. Where academics are sneery about implementation, problems go unsolved, and patients are harmed.
Ironically all these papers on Open Science are behind an academic paywall. The full final text of our paper is posted below. If you’re an academic and you’ve ever wondered whether you’re allowed to do this, but felt overwhelmed by complex terms and conditions, you can check every academic journal’s blanket policy very easily here.
And lastly, if you’re in a hurry: the last two paragraphs are the money shot. Enjoy.
Fixing flaws in science must be professionalised.
Ben Goldacre* and Tracey Brown
Journal of Clinical Epidemiology. July 10, 2015. doi:10.1016/j.jclinepi.2015.06.018.
It is heartening to see calls for greater transparency around data and analytic strategies, including in this issue, from such senior academic figures as Robert West. Science currently faces multiple challenges to its credibility. There is an ongoing lack of public trust in science and medicine, often built on weak conspiracy theories about technology such as vaccines(1). At the same, however, there is clear evidence that we have failed to competently implement the scientific principles we espouse.
The most extreme illustration of this is our slow progress on addressing publication bias. The best current evidence, from dozens of studies on publication bias, shows that only around half of all completed research goes on to be published(2), and studies with positive results are twice as likely to be published(2)(3). Given the centrality of systematic review and meta-analysis to decision-making in medicine, this is an extremely concerning finding. It also exemplifies the perversity of our failure to address structural issues in evidence based medicine: we spend millions of dollars on trials specifically to exclude bias and confounding; but then, at the crucial moment of evidence synthesis, we let all those biases flood back in, by permitting certain results to be withheld.
Progress towards addressing this issue has been slow, with a series of supposed fixes that have often given little more than false reassurance(4). Clearest among these is the FDA Amendment Act 2007, which required all trials after 2008 on currently available treatments with at least one site in the US to post results at clinicaltrials.gov within 12 months of completion. There was widespread celebration that this had fixed the problem of publication bias, but no routine public audit of compliance. When one was finally published, in 2012, it showed that the rate of compliance with this legislation was 22%(5).
It is also worth noting the growing evidence that – despite peer review, and editorial control – academic journals are rather bad places to report the findings of clinical trials. Information on side effects in journal publications is known to be incomplete(6), for example; and primary outcomes of trials are routinely switched out, and replaced with other outcomes(7,8), which in turn exaggerates the apparent benefits of the treatment. Initiatives such as CONSORT (9) have attempted to standardise trial reporting with best practice guidelines, but perhaps most encouraging is the instantiation of such principles in clear proformas for reporting. All trials from any region and era can now post results on clinicaltrials.gov, and completion of simple pre-specified fields is mandated, leaving less opportunity for editorialising and manipulation. A recent cohort study of 600 trials at clinicaltrials.gov found that reporting was significantly more complete on this registry than in the associated academic journal article – where one was even available – for efficacy results, adverse events, serious adverse events, and participant flow(10).
Academic papers appear even less informative when compared with Clinical Study Reports (CSRs). These are lengthy documents, with a stereotyped and pre-specified structure, which are generally only prepared for trials sponsored by the pharmaceutical industry. For many years they were kept from view, but are increasingly being sought for use in systematic reviews, and to double-check analyses. A recent study by the German government’s cost effectiveness agency compared 101 CSRs against the academic papers reporting the same trials, and found that CSRs were significantly more informative, with important information on benefits and harms absent from the academic papers that many would regard as the canonical document on a trial.(6) Sadly, progress towards greater transparency on CSRs has been hindered by lobbying and legal action from drug companies(11).
Lastly, there have been encouraging recent calls for greater transparency around individual patient data (IPD) from trials, which offers considerable opportunity for IPD meta-analysis and checking initial analyses; although these calls have also been tempered by overstatement of the privacy risks and administrative challenges (even though both have long been managed for the sharing of large datasets of patients’ electronic health records for observational epidemiology research), and position statements on data sharing that exclude most retrospective data on the treatments that are currently in widespread use(4).
The AllTrials campaign(12) calls for all trials – on all uses, of all currently prescribed interventions – to be registered, with their full methods and results reported, and CSRs shared where they have been produced. This is a simple, clear “ask”, and the campaign has had significant policy impact, with extensive lobbying in the UK and EU, and a US launch to follow in 2015. It is sobering to note that the first robust quantitative evidence demonstrating the presence of publication bias was published in 1986, and was accompanied by calls for full trial registration(13), which have still not been answered.
This is especially concerning, since transparency around the methods and results of whole trials is one of the simplest outstanding structural flaws we face, and there is clear evidence of more subtle and interesting distortions at work throughout the scientific literature. Masicampo and Lalande, for example, examined all the p-values in one year of publications from three highly regarded psychology journals, and found an unusually high prevalence of p-values just below 0.05 (a conventional cut-off for regarding a finding as statistically significant)(14). A similar study examined all p-values in economics journals, and found a “hump” of p-values just below the traditional cut-off, and a “trough” just above(15). It is highly unlikely that these patterns emerged by chance. On the contrary: anyone who has analysed data themselves will be well aware of exactly how a marginal p-value could be improved, with the judicious application of small changes to the analytic strategy. “Perhaps it might look a little better if we split age into quintiles, rather than 5 year bands?”; “Maybe we should sense-check the data again for outliers?”; “If we took that co-variate out of the model things might change? We can always justify it in the discussion.”. Whether we like it or not, the evidence suggests that sentences like these echo through the corridors of academia.
There is also the related problem of selective outcome reporting within studies, and people finding their hypothesis in their results, a phenomenon which has also been detected through close analysis of large quantities of research. One unusual landmark study inferred power calculations from a large sample of brain imaging studies which were looking for correlations between anatomical and behavioural features: overall, this literature reported almost twice as many positive findings as can plausibly be supported by the number of hypotheses the authors claim to have tested(16). The most likely explanation for this finding is highly concerning: it appears that large numbers of researchers have effectively gone on fishing expeditions, comparing multiple anatomical features against multiple behavioural features, then selectively reported the positive findings, and misleadingly presented their work as if they had only set out to examine that one correlation.
For epidemiology, all this raises important questions. It is clear that there are discretionary decisions made by researchers that can affect the outcomes of research, whether observational studies or randomised trials. Entire studies that go unpublished – the crudest form of outcome reporting bias – is in many ways the simplest to address, with universal registration and disclosure of results, accompanied by close monitoring of compliance through universities, ethics committees, sponsors, and journals. Addressing the other distortions may prove more challenging. One option is to demand extensive and laborious pre-specification of analytic strategy, with similarly extensive documentation of any deviations. This may help, but may also miss much salient information, as there are so many small decisions (or “researcher degrees of freedom”(17)) in an analysis, and some may not even be thought of before the analysis begins. A more extreme option might be to demand a locked and publicly accessible log of every command and program ever run on a researcher’s installation of their statistics package, providing a cast iron historical record of any attempt to exaggerate a finding through multiple small modifications to the analysis. The Open Science Framework offers tools to facilitate something approaching this(18): although even such extreme approaches are still vulnerable to deliberate evasion. Clearly a trade-off must emerge between what is practical, what is effective, what the culture of science will bear, and perfection: but at the moment, these problems are largely unaddressed, and under-discussed. Meanwhile, the experience of publication bias suggests progress will be slow.
Is there a way forward? We think so. The flaws we see today, in the structures of evidence based medicine, are a significant public health problem. It is remarkable that we should have identified such widespread problems, with a demonstrable impact on patient care, documented them meticulously, and then left matters to fix themselves. It is as if we had researched the causes of cholera, and then sat proudly on our publications, doing nothing about cleaning the water or saving lives. Yet all too often efforts to improve scientific integrity, and fix the flaws in our implementation of the principles of evidence based medicine, are viewed as a hobby, a side project, subordinate to the more important business of publishing academic papers.
We believe that this is the core opportunity. Fixing structural flaws in science is labour intensive. It requires extensive lobbying of policy makers and professional bodies; close analysis of evidence on flaws and opportunities; engaging the public to exert pressure back on professionals; creating digital infrastructure to support transparency; open, public audit of best and worst practice; and more. If we do not regard this as legitimate professional activity – worthy of grants, salaries, and foreground attention from a reasonable number of trained scientists and medics – then it will not happen. The public, and the patients of the future, may not judge our inaction kindly.
Dr Ben Goldacre BA MA MSc MBBS MRCPsych, Dept Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel St, London WC1E 7HT.
Tracey Brown, Director, Sense About Science, 14a Clerkenwell Green London EC1R 0DP
BG and TB are co-founders of the AllTrials campaign. BG has been supported by grants from the Wellcome Trust, NIHR BRC, and the Arnold Foundation. BG receives income from speaking and writing on problems in science. TB is employed to campaign on science policy issues by Sense About Science, a UK charity.
- Goldacre B. Bad Science. Fourth Estate Ltd; 2008.
- Schmucker C, Schell LK, Portalupi S, Oeller P, Cabrera L, Bassler D, et al. Extent of Non-Publication in Cohorts of Studies Approved by Research Ethics Committees or Included in Trial Registries. PLoS ONE. 2014 Dec 23;9(12):e114023.
- Song F, Parekh S, Hooper L, Loke YK, Ryder J, Sutton AJ, et al. Dissemination and publication of research findings: an updated review of related biases. Health Technol Assess Winch Engl. 2010 Feb;14(8):iii, ix – xi, 1–193.
- Goldacre B. Commentary on Berlin et al “Bumps and bridges on the road to responsible sharing of clinical trial data.” Clin Trials Lond Engl. 2014 Feb;11(1):15–8.
- Prayle AP, Hurley MN, Smyth AR. Compliance with mandatory reporting of clinical trial results on ClinicalTrials.gov: cross sectional study. BMJ. 2012;344:d7373.
- Wieseler B, Wolfram N, McGauran N, Kerekes MF, Vervölgyi V, Kohlepp P, et al. Completeness of Reporting of Patient-Relevant Clinical Trial Outcomes: Comparison of Unpublished Clinical Study Reports with Publicly Available Data. PLoS Med. 2013 Oct 8;10(10):e1001526.
- Mathieu S, Boutron I, Moher D, Altman DG, Ravaud P. Comparison of Registered and Published Primary Outcomes in Randomized Controlled Trials. JAMA. 2009 Sep 2;302(9):977–84.
- Boutron I, Dutton S, Ravaud P, Altman DG. Reporting and Interpretation of Randomized Controlled Trials With Statistically Nonsignificant Results for Primary Outcomes. JAMA. 2010 May 26;303(20):2058–64.
- Schulz KF, Altman DG, Moher D. CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials. Trials. 2010 Mar 24;11:32.
- Riveros C, Dechartres A, Perrodeau E, Haneef R, Boutron I, Ravaud P. Timing and Completeness of Trial Results Posted at ClinicalTrials.gov and Published in Journals. PLoS Med. 2013 Dec 3;10(12):e1001566.
- Groves T, Godlee F. The European Medicines Agency’s plans for sharing data from clinical trials. BMJ. 2013 May 8;346(may08 1):f2961–f2961.
- AllTrials. The AllTrials Campaign [Internet]. 2013 [cited 2014 Jan 9]. Available from: www.alltrials.net/
- Simes RJ. Publication bias: the case for an international registry of clinical trials. J Clin Oncol. 1986 Oct 1;4(10):1529–41.
- Masicampo EJ, Lalande DR. A peculiar prevalence of p values just below .05. Q J Exp Psychol. 2012 Nov 1;65(11):2271–9.
- Brodeur A, Lé M, Sangnier M, Zylberberg Y. Star Wars: The Empirics Strike Back [Internet]. Rochester, NY: Social Science Research Network; 2012 Jun [cited 2015 Jun 11]. Report No.: ID 2089580. Available from: papers.ssrn.com/abstract=2089580
- Ioannidis JPA. Excess Significance Bias in the Literature on Brain Volume Abnormalities. Arch Gen Psychiatry. 2011 Aug 1;68(8):773–80.
- Simmons JP, Nelson LD, Simonsohn U. False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant [Internet]. Rochester, NY: Social Science Research Network; 2011 May [cited 2015 Feb 11]. Report No.: ID 1850704. Available from: papers.ssrn.com/abstract=1850704
- Centre for Open Science. Open Science Framework [Internet]. 2015 [cited 2015 Feb 11]. Available from: osf.io/