Ben Goldacre, The Guardian, Saturday 24 July 2010
There is a pleasing symmetry in the ropey science you get from different players. When GlaxoSmithKline are confronted with an unflattering meta-analysis summarising the results of all 56 trials on one of their treatments, as we saw last week, their defense is to point at 7 positive trials, exactly as a homeopath would do. Politicians will often find a ray of positive sunshine in a failed policy’s appraisal, and promote that to the sky. Newspapers, similarly, will spin science to fit their political agenda, with surreal consequences (the Telegraph have claimed recently that shopping causes infertility in men, and the Daily Mail reckon housework prevents breast cancer in women).
But does the same thing happen in formal academic research?
Isabelle Boutron and colleagues set out to examine this problem systematically. They took every trial published over one month that had a negative result – 72 in total – and then went through each trial report to look for evidence of “spin”: people trying to present the results in a positive light, or distract the reader from the fact that the trial was negative.
First they looked in the abstracts. These are the brief summaries of the academic paper, and they are widely read, either because people are too busy to read the whole paper, or because they cannot get access to it without a paid subscription (a scandal in itself).
Normally, as you scan hurriedly through an abstract, you’d expect to be told the “effect size” – “0.85 times as many heart attacks in patients on our new super-duper heart drug” – along with an indication of the statistical significance of this result. But in this representative sample of 72 trials with negative results, only 9 gave these figures properly in the abstract, and 28 gave no numerical results for the main outcome of the trial, at all. It gets worse. Only 16 of these negative trials reported the main negative outcome of the trial properly, anywhere, even in the main body of the text.
So what was in these trial reports? Spin. Sometimes the researchers found some other positive result in the spreadsheets and pretended that this was what they intended to count as a positive result all along. Sometimes they reported a dodgy subgroup analysis. Sometimes they claimed to have found that their treatment was “non-inferior” to the comparison treatment (when in reality a “non-inferiority” trial requires a bigger sample of people, because you might have missed a true difference simply by chance). Sometimes they just brazenly banged on about how great the treatment was, despite the evidence.
There a lots of things in place to stop this kind of stuff from happening. Trials are supposed to be registered, before they begin, with their protocol described in full, so that highly motivated individuals can go back and check if researchers changed their minds about what constituted a positive result, retrospectively, after the results came in. There are also reporting guidelines, such as CONSORT, which formalise the information that is supposed to appear in any scientific paper resulting from a trial.
But there is no enforcement for any of this, everyone is free to ignore it, and commonly enough – as with newspapers, politicians, and quacks – uncomfortable facts are cheerfully spun away.
DaveChapman said,
July 24, 2010 at 10:48 am
I have often been dismayed at research that cites other studies supporting it findings, only to find that the other studies have so many flaws in their methodology that it is almost meaningless
T.J. Crowder said,
July 24, 2010 at 12:28 pm
How does that get past peer review?
TwentyMuleTeam said,
July 24, 2010 at 1:29 pm
Once at a week-long seminar on EBM, I watched experienced & very busy MDs search through a plethora of on-line research articles to identify key parts, evaluate them, & attempt to use the information to solve a single tx question: “Do bipolar patients with chronic severe sx do better w/psychiatric case management?” This was just 1 out of 100,000 possible tx questions. It occurred to me, Who has time to do all this, much less monitor quality of research? What kind of alembic produces from 10,000,000 possible research articles at least a bit of reliable information? The amt of vigilance needed to monitor this process seems daunting.
ronmurp said,
July 24, 2010 at 6:48 pm
Someone else who sees problems:
thesciencenetwork.org/programs/beyond-belief-candles-in-the-dark/beatrice-golomb
TheSacredMongoose said,
July 24, 2010 at 9:31 pm
@T.J.Crowder – You could get it past peer review by publishing it in a journal that isn’t peer reviewed…. there’s another thing that should be enforced really, all research should be published somewhere that’s peer reviewed, instead of in some obscure back of beyond type journal.
lorcancoyle said,
July 24, 2010 at 10:53 pm
Slightly OT, prescriptions.blogs.nytimes.com/2010/07/22/federal-sting-slams-gene-tests/?scp=1&sq=dna%20fda%20heart&st=cse, NY Times on the FDA cracking down on gene test kit claims etc.
martha said,
July 24, 2010 at 11:05 pm
Maybe only the ones that put a positive spin on their research get published…
muscleman said,
July 25, 2010 at 2:24 pm
I suspect that Martha has a good handle on it. The other one of course is a combination of career progression and the need to keep getting funding. We need the practice of the top journals not to accept trial reports that are not on the register to spread to other journals. If you couldn’t get published anywhere without being registered then things would change.
AndrewKoster said,
July 25, 2010 at 2:37 pm
@martha: that is a very real possibility. Negative results are too often rejected for not being significant enough for publication. If only positive results get published then you HAVE to put some kind of spin on your research to get your negative result published.
Dudeistan said,
July 26, 2010 at 11:55 am
Bing Crosby used to sing about accentuating the positive. I certainly think we need more ‘accentuating the negative’ when it comes to research. Didn’t Merck allegedly hide research that showed potential cardiotoxicity of Vioxx?
Given the current number of lawsuits against the makers of rosiglitazone, clearly something has gone terribly wrong with this drug’s provenance. Typical that the company would highlight the positive studies despite the meta-analysis.
However as a general point only, and I do not mean to refer to this specific case, we mustn’t lose sight that meta-analysis is the not the Holy Grail of research evidence: BMJ 2010;341:c3515
For example:
Dudeistan said,
July 26, 2010 at 11:56 am
Bing Crosby used to sing about accentuating the positive. I certainly think we need more ‘accentuating the negative’ when it comes to research. Didn’t Merck allegedly hide research that showed potential cardiotoxicity of Vioxx?
Given the current number of lawsuits against the makers of rosiglitazone, clearly something has gone terribly wrong with this drug’s provenance. Typical that the company would highlight the positive studies when confronted with a meta-analysis.
However as a general point only, and I do not mean to refer to this specific case, we mustn’t lose sight that meta-analysis is the not the Holy Grail of research evidence.
For example: BMJ 2010;341:c3515
TheShrink said,
July 26, 2010 at 1:49 pm
Indeed, I am bemused and dismayed by how utterly shameless a company can be in reporting of research.
I’d agree with Martha too, publication bias towards positive findings skews writing up of research markedly, as the raw “data” is morphed into slickly presented desirable “information”
The consequence of results/discussion being at variance with the factual data is of course a trifling nuisance, but clever folk can overcome such obstacles and challenge, no?
nesburrito said,
July 26, 2010 at 3:02 pm
@martha (#6) and AndrewKoster (#7): I think you’ve hit it on the nose. I am a scientist with a drawer full of null data. There is no forum in which I can publish it – not out of any nefarious motives or any attempt to hide anything, but simply because no journal will take it. At the moment, the only way to get any null data past reviewers is to still insist it’s an interesting finding – a skill relying on spin. At least the researchers DID manage publish the null data, which allows reasonably informed readers to look just at the results. You don’t have to believe the spin.
If we want to see more published data showing ‘no significant effect’, I expect what we need is reform of academic journals rather than changes of heart on the part of the researchers.
Sqk said,
July 26, 2010 at 8:31 pm
Having read points 6-9 above, have a look at bulletpoints 2 and 3 in the advert below:
www.newscientistjobs.com/jobs/job/creative-medical-writer-healthcare-advertising-london-london-1400883736.htm
quietstorm said,
July 26, 2010 at 8:50 pm
Thanks Sqk! Chilling
[bullet 2] Use this understanding to identify the most persuasive and compelling brand story possible from the available data.
“brand story”??? Really?
[bullet 3] Produce clear, accurate, creative and, above all, compelling sales copy appropriate to the given brief.
Are the words “accurate, creative” not mutually exclusive?
quietstorm said,
July 26, 2010 at 9:09 pm
@nesburrito [#9]: I feel like your “drawer results” deserve seeing the light of day, although I understand that it would be an uphill struggle.
“Null” data are interesting! Should I have to write to all the editors I know and campaign to have more published articles which deal with data which don’t show anything one way or another? I mean null data may reveal
(a) standard approaches are insufficient, and a new design of trial/experiment is required,
(b) the phenomenon of interest doesn’t show any relationship with anything.
Nobody seems to like option (b), but I’m pretty sure that sometimes, that’s how nature works. And I’d like to be able to see the data in those cases!
How does this vary from one discipline to another? Does anyone have any advice for getting results published without having to resort to spin? I should state that I work in physical sciences, not medicine, but the problems appear to be the same.
Bloodvassal said,
July 26, 2010 at 9:40 pm
@T.J. Crowder and @TheSacredMongoose: Even in peer reviewed journals. As someone who reviews manuscripts I am continually surprised at what authors try to get past. Increasingly often I find myself writing a sentence to the effect that a claim in the discussion and abstract did not reflect the actual data. In a couple of cases it was the exact opposite of the data. Ironically, they really needn’t try to fudge their interpretation of data as I am far less likely to kick back useful negative data because, like @nesburrito I have enough negative data to fill several manuscripts myself.
However from the opposite corner, as a publishing scientist, I am under the impression that not all reviewers fully read the manuscript. Lowering a manuscript down the Impact Factor ladder I’ve had a paper rejected five times. That’s fine by me, it happens sometimes, and four have given me critiques that helped me improve the manuscript. A fifth critique contained comments that very clearly indicated that the reviewers had read the abstract and looked at the figures but failed to read the text. Fine, on to the next journal but it was very clearly a less informed review than the other four. Peer reviewers, refereeing a paper are human and in that great bell curve of humanity there are crap ones too. Sadly also true, I guarantee, for reviewers of clinical trials papers.
glistering said,
July 27, 2010 at 12:01 am
Funnily enough, the last time I was at a conference I presented what could be politely called ‘inconclusive results’.
I was going over work done by someone 30 years ago in the light of more recent theories and it turned out what with all the supposed advances over that period there was not too much more that could be said. Which I felt was interesting in itself.
I eve got a comment that it was refreshing to hear someone present such a result (of course that could just have been damning with feint praise)!
martha said,
July 27, 2010 at 12:05 am
@quietstorm Null results are only interesting if it made sense to ask the question in the first place. There are infinite hypotheses but they are not all worth researching. I think a well designed study is one where any result is an interesting one, but often from my own experience studies are designed such that a positive result is very interesting but a negative result does not tell you much. So I’m not sure if publishing ALL null results is the optimal course of action.
Brightonian said,
July 27, 2010 at 10:36 am
@martha – couldn’t null results be published somewhere that isn’t fussy about interestingness? (arXiv perhaps allows this – I don’t know really as I’ve almost no grasp of the process for publishing science results in general, so I might be talking nonsense.) I’d think this would avoid unnecessary repetition of work that comes up with null results.
AndrewKoster said,
July 27, 2010 at 10:45 am
And… we’re back at funding! Because while I agree that not all null results are interesting, they do all take time (and money) to research. Seeing as how funding in academia is so closely linked to publications, it is then necessary to publish all the research done, including that which results in nothing much interesting.
I also do not entirely agree. However little the data may tell you, at the very minimum it tells you your hypothesis was incorrect. While that is often not enough for a publication in the established journals, your only other option is currently to publish it as a technical report, white paper or some other non-peer-reviewed channel which will probably never be read by anybody outside the institute where the research was done. That means that somebody else somewhere else might have the same idea and repeat all the research simply because he doesn’t know that was already tried: wasting the time and money on this negative result not once, but twice, or maybe many times.
I really think there’s a use for somehting like a Journal of Negative Results in which it’d be possible to publish your “not terribly interesting null results” without having to spin them. Obviously you’d still have to take the effort of explaining why your hypothesis was worth researching in the first place 😛
Sqk said,
July 27, 2010 at 11:52 am
Quietstorm,
That, unfortunately, is the beauty of it. It’s possible to be extremely creative while also being entirely accurate. It’s a little like the difference between ‘precision’ and ‘accuracy’: I am 8.00000ft tall is precise, but not remotely accurate.
Likewise, “…came in showing no signs of distress, trauma, confusion or pain, was not in need of immediate medical attention, is physically complete, stable and expected to make a full recovery” can be applied with technical accuracy to a freshly exhumed, scrubbed and boxed skeleton sat on a desk. (‘Complete’ is defined by context and you can ‘expect’ what you like!)
You could even add “seems unresponsive, sits grinning in silence” but that’s just fleshing it out.
msjhaffey said,
July 27, 2010 at 12:56 pm
After attending the annual Sense About Science lecture recently I blogged about the need for research to be better audited.
The issue is that anyone doing any research wants it to “succeed”. We need an approach that is more effective in eliminating the rosy spectacles.
Jon said,
July 27, 2010 at 1:42 pm
@AndrewKoster
I give you www.jasnh.com – Journal of Articles in Support of the Null Hypothesis
leifdenby said,
July 27, 2010 at 3:19 pm
Not related to this story, but you will all enjoy it. Didn’t know where else to post it:
xkcd.com/765/
Idiomatic said,
July 27, 2010 at 3:53 pm
I’m not a scientist but I have worked for a medical research lab as a fundraiser and as chief exec of a medical research charity. I recall the PI of one research group once saying that the answer “no” is also a good answer in science.
sciencecarol said,
July 27, 2010 at 5:56 pm
As an ex-journal editor myself I totally agree that publishing negative results (unless they are unexpectedly negative) in respectable journals is hard. This is because they don’t get cited so don’t contribute to impact factor.
The day that the academics stop judging journals based on impact factor is the day when publishers may consider publishing negative results. This will (presumably) also be the day on which funding organisations start allocating funding on merit rather than citations and where scientists have published. That will be a great day for science, and I hope it happens soon – but I won’t hold my breath as it means someone who actually understands the all of the science taking charge of the funding. Until then scientists are doomed to be forever repeating the mistakes made in the past but not published.
chrismccabe said,
July 27, 2010 at 11:07 pm
The journal of cerebral blood flow and metabolism has recently started a negative results article section within the journal which is promising as it is extremely important for negative results to be published in order to prevent others from repeating and because a result is a result whether negative or positive. It is no surprise that academic scientists (not clinicians) try and highlight the positive in their results as there is an enormous pressure to publish in order to obtain future grants/fellowships etc since our salaries depend on it. Most scientists are employed on temporary contracts with no career structure other than moving from one grant to the next.
aka_pigen said,
July 28, 2010 at 7:41 am
In an ideal world the “journal of negative results” as a place where you could publish negative results, as long as the hypothesis was valid and the results sound would be a great idea. However we do not live in an ideal world and the real world of science is currently obsessed with impact (an assessment of the importance of a research area also known as spin)
Impact is now one of the major evaluation factors determining whether research grants gets funding particularly from the research councils.
Being scientists we do of cause try to measure such things – through journal impact factors (the average number of citations a paper in that journal gets) and researchers have H-factors which is essentially a measure of the number of times their work gets cited. Well good research gets more citations than bad research so that should be fine, except in the real world fashionable but poor science still gets more citations than unfashionable but good quality science. But at the end of the day the only thing that is really wrong with these measures is the importance they are given.
As a researcher would you put the spin on your paper and maybe get it into a journal with a higher impact factor than where it would normally go? Well it will increase your chance of bringing in more grant money so you really cannot afford not to!
Journals will have strategies to improve their impact factor, so you can attract better papers and more prestige and certainly you do not want to have an impact factor below the cut-off level some universities have for where their employees can publish. Hence more rejections from journals are along the lines – nice research, the conclusions supported by the work, publishable but not high enough impact for us, sometimes even recommending which journal might publish it. You would expect that from the top journals but the good, average ones too? Well they cannot afford not to!
This brings me back to “journal of negative results”. Given that it would be for results unpublishable elsewhere it would be unlikely to be widely read and hence get few citations and have a low impact factor.
So as a researcher what do you prioritise:
The paper that you know you ought to write but really is not going to further your career or the impact assessment for your next grant application which if it is funded means that you can keep your job?
Well I have a mortgage to pay so really there is no choice – I have got to prioritise the grant I cannot afford not to!
H. E. Pennypacker said,
July 28, 2010 at 10:11 am
I completely understand the worry about the difficulty of publishing null results. But I don’t think it’s as much of a problem as many seem to think (although I acknowledge that in drug trials it is more important than in most fields, my comments below are written with science in general in mind). A null result is really only interesting in the context of a systematic failure by different labs using different procedures to replicate a previously reported finding. This is because there are a number of reasons why an experiment would fail to deliver a positive result (poor design, poor method, statistical issues etc) which are not informative of the hypothesis in question. In any case, anyone who closely follows their own field knows which findings have been replicated and are therefore reliable, and which ones do not seem to be followed up and may be suspect.
On a more practical note, I don’t think the idea of publishing a lot of null findings is realistic anyway. No-one would read a journal of null findings because we just don’t have the time! Already now researchers struggle to keep up with their reading and have to carefully select which papers to read and which ones to put aside for later. A related problem is the strain it would put on the peer review system. Reviewing takes huge amounts of time, and a lot of journals have a hard time finding reviewers and getting them to return their comments in a reasonable time. If you flooded that limited resource with all the null results people have in their drawers, the entire enterprise of science might grind to a halt.
Impact factors have their problems, but one of the reasons why people care about them is that they are to some degree informative. If something gets published in a journal with a very high impact factor, I can be fairly certain that it’s going to be good, and worth reading (of course there are exceptions to this rule, lots of them). If a paper gets a large number of citations, that’s another sign for me that it’s likely to be important to read. If I make what I think is an important finding, I know it will be worth trying to send it to a top journal because I know everyone reads them. If papers weren’t reviewed partially based on their impact, a lot of really important work would be at risk of getting buried and going unnoticed in places like PLoS ONE (which is an example of a journal that doesn’t review for impact and therefore publishes a mixed bag of decent stuff and rubbish, making people less likely to pay attention to it).
All I’m saying is there is some benefit at least to the current state of affairs. Whether these benefits outweigh the negatives is a matter for debate of course.
Sqk said,
July 28, 2010 at 6:06 pm
Sorry, couldn’t resist:
www.phdcomics.com/comics/archive.php?comicid=1174
If it’s on here already somewhere, apologies. If not, I’m sure you’ll enjoy it and the relevance to the first paragraph.
sağlık said,
August 1, 2010 at 11:42 pm
thank you Ben Goldacre and thanks to The Guardian.