Ben Goldacre, The Guardian, Saturday 7 May 2011
Some of the biggest problems in medicine don’t get written about, because they’re not about eyecatching things like one patient’s valiant struggle: they’re protected from public scrutiny by a wall of tediousness.
Here is one problem that affects millions of people. What if we had rubbish evidence on whether hundreds of common treatments really work, simply because nobody asked the right research question? A paper published this week looks at how much evidence there was for every one of the new drugs approved by the FDA between 2000 and 2010, at the time they were approved.
You might think drugs only get on the market if they’ve been shown to be useful. But “useful” can mean many different things: for FDA approval, for example, you only need trials to show your drug is better than placebo. That’s nice, but with most medical problems, we’ve already got some kind of treatment. We’re not interested in whether your drug is better than nothing. We’re interested in whether it’s better than the best currently available option.
So it turns out that – from all the 197 new drugs over the past decade – only 70% had data to show they were better than other treatments (and that’s after you ignore drugs for conditions where there was no current treatment).
But the problems go beyond just using the wrong comparator: most of the trials we rely on to make real world decisions also study drugs on highly unrepresentative, freakishly ideal patients. These patients are younger, with perfect single diagnoses, fewer other health problems, and so on.
This can stretch to absurd extremes. Earlier this year, some researchers from Finland took every patient who’d ever had a hip fracture, and worked out if they would have been eligible for the trials that have been done on fracture-preventing bisphosphonate drugs, which are in wide use.
Starting with all 7,411 fractures, 2,134 patients get excluded straight off, because they’re men, and the trials have been done in women. Then, from the 5,277 remaining, 3,596 get excluded again, because they’re the wrong age: patients in trials had to be between 65 and 79. Then, finally, 609 fracture patients get excluded, because they’ve not got osteoporosis.
This only leaves 1072 patients. So the data from the trials on these fracture-preventing drugs are only strictly applicable to about one of every seven patient with a fracture: they might still work in the people who’ve been excluded, though that’s not a judgement call you should have to make; and one problem, in particular, is that the size of the benefit might be different in different people.
To understand why this matters, finally, we need to go through one more study (written by people I work with, though I don’t know if that’s transparency or a boast). The new “coxib” painkiller drugs are sold on the basis that they cause fewer gastrointestinal bleeds than cheap old painkillers like ibuprofen: and coxibs do seem to do this.
But the trials were conducted in ideal patients, who were at much higher risk of having a GI bleed, and this causes problems when you do a cost benefit analysis. NICE estimated the cost of preventing one bleed, if you use a coxib instead of an older drug, at $20,000. But that’s a huge underestimate, and here’s why: they estimated the number of avoided bleeds from the figures in the trials, where patients were at high risk of bleeds.
If, instead, you look at the real data on people prescribed coxibs, in a database of GP records, the overall number of bleeds among people getting painkillers is much smaller: so the number of bleeds avoided is also smaller, and so the cost of each avoided bleed is higher: $104,000, in fact.
This explanation might make your eyes glaze over. You assume someone else is dealing with it. And that’s why problems like these don’t get fixed.
ipad41001 said,
May 7, 2011 at 11:20 pm
You make some good points but the reason for perfect patients is to be able to see side effects clearly.
I’d say that there should be more structured and possibly publicly funded research in the post approval period not a denial of approval.
Many cars are safe and effective, what is best for a given usage is separate from that, up to and including simply walking 🙂
chris lawson said,
May 8, 2011 at 12:04 am
Nice to see you back in the saddle, Ben.
chris lawson said,
May 8, 2011 at 12:15 am
@ipad41001:
The reason for choosing ideal patients in a trial is to make it easier to see an effect. This is a perfectly acceptable research technique because it makes studies a lot cheaper and faster, which is a good thing. But the more extreme the study group, the harder it is to apply the findings to the rest of the population.
havoc_theory said,
May 8, 2011 at 9:29 am
Isn’t that the reason we have Phase IV?
T.J. Crowder said,
May 8, 2011 at 9:38 am
This extremely narrow focus during approvals suggests that, as a minimum-impact change to the process, regulators should always require Phase IV trials and broaden their scope. One can see the rationale for the narrow focus early on; once a drug has passed the current safety and quite low bar of effectiveness, Phase IV could not only assess broader safety but broader applicability. What patients got it? Was it beneficial (as far as one can tell), or not? Constantly reassessing and refining knowledge about the drugs we rely on is surely a good thing…although of course, before assuming that, one would want to design a trial to test that assumption, particularly around the cost implications. 😉
elenacmills said,
May 8, 2011 at 9:50 am
I was thinking exactly the same thing as I was reading this. That’s precisely why large-scale observational phase 4 studies are important when looking at the evidence base for a drug. OK so they don’t tend to be available immediately, post-approval but I think they should have got a mention in Ben’s article together with investigator-led audits and clinical practice case studies as well. It’s a little biased otherwise although I agree with the main point re. RCTs not accurately reflecting real-life clinical practice.
chris lawson said,
May 8, 2011 at 11:25 am
Phase IV studies are not usually able to address the problem Ben has written about. The main benefit of Phase IV studies is to identify complications and side effects that were either too rare or have too long an induction phase to turn up in standard RCTs. Phase IV studies are great for, say, discovering that flucloxacillin causes cholestatic hepatitis. What Phase IV studies are not good at is working out the rate at which these events occur because they essentially require a physician to see a clinical event and think that it might be caused by a drug the person used and then get around to reporting the possibility. Even if a serious ADR is discovered in Phase IV research, you can’t tell from report rates how common the event is. It required further specific research to determine that hepatitis events occurred at about 1 in 15,000 exposures to flucloxacillin.
Even if Phase IV studies could measure ADR rates directly, Ben’s point remains that many drugs are approved in the first place on the basis of cost-effectiveness calculations based on high-risk study participants which are then extrapolated to all patients who might seek the treatment. And although Ben didn’t mention it in the article, it is extremely difficult, politically speaking, for any government to withdraw a drug from the marketplace (or even from govt subsidy) on the basis of revised cost-effectiveness estimates, so it’s important to get the best possible estimates in the first place.
SimonW said,
May 8, 2011 at 10:00 pm
Since not all patients can tolerate the best treatment, anything safe, effective and better than placebo may be worth permitting.
It is usually possible to make some sort of comparison based on how much better each drug was than placebo.
If we compare with best current treatment we’d need to do the reverse calculation (possibly through several iterations of best treatment) to work out if a new drug has any merit above placebo, and that would be susceptible to confounding factors like improvement in non-medicinal treatments.
So not obvious to this layman that comparison to best treatment is an unalloyed benefit unless you happen to be randomised to the control group.
slartibartfast said,
May 9, 2011 at 9:33 am
To try an look at this issue from another angle: we have NO certainty that a given treatment will work in a specific patient for a specific condition. Under these circumstances we can rubbish any clinical trial. What we need to know is that a given treatment is more likely to work than no treatment at all, and we need to know what the risks of that treatment may be (given individual variation we cannot be certain as to what those adverse drug reactions may be). Thus for every patient we conduct a mini-trial to test if the treatment is effective and what the complications are. On a day-to-day basis we do not really get caught up in concerns over numbers needed to treat, or in cost-effectiveness – although perhaps we should. Simply put, all these trials really need to do is to demonstrate that the treatment IS more effective than doing nothing. Then we need to determine whether or not we should treat (not just whether we can treat). This, together with the information on potential risks, should be discussed with the patient so that an informed choice is made. It is not up to trials to determine that a medication should be used, it is up to the informed provider and the informed patient.
chris lawson said,
May 9, 2011 at 11:07 am
slartibartfast,
The informed provider is precisely what Ben is arguing for here. What he is saying is not that RCTs are useless, but that when large providers or advisors to providers (like the PBS in Australia, NICE in the UK, or HMOs in the US) make decisions based on cost-effectiveness, then extrapolating cost-effectiveness from a highly selected, high-risk subset of the population will give wonky results and therefore dubious decisions.
slartibartfast said,
May 9, 2011 at 8:55 pm
Chris, I don’t disagree at all with what you have said, nor do I disagree with Ben. Quite the opposite. I whole-heartedly agree with your last statement. The organizations you have mentioned – and others – seem to generate “recipes” rather than “guidelines” – or maybe it is just the “funders” of the treatments that demand “recipes” in their quest for certainty and to manage costs (without apparently realising that following “recipes” that don’t work – or aren’t actually needed – just add to cost). I just don’t think that enough people are critical enough in their evaluation of information that has been provided, and I think it is informed, critical evaluation that Ben is encouraging (if not demanding). Frankly, I don’t believe that the Medical Profession is critical enough in its evaluation and the outcome is exactly as you have suggested.
Ronlavine said,
May 10, 2011 at 3:52 pm
Interesting post and string of comments.
I believe part of the issue lies in post-approval drug marketing. For instance, “Drug A can cut your risk of a heart attack by 50% with only a 2% chance of side effects.”
Sounds like a reasonable gamble, doesn’t it? But it all depends on what your odds of having a heart attack are to begin with – the drug may only prevent heart attacks in 1% of the people taking it.
Veronica said,
May 10, 2011 at 3:52 pm
By stipulating that every drug on the market should have to be better than every other drug, you are setting the hurdles higher than those of any other product. What’s wrong with “me too” if the company thinks there is money in it? Customers can choose between more than one brand of jeans, smartphones, beer etc., why not more than one brand of each prescription drug if the market will stand it?
Ronlavine said,
May 10, 2011 at 4:00 pm
I work in a “functional medicine” field (chiropractic,) and the working model is that the procedures I perform help the body to self-regulate more effectively.
If indeed there are methods that restore the body’s intrinsic self-regulation, then there should be a different standard for their evaluation in clinical trials.
It’s not really a double standard – it’s actually a Bayesian analysis that takes into consideration a pre-existing expectation. As an example, I have the expectation that improving the body’s ability to adjust heart rate via it’s already-designed regulatory channels will have a more benign effect than introducing a non-physiological chemical into the body.
I think that’s a reasonable expectation.
Do you see a practical way to introduce this element of thinking into the standard model of medical research? Or am I way off base here?
Ron
decium said,
May 11, 2011 at 11:08 am
Ronlavine : “I work in a ‘functional medicine’ field (chiropractic)”. Jejejejeje. Cheers. I needed a belly laugh.
skyesteve said,
May 12, 2011 at 4:41 pm
As someone who presribes on a daily basis (and, on that basis, has a personal responsibilty for tens of thousands of pounds of NHS prescribing costs) and who also has a personal interest of drawing up local Formularies,I am quite simple in my approach to new drugs.
1. Does the new drug do something new or is it merely another member of the pack?
2. If it does something new is that clinically relevant and worthwhile? (some of the recent anti-diabetics spring to mind as being of dubious value for example)
3. If it is worthwhile, can the cost and potential side effects be out-weighed by the benefit (I believe firmly we have an obligation in the NHS to do the most for the most rather than spend £300,000 on a new anti-cancer drug that prolongs life by 3 weeks).
4. If it’s just another pack member is it better than existing ones? If not does it have a better side effect profile than the existing ones? If not is it cheaper than the existing ones? If by now the answer is still “no” then I am not interested.
And I think about number needed to treat every working day.
slartibartfast said,
May 12, 2011 at 9:11 pm
I get your point, Skyesteve, but the biggest number needed to treat that needs to be considered is the ONE sitting in front of you. If that is what you refer to, then I’m with you all the way.
Bruce44 said,
May 13, 2011 at 12:27 pm
“But the problems go beyond just using the wrong comparator”
It’s not the wrong comparator Ben. It’s exactly the right comparator. That a new drug is beneficial, but not as beneficial as an existing drug, is a good argument for medical practitioners preferring the existing drug. It is no justification whatsoever for the regulator to refuse approval for the new drug.
chris lawson said,
May 14, 2011 at 9:29 am
@ Bruce44,
I don’t think Ben said that new drugs should not be approved for marketing, or that RCT against placebo is worthless. He was talking about the way such data are presented to clinicians and to regulatory groups leading to poor clinical decisions.
chris lawson said,
May 14, 2011 at 9:43 am
@slarti,
Your point that the person being treated is the most important factor in making medical decisions, but I would take issue with the way you phrased it. “The biggest number needed to treat that needs to be considered is the ONE sitting in front of you.” It’s a snappy sentence, but to me it looks like the sort of statement anti-EBM groups make all the time.
NNT only makes sense as a population measure. Applying that number to an individual is a vital clinical skill (and I’m assuming that’s what you were trying to get across), which is why we teach doctors basic stats and epidemiology even if they never go on to do research or public health.
In terms of clinical effectiveness, the single best intervention I undertake in general practice is vaccination, especially childhood vaccination. Through this one activity I probably save 40-50% of the children I see from dying before their fifth birthday. But the problem is that I can’t tell which one of those children have been saved. To make matters worse, about one in a million of the children who receive MMR to will develop encephalitis and die or have serious lifelong brain injuries. And in those cases we will almost certainly know that the vaccine was the intervention that led to the encephalitis.
I know we’re both in favour of EBM, so I’m not trying to argue against your position, just that particular sentence construction.
slartibartfast said,
May 15, 2011 at 4:11 am
@Chris Lawson
I think we have been down this EBM discussion pathway before – as you have mentioned at the end of your post. The reason I have raised this point has much to do with your own comments with regard to NICE etc and in regard to QOF. It does seem that, on many occasions, what is actually being treated is the guideline or the QOF and not the patient – all arguments to the contrary included. EBM is vital if informed decisions are to be made, but it shouldn’t make the decision for you – if only because it is not your decision to make alone. What I am trying to advocate is true patient-centred care (not some misunderstood registrar interpretation of it – “giving patients what they want” nonsense)and not simple doctor-centred or guideline-centred care. I am certainly not accusing you of practising either of these latter forms of care, but, unfortunately, there are many who do, and hence my statement regarding the ONE in need of treatment. Patient-centred care and EBM are NOT mutually exclusive. Indeed, you cannot practice the former effectively if you are not well-versed in the latter.
As for immunization, it is “modern” medicine’s single biggest contribution to increased life-expectancy – some may say that, on a population level at least, it is the only contribution! Many of our interventions may increase the individual’s life-expectancy but not fundamentally alter the population statistics. Immunization has been so effective at doing what it was intended to do that many seem to have forgotten those benefits.
hypatiagoldennumber said,
May 16, 2011 at 1:53 pm
Dear Ben, Could you list (at least) 10 features to avoid “Bad Science”?, from scientific misconduct to result frauds.
Thanks and congratulations for your work. It helps pretty much to many!!
chris lawson said,
May 18, 2011 at 8:15 am
@slarti,
It is uncommon, in my opinion, to see someone given inferior treatment because clinical guidelines were applied too rigorously by their treating doctors. It does happen sometimes, and I think it’s very reasonable to raise it. But EBM means evidence *based* medicine, not RCT-Only Medicine, and every EBM textbook and article I have ever read will talk about the difficulties of applying findings from a study to an individual patient and will encourage clinicians to consider the individual needs of the patient when interpreting evidence.
Far more of a problem is sensible guidelines being ignored. Common pitfalls include doctors prescribing too many antibiotics, parents refusing vaccinations for their children, ministers ignoring the advice of expert committees for political expediency, journals accepting articles that do not meet appropriate standards, news services ignoring caveats when reporting new findings, and so on.
I can think of a few examples where EBM was applied far too stringently — say when bisphosphonates were not available to men with severe osteoporosis because all the studies were done in women — but those few examples are overshadowed by the opposite problem, and I actually think these examples are cases where EBM was *not* implemented, and selective evidence was used to provide justification for a non-clinical agenda. (In the bisphosphonate case, it was a matter of saving govt spending.)
slartibartfast said,
May 18, 2011 at 10:34 am
@Chris Lawson
Many patients have multiple medical conditions. Guidelines exist for the treatment of the individual conditions. As a consequence patients may end up on multiple agents and, as a further consequence, the risk of interactions and adverse effects increases. The prescriber may diligently and intelligently apply the guidelines but fail to recognise that what is being treated is a consequence of another treatment or combination of treatments. Each condition is considered and treated on its individual merits in accordance with guidelines. The treatment itself may not be inferior, in fact it may be justified by all the evidence and be considered superior for the given condition, but the outcome for the patient is inferior and harm may result. I would argue that this is not uncommon, but occurs more frequently than we may realise. I would also argue that this sort of situation is far more likely to occur when a patient sees multiple providers, each possibly a specialist in their field delivering “best care” but essentially unaware of other treatments being implemented by another provider.
EBM remains a tool. If you were to argue that a poor workman usually blames his tools, you would find that I would not disagree. The problem isn’t the tool, it is the application.
I like your emphasis on “based” as I think this is the part that is often forgotten. I would also like to think that it is the part that is less likely to be forgotten by the “generalist”, especially one who engages with the patient in patient-centred clinical method.
Your last thought is a point well made and with which I whole-heartedly concur.
There is also an interesting discussion – in my opinion at least – to be had about what is really happening when a provider provides treatment and why. But that is another matter entirely.
hasagice said,
May 22, 2011 at 5:59 pm
@ Bruce44,
I don’t think Ben said that new drugs should not be approved for marketing, or that RCT against placebo is worthless. He was talking about the way such data are presented to clinicians and to regulatory groups leading to poor clinical decisions.
glynisrose said,
September 7, 2011 at 1:14 pm
Its a pity that no-one ever asks the question ‘Does this drug actually work?’ The trouble is that with so called clinical trials the answer is decided upon first then the research is done to support the answer. Thats wrong, from experience I can say that it meant I had the wrong medication (which didn’t actually work) for 7 years.