Welcome back to the only home-learning statistics and trial methodology course to feature villains. You will remember the comedy factory of the Equazen fish oil “trials”: those amazing capsules that make your child clever and well behaved. A new proper trial has now been published looking at whether these fish oil capsules work. They took 75 children with ADHD aged 8 to 18, split the group in half randomly, and gave each child either genuine fish oil capsules, or dummy capsules. They measured ratings scales, and a Clinical Global Impression scale, but there was no difference between the two groups. The fish oil pills did nothing, as in many previous studies, so this trial has not been press released by the company, nor has it been covered in the media.
The funders of this study, Equazen, will doubtless have been disappointed with a negative result. The authors of the study may have been disappointed too. But there was some light on the horizon. They looked at the data more closely, and found that some children did, in fact, respond: “a subgroup of 26% responded with more than 25% reduction of ADHD symptoms and a drop of CGI scores to the near-normal range.”
Subgroup analyses are widely derided in academia, and for very good reasons. The coins are randomly distributed throughout your christmas pudding. If you x-ray it, and follow a very complex path with your knife, you will be able to create one piece with more coins in it than the others: but that means nothing. The coins are still randomly distributed in the cake.
And yet this optimistic overanalysis is seen echoing out from business presentations, throughout the country, every day of the week. “You can see we did pretty poorly overall,” they might say: “but interestingly our national advertising campaign did cause a massive uptick in sales for the Bognor region.”
Interestingly it turns out that you can show significant benefits, using a subgroup analysis, even in a fake trial, where the intervention consists of doing absolutely nothing whatsoever. 30 years ago Lee et al published the classic cautionary paper on this topic in the journal Circulation: they recruited 1073 patients with coronary artery disease, and randomly allocated them to receive either Treatment 1 or Treatment 2. Both treatments were non-existent, because this was simply a simulation of a trial, but they went through the motions of randomising, and following up the data, to see what they could find in the random noise of patients’ progress.
They were not disappointed. Overall, as expected, there was no difference in survival between the two groups. But in a subgroup of 397 patients (characterized by “three-vessel disease” and “abnormal left ventricular contraction”) the survival of Treatment 1 patients was significantly different from that of Treatment 2 patients. This was entirely by chance.
You can also find spurious subgroup effects in real trials, if you do an analysis that’s foolish enough. Close analysis of the ECST trial found that the efficacy of a procedure called endarterectomy depended on which day of the week you were born on. Base your clinical decisions on that: I dare you. Furthermore there is a beautiful, almost linear relationship in this trial’s results between month of birth, and clinical outcome: patients born in May and June show a huge benefit, then as you move ahead through the calendar, there is less and less effect, until by March it starts to seem almost harmful. If this had been a biologically plausible variable, like age, this subgroup analysis would have been very hard to ignore.
It goes on. The ISIS-2 trial compared the benefits of aspirin against placebo during a heart attack. Aspirin improves outcomes, but a mischievous subgroup analysis revealed that it is not effective in patients born under the star signs of Libra and Gemini. Should these patiens be deprived of treatment? Because sometimes subgroup analyses can have a damaging impact on practice: the CCSG trial found that aspirin was effective in preventing stroke and death in men but not in women, and as a result, women were undertreated for a decade, until further trials and overviews showed a benefit.
And sometimes there can be what we might call proper mischief. The CLASS trial compared a painkiller called celecoxib against two older pills, over six months: this new drug showed fewer gastrointestinal complications, and so lots more people prescribed it. A year later, it emerged that the original intention of the trial had been to follow-up for over a year. The trial had shown no benefit for celecoxib over that longer period, but when they only looked at the subgroup of results at six months, the drug shined.
You are unlikely to find the answers to complex problems like school performance and behaviour in any pill, whether it’s ritalin or fish oil, and yet despite the rather desperate anti-establishment swagger of the $60bn food supplement pill industry, time and again we see that they use the exact same tricks as the $600bn pharmaceutical industry. Although Equazen, we might finally mention, are wholly owned by the £1.6bn pharmaceutical company Galenica.
References:
I’ll bung up full references to all the comedy stats papers looking at subgroup analysis at 3pm today, sorry got to sprint now!
shane said,
June 4, 2009 at 7:05 am
I like it very useful information to share.Then you list the correlations in order of significance, and draw your cutoff line wherever you feel you can justify it. get more information from here: Testosterone Therapy
simontax said,
June 23, 2009 at 4:30 pm
SteveJG: “However, I’m wondering if there could be some form of independent service that evaluates the statistical significance of media reports”
It’s called education, Steve. Or if you prefer the political form “Education, Education, Education”.
Lifewish said,
July 15, 2009 at 11:59 pm
FYI, the paper on subgroup analysis that Ben mentions is called “Clinical judgment and statistics: Lessons from a simulated randomized trial in coronary artery disease”. It’s available freely online here.
Anyone should be able to get the gist of it, but Googling may be required for non-mathmos to understand some technical terms. Hell, I didn’t know what a Mantel-Haenszel test was until now.
I also like the look of this other paper about statistical errors in medical literature.