Ben Goldacre, The Guardian, Saturday 14 May 2011
Politicians are ignorant about trials, and they’re weird about evidence. It doesn’t need to be this way. In international development work, resources are tight, and people know that good intentions aren’t enough: in fact, good intentions can sometimes do harm. We need to know what works.
In two new books published this month – “More Than Good Intentions” and “Poor Economics” – four academics describe amazing work testing interventions around the world with proper randomised trials. This is something we’ve bizarrely failed to do at home.
Is business training useful? There’s a randomised trial on it in Peru. What about business mentors? In Mexico, they ran a randomised trial. Now think about all the different initiatives in the UK to support small businesses, or to help people find work. Do they work? No idea: you can have no clear idea.
Randomised trials are our best way to find out if something works: by randomly assigning participants to one intervention or another, and measuring the outcome we’re interested in, we exclude all alternative explanations for any difference between the two groups. If you don’t know which of two reasonable interventions is best, and you want to find out, a trial will tell you.
Microfinance schemes help small producers buy in bulk to make larger profits, and they change lives. But are group-liability loans better, because people default less, so the project is more sustainable? Or do anxieties about shared reponsibility restrict recruitment? Some academics ran a trial.
Do free uniforms improve school attendance, especially in pupils who don’t own one at all? Someone ran a trial. Contingent payments improve attendance: but what’s the best time to pay, and how? There’s another trial. What about streaming in Kenyan schools, with high and low ability classes? Do all kids do better? Someone ran a trial. Maybe different strategies to encourage saving work best in different places? Innovations for Policy Action ran a series of trials to find out, in the Philippines, in Bolivia, in Peru.
I won’t tell you the results, for any of those projects: because this isn’t about good news on what works, or bad news about what doesn’t. What matters is that someone ran a randomised trial and found the answer.
This week the papers and parliament were filled with uninformed wittering on sex education. If the goal is to delay sexual activity, or reduce sexually transmitted infections, and you don’t know what age to start, or what to teach, then stop wittering: define your outcome, randomise schools to different programmes, and you’ll have the answer by the end of next parliament.
Do long prison sentences work? At the moment sentences are hugely variable anyway: so randomise properly and run a trial. Different teaching approaches? Run a trial. Harder exams? Run a trial. Job-seeking support? Run a trial. This isn’t rocket science: the first trial was in the bible. And I’m certainly not saying these are the best UK policy trials you could run. In fact, the most important part of evidence based policy is identifying where there is uncertainty.
So here is my fantasy. We sack the Behavioural Insights Team – all they’ll do is overextrapolate from behavioural economics research – and open a Number Ten Policy Trials Unit instead.
They sit down to write a giant list of unanswered questions, for situations where we don’t know if an intervention works: this will be most of them. Then we filter down to questions where a randomised trial can feasibly be run. Then we do them.
This won’t cost money: it will save money, in unprecedented amounts, by permitting disinvestment in failed interventions, and it will transform the country. It’s efficient, it’s sensible, and it will never happen, because politicians are too ignorant of these simple ideas, too arrogant to have their ideologies questioned, and too scared – let’s be generous – of hard data on their good intentions.
Chandlen said,
May 23, 2011 at 2:21 pm
Faith in randomized trials in policy has been growing. There has been an aversion to randomized trials in academic policy analysis & evaluation based on concerns like
*People are so different, how can you possibly run an effective randomized trial?
*If we’re interested in generalizing the policy to a greater population, analysis about a particular sample doesn’t tell us much.
*Randomized trials are immoral because they deny the control group service (despite the fact that SOMEONE is getting denied service anyway due to insufficient resources).
*The ability to leave the control group will destroy any experiment.
I don’t really agree with these reasons (some don’t matter, and some can be dealt with by an effectively performed experiment), and the climate about it in the policy literature is changing, too. However, there are still a few good criticisms that I think hold water. Not a reason to abandon randomized controlled trials, obviously, but things to keep in mind as to why it won’t work all the time:
*Policy implementation is spotty. Whatever your policy is, when you actually put it into action some people will not be as good at implementing it as others. Some administrators won’t care and others will if it’s a top-down program, some regions may have a political aversion to it and others won’t, etc. etc.. Additionally, this sort of thing is difficult to quantify so it’s not like you can just set up a dose/response model and fix everything. What this means is that it’s sometimes difficult to tell what your “treatment” actually IS. If your experiment tells you that it works or doesn’t work, what is that actually telling you?
*Randomized trials work great for some policies and not for others. Specifically, any policy which has systemic change or can only really be randomized at a regional level becomes very difficult to analyze this way. It’s simply very rare that a researcher will have the funds or the political backing to get a reasonable sample size when she has to randomize over regions. This isn’t necessarily a problem (just use randomized trials where they work, don’t where they don’t) but it can mean that as the policy culture changes, it becomes easier for a politician to challenge any possible policy by requiring an experiment. Sounds like a dream, but this also means that any policy requiring a systemic change will hardly ever pass.
Chandlen said,
May 23, 2011 at 2:47 pm
To be clear, I think randomized trials on policy are a very good thing, and agree with the title. I just wanted to point out that it’s not as straightforward as it seems like it should be.
2wanderers said,
May 23, 2011 at 4:24 pm
The problem with politicians is that a lot of them prefer to have no evidence. No evidence lets them implement their ideology without being required to justify it with anything more than an emotional appeal.
We’re seeing this in Canada right now, where one of the most heavily studied drug programmes is a huge (and super-cheap) success. But because it’s a harm reduction programme instead of a criminilisation and enforcement programme, the current government is fighting to shut it down. Harm reduction just doesn’t fit with the “law and order” – read: lock ’em up and throw away the key – ideology of our Conservative government.
wolfkeeper said,
May 23, 2011 at 5:28 pm
Oh sure, it *sounds* like a good idea to do more trials for political decisions.
But where are the trials that show this helps (get votes for politicians)?
;-P
bsekula said,
May 23, 2011 at 8:10 pm
Never going to happen, atlhough it should. Social programs are too valuable and too emotional to ALL politicians.
John Dixon said,
May 23, 2011 at 8:18 pm
Interesting bit about the biblical trial. There’s also an instruction to “Test all things; hold fast what is good” in 1 Thessalonians 5:21 (NKJV). Not that I go about quoting the Bible on a regular basis, obviously!
drowned said,
May 23, 2011 at 8:51 pm
@Ben – randomised trials of policy are tricky for many reasons, particularly isolating the effects on the independent variable and generalising beyond the study population – something not even the (relatively) easy to control medical trials get right as often as one might wish. Nevermind the lack of political backbone necessary to undertake them and act on the results. And you suggest it could be done in a single parliament, but that presumes no change in policy objectives in that time, and policy is ultimately a normative issue, as it should be. That doesn’t mean RCTs shouldn’t be attempted, but if people push for randomisation only then not much progress will be made. There might be more mileage (kilometerage?) In pursuing alternative study designs or analysis frameworks such as difference differences or propensity score matching that attempt control and are less reliant on collecting years of data prospectively.
@2wanderers – you mistake a normative for a positivist issue. If the objective is harm reduction then one policy is appropriate; if the objective is taking drug users of the street another policy is needed. We may agree on hatm reduction and therefore accept the evidence, but if Harper’s interest is so called ‘law and order’ then the evidence you cite isn’t relevant.
zamboni said,
May 24, 2011 at 8:39 am
my prediction is that policy RCTs will be systematically contested. and for good reasons. say you’re randomized-testing for sex education in two adjacent town. what happens if religious fervor sweeps over one of them? or there’s a new charismatic politician in town with a sex-education agenda? or if a shocking story about some teen pregnancy makes it to town news? or if student mobility increases dramatically in one town? how do you control for the effects of such events? it would be like letting your subjects in a drug-related RCT medicate themselves freely during the test.
i submit the reason RCTs are so frequent in some areas of policy (drug testing, perhaps criminology) and not in others is related more to such difficulties than to political incompetence.
botogol said,
May 24, 2011 at 11:53 am
one way we could encourage randomised trials by stealth is to empower local authorities in many areas, and then watch and evaluate the different policy decisions. In the US they generate a lot of comparative data by looking identifying similar groups in different states using different interventions.
Alas we are moving in the opposite direction : have different policies and everyone shouts ‘postcode lottery’ sigh.
Ulrich said,
May 24, 2011 at 12:01 pm
“Do long prison sentences work? At the moment sentences are hugely variable anyway: so randomise properly and run a trial.”
Well, randomizing criminals to longer or shorter sentences is an idea that probably wouldn’t please an ethics committee. (I guess that’s why the linked article reports on a retrospective study using a sample of matched pairs, not an RCT.)
Bob Dowling said,
May 27, 2011 at 8:08 am
How will you stop the courts (bastions of statistical wisdom that they are) from letting everyone in the “losing” half of the trial sue the government for millions?
ferguskane said,
May 27, 2011 at 9:05 pm
@ Zamboni
“…. say you’re randomized-testing for sex education in two adjacent town. what happens if religious fervor sweeps over one of them? or there’s a new charismatic politician in town with a sex-education agenda? or if a shocking story about some teen pregnancy makes it to town news? or if student mobility increases dramatically in one town? how do you control for the effects of such events?….”
Now I’m not an RCT expert, but you’d be an fool to randomise to just two schools in separate towns – although it would make ‘randomisation’ easy! Due to the cluster effects that you have described, that would be a bit like reducing your group n to 1. So to do the above test, you’d want to do something like:
A. randomise at a pupil level (which would have serious contamination problems, but might still be useful if the effect was large enough),
or B. Randomise at a school level and use multiple schools. With the later, you’d still want to use statistics that would account for the clustering (in this case, correlations due to the school membership).
“Well-designed” is kind of implicit in Ben’s call for RCTs.
ferguskane said,
May 27, 2011 at 9:09 pm
@Ben. As politicians are ignorant of these issues, and you are feeling ambitious, is there any way to run training for them (do politicians have CPD?!) ? It could be done in small groups.
misspiggy said,
May 29, 2011 at 10:15 pm
Working in international development I feel tempted to say ‘I think you’ll find it’s a bit more complicated than that…’
Such is the emphasis on randomised trials in development that it’s becoming difficult to get funding from the big aid donors without evidence from them. In humanitarian crisis, where to deny people an intervention on a randomised basis would result in their death, it’s often not appropriate. And if you conduct trials on whole populations in one location based on their need for the intervention, it becomes very difficult to get comparison data.
If you’re targeting an intervention at particularly excluded groups, such as disadvantaged ethnic minorities, street children, or poor people with disabilities, again you can’t randomise and it is usually impossible to get a big enough sample together to draw out strong enough evidence.
Randomised trials are great for public policy aimed at a national population, or at a large and not-too-badly deprived group of people, but not for the poorest and most excluded people or places. These are the most urgent priorities for international development, as deeply inequitable societies are being shown to hold back stability and economic growth, as is the extreme deprivation found in conflict and crisis affected countries.
The school uniforms study is an interesting example. It is often used to argue that having a school uniforms policy when uniforms are a barrier to education is crazy. Unfortunately, what you actually now see is quite a lot of funding for school uniform projects – donors often like to feel they’re doing something concrete and quick that has been proven ‘to work’, rather than asking complex questions about the settings they are trying to assist.
But yes, many developing country governments do try to use evidence to inform policy. When they only have access to evidence from other countries things can go awry. Old evidence on teaching in French to immigrants in urban Canada has been used to support teaching in English in places like rural Southern Sudan – where children have no English books, no English TV or radio, and no one using English around them. Unsurprisingly, the education results are tragically bad, but government often clings to the only validated evidence they have.
If we want policies that genuinely tackle disadvantage and inequity, it’s key to fund studies in the places where an intervention is likely to be carried out. It’s also vital to find additional ways of testing that don’t rely on randomised trial populations.
lmsava said,
May 31, 2011 at 9:34 am
Having worked on policy evaluation as an external contractor for central government the biggest problem I came across was possible “ethical implications”. Not allowing some people to participate in whichever intervention you were evaluating could have been considered “denial of treatment”. That sounds ridiculous to people who know the purpose of trials but to civil servants who don’t then they just fear the backlash.
The best you can do is a quasi-experimental method using propensity score matching and even that is a small victory for good research. More often than not, these evaluations are intended to be confirmation of a policy that the government is already committed to on ideological grounds – and believe me, government departments are not averse to making rather substantial changes in the editing process that follows submission of an evaluation. One department fundamentally changed the conclusions of a report that my previous organisation produced and we subsequently had to go to court to have our name removed from the publication. That department clearly wanted the fig leaf of an independent report by an independent organisation to justify it’s policy.
Politicians are not ignorant of properly conducted policy evaluations. They just don’t like them because they can bring uncomfortable results and any policy that is subsequently abandoned is jumped on by media and opposition as a u-turn or sign of weakness. The juvenile nature of British politics offers no place for high quality evidence.
Not on the subject of RCTs, but look at the fallout from a decent analysis of the ID card policy under the last government. The research contradicted the government so they went to town on the independent researchers from the LSE:
blogs.lse.ac.uk/politicsandpolicy/2010/06/07/how-academic-research-has-impact-but-not-always-what-the-minister-wanted-the-story-of-the-lse-identity-project/
fontwell said,
June 3, 2011 at 5:34 am
I have to agree with the opinion that politics isn’t about backing what works, it’s about backing what you want to work and getting so worked up about it that rational debate is impossible. Until people stop buying the Daily Mail we can be pretty sure that finding the truth is not one of their priorities.
PC49 said,
June 4, 2011 at 10:17 am
@Ben It would appear that social media sites like to place their own interpretation on trial results… When I posted a link to the Kenyan trial about free uniforms improving school attendance onto FB, the quote FB chose to put next to the link was “Providing school uniforms has only a very minor impact on student attendance, and was not a cost effective strategy to promote education in Busia, Kenya.” – the exact opposite of what the trial concluded. Maybe the Behavioural Insights Team are using media sites too?
PC49 said,
June 4, 2011 at 10:45 am
Ah… OK… Maybe I can forgive FB… Someone’s hidden that quote in the meta data on the povertyactionlab webpage. Nothing to do with FB at all – other than blindly copying the meta data of course!
PC49 said,
June 4, 2011 at 11:49 am
Ah… I think I can forgive povertyactionlab too. If you look at the headline figure in the opening paragraph “We find that giving a school uniform reduces school absenteeism by 44%” you’d quite rightly think wow and that the povertyactionlab comment was unjustified. However, when you poke into the data more thoroughly, you find that baseline school attendance was already at 85% and providing a free uniform raised this by 6.4%. True, that is a reduction in absenteeism of 44% but it isn’t as impressive or as cost effective as you might think. So, note to self (and any politicians reading this): beware of headline figures. Study all of the trial data before forming your own conclusions.
WilliamJay said,
June 25, 2011 at 8:21 pm
I’d also add that a proper randomised trial would require several years, the trial on long sentences would by definition take a long time.
That of course makes them politically useless as they don’t fit a 4 year electoral cycle. Imagine Ken Clark standing up in the house of commons to say “I am anouncing a longitudinal study into our problems with crime.” “In ten years I will know what to do!” The tabloids would rip him apart: ‘dither’, ‘time-waster’, ‘asleep on the job’, ‘do we pay these people to just sit around?’. Anyway most politicians expect to have made their get away in a decades time.
Deadman said,
July 5, 2011 at 1:02 pm
Its a great start of the day with a website like this. very informative , iam now one of the regular visitor of your web. Thanks.
www.binarytotext.com
joey89924 said,
November 16, 2012 at 2:39 am
That sounds ridiculous to people who know the purpose of trials but to civil servants who don’t then they just fear the backlash.
www.hqew.net