Here’s our Cabinet Office paper on randomised trials of government policies. Read it.

June 20th, 2012 by Ben Goldacre in evidence based policy | 26 Comments »

I’ve spent a lot of time arguing that government should be more evidence based, and that wherever possible, we should do randomised trials to find out which policy intervention works best. We often have no idea whether the things we do in government actually work or not, and achieve their stated goals. This is a disaster.

So, with my grown up hat on, here’s a Cabinet Office paper I co-wrote with some government people on exactly this topic. We explain why randomised trials of policy are so powerful; we explain exactly how to do them; and we explain how to identify a meaningful policy question that can be explored cheaply in a good quality trial. 

We also show that policy people need to have a little humility, and accept that they don’t necessarily know if their great new idea really will achieve its stated objectives. We do this using examples of policies which should have been great in principle, but turned out to be actively harmful when they were finally tested.

Finally, we address – and demolish – the spurious objections that people often raise against doing trials of policy (like: “surely it’s unfair to withold a new intervention from half the people in your trial?”).

Trials are used widely in medicine, in business, in international development, and even in web design. The barriers to using them in UK policy are more cultural than practical, and this document will hopefully be a small part of a bigger battle to get better evidence into government.

The paper also describes, for the first time, several fun examples of trials that have been conducted in UK government over just the past year, reporting both positive and negative findings. These trials all test small, modest changes in policy – and ones that are ideologically uncontroversial – because this is the best way to get trials adopted more widely.

What’s more, they’ve all been run by a small group of very smart people running out of the Cabinet Office, who have quietly set up what is effectively a randomised trials unit in government.  There are quite a few people in the civil service who seem to be on board for all this, so it will be interesting to see if the idea catches on.

Anyway, I think (I hope!) that the paper is readable and straightforward, like the Ladybird Book of Randomised Policy Trials, and I really hope you’ll enjoy reading it. It’s a good primer on basic research methods, and on how to do a trial properly in any domain, with clear examples taken from the real world of medicine, business, teaching, job centres, web design, and more. The people I wrote it with are a mix of supersmart civil servant policy wonks and academics.

To be clear: this is a long read, and there’s a ton of material in these 30 pages. It’s free to download here:

(I read PDFs like this with iAnnotate on iPad, or more commonly on my Kindle: email the file to your Kindle’s email address as an attachment with “convert” in the subject line and PDFs come out very nicely).

Sorry to be absent, by the way: I’ve been working. There’s some fun stuff coming after the summer, until then you can find me on twitter, on my scrappy other blog, here’s my TED talk if you’re procrastinating, but most importantly of all: read our paper, and tell your policy wonk friends about it.

If you like what I do, and you want me to do more, you can: buy my books Bad Science and Bad Pharma, give them to your friends, put them on your reading list, employ me to do a talk, or tweet this article to your friends. Thanks! ++++++++++++++++++++++++++++++++++++++++++

26 Responses

  1. JdeP said,

    June 21, 2012 at 9:55 am

    I am a big fan of Ben’s work, but looking at the new document I am almost immediately struck by highly and deliberately misleading representation of data.

    Fig. 2 plots the numbers of RCTs undertaken (vertical axis) against the decades of the 20th century (horizontal axis) in the fields of (i) Health and (ii) Social Welfare, Education, Crime and Justice. The former shows a steep upward curve, the latter a shall curve. The graph is used to demonstrate that RCTs are not much used in the latter field as compared to the former, and implying that the latter field is somehow falling short of the commendable example set by the field of Health

    But this is completely misleading: if there were a total of 10,000 decisions made in the former field, all using RCTs, and only 100 decisions made in the latter field **also using RCTs all the time**, we would get a graph like Fig. 2, despite the fact that BOTH fields already use RCTs for 100% of their decision-making.


  2. jamarton said,

    June 21, 2012 at 12:53 pm

    Just some feedback. I did this to read this document:

    “and here for the PDF:

    (I read PDFs like this with iAnnotate on iPad, or more commonly on my Kindle: email the file to your Kindle’s email address as an attachment with “convert” in the subject line and PDFs come out very nicely).”

    In this case, on a Kindle, the PDF loaded with the text in large out-of-order fragments.

  3. UrbanAchieverAndy said,

    June 21, 2012 at 1:32 pm

    Sadly, Ben, surely the big problem is this:

    If government policies were set by pursuing ideas that had been proven to work and ignoring those that don’t, it would leave no room for the ideology-driven farcical b*llocks that currently passes for informed decision-making. It would be the end of “politics” (in a good way) but, for that reason, will never happen.

    You’d have to reverse the entirety of the coalition’s economic policy for one. And what on earth would Gove do? Etcetera.

  4. lukebarnes said,

    June 22, 2012 at 12:49 am

    I agree with JdeP ‘s point about figure 2. It’s a very bad idea to start a report about using evidence to inform decisions with a misleading plot. The plot is unnecessary.

    Other than that, the document is rather good! Figure 3, for example, is great. (Error bars? or would they be a bit confusing?)

    A few typos:
    * Step 5 heading, page 27: robust is misspelled (or is it misspelt?).
    * Page 28, 2nd column: “thees” should be “these”.
    * Page 32, 1st column: Sentence starts “be particularly pertinent …” has no capital at the start, and I’m pretty sure it’s a sentence fragment.

  5. muscleman said,

    June 22, 2012 at 2:53 pm

    I have just downloaded the pdf and added it to my Kobo eReader by simple drag and drop. It displays perfectly as pdf’s seem to on this device.

  6. muscleman said,

    June 22, 2012 at 3:28 pm

    @JdeP, I take your point though I think you are too harsh. Going to a % plot would be extremely fraught and subject to interpretation bias as to what is a ‘decision’ as well as when a ‘decision’ is properly ‘novel’. I expect absolute numbers of trials are the best that could be done. Also considering BioMedical spending is less than the total of government spending that this subset runs the majority of RCTs is itself damning. Sure some are run by BigPharma but I suspect the point still stands. Governments spend a lot of money and on things not proven to work. So there is plenty of scope for the second line to overtake the first. In that sense the plot is not misleading.

  7. BrickWall said,

    June 25, 2012 at 2:24 pm

    The general issue re failure to use/apply RCTs in public policy is politicians/democracy/media interpretation.

    Politicians want answers to fit in with their preconceived notions of what the solutions are to their perceptions of what the problems are – both of these are heavily influenced by what the media thinks/dictates and then the whole circus is implemented as best as possible within constraints applied regardless as Government’s are democratically elected to make decisions and derive policies regardless of whether they are efficient/effective (interpretation of which is again subject to media/general public/media).

    Whereas applying science allows for results to be interpreted and experiments re-configured to allow for other factors etc. the public/political world rarely allows for repeated attempts at what would be perceived as “the same thing”.

    Best arena for these RCTs to be worked through sensibly is through Academia etc. and then bash/fight way through the political arena with results – but then who funds it?

  8. steffan_john said,

    June 25, 2012 at 3:35 pm

    It’s a very good overview of the RCT process, as well as addressing the most obvious and basic criticisms of it. It ignores the main limits and problems of evidence-based-policy, but as a basic outline and core document it’s very well put-together document.

  9. ratm973 said,

    June 26, 2012 at 1:56 am

    I would vote for a party that said they would implement this, it just makes sense, we do it in every other part of serious life, but not in politics.

    To that end…

    Ben Goldache for Prime Minister

  10. johnlaity said,

    June 26, 2012 at 11:54 am

    Brilliant work !

    Most Policy chases down or reacts to “Black Swan” events, rather than looking to exploit positive events and establish protections against negative events in a quantified fashion.

    Unfortunately, work like this is also overshadowed by Policy Announcements with an appetite for headline grabbing big number announcements.

    (They maybe Vote winning, but in the long run they kill off engagements with the voter when they fail).

    Used correctly this approach could certainly limit political risk and establish a measured accountability on public investment. Which will grow public engagement.

    With the advent of social media and the internet this approach could also deliver huge cost savings in the long run…

    Policy delivery is effectively research, so it follows that research tools be used to guide development.

    Could this be the death of the phrase “U-Turn” ? LOL

  11. sideshow_nick said,

    June 26, 2012 at 2:00 pm

    @UrbanAchieverAndy there’s still room for ‘ideology-driven farcical b*llocks’ in some sense. Differently aligned parties may be looking to achieve different performance metrics. Crudely, and dependent on the policy area, right-oriented parties may look for policies that minimise cost whereas leftist parties may look to maximise results.

    Working this way would certainly be a huge improvement though.

  12. AnonymousPlease said,

    June 27, 2012 at 12:14 pm

    @JdeP: Theoretically your are correct, but my experience in higher education, and what I have observed of secondary education in both the US and UK, is that most educational policy is set by whims of politicians or by managerial fads supported by shoddy, biased research. Think about it: education arguably has a greater impact on a person’s life than medicine because education is given when the person is young and influences them for the rest of their life. Education even affects a person’s life expectancy. Yet, while there are laws requiring rigorous testing for medical treatments, there is no such legal vetting process for educational techniques. The UK has NICE, which examines the evidence for the effectiveness of medical treatments, but there is no such institute for examining the efficacy of educational practices. The plain and simple truth is that the UK taxpayers spend £90 billion a year on education based on policies that are largely unproven. Thus, I think it is reasonable to say that evidence-based policy-making in education is behind medicine.

  13. Ben said,

    June 28, 2012 at 6:02 pm

    Yet to read it (looking forward to once finals are out of the way next week) – but must admit the first thing that came to mind, reading the title, was this:

  14. Robert Frisbee said,

    June 29, 2012 at 1:15 pm

    Randomized trials are all very well, but they don’t help you decide what _should_ be optimised.

    It seems the first problem we need to sort out is having the country run by people who are in politics to benefit themselves and the huge corporations that lobby them than the common good. Giving those without ethics powerful tools is often a bad idea.

    Another problem is that many of the most costly and significant political decisions are all or nothing. For example, whether or not to go to war, adopt another currency, become an independent country, etc. cannot be determined using randomised trials.

    Finally, though randomised trials are potentially useful, the pharmaceutical industry have provided plenty of examples of how they can be misused to sell us expensive drugs that are only marginally better than placebos and statistics in general to convince huge numbers of people they are sexually dysfunctional.

  15. rjstevens said,

    July 2, 2012 at 9:48 pm


    This work is clearly correct BUT there is a big problem here.

    In practice, Government never defines requirements (goals) before choosing a policy – i.e. there is not a single set of requirements that are documented, verifiable, complete, and published.

    Without requirements, you cannot have a meaningful test – any solution is OK.

    For example, what are the requirements for putting people in prison? for reducing the amount of litter? For school exams? Nobody knows, everybody has a different idea.

    Government bills are effectively designs that are created without a defined set of requirements. The success of their implementation cannot be judged.

    BTW verbal waffle or unverifiable policy documents are not requirements.

  16. bf said,

    July 3, 2012 at 10:41 am

    I heard a Radio 4 programme last year about the government’s Sure Start scheme to assist underprivileged children. Apparently it was proposed to use a control group to evaluate its effectiveness, but this was blocked by civil servants/ministers, apparently because they didn’t want anyone to be able to tell for sure whether it was working or not. I.e. they feared they might be shown to have wasted millions of public money on it.

  17. John Stumbles said,

    July 7, 2012 at 12:06 pm

    I’ve just read Mark Henderson’s Geek Manifesto[1] and I’m delighted to see Ben involved in sneaking evidence-based policy making into government by the back door! I take the point others have made that there’s a loooong way to go but a journey of a thousand miles begins with a single step and all that. And if the mandarins can be accustomed to doing RCTs and showing how they Save Money it’s the thin end of the wedge to opening up higher-profile policies to critical scrutiny.

    [1] not on my Kobo: fscking ebook version of it is DRMed so you can’t get it from a Linux desktop to your kobo. Geek Fail 🙁

  18. jessmadge said,

    July 8, 2012 at 7:06 am

    Nice clear paper Ben and I hope it is widely read. I remember the example in which doctors used to give loads of oxygen to premature babies until someone discovered this practice caused blindness. I have always thought it a pity there are not more people with a scientific education in politics and the civil service.
    The other problem is that politicians want a quick headline, with their eye on polls and the next election – and proper research takes time. So it would be a huge change of mindset to start testing things properly before organising a grand roll-out with fanfare of trumpets.
    I’ve been chairing the governing body of a school in a deprived area for many years and we have suffered an onslaught of policy initiatives and changes of direction. Gove has now engaged warp drive. Instead of trials we have: “if it is old fashioned then it must be good” and “more new (old) things must be better”.

  19. Username said,

    July 27, 2012 at 11:48 am

    I had a look at the Behavioural Insights Team page on the Cabinet Office website, and the whole thing seems to be about promoting a particular kind of economic philosophy (libertarian paternalism to be specific). Are they actually serious about testing policies, or is this just their way of adding a scientific veneer to their ideology (which is essentially faith-based, as all economic ideologies are)?

  20. jimjial said,

    July 31, 2012 at 4:08 pm

    Great paper – hopefully some politicians can be persuaded that RCTs can form a crucial part of policy making. Some comments do seem a little bit over-optimistic about the potential applications. Some comments point out that different people might be pursuing different policy goals. But I think a more important issue is that it is very difficult to come up with a control for many policy decisions. For example, it would be really great if someone could come up with an RCT to help set interest rates. Another big stumbling block is that it takes a lot longer than a parliamentary term to evaluate some of the most important policies. Try telling Michael Gove that his changes to education policy need to be evaluated over a 10 year period before we can have any evidence to support their wholesale introduction!

  21. shaunstar said,

    September 11, 2012 at 3:47 pm

    How about an e-petition to demand the government to use RCT in policy making?
    Something written succinctly but clearly might capture the publics attention.

  22. auntbea said,

    September 16, 2012 at 4:06 pm

    May I ask who the intended audience of this piece is? Perhaps it is different here in the US, but as someone in the social science/politics/policy community, everyone I work with either runs experiments, uses quasi-experimental methods, or acknowledges experiments as the gold standard. There must be *at least* fifty organizations, not including universities, that run policy experiments. A large portion of these studies are commissioned by the government, politicians, or political parties.

    As to why politicians don’t always use RCT evidence in their policy making, well, there is an entire political science literature on that. Short version: politicians’ goal, if they are to remain politicians, is to get elected and stay that way. All other concerns are necessarily secondary. Also, political rhetoric != bureaucratic policy.

  23. NorfolkMuse said,

    October 20, 2012 at 8:39 pm

    Evidence base is great.The HoC select committee on Science and Technoology took evidence on Annual mortality rates for various levels of Alcohol consumption. The provider, a Canadian, escaped to California, does not reply to emails. The Chair of that committee has a duty to ensure that evidence is from trusted sources AND peer reviewed.

  24. arewenearlythereyet said,

    January 4, 2013 at 5:04 pm

    Great piece of work Ben that could potentially change the face of politics ‘as we know it’!
    I suspect two of the downsides will be:
    (a) reluctance of political parties to put their necks on the block by having a system in place that would actually show they had made mistakes;
    (b) cynical setting up of RCTs with deliberately weak control groups so that a point can be ‘proven’.
    The use of RCTs in medicine is supported by a strong peer review network and I suspect it may take some time to develop this in other areas.

  25. matt73 said,

    May 5, 2013 at 10:23 am

    RCTs are becoming common in business. Big digital companies such as Google do it. Lots of startups do it. (There is a popular “Lean Startup” movement that favours RCTs).

    The more people see the benefits of RCTs at work, the more people will VOTE for an RCT-friendly government.

  26. Sue Jones said,

    February 12, 2016 at 2:14 pm

    Username said,

    July 27, 2012 at 11:48 am said:

    “I had a look at the Behavioural Insights Team page on the Cabinet Office website, and the whole thing seems to be about promoting a particular kind of economic philosophy (libertarian paternalism to be specific). Are they actually serious about testing policies, or is this just their way of adding a scientific veneer to their ideology (which is essentially faith-based, as all economic ideologies are)?”

    Remarkably astute comment. The BIT have had a hand in the increased use of punitive welfare sanctions (based on manipulating a theoretical cognitive bias called “loss aversion”), in very lucrative privately run welfare to work schemes amongst other things. The Nudge Unit are simply a used as a prop for neoliberal ideology and traditional Tory prejudices and dogma.