Teachers! What would evidence based practice look like?

March 15th, 2013 by Ben Goldacre in evidence based policy | 56 Comments »

I was asked by Michael Gove (Secretary of State for Education) and the Department for Education to look at how to improve the use of evidence in schools. I think there are huge, positive opportunities for teachers here, that go way beyond just doing a few more trials: there is a need for a coherent “information architecture” that supports evidence based practice. I was asked to write something that explains what this would look like, specifically for teachers. Pasted below is the briefing note from DfE press office, and then the text of what I wrote for them, which came out this week. You can also download a PDF from the DfE website here.

If you’re interested, there’s more on evidence based policy in this BBC Radio 4 documentary I did here, and in this Cabinet Office paper on trials in government that I co-authored here, as well as zillions more posts.

There’s a response to my DfE paper from the Education Endowment Foundation here (they’re running over 50 trials in 1400 schools), and a blog post from the Institute of Education here, I’ll post up more when I get a chance.

Hope you like it!

Building evidence into education

Dr Ben Goldacre will set out today how teachers in England have the chance to make teaching a truly evidence-based profession.

Education Secretary Michael Gove asked Dr Goldacre to examine the role of evidence in the education sector.

In a paper to be presented at Bethnal Green Academy, Dr Goldacre will say today that research into “which approaches work best” should be embedded as seamlessly as possible into everyday activity in education.

High-quality research into what works best can improve outcomes, benefitting pupils and increasing teachers’ independence. But Dr Goldacre’s recommendations go beyond simply running more “randomised trials”, or individual research projects. Drawing on comparisons between education and medicine, he said medicine had “leapt forward” by creating a simple infrastructure that supports evidence-based practice, making it easy and commonplace.

Dr Goldacre says that:

– research on what works best should be a routine part of life in education
– teachers should be empowered to participate in research
– myths about randomised trials in education should be addressed, removing barriers to research
– the results of research should be disseminated more efficiently
– resources on research should be available to teachers, enabling them to be critical and thoughtful consumers of evidence
– barriers between teachers and researchers should be removed
– teachers should be driving the research agenda, by identifying questions that need to be answered.

In some of the highest performing education jurisdictions, including Singapore, he explained: “it is almost impossible to rise up the career ladder of teaching, without also doing some work on research in education.”

Dr Goldacre said:

“This is not about telling teachers what to do. It is in fact quite the opposite. This is about empowering teachers to make independent, informed decisions about what works, by generating good quality evidence, and using it thoughtfully.”

“The gains here are potentially huge. Medicine has leapt forward with evidence-based practice. Teachers have the same opportunity to leap forwards and become a truly evidence-based profession. This is a huge prize, waiting to be claimed by teachers.”


Ben Goldacre is a doctor, academic and writer who focuses on problems in science, statistics, and evidence based practice. His first book “Bad Science” sold half a million copies. He is currently a Research Fellow in Epidemiology at London School of Hygiene and Tropical Medicine.


And here’s the paper…


Building evidence into education


I think there is a huge prize waiting to be claimed by teachers. By collecting better evidence about what works best, and establishing a culture where this evidence is used as a matter of routine, we can improve outcomes for children, and increase professional independence.

This is not an unusual idea. Medicine has leapt forward with evidence based practice, because it’s only by conducting “randomised trials” – fair tests, comparing one treatment against another – that we’ve been able to find out what works best. Outcomes for patients have improved as a result, through thousands of tiny steps forward. But these gains haven’t been won simply by doing a few individual trials, on a few single topics, in a few hospitals here and there. A change of culture was also required, with more education about evidence for medics, and whole new systems to run trials as a matter of routine, to identify questions that matter to practitioners, to gather evidence on what works best, and then, crucially, to get it read, understood, and put into practice.

I want to persuade you that this revolution could – and should – happen in education. There are many differences between medicine and teaching, but they also have a lot in common. Both involve craft and personal expertise, learnt over years of experience. Both work best when we learn from the experiences of others, and what worked best for them. Every child is different, of course, and every patient is different too; but we are all similar enough that research can help find out which interventions will work best overall, and which strategies should be tried first, second or third, to help everyone achieve the best outcome.

Before we get that far, though, there is a caveat: I’m a doctor. I know that outsiders often try to tell teachers what they should do, and I’m aware this often ends badly. Because of that, there are two things we should be clear on.

Firstly, evidence based practice isn’t about telling teachers what to do: in fact, quite the opposite. This is about empowering teachers, and setting a profession free from governments, ministers and civil servants who are often overly keen on sending out edicts, insisting that their new idea is the best in town. Nobody in government would tell a doctor what to prescribe, but we all expect doctors to be able to make informed decisions about which treatment is best, using the best currently available evidence. I think teachers could one day be in the same position.

Secondly, doctors didn’t invent evidence based medicine. In fact, quite the opposite is true: just a few decades ago, best medical practice was driven by things like eminence, charisma, and personal experience. We needed the help of statisticians, epidemiologists, information librarians, and experts in trial design to move forwards. Many doctors – especially the most senior ones – fought hard against this, regarding “evidence based medicine” as a challenge to their authority.

In retrospect, we’ve seen that these doctors were wrong. The opportunity to make informed decisions about what works best, using good quality evidence, represents a truer form of professional independence than any senior figure barking out their opinions. A coherent set of systems for evidence based practice listens to people on the front line, to find out where the uncertainties are, and decide which ideas are worth testing. Lastly, crucially, individual judgement isn’t undermined by evidence: if anything, informed judgement is back in the foreground, and hugely improved.

This is the opportunity that I think teachers might want to take up. Because some of these ideas might be new to some readers, I’ll describe the basics of a randomised trial, but after that, I’ll describe the systems and structures that exist to support evidence based practice, which are in many ways more important. There is no need for a world where everyone is suddenly an expert on research, running trials in their classroom tomorrow: what matters is that most people understand the ideas, that we remove the barriers to “fair tests” of what works, and that evidence can be used to improve outcomes.

How randomised trials work.

Where they are feasible, randomised trials are generally the most reliable tool we have for finding out which of two interventions works best. We simply take a group of children, or schools (or patients, or people); we split them into two groups at random; we give one intervention to one group, and the other intervention to the other group; then we measure how each group is doing, to see if one intervention achieved its supposed outcome any better.

This is how medicines are tested, and in most circumstances it would be regarded as dangerous for anyone to use a treatment today, without ensuring that it had been shown to work well in a randomised trial. Trials are not only used in medicine, however, and it is common to find them being used in fields as diverse as web design, retail, government, and development work around the world.

For example, there was a longstanding debate about which of two competing models of “microfinance” schemes was best at getting people out of poverty in India, whilst ensuring that the money was paid back, so it could be re-used in other villages: a randomised trial compared the two models, and established which was best.

At the top of the page at Wikipedia, when they are having a funding drive, you can see the smiling face of Jimmy Wales, the founder, on a fundraising advert. He’s a fairly shy person, and didn’t want his face to be on these banners. But Wikipedia ran a randomised trial, assigning visitors to different adverts: some saw an advert with a child from the developing world (“she could have access to all of human knowledge if you donate…”); some saw an attractive young intern; some saw Jimmy Wales. The adverts with Wales got more clicks and more donations than the rest, so they were used universally.

It’s easy to imagine that there are ways around the inconvenience of randomly assigning people, or schools, to one intervention or another: surely, you might think, we could just look at the people who are already getting one intervention, or another, and simply monitor their outcomes to find out which is the best. But this approach suffers from a serious problem. If you don’t randomise, and just observe what’s happening in classrooms already, then the people getting different interventions might be very different from each other, in ways that are hard to measure.

For example, when you look across the country, children who are taught to read in one particularly strict and specific way at school may perform better on a reading test at age 7, but that doesn’t necessarily mean that the strict, specific reading method was responsible for their better performance. It may just be that schools with more affluent children, or fewer social problems, are more able to get away with using this (imaginary) strict reading method, and their pupils were always going to perform better on reading tests at age 7.

This is also a problem when you are rolling out a new policy, and hoping to find out whether it works better than what’s already in place. It is tempting to look at results before and after a new intervention is rolled out, but this can be very misleading, as other factors may have changed at the same time. For example, if you have a “back to work” scheme that is supposed to get people on benefits back into employment, it might get implemented across the country at a time when the economy is picking up anyway, so more people will be finding jobs, and you might be misled into believing that it was your “back to work” scheme that did the job (at best, you’ll be tangled up in some very complex and arbitrary mathematical modelling, trying to discount for the effects of the economy picking up).

Sometimes people hope that running a pilot is a way around this, but this is also a mistake. Pilots are very informative about the practicalities of whether your new intervention can be implemented, but they can be very misleading on the benefits or harms, because the centres that participate in pilots are often different to the centres that don’t. For example, job centres participating in a “back to work” pilot might be less busy, or have more highly motivated staff: their clients were always going to do better, so a pilot in those centres will make the new jobs scheme look better than it really is. Similarly, running a pilot of a fashionable new educational intervention in schools that are already performing well might make the new idea look fantastic, when in reality, the good results have nothing to do with the new intervention.

This is why randomised trials are the best way to find out how well a new intervention works: they ensure that the pupils or schools getting a new intervention are the same as the pupils and schools still getting the old one, because they are all randomly selected from the same pool.

At around this point, most people start to become nervous: surely it’s wrong, for example, to decide what kind of education a child gets, simply at random? This cuts to the core of why we do trials, and why we gather evidence on what works best.

Myths about randomised trials

While there are some situations where trials aren’t appropriate – and where we need to be cautious in interpreting the results – there are also several myths about trials. These myths are sometimes used to prevent trials being done, which slows down progress, and creates harm, by preventing us from finding out what works best. Some people even claim that trials are undesirable, and even completely impossible, in schools: this is a peculiarly local idea, and there have been huge numbers of trials in education in other countries, such as the US. However, the specific myths are worth discussing.

Firstly, people sometimes worry that it is unethical to randomly assign children to one educational intervention or another. Often this is driven by an implicit belief that a new or expensive intervention is always necessarily better. When people believe this, they also worry that it’s wrong to deprive people of the new intervention. It’s important to be clear, before we get to the detail, that a trial doesn’t necessarily involve depriving people of anything, since we can often run a trial where people are randomly assigned to receive the new intervention now, or after a six month wait. But there is a more important reason why trials are ethically acceptable: in reality, before we do a trial, we generally have no idea which of two interventions is best. Furthermore, new things that many people believe in can sometimes turn out, in reality, to be very harmful.

Medicine is littered with examples of this, and it is a frightening reality. For many years, it was common to treat everyone who had a serious head injury with steroids. This made perfect sense on paper: head injuries cause the brain to swell up, which can cause important structures to be crushed inside our rigid skulls; but steroids reduce swelling (this is why you have steroid injections for a swollen knee), so they should improve survival. Nobody ran a trial on this for many years. In fact, it was widely argued that randomising unconscious patients in A&E to have steroids or not would be unethical and unfair, so trials were actively blocked. When a trial was finally conducted, it turned out that steroids actually increased the chances of dying, after a head injury. The new intervention, that made perfect sense on paper, that everyone believed in, was killing people: not in large enough numbers to be immediately obvious, but when the trial was finally done, an extra two people died out of every hundred people given steroids.

There are similar cases from the world of education. The “Scared Straight” programme also made sense on paper: young children were taken into prisons and shown the consequences of a life of crime, in the hope that they would be more law abiding in their own lives. Following the children who participated in this programme into adult life, it seemed they were less likely to commit crimes, when compared with other children. But here, researchers were caught out by the same problem discussed above: the schools – and so the children – who went on the Scared Straight course were different to the children who didn’t. When a randomised trial was finally done, where this error could be accounted for, we found out that the Scared Straight programme – rolled out at great expense, with great enthusiasm, good intentions, and huge optimism – was actively harmful, making children more likely to go to prison in later life.

So we must always be cautious about assuming that things which are new, or expensive, are necessarily always better. But this is just one special case of a broader issue: we should always be clear when we are uncertain about which intervention is best. Right now, there are huge numbers of different interventions used throughout the country – different strategies to reduce absenteeism, or teach arithmetic, or reduce teenage pregnancies, or any number of other things – where there is no evidence to say which of the currently used methods is best. There is arbitrary variation, across the country, across a town, in what strategies and methods are used, and nobody worries that there is an ethical problem with this.

Randomisation, in a trial, adds one simple extra chink to this existing variation: we need a group of schools, teachers, pupils, or parents, who are able to honestly say: “we don’t know which of these two strategies is best, so we don’t mind which we use. We want to find out which is best, and we know it won’t harm us.”

This is a good example of how gathering good evidence requires a culture shift, extending beyond a few individual randomised trials. It requires everyone involved in education to recognise when it’s time to honestly say “we don’t know what’s best here”. This isn’t a counsel of despair: in medicine, and in teaching, we know that most of what we do does some good (if we’re not better than nothing, then we’re all in big trouble!). The real challenge is in identifying what works the best, because when people are deprived of the best, they are harmed too. But this is also a reminder of how inappropriate certainty can be a barrier to progress, especially when there are charismatic people, who claim they know what’s best, even without good evidence.

Medicine suffered hugely with this problem, and as late as the 1970s there were infamous confrontations between people who thought it was important to run fair tests, and “experts”, who were angry at the thought of their expertise being challenged, and their favourite practices being tested. Archie Cochrane was one of the pioneers of evidence based medicine, and in his autobiography, he describes many battles he had with senior doctors, in glorious detail. In 1971, Cochrane was concerned that Coronary Care Units in hospitals might be no better than home care, which was the standard care for a heart attack at the time (we should remember that this was the early days of managing heart attacks, and the results from this study wouldn’t be applicable today). In fact, he was worried that hospital care might involve a lot of risky procedures that could even, conceivably, make outcomes worse for patients overall.

Because of this, Cochrane tried to set up a randomised trial comparing home care against hospital care, against great resistance from the cardiologists. In fact, the doctors running the new specialist units were so vicious about the very notion of running a trial that when one was finally set up, and the first results were collected, Cochrane decided to play a practical joke. These initial results showed that patients in Coronary Care Units did worse than patients sent home; but Cochrane switched the numbers around, to make it look like patients on CCUs did better. He showed the cardiologists these results, which reinforced their belief that it was wrong of Cochrane to even dare to try running a randomised trial of whether their specialist units were helpful. The room erupted:

“They were vociferous in their abuse: “Archie,” they said “we always thought you were unethical. You must stop this trial at once.” … I let them have their say for some time, then apologized and gave them the true results, challenging them to say as vehemently, that coronary care units should be stopped immediately. There was dead silence and I felt rather sick because they were, after all, my medical colleagues.

Similar confrontations are reported in many new fields, when people try subjecting ideas and practices to fair tests, in randomised trials. But being open and clear about the need for research – when there is no good evidence to help us choose between interventions – is also important because it helps make sure that research is done on relevant questions, meeting the needs of teachers, pupils and parents. When everyone involved in teaching knows a little about how research is done – and what previous research has found – then we can all have a better idea of what questions need to be asked next.

But before we get on to how this can happen, we should first finish the myths about trials. From now on, these are all cases where people overstate the benefits of trials.

For example, sometimes people think that trials can answer everything, or that they are the only form of evidence. This isn’t true, and different methods are useful for answering different questions. Randomised trials are very good at showing that something works; they’re not always so helpful for understanding why it worked (although there are often clues when we can see that an intervention worked well in children with certain characteristics, but not so well in others). “Qualitative” research – such as asking people open questions about their experiences – can help give a better understanding of how and why things worked, or failed, on the ground. This kind of research can also be useful for generating new questions about what works best, to be answered with trials. But qualitative research is very bad for finding out whether an intervention has worked. Sometimes researchers who lack the skills needed to conduct or even understand trials can feel threatened, and campaign hard against them, much like the experts in Archie Cochrane’s story. I think this is a mistake. The trick is to ensure that the right method is used to answer the right questions.

A related issue involves choosing the right outcome to measure. Sometimes people say that trials are impossible, because we can’t capture the intangible benefits that come from education, like making someone a well rounded member of society. It’s true that this outcome can be hard to measure, although that is an argument against any kind of measurement of attainment, and against any kind of quantitative research, not just trials. It’s also, I think, a little far-fetched: there are lots of things we try to improve that are easy to measure, like attendance rates, teenage pregnancy, amount of exercise, performance on specific academic or performance tests, and so on.

However, we should return to the overly exaggerated claims sometimes made in favour of trials, and the need to be a critical consumer of evidence. A further common mistake is to assume that, once an intervention has been shown to be effective in a single trial, then it definitely works, and we should use it everywhere. Again, this isn’t necessarily true. Firstly, all trials need to be run properly: if there are flaws in a trial’s design, then it stops being a fair test of the treatments. But more importantly, we need to think carefully about whether the people in a trial of an intervention are the same as the people we are thinking of using the intervention on.

The Family Nurse Partnership is a programme that is well funded and popular around the world. It was first shown to be effective in a randomised trial in 1977. The trial participants were white mothers in a semirural setting upstate from New York, and people worried at the time that the positive results might have been exceptional, and occurred simply because the specific programme of social support that was offered had suited this population unusually well. In 1988, to check that the findings really were applicable to other settings, the same programme was assessed using a randomised trial in African-American mothers in inner city Memphis, and again found to be effective. In 1994, a third trial was conducted in a large population of Hispanic, African-American, and Caucasian mothers from Denver. After this trial also showed a benefit, people in the US were fairly certain that the programme worked, with fewer childhood injuries, increased maternal employment, improved “school readiness”, and more.

Now, the Family Nurse Partnership programme is being brought to the UK, but the people who originally designed the intervention have insisted that a randomised trial should be run here, to see if it really is effective in the very different setting of the UK. They have specifically stated that they expect to see less dramatic benefits here, because the basic level of support for young families in the UK is much better than that in the US: this means that the difference between people getting the FNP programme, and people getting the normal level of help from society, will be much smaller.

This is just one example of why we need to be thoughtful about whether the results of a trial in one population really are applicable to our own patients or pupils. It’s also an illustration of why we need to make trials part of the everyday routine, so that we can replicate trials, in different settings, instead of blindly assuming we can use results from other countries (or even other schools, if they have radically different populations). It doesn’t mean, however, that we can never trust the results of a trial. This is just another example of why it’s useful to know more about how trials work, and to be a thoughtful consumer of evidence.

Lastly, people sometimes worry that trials are expensive and complicated. This isn’t necessarily true, and it’s important to be clear what the costs of a trial are being compared against. For example, if the choice is between running a trial, and simply charging ahead, implementing an idea that hasn’t been shown to work – one that might be ineffective, wasteful, or even harmful – then it’s clearly worth investing some time and effort in assessing its true impact. If the alternative is doing an “observational” study, which has all the shortcomings described above, then the analysis can be so expensive and complex – not to mention unreliable – that it would have been easier to randomise participants to one intervention or the other in the first place.

But the mechanics and administrative processes for running a trial can also be kept to a minimum with thoughtful design, for example by measuring outcomes using routine classroom data, that was being collected anyway, rather than running a special set of tests. More than anything, though, for trials to be run efficiently, they need to be part of the culture of teaching.

Making evidence part of everyday life.

I’m struck by how much enthusiasm there is for trials and evidence based practice in some parts of teaching: but I’m also struck that much of this enthusiasm dies out before it gets to do good, because the basic structures needed to support evidence based practice are lacking. As a result, a small number of trials are done, but these exist as isolated islands, without enough bridges joining the people and strands of work together. This is nobody’s fault: creating an “information architecture” out of thin air is a big job, and it might take decades. The benefits, though, are potentially huge. Some individual randomised trials from the UK have produced informative results, for example, but these results are then poorly communicated, so they don’t inform and change practice as well as they might.

Because of this, I’ve sketched out the basics of what education would need, as a sector, to embrace evidence based practice in a serious way. The aim – which I hope everyone would share – is to get more research done, involving as many teachers as possible; and to get the results of good quality research disseminated and put into practice. It’s worth being clear, though, that this is a first sketch, and a call to arms. I hope that others will pull it apart and add to it. But I also hope that people will be able to act on it, because structures like these in medicine help capture the best value from the good work – and hard work – that is done all around the country.

Firstly – and most simply – it’s clear that we need better systems for disseminating the findings of research to teachers on the ground. While individual studies are written up in very technical documents, in obscure academic journals, these are rarely read by teachers. And rightly so: most doctors rarely bother to read technical academic journals either. The British Medical Journal has brief summaries of important new research from around the world; and there is a thriving market of people offering accessible summary information on new “what works” research to doctors, nurses, and other healthcare professionals. The US government has spent vast sums of money on two similar websites for teachers: “Doing What Works”, and the “What Works Clearing House”. These are large, with good quality resources, and they are written to be relevant to teachers needs, rather than dry academic games. While there are some similar resources in the UK, these are often short-lived, and on a smaller scale.

For these kinds of resources to be useful at all, they then need to land with teachers who know the basics of “how we know” what works. While much teacher training has reflected the results of research, this evidence has often been presented as a completed canon of answers. It’s much rarer to find all young teachers being taught the basics of how different types of research are done, and the strengths and weaknesses of each approach on different types of question (although some individual teachers have taught themselves on this topic, to a very high level). Learning the basics of how research works is important, not because every teacher should be a researcher, but because it allows teachers to be critical consumers of the new research findings that will come out during the many decades of their career. It also means that some of the barreirs to research, that arise from myths and misunderstandings, can be overcome. In an ideal world, teachers would be taught this in basic teacher training, and it would be reinforced in Continuing Professional Development, alongside summaries of research.

In some parts of the world, it is impossible to rise up the career ladder of teaching without understanding how research can improve practice, and publishing articles in teaching journals. Teachers in Shanghai and Singapore participate in regular “Journal Clubs”, where they discuss a new piece of research, and its strengths and weaknesses, before considering whether they would apply its findings in their own practice. If the answer is no, they share the shortcomings in the study design that they’ve identified, and then describe any better research that they think should be done, on the same question.

This is an important quirk: understanding how research is done also enables teachers to generate new research questions. This, in turn, ensures that the research which gets done addresses the needs of everyday teachers. In medicine, any doctor can feed up a research suggestion to NIHR (the National Institute for Health Research), and there are organisations that maintain lists of what we don’t yet know, fed by clinicians who’ve had to make decisions, without good quality evidence to guide them. But there are also less tangible ways that this feedback can take place.

Familiarity with the basics of how research works also helps teachers get involved in research, and to see through the dangerous myths about trials being actively undesirable, or even “impossible” in education. Here, there is a striking difference with medicine. Many teachers pour their heart and soul into research projects which are supposed to find out whether something worked; but in reality the projects often turn out to be too small, being run by one person in isolation, in only one classroom, and lack the expert support necessary to ensure a robust design. Very few doctors would try and run a quantitative research project alone in their own single practice, without expert support from a statistician, and without help from someone experienced in research design.

In fact, most doctors participate in research by playing a small role in a larger research project which is coordinated, for example, through a research network. Many GPs are happy to help out on a research: they recruit participants from among their patients; they deliver whichever of two commonly used treatments has been randomly assigned to their patient; and they share medical information for follow-up data. But they get involved by putting their name down with the Primary Care Research Network covering their area. Researchers interested in running a randomised trial in GP patients then go to the Research Network, and find GPs to work with.

This system represents a kind of “dating service” for practitioners and researchers. Creating similar networks in education would help join up the enthusiasm that many teachers have – for research that improves practice – with researchers, who can sometimes struggle to find schools willing to participate in good quality research. This kind of two-way exchange between researchers and teachers also helps the teacher-researchers of the future to learn more about the nuts and bolts of running a trial; and it helps to keep researchers out of their ivory towers, focusing more on what matters most to teachers.

In the background, for academics, there is much more to be said on details. We need, I think, academic funders who listen to teachers, and focus on commissioning research that helps us learn what works best, to improve outcomes. We need academics with quantitative research skills from outside traditional academic education departments – economists, demographers, and more – to come in and share their skills more often, in a multidisciplinary fashion. We need more expert collaboration with Clinical Trials Units, to ensure that common pitfalls in randomised trial design are avoided; we may also need – eventually − Education Trials Units, helping to support good quality research throughout the country.

But just as this issue stretches way beyond a few individual research projects, it also goes way beyond anything that one single player can achieve. We are describing the creation of a whole ecosystem from nothing. Whether or not it happens depends on individual teachers, researchers, heads, politicians, pupils, parents and more. It will take mischievous leaders, unafraid to question orthodoxies by producing good quality evidence; and it will need to land with a community that – at the very least – doesn’t misunderstand evidence based practice, or reject randomised trials out of hand.

If this all sounds like a lot of work, then it should do: it will take a long time. But the gains are huge, and not just in terms of better evidence, and better outcomes for pupils. Right now, there is a wave of enthusiasm for good quality evidence, passing through all corners of government at the moment. This is the time to act. Teachers have the opportunity, I believe, to become an evidence based profession, in just one generation: embedding research into everyday practice; making informed decisions independently; and fighting off the odd spectacle of governments telling teachers how to teach, because teachers can use the good quality evidence that they have helped to create, to make their own informed judgements.

There is also a roadmap. While evidence based medicine seems like an obvious idea today – and we would be horrified to hear of doctors using treatments without gathering and using evidence on which works best – in reality these battles were only won in very recent decades. Many eminent doctors fought viciously, as recently as the 1970s, against the very idea of evidence based medicine, seeing it as a challenge to their expertise. The case for for change was made by optimistic young practitioners like Archie Cochrane, who saw that good evidence on what works best was worth fighting for.

Now we recognise that being a good doctor, or teacher, or manager, isn’t about robotically following the numerical output of randomised trials; nor is it about ignoring the evidence, and following your hunches and personal experiences instead. We do best, by using the right combination of skills to get the best job done.

If you like what I do, and you want me to do more, you can: buy my books Bad Science and Bad Pharma, give them to your friends, put them on your reading list, employ me to do a talk, or tweet this article to your friends. Thanks! ++++++++++++++++++++++++++++++++++++++++++

56 Responses

  1. Marcus Hill said,

    May 10, 2013 at 12:12 pm

    There are plenty of longitudinal studies in education that last a number of years. The problem is that any major changes to the education system are, by thei nature, driven by government. It’s easy in medicine to make incremental changes to practice and research how best to treat particular ailments. There’s nothing in medicine as all-controlling as the National Curriculum or the public examination system. Education secretaries tinker with these, but major changes do need to be backed by evidence. Ideally, such a change would start with a consultation exercise leading to draft new documents – that’s around two years to do properly. You then need to pilot these with at least two cohorts to fine tune things, and then consider whether the evidence warrants a national roll-out. That’s another three years of work. So, from initial idea to national roll-out you’re looking at a minimum of five years if it’s done properly. The problem, of course, is that there is bound to be a general election in the next five years, and the Secretary of State needs to have his or her grand new scheme in place before that. This is why education is beset by wave after wave of sweeping change, politically motivated and insufficiently researched.

  2. JFB said,

    May 21, 2013 at 12:29 pm

    Both Rebecca’s Allens and Geoff Whitty’s comments to Ben Goldacres’ DfE Analytical Review seem to be saying ‘well of course we need randomised controlled studies in education, I have been saying this all along, however, I have always advocating going much further and using much more sophisticated and nuanced interpretation that applies only to education to say what works for whom and under what conditions’. This humbug is beyond tolerance on several levels, as firstly during Geoff’s 10 years premiership of the leading education research institution his organisation did not carry out any randomised control trials. Similarly, throughout Rebeccas’ career, apparently dedicated to this paradigm shift she has not previously carried one out. So this adherence to RCT seems a rather rapid conversion suspiciously coincidental with recent rise in criticism of the methods that come from outside education research establishment they are central members of. Secondly, their understanding of RCT methodology suggests the topic is ‘Google new’ to them as they say its’ different in education than it is in medicine because unlike medicine in education we must know what works for whom and under what conditions. When in medicine it is also necessary to know what works whom and under what conditions for example a drug supposed to treat Parkinson’s disease (what) should help those with Parkinson’s disease (whom) at this dose at this stage of the disease (what conditions). It is hard to imagine what experiments they are thinking of that doesn’t specify what, whom in what conditions. Moreover these comments seem to attempt to claim credit for inventing afresh the idea of what’s called the ‘methods’ section in the write up of any experiment in any research in any field and taught in the first week of any research methods course (apparently apart from education). Their hubris is further compounded by their attempt to imply that they have added substantially to the debate by suggesting that readers of education research need to be careful to bear in mind that studies on for example, 15 year olds Maths in average UK schools, would only apply only to 15 year olds Maths in average UK schools and not 4 year olds music lessons in a Viennese conservatoires. These comments seem to suggest that the Rebecca and Geoff think that teachers will not be able to make this inference.
    Further clarifying the recency of their thoughts on the subject they seem to think it is informative to point out that experiments without theory aren’t very helpful, when any view of the history of experimental science suggests fairly clearly that doing experiments on things you think might work is more helpful that doing experiments on things you don’t think might work. They go on to suggest a solution to this problem is to ask for expensive qual studies with RCT, a call likely to put off funders rather than encourage them at a time when as Rebecca rightly points out schools and funders are million miles away from buying into the need for RCT. A step likely to preserve the perennial problem stymieing education research where whenever asked to research anything education researchers respond ‘oh its very complicated’, interpreted by funders to mean expensive and producing inconclusive results. All this seems to suggest Rebecca and Geoff may have gathered their thoughts on the issue of promoting RCT rather more hurriedly than they may have us believe.

  3. Ben Goldacre said,

    February 9, 2014 at 7:48 pm

    Hi there

    sorry, I missed most of these comments.

    The issues raised are covered at length in the comments below my Guardian piece on the topic, where I spent about a day engaging with some very confused arguments. I strongly recommend it, whether you’re rabidly against RCTs, or just interested in the reasoning and culture of those who are. I would repeat the exercise here but it’s quite time consuming addressing the same canards.


    It’s genuinely interesting – from a scientific perspective, and from a cultural one – how much special pleading, territoriality, and misunderstanding this issue elicits from people. Some of the misunderstanding is legitimate; some, I would say (and not lightly) is deliberate. Because of that, and the scale of the problem, I am going to do a much larger piece of work on RCTs.

    If any of you have any good references to people making the anti-RCT arguments made here and elsewhere, but in more concrete settings – ideally academic papers, and books – please do send them over, either here or by email ben@badscience.net. They are made commonly in conversation and blog posts, but it’s harder to find people willing to put their names to them formally in print. I’m particularly interested in people arguing:

    – “you can’t know what to measure” whilst defending observational research

    – “randomistas want to do only trials”

    – “trials are unethical”

    – “trials can’t answer questions about the purpose of education” (a bit like complaining an aeroplane can’t make toast)

    and so on.

    For those very interested there was this recent interaction with the head of policy at the ACSL, where the comments are worth reading.


    I don’t know what it is that would make a union take this position on RCTs and evidence.

  4. Ann JOnes said,

    January 21, 2015 at 11:35 pm

    How then do you explain findings like this?


    And GPs equally as qualified as you who are happy to put NLP next to their name?


  5. Jakob said,

    March 20, 2015 at 1:21 pm


    I just found this discussion now, but my own go-to book – as per Ben’s request – ‘on this topic is:


    The central claim here is NOT that RCTs don’t tell us anything of relevance for making policy. The claim, corresponding to the point above about external validity – is that RCTs only answer a small subset of questions that you need to answer when making a good policy. RCTs will tell you whether an intervention worked THERE, but it will not tell you WHAT worked there nor whether that something is likely to work in a different context, with the same effect and to the same extent.

  6. Tara said,

    April 2, 2015 at 8:38 am

    Gove has decided to ignore the clear international evidence form such research as the EPPE project and sytems in Scandinavia where their international record is also high but their ideology does not ft into what give feels is suitable. He has ignored EYFS practice significantly. Finally while patients may differ, the variables in how a body respond will still be limited so there is a more limited number of outcomes given a course of action. In children this is not the case. The number of neurological, behavioural, social and emotional variables in one class makes the comparison with medicine a poor one. What works with one group of patients if far more likly to work with another set, if it works. However what works with one set of students simply can not be guarentee to work with another. I would suggest the money would be best spent reducing class size so that the teacher can ensure they really get to know and understand each pupil and adapt accordingly, not just through differentiation, but quality time with the pupils so that they understand each pupils thinking methods, and barriers to learning.