Lucia de Berk – a martyr to stupidity

April 9th, 2010 by Ben Goldacre in bad science, numerical context, statistics | 53 Comments »

Ben Goldacre, The Guardian, Saturday 10 April 2010

Lucia de Berk is a Dutch nurse who has spent 6 years in jail on a life sentence for murdering 7 people, in a killing spree that never happened. She will hear about her appeal on Wednesday, and there is now little doubt that she will be let off. The statistical errors in the evidence against her were so crass that they can be explained in one newspaper column. So will the people who jailed her apologise?

The case against Lucia was built on a suspicious pattern: there were 9 incidents on a ward where she worked, and Lucia was present for all of them. This could be suspicious, but it could be a random cluster, best illustrated by the “Texas Sharp Shooter” phenomenon: imagine I am stood in front of a wooden barn with a machine gun in each hand, maniacally firing off a thousand bullets into the wall. I remove my blindfold, walk up to the barn, find 3 bullets which are very close together, and carefully paint a target around them. Then I announce that I am an olympic standard rifleman.

This is plainly foolish. All across the world, nurses are working on wards, where patients die, and it is inevitable that on one ward, in one hospital, in one town, in one country, somewhere in the world, you will find one nurse who seems to be on a lot when patients die. It’s very unlikely that one particular prespecified person will win the lottery, but it’s inevitable that someone will win: we don’t suspect the winner of rigging the balls.

And did the idea that there was a killer on the loose make any sense, statistically, for the hospital as a whole? There were 6 deaths over 3 years on one key ward where Lucia supposedly did her murdering. In the 3 preceeding years, before Lucia arrived, there were 7 deaths. So the death rate on this ward went down at the precise moment that a serial killer – on a killing spree – moved in.

Even more bizarre was the staggering foolishness by some of the statistical experts used in the court. One, Henk Elffers, a professor of law, combined individual statistical tests by taking p-values – a mathematical expression of statistical significance – and multiplying them together. This bit is for the nerds: you do not just multiply p-values together, you weave them with a clever tool, like maybe ‘Fisher’s method for combination of independent p-values’. If you multiply p-values together, then chance incidents will rapidly appear to be vanishingly unlikely. Let’s say you worked in twenty hospitals, each with a pattern of incidents that is purely random noise: let’s say p=0.5. If you multiply those harmless p-values, of entirely chance findings, you end up with a final p-value of p < 0.000001, falsely implying that the outcome is extremely highly statistically significant. With this mathematical error, by this reasoning, if you change hospitals a lot, you automatically become a suspect.

One statistician – Richard Gill – has held the Dutch courts’ feet to the fire, writing endless papers on these laughable statistical flaws ( ). Alongside the illusory patterns he has identified, there was one firm piece of forensic evidence. Some traces of the drug digoxin were found in one baby who died. The baby had previously been prescribed digoxin, months previously. Three court toxicologists now say the digoxin was not the cause of death.

In fact, even the Dutch state proseution now accepts that Lucia should be acquitted, and that there was no evidence for an unnatural death in any of the patients, though her convictions for stealing two library books from the hospital library – shamefully and bizarrely – will be upheld. Lucia denies stealing these two library books. Now living with her partner while she awaits the final judgement, Lucia is penniless, denied unemployment benefits because of her unusual status, and paralysed down one side following a stroke which she had, in 2006, aged 44, in the week she was told that her conviction would be upheld. Watch what the Dutch legal system does next, because they owe this woman a great deal.

If you like what I do, and you want me to do more, you can: buy my books Bad Science and Bad Pharma, give them to your friends, put them on your reading list, employ me to do a talk, or tweet this article to your friends. Thanks! ++++++++++++++++++++++++++++++++++++++++++

53 Responses

  1. Tom said,

    April 10, 2010 at 12:09 am

    It’s a sad case, and my heart goes out to this poor woman.

    HOWEVER: In the same way that an unusual confluence of coincidences resulted in this nurse being on duty during nine deaths, is it not also possible that a terrible run of bad luck was responsible for her conviction of the “crime?” Perhaps the court was composed of officials unusually incompetent in interpreting uncertain events, perhaps her defense was unusually ineffective, etc.

    We don’t expect our health care system (or anything else, for that matter) to function without error. Nor can our justice system.

  2. Patrick said,

    April 10, 2010 at 12:36 am

    I agree that we can’t expect the judicial system to be perfect. Just like we don’t expects hospitals to be perfect. Mistakes, deaths, and wrongful convictions will happen. However, we should all hope, and expect, that, when fixable, it will take less than 6 years to rectify those mistakes.

  3. tortorific said,

    April 10, 2010 at 2:14 am

    It’s called the prosecutors fallacy, if it’s been abused so much by the courts that it was named after them then you think they would be aware of it. There is a good discussion of in The Drunkard’s Walk: How randomness rules our lives by Leonard Mlodinow, it was nominated along with Bad Science for the royal society’s science book of the year. In particular there has been a bit of discussion lately about how doctors are struggling with the same statistical problem when explaining the statistical power of tests.

  4. SteveGJ said,

    April 10, 2010 at 2:19 am

    I have to quibble with the wording of the following

    “She will hear about her appeal on Wednesday, and there is now little doubt that she will be let off.”

    She is not be “let off” – what is finally happening after this disgraceful episode is that there is finally some sense of justice. There was never anything to be “let off” from – she is finally to be found innocent of something she whould never have been guilty of in the first place.

    I realise that it wasn’t Ben’s intention, but I’m only too aware that there are people out there who do see these things as being “let off”. The choice of words in this case matters.

  5. milli said,

    April 10, 2010 at 3:19 am

    As an avid reader, you do not steal library books! unscientific,we love the library too much..but book lovers still steal books.

  6. Quick2kill said,

    April 10, 2010 at 8:27 am

    Very shocking if this case was as you describe. On a stylistic note could you not have provided a hand waved explanation as to what p-values are and why you can’t combine them just by multiplication. My brain is too mashed right now to do it but I feel like it should be possible and IMO that would improve the article. I may just be being bitter though, in my recent attempts to do pop science writing I was told by someone it was too tough and they compared me unfavorably to you :(.

  7. gregpye said,

    April 10, 2010 at 9:41 am

    And what of Henk Elffers (the law professor)? Surely he realised his mistake? Or was told about it? Did he not make any effort to correct it? If he realised the error and was diligent in following it up then I cannot see how it could take 7 years to fix. If he was ethically negligent (I am struggling for the right expression) then he might let it ride, but I cannot see how he would have any credibility. And yet he, or someone with the same name, still appears at the faculty of Law at VU University.

  8. WilliamSatire said,

    April 10, 2010 at 10:01 am

    No smoke with out fire, I say… 😉

  9. jimmymagee said,

    April 10, 2010 at 10:09 am

    As much commenting to clarify my own understanding as anything else, but is it not more correct to say that multiplying p-values *is* correct, but one has to account for the fact that the resulting product is not uniformly distributed (in particular, twice the log of the product is chi-squared) the way the original p-values were?

  10. jimmymagee said,

    April 10, 2010 at 10:12 am

    Mind you, I’m not so clear on why the product of the p-values is the thing you should be interested in (rather than the most extreme p-value, say) – it seems intuitive, but I don’t know exactly why…anyone…?

  11. elbuho said,

    April 10, 2010 at 10:40 am

    Thanks for highlighting this mindboggling case. As a resident of Holland, I hope I never get sucked into the Dutch legal system!

  12. gill1109 said,

    April 10, 2010 at 10:56 am

    this comment has been removed at the request of the author following legal threats.

  13. gill1109 said,

    April 10, 2010 at 10:58 am

    BTW about the library books (they were Stephen Kings!!!). The librarian has repeatedly written to the police and the court that the books were not stolen. The library administration was Ina state of transition and the statement by a higher administrator that the books were stolen is based on clerical error. But since this offence is not under review, and is minor, it remains legally unalterable. This legal nicety means that Lucia is not eligible for compensation since only part if the conviction has been revised. Dutch judges are very strict, especially the court which would be responsible for determing compensation.

    All along, Lucia’s lawyers had asked the capital offenses to be on a seperate charge sheet from the “economic” crimes. But this was turned down since the stealing proved that Lucia was a liar and a thief hence more likely to be a serial killer.

    Kafka lives on in Absurdistan-on-the-Rhine-Delta, where the Hague, where this all happens, styles itself as capital city of European, even World Justice!

  14. SteveGJ said,

    April 10, 2010 at 11:57 am


    I suspect you missed the element of satire in Williamsatire’s comment (unless it was noted and the reply is deliberately dead-pan). It all goes to show that irony and web comments are a dangerous mixture.

  15. gill1109 said,

    April 10, 2010 at 12:55 pm

    Thanks SteveGJ! Well, satire or not, it was a splendid opportunity for me to explain why there is indeed in this case (IMHO) a lot of smoke but coming from fires of a rather different nature. I no longer feelso angry with Dutch Judges. And in fact they have learnt from their experiences and are nowadays a whole lot more suspicious of the Public Prosecution. Moreover the meanness of the public prosecution’s statements, probably meant to be some kind of damage-control for their own back yard, do them more damage still in the eyes of the public. Their mean beahviour is a way of shooting themselves in their own foot. The Dutch legal system has already learnt quite a lot from the case of Lucia de Berk. Also the Durch scientific world and in particular, the Dutch statistical world.

    What remains is for the medical world to learn. However, as long as they deny any responsibility, that learning process cannot start.

    In my opinion a number of major system-weaknesses, specific to the Dutch situation, were exposed by the case.

    1) The outcome of the first series of court cases was determined essentially by *one* medical specialist, the chef-de-clinique of the Juliana Children’s Hospital who was vulnerable to gossip, who made a couple of wrong diagnoses so that she herself was amazed when children suddenly died, and who had her nerdy brother-in-law further support her statistical analysis of her own thoroughly subjective and biased data-gathering. This same person oversaw the internal hospital investigation in the two weeks between the initial catalyzing event and the reporting to the police of 5 murders and 5 attempted murders. The same person was made hospital coordinator and laision person for the subsequent police investigation. Two extremely hierarchical and powerful organisations (a large hospital and the public ministry) had to be linked up for a murder investigation, and this went through one person, who already was committed to the outcome.

    It seems to me that the hospital director was responsible for these major errors. An authoritarian manager focussed on processes, not on persons, with a habit of making rapid decisions and never looking back on decisions once made.

    2) In most modern countries where this sort of case arises the first thing that happens is not a police investigation but an independent external medical investigation.

    3) In the UK and in many other modern countries the nursing staff is much better organized and harder to ignore. Florence Nightingale? In NL, nurses have only had a single organisation representing them for a couple of years. They are largely ignored in hospital management decisions and certainly by medical specialists. A colleague of mine was in hospital for 6 weeks with a severe heart condition and took great care to note exactly what medication he was supposed to be having and what he actually got. He was given the wrong pills on 8 occasions. He told this to his heart-surgeon who exclaimed “oh those careless sluts”. Which shocked my colleague to the core, who could see that a dedicated nursing staff was doing an almost impossible job to the very best of ability. Mismanagement and understaffing, mistakes by specialists and pharmacists, contradictory instructions by specialists and their assistants, illegible prescriptions, were the order of the day. All professional medical groupings are represented at a monthly meeting with the minister of health. All except… the nurses.

    So my recommendations are:

    1) Strengthening of the role and prestige of nursing staff in hospitals.

    2) More scientific diagnostic reporting (“differential diagnostics”). In the medical-legal situation the medical specialist must discard his role of the God, who knows the right decision to make and never makes a mistake, (in life and death situations); he must adopt a more humble scientific attitude, admitting that even after post-mortem examination, the cause of death is still not known in 30% of deaths, and that three people a day die in Dutch hospitals because of avoidable medical errors.

    3) External and independent and confidential medical investigations in Lucia scenarios, before calling in the police. Probably this will often need non-Dutch speaking experts and more openness concerning health care in hospitals. The medical community must leave the 19th century; individuals must no longer be afraid to criticise others of higher status. (Staff members of the Juliana Children’s Hospital are *still* strictly forbidden to speak about the case. Dissenting voices in the investigations were quashed from the start).

    4) In the court situation, written scientific expert evidence needs to be got into the public domain as far as possible, so that the scientific methodology used can be openly discussed in the scientific community.

  16. Jessicathejourno said,

    April 10, 2010 at 1:20 pm

    @SteveGJ I suspect you missed that gill1109 was using a flip comment as a lead-in to explain where he or she feels the fire actually was (BTW if the point about the library books theft charge and compensation is accurate I’m even more disgusted and flabbergasted than I already was – I mean, HOLY SHIT).

    The Dutch aren’t usually known for their fine appreciation of irony but you really might give them the benefit of the doubt when the commenter they’re responding to had put ‘satire’ in their name.

  17. pv said,

    April 10, 2010 at 2:08 pm

    How exactly will the State recompense Lucia de Berk? I think that question is equal in importance to all the other questions that may be asked in this case.
    They cannot give her back her life, but when you compare what has happened to her with others who have have reuested obscene payouts for suffering nothing more than being referred to as dick heads, she certainly deserves some extraordinary offer plus a very public apology.
    And what is being done to penalise those whose errors put her in this dreadful position?

  18. SteveGJ said,

    April 10, 2010 at 2:15 pm


    I did say unless it was deadpan, which would put even more irony into the mix. As for Richard Gill (by birth, British and now a long time resident of the Netherlands), he is very well placed to judge the appreciation, or otherwise, of irony by the Dutch.

  19. aphasia said,

    April 10, 2010 at 7:31 pm

    Thanks for the detailed info Richard. It’s very interesting (and shocking) to hear how the case was dealt with by the hospital authorities. It complements Ben’s article well.

    If this terribly unfortunate woman is denied sufficient compensation due to the absurd matter of 2 library books it will complete a thoroughly disgraceful debacle. Obviously the people responsible must be held accountable.

  20. sjmurdoch said,

    April 10, 2010 at 7:53 pm

    A similar scenario occurs during the investigation of phantom withdrawals. The usual scenario is that a customer finds some money missing from his account, due to a cash machine withdrawal. He reports it to his bank and if the withdrawal happened at a cash machine far away from the customer, the bank is reasonably likely to refund the money.

    The problem occurs when the withdrawal happens near the customer’s address. Here, the banks commonly concludes that the customer probably made the transaction himself. The customer appeals, but the adjudicator who sees this case thinks that the location of the transaction cannot be by chance, so agrees with the bank.

    In fact, there are so many phantom withdrawals going on, that there are going to be a large number that just happen to occur near the victim. Now, these cases are more suspicious than the ones where the victim and disputed withdrawal are far apart, but it is by no means conclusive proof of guilt.

  21. fragmeister said,

    April 10, 2010 at 8:07 pm

    A few years ago I had a friend who was a nurse. Her nickname was Nurse Killpatient because she constantly had anecdotes about patients on her ward who had sadly died. Funny that people die in hospitals.

  22. gf said,

    April 10, 2010 at 8:21 pm

    There is some seriously bad statistics going on here. And I don’t just mean the court case referred to (that’s unfortunately as common as Common Mud).

    You are absolutely right, Ben Goldacre, that multiplying p-values of different events to come up with some new p-value for all those events is stupid. But that’s mostly because p-values are not probabilities of those events.

    Let’s repeat it in capital letters so everyone can follow along (especially the law professors amongst you).


    What are p-values then? P-values — which, by the way, are evil and should be banned from use outside a biohazard facility — are the very specialised probabilities of what you observed or more “extreme” versions thereof UNDER THE NULL HYPOTHESIS. Whatever that null hypothesis might be. [And what is “extreme” depends on the alternative hypothesis]. So it’s kind of a measure of how likely what you saw is a result of random noise under the null hypothesis rather than random noise from some more interesting process. Whatever.

    This unfortunately makes your explanation, Mr. Goldacre, of why you shouldn’t multiply p-values, wrong. If, as you put it, “you worked in twenty hospitals, each with a pattern of incidents that is purely random noise”, then the p-value could be bloody well anything, depending on what happened during your tenure (without mentioning what the null hypothesis actually was).

    To see why, let’s try it with coins. We toss 20 of them. Null hypothesis: all of the coins are fair, independent, all that jazz. You observe 10 heads. What is the p-value of this miraculously balanced phenomenon under our chosen null hypothesis? A great big 1. If you observe the even more miraculous 20 heads instead, your p-value (under the alternative hypothesis that the coins are not fair but without saying whether heads are more likely than tails or vice versa) will be 2x[(0.5)^(20)], which is about 2x(10^(-6)). That’s about as diverse a range as possible.

    So when you wrote “p=0.5”, Mr. Goldacre, you meant the probability of a binary variable being 1 under your null hypothesis is 0.5, not that the p-value of what you observe under that null hypothesis will be 0.5. I told you p-values were evil. Their being evil extends to being nigh-on impossible to combine for different events without going back to scratch and just working it out afresh.

    Now to the commenters: here’s my personalised statistics correction service just for you! Gratis!

    Tom: Sometimes a process that can seem random stops being so when you make more observations. Apparent randomness does not justify sloppiness. Maybe the lady was unlucky to get incompetent or malign people working on her case, but that just means we haven’t done enough to reduce the risk of that happening. And as Patrick said, once we find out that something went wrong, whether it was out of our control or not is irrelevant to bloody well sorting it out.

    Tortorific: The Prosecutors Fallacy involves saying that the probability of observations under a hypothesis is the same as the probability of the hypothesis. It’s a result of forgetting (or refusing) to apply Bayes Theorem. Nothing to do with the case here, so far as I can tell. That the legal profession still makes outrageous statistical mistakes despite having a fallacy named after them is well-founded, though… [By the by: Why would the court hire a law professor to give statistical advice? Seriously? I could have done a better job for half the price!]

    Jimmymagee: I hope my explanation of p-values suffices. A small aside: twice the logarithm of the likelihood ratio of the two hypothesis is (approximately, ymmv, etc) chi-squared. Don’t do anything with p-values, ever, except leave them as they are. Actually, don’t do anything with p-values, ever, full stop. Because they’re evil, as I hope I’ve convinced you all by now…

  23. andrevandelft said,

    April 11, 2010 at 12:13 am

    Richard Gill wrote: “individuals must no longer be afraid to criticise others of higher status”

    The importance of this is illustrated by today’s TimesOnline item on the air crash in Russia:

    “Kaczynski … had every reason to believe he was not welcome in Russia. Polish observers said he may have interpreted an order to divert to Moscow as an attempt to sabotage his big day in Katyn, where he was due to attend a mass and give a speech.

    Russian media reports said he had once become angry with a pilot who refused to land in Tbilisi, the Georgian capital, on the grounds that it was unsafe. The same thing may have happened at Smolensk, aviation experts claimed. They suggested he may have pressed the pilot to make at least two attempts to land.”

  24. Colette Iris said,

    April 11, 2010 at 9:01 am

    Hi. Short time reader and first time poster here, registering to also give p-values a kicking. P-values are one of the cheapest ways that drugs reps try and bamboozle doctors. Their favourite phrase is “this study shows that ourdrugalol is significantly better than theirdrugidone”. No it doesn’t! It shows that you found some difference in results and that that difference was unlikely to be due to chance. (I think that’s what it means anyway.) The difference itself is then often completely irrelevant or unimportant, but they use the low p-values to try and make you think there’s a huge real world difference. Still, the sandwiches are nice.

  25. jimmymagee said,

    April 11, 2010 at 8:05 pm


    I guess the problem in your toin cossing example is that you have to assign an ordering to the events? And put the “least extreme” event – 10 heads – on one side. Thus, the p-value IS a probability, just not of particular events, rather of events “as or more extreme” than that in question. And this is why combining them is problematic, because there isn’t a default way to combine orderings on Cartesian products (or something along those lines).

    So, is Ben not justified in his example? He can just adjust it to be “0.5 chance of an incident as or more extreme” occurring. And then multiple such incidents occurring is nothing to write home about, but multiplying the p-values would make it look like it is, because what you’re actually calculating is the probability of ALL the events being as or more extreme than the observed ones (in the right order), whereas what you’re interested in is the probability of the events being “on average” more extreme than the observed (again, difficult to totally order Cartesian products). All assuming the null hypothesis, of course.

    I do thus find the Fisher method a bit odd, as it seems to assume a certain total ordering of the product space, which may not necessarily be the most interesting one. So, assuming the null hypothesis, it will give “significant” p-values the right proportion of the time, but will it correctly identify the right cases? Anyone?

    @gf, again – the likelihood ratio of which two hypotheses?

  26. maus said,

    April 11, 2010 at 9:59 pm

    @9 “No smoke with out fire, I say… ;-)”

    Childishly stupid and inaccurate analogies do not make you appear any wiser, especially on a site devoted to bad science. Is this an active troll or do you actually believe that people will cheer on your use of folksy-isms? Think more, post less.

  27. mikewhit said,

    April 12, 2010 at 9:32 am

    Interesting medical stats PhD project – do a similar (but better founded in theory!) analysis for ALL hospital personnel in Dutch hospitals to demonstrate (or not) that there are potentially many ‘de Berk’ instances.

    Point two.
    Despite this, does the ‘medical system’ need some recognised method of performing _this kind_ of analysis to pick up (or at least, trigger a check on) real wrongdoers ?

    Or is this another case of wanting a ‘systems-based’ approach, rather than verified due diligence / best practice in the lower echelons

  28. fontwell said,

    April 12, 2010 at 10:59 am

    I’m glad something is finally happening with the case, even if it is crazy slow. I remember Ben writing about this in the past and was staggered that it could happen and also that the Dutch system seemed determined to ignore it.

    I think we should fund a statue commemorating this lady and Sally Clark (the cot deaths case) as victims of gross incompetence with basic statistics.

  29. JustAsItSounds said,

    April 12, 2010 at 11:17 am

    @25. maus

    Childishly stupid and inaccurate analogies do not make you appear any wiser, especially on a site devoted to bad science. Is this an active troll or do you actually believe that people will cheer on your use of folksy-isms? Think more, post less.

    The little smiley face thing after the ellipsis clearly indicates tongue firmly planted in cheek. A concise parody of the typical Daily Mail reader attitude towards anyone accused of a crime, regardless of their guilt. You may want to take your own advise about posting and thinking in future.

    In short: Whoosh.

  30. MedsVsTherapy said,

    April 12, 2010 at 3:02 pm

    I guess I am more research-minded than sports-minded: we take a visiting niece to the local pro baseball games when she is in town. It seems like the home team wins all the time on these occasions (like maybe 4 per season). This got me to thinking about the stress that a fan might have if they believe they are the team’s good-luck charm, attending some home games and missing others, with a win each time they attend. Going into a new season, out of 5,000 plus fans each game, and all the season-ticket holders, there will eventually be that one person who attends the winning games, and fails to attend the losing games, for a note-worthy run, before the inevitable. Pro baseball teams quickly end up near .5, so coin toss analogy is helpful for referring to prototype likelihood analogies. does the person think: I must attend, or I will be responsbiel for their loss? Or, once the first loss happens, does the fan start trying to figure out what the fan has done wrong? worn sandals not sneakers, etc.? Instead of keeping track of the batter’s count, my eyes are roving the stands, wondering which of these individuals has such an unlikely, but eventual, coincidence.

  31. JayScottGreenspan said,

    April 12, 2010 at 3:36 pm


    Try reading more, and typing less.

    “let’s say p=0.5” basically means p could be anything, we don’t know what p is, but let’s just suppose p is 0.5 so we can see how ludicrous multiplying them can be.

  32. NeilHoskins said,

    April 12, 2010 at 4:13 pm

    Thanks for the update, Ben. I’d been wondering about her case ever since you flagged it up.

  33. demirole said,

    April 12, 2010 at 7:31 pm

    I don’t agree with your ‘evil’ statement at all: p-values are a tool to describe the significance of rejecting/trusting a certain hypothesis. Period. Properly used they do their job. I work in astroparticle physics, we use p-values all the time to characterize the significance of a search for sources in the sky. Nothing wrong or evil about that!

    @jimmymagee: likelihood ratio of e.g. “signal” hypothesis to “background”/”null” hypothesis. See Neyman-Pearson lemma:–Pearson_lemma

  34. Ben Goldacre said,

    April 12, 2010 at 9:24 pm

    this from henk elffers, happy to post below, and i’ve asked if he could send a copy of his report:

    Dear dr. Goldacre,

    It is a pity you did not reach me as I was absent, (though otherwise the time you allowed me for reaction was extremely short as well), so that I could not correct your complete misunderstanding of both my report to the court in the De Berk case and the role it played in evidence. Bad journalism, I say.
    It is sad to observe that your comment was entirely based on hearsay, from a source that should have known better.

    I felt obliged to send a reaction to the editor of the Guardian.

    I enclose the text of it.


    Henk Elffers, senior researcher Netherlands Institute for the Study of Crime and Law Enforcement NSCR


    Dr Ben Goldacre (Guardian of April 10, 2010) gets hold of the wrong end of the stick in his comments on the role of statistics in the murder conviction of the Dutch nurse Lucia de Berk, whose case is now retried. His treatment completely misapprehends the statistical report prepared by me for the court in first instance, and he misunderstands the role of the report in evidence.

    No, my statistical report did not neglect the Texas sharpshooter problem. For that reason, it explicitly investigated the question whether, given the number of nurses and their roster of duty, and given the number of incidents, it is compatible with chance that someone, whichever member of the staff, would be present at all incidents.

    No, for the analysis used, it is not relevant what the rate of occurrence of incidents was in previous periods, as the analysis was conditional on the period under consideration.

    No, the analysis was not done after arbitrarily dividing the data in three portions. I investigated the question whether someone (whichever member of the nursing staff) could have met all incidents in the first hospital by chance (result: very unlikely). Subsequently, I investigated the question whether the person thus identified could have met by chance the incidents as observed (given rosters etc.) in two previous appointments (results, respectively: unlikely and slightly unlikely).

    No, one should not combine these results by means of Fisher’s method, as the first test does not address the same hypothesis as the other two. It is better to state the individual test results as such. As a matter of fact, I did not present the multiplied tail probabilities as a combined significance test, though I have been criticised for not being crystal clear in my way of reporting this part of my results.

    No, the demonstrated incompatibility with chance is not a proof that the accused, though present at all incidents, has had a hand in them. I pointed this out to the court, suggesting several possible alternative explanations. The court, therefore, decided that the statistical argument could not be used for a conviction, as indeed Dutch judges had already ruled in a similar case.

    No, mrs. De Berk has not been convicted on statistical evidence, but on medical evidence. The revision of her process hinges on the fact that new medical experts, reanalysing toxicological evidence, now testify that no unnatural deaths have taken place. Indeed, no statistical evidence is brought forward in the revision proceedings at all. Of course, the statistical analysis as presented loses all relevance if the occurrence of incidents has been recorded incorrectly.

    Henk Elffers (M.Sc. mathematical statistics, Ph.D. psychology of law)

  35. gill1109 said,

    April 12, 2010 at 10:01 pm

    Attaboy Henk! But have you read the verdict where the court copiously copies your verbal argument that correlation does not imply causation and therefore your alternative explanations need to be answered, one by one? Or is that a coincidence that the court *exclusively* considers your alternatives to Lucia being a murderer?

    Of course we should not blame you for this. It was your colleague and collaborator law professor Richard de Mulder who explained your advanced maths to the court and told them that what it meant was that *Lucia* had to explain her presence at the events.

    You did not mention half a dozen other confounders, and for that I must blame the so-called experts for the defense, who certainly did even more damage than you had already done. They agreed that the numbers were amazing but argued that there were different models which led to different probabilities and therefore that no single probability could be given as “the probability that it was not chance”.

    The judges obliged by writing on page 1 of their 80 odd summary of the verdict, that “a statistical probability calculation plays no part in the conviction”. Indeed, no calculation, and they scrupulously avoided words like “statistics” or “chance”. Instead they got medical doctors to swear that so many events in such a short time is totally impossible in normal circumstances. Other medical doctors swore that the event about which they were asked an opinion was an unnatural event. In their written evidence to the court it says that the reason they say this is because Lucia was present, because otherwise they would have considered the event unsurprising.

    The court calls this “incontrovertible medical-scientific evidence that all the deaths were unnatural”. Yes it is incontrovertible because the court has decided it is true. Scientific because a prof. dr. in medicine signed his name to it.

    In one case all but one of six experts thought the event was natural. One thought it was unnatural. The court writes that this is the good expert because he alone could see that the event was unnatural, and writes that medical science has advanced so far that a good expert could see even on the basis of a single A4 of rough notes whether a death was natural or unnatural.

    My friend, you were screwed, but I forgive you, because if it hadn’t been for you, the statisticians would never have got involved in the case (*they* have nothing to lose), and because the statisticians got involved in the case, it got reopened. So if it wasn’t for you, Lucia would have perished in jail long ago and nobody would have cared or noticed.

    I was disappointed that you chose to side with the conservative lawyers rather than the liberal scientifists (where you were born and raised), but I guess he who pays the piper calls the tune, and that you value a chair in a law department more than the respect of mathematicians. Who after all have no power whatsoever.

  36. mikewhit said,

    April 13, 2010 at 10:40 am

    C’mon, gill1109, we don’t do ad hominem here …

  37. jimmymagee said,

    April 13, 2010 at 6:41 pm


    Yes, I just wasn’t/amn’t sure what two hypotheses @gf is referring to…

    Is my understanding/lack of understanding of how the Fisher method workds/doesn’t work totally off?

  38. quasilobachevski said,

    April 13, 2010 at 9:53 pm


    You seem very angry about Ben’s treatment of p-values. We all have our favourite ways of explaining complicated scientific or mathematical phenomena, and often popular science writers come up short. But given the space restrictions, I think Ben dealt with the issue at hand pretty well.

    Describing specific null and alternative hypotheses might have made pedants happy, but it would have wasted space and I doubt it would have been of much help to anyone who doesn’t already know what a p-value is.

    Also, Ben never asserted that p-values are ‘probabilities of the data’. I can see why you think p-values are evil. They certainly seem to have confused you!

  39. Jerry said,

    April 14, 2010 at 12:38 pm

    Other medical doctors swore that the event about which they were asked an opinion was an unnatural event. In their written evidence to the court it says that the reason they say this is because Lucia was present, because otherwise they would have considered the event unsurprising.

    wtf? that’s a logical fallacy if I ever saw one, and one the judges should’ve thrown out.

  40. hartkp said,

    April 14, 2010 at 1:18 pm

    This morning she was exonerated completely.
    Haven’t found an English link yet, but here’s the story in Dutch.
    Bij monde van voorzitter Van den Heuvel oordeelde het hof dat niet gesteld kan worden dat er moorden hebben plaatsgevonden, laat staan dat De Berk die gepleegd zou hebben
    Speaking through its chair the court judged that it can’t be stated that murders actually took place, leave alone that De Berk would have committed them.

  41. DrJG said,

    April 14, 2010 at 10:26 pm

    I doubt that Henk Elffers reads this blog, but if so, I would appreciate his response to the comments reportedly made by law professor Theo de Roos in 2003:
    “In the Lucia de B. case statistical evidence has been of enormous importance. I do not see how one could have come to a conviction without it”.
    This seems to me to be somewhat at odds with Henk Elffers’ statement above:
    “No, mrs. De Berk has not been convicted on statistical evidence, but on medical evidence.”

    I will risk people invoking Godwin’s Law, but claim the justification of listening to a radio play this afternoon about Nazi architect Albert Speer’s incarceration in Spandau. I looked him up on Wikipedia and found he was about the only one to plead guilty at
    “In political life, there is a responsibility for a man’s own sector. For that he is of course fully responsible. But beyond that there is a collective responsibility when he has been one of the leaders. Who else is to be held responsible for the course of events, if not the closest associates around the Chief of State?”

    Although he denied knowledge of the death camps, he did not regard that as exonerating him:

    “For from that moment on I was inescapably contaminated morally; from fear of discovering something which might have made me turn from my course, I had closed my eyes … Because I failed at that time, I still feel, to this day, responsible for Auschwitz in a wholly personal sense.”

    Henk Ellfers, on the other hand, having been a major prosecution witness in a gross miscarriage of justice, blusters, retrospectively self-justifies, and tries to pass the buck to others.

    I can understand a bit of ad hominem on gill1109’s part…

  42. frettled said,

    April 14, 2010 at 10:55 pm

    I heard it on the BBC World Service at 9.30 pm GMT, so this is one story where mainstream media apparently wants to report the exoneration, too. No names, though.

  43. KMBayes said,

    April 15, 2010 at 4:57 pm


    Your answer is a little unkind. Based upon gf’s answer it’s fairly clear to me that he’s much more comfortable with the Bayesian approach to statistics and I’m in full agreement. For those of you with an interest in stats and who are unfamiliar with Bayesian the Bayesian approach I suggest you get aquainted. It’s the way forward.

  44. Bexley said,

    April 17, 2010 at 12:25 am



    Poor example – Speer pulled a very artful con job at nuremburg claiming he didn’t know about the death camps, his balancing act helped him escape the noose.

    On the morning of 6 October 1943 Speer gave a speech at Posen to the Gauleiter. In the afternoon Himmler gave a speech in which he casually mentioned the extermination of europe’s Jews:

    The brief sentence “The Jews must be exterminated” is easy to pronounce, but the demands on those who have had to put it into practice are the hardest and most difficult in the world … We, you see, were faced with the hard question , “What about the women and children?” … The hard decision had to be taken to have this people disappear from the face of the earth. … By the end of this year, the matter of the Jews will have been dealt with …

    Clearly anyone at Posen listening could hardly claim ignorance about the plight of the Jews. Speer maintained that he left the conference after his morning speech. However there is no evidence of this – moreover Himmler addressed Speer directly during his speech, clearly implying Speer was still there with him.

    Of course none of this has anything to do with party comrade Speer: it wasn’t your doing.

    Speer’s rather lame defence was that Himmler must have forgotten his glasses and mistaken someone else as him!

    Two of Speer’s close friends (two of the Gau) were definitely at the speech making it even more implausible that Speer did not know the content even if his excuses are true and he wasn’t there. Moreover as armaments minister his job would have brought him into contact with all the information he needed. He would have known about the forced labour used in Germany’s factories. And his visits to occupied Poland and Ukraine (especially the factories) should have left him in little doubt as to the nature of the regime he served.

  45. quasilobachevski said,

    April 21, 2010 at 6:08 pm


    There’s no problem with preferring Bayesian statistics – no one denies that the Bayesian point of view has a lot to offer. There is a problem with saying ‘You’re wrong’, as gf basically did to Ben, when s/he really meant ‘I prefer a different point of view.’

    As Ben points out in his book, we are all very poor judges of our own areas of competence.

  46. gf said,

    April 24, 2010 at 9:01 pm

    No anger was involved, and I’m a little sad to see such defensiveness. I really appreciate Ben’s work and am a proud owner of his book.

    However, I did notice some confusion by both him and some commenters which I tried to (gently?) correct. I believe I was very specific about what the mistakes were. For the record, I am a qualified statistician. Not that that will help in a fire…

    P-values *are* pretty confusing things when you look closely, actually, and as KMBayes correctly surmised, I’m uncomfortable with their popularity and wish Bayesian analyses were more mainstream (although I happily acknowledge this is happening, slowly). I have seen them being abused too many times even in the supposedly hallowed “peer-reviewed literature” and just used to opportunity so fulminate against them a little, hopefully educationally.

    JayScottGreenspan, I don’t understand your need to advise me to ‘Try reading more, and typing less’. What did I miss, exactly? You claim that ‘“let’s say p=0.5″ basically means p could be anything, we don’t know what p is, but let’s just suppose p is 0.5 so we can see how ludicrous multiplying them can be’. Fair enough; I was more concerned with Ben’s previous sentence, which was ‘a pattern of incidents that is purely random noise’, which as I tried to explain with my coin-tossing example is nothing to do with p-values. Nothing to get your panties in a twist about, but I felt given my area of competence (which, in the spirit of your valuable if patronising advice that we are poor judges of our own, I refer to my degrees in) it would be remiss not to say something.

    Keep up the good work Ben.

  47. quasilobachevski said,

    April 26, 2010 at 7:18 pm


    To me, your original comment was the opposite of a ‘gentle’ correction. Your repeatedly referring to Ben as ‘Mr. Goldacre’ came across (again, to me) as sarcastic and aggressive. (It’s notoriously easy to strike an unintended tone in blog comments threads, of course.)

    Your comment’s unfortunate tone combined with the fact that no one else here thinks that Ben’s statistics were incorrect! We all know (you’re not the only one with a stats qualification here) that strictly speaking one can’t assign p-values without making a pair of hypotheses. But this is a newspaper article, and Ben’s elision of this point seems reasonable, appropriate and not even slightly misleading.

    In short, your original comment came across as aggressive and wrong. You ask JayScottGreenspan what you missed – the answer is context.

  48. pessimizer said,

    May 3, 2010 at 5:21 pm

    I registered, rather than continuing to lurk, in order to thank gf for his/her comment. It reminded me of the things that I had forgotten since college, and made the errors of the prosecutors clearer to me than they were from simply reading the entry. I think that Goldacre did a good job in explaining the problem, but when you’re trying to explain slightly tough mathematical ideas to people who may not have, or may have a rusty, mathematical background, you have to make a choice about how vague you’re going to be so the largest portion of the audience gets the gist of the idea, and the smallest portion leave with an accidentally induced misunderstanding of some mathematical point. The fact is: if you could leave out something in a mathematical explanation and have it remain as accurate, mathematicians wouldn’t have added it in the first place.

    I think that Goldacre made a necessary sacrifice by being vague about what p-values are, other than ‘things that do not directly represent probabilities’ and ‘things that cannot be simply multiplied together.’ His explanation is a superset of yours, which is more accurate. *30% of readers will walk away from the original post with a misinterpretation of the statistical argument. You suggest that the misuse of p-values is such an important issue that it would be better to be more specific – maybe bringing that number down to 20%, but maybe sacrificing a few readers altogether.

    Your comment was aggressive, but only towards p-values and their misuse, and that’s a pretty righteous thing to be aggressive about.

    [*] statistics drawn from my ass.


    Referring to someone respectfully while criticising them isn’t by definition sarcastic. Some people don’t feel comfortable referring to people by their first names that they don’t personally know, especially while criticising them.

    Also: ad populum, ad verecundiam (the only reason gf defended his/her qualification is because it was attacked), bald-faced assertion (we don’t all know anything.)

    I’m not sure what argument from “unfortunate tone” falls under, but you know that it’s not right. I thought the tone was fun. I’ve never seen somebody so pissed off by the mere existence of p-values:)

    Sorry, I just couldn’t let that attack stand as the last comment on the issue.

  49. gill1109 said,

    May 19, 2010 at 10:26 am

    Haga hospitals is threatening legal action against me because of the comments I made on various internet articles and blogs on the case of Lucia de Berk, including here. Apparently these count as “publications” and they are all the more serious because I am a university professor, hence people are bound to believe every word I write. Apparently it would have been OK to post the comments anonymously, and to replace the names of various key persons by their functions (former chef-de-clinique/chief paediatrician at JKZ; former director of JKZ). And of course every statement should be preceded by “apparently/allegedly/it’s my opinion that”. Finally: irony, hyperbole, or understatement are not appreciated.

    It’s my opinion that understanding the personalities of the key persons around which this whole case revolved, is the key to understanding why there was a case at all. Moreover: “tout comprendre c’est tout pardonner” – this understanding exhonerates those persons (chef-de-clinique/chief paediatrician at JKZ; director of JKZ) from any blame, of course they could not do otherwise and would do the same again today, if the case happened again; they have both said so, in public! Allegedly (ie, according to the reports of the Committee for Reconsideration of Closed Cases to the Dutch Public Ministry, of Advcate-General Mr. Knigge to the Dutch Supreme Court, and of prof Meulenbelt to the high court at Arnhem) they committed gross errors of professional judgement (both medical and managerial) which caused the whole catastrophe to explode out of control, devasting lives, almost killing Haga’s employee Mrs Lucia de Berk (she lay paralyzed by a stroke on the floor of her cell for 10 hours and was later refused decent medical treatment), ruining Netherlands international reputation for justice and humanity, and costing the Dutch taxpayer millions of Euros.

    Once we know that those couple of people are not to blame, it follows that blame falls on the organisation around them (and above them). It follows that a great deal needs to be learnt about the root causes of the case, so as to prevent some bad luck and some unlucky personality interactions from unleashing a new social nuclear bomb and nuclear winter yet again.

  50. gill1109 said,

    May 19, 2010 at 10:33 am

    Below is an open letter which I sent to the board of Haga hospitals, the Hague, yesterday. Haga hospitals is the new name of the merged hospitals JKZ, RKZ and Leyenberg at which Lucia worked and where the murder cases were fabricated out of bad luck and gossip). In it, I request that a scientific investigation is made of the original data on the basis of which Lucia was convicted (first explicitly, later implicitly). Henk Elffers never asked to see the sources of the data he was given by police which they had got from the hospital. No one has ever seen that. Some parts of it have been reconstructed from documents in possession of the defence and it turns out that the data was badly biased indeed. Thus Elffers reported an irrelevant number (1 in 342 million) computed from bad data using an inappropriate model which did not take account of hidden confounders. It turns out that Lucia had more weekend shifts than the average and that incidents often happen in weekends. As everyone knows… But no-one asked and no-one checked.

    Dear board of Haga

    I was yesterday invited to give a major lecture to KNAW (the Royal Dutch Academy of Sciences) on the case of Lucia de Berk in all its ramifications and societal aspects. I would so like to be able to report that a mutually respectful and beneficial collaboration between scientists and Haga Hospitals is now helping to clarify what really happened, and to uncover the lessons that should be learnt for the future.

    Now that the Lucia case is completely closed – in particular, there were no murders or otherwise unnatural deaths at all – I suppose there can no longer be any objection to a thorough multidisciinary and in particular statistical / epidemiological analysis of medical incidents at JKZ, say between 1995-2005. This would be so valuable for the future, and moreover, in accord with the current insight that sophisticated scientific evidence in the legal context has to be made as publicly available (to scientific inspection) as possible (cf reports of US Academy of Sciences, adopted by many scientific organizations worldwide).

    Thanks to the investigations of Meulenbelt, Tytgat and Aderjan we now know that the nurses at JKZ worked in emergency situations with exemplary professionality, in contrast to the more mundane level of diagnosis and treatment. Moreover their insights into the medical state of the babies in their care was often better than that of specialists or their assistants, though usually not acknowledged as such.

    Unfortunately, 30% of the Dutch population still believe that your former employee Lucia de Berk is a serial killer, and influential circles at the top of your hospital still continue to broadcast the slanderous accusations that “there is so much more against her” and “the whole case is nothing but an out-of-control family feud driven by the jealousy of some family members for the much more succesful careers of others”, and “it became an awful media hype, what could 100 professors of statistics or a second rate novelist know about the case?”.

    Despite repeated attempts from 2004 onwards to warn the concerned individuals and organizations that something was terribly wrong with the whole case, your hospital and its senior personnel took no notice whatsoever, but instead intensified attempts to discredit those who had uncovered this particularly inconvient truth.

    Doctors and nurses in confidence told us of many persons’ deep concern about the case, but no-one dared to speak out. Even a retired medical specialist wouldn’t say anything in public, since that would damage the prospects of his children in medical school. The few medical experts who dared to contradict or criticise findings of some colleagues during the trial were later ostracized by other colleagues for the breach in collegiality.

    Highly confidential inside information about the case was repeatedly leaked from the Public Ministry and from the judiciary (even from the supreme court) to senior employees of your hospital. Those “outsiders” unfortunate enough to be driven by a dedication to justice and truth were subject to vile personal attacks in the media, accused of undermining the foundations of the state, and subjected to phone-taps and ostracism.

    Vile disinformation about your former employee Mrs de Berk was leaked from your hospital to the press. Critical police investigators were taken off the case and critical hospital employees were silenced.

    But that is all in the past now.

    As always, despite this past, I remain hopeful of a mutually fruitful, mutually respectful, and civilized (gentlemanly) collaboration in the future.

    Yours sincerely
    Richard Gill

    Distinguished Lorentz Fellow 2010-2011
    President of Dutch Statistical Society
    Member of KNAW