In All Probability
As a teenager, I had the good fortune to meet Tom Körner. He had written a book, The Pleasures of Counting, that opens with a story about John Snow.
Snow was a 19th-century English physician who painstakingly collected and analyzed vast amounts of data to convincingly argue that cholera spreads through contaminated drinking water, and not, as was once widely believed, from some kind of air pollution.
This struck me as a powerful example of how mathematics, and in particular, statistics, can impact our lives. How many more would have succumbed, had the true cause remained hidden?
Yet today, probability and statistics seem much maligned. Statistics are worse than "damned lies"; they’re "pliable"; one can "prove anything by statistcs except the truth"; they are the means to produce "unreliable facts from reliable figures".
How did this happen? I’m sure these famous quotes were composed mainly in jest, and perhaps referred to shady accounting more than actual calculation. But these days, even the mathematics itself seems suspect:
-
The Reformation: Can Social Scientists Save Themselves? (May 2014)
-
Scientific method: Statistical errors (February 2014)
-
The Curse of P-values (November 2013)
-
Revised standards for statistical evidence (October 2013)
Statistics is indeed a troubled subject. It turns out some guy named R. A. Fisher is to blame. Fisher had a tragic combination of gifts and flaws that led to today’s erroneous orthodox statistics. (Despite an ever-growing mountain of evidence, Fisher steadfastly refused to believe smoking causes lung cancer. How good could his methods be?)
My undergrad introductory course on probability and statistics followed Fisher’s dogma. As a result, I felt that the methods they taught seemed more like black magic than mathematics. But I was convinced that the lecturer only seemed to be teaching superstitions because my understanding was too shallow, and I concluded I must have a poor intuition for the subject.
Years later, and determined to conquer my weakness in this area, I went back to my textbook. And some other books. I discovered the shocking truth: my textbook is wrong. For once, a crazy conspiracy theory was true and They really were corrupting us all with Their false mathematics.
Epilogue
I heard from Fred Ross that this page got posted to Hacker News. He had this to say:
The underlying theory that justifies most inference (Bayesian, minimax, etc.) is decision theory, which is a subset of the theory of games. Savage’s book on the foundations of statistics has a very nice discussion of why this should be. I learned it from Kiefer’s book, which is the only book I know of that starts there. Lehmann or Casella both get to it later in their books.
The justification for p-value is actually the Neyman-Pearson theory of hypothesis testing. The p-value is the critical value of alpha in that framework. I wrote a couple of expository articles for clinicians going through this if you’re interested.
Jaynes was a wonderful thinker, but be aware that a lot of the rational actor theory breaks down when you don’t have a single utility function. That is true of using classes of prior (see the material towards the end of Berger), or in sequential decision problems (look at prospect theory in psychology, where the overall strategy may have a single utility function, but local decisions along the way can’t be described with one). So the claims in the middle of the 20th century for naturalness of Bayesian reasoning haven’t held up well.
My views have since hardened. I disagree with Ross. Any flaws in Jaynes' unfinished work are more than compensated for by David Mackay, Information Theory.
Cox’s theorem is the underlying justification for Bayesian reasoning. In particular, if there are multiple ways to solve a problem, naturally we desire confluence: all roads ought to lead to the same solution. Non-Bayesians view this as unnatural!
As for decision theory, see Chapter 36 of Mackay for a one-equation summary. Decision theory builds on Bayesian reasoning; to justify the latter with the former is to put the cart before the horse.
(Appealing to psychology is dubious. There exist many humans, known as sampling theorists or frequentists, who somehow reason without a sound mathematical foundation. Why then should a utility function be universally appropriate?)
Chapter 37 demolishes frequentism/sampling theory with flair. Mackay proposes an ingenious compromise. He first observes that "from a selection of statistical methods," sampling theorists pick "whichever has the 'best' long-run properties". Thus to sneak Bayesian reasoning past them, simply state you’re choosing the method with the 'best' long-run properties, while taking care to avoid the word "Bayesian". I propose we use the phrase "Mackay’s correction"; for example, "the chi-squared significance test with Mackay’s correction" might mollify reviewers suffering from frequentism.
Mackay’s favourite reading on this topic includes: Jaynes, 1983; Gull, 1988; Loredo, 1990; Berger, 1985; Jaynes, 2003. Mackay also mentions treatises on Bayesian statistics from the statistics community: Box and Tiao, 1973; O’Hagan, 1994.
The phrase "in the middle of the 20th century" calls to mind World War II, when Allied codebreakers applied Bayesian reasoning to break Germany’s Enigma cipher. Their methods "haven’t held up well"? Really? Which side won?
Meanwhile, Fisher’s eugenicist views (see his writing on race and miscegenation) seem to have fallen out of fashion. This mirrors his sampling theory, a 20th century practice that hasn’t held up well: see John Ioannidis, Why Most Published Research Findings Are False.
Although unrelated to probability, I recommend another book by David Mackay: Sustainable Energy: without the hot air. Again, Mackay clearly explains how to navigate correctly in an area infested by influential charlatans that attempt to mislead us with labyrinthine arguments, obscuring their speciousness with complexity.
The stubborn resistance of frequentists echoes the reception of John Snow’s ideas. Today, it is widely accepted that dangerous waterborne pathogens are pervasive and difficult to detect. But at the time, government inspectors rejected Snow’s theories.
Even the local vestrymen publicly sided against Snow. However, privately, deep down, they must have harboured doubts, as they appointed an investigative committee who, despite government obstruction, would one day confirm Snow’s theories. Some credit is due to the Reverend Henry Whitehead, who essentially replicated Snow’s work, painstakingly collecting and analyzing a mountain of evidence that forced him to change his mind, and thankfully, the minds of others.