The Drunkard's Walk: How Randomness Rules Our Lives
by Leonard Mlodinow
Why a Positive Result on a Medical Test Doesn't Necessarily Mean You're Sick
A review by Doug Brown
Most people have the wrong idea about randomness. In flipping a coin, for instance, people assume random means that in any string of flips there will be an even number of heads and tails. However, truly random data has strings of all heads and all tails. Furthermore, most people think if you've got four heads in a row you are likelier to get tails on the next flip – that a tails is due (this cognitive error is so common it has a name: the Gambler's Fallacy). However, in truly random flips, the odds of getting tails is always 50%. Even if you've gotten 10 heads in a row or 10 tails in a row, the odds of getting a tails on the next flip is 50/50. This and many other dimensions of randomness are the topic of The Drunkard's Walk.
Throughout the book, Mlodinow traces the history of probability and statistics as mathematical studies. Predictably, the impetus was often gambling; some lord or gentleman would ask his smart friend if particular outcomes were likelier, and the smart friend would come up with a concept that later became part of statistical analysis. Early contributors included names such as Cicero, Galileo, Pascal, and Fermat. Interspersed with these historical vignettes are modern-day examples where these methods and ideas were used (or where the lesson clearly hasn't been learned).
In the 1700s a reverend named Bayes noted that the odds of A occurring if B occurs are not dependent on the relationship between A and B so much as the likelihood of A and B overall. For instance, the probability of having a disease (A) given a positive test result (B) is not determined by the effectiveness of the test, but by the frequency of the illness compared against the effectiveness of the test. Mlodinow gives an example where he was told the odds were 999 out of 1,000 that he would be dead in 10 years. The reason was an HIV test had yielded a positive result, and the test returns a false positive only 1 out of 1,000 times. However, this is not the correct analysis; he was being told B, not A. Mlodinow looked instead at the frequency of HIV among his social group -- white, heterosexual, non-drug-using American males -- which, according to CDC statistics, is 1 in 10,000. If 10,000 men from this group are given the test, on average there will be 11 positive results -- one from the person who actually has HIV, and the other 10 who are false positives. Thus, the true odds that Mlodinow had HIV were 1 in 11, not 999 in 1000. And sure enough, he's still here. Another example presented is from the O.J. trial, where Alan Dershowitz exploited people's misunderstanding of Bayes' Theorem to argue against a point the prosecution made. They noted Simpson had abused his wife, thus making him likelier to have eventually killed her. Dershowitz argued that since 4 million women are annually abused but only 1,400 killed by their abusers, the odds of O.J. having done it were only 1 in 2,500. However, the relevant question is if a battered woman is murdered, what are the odds that her abuser killed her? The answer is 90%. As Mlodinow points out, this statistic was unfortunately not mentioned in the trial. Many email spam filters use Bayesian analysis to consider the probability that an email is spam if it contains certain words like "Viagra" (hence the recent proliferation of spam consisting of Haiku-esque random words, hoping to beat the filters).
One of the most persistent myths regarding randomness comes from the world of sports: the Hot Hand. If a basketball player has made her last several shots, people think she's on a hot streak, and likelier to be successful on her next attempt. However, every analysis of streaks in sports has shown them to be nothing more than the expected strings of hits and misses to be found in any random sample (given the athlete's average success rate). If a batter has a .400 average, over time he'll hit 4 out of 10 times he's at bat. Sometimes he'll get many hits in a row, and the announcers will say things like, "Mlodinow's hot this season!" Sometimes he'll miss a bunch, and the announcers will justify it with utterances such as, "Mlodinow's just not on his game today." But overall, the average is .400. There is no such thing as a hot streak or a cold streak, any more than getting 10 heads in a row when flipping a coin means you've got a "hot thumb." Good luck arguing this case against sports fans, though.
One misstep Mlodinow makes is in discussing the problem people have had coming up with random number generators. A syndicate group that needed numbers for an illegal lottery in the 1920s hit on the idea of using the last five digits of the National Treasury balance. Mlodinow then goes into a discussion of how this runs afoul of a statistical phenomenon called Benford's Law. This came from an observation that in any data set, numbers are much likelier to start with 1 (30%), then 2 (18%), and the higher numbers are less and less likely to be the first digit. An interesting phenomenon, and one that has been used to spot when people have falsified data sets; manufactured data usually fails to conform to Benford's Law. This is all very fascinating, but it has nothing to do with the syndicate's number selection. They were using the last five digits of the Treasury balance, not the first five. Folks would not have been able to use Benford's Law to select likelier numbers in the lottery. Mlodinow got this example from Henk Tijms' textbook Understanding Probability, in a section called "Pitfalls Encountered in Randomizing." The second edition of Tijms uses a different lottery example in this section (the 1970 draft lottery), so it seems Tijms spotted the error, too.
Markets, casinos, the bell curve, how our expectations color our data analysis, and more are covered in The Drunkard's Walk. People's mistaken notions that random markets are predictable gets a particularly nice summary; those wanting a more in-depth analysis are encouraged to read Taleb's Fooled by Randomness. People's incomprehension of randomness caused them to complain about the iPod's shuffle feature; if they got two songs in a row from an album or artist, they assumed it wasn't actually random. So Apple reprogrammed the shuffle feature; as Steve Jobs is quoted as saying, they made it "less random to make it feel more random."
Mlodinow finishes off with a point regarding persistence in the face of failure. Failure happens, and doesn't necessarily mean you're on the wrong path. The reader is presented with examples of famous books that were repeatedly rejected, and of an amusing case where London's Sunday Times sent the first chapters of two Booker Prize-winning books -- under pseudonyms -- to major publishers and agents. All but one agent rejected the works. Success, while not independent of worthiness, is largely a random event. As with rolling two dice trying to get a 12, the more you try for it, the greater chance of eventually attaining it. Don't be discouraged by failure; it's statistically part of the process (and usually the likeliest outcome). IBM pioneer Thomas Watson amusingly phrased it, "If you want to succeed, double your failure rate." This and many other lessons make The Drunkard's Walk rewarding reading. Mlodinow's background is in physics, but he writes for the general audience, assuming his readers are smart enough folks who just don't happen to remember a lot about the math courses they may have had. Our brains are very predisposed to be fooled by randomness, and The Drunkard's Walk offers many tips on hopefully reducing the rate of self-deception.