The Signal and The Noise: Why So Many Predictions Fail but Some Don’t is an intriguing and entertaining look into real world situations with massive amounts of data that create opportunities and risks for our data analysis tendencies. The book is a wonderful read for clinicians, researchers, administrators, and “laypersons” alike.
For a researcher, data are good, but even more data are better. Or is it? In recent years, anesthesiology researchers have reached the “summit” of electronic medical record and administrative data; we actually have enough data regarding infrequent events that we can start developing prediction models. As a community, we have been thrilled to start moving research to the next level, but Mr. Silver suggests it is how you interpret the data and your biases that could ultimately hurt your predictions. He quickly states with all of these “big data” sources available to us from a multitude of areas (not just healthcare), progress will be made but “how quickly it does and whether we regress in the meantime, will depend on us.” For any clinician, researcher, or administrative leader, this is a sobering admonition to look within and determine their own perceived biases.
The book is divided into two parts. The first half explains the problems of predictions and describes how predictions have failed in the past. Mr. Silver intricately describes how the signals in “big data” were not adequately separated from the noise and predictions went grossly wrong. He starts by drawing the reader into exploring two recent events that every stockholder would rather forget: the recent housing bubble and the financial markets crisis. Mr. Silver convincingly points out that the preconceived biases of the housing, financial, regulatory, and credit-rating industries prevented them from appreciating signs of impending trouble. Silver’s detailed, understandable, yet scientific decomposition of the recent housing bubble and financial crisis stirs strong emotions as one realizes that this dark time in our global economy was not as “unexpected” as many would like us to believe. Several other well thought out examples of “big data” that probably will not evoke such a visceral reaction are also provided to show how individual biases can alter the detection of the real signal in our ever expanding world of “big data.” Silver also highlights the softer side of modeling: all predictions are made in a political, financial, academic, or marketing context and are invariably subject to the biases created by that macro-environment.
The second half of the book focuses on applying Bayes theorem in making predictions. When most people read the words “Bayesian statistics,” their mind drifts away to a happier place and comprehension ceases. However, Mr. Silver explains Bayes theorem in simple terms that will grab your attention. For example, he scientifically and amusingly explores the application of Bayesian statistics to a decidedly non-“big data” problem: the probability that your partner is cheating on you if you return from a business trip and find an unknown pair of underwear. The reader explores the probability of a cheating partner based on the Bayesian approach of determining the probability of cheating if your hypothesis is true, the probability if your hypothesis is false, and the a prior probability that before finding the unknown underwear you would suspect your partner of cheating. As with any good teacher, picking the correct example is the best way to prove the Bayesian methodology. Once the theorem has been properly stored in your brain (with an example you will never forget), Mr. Silver switches the tone throughout the last half and presents real world examples, such as chess, poker, climate change, and terror attacks around the world, of accurate predictions using Bayesian statistics.
At the heart of using Bayes theorem is a constantly evolving prior probability based on new data. A more sobering example is presented about the probability of terror attacks before September 11, 2001. Because the prior probability of an attack was so low, the Bayesian method suggests the probability of a terror attack of that magnitude was low as well. However, with the new knowledge that was gained after September 11, 2001, the prior probability has now changed, and therefore, the posterior probability of another terror attack has also increased. The reader can highlight in their own mind, using Bayesian statistics, that our own internal bias (i.e., prior probabilities of an event being true) could greatly affect our sincere efforts in predictions. Mr. Silver is clearly a Bayesian statistician and only briefly touches on the frequentist method. However, even a frequentist would gain insight into his explanations of various predictions.
This book eloquently highlights the problem that too much “big data” can cause if we, as researchers, do not adequately control our own bias. A computer is great when it comes to statistics and electronic medical records, but a computer can only do so much when making predictions. Ultimately, it is up to humans to decide whether these predictions are plausible and whether the patterns that are seen are the noise or the true signal. Therefore, in recent years with the explosion of “big data” into anesthesiology, we are tasked to understand and admit our own biases and incorporate them into our use of big data, whether using a Bayesian or frequentist approach.
In summary, The Signal and The Noise: Why So Many Predictions Fail but Some Don’t is a fascinating read to understand how “big data” can lead very well meaning, hard-working, educated individuals down the wrong path. The impact of subtle misperceptions has real consequences and those consequences can be directly attributed to misunderstood biases and the analyses based on those biases. We recommend that every clinician, researcher, and administrator takes the time to read this book and consider his or her own biases.