le 23 décembre 2012
N. Silver is no amateur forecaster: he designed a system for forecasting performance of baseball players and set up a web site predicting election results (he also happens to have played poker at a semi-professional level).
The book is full of insights on the pitfalls that forecaster can fall into. But, it also contains a bounty of solutions (notably derived from Bayesian statistics). Effortlessly, N. Silver guides us to subtle and clever ways on how we can improve our prediction abilities (and recognize our limitations!). Let me just give a very small sample of how the book helps us grasp what should be understood:
* Understanding the difference between a prediction and a forecast, as illustrated by earthquakes.
“A prediction is a definitive and specific statement about when and where an earthquake will strike […] Whereas a forecast is a probabilistic statement, usually over a longer time scale.” (p. 149)
* Understanding what “overfitting” is, i.e. designing a model that explains, data-wise, more than is actually possible or actually exists (a good image of the trait of human nature leading us to make such mistakes is that of recognizing animals in clouds), and the unsound confidence that it triggers (p. 167)
* Understanding that you ignore unknown unknowns (as the phrase was coined by D. Rumsfeld) at your own risk.
“There is a tendency in our planning to confuse the unfamiliar with the improbable […] what looks strange is thought improbable” (p. 419)
N. Silver uses a very wide array of topics and references to make his points. He is most of the times well versed in such topics but yet falls prey to his unrealistic ambition of being a true polymath ; two instances of factual mistakes I noticed are:
* “not only were Estonians sick of Russians, but Russians were nearly as sick of Estonians, since the satellite republics contributed less to the Soviet economy than they received in subsidy from Moscow.” p. 52
At the time of USSR, stating that Estonia received subsidies from Russia (rather than being plundered) is a wrong pick ; subsidies may have existed for some republics (such as the “–stan” republics) or countries (such as Cuba) but not Estonia the richest and most advanced of the soviet republics…
* The description of the first 3 moves of the 1st game of the Kasparov – Deep Blue match is mistaken, with one move missing (and the figure 9-2 showing the position correspondingly erroneous ; the white g-pawn is misplaced) p. 270
Anyhow, these mistakes are minor and do not alter my overall vey positive assessment of the book!
le 25 avril 2015
Fort de son expérience dans les paris sportifs, Nate Silver nous explique les différents domaines où le Big Data peut être utilisé pour prédire l’avenir : marchés financiers, météorologie, criminalité, etc.
Tout le livre est illustré par des exemples, mais on ne rentre pas dans une explication mathématique ou statistique des données.
Parmi tout le flux d’informations, le plus difficile reste de savoir si un évènement est l’effet d’un autre ou au contraire la cause d’un futur évènement.
A vous de lire maintenant…
le 21 juillet 2014
Un ouvrage qui présente les difficultés pour isoler les signaux pertinents du bruit. En s'appuyant sur son expérience dans des domaines aussi différents que la politique, le pari sportif, le poker, les échecs ou la bourse, l'auteur montre les difficultés de distinguer la réalité d'un signal. Pour ma part j'ai beaucoup aimé le ton assez différent des ouvrages sur le Big Data qui nous promettent un éclairage presque absolu des modèles. Nate SIlver montre qu'il est nécessaire de mixer des données et des analyses humaines, que certains problèmes ne sont pas modélisables et de manière assez logique qu'il est nécessaire de mesurer la pertinence des prédictions. Sur ce dernier point il montre que l'humain conduit à des biais ...pour se faire entendre lorsque l'on est inconnu ..il faut faire des prédictions atypiques .. et lorsque l'on atteint la notoriété, il est plus urgent dans la tendance. L'auteur énonce un théorème intéressant en économie .... si un prévisionniste a eu du flair .... autant savoir qu'il aura beaucoup de difficultés à en avoir une seconde fois. L'auteur fait une apologie des techniques bayésiennes ... et demande à sortir le cadre éducatif des mains de M. Fisher, et sur ce point je ne peux que le rejoindre ... dans le monde du Big Data .. on trouve des corrélations entre des données ... ce qui ne veut pas dire causalité : ce n'est pas parce que les ventes de glace sont corrélés aux incendies de forêt qu'il faut interdire les glaces Miko.
le 4 octobre 2012
*A full executive summary of this book will be available at newbooksinbrief . wordpress . com, on or before Monday, October 15.
Making decisions based on an assessment of future outcomes is a natural and inescapable part of the human condition. Indeed, as Nate Silver points out, "prediction is indispensable to our lives. Every time we choose a route to work, decide whether to go on a second date, or set money aside for a rainy day, we are making a forecast about how the future will proceed--and how our plans will affect the odds for a favorable outcome" (loc. 285). And over and above these private decisions, prognosticating does, of course, bleed over into the public realm; as indeed whole industries from weather forecasting, to sports betting, to financial investing are built on the premise that predictions of future outcomes are not only possible, but can be made reliable. As Silver points out, though, there is a wide discrepancy across industries and also between individuals regarding just how accurate these predictions are. In his new book `The Signal and the Noise: Why So Many Predictions Fail--but Some Don't' Silver attempts to get to the bottom of all of this prediction-making to uncover what separates the accurate from the misguided.
In doing so, the author first takes us on a journey through financial crashes, political elections, baseball games, weather reports, earthquakes, disease epidemics, sports bets, chess matches, poker tables, and the good ol' American economy, as we explore what goes into a well-made prediction and its opposite. The key teaching of this journey is that wise predictions come out of self-awareness, humility, and attention to detail: lack of self-awareness causes us to make predictions that tell us what we'd like to hear, rather than what is true (or most likely the case); lack of humility causes us to feel more certain than is warranted, leading us to rash decisions; and lack of attention to detail (in conjunction with self-serving bias and rashness) leads us to miss the key variables that make all the difference. Attention to detail is what we need to capture the signal in the noise (the key variable[s] in the sea of data and information that are integral in determining future outcomes), but without self-awareness and humility, we don't even stand a chance.
While self-awareness requires us to make an honest assessment of our particular biases, humility requires us to take a probabilistic approach to our predictions. Specifically, Silver advises a Bayesian approach. Bayes’ theorem has it that when it comes to making a prediction, the most prudent way to proceed is to first come up with an initial probability of a particular event occurring (rather than a black and white prediction of the form ‘I believe x will occur’). Next, we must continually adjust this initial probability as new information filters in.
The level of certainty that we can place on our initial estimate of the probability of a particular event (and the degree to which we can accurately refine it moving forward) is limited by the complexity of the field in which we are making our prediction, and also the amount and quality of the information that we have access to. For instance, in a field like baseball, where wins and losses mostly comes down to two variables (the skill of the pitchers, and the skill of the hitters), and where there is an enormous wealth of precise data, prediction is relatively straightforward (but still not easy). On the other hand, in a dynamic field such as the American economy, where the outcomes are influenced by an enormous number of variables, and where the interactions between these variables can become incredibly complex (due to things like positive and negative feedback), probabilities become a whole lot more difficult to pin down precisely (though they often remain possible on a general and/or long-term scale).
It is also important to recognize that while additional information can help us no matter what field we are trying to make our prediction in, we must be careful not to think that information can stand on its own. Indeed, additional information (when it is not met with insightful analysis) often does nothing more than draw our attention away from the key variables that truly make a difference. In other words, it creates more noise, which can make it more difficult to identify the signal. It is for this reason that predictive models that rely on statistics and statistics alone are often not very effective (though they do often help a seasoned expert who is able to apply insightful analysis to them).
In the final stage of the book Silver explores how the lessons that he lays out can be applied to such issues as global warming, terrorism and bubbles in financial markets. Unfortunately, each of these fields is a lot noisier than many of us would like to think (thus making them very difficult to predict precisely). Nevertheless, the author argues, within each there are certain signals that can help us make better predictions regarding them, and which should help make the world a safer and more livable place.
If you are hoping that this book will make you a fool-proof prognosticator, you are going to be disappointed. A key tenet of the book is that this is simply not possible (no matter what field you are in). That being said, Silver makes a very strong argument that by applying a few simple principles (and putting in a lot of hard work in identifying key variables) our predictive powers should take a great boost indeed. A full executive summary of this book will be available at newbooksinbrief . wordpress . com, on or before Monday, October 15; a podcast discussion of the book will be available shortly thereafter.