superforecastingimagesThis post is a look at the book by Philip E Tetlock and Dan Gardner, Superforecasting– the Art and Science of Prediction.  Phil Tetlock is also the author of Expert Political Judgment: How Good Is It? How Can We Know?   In Superforecasting Tetlock blends discussion of the largely popular literature on decision making and his long duration scientific work on the ability of experts and others to predict future events.

In Expert Political Judgment: How Good Is It? How Can We Know? Tetlock found that the average expert did little better than guessing.  He also found that some did better. In Superforecasting he discusses the study of those who did better and how they did it.

He and his colleagues created the The Good Judgment Project (GJP) to participate in a forecasting tournament run by the U.S. Intelligence Advanced Research Projects Activity. Several groups participated including part of the U.S. intelligence community. From 2011 through June 2015, participants were asked to put specific odds on the likelihood of certain events, such as whether or not Assad’s regime in Syria would fall within the next six months.  The forecasts were short term between one month and one year. There were many trials and there is little ambiguity in measuring the accuracy of the forecasts. It is the rare scientific study of prediction accuracy. The GJP  showed superior forecasting skills relative to the other academic sponsored groups and even over the professionals within the intelligence community who had access to all the resources of the government agencies supporting them.

Rather than provide a summary, I will provide what I found to be the most salient ideas and some of my own commentary as I moved through the book from beginning to end.

Measurement is key to doing about anything well. Tetlock and Gardner give Bill Gates credit for this recognition.

When you have a well validated statistical algorithm, use it.

Tetlock is a friend of Daniel Kahneman and uses Kahneman to point out some of our human foibles. I will point out a few that he mentions:

  • Intuitive judgment is insensitive to the quality of the evidence.
  • “We are creative confabulators hardwired to invent stories that impose coherence on the world.”
  • Kahneman noted, ” but declarations of high confidence mainly tell you that an individual has constructed a coherent story in his mind, not necessarily that the story is true.”

Attribute substitution or as Tetlock calls it bait and switch is a common issue with our decision making. His example is avoiding the difficult question of should I worry about the shadow in the grass and substituting an easy question like have I ever heard of a situation when a person should have worried about the shadow in the grass. If yes is the answer to the second question, we often make it the answer to the original question. We don’t seem to like being reflective so we just let System 1(automatic) hum along. Tetlock calls this the tip of your nose perspective.

Blink-think (see post Dancing with Chance) is a false dichotomy like System 1 and System 2 (see post Prospect Theory). Tetlock notes, but he does not give credit to the creator of the  cognitive continuum, Kenneth Hammond.

“Whether intuition generates delusion or insight depends on whether or not you work in a world full of valid cues you can unconsciously register for future use.” Tetlock brings up Kahneman and Klein’s joint paper on this (see post Kahneman and Klein). Again Tetlock does not mention Hogarth’s wicked or kind environments ( post Learning, Feedback and Intuition). Feedback is a big issue. Police officers may think they can detect lies, but they have to wait so long  and there are so many factors that they tend to be overconfident about their ability to detect lies. This is Hogarth’s wicked environment.

To measure the accuracy of predictions Tetlock points to Brier scores. First, predictions need to be calibrated.  If when a weatherman predicts a 40% chance of rain, it only rains 20% of the time, he is miscalibrated and overconfident.  If when he predicts a 20% chance of rain it rains 40% of the time he is miscalibrated and underconfident. If when he predicts a 40% chance of rain, it rains 40% of the time he is perfectly calibrated. But that is not what people consider to be great forecasting. Resolution adds certainty. Good resolution might be 80-90%, but it depends upon what you are predicting. Saying it will be sunny in Los Angeles is easy most of the year. Brier scores include both calibration and resolution.  Perfect is 0. The worst is 2.0 and pure chance is .5.

Hedgehog forecasters provide a tip of the nose perspective. They know one big thing and everything they say is colored by that one big thing. Thus they tend to be poor forecasters, but they tell tight simple stores that capture audiences and they are always confident. Foxes are better forecasters but not so good on tv with “if” and “maybe” and “on the other hand”.

Tetlock points out the ideas expressed in Surowiecki’s Wisdom of Crowds and Scott Page’s The Difference- How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies (post Smart Mobs and Diverse Problem Solvers). That main idea is that aggregation can be powerful, but only if people involved know lots about lots of different things. Page sees diversity as trumping ability. Foxes approach forecasting by seeking information from many sources. Then they aggregate.  Tetlock indicates that aggregating within one skull can be difficult.

Tetlock mentions Michael Mauboussin’s book: The Success Equation which engages the concept of regression to the mean (post Regression to the Mean). In situations that involve more skill, forecasters tend to regress to the mean more slowly, while when there is more chance, the regression is quick.  The guy who predicts coins tosses accurately 10 times in a row will likely be back to chance for the next 10.

Tetlock creates what he calls Fermi-ize. This  is based on the ideas of Enrico Fermi, physicist at the University of Chicago, and one who had much to do with the ideas behind the atomic bomb. If you are asked how many piano tuners in Chicago, you will be better off breaking down the question into several questions even though you are just guessing on each one. How many pianos are there in Chicago? How often do they need to be tuned? How long does it take to tune a piano?

This leads directly to the idea of outside view v. inside view. If you are asked to predict whether a particular family owns a pet, the outside view is what percentage of American households own pets. This is based on Daniel Kahneman’s idea of anchoring.  We anchor and adjust our predictions and tend to under adjust so we need to start with the outside view.

There are ways to create a more outside view. Writing down a prediction distances it and provides a kind of automatic feedback. Similarly, being asked for a new estimate or waiting a few weeks and providing an estimate can help. Tetlock does not mention it, but this is often called bootstrapping (see post Bootstrapping and post Dialectical Bootstrapping).

Openness to experience  or what Jonathan Baron calls active open mindedness are predicters of superforecasters. Taking the outside view tends to be probabilistic thinking instead of “meant to be” fatalistic thinking. Tetlock quotes Kurt Vonnegut: “Why me? Why not me?” Tetlock notes that awareness of irreducible uncertainty is the core of probabilistic thinking. Again, Tetlock does not mention Kenneth Hammond, but you might check the post Irreducible Uncertainty, Inevitable Error, Unavoidable Injustice.  Numeracy also seems to be a requirement for probabilistic thinking (The Statistics of Health Decision Making-Statistical Illiteracy).

Granularity also predicts accuracy. This is not obvious to me, because I have learned to not give a false impression of accuracy when communicating information. But granularity is not about communicating, but about making the best prediction that you can make at every level. This becomes important when you start aggregating different questions. In other words, you will tend to make better predictions if you do not round off just because you are not certain. If you believe that 73% is a better prediction than 70%, stick with it.

Tetlock wonders about some forecasters for their lack of humility–more like hedgehogs. He concludes that prediction requires a peculiar type of humility–humility in the face of the game, not necessarily in the face of your opponents.

Most of Tetlock’s superforecasters have had no stake in their predictions except for their desire to do a good job. When the stakes are high, e.g your career depends upon it, you often have “belief perseverance” where you resist disturbances. The stronger the commitment, the greater the resistance. Tetlock gives credit to Dan Kahan (see post Cultural Cognition & Motivated Reasoning) for his idea that you are reluctant to admit errors when that particular belief block is holding up a tower. Supporters of creating markets to make predictions argue that a financial stake makes you a better predicter. Tetlock did not get that result.

Another issue for predictions is the distribution. Fat tails that a normal distribution does not have can change consequences dramatically. Assuming a normal distribution can create bad predictions.

Tetlock definitely does not believe that forecasting is a fool’s errand.   Improving even short term forecasts or predictions could dramatically improve the human condition.

The key to better predictions, in my interpretation, is to not let intuition comfortably run its parallel constraint satisfaction model (see post Parallel Constraint Satisfaction Theory) without constantly asking the analytical system to keep finding new information to put into the model. Superforecasters may have managed to teach the intuitive system to automatically seek input from the analytic system and never let the model be closed to new input.

Finally, it occurs to me that predicting and forecasting are different than decision making. So what is required to make good predictions is not quite the same as what is required to make good decisions.










One thought on “Superforecasting

  1. Pingback: Consistency and Discrimination as Measures of Good Judgment - Judgment and Decision Making

Leave a Reply

Your email address will not be published. Required fields are marked *