Regression to the Mean

regressionI have mentioned Michael Mauboussin’s book The Success Equation before, but this will be the closest I come to a review.  The title makes it sound like a self help book, but it is much more substantial.  However, his notes and bibliography somehow miss both Ken Hammond and Robin Hogarth which frankly seems unlikely. Hogarth’s books Educating Intuition (post Learning, Feedback and Intuition) and Dance with Chance (post Dancing with Chance) have much in common.

Mauboussin most unique contribution from my view is to bring Bill James and his successors from baseball to the world of skill and luck and investment. And Mauboussin is amazingly honest about the luck involved in investment which is his world. He pretty much says that you cannot be an expert in his field but only experienced. Using sports, especially baseball, makes the book’s ideas much more understandable. That brings us to the idea for this post.  Mauboussin calls it reversion to the mean and Kahneman calls it regression to the mean.  Either way, baseball makes it more understandable.

To oversimplify, Mauboussin suggests that about 30% of a MLB baseball team’s performance is luck while only about 10% of a NBA basketball team’s performance is luck. Thus the correlation between a team’s performance and its next year’s performance is only about 70% in baseball and 90% in basketball. And for that reason, regression to the mean is much more important and visible in baseball.  Baseball statisticians have calculated that you can add 37 wins and 37 losses to your team’s record at any time in the year to get the best guess of your team’s win percentage based on skill.  So if you are a Royal’s fan and they start off with 10 wins in there first 35 games, you can take heart in knowing that even though their win percentage is .286, the best guess of their true skill is .431.  Now this is no guarantee.  The Royals have defied statistics before. Since the NBA has less luck involved and the season is shorter you only need to add 6 wins and 6 losses to the record.

Dan Kahneman has an example of regression to the mean from his military past that applies to almost everyone.  Pilots in training would be lambasted after a particularly poor landing or praised lavishly after a very good one. And what do you know, the one given negative reinforcement would usually improve and the ones given positive reinforcement would usually not do so well the next time. The trainers could feel this, too, and soon negative reinforcement dominated. Kahneman noted that his newly learned idea of the power of intermittent positive reinforcement was not so well received. The same thing happens when you yell at your kids on the t-ball field. Hopefully, you don’t, but you can’t criticize fairly until you have stood there. All this is nothing but regression to the mean. When you base your criticism or praise of an extreme performance on a tiny sample, you are most likely wrong. Mauboussin calles this the illusion of feedback.

Regression to the mean is the flip side to correlation.  If you are playing chess against a grandmaster, skill dominates. His performance is going to correlate very highly with his next performance and winning. Thus there is not going to be much regression to the mean. He is going to beat you every time. Now if you are also a grandmaster, luck will be much more important.

Mauboussin points out this example from baseball.  A batter comes up against a pitcher and announcer tells you that this batter owns the pitcher as he is batting .500 in 20 at bats with 6 extra base hits.  The pitcher owns the next batter who has no hits in eleven at bats.  You will not be surprised that these vignettes are just stories.  Mauboussin calls this the illusion of cause and effect. We are certain that the batter is hitting .500 because the batter can see the ball perfectly coming out of the pitcher’s hand and that he has the perfect swing to hit that slider. A player’s overall performance with a large sample size is a much better predictor of specific interactions between hitters and pitchers than any exceptional outcomes from the past.

Mauboussin also points out the illusion of declining variance which he describes as the idea that regression to the mean implies that the overall variance of results is converging over time.  Actually, regression to the mean does not predict declining variance, but only that a single measurement is likely to be closer to the mean.

Now it is one thing to act like regression to the mean is obvious, but in our everyday lives, we love causes for effects.  Keep in mind that quite often it is not random and there are causes and effects, but that when you get an extreme, it is most likely that inevitable statistics are the result.  A valid cause needs to overcome that.  The classic example of regression to the mean is well set out below:

sg_regression_toward_mean

 

2 thoughts on “Regression to the Mean

  1. Pingback: Gigerenzer — Risk Saavy | Judgment and Decision Making

  2. Pingback: Superforecasting - Judgment and Decision Making

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.