Category Archives: numeracy

Consistency and Discrimination as Measures of Good Judgment

This post is based on a paper that appeared in Judgment and Decision Making, Vol. 12, No. 4, July 2017, pp. 369–381, “How generalizable is good judgment? A multi-task, multi-benchmark study,” authored by Barbara A. Mellers, Joshua D. Baker, Eva Chen, David R. Mandel, and Philip E. Tetlock.  Tetlock is a legend in decision making, and it is likely that he is an author because it is based on some of his past work and not because he was actively involved. Nevertheless, this paper, at least, provides an opportunity to go over some of the ideas in Superforecasting and expand upon them. Whoops! I was looking for an image to put on this post and found the one above. Mellers and Tetlock looked married and they are.  I imagine that she deserved more credit in Superforecasting, the Art and Science of Prediction. Even columnist David Brooks who I have derided in the past beat me to that fact. (

The authors note that Kenneth Hammond’s correspondence and coherence (Beyond Rationality) are the gold standards upon which to evaluate judgment. Correspondence is being empirically correct while coherence is being logically correct. Human judgment tends to fall short on both, but it has gotten us this far. Hammond always decried that psychological experiments were often poorly designed as measures, but complimented Tetlock  on his use of correspondence to judge political forecasting expertise. Experts were found wanting although they were better when the forecasting environment provided regular, clear feedback and there were repeated opportunities to learn. According to the authors, Weiss & Shanteau suggested that, at a minimum, good judges (i.e., domain experts) should demonstrate consistency and
discrimination in their judgments. In other words, experts should make similar judgments if cases are alike, and dissimilar judgments when cases are unalike.  Mellers et al suggest that consistency and discrimination are silver standards that could be useful. (As an aside, I would suggest that Ken Hammond would likely have had little use for these. Coherence is logical consistency and correspondence is empirical discrimination.)

Continue reading



Although I have had much respect for Dan Kahan’s work, I have had a little trouble with the Identity protective Cognition Thesis (ICT). The portion in bold in the quote below from “Motivated Numeracy and Enlightened Self-Government” has never rung true.

On matters like climate change, nuclear waste disposal, the financing of economic stimulus programs, and the like, an ordinary citizen pays no price for forming a perception of fact that is contrary to the best available empirical evidence: That individual’s personal beliefs and related actions—as consumer, voter, or public discussant—are too inconsequential to affect the level of risk he or anyone else faces or the outcome of any public policy debate. However, if he gets the ‘wrong answer” in relation to the one that is expected of members of his affinity group, the impact could be devastating: the loss of trust among peers, stigmatization within his community, and even the loss of economic opportunities.

Why should Thanksgiving be so painful if it were true? I do not even know what my friends think of these things. Now at some point issues like climate change become so politically tainted that you may avoid talking about them to not antagonize your friends, but that does not change my view. But now Kahan has a better explanation.

Continue reading

Fuzzy-Trace Theory Explains Time Pressure Results

This post is based on a paper:  “Intuition and analytic processes in probabilistic reasoning: The role of time pressure,” authored by Sarah Furlan, Franca Agnoli, and Valerie F. Reyna. Valerie Reyna is, of course, the primary creator of fuzzy-trace theory. Reyna’s papers tend to do a good job of summing up the state of the decision making art and fitting in her ideas.

The authors note that although there are many points of disagreement, theorists generally agree that there are heuristic processes (Type 1) that are fast, automatic, unconscious, and require low effort. Many adult judgment biases are considered a consequence of these fast heuristic responses, also called default responses, because they are the first responses that come to mind. Type 1 processes are a central feature of intuitive thinking, requiring little cognitive effort or control. In contrast, analytic (Type 2) processes are considered slow, conscious, deliberate, and effortful, and they place demands on central working memory resources. Furlan, Agnoli, and Reyna assert that Type 2 processes are thought to be related to individual differences in cognitive capacity and Type 1 processes are thought to be independent of cognitive ability, a position challenged by the research presented in their paper. I was surprised by the given that intuitive abilities were unrelated to overall intelligence and cognitive abilities as set up by typical dual process theories.

Continue reading

Hogarth on Simulation

scm1This post is a contination of the previous blog post Hogarth on Description. Hogarth and Soyer suggest that the information humans use for probabilistic decision making has two distinct sources: description of the particulars of the situations involved and through experience of past instances. Most decision aiding has focused on exploring effects of different problem descriptions and, as has been shown, is important because human judgments and decisions are so sensitive to different aspects of descriptions. However, this very sensitivity is problematic in that different types of judgments and decisions seem to need different solutions. To find methods with more general application, Hogarth and Soyer suggest exploiting the well-recognized human ability to encode frequency information, by building a simulation model that can be used to generate “outcomes” through a process that they call “simulated experience”.

Simulated experience essentially allows a decision maker to live actively through a decision situation as opposed to being presented with a passive description. The authors note that the difference between resolving problems that have been described as opposed to experienced is related to Brunswik’s distinction between the use of cognition and perception. In the former, people can be quite accurate in their responses but they can also make large errors. I note that this is similar to Hammond’s correspondence and coherence. With perception and correspondence, they are unlikely to be highly accurate but errors are likely to be small. Simulation, perception, and correspondence tend to be robust.

Continue reading

Hogarth on Description



problemUntitledThis post is based on “Providing information for decision making: Contrasting description and simulation,” Journal of Applied Research in Memory and Cognition 4 (2015) 221–228, written by
Robin M. Hogarth and Emre Soyer. Hogarth and Soyer propose that providing information to help people make decisions can be likened to telling stories. First, the provider – or story teller – needs to know what he or she wants to say. Second, it is important to understand characteristics of the audience as this affects how information is interpreted. And third, the provider must match what is said to the needs of the audience. Finally, when it comes to decision making, the provider should not tell the audience what to do. Although Hogarth and Soyer do not mention it, good storytelling draws us into the descriptions so that we can “experience” the story. (see post 2009 Review of Judgment and Decision Making Research)

Hogarth and Soyer state that their interest in this issue was stimulated by a survey they conducted of how economists interpret the results of regression analysis. The economists were given the outcomes of the regression analysis in a typical, tabular format and the questions involved interpreting the probabilistic implications of specific actions given the estimation results. The participants had available all the information necessary to provide correct answers, but in general they failed to do so. They tended to ignore the uncertainty involved in predicting the dependent variable conditional on values of the independent variable. As such they vastly overestimated the predictive ability of the model. Another group of similar economists who only saw a bivariate scatterplot of the data were accurate in answering the same questions. These economists were not generally blinded by numbers as some in the population, but they still needed the visually presented frequency information.

Continue reading


superforecastingimagesThis post is a look at the book by Philip E Tetlock and Dan Gardner, Superforecasting– the Art and Science of Prediction.  Phil Tetlock is also the author of Expert Political Judgment: How Good Is It? How Can We Know?   In Superforecasting Tetlock blends discussion of the largely popular literature on decision making and his long duration scientific work on the ability of experts and others to predict future events.

In Expert Political Judgment: How Good Is It? How Can We Know? Tetlock found that the average expert did little better than guessing.  He also found that some did better. In Superforecasting he discusses the study of those who did better and how they did it.

Continue reading

Cultural Differences are not always Reducible to Individual Differences

2781790829_44ac7fb049This post is based on the paper: “Cultural differences are not always reducible to individual differences,” written by Jinkyung Na, Igor Grossmann, Michael E. W. Varnum, Shinobu Kitayama, Richard Gonzalez, and Richard E. Nisbett p 6192-6197 | PNAS | April 6, 2010 | vol.107.

As people, I think that we want to believe that cultural differences can be reduced to individual differences. But is it actually true? The authors studied whether or not cultural constructs can be conceptualized as psychological traits at the individual level.

According to the authors, cultural psychology has placed a heavy emphasis on two constructs: social orientation and cognitive style. These two constructs seem applicable to decision making and make me want to apply them when there are international negotiations going on. Some cultures, such as the United States, are characterized by a social orientation valuing independence: emphasizing uniqueness, having relatively low sensitivity to social cues, and encouraging behaviors that affirm autonomy. In contrast, other cultures including China, Japan, and Korea tend to value interdependence: emphasizing harmonious relations with others, promoting sensitivity to social cues, and encouraging behaviors that affirm relatedness to others. Similarly, cultures have been shown to vary along the analytic holistic dimension in cognitive style. Some cultures are analytic: detaching a focal object from the perceptual field, categorizing objects taxonomically, and ascribing causality to focal actors or objects. Other cultures are holistic: paying attention to the entire perceptual field, especially relations among objects and events, categorizing objects on the basis of their thematic relations, and attributing causality to context.

Continue reading

Medical Decisions–Risk Saavy

screeningLearnMoreThis post looks at the medical/health component of decision making as addressed in Gerd Gigerenzer’s new book, Risk Saavy, How to Make Good Decisions. First, Gigerenzer has contributed greatly to improving health decision making. This blog includes three consecutive posts on the Statistics of Health Decision Making based on Gigerenzer’s work.

He points out both the weaknesses of screening tests and our understanding of the results. We have to overcome our tendency to see linear relationships when they are nonlinear. Doctors are no different. The classic problem is an imperfect screening test for a relatively rare disease. You cannot think in fractions or percentages. You must think in absolute frequencies. Breast cancer screening is one example. Generally, it can catch about 90% of breast cancers and only about 9% test positive who do not have breast cancer. So if you have a positive test, that means chances are you have breast cancer. No! You cannot let your intuition get involved especially when the disease is more rare than the test’s mistakes. If we assume that 10 out of 1000 women have breast cancer, then 90% or 9 will be detected, but about 90 of the 1000 women will test positive who do not have disease. Thus only 9 of the 99 who test positive actually have breast cancer. I know this, but give me a new disease or a slightly different scenario and let a month pass, I will still be tempted to shortcut the absolute frequencies and get it wrong.

Continue reading

Gigerenzer — Risk Saavy

risksaavyindexGerd Gigerenzer has a 2014 book out entitled:  Risk Saavy, How to Make Good Decisions, that is a refinement of his past books for the popular press.  It is a little too facile, but it is worthwhile. Gigerenzer has taught me much, and he will likely continue. He is included in too many posts to provide the links here (you can search for them). My discussion of the book will be divided into two posts. This one will be a general look, while the next post will concentrate on Gigerenzer’s take on medical decision making.

As in many books like this, the notes provide insight. Gigerenzer points out his disagreements with Kahneman with respect to heuristics all being part of the unconscious system. As he notes heuristics, for instance the gaze heuristic, can be used consciously or unconsciously. This has been a major issue in my mind with Kahneman’s System 1 and System 2. Kahneman throws heuristics exclusively into the unconscious system. I also side with Gigerenzer over Kahneman, Ariely, and Thaler that the unconscious system is associated with bias. As Gigerenzer states: “A system that makes no errors is not intelligent.” He interestingly points out the use of the gaze heuristic by Sully Sullenberger to decide to not return to LaGuardia, but instead to land in the Hudson River.

Continue reading

Affect Gap

affect-space-webThis post is based on the paper: “The Affect Gap in Risky Choice: Affect-Rich Outcomes Attenuate Attention to Probability Information,” authored by Thorsten Pachur,  Ralph Hertwig, and Roland Wolkewitz that appeared in Decision, 2013, Volume 1, No. 1, p 64-78.  This is a continuation of the affect/ emotion theme. It is more of a valence based idea than Lerner’s Appraisal Tendency Framework.  This is more thinking about emotion than actually experiencing it although the two can come together.

Often risky decisions involve outcomes that can create considerable emotional reactions. Should we travel by plane and tolerate a minimal risk of a fatal terrorist attack or take the car and run the risk of traffic jams and car accidents? How do people make such decisions?  Decisions under risk typically obey the principle of the maximization of expectation.
The expectation expresses the average of an option’s outcomes, each weighted by its
probability. This, of course, underlies expected utility theory and cumulative prospect theory and these models do a good job in accounting for choices among relatively affect-poor

Continue reading