Consistency and Discrimination as Measures of Good Judgment

This post is based on a paper that appeared in Judgment and Decision Making, Vol. 12, No. 4, July 2017, pp. 369–381, “How generalizable is good judgment? A multi-task, multi-benchmark study,” authored by Barbara A. Mellers, Joshua D. Baker, Eva Chen, David R. Mandel, and Philip E. Tetlock.  Tetlock is a legend in decision making, and it is likely that he is an author because it is based on some of his past work and not because he was actively involved. Nevertheless, this paper, at least, provides an opportunity to go over some of the ideas in Superforecasting and expand upon them. Whoops! I was looking for an image to put on this post and found the one above. Mellers and Tetlock looked married and they are.  I imagine that she deserved more credit in Superforecasting, the Art and Science of Prediction. Even columnist David Brooks who I have derided in the past beat me to that fact. (

The authors note that Kenneth Hammond’s correspondence and coherence (Beyond Rationality) are the gold standards upon which to evaluate judgment. Correspondence is being empirically correct while coherence is being logically correct. Human judgment tends to fall short on both, but it has gotten us this far. Hammond always decried that psychological experiments were often poorly designed as measures, but complimented Tetlock  on his use of correspondence to judge political forecasting expertise. Experts were found wanting although they were better when the forecasting environment provided regular, clear feedback and there were repeated opportunities to learn. According to the authors, Weiss & Shanteau suggested that, at a minimum, good judges (i.e., domain experts) should demonstrate consistency and
discrimination in their judgments. In other words, experts should make similar judgments if cases are alike, and dissimilar judgments when cases are unalike.  Mellers et al suggest that consistency and discrimination are silver standards that could be useful. (As an aside, I would suggest that Ken Hammond would likely have had little use for these. Coherence is logical consistency and correspondence is empirical discrimination.)

Although I have had much respect for Dan Kahan’s work, I have had a little trouble with the Identity protective Cognition Thesis (ICT). The portion in bold in the quote below from “Motivated Numeracy and Enlightened Self-Government” has never rung true.

On matters like climate change, nuclear waste disposal, the financing of economic stimulus programs, and the like, an ordinary citizen pays no price for forming a perception of fact that is contrary to the best available empirical evidence: That individual’s personal beliefs and related actions—as consumer, voter, or public discussant—are too inconsequential to affect the level of risk he or anyone else faces or the outcome of any public policy debate. However, if he gets the ‘wrong answer” in relation to the one that is expected of members of his affinity group, the impact could be devastating: the loss of trust among peers, stigmatization within his community, and even the loss of economic opportunities.

Why should Thanksgiving be so painful if it were true? I do not even know what my friends think of these things. Now at some point issues like climate change become so politically tainted that you may avoid talking about them to not antagonize your friends, but that does not change my view. But now Kahan has a better explanation.

Fuzzy-Trace Theory Explains Time Pressure Results

This post is based on a paper:  “Intuition and analytic processes in probabilistic reasoning: The role of time pressure,” authored by Sarah Furlan, Franca Agnoli, and Valerie F. Reyna. Valerie Reyna is, of course, the primary creator of fuzzy-trace theory. Reyna’s papers tend to do a good job of summing up the state of the decision making art and fitting in her ideas.

The authors note that although there are many points of disagreement, theorists generally agree that there are heuristic processes (Type 1) that are fast, automatic, unconscious, and require low effort. Many adult judgment biases are considered a consequence of these fast heuristic responses, also called default responses, because they are the first responses that come to mind. Type 1 processes are a central feature of intuitive thinking, requiring little cognitive effort or control. In contrast, analytic (Type 2) processes are considered slow, conscious, deliberate, and effortful, and they place demands on central working memory resources. Furlan, Agnoli, and Reyna assert that Type 2 processes are thought to be related to individual differences in cognitive capacity and Type 1 processes are thought to be independent of cognitive ability, a position challenged by the research presented in their paper. I was surprised by the given that intuitive abilities were unrelated to overall intelligence and cognitive abilities as set up by typical dual process theories.

Not that Irrational

This post is from Judgment and Decision Making, Vol. 11, No. 6, November 2016, pp. 601–610, and is based on the paper:  “The irrational hungry judge effect revisited: Simulations reveal that the magnitude of the effect is overestimated,” written by Andreas Glöckner. Danziger, Levav and Avnaim-Pesso analyzed legal rulings of Israeli parole boards concerning the effect of serial order in which cases are presented within ruling sessions. DLA analyzed 1,112 legal rulings of Israeli parole boards that cover about 40% of the parole requests of the country. They assessed the effect of the serial order in which cases are presented within a ruling session and took advantage of the fact that the ruling boards work on the cases in three sessions per day, separated by a late morning snack and a lunch break. They found that the probability of a favorable decision drops from about 65% to 5% from the first ruling to the last ruling within each session. This is equivalent to an odds ratio of 35. The authors argue that these findings provide support for extraneous factors influencing judicial decisions and speculate that the effect might be driven by mental depletion. Glockner notes that the article has attracted attention and the supposed order effect is considerably cited in psychology.

Hogarth on Simulation

scm1This post is a contination of the previous blog post Hogarth on Description. Hogarth and Soyer suggest that the information humans use for probabilistic decision making has two distinct sources: description of the particulars of the situations involved and through experience of past instances. Most decision aiding has focused on exploring effects of different problem descriptions and, as has been shown, is important because human judgments and decisions are so sensitive to different aspects of descriptions. However, this very sensitivity is problematic in that different types of judgments and decisions seem to need different solutions. To find methods with more general application, Hogarth and Soyer suggest exploiting the well-recognized human ability to encode frequency information, by building a simulation model that can be used to generate “outcomes” through a process that they call “simulated experience”.

Simulated experience essentially allows a decision maker to live actively through a decision situation as opposed to being presented with a passive description. The authors note that the difference between resolving problems that have been described as opposed to experienced is related to Brunswik’s distinction between the use of cognition and perception. In the former, people can be quite accurate in their responses but they can also make large errors. I note that this is similar to Hammond’s correspondence and coherence. With perception and correspondence, they are unlikely to be highly accurate but errors are likely to be small. Simulation, perception, and correspondence tend to be robust.

Hogarth on Description



problemUntitledThis post is based on “Providing information for decision making: Contrasting description and simulation,” Journal of Applied Research in Memory and Cognition 4 (2015) 221–228, written by
Robin M. Hogarth and Emre Soyer. Hogarth and Soyer propose that providing information to help people make decisions can be likened to telling stories. First, the provider – or story teller – needs to know what he or she wants to say. Second, it is important to understand characteristics of the audience as this affects how information is interpreted. And third, the provider must match what is said to the needs of the audience. Finally, when it comes to decision making, the provider should not tell the audience what to do. Although Hogarth and Soyer do not mention it, good storytelling draws us into the descriptions so that we can “experience” the story. (see post 2009 Review of Judgment and Decision Making Research)

Hogarth and Soyer state that their interest in this issue was stimulated by a survey they conducted of how economists interpret the results of regression analysis. The economists were given the outcomes of the regression analysis in a typical, tabular format and the questions involved interpreting the probabilistic implications of specific actions given the estimation results. The participants had available all the information necessary to provide correct answers, but in general they failed to do so. They tended to ignore the uncertainty involved in predicting the dependent variable conditional on values of the independent variable. As such they vastly overestimated the predictive ability of the model. Another group of similar economists who only saw a bivariate scatterplot of the data were accurate in answering the same questions. These economists were not generally blinded by numbers as some in the population, but they still needed the visually presented frequency information.

Single Strategy Framework and the Process of Changing Weights


cloudindexThis post starts from the conclusion of the previous post that the evidence supports a single strategy framework, looks at Julian Marewski’s criticism, and then piles on with ideas on how weights can be changed in a single strategy framework.

Marewski provided a paper for the special issue of the Journal of Applied Research in Memory and Cognition (2015)  on “Modeling and Aiding Intuition in Organizational Decision Making”:  “Unveiling the Lady in Black: Modeling and Aiding Intuition,” authored by Ulrich Hoffrage and Julian N. Marewski. The paper gives the parallel constraint satisfaction model a not so subtle knock:

By exaggerating and simplifying features or traits, caricatures can aid perceiving the real thing. In reality, both magic costumes and chastity belts are degrees on a continuum. In fact, many theories are neither solely formal or verbal. Glöckner and Betsch’s connectionist model of intuitive decision making, for instance, explicitly rests on both math and verbal assumptions. Indeed, on its own, theorizing at formal or informal levels is neither “good” nor “bad”. Clearly, both levels of description have their own merits and, actually, also their own problems. Both can be interesting, informative, and insightful – like the work presented in the first three papers of this special issue, which we hope you enjoy as much as we do. And both can border re-description and tautology. This can happen when a theory does not attempt to model processes. Examples are mathematical equations with free parameters that carry no explanatory value, but that are given quasi-psychological, marketable labels (e.g., “risk aversion”).

Strategy Selection — Single or Multiple?

spannerindexThis post tries to do a little tying together on a familiar subject. I look at a couple of papers that provide more perspective than typical research papers provide. First is the preliminary dissertation of Anke Söllner. She provides some educated synthesis which my posts need, but rarely get. Two of her papers which are also part of her dissertation are discussed in the posts Automatic Decision Making and Tool Box or Swiss Army Knife? I also look at a planned special issue of the Journal of Behavioral Decision Making to address “Strategy Selection: A Theoretical and Methodological Challenge.”

Söllner’s work is concerned with the question:  which framework–multiple strategy or single strategy– describes multi-attribute decision making best? In multi-attribute decision making we have to choose among two or more options. Cues can be consulted and each cue has some validity in reference to the decision criterion. If the criterion is an objective one (e.g., the quantity of oil), the task is referred to as probabilistic inference, whereas a subjective criterion (e.g., preference for a day trip) characterizes a preferential choice task. The multiple strategy framework is most notably the adaptive toolbox that includes fast and frugal heuristics as individual strategies. Single strategy frameworks assume that instead of selecting one from several distinct decision strategies, decision makers employ the same uniform decision making mechanism in every situation. The single strategy frameworks include the evidence accumulation model and the connectionist parallel constraint satisfaction model.

superforecastingimagesThis post is a look at the book by Philip E Tetlock and Dan Gardner, Superforecasting– the Art and Science of Prediction.  Phil Tetlock is also the author of Expert Political Judgment: How Good Is It? How Can We Know?   In Superforecasting Tetlock blends discussion of the largely popular literature on decision making and his long duration scientific work on the ability of experts and others to predict future events.

In Expert Political Judgment: How Good Is It? How Can We Know? Tetlock found that the average expert did little better than guessing.  He also found that some did better. In Superforecasting he discusses the study of those who did better and how they did it.

The Mixed Instrumental Controller

mic_MG_5849This is more or less a continuation of the previous post based on Andy Clark’s “Embodied Prediction,” in T. Metzinger & J. M. Windt (Eds). Open MIND: 7(T). Frankfurt am Main: MIND Group (2015).   It further weighs in on the issue of changing strategies or changing weights (see post Revisiting Swiss Army Knife or Adaptive Tool Box). Clark has brought to my attention the terms model free and model based which seem to roughly equate to intuition/system 1 and analysis/system 2 respectively. With this translation, I am helped in trying to tie this into ideas like cognitive niches and parallel constraint satisfaction. Clark in a footnote:

Current thinking about switching between model-free and model based strategies places them squarely in the context of hierarchical inference, through the use of “Bayesian parameter averaging”. This essentially associates model-free schemes with simpler (less complex) lower levels of the hierarchy that may, at times, need to be contextualized
by (more complex) higher levels.

