Kind and Wicked Learning Environments

This post is based on a paper: “The Two Settings of Kind and Wicked Learning Environments” written by Robin M. Hogarth, Tomás Lejarraga, and Emre Soyer that appeared in Current Directions in Psychological Science 2015, Vol. 24(5) 379–385. Hogarth created the idea of kind and wicked learning environments and it is discussed in his book Educating Intuition.

Hogarth et al state that inference involves two settings: In the first, information is acquired (learning); in the second, it is applied (predictions or choices). Kind learning environments involve close matches between the informational elements in the two settings and are a necessary condition for accurate  inferences. Wicked learning environments involve mismatches.

Hogarth (Learning, Feedback, and Intuition) uses an example in introducing the concept of wicked learning environments. An early 20th century physician in a New York hospital  acquired a reputation for accurately diagnosing typhoid fever in its early stages. The physician believed that the appearance of the tongue was highly diagnostic. Hence, his clinical technique included palpating patients’ tongues before making his pessimistic forecasts. Unfortunately, he was invariably correct since he was a more effective carrier, using only his hands, than Typhoid Mary.

Hogarth describes wicked domains as situations in which feedback in the form of outcomes of actions or observations is poor, misleading, or even missing. In contrast, in kind learning environments, feedback links outcomes directly to the appropriate actions or judgments and is both accurate and plentiful. In determining when people’s intuitions are likely to be accurate, this framework emphasizes the importance of the conditions under which learning has taken place. Kind learning environments are a necessary condition for accurate intuitive judgments, whereas intuitions acquired in wicked environments are likely to be mistaken.

The Two-Settings Framework
Hogarth et al conceptualize inference through the lens of probabilistic prediction. One observes a sample, calculates a statistic, and then estimates that statistic in the population or a different sample (as when, e.g., one estimates a mean). The theoretical justification relies on a simple assumption: Samples are randomly drawn from the same underlying population.

This formulation has been critical in judgment and decision-making research. However, the authors contend that it is ill-suited for considering the psychological issues underlying decision making because, instead of one underlying population, people have to deal with two populations, or as we shall say, two settings.

In the first setting, people learn about a situation (e.g., how two variables covary). In the second, they take an action or make a prediction using the knowledge acquired in the first. One setting is characterized by learning and the other by choice or prediction. To illustrate, imagine you are a personnel manager who uses a test to select job candidates. This test has been accurate in the past (learning). Thus, for current decisions (predictions), the test can be expected to be accurate when the features of the two settings (past and present) match. For example, are the present candidates similar to those in the past?

Rather than assuming that both situations (e.g., past and present) are random samples from the same underlying population, Hogarth et al suggest two distinct settings. They refer to the first as L (for learning) and the second as T (for target) and ask how these match. On the left-hand side of Figure 1, they consider six ways in which the elements of information in L and T do or do not match, and these, in turn, allow us to define different task structures for kind and wicked learning environments.

The right-hand side of Figure 1 illustrates the cases on the left using the job-selection scenario. Each scatter plot shows the data experienced by the manager in learning about the relation between test scores and job performance from past applicants (L). Subsequently, this information is used to predict the performance of new candidates (T ).

Cases A and B represent kind learning environments. In A, there is a perfect match between the elements of L and T.Case B reflects that the presence of random error means that matches are at best approximate. The relation between X and Y on the right is represented by an ellipse as opposed to a straight line.

Cases C through F represent wicked learning environments. In C, L is a subset of T. There are elements in T that cannot be inferred from L. Examples include the survivorship bias, in which data have been systematically restricted by events or actions. In the example on the right, performance data are not available for people scoring low on the test (X < 10) because they were not selected for the job.

In D, T is a subset of L. This can occur when the person is unaware that there has been a change in the composition of the reference class between learning and prediction. For example, imagine that the applicant pool changes because the local university has lowered its admission
standards, such that there are no highly qualified candidates among the graduates applying for the job. However, the personnel manager does not realize this.

In E, the elements of L and T intersect because of systematic factors, and the ability to predict in T is limited. This case captures self-fulfilling prophecies or so-called treatment effects. In terms of the selection model, those chosen (X > 10) receive special “treatment” that systematically
biases job performance positively (e.g., they have excellent mentors). The personnel manager is exposed to a biased learning sample.  Case E also captures the conditions of both C and D.

Finally,  Case F, in which T and L have no elements in common. In this case, the variable used to predict performance is not related to it (e.g., physical appearance).

Features of Wickedness
A wicked learning environment can emerge as a result of actions taken by the person making the inferences (as in self-fulfilling prophecies, Case E) as well as the characteristics of the environment. For example, a Case C situation could arise if someone were asked to make predictions beyond the range of data observed in the past.

Although discrete in the classification scheme, kindness and wickedness can vary in degree as on a continuum. For instance, Case A is kinder than B, which is kinder than E or F. But what happens when mismatches are due to random factors? In B, for example, noise attenuates
predictive ability. In fact, with much noise, predictive ability could be inherently lower in Case B than in some wicked environments, such as Case C. However, the framework clearly indicates that whereas the underlying cause of mismatch is random in the former, it is systematic in the latter.

The framework deals only with the elements of information in L and T, but does not explain the reasons why individuals consider extraneous information or how information is aggregated in making inferences. These issues are important because many errors can be attributed to attention paid to extraneous information and/or inappropriate aggregation rules like using additive aggregation when it should be multiplicative.


The concept of the kindness of the learning environment also explains the predictive accuracy of some heuristic decision processes that typically ignore information and involve simple decision rules. Successful heuristics exploit two key features of the environment: how information is aggregated and redundancy (Heuristic and Linear Models). As such, they operate in the intersection of L and T. For example, when people employ the recognition heuristic to select one of two alternatives (On the Use of Recognition in Inferential Decision Making), they base their judgments on information available in memory that happens to be correlated with what they are trying to predict.

Matching as a Default
People often use a default strategy that projects a match from L to T. This is likely because:

  1. Inferences often need only to suggest a direction as opposed to providing precise answers.
  2. Assume that a person knows that elements are missing from L. What should be done?  it is unclear how to correct for missing observations (Case C) and unrepresentative learning sets (Case D).
  3. Default matching strategies are cognitively simple. Adjusting defaults requires meta-cognitive ability that people may not possess.

If an environment is kind, the necessary conditions for accurate inference exist. Therefore, any errors must be attributed to the person. If wicked, we can identify how error results from task features, although these can also be affected by human actions. In short, the framework facilitates pinpointing the sources of errors (task structure and/or person). Table 1 lists some phenomena viewed from this perspective. For example, consider the “hot stove” effect, the fourth entry. Here, a person’s experience of past outcomes (learning) determines what she selects currently (target), but then the outcome of this biases her subsequent learning.

Since kind environments are a necessary condition for accurate judgments, the framework suggests deliberately creating kind environments. Indeed, this reasoning motivated Hogarth’s work on simulated experience, in which he engineered kind environments by letting people experience sequential outcomes of probabilistic processes (Hogarth on Simulation) and investigated their ability to make appropriate probabilistic statements.

Hogarth et al have put together several important ideas here and explained them well. It is definitely an advancement from Hogarth’s Educating Intuition and much to continue to think about. A couple of my errant thoughts are that matching as a default goes along with our ability to do well in linear situations and less well in non linear ones. It also reminds me of Hammond’s (Beyond rationality) correspondence and coherence. Correspondence is a matching rationality that is often linear, while coherence is a logical rationality which is often non linear. I am not certain whether the gambler’s fallacy, where we get lazy and assume that the next answer is determined by an overall distribution is within this framework or if it is an aggregation issue.