I used material from Ken Hammond in his book Human Judgment and Social Policy, Irreducible Uncertainty, Inevitable Error, Unavoidable Injustice in my previous post. In the book he makes the point, previously lost on me, that a key risk is never mentioned in discussions of the 1986 Challenger disaster–the risk of a false negative.
What was the risk of not launching when it should have been? As Hammond writes, that error surely was in everyone’s mind: “What if it is discovered that we held back unnecessarily or failed to launch when we should have?” And what would the consequences of that error have been? A congressional uproar? Some managers might have lost their jobs? Although there is no indication that this kind of error and its consequences were explicitly considered, as Hammond says there can be little doubt that fear of this error was on everyone’s mind. The duality of error was ignored, at least explicitly. If the risk of a false negative had been set out, it might have changed the discussion and the decision. (As an aside, considering the 2003 Columbia disaster, it probably would not have changed the discussion. It seems apparent that an honest risk assessment is not always desirable for everyone. NASA probably felt that Congress would not accept the real risk, and Congress would probably have thought that public would not accept the risk.)
Hammond notes that it was the introduction of signal detection theory in the early 1950s that brought the importance of the duality of error home to psychologists. This is likely the first mention of Signal Detection Theory in this blog, and it is probably about time. According to Hammond, signal detection theory (SDT) provides one of the best and most sophisticated approaches to the problem of judgment and decision making in the face of uncertainty and duality of error. It has been applied to such diverse problems as the diagnosis of AIDS, the detection of dangerous flaws in aircraft structure, weather forecasting, and the detection of breast cancer.
The theory was brought to the attention of psychologists in 1954 by an article in the Psychological Review by Tanner and Swets. SDT began with the simple case of one stimulus (a signal) to be discriminated from another (noise). The basic theory is that for any criterion, there will be four kinds of decision outcome: two types of errors, false positive and false negative, and two types of correct decisions, true positive (hits) and true negative. Exactly where the positivity criterion is set determines the balance among those proportions. If a low or lenient criterion is set, the proportions of both true-positive and false-positive outcomes will be relatively high and the proportions of both true-negative and false-negative outcomes will be relatively low. The converse is the case when a high or strict criterion is set. Once the numbers become available, we can determine just how many, or what proportions of, false positives and false negatives will (or would be) produced, as well as the proportion of true positives and true negatives. And when it is possible to calculate the costs of the errors and the benefits of identifying the true positives and true negatives, we can determine the cost/benefit ratio that follows from a specific action.
To provide a slightly more detailed description of signal detection theory, I found work done by David Heeger when he was at Stanford. Now he is at New York University. He provides a medical scenario.
He says to imagine that a radiologist is examining a CT scan, looking for evidence of a tumor. Interpreting CT images is hard and it takes a lot of training. Because the task is so hard, there is always some uncertainty as to what is there or not. Either there is a tumor (signal present) or there is not (signal absent). Either the doctor sees a tumor (they respond “yes”) or does not (they respond “no”). There are four possible outcomes: hit (tumor present and doctor says “yes”), false negative (tumor present and doctor says “no”), false positive (tumor absent and doctor says “yes”), and correct rejection (tumor absent and doctor says “no”).
There are two main components to the decision-making process: information acquisition and criterion.
Information acquisition: There is information in the CT scan. For example, healthy lungs have a characteristic shape. The presence of a tumor might distort that shape. Tumors may have different image characteristics: brighter or darker, different texture, etc. With proper training a doctor learns what kinds of things to look for, so with more practice/training they will be able to acquire more (and more reliable) information. Running another test (e.g., MRI) can also be used to acquire more information. Regardless, acquiring more information is good. The effect of information is to increase the likelihood of getting either a hit or a correct rejection, because it increases the separation between the distributions for the two events.
Criterion: In addition to relying on technology/testing to provide information, the medical profession allows doctors to use their own judgement. Different doctors may feel that the different types of errors are not equal. For example, a doctor may feel that missing an opportunity for early diagnosis may mean the difference between life and death. A false positive, on the other hand, may result only in a routine biopsy operation. There are two kinds of noise factors that contribute to the uncertainty: external and internal noise .
External noise: There are many possible sources of external noise. There can be noise factors that are part of the photographic process, a smudge, or a bad spot on the film. Or something in the person’s lung that is fine but just looks a bit like a tumor. All of these are examples of external noise. While the doctor makes every effort possible to reduce the external noise, there is little or nothing that he can do to reduce internal noise.
Internal noise: Internal noise refers to the fact that neural responses are noisy. Heeger has us suppose that our doctor has a set of tumor detector neurons and that they monitor the response of one of these neurons to determine the likelihood that there is a tumor in the image. These hypothetical tumor detectors will give noisy and variable responses. After one glance at a scan of a healthy lung, our hypothetical tumor detectors might fire 10 spikes per second. After a different glance at the same scan and under the same conditions, these neurons might fire 40 spikes per second.
Internal response: According to Heeger, there is some internal state, reflected by neural activity somewhere in the brain, that determines the doctor’s impression about whether or not a tumor is present. This neural activity might be concentrated in just a few neurons or it might be distributed across a large number of neurons. Wherever or whatever it is, it is the doctor’s internal response.
Figure 1 shows a graph of two internal response curves. The curve on the left is for for the noise-alone (healthy lung) trials, and the curve on the right is for the signal-plus-noise (tumor present) trials. The horizontal axis is labeled internal response and the vertical axis is labeled probability. The height of each curve represents how often that level of internal response will occur.
The noise does not go away. The internal response for the signal-plus-noise case is generally greater but there is still a distribution (a spread) of possible responses. Notice also that the curves overlap, that is, the internal response for a noise-alone trial may exceed the internal response for a signal-plus-noise trial.
Suppose that the doctor chooses a low criterion (Figure 3, top), so that he responds “yes” to almost everything. Then he will never miss a tumor when it is present and he will therefore have a very high hit rate. On the other hand, saying “yes” to almost everything will greatly increase the number of false positives (potentially leading to unnecessary surgeries). Thus, there is a clear cost to increasing the number of hits, and that cost is paid in terms of false positives. If the doctor chooses a high criterion (Figure 3, bottom) then he responds “no” to almost everything. They will rarely make a false positive, but they will also miss many real tumors.
Notice that there is no way that the doctor can set their criterion to achieve only hits and no false positives. Thus the doctor cannot always be right. He can adjust the kind of errors that he makes by manipulating his criterion, the one part of this diagram that is under their control.
Hammond, K.R. (1996) Human Judgment and Social Policy, Irreducible Uncertainty, Inevitable Error, Unavoidable Injustice. Oxford University Press: New York.