Signal Detection Theory Applied

treedownloadThis post examines a couple of applications of Signal Detection Theory.  Both are technically beyond me, but the similarities in the applications seem instructive.  In both articles, SDT is used to evaluate questionnaire type screening tools.  This is not a big surprise since it is where most of us first saw applications of statistical hypothesis testing and those false positives and false negatives.  One paper looks at BRCA genetic risk screening and the other depression screening.  In both cases, the screening instruments do not propose to be gold standards, but only introductory screening.  It might be the type of screening that internal medicine doctors might do.  In both cases, there is the idea that pure probability based instruments are ineffective, due to the biases that most people carry with them. One paper utilizes a fast and frugal decision tree(FFT) and the other three risk categories to provide the gist as in fuzzy trace theory(FTT). This gives us promotion of two similar acronyms: FFT and FTT.

The Signal Detection Theory (SDT) model was used to assess the effectiveness of BRCA Gist, an intelligent tutoring system designed to improve women’s judgments and understanding of genetic risk for breast cancer. Participants were randomly assigned to the BRCA Gist intelligent tutoring system, the National Cancer Institute (NCI) Web pages, or a control group. An important prerequisite for using SDT is the development of stimuli that fall into objectively defined categories, a difficult but not impossible task. Genetic risk level in this study was validated with the Pedigree Assessment Tool. The PAT estimates genetic risk on the basis of empirically verified risk factors, including family history of breast cancer and ethnicity. Ordinal gist categories were defined using cutoff values for the PAT. Importantly, the cutoff values were also vetted by a nationally recognized medical expert in women’s health and clinical decision-making, as defining low-, medium-, and high risk categories. FTT suggests that the ordinal gist categories “low,” “medium,” and “high” reflect a level of resolution frequently used by laypeople when assessing levels of risk.

On the basis of these defined categories, 12 cases of hypothetical women who varied in genetic risk for breast cancer were developed. These included four low-, medium-, and high risk cases. Careful attention was given to the development of tightly controlled and standardized cases. Each case included the following information: name, age, ethnicity, hometown, and family and personal health history. Given that age is a strong, nongenetic predictor of breast cancer, age was equated across the risk categories. The 12 cases were also equated in terms of word length and linguistic complexity. medical experts.

In this experiment, BRCA Gist, an intelligent tutoring system, and the NCI website both increased women’s ability to discriminate genetic risk for breast cancer, although BRCA Gist supported differentiating low, medium, and high risk. However, no differences in response bias emerged among any of the groups, suggesting that BRCA Gist and NCI do not appreciably alter
how women weight errors (misses vs. false alarms). Thus, participants did not improve accuracy by simply having a more strict or lenient decision criterion. Without a mathematical model, such as SDT, it would have been impossible to isolate these components of risk judgment.

In the other paper, the SDT model was used to evaluate a Fast and Frugal Tree, a unit-weight model, and a logistic regression model, that categorized a respondent as having depressed mood (or not) based on her responses to items from the German version of the Beck Depression Inventory. The BDI consists of 21 items, each asking the respondent to indicate which of four statements best describes how she has felt in the last 7 days. Using the BDI helped to assure objectively defined categories. The 5 most valid items were chosen from the BDI for the experiment and incorporated into the models. Only 4 items were incorporated into the Fast and Frugal Tree. It is set out below.


The FFT proved to be highly frugal. Across all categorizations,the tree stopped search after inspecting an average of just 1.3 cues. One reason for this was that the Dresden Predictor Study participants had a very low depression base rate. (When used in a population with a much higher base rate, the FFT inspected an average of 3 cues.) Specifically, the hit and false alarm rates of all models were determined and the discriminability calculated, as measured by the index d. The SDT framework also allowed the authors to determine the bias (or decision threshold) of a decision model.

In comparing the models,  the unit-weight model achieved the highest accuracy , followed by the FFT, the naïve maximization model, and the logistic regression model.  Comparing the models in terms of discriminability yielded a similar pattern:  the unit-weight model achieved the highest d, followed by the FFT, the logistic regression model, and the naïve maximization model.  Overall, these analyses show that although the FFT inspected only about one fourth of the information inspected by the compensatory models, it was able to compete. The logistic regression model and the unit-weight model were most lenient in categorizing a respondent as having clinically depressed mood, whereas the FFT was more conservative.

It certainly seems that disease screening tools will continue to multiply. Signal detection theory provides one way to examine them before they are widely deployed. Maybe this will help assure a longer term of usefulness.  With the current muddled state of prostate and breast cancer screening, the problem with SDT may be providing those objectively defined categories at the outset.

Wolfe, C., Reyna, V., Widmer, C., Cedillos, E., & Brust-Renck, P.(2013). “A signal detection analysis of gist-based discrimination of genetic breast cancer risk,” Behav Res 45:613–622.

Jennya, M., Pachur, T., William, S., Becker, E., & Margraf, J.(2013). “Simple rules for detecting depression.”   Journal of Applied Research in Memory and Cognition 2:149–157.