When to Quit

Hiking toward the snow

This post is based on the paper: “Multi-attribute utility models as cognitive search engines”, by Pantelis P. Analytis, Amit Kothiyal, and Konstantinos Katsikopoulos that appeared in Judgment and Decision Making, Vol. 9, No. 5, September 2014, pp. 403–419. This post does not look at persistence (post Persistence or delay (post Decision Delay) when you believe that you need more alternatives, but when to quit your search and stop within the available alternatives.

In optimal stopping problems, decision makers are assumed to search randomly to learn the utility of alternatives; in contrast, in one-shot multi-attribute utility optimization, decision makers are assumed to have perfect knowledge of utilities. The authors point out that these two contexts represent the boundaries of a continuum, of which the middle remains uncharted: How should people search intelligently when they possess imperfect information about the alternatives? They pose the example of trying to hire a new employee faced with several dozen applications listing their skills and credentials. You need interviews to determine each candidate’s potential. What is the best way to organize the interview process? First, you need to decide the order in which you will be inviting candidates. Then, after each interview you need to decide whether to make an offer to one of the interviewed candidates, thus stopping your search. The first problem is an ordering problem and the second a stopping problem. If credentials were adequate, you would not need an interview, and if credentials were worthless, you would invite people for interviews randomly.

The paper considered three well-known models for estimating utility: (i) a linear multi-attribute model, (MLU is one of the cornerstone models and is the equivalent of multiple linear regression.)(ii) equal weighting of attributes, (EW is a special case of MLU where all decision weights are equal.)and (iii) a single-attribute heuristic (Single-attribute utility (SA): The SA model is akin to the lexicographic heuristic  and the take-the-best heuristic. However, SA resolves ties between alternatives by choosing at random). They used 12 real-world decision problems: beer aroma, cheese taste, cpu efficiency, fluorescent lamp lifetime, machine productivity, octane quality, olive oil quality, potato taste, red wine quality, white wine quality, seed yield, and tea quality, to measure the performance of the three models.

The paper assumes that decision makers first estimate the utility of each available alternative and then search the alternatives in order of their estimated utility until expected benefits are outweighed by search costs. MLU performed best on average but its simplifications also had regions of superior performance.  In the individual environments, the performance of the models varied significantly. There were four environments in which EW performed best and one in which SA did. In general, the performance in the binary choice task is a good proxy of performance in the full search task.  The discrepancies observed can be attributed to the fact that in the full search task, the alternatives that are searched early on contribute disproportionally to the success of a model. In contrast, in binary choices all possible single choices in the data set contribute equally to the performance of the model. The SA model tends to perform well when a simply or cumulatively dominating alternative is present, or when there exist high correlations between the single attribute and all other attributes. EW tends to perform well when the variability in cue validities is small or when there are high intercorrelations between all the attributes.

As long as there is some uncertainty about the exact utility of the alternatives, it may pay to sample some of them to learn their utility. In the Analytis et al model, the initial preferences guide the search process but are also subject to revision when the true utility of the sampled alternatives is revealed. Their modeling approach suggests that only the alternatives that have been sampled by the decision makers stand a chance of being selected. This is commonly called the consideration set. The decision makers then examine the alternatives of this subset more closely and finally choose one of the alternatives in it. Such models have often been found to outperform, in fitting and prediction, discrete choice models in which the decision makers are assumed to consider all the alternatives.

In the stopping policy presented in the paper, the decision maker should at every search step reevaluate the returns from sampling the next alternative in line. This task may appear demanding in relation to the optimal threshold rule. However, it has the same structure as a simplified version of a signal-detection problem  — in which humans are known to perform fairly well. Thus, the authors believe that the stopping policy could be psychologically plausible.

They did not discuss cases where the decision makers have to pay a search cost to learn additional attributes before moving forward to examine further alternatives. Further, as in search problems, they have assumed that the decision maker learns the exact utility of an alternative after paying a search cost and examining it. However, there are many dynamic decision-making contexts where the cost is internally defined as an opportunity cost when consuming an inferior alternative. In these environments information acquisition can be inherently noisy and decision makers may want to sample the alternatives repeatedly. Such decision-making contexts are commonly referred to as multi-armed bandits. In some environments decision makers may not know the utility weights but rather learn them along the way as they examine new alternatives. This suggests that previous findings on the ecological rationality of choice and inference strategies are also relevant to the search task.