Does interaction matter in collective decision-making?

interactionF1.largeThis post is based on a paper: “Does interaction matter? Testing whether a confidence heuristic can replace interaction in collective decision-making.” The authors are
Dan Bang, Riccardo Fusaroli, Kristian Tylén, Karsten Olsen, Peter E. Latham, Jennifer Y.F. Lau, Andreas Roepstorff, Geraint Rees, Chris D. Frith, and Bahador Bahrami. The paper appeared in Consciousness and Cognition 26 (2014) 13–23.

The paper indicates that there is a growing interest in the mechanisms underlying the ‘‘two-heads-better-than-one’’ (2HBT1) effect, which refers to the ability of dyads to make more accurate decisions than either of their members. Bahrami’s 2010 study, using a perceptual task in which two observers had to detect a visual target, showed that two heads become better than one by sharing their ‘confidence’ (i.e., an internal estimate of the probability of being correct), thus allowing them to identify who is more likely to be correct in a given situation. This tendency to evaluate the reliability of information by the confidence with which it is expressed has been termed the ‘confidence heuristic’. I do not recall having seen the acronym 2HBT1 before, but it does recall the post Dialectical Bootstrapping in which one forms his own dyad, Bootstrapping where one uses expert judgment, and Scott Page’s work Diversity or Systematic Error? However, this is the first discussion of a confidence heuristic.

In a 2012 study, Koriat asked isolated observers to estimate the degree of confidence in their perceptual decisions. Participants, all of whom had received the same sequence of stimuli, were afterwards paired into virtual dyads so that they matched each other in terms of their ‘reliability’ (i.e., the reliability of their individual decisions about the visual target). To remove individual biases in confidence, their confidence estimates were normalised, so that they shared the same mean and standard deviation, before being submitted to the Maximum Confidence Slating (MCS) algorithm, which selected the decision of the more confident member of the virtual dyad on every trial. Without interaction, the MCS algorithm yielded a robust 2HBT1 effect.  In the study at hand Bang et al tested whether this algorithm could in practice replace interaction in collective decision making. Importantly, such a formula for collective choice – if effective – would not be susceptible to the individual biases that may impair interaction, and could readily be used by decision makers, such as jurors, medical doctors or financial investors, who have to combine different opinions in limited time.

In this study, Bang et al changed the experiment by testing the efficacy of the MCS algorithm without matching dyad members in terms of their reliability, and compared the responses advised by the algorithm with those reached by the dyad members through interaction. This study sought to answer whether or not the success of the MCS algorithm depends on the similarity of dyad members’ reliabilities and does the algorithm fare just as well as interacting dyad members?

The actual experiments found that the fraction of disagreement trials in which the dyad followed dyad member A instead of dyad member B depended on their relative ability to estimate the reliability of their own decisions so called metacognitive accuracy, indicating that dyad members used the credibility of each other’s confidence estimates to guide their joint decisions. The experiments also found that  interaction was more robust than the MCS algorithm to differences in reliability. For similar dyad members, the decisions reached through interaction were no more accurate than those advised by the MCS algorithm. However, for dissimilar dyad members, the decisions reached through interaction were considerably more accurate than those advised by the MCS algorithm. Interacting individuals may take such misestimates into account. For example, one study has shown that mock jurors find witnesses who are confident about erroneous testimony less credible than witnesses who are not confident about it. While models of collective decision-making have identified the ‘arbitration’ of confidence estimates as key to collective performance, their findings suggests that the ‘weighting’ of confidence estimates is equally important for collective performance.

The authors note that their results may not carry over to non-perceptual domains. Research has shown that individuals are ‘overconfident’ about the accuracy of their  knowledge-based judgements but ‘underconfident’ about the accuracy of their perceptual judgements. These discrepant patterns of confidence have led to the hypothesis that different types of information determine confidence in knowledge and perception. For example, Juslin and Olsson propose a model of confidence in which perceptual judgements are dominated by ‘Thurstonian’ uncertainty, internal noise such as stochastic variance in the sensory systems, whereas knowledge based judgments are dominated by ‘Brunswikian’ uncertainty, external noise such as less-than-perfect correlations between features in the environment.

To summarize, Bang et al found that, for individuals of nearly equal reliability, the decisions advised by the confidence heuristic were just as accurate as those reached through interaction, but for individuals with different reliabilities, the decisions advised by the confidence heuristic were less accurate than those reached through interaction. Interacting individuals took into account the credibility of each other’s confidence when making their joint decisions, presumably making them less susceptible to those situations in which the more confident was the less competent group member.