This is the first post in quite a while. I have been trying to consolidate and integrate my past inventory of posts into what I am calling papers. This has turned out to be time consuming and difficult since I really have to write something. As part of that effort, I have been reading* Surfing Uncertainty– Prediction, Action, and the Embodied MInd* authored by Andy Clark. (I note that this book is not designed for the popular press–it is quite challenging.) Clark refers to the extensive literature on decision making and pointed out a part of that literature unknown to me. With that recommendation, I sought out: “Perception, Action and Utility: The Tangled Skein,” (2012) in M. Rabinovich, K. Friston & P. Varona, editors, *Principles of brain dynamics: Global state interactions*. Cambridge, MA: MIT Press written by Samuel J. Gershman and Nathaniel D. Daw.

Gershman and Daw focus on two aspects of decision theory that have important implications for its implementation in the brain:

1. Decision theory implies a strong form of separation between probabilities and utilities. In

particular, the posterior must be computed before (and hence independently of) the expected

utility. This assumption is sometimes known as probabilistic sophistication. It means that I can state how much enjoyment I would derive from having a picnic in sunny weather, independently of my belief that it will be sunny tomorrow. This framework supports a sequentially staged view of the problem –perception guiding evaluation.

2. The mathematics that formalizes decision making under uncertainty, Bayes Theorem, generally assumes Gaussian or multinomial assumptions for distributions. Gershman and Daw note that these assumptions are not generally applicable to real-world decision-making

tasks, where distributions may not take any convenient parametric form. This means that if the brain is to perform the necessary calculations, it must employ some form of approximation.

Statistical decision theory, to be plausibly implemented in the brain, requires segregated representations of probability and utility, and a mechanism for performing approximate inference.

The full story, however, is not so simple. First, abundant evidence from vision indicates that reward modulation occurs at all levels of the visual hierarchy, including V1 and even before that in the lateral geniculate nucleus. Gershman and Daw suggest that the idea of far-downstream LIP (lateral intraparietal area) as a pure representation of posterior state probability is dubious. Indeed, other work varying rewarding outcomes for actions shows that neurons in LIP are indeed modulated by the probability and amount of reward expected for an action probably better thought of as related to expected utility rather than state probability per se. Then recall that area LIP is only one synapse downstream from the instantaneous motion energy representation in MT. If it already represents expected utility there seems to be no candidate for an intermediate stage of pure probability representation.

A different source of contrary evidence comes from behavioral economics. The classic Ellsberg

paradox revealed preferences in human choice behavior that are not probabilistically

sophisticated. The example given by Ellsberg involves drawing a ball from an urn containing 30 red balls and 60 black or yellow balls in an unknown proportion. Subjects are asked to choose between pairs of gambles (A vs. B or C vs. D) drawn from the following set:

Experimentally, subjects prefer A over B and D over C. The intuitive reasoning is that in gambles A and D, the probability of winning $100 is known (unambiguous), whereas in B and C it is unknown (ambiguous). There is no subjective probability distribution that can produce this pattern of preferences. This is widely regarded as violating the assumption of probability-utility segregation in statistical decision theory. (See post Allais and Ellsberg Paradoxes).

Gershman and Daw suggest two ways that the separation between probabilities and utilities might be weakened or abandoned:

A. Decision-making as probabilistic inference

The idea here is that by transforming the utility function appropriately, one can treat it as a probability density function parameterized by the action and hidden state. Consequently, maximizing the “probability” of utility with respect to action, while marginalizing the hidden state, is formally equivalent to maximizing the expected utility. Although this is more or less an algebraic maneuver, it has profound implications for the organization of decision-making circuitry in the brain. The insight is that what appear to be dedicated motivational and valuation circuits may instead be regarded as parallel applications of the same underlying computational mechanisms over effectively different likelihood functions.

Karl Friston builds on this foundation to assert a much more provocative concept: that for biologically evolved organisms, the desired equilibrium is by definition just the species’ evolved equilibrium state distribution. The mathematical equivalence rests on the evolutionary argument that hidden states with high prior probability also tend to have high utility. This situation arises through a combination of evolution and ontogenetic development, whereby the brain is immersed in a “statistical bath” that prescribes the landscape of its prior distribution. Because agents who find themselves more often in congenial states are more likely to survive, they inherit (or develop) priors with modes located at the states of highest congeniality. Conversely, states that are surprising given your evolutionary niche, like being out of water, for a fish, are maladaptive and should be avoided. (See post Neuromodulation.)

B. The costs of representation and computation

Probabilistic computations make exorbitant demands on a limited resource, and in a real physiological and psychological sense, these demands incur a cost that debits the utility of action. According to Gershman and Daw, humans are “cognitive misers” who seek to avoid effortful thought at every opportunity, and this effort diminishes the same neural signals that are excited by reward. For instance, one can study whether a rat who has learned to lever press for food while hungry will continue to do so when full; a full probabilistic representation over outcomes will adjust its expected utilities to the changed outcome value, whereas representing utilities only in expectation can preclude this and so predicts hapless working for unwanted food. The upshot of many such experiments is that the brain adopts both approaches, depending on circumstances. Circumstances elicit which approach can be explained by a sort of meta-optimization over the costs (e.g. extra computation) of maintaining the full representation relative to its benefits (better statistical accuracy).