I discovered that I was a celiac a few months ago and accordingly I am on a gluten free diet. Compared to most conditions discovered in one’s late sixties, celiac disease seems almost inconsequential. However, it fits into the idea of prediction error minimization. In effect, the environment has changed and I need to change my predictions. Bread and beer are now bad. My automatic, intuitive prediction machine has not been getting it right. It is disorienting. I can no longer “See food, eat food.” I can change the environment at home, but in the wider world I need to be aware. My brain needs to dedicate perpetual, and at least for now, conscious effort to this cause. It is almost as if I became instantly even dumber. It makes me more self absorbed in social settings that involve food. Not known for my social skills, I have been a good listener, but now not so much. On my Dad’s 94th birthday, I ate a big piece of German chocolate cake, enjoyed it thoroughly, and then remembered that it was not allowed. In my particular case, I do not get sick or nauseated when I make such a mistake so my commitment is always under threat. This demands an even larger share of my brain to be compliant. My main incentive to comply is those photos of my scalloped small intestine. I note that I was diagnosed after years of trying to figure out my low ferritin levels. (It will be extremely disappointing if I find that my ferritin is still low.) Continue reading
This post is based on a paper written by Andy Clark, author of Surfing Uncertainty (See Paper Predictive Processing for a fuller treatment.), “A nice surprise? Predictive processing and the active pursuit of novelty,” that appeared in Phenomenology and the Cognitive Sciences, pp. 1-14. DOI: 10.1007/s11097-017-9525-z. For me this is a chance to learn how Andy Clark has polished up his arguments since his book. It also strikes me as connected to my recent posts on Curiosity and Creativity.
Clark and Friston (See post The Prediction Machine) depict human brains as devices that minimize prediction error signals: signals that encode the difference between actual and expected sensory simulations. But we know that we are attracted to the unexpected. We humans often seem to actively seek out surprising events, deliberately seeking novel and exciting streams of sensory stimulation. So how does that square with the idea of minimizing prediction error.
Understanding or inferring the intentions, feelings and beliefs of others is a hallmark of human social cognition often referred to as having a Theory of Mind. ToM has been described as a cognitive ability to infer the intentions and beliefs of others, through processing of their physical appearance, clothes, bodily and facial expressions. Of course, the repertoire of hypotheses of our ToM is borrowed from the hypotheses that cause our own behavior.
But how can processing of internal visceral/autonomic information (interoception) contribute to the understanding of others’ intentions? The authors consider interoceptive inference as a special case of active inference. Friston (see post Prediction Error Minimization) has theorized that the goal of the brain is to minimize prediction error and that this can be achieved both by changing predictions to match the observed data and, via action, changing the sensory input to match predictions. When you drop the knife and then catch it with the other hand, you are using active inference.
This post is based on a paper written by Fabienne Picard and Karl Friston, entitled: “Predictions, perceptions, and a sense of self,” that appeared in Neurology® 2014;83:1112–1118. Karl Friston is one of the prime authors of predictive processing and Fabienne Picard is a doctor known for studying epilepsy. The ideas here are not new or even new to this blog, but the paper and specifically the figure below provide a good summary of the ideas of predictive processing. Andy Clark’s Surfing Uncertainty is the place to go if the subject interests you.
This is the first post in quite a while. I have been trying to consolidate and integrate my past inventory of posts into what I am calling papers. This has turned out to be time consuming and difficult since I really have to write something. As part of that effort, I have been reading Surfing Uncertainty– Prediction, Action, and the Embodied MInd authored by Andy Clark. (I note that this book is not designed for the popular press–it is quite challenging.) Clark refers to the extensive literature on decision making and pointed out a part of that literature unknown to me. With that recommendation, I sought out: “Perception, Action and Utility: The Tangled Skein,” (2012) in M. Rabinovich, K. Friston & P. Varona, editors, Principles of brain dynamics: Global state interactions. Cambridge, MA: MIT Press written by Samuel J. Gershman and Nathaniel D. Daw.
Gershman and Daw focus on two aspects of decision theory that have important implications for its implementation in the brain:
1. Decision theory implies a strong form of separation between probabilities and utilities. In
particular, the posterior must be computed before (and hence independently of) the expected
utility. This assumption is sometimes known as probabilistic sophistication. It means that I can state how much enjoyment I would derive from having a picnic in sunny weather, independently of my belief that it will be sunny tomorrow. This framework supports a sequentially staged view of the problem –perception guiding evaluation.
2. The mathematics that formalizes decision making under uncertainty, Bayes Theorem, generally assumes Gaussian or multinomial assumptions for distributions. Gershman and Daw note that these assumptions are not generally applicable to real-world decision-making
tasks, where distributions may not take any convenient parametric form. This means that if the brain is to perform the necessary calculations, it must employ some form of approximation.
Statistical decision theory, to be plausibly implemented in the brain, requires segregated representations of probability and utility, and a mechanism for performing approximate inference.
The full story, however, is not so simple. First, abundant evidence from vision indicates that reward modulation occurs at all levels of the visual hierarchy, including V1 and even before that in the lateral geniculate nucleus. Gershman and Daw suggest that the idea of far-downstream LIP (lateral intraparietal area) as a pure representation of posterior state probability is dubious. Indeed, other work varying rewarding outcomes for actions shows that neurons in LIP are indeed modulated by the probability and amount of reward expected for an action probably better thought of as related to expected utility rather than state probability per se. Then recall that area LIP is only one synapse downstream from the instantaneous motion energy representation in MT. If it already represents expected utility there seems to be no candidate for an intermediate stage of pure probability representation.
A different source of contrary evidence comes from behavioral economics. The classic Ellsberg
paradox revealed preferences in human choice behavior that are not probabilistically
sophisticated. The example given by Ellsberg involves drawing a ball from an urn containing 30 red balls and 60 black or yellow balls in an unknown proportion. Subjects are asked to choose between pairs of gambles (A vs. B or C vs. D) drawn from the following set:
Experimentally, subjects prefer A over B and D over C. The intuitive reasoning is that in gambles A and D, the probability of winning $100 is known (unambiguous), whereas in B and C it is unknown (ambiguous). There is no subjective probability distribution that can produce this pattern of preferences. This is widely regarded as violating the assumption of probability-utility segregation in statistical decision theory. (See post Allais and Ellsberg Paradoxes).
Gershman and Daw suggest two ways that the separation between probabilities and utilities might be weakened or abandoned:
A. Decision-making as probabilistic inference
The idea here is that by transforming the utility function appropriately, one can treat it as a probability density function parameterized by the action and hidden state. Consequently, maximizing the “probability” of utility with respect to action, while marginalizing the hidden state, is formally equivalent to maximizing the expected utility. Although this is more or less an algebraic maneuver, it has profound implications for the organization of decision-making circuitry in the brain. The insight is that what appear to be dedicated motivational and valuation circuits may instead be regarded as parallel applications of the same underlying computational mechanisms over effectively different likelihood functions.
Karl Friston builds on this foundation to assert a much more provocative concept: that for biologically evolved organisms, the desired equilibrium is by definition just the species’ evolved equilibrium state distribution. The mathematical equivalence rests on the evolutionary argument that hidden states with high prior probability also tend to have high utility. This situation arises through a combination of evolution and ontogenetic development, whereby the brain is immersed in a “statistical bath” that prescribes the landscape of its prior distribution. Because agents who find themselves more often in congenial states are more likely to survive, they inherit (or develop) priors with modes located at the states of highest congeniality. Conversely, states that are surprising given your evolutionary niche, like being out of water, for a fish, are maladaptive and should be avoided. (See post Neuromodulation.)
B. The costs of representation and computation
Probabilistic computations make exorbitant demands on a limited resource, and in a real physiological and psychological sense, these demands incur a cost that debits the utility of action. According to Gershman and Daw, humans are “cognitive misers” who seek to avoid effortful thought at every opportunity, and this effort diminishes the same neural signals that are excited by reward. For instance, one can study whether a rat who has learned to lever press for food while hungry will continue to do so when full; a full probabilistic representation over outcomes will adjust its expected utilities to the changed outcome value, whereas representing utilities only in expectation can preclude this and so predicts hapless working for unwanted food. The upshot of many such experiments is that the brain adopts both approaches, depending on circumstances. Circumstances elicit which approach can be explained by a sort of meta-optimization over the costs (e.g. extra computation) of maintaining the full representation relative to its benefits (better statistical accuracy).
- Egon Brunswik’s Lens model as elucidated by Ken Hammond and examined by Karelaia & Hogarth (see post What has Brunswik’s Lens Model Taught? et al)
- Parallel Constraint Satisfaction model through Andreas Glockner and his colleagues (see post Parallel Constraint Satisfaction Theory et al)
- Surprise Minimization or Free Energy Minimization (see post Prediction Machine et al) as presented by Andy Clark and including the ideas of Karl Friston and others
I continually look for commment on and expansion of these ideas, and I often do this in the most lazy of ways, I google them. Recently I seemed to find the last two mentioned on the same page of a philosophy book. That was not actually true, but it did remind me of similarities that I could point out. The idea of a compensatory process where one changes his belief a little to match the current set of “facts” tracks well with the idea that we can get predictions correct by moving our hand to catch the ball so that it does not have to be thrown perfectly. Both clearly try to match up the environment and ourselves. The Parallel Constraint Satisfaction model minimizes dissonance while the Free Energy model minimizes surprise. Both dissonance and surprise can create instability. The Free Energy model is more universal than the Parallel Constraint Satisfaction model, while for decision making PCS is more precise. The Free Energy model also gives us the idea that heuristic models could fit within process models. All this points out what is obvious to us all. We need the right model for the right job.
This is the second post looking at Karl Friston’s review (“The Fantastic Organ” Brain 2013:136; 1328-1332) of Kandel’s The Age of Insight: the Quest to Understand the Unconscious in Art, Mind, and Brain, from Vienna 1900 to the Present. Kandel looks at how we make inferences about other people, ourselves and our emotional states. He combines the mirror neuron system with reflections in a mirror. Friston suggests that this captures the essence of ‘perspective taking’, which is unpacked in terms of second order representations (representations of representations) as they relate to theory of mind and how artists use reflections. Friston states:
It is self evident that if our brains entail generative models of our world, then much of the brain must be devoted to modelling entities that populate our world; namely, other people. In other words, we spend much of our time generating hypotheses and predictions about the behavior of people—including ourselves. As noted by Kandel ‘the brain also needs a model of itself’ (p. 406).
This post is the first of two that look at a book review written by Karl Friston. Friston is the primary idea man behind embodied cognition (see post Embodied (grounded) Prediction (cognition) so far as I can tell. A book review is a chance to read his ideas in a little less formal and easier to understand environment. He reviews The Age of Insight: the Quest to Understand the Unconscious in Art, Mind, and Brain, from Vienna 1900 to the Present by Eric R. Kandel 2012.