Unit 2: the blind men and the elephant

the blind men and the elephant

An elephant, by Itamar Edelman

the blind men and the elephant

Four case studies — four approaches to understanding the brain, and their limitations:

An ambitious broad-scope "computational" theory (J. R. Anderson. ACT: A simple theory of complex cognition. American Psychologist, 51:355-365, 1996).
A single-cell electrophysiology study (C. D. Salzman, K. H. Britten, and W. T. Newsome. Cortical microstimulation influences perceptual judgements of motion direction. Nature, 346:174-177, 1990).
An imaging study (J. V. Haxby, M. I. Gobbini, M. L. Furey, A. Ishai, J. L. Schouten, and P. Pietrini. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293:2425-2430, 2001).
A "big data" / "deep learning" hack (V. Mnih et al., Human-level control through deep reinforcement learning, Nature 518:529-518, 2015).

[NOTE: I did not include an example of a purely behavioral study ("experimental psychology") — criticizing something like that would be like shooting fish in a barrel.]

case study #1: ACT-R (John Anderson)

"The ACT-R (Adaptive Control of Thought, Rational) theory rests upon three important components: rational [= "effective + adaptive"] analysis (Anderson, 1990), the distinction between procedural and declarative memory (Anderson, 1976), and a modular structure in which components communicate through buffers."

"The actual substrate of cognition is an interconnected network of neurons. Whether or not this is a significant source of constraint is open to debate."

— Taatgen, N.A. & Anderson, J.R. (2008). ACT-R.
In R. Sun (ed.), Constraints in Cognitive Architectures.
Cambridge University Press, 170-185.

ACT-R: a "simple theory of complex cognition"

"All that there is to intelligence is the simple accrual and tuning of many small units of knowledge that in total produce complex cognition. The whole is no more than the sum of its parts, but it has a lot of parts."

— Anderson, J. R. (1996). ACT: A simple theory of complex cognition.
American Psychologist, 51, 355-365.

ACT-R: a "simple theory of complex cognition"

"All this knowledge creates a serious problem. How does one select the appropriate knowledge in a particular context?"

"ACT-R has developed a two-pass solution for knowledge deployment. An initial parallel activation process identifies the knowledge structures (chunks and productions) that are most likely to be useful in the context, and then those knowledge structures determine performance."

— Anderson, J. R. (1996). ACT: A simple theory of complex cognition.
American Psychologist, 51, 355-365.

ACT-R as a theory of cognition

"ACT-R is a hybrid architecture in the sense that it has both symbolic and subsymbolic aspects. The symbolic aspects involve declarative chunks and procedural production rules. The declarative chunks are the knowledge-representation units that reside in declarative memory, and the production rules are responsible for the control of cognition. Access to these symbolic structures is determined by a subsymbolic level of neural-like activation quantities."

— Anderson, J. R., & C. Lebiere (2003).
The Newell Test for a theory of cognition.
Behavioral and Brain Sciences 26:587-640.

ACT-R as a theory of cognition

"ACT-R consists of two key memories — a declarative memory and a procedural memory. The system has separate memories for each different type of chunk — for example, addition facts are represented by one type memory, whereas integers are represented by a separate type memory."

— Anderson, J. R., & C. Lebiere (2003).
The Newell Test for a theory of cognition.
Behavioral and Brain Sciences 26:587-640.

ACT-R: a "simple theory of complex cognition"

"Activation [of an item] in ACT-R theory reflects its log posterior odds of being appropriate in the current context. This is calculated as a sum of the log odds that the item has been useful in the past (log prior odds) plus an estimate that it will be useful given the current context (log likelihood ratio). [This is a Bayesian approach, more about which later.]

Thus, the ACT-R claim is that the mind keeps track of general usefulness and combines this with contextual appropriateness to make some inference about what knowledge to make available in the current context. The basic equation is

Activation-Level = Base-level + Contextual-Priming

— Anderson, J. R. (1996). ACT: A simple theory of complex cognition.
American Psychologist, 51, 355-365.

ACT-R as a theory of cognition

"In addition to the learning mechanisms that update activation and expected outcome, ACT-R can also learn new chunks and production rules."

"New chunks are learned automatically: Each time a goal is completed or a new percept is encountered, it is added to declarative memory. New production rules are learned by combining existing production rules. The circumstance for learning a new production rule is that two rules fire one after another, with the first rule retrieving a chunk from memory. A new production rule is formed that combines the two into a macro-rule but eliminates the retrieval. Therefore, everything in an ACT-R model (chunks, productions, activations, and utilities) is learnable."

— Anderson, J. R., & C. Lebiere (2003).
The Newell Test for a theory of cognition.
Behavioral and Brain Sciences 26:587-640.

ACT-R as a theory of cognition

"It might seem strange that neural computation should just so happen to satisfy the well-formedness constraints required to correspond to the symbolic level of a system like ACT-R. This would indeed be miraculous if the brain started out as an unstructured net that had to organize itself just in response to experience. However, as illustrated in the tentative brain correspondences for ACT-R components [next slide] and in the [preceding] description of ACT-RN, the symbolic structure emerges out of the structure of the brain."

— Anderson, J. R., & C. Lebiere (2003).
The Newell Test for a theory of cognition.
Behavioral and Brain Sciences 26:587-640.

ACT-R architecture

The organization of information in ACT-R. Information in the buffers associated with modules is responded to and changed by production rules.
DLPFC: dorsolateral prefrontal cortex; VLPFC: ventrolateral prefrontal cortex.

— Anderson, J. R., et al. (2004).
An Integrated Theory of the Mind.
Psychological Review 111:1036-1060

[but consider the octopus]

On the right: octopus central nervous system (Fig. 6B from The cephalopod nervous system: What evolution has made of the molluscan design, B. U. Budelmann, 1995).

[but consider the actual interconnection pattern of primate visual cortical areas]

Felleman, D. J., and D. C. Van Essen (1991).
Distributed hierarchical processing in the primate cerebral cortex.
Cerebral Cortex 1:1-47.

self-critique by Anderson: what ACT-R doesn't do

"Sometimes the suspicion is stated that ACT-R is a general computational system that can be programmed to do anything. To address this issue, we would like to specify four senses in which the system falls short of that.

Limitations: timing; serial bottleneck.
Mechanisms not yet incorporated: e.g., speech.
Domains not yet addressed: e.g., perceptual recognition (as opposed to attention).
Technical issues: e.g., temporal bounds for utility learning [cf. credit assignment problem, due to Minsky].

— Anderson, J. R., & C. Lebiere (2003).
The Newell Test for a theory of cognition.
Behavioral and Brain Sciences 26:587-640.

case study #2: single-cell physiology (W. T. Newsome)

Localized brain stimulation for diagnostic purposes, pioneered by Wilder Penfield, or: how to smell burnt toast that's not there —

the cue chosen for Newsome's study: visual motion

Motion is a powerful cue to 3D structure and to "abstract" features of the moving object:

inducing motion perception by microstimulation of cortical area MT

cortical area MT, which "specializes" in motion processing

microstimulation of area MT: behavioral tasks and neural data

Behavioral tasks:

fixation

saccade (gaze shift)

reaching

The data: parallel recordings of —

cell activity
visual stimulus presence/absence
gaze direction

microstimulation of cortical area MT: stimulus properties

Moving-dots stimuli at varying degrees of coherence:

The behavioral task is to detect and signal the direction of coherent motion. Its difficulty (and therefore the performance) depends on the degree of coherence.

degrees of motion coherence: four examples

0%	5%
30%	100%

typical responses of an MT cell to motion stimuli


the typical response of a single neuron in area MT	area MT is organized in columns by the neurons' preferred direction of motion

the outcome: inducing perception by cortical microstimulation

From William T. Newsome's lab (Stanford):

(A) spatial layout of fixation point and eye movement targets

(B) tuning curve illustrating collective responses of neurons in a single MT area column; data obtained from 100%-coherence trials

(C) behavioral data obtained by electrically stimulating the column whose tuning is shown in (B);
these data are from 0%-coherence trials!

case study #3: fMRI (J. Haxby)

"Schematic diagram illustrating the locations of the fusiform face area (FFA), which also has been implicated in expert visual recognition, and the parahippocampal place area (PPA) on the ventral surface of the right temporal lobe. In most brains, these areas are bilateral."

— J. V. Haxby, M. I. Gobbini, M. L. Furey, A. Ishai, J. L. Schouten, and P. Pietrini (2001).
Distributed and overlapping representations of faces and objects in ventral temporal cortex.
Science 293:2425-2430.

stimulus set

"Examples of stimuli. Subjects performed a one-back repetition detection task in which repetitions of meaningful pictures were different views of the same face or object."

stimulus set, continued

"Examples of stimuli. Subjects performed a one-back repetition detection task in which repetitions of meaningful pictures were different views of the same face or object."

brain response (average across categories)

"The category specificity of patterns of response was analyzed with pairwise contrasts between within-category and between-category correlations. These patterns were normalized to a mean of zero in each voxel across categories by subtracting the mean response across all categories. Brain images shown here are the normalized patterns of response in two axial slices in a single subject. Responses in all object-selective voxels in ventral temporal cortex are shown."

(C) Mean response across all categories relative to a resting baseline.

classification based on multivoxel response

regarding multivoxel response

"All 66 regions were active in at least one domain; 65 (98.5%) were active in two or more domains. [...] The 66 regions were active in an average of 9.09 (SD 2.27) different domains. The 60 regions active in action tasks were also active in an average 7.38 (SD 0.98) non-action domains and 5.5 (SD 0.81) cognitive domains. The 64 regions active in perception tasks were also active in 7.39 (SD 1.87) non-perceptual domains and 5.34 cognitive domains. The 59 regions active in both perception and action tasks were also active in an average of 5.53 (SD 0.80) other domains, and the 7 regions not active in both perception and action tasks were active in an average of 3.00 (SD 1.41) of the cognitive domains. Only one region was active in only cognitive tasks, and that region was active only in memory."

— Anderson, M. L. (2010).
Neural reuse: A fundamental organizational principle of the brain
Behavioral and Brain Sciences 33:245-266.

case study #4: a ML hack — learning to play Atari (V. Mnih et al.)

Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature 518:529-533.

"games demanding more temporally extended planning strategies still constitute a major challenge for all existing agents [...] (for example, Montezuma's Revenge)."

moral: ??? [the blind men and the elephant]

Four case studies — four approaches to understanding the brain, and their limitations:

An ambitious broad-scope "computational" theory (ACT-R): based on a single top-down assumption.
A single-cell electrophysiology study (microstimulation): neat, but leaves most questions unresolved.
An imaging study (fMRI): trying to pinpoint where in the brain particular operations happen is not only not informative; localization of function is a fool's hope.
Throwing machine learning at problems (Deep Learning etc.): even if the hack succeeds, it doesn't reveal much about how the brain works.

Computational Psychology

Unit 2: the blind men and the elephant

the blind men and the elephant

the blind men and the elephant

case study #1: ACT-R (John Anderson)

ACT-R: a "simple theory of complex cognition"

ACT-R: a "simple theory of complex cognition"

ACT-R as a theory of cognition

ACT-R as a theory of cognition

ACT-R: a "simple theory of complex cognition"

ACT-R as a theory of cognition

ACT-R as a theory of cognition

ACT-R architecture

[but consider the octopus]

[but consider the actual interconnection pattern of primate visual cortical areas]

self-critique by Anderson: what ACT-R doesn't do

case study #2: single-cell physiology (W. T. Newsome)

the cue chosen for Newsome's study: visual motion

inducing motion perception by microstimulation of cortical area MT

microstimulation of area MT: behavioral tasks and neural data

microstimulation of cortical area MT: stimulus properties

degrees of motion coherence: four examples

typical responses of an MT cell to motion stimuli

the outcome: inducing perception by cortical microstimulation

case study #3: fMRI (J. Haxby)

stimulus set

stimulus set, continued

brain response (average across categories)

classification based on multivoxel response

regarding multivoxel response

case study #4: a ML hack — learning to play Atari (V. Mnih et al.)

moral: ??? [the blind men and the elephant]