Psych/Cogst 4650/6650, Spring 2012: Reinforcement Learning:
Computational and Brain Aspects
-
Instructors: Shimon Edelman, Barbara Finlay
-
Time: Wednesdays 10:10-12:35. Place: 369 Uris Hall.
-
Blackboard site:
HERE;
use it to post questions, suggestions for additional readings, etc.
-
Readings: a zipfile with all the PDFs is available on
Blackboard. For a week-by-week list, see below; for some annotations, see
here.
I. Introduction: the idea of reinforcement learning (RL) in machine
learning and neuroscience
Jan 25:
- Overview of reinforcement learning contrasted with other approaches (Edelman).
- General motor command structure in the brain (Finlay).
- Review topic assignments for the later dates.
Feb 1:
- Deeper into the basal ganglia (Finlay).
- Computational issues in action learning (Edelman).
- Choose presentation subjects and dates.
Readings for Jan. 25 Feb 1 introductory material (required):
-
Woergoetter, W., and B. Porr (2007). Reinforcement learning. Scholarpedia,
3(3):1448.
-
Parent, A. and L.-N. Hazrati (1995). Functional anatomy of the basal
ganglia. I. The cortico-basal ganglia-thalamo-cortical loop. Brain Research
Reviews 20:91-127.
-
Wolpert, D. M., J. Diedrichsen, J. R. Flanagan (2011). Principles of
sensorimotor learning. Nature Reviews Neuroscience 12:739-752.
-
Atallah, H. E., Frank, M. J., & O'Reilly, R. C. (2004). Hippocampus,
cortex, and basal ganglia: Insights from computational models of
complementary learning systems. Neurobiology of Learning and Memory, 82(3),
253-267.
Other non-required readings:
-
Chater, N. (2009). Rational and mechanistic perspectives on reinforcement
learning. Cognition 113:350-364.
-
Sutton, R. and A. G. Barto (1998). Reinforcement Learning (BOOK).
Background reading for basic brain structure review:
-
Purves et al. (2008). The Human Nervous System. Chapter 1 in
Principles of Cognitive Science, Sinauer. Clear overview of basic
structure and terminology, including the assumed background on neurons and
action potentials.
-
Solari, S. V. H., & Stone, R. A. (2011). Cognitive consilience:
primate non-primary circuits underlying cognition. Frontiers in
Neuroanatomy, 5, 65. doi: 10.3389/fnana.2011.00065
II. Brain Mechanisms and Models of Reinforcement Learning
Feb 8:
Basic basal ganglia circuitry and models
-
Aldridge, J. W., & Berridge, K. C. (1998). Coding of serial order
by neostriatal neurons: A ''natural action'' approach to movement
sequence. Journal of Neuroscience, 18(7), 2777-2787. Also, Aldridge and
Berridge review.
-
Frank, M. J. (2011). Computational models of motivated action
selection in corticostriatal circuits. Current Opinion in Neurobiology,
21(3), 381-386.
-
Graybiel, A. M. (2008). Habits, rituals, and the evaluative
brain. Annual Review of Neuroscience, 31(1), 359-387.
-
Redgrave, P., Vautrelle, N., & Reynolds, J. N. J. (2011).
Functional properties of the basal ganglia's re-entrant loop architecture:
selection and reinforcement. Neuroscience, 198, 138-151.
Additional material, not required (note: Most of these review similar
material, but frame the content in the specific academic perspective
indicated, except Cohen and Frank, which is a different version of Frank
2011):
-
Ashby, F. G., Turner, B. O., & Horvitz, J. C. (2010). Cortical and basal
ganglia contributions to habit learning and automaticity. Trends in
Cognitive Sciences, 14(5), 208-215.
-
Bar-Gad, I., Morris, G., & Bergman, H. (2003). Information processing,
dimensionality reduction and reinforcement learning in the basal
ganglia. Progress in Neurobiology, 71(6), 439-473.
-
Balleine, B. W., Liljeholm, M., & Ostlund, S. B. (2009). The integrative
function of the basal ganglia in instrumental conditioning. Behavioural
Brain Research, 199:43-52.
-
Botvinick, M. M., Niv, Y., & Barto, A. C. (2009). Hierarchically organized
behavior and its neural foundations: A reinforcement learning
perspective. Cognition, 113(3), 262-280.
-
Chakravarthy, V. S., D. Joseph, R. S. Bapi (2010). What do the basal
ganglia do? A modeling perspective. Biol Cybern 103:237-253.
-
Cohen, M. X., & Frank, M. J. (2009). Neurocomputational models of basal
ganglia function in learning, memory and choice. Behavioural Brain
Research, 199:141-156.
Feb 15:
The process of learning and unlearning as insight into models
-
Jin, X., & Costa, R. M. (2010). Start/stop signals emerge in
nigrostriatal circuits during sequence learning. Nature, 466(7305),
457-462.
-
Thorn CA, Atallah H, Howe M, Graybiel AM (2010). Differential dynamics
of activity changes in dorsolateral and dorsomedial striatal loops during
learning. Neuron 66:781-795.
-
Howe, M. W., Atallah, H. E., McCool, A., Gibson, D. J., & Graybiel,
A. M. (2011). Habit learning is associated with major shifts in
frequencies of oscillatory activity and synchronized spike firing in
striatum. Proceedings of the National Academy of Sciences, 108(40),
16801-16806.
Additional:
-
Jin, D. Z., Fujii, N., & Graybiel, A. M. (2009). Neural representation of
time in cortico-basal ganglia circuits. Proceedings of the National Academy
of Sciences, 106(45), 19156-19161.
Feb 22:
Chaining, embedding, interrupting and unlearning (1)
-
Charlesworth, J. D., Tumer, E. C., Warren, T. L., & Brainard,
M. S. (2011). Learning the microstructure of successful
behavior. [10.1038/nn.2748]. Nat Neurosci, 14(3), 373-380
-
Bornstein, A. M., & Daw, N. D. (2011). Multiplicity of control in
the basal ganglia: computational roles of striatal subregions. Current
Opinion in Neurobiology, 21(3), 374-380
Amemori, K.-i., Gibb, L. G., & Graybiel, A. M. (2011). Shifting
responsibly: The importance of striatal modularity to reinforcement
learning in uncertain environments. Frontiers in Human Neuroscience, 5.
Additional:
-
Ito, M., & Doya, K. (2011). Multiple representations and algorithms for
reinforcement learning in the cortico-basal ganglia circuit. Current
Opinion in Neurobiology, 21(3), 368-373.
Feb 29:
Chaining, embedding, interrupting and unlearning (2)
-
Ding, J. B., Guzman, J. N., Peterson, J. D., Goldberg, J. A., &
Surmeier, D. J. (2010). Thalamic gating of corticostriatal signaling by
cholinergic interneurons. Neuron, 67(2), 294-307.
-
Isoda, M., & Hikosaka, O. (2011). Cortico-basal ganglia mechanisms
for overcoming innate, habitual and motivational
behaviors. [10.1111/j.1460-9568.2011.07698.x]. European Journal of
Neuroscience, 33(11), 2058-2069.
-
Special topic: Michael Anderson, Colloquium Speaker in
Psychology this week. Anderson, M. (2010). Neural re-use as a
fundamental organizational principal of the brain. Behavioral and Brain
Sciences.
Mar 7:
Individual differences in performance of models and subjects, disorders
-
Montague, P. R., Dolan, R. J., Friston, K. J., & Dayan, P. (2012).
Computational psychiatry. Trends in Cognitive Sciences, 16(1), 72-80.
-
Redgrave, P., Rodriguez, M., Smith, Y., Rodriguez-Oroz, M. C.,
Lehericy, S., Bergman, H., . . . Obeso, J. A. (2010). Goal-directed and
habitual control in the basal ganglia: implications for Parkinson's
disease. [10.1038/nrn2915]. Nat Rev Neurosci, 11(11), 760-772.
-
Neiman, T. and Y. Loewenstein (2011). Reinforcement learning in
professional basketball players. Nature Communications 2:569.
Mar 14:
Assigning agency
-
Review Wolpert (under Jan. 25 readings).
-
Gallagher, S. (2012). Multiple aspects in the sense of agency.
New Ideas in Psychology 30:15-31.
-
Wegner D. (2004). Precis of The illusion of conscious will. Behavioral and
Brain Sciences, 27:649-659 (not commentaries). Whoever signs up for
this should get the book and present some of the experiments reviewed, with
an eye to the statistical assignment of agency, not the consciousness
aspect.
-
Whitham, E. M., Fitzgibbon, S. P., Lewis, T. W., Pope, K. J.,
DeLosAngeles, D., Clark, C. R., . . . Willoughby, J. O. (2011). Visual
experiences during paralysis. Frontiers in Human Neuroscience, 5. doi:
10.3389/fnhum.2011.00160
Mar 21: SPRING BREAK
Mar 28:
Multiple types of reinforcers and gates: Opiates and oxytocin
-
Humphries, M. D., & Prescott, T. J. (2010). The ventral basal
ganglia, a selection mechanism at the crossroads of space, strategy, and
reward. Progress in Neurobiology, 90(4), 385-417.
-
Ross, H. E., & Young, L. J. (2009). Oxytocin and the neural
mechanisms regulating social cognition and affiliative behavior. Frontiers
in Neuroendocrinology, 30(4), 534-547.
-
Depue, R. L., & Collins, P. F. (1999). Neurobiology of the structure of
personality: Dopamine, facilitation of incentive motivation, and
extraversion. Behavioral and Brain Sciences, 22, 491-569.
An example, but not necessarily a "model", of research in this area using
primates and neuroimaging:
-
Chang, S. W. C., Barter, J. W., Ebitz, R. B., Watson, K. K., & Platt,
M. L. (2012). Inhaled oxytocin amplifies both vicarious reinforcement and
self reinforcement in rhesus macaques (Macaca mulatta). Proceedings of the
National Academy of Sciences, 109:959-964.
April 4:
Hierarchical and other control architectures: computation
-
Parr, P. and S. Russell (1997). Reinforcement Learning with Hierarchies of
Machines. Proc. NIPS.
-
Botvinick, M., Y. Niv, A. G. Barto (2009). Hierarchically organized behavior
and its neural foundations: A reinforcement learning perspective. Cognition
113:262-280.
-
Ribas-Fernandes, J. J. F., A. Solway, C. Diuk, J. T. McGuire, A. Barto,
Y. Niv, and M. Botvinick (2011). A Neural Signature of Hierarchical
Reinforcement Learning. Neuron 71:370-379.
-
Barto, A. G. and S. Mahadevan (2003). Recent Advances in Hierarchical
Reinforcement Learning. Discrete Event Systems 14:41-77.
-
Vigorito, C. M. and A. G. Barto (2010). Intrinsically Motivated Hierarchical
Skill Learning in Structured Environments. IEEE Transactions on Autonomous
Mental Development 2:132-144.
April 11:
Negative reinforcement and punishment
-
Matsumoto M, Hikosaka O (2009). Two types of dopamine neuron distinctly convey
positive and negative motivational signals. Nature 459:837-841.
-
LeDoux, J. E. (2003). The emotional brain, fear and the amygdala. Cellular
and Molecular Neurobiology, 23.
Additional:
-
Leknes, S., & Tracey, I. (2008). A common neurobiology for pain and
pleasure. Nature Reviews Neuroscience, 9:314-320.
April 18:
The special case of anxiety: resource allocation and control
-
Egner, T., Etkin, A., Gale, S., & Hirsch, J. (2008). Dissociable
neural systems resolve conflict from emotional versus nonemotional
distracters. Cerebral Cortex, 18(6), 1475-1484.
(Also Egner commentary).
Additional:
-
Duncan, J. (2010). The multiple-demand (MD) system of the primate brain:
mental programs for intelligent behaviour. Trends in Cognitive Sciences,
14:172-179.
-
Hurley, MM, Dennet, D. and Adams, R. (2011). Inside Jokes: Using
humor to reverse-engineer the mind. MIT Press. Chapts 2 and 5 for a
sketch of the argument and an interesting twist on what a "reinforcement"
is.
April 25:
Integrating basal ganglia function into memory and decision-making
-
O'Reilly, R. C. and M. J. Frank (2006). Making Working Memory Work: A
Computational Model of Learning in the Prefrontal Cortex and Basal
Ganglia. Neural Computation 18:283-328.
-
van der Meer MA, Johnson A, Schmitzer-Torbert NC, Redish
AD (2010). Triple dissociation of information processing in dorsal
striatum, ventral striatum, and hippocampus on a learned spatial decision
task. Neuron 67:25-32.
-
McNab, F. and T. Klingberg (2008). Prefrontal cortex and basal ganglia
control access to working memory. Nature Neuroscience 11:103-108.
-
Ullman, M. T. (2006). Is Broca's area part of a basal ganglia
thalamocortical circuit? Cortex 42:480-485.
May 2:
Overview
-
Shmuelof, L., & Krakauer, J. W. (2011). Are we ready for a natural
history of motor learning? Neuron, 72(3), 469-476.
Shimon Edelman <se37 at cornell.edu>
Last modified on Mon Feb 27 14:50:27 2012