Credit assignment in multiple goal embodied visuomotor behavior

The intrinsic complexity of the brain can lead one to set aside issues related to its relationships with the body, but the field of embodied cognition emphasizes that understanding brain function at the system level requ
The intrinsic complexity of the brain can lead one to set aside issues related to its relationships with the body, but the field of embodied cognition emphasizes that understanding brain function at the system level requires one to address the role of the brain-body interface. It has only recently been appreciated that this interface performs huge amounts of computation that does not have to be repeated by the brain, and thus affords the brain great simplifications in its representations. In effect the brain’s abstract states can refer to coded representations of the world created by the body. But even if the brain can communicate with the world through abstractions, the severe speed limitations in its neural circuitry mean that vast amounts of indexing must be performed during development so that appropriate behavioral responses can be rapidly accessed. One way this could happen would be if the brain used a decomposition whereby behavioral primitives could be quickly accessed and combined. This realization motivates our study of independent sensorimotor task solvers, which we call modules, in directing behavior. The issue we focus on herein is how an embodied agent can learn to calibrate such individual visuomotor modules while pursuing multiple goals. The biologically plausible standard for module programming is that of reinforcement given during exploration of the environment. However this formulation contains a substantial issue when sensorimotor modules are used in combination: The credit for their overall performance must be divided amongst them. We show that this problem can be solved and that diverse task combinations are beneficial in learning and not a complication, as usually assumed. Our simulations show that fast algorithms are available that allot credit correctly and are insensitive to measurement noise.
show moreshow less

Download full text files

Export metadata

  • Export Bibtex
  • Export RIS

Additional Services

    Share in Twitter Search Google Scholar
Metadaten
Author:Constantin A. Rothkopf, Dana H. Ballard
URN:urn:nbn:de:hebis:30:3-267472
DOI:http://dx.doi.org/10.3389/fpsyg.2010.00173
ISSN:1664-1078
Parent Title (English):Frontiers in psychology
Publisher:Frontiers Research Foundation
Place of publication:Lausanne
Document Type:Article
Language:English
Date of Publication (online):2010/11/22
Date of first Publication:2010/11/22
Publishing Institution:Univ.-Bibliothek Frankfurt am Main
Release Date:2012/10/11
Tag:credit assignment; learning; modules; reinforcement; reward
Volume:1
Issue:Article 173
Pagenumber:13
First Page:1
Last Page:13
Note:
Copyright © 2010 Rothkopf and Ballard. This is an open-access article subject to an exclusive license agreement between the authors and the Frontiers Research Foundation, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited.
Institutes:Frankfurt Institute for Advanced Studies (FIAS)
Dewey Decimal Classification:150 Psychologie
Sammlungen:Universitätspublikationen
Licence (German):License LogoCreative Commons - Namensnennung 3.0

$Rev: 11761 $