Development of cue integration with reward-mediated learning

This thesis will first introduce in more detail the Bayesian theory and its use in integrating multiple
information sources. I will briefly talk about models and their relation to the dynamics of an environment,
and how to combine multiple alternative models.
Following that I will discuss the experimental findings on multisensory integration in humans and
animals. I start with psychophysical results on various forms of tasks and setups, that show that the brain
uses and combines information from multiple cues. Specifically, the discussion will focus on the finding
that humans integrate this information in a way that is close to the theoretical optimal performance.
Special emphasis will be put on results about the developmental aspects of cue integration, highlighting
experiments that could show that children do not perform similar to the Bayesian predictions. This section
also includes a short summary of experiments on how subjects handle multiple alternative environmental
dynamics. I will also talk about neurobiological findings of cells receiving input from multiple receptors
both in dedicated brain areas but also primary sensory areas.
I will proceed with an overview of existing theories and computational models of multisensory integration.
This will be followed by a discussion on reinforcement learning (RL). First I will talk about the
original theory including the two different main approaches model-free and model-based reinforcement
learning. The important variables will be introduced as well as different algorithmic implementations.
Secondly, a short review on the mapping of those theories onto brain and behaviour will be given. I mention
the most in
uential papers that showed correlations between the activity in certain brain regions
with RL variables, most prominently between dopaminergic neurons and temporal difference errors. I
will try to motivate, why I think that this theory can help to explain the development of near-optimal
cue integration in humans.
The next main chapter will introduce our model that learns to solve the task of audio-visual orienting.
Many of the results in this section have been published in [Weisswange et al. 2009b,Weisswange
et al. 2011]. The model agent starts without any knowledge of the environment and acts based on predictions
of rewards, which will be adapted according to the reward signaling the quality of the performed
action. I will show that after training this model performs similarly to the prediction of a Bayesian
observer. The model can also deal with more complex environments in which it has to deal with multiple
possible underlying generating models (perform causal inference). In these experiments I use di#erent
formulations of Bayesian observers for comparison with our model, and find that it is most similar to
the fully optimal observer doing model averaging. Additional experiments using various alterations to
the environment show the ability of the model to react to changes in the input statistics without explicitly
representing probability distributions. I will close the chapter with a discussion on the benefits and
shortcomings of the model.
The thesis continues whith a report on an application of the learning algorithm introduced before
to two real world cue integration tasks on a robotic head. For these tasks our system outperforms a
commonly used approximation to Bayesian inference, reliability weighted averaging. The approximation
is handy because of its computational simplicity, because it relies on certain assumptions that are usually
controlled for in a laboratory setting, but these are often not true for real world data. This chapter is
based on the paper [Karaoguz et al. 2011].
Our second modeling approach tries to address the neuronal substrates of the learning process for cue integration. I again use a reward based training scheme, but this time implemented as a modulation of
synaptic plasticity mechanisms in a recurrent network of binary threshold neurons. I start the chapter
with an additional introduction section to discuss recurrent networks and especially the various forms of
neuronal plasticity that I will use in the model. The performance on a task similar to that of chapter 3 will be presented together with an analysis of the in
uence of different plasticity mechanisms on it.
Again benefits and shortcomings and the general potential of the method will be discussed.
I will close the thesis with a general conclusion and some ideas about possible future work.

Download full text files

Export metadata

  • Export Bibtex
  • Export RIS

Additional Services

    Share in Twitter Search Google Scholar
Metadaten
Author:Thomas Weißwange
URN:urn:nbn:de:hebis:30:3-251993
Referee:Jochen Triesch, Visvanathan Ramesh
Document Type:Doctoral Thesis
Language:English
Date of Publication (online):13.06.2012
Year of first Publication:2012
Publishing Institution:Univ.-Bibliothek Frankfurt am Main
Date of final exam:04.06.2012
Pagenumber:VIII, 120
HeBIS PPN:30304649X
Institutes:Informatik
Frankfurt Institute for Advanced Studies
Dewey Decimal Classification:000 Informatik, Informationswissenschaft, allgemeine Werke
Sammlungen:Universitätspublikationen
Licence (German):License LogoCreative Commons - Namensnennung-Keine Bearbeitung

$Rev: 8725 $