Abstract
This article focussei on the automated synthesis of agents In an uncertain environment, working In the setting of Reinforcement Learning and more precisely of Partially Observable Markov Decision Processes. The agents (with no model of their environment and no short-term memory) are facing multiple motivations/goals simultaneously, a problem related to thefield of Action Selection. We propose and evaluate various Action Selection architectures. They all combine already known basic behaviors in an adaptive manner, by learning the tuning of the combination, so as to maximize the agent's payoff. The logical continuation of this work is to automate the selection and design of the basic behaviors themselves.
Translated title of the contribution | Study of various adaptative combinations of behaviors |
---|---|
Original language | French |
Pages (from-to) | 311-343 |
Number of pages | 33 |
Journal | Revue d'Intelligence Artificielle |
Volume | 20 |
Issue number | 2-3 |
DOIs | |
Publication status | Published - 2006 |