TY - GEN
T1 - Identifying Reusable Early-Life Options
AU - Weber, Aline
AU - Martin, Charles P.
AU - Torresen, Jim
AU - Da Silva, Bruno C.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/9/30
Y1 - 2019/9/30
N2 - We introduce a method for identifying short-duration reusable motor behaviors, which we call early-life options, that allow robots to perform well even in the very early stages of their lives. This is important when agents need to operate in environments where the use of poor-performing policies (such as the random policies with which they are typically initialized) may be catastrophic. Our method augments the original action set of the agent with specially-constructed behaviors that maximize performance over a possibly infinite family of related motor tasks. These are akin to primitive reflexes in infant mammals - agents born with our early-life options, even if acting randomly, are capable of producing rudimentary behaviors comparable to those acquired by agents that actively optimize a policy for hundreds of thousands of steps. We also introduce three metrics for identifying useful early-life options and show that they result in behaviors that maximize both the option's expected return while minimizing the risk that executing the option will result in extremely poor performance. We evaluate our technique on three simulated robots tasked with learning to walk under different battery consumption constraints and show that even random policies over early-life options are already sufficient to allow for the agent to perform similarly to agents trained for hundreds of thousands of steps.
AB - We introduce a method for identifying short-duration reusable motor behaviors, which we call early-life options, that allow robots to perform well even in the very early stages of their lives. This is important when agents need to operate in environments where the use of poor-performing policies (such as the random policies with which they are typically initialized) may be catastrophic. Our method augments the original action set of the agent with specially-constructed behaviors that maximize performance over a possibly infinite family of related motor tasks. These are akin to primitive reflexes in infant mammals - agents born with our early-life options, even if acting randomly, are capable of producing rudimentary behaviors comparable to those acquired by agents that actively optimize a policy for hundreds of thousands of steps. We also introduce three metrics for identifying useful early-life options and show that they result in behaviors that maximize both the option's expected return while minimizing the risk that executing the option will result in extremely poor performance. We evaluate our technique on three simulated robots tasked with learning to walk under different battery consumption constraints and show that even random policies over early-life options are already sufficient to allow for the agent to perform similarly to agents trained for hundreds of thousands of steps.
KW - Development of skills in biological systems and robots
KW - Machine Learning methods for robot development
UR - https://www.scopus.com/pages/publications/85073674250
U2 - 10.1109/DEVLRN.2019.8850725
DO - 10.1109/DEVLRN.2019.8850725
M3 - Conference Paper
AN - SCOPUS:85073674250
SN - 978-1-5386-8129-9
T3 - IEEE International Conference on Development and Learning, ICDL
SP - 335
EP - 340
BT - 2019 Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics, ICDL-EpiRob 2019
A2 - Aly, Amir
A2 - Bicho, Estela
A2 - Boucenna, Sofiane
A2 - Castro da Silva, Bruno
A2 - Chetouani, Mohamed
A2 - del Pobil, Angel P.
A2 - Diard, Julien
A2 - Doncieux, Stephane
A2 - Goksun, Tilbe
A2 - Grimminger, Angela
A2 - Guerin, Frank
A2 - Hagiwara, Yoshinobu
A2 - Jamone, Lorenzo
A2 - Kalkan, Sinan
A2 - Lara, Bruno
A2 - Moulin-Frier, Clement
A2 - Murata, Shingo
A2 - Nagai, Takayuki
A2 - Nagai, Yukie
A2 - Nomikou, Iris
A2 - Ogino, Masaki
A2 - Oudeyer, Pierre-Yves
A2 - Pereira, Alfredo F.
A2 - Pitti, Alexandre
A2 - Raczaszek-Leonardi, Joanna
A2 - Risi, Sebastian
A2 - Rosman, Benjamin
A2 - Sandamirskaya, Yulia
A2 - Schilling, Malte
A2 - Sciutti, Alessandra
A2 - Shaw, Patricia
A2 - Soltoggio, Andrea
A2 - Spranger, Michael
A2 - Taniguchi, Tadahiro
A2 - Thill, Serge
A2 - Triesch, Jochen
A2 - Ugur, Emre
A2 - Vollmer, Anna-Lisa
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, ICDL-EpiRob 2019
Y2 - 19 August 2019 through 22 August 2019
ER -