TY - GEN
T1 - Multilevel Monte-Carlo for Solving POMDPs Online
AU - Hoerger, Marcus
AU - Kurniawati, Hanna
AU - Elfes, Alberto
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Planning under partial obervability is essential for autonomous robots. A principled way to address such planning problems is the Partially Observable Markov Decision Process (POMDP). Although solving POMDPs is computationally intractable, substantial advancements have been achieved in developing approximate POMDP solvers in the past two decades. However, computing robust solutions for systems with complex dynamics remains challenging. Most on-line solvers rely on a large number of forward-simulations and standard Monte-Carlo methods to compute the expected outcomes of actions the robot can perform. For systems with complex dynamics, e.g., those with non-linear dynamics that admit no closed form solution, even a single forward simulation can be prohibitively expensive. Of course, this issue exacerbates for problems with long planning horizons. This paper aims to alleviate the above difficulty. To this end, we propose a new on-line POMDP solver, called Multilevel POMDP Planner (MLPP), that combines the commonly known Monte-Carlo-Tree-Search with the concept of Multilevel Monte-Carlo to speed-up our capability in generating approximately optimal solutions for POMDPs with complex dynamics. Experiments on four different problems of POMDP-based torque control, navigation and grasping indicate that MLPP substantially outperforms state-of-the-art POMDP solvers.
AB - Planning under partial obervability is essential for autonomous robots. A principled way to address such planning problems is the Partially Observable Markov Decision Process (POMDP). Although solving POMDPs is computationally intractable, substantial advancements have been achieved in developing approximate POMDP solvers in the past two decades. However, computing robust solutions for systems with complex dynamics remains challenging. Most on-line solvers rely on a large number of forward-simulations and standard Monte-Carlo methods to compute the expected outcomes of actions the robot can perform. For systems with complex dynamics, e.g., those with non-linear dynamics that admit no closed form solution, even a single forward simulation can be prohibitively expensive. Of course, this issue exacerbates for problems with long planning horizons. This paper aims to alleviate the above difficulty. To this end, we propose a new on-line POMDP solver, called Multilevel POMDP Planner (MLPP), that combines the commonly known Monte-Carlo-Tree-Search with the concept of Multilevel Monte-Carlo to speed-up our capability in generating approximately optimal solutions for POMDPs with complex dynamics. Experiments on four different problems of POMDP-based torque control, navigation and grasping indicate that MLPP substantially outperforms state-of-the-art POMDP solvers.
KW - Monte-Carlo
KW - POMDP
KW - Partially Observable Markov Decision Process
UR - http://www.scopus.com/inward/record.url?scp=85126227065&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-95459-8_11
DO - 10.1007/978-3-030-95459-8_11
M3 - Conference contribution
SN - 9783030954581
T3 - Springer Proceedings in Advanced Robotics
SP - 174
EP - 190
BT - Robotics Research - The 19th International Symposium ISRR
A2 - Asfour, Tamim
A2 - Yoshida, Eiichi
A2 - Park, Jaeheung
A2 - Christensen, Henrik
A2 - Khatib, Oussama
PB - Springer Nature
T2 - 17th International Symposium of Robotics Research, ISRR 2019
Y2 - 6 October 2019 through 10 October 2019
ER -