TY - GEN
T1 - Continuous real time dynamic programming for discrete and continuous state MDPs
AU - Vianna, Luis Gustavo Rocha
AU - Sanner, Scott
AU - De Barros, Leliane Nunes
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/12/12
Y1 - 2014/12/12
N2 - Applications of automated planning under uncertainty are often modelled as a discrete and continuous state Markov Decision Process (DC-MDP). Symbolic Dynamic Programming is the existent exact solution for DC-MDPs that uses the eXtended Algebraic Decision Diagrams (XADDs) to symbolically represent the state value function and that computes a complete state-space policy (which is very costly and limits solution to problems with small size and depth). Real-Time Dynamic Programming (RTDP) is an efficient solution method for discrete state MDPs that provides a partial solution for a known initial state. In this paper we combine the RTDP solution with XADD symbolic representation and computation of the value function to propose the Continuous Real Time Dynamic Programming (CRTDP) algorithm. This novel planner uses heuristic search and symbolic generalisation to efficiently update the value function by regions. We show that using the initial state information greatly reduces the number of regions in the value function, therefore allowing CRTDP to solve DC-MDPs more efficiently than standard symbolic dynamic programming both in time and space required for the solution.
AB - Applications of automated planning under uncertainty are often modelled as a discrete and continuous state Markov Decision Process (DC-MDP). Symbolic Dynamic Programming is the existent exact solution for DC-MDPs that uses the eXtended Algebraic Decision Diagrams (XADDs) to symbolically represent the state value function and that computes a complete state-space policy (which is very costly and limits solution to problems with small size and depth). Real-Time Dynamic Programming (RTDP) is an efficient solution method for discrete state MDPs that provides a partial solution for a known initial state. In this paper we combine the RTDP solution with XADD symbolic representation and computation of the value function to propose the Continuous Real Time Dynamic Programming (CRTDP) algorithm. This novel planner uses heuristic search and symbolic generalisation to efficiently update the value function by regions. We show that using the initial state information greatly reduces the number of regions in the value function, therefore allowing CRTDP to solve DC-MDPs more efficiently than standard symbolic dynamic programming both in time and space required for the solution.
KW - Continuous RTDP
KW - Continuous State Space Planning
KW - Discrete and Continuous State MDPs
KW - Markov Decision Process
KW - Symbolic Dynamic Programming
UR - https://www.scopus.com/pages/publications/84922483670
U2 - 10.1109/BRACIS.2014.34
DO - 10.1109/BRACIS.2014.34
M3 - Conference Paper
T3 - Proceedings - 2014 Brazilian Conference on Intelligent Systems, BRACIS 2014
SP - 134
EP - 139
BT - Proceedings - 2014 Brazilian Conference on Intelligent Systems, BRACIS 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd Brazilian Conference on Intelligent Systems, BRACIS 2014
Y2 - 19 October 2014 through 23 October 2014
ER -