TY - GEN
T1 - Symbolic dynamic programming for continuous state and observation POMDPs
AU - Zamani, Zahra
AU - Sanner, Scott
AU - Poupart, Pascal
AU - Kersting, Kristian
PY - 2012
Y1 - 2012
N2 - Point-based value iteration (PBVI) methods have proven extremely effective for finding (approximately) optimal dynamic programming solutions to partially-observable Markov decision processes (POMDPs) when a set of initial belief states is known. However, no PBVI work has provided exact point-based backups for both continuous state and observation spaces, which we tackle in this paper. Our key insight is that while there may be an infinite number of observations, there are only a finite number of continuous observation partitionings that are relevant for optimal decision-making when a finite, fixed set of reachable belief states is considered. To this end, we make two important contributions: (1) we show how previous exact symbolic dynamic programming solutions for continuous state MDPs can be generalized to continuous state POMDPs with discrete observations, and (2) we show how recently developed symbolic integration methods allow this solution to be extended to PBVI for continuous state and observation POMDPs with potentially correlated, multivariate continuous observation spaces.
AB - Point-based value iteration (PBVI) methods have proven extremely effective for finding (approximately) optimal dynamic programming solutions to partially-observable Markov decision processes (POMDPs) when a set of initial belief states is known. However, no PBVI work has provided exact point-based backups for both continuous state and observation spaces, which we tackle in this paper. Our key insight is that while there may be an infinite number of observations, there are only a finite number of continuous observation partitionings that are relevant for optimal decision-making when a finite, fixed set of reachable belief states is considered. To this end, we make two important contributions: (1) we show how previous exact symbolic dynamic programming solutions for continuous state MDPs can be generalized to continuous state POMDPs with discrete observations, and (2) we show how recently developed symbolic integration methods allow this solution to be extended to PBVI for continuous state and observation POMDPs with potentially correlated, multivariate continuous observation spaces.
UR - http://www.scopus.com/inward/record.url?scp=84877729851&partnerID=8YFLogxK
M3 - Conference contribution
SN - 9781627480031
T3 - Advances in Neural Information Processing Systems
SP - 1394
EP - 1402
BT - Advances in Neural Information Processing Systems 25
T2 - 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012
Y2 - 3 December 2012 through 6 December 2012
ER -