Symbolic dynamic programming for continuous state and action MDPs

Zahra Zamani*, Scott Sanner, Cheng Fang

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    17 Citations (Scopus)

    Abstract

    Many real-world decision-theoretic planning problems are naturally modeled using both continuous state and action (CSA) spaces, yet little work has provided exact solutions for the case of continuous actions. In this work, we propose a symbolic dynamic programming (SDP) solution to obtain the optimal closed-form value function and policy for CSA-MDPs with multivariate continuous state and actions, discrete noise, piecewise linear dynamics, and piecewise linear (or restricted piecewise quadratic) reward. Our key contribution over previous SDP work is to show how the continuous action maximization step in the dynamic programming backup can be evaluated optimally and symbolically - a task which amounts to symbolic constrained optimization subject to unknown state parameters; we further integrate this technique to work with an efficient and compact data structure for SDP - the extended algebraic decision diagram (XADD). We demonstrate empirical results on a didactic nonlinear planning example and two domains from operations research to show the first automated exact solution to these problems.

    Original languageEnglish
    Title of host publicationAAAI-12 / IAAI-12 - Proceedings of the 26th AAAI Conference on Artificial Intelligence and the 24th Innovative Applications of Artificial Intelligence Conference
    Pages1839-1845
    Number of pages7
    Publication statusPublished - 2012
    Event26th AAAI Conference on Artificial Intelligence and the 24th Innovative Applications of Artificial Intelligence Conference, AAAI-12 / IAAI-12 - Toronto, ON, Canada
    Duration: 22 Jul 201226 Jul 2012

    Publication series

    NameProceedings of the National Conference on Artificial Intelligence
    Volume3

    Conference

    Conference26th AAAI Conference on Artificial Intelligence and the 24th Innovative Applications of Artificial Intelligence Conference, AAAI-12 / IAAI-12
    Country/TerritoryCanada
    CityToronto, ON
    Period22/07/1226/07/12

    Fingerprint

    Dive into the research topics of 'Symbolic dynamic programming for continuous state and action MDPs'. Together they form a unique fingerprint.

    Cite this