RAO: An algorithm for chance-constrained POMDP's

Pedro Santana, Sylvie Thíebaux, Brian Williams

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    60 Citations (Scopus)

    Abstract

    Autonomous agents operating in partially observable stochastic environments often face the problem of optimizing expected performance while bounding the risk of violating safety constraints. Such problems can be modeled as chance-constrained POMDP's (CCPOMDP's). Our first contribution is a systematic derivation of execution risk in POMDP domains, which improves upon how chance constraints are handled in the constrained POMDP literature. Second, we present RAO, a heuristic forward search algorithm producing optimal, deterministic, finite-horizon policies for CCPOMDP's. In addition to the utility heuristic, RAO leverages an admissible execution risk heuristic to quickly detect and prune overly-risky policy branches. Third, we demonstrate the usefulness of RAO in two challenging domains of practical interest: power supply restoration and autonomous science agents.

    Original languageEnglish
    Title of host publication30th AAAI Conference on Artificial Intelligence, AAAI 2016
    PublisherAAAI Press
    Pages3308-3314
    Number of pages7
    ISBN (Electronic)9781577357605
    Publication statusPublished - 2016
    Event30th AAAI Conference on Artificial Intelligence, AAAI 2016 - Phoenix, United States
    Duration: 12 Feb 201617 Feb 2016

    Publication series

    Name30th AAAI Conference on Artificial Intelligence, AAAI 2016

    Conference

    Conference30th AAAI Conference on Artificial Intelligence, AAAI 2016
    Country/TerritoryUnited States
    CityPhoenix
    Period12/02/1617/02/16

    Fingerprint

    Dive into the research topics of 'RAO: An algorithm for chance-constrained POMDP's'. Together they form a unique fingerprint.

    Cite this