Concurrent probabilistic temporal planning with policy-gradients

Douglas Aberdeen*, Olivier Buffet

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    7 Citations (Scopus)

    Abstract

    We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search that attempts to optimise a parameterised policy using gradient ascent. Low memory use, plus the use of function approximation methods, plus factorisation of the policy, allow us to scale to challenging domains. This Factored Policy Gradient (FPG) Planner also attempts to optimise both steps to goal and the probability of success. We compare the FPG planner to other planners on CPTP domains, and on simpler but better studied probabilistic non-temporal domains.

    Original languageEnglish
    Title of host publicationICAPS 2007, 17th International Conference on Automated Planning and Scheduling
    PublisherAssociation for the Advancement of Artificial Intelligence, AAAI
    Pages10-17
    Number of pages8
    ISBN (Print)9781577353447
    Publication statusPublished - 2007
    EventICAPS 2007, 17th International Conference on Automated Planning and Scheduling - Providence, RI, United States
    Duration: 22 Sept 200726 Sept 2007

    Publication series

    NameICAPS 2007, 17th International Conference on Automated Planning and Scheduling

    Conference

    ConferenceICAPS 2007, 17th International Conference on Automated Planning and Scheduling
    Country/TerritoryUnited States
    CityProvidence, RI
    Period22/09/0726/09/07

    Fingerprint

    Dive into the research topics of 'Concurrent probabilistic temporal planning with policy-gradients'. Together they form a unique fingerprint.

    Cite this