Guiding GBFS through Learned Pairwise Rankings

Mingyu Hao, Felipe Trevizan, Sylvie Thiébaux, Patrick Ferber, Jörg Hoffmann

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

We propose a new approach based on ranking to learn to guide Greedy Best-First Search (GBFS). As previous ranking approaches, ours is based on the observation that directly learning a heuristic function is overly restrictive, and that GBFS is capable of efficiently finding good plans for a much more flexible class of total quasi-orders over states. In order to learn an optimal ranking function, we introduce a new ranking framework capable of leveraging any neural network regression model and efficiently handling the training data through batching. Compared with previous ranking approaches for planning, ours does not require complex loss functions and allows training on states outside the optimal plan with minimal overhead. Our experiments on the domains of the latest planning competition learning track show that our approach substantially improves the coverage of the underlying neural network models without degrading plan quality.

Original languageEnglish
Title of host publicationProceedings of the 33rd International Joint Conference on Artificial Intelligence, IJCAI 2024
EditorsKate Larson
PublisherInternational Joint Conferences on Artificial Intelligence
Pages6724-6732
Number of pages9
ISBN (Electronic)9781956792041
Publication statusPublished - 2024
Event33rd International Joint Conference on Artificial Intelligence, IJCAI 2024 - Jeju, Korea, Republic of
Duration: 3 Aug 20249 Aug 2024

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference33rd International Joint Conference on Artificial Intelligence, IJCAI 2024
Country/TerritoryKorea, Republic of
CityJeju
Period3/08/249/08/24

Fingerprint

Dive into the research topics of 'Guiding GBFS through Learned Pairwise Rankings'. Together they form a unique fingerprint.

Cite this