TY - GEN
T1 - Practical linear value-approximation techniques for first-order MDPs
AU - Sanner, Scott
AU - Boutilier, Craig
PY - 2006
Y1 - 2006
N2 - Recent work on approximate linear programming (ALP) techniques for first-order Markov Decision Processes (FOMDPs) represents the value function linearly w.r.t. a set of first-order basis functions and uses linear programming techniques to determine suitable weights. This approach offers the advantage that it does not require simplification of the first-order value function, and allows one to solve FOMDPs independent of a specific domain instantiation. In this paper, we address several questions to enhance the applicability of this work: (1) Can we extend the first-order ALP framework to approximate policy iteration and if so, how do these two algorithms compare? (2) Can we automatically generate basis functions and evaluate their impact on value function quality? (3) How can we decompose intractable problems with universally quantified rewards into tractable subproblems? We propose answers to these questions along with a number of novel optimizations and provide a comparative empirical evaluation on problems from the ICAPS 2004 Probabilistic Planning Competition.
AB - Recent work on approximate linear programming (ALP) techniques for first-order Markov Decision Processes (FOMDPs) represents the value function linearly w.r.t. a set of first-order basis functions and uses linear programming techniques to determine suitable weights. This approach offers the advantage that it does not require simplification of the first-order value function, and allows one to solve FOMDPs independent of a specific domain instantiation. In this paper, we address several questions to enhance the applicability of this work: (1) Can we extend the first-order ALP framework to approximate policy iteration and if so, how do these two algorithms compare? (2) Can we automatically generate basis functions and evaluate their impact on value function quality? (3) How can we decompose intractable problems with universally quantified rewards into tractable subproblems? We propose answers to these questions along with a number of novel optimizations and provide a comparative empirical evaluation on problems from the ICAPS 2004 Probabilistic Planning Competition.
UR - http://www.scopus.com/inward/record.url?scp=77957878103&partnerID=8YFLogxK
M3 - Conference contribution
SN - 0974903922
SN - 9780974903927
T3 - Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence, UAI 2006
SP - 409
EP - 417
BT - Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence, UAI 2006
T2 - 22nd Conference on Uncertainty in Artificial Intelligence, UAI 2006
Y2 - 13 July 2006 through 16 July 2006
ER -