Natural actor-critic for road traffic optimisation

Silvia Richter*, Douglas Aberdeen, Jin Yu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

79 Citations (Scopus)

Abstract

Current road-traffic optimisation practice around the world is a combination of hand tuned policies with a small degree of automatic adaption. Even state-of-the-art research controllers need good models of the road traffic, which cannot be obtained directly from existing sensors. We use a policy-gradient reinforcement learning approach to directly optimise the traffic signals, mapping currently deployed sensor observations to control signals. Our trained controllers are (theoretically) compatible with the traffic system used in Sydney and many other cities around the world. We apply two policy-gradient methods: (1) the recent natural actor-critic algorithm, and (2) a vanilla policy-gradient algorithm for comparison. Along the way we extend natural-actor critic approaches to work for distributed and online infinite-horizon problems.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference
Pages1169-1176
Number of pages8
Publication statusPublished - 2007
Externally publishedYes
Event20th Annual Conference on Neural Information Processing Systems, NIPS 2006 - Vancouver, BC, Canada
Duration: 4 Dec 20067 Dec 2006

Publication series

NameAdvances in Neural Information Processing Systems
ISSN (Print)1049-5258

Conference

Conference20th Annual Conference on Neural Information Processing Systems, NIPS 2006
Country/TerritoryCanada
CityVancouver, BC
Period4/12/067/12/06

Fingerprint

Dive into the research topics of 'Natural actor-critic for road traffic optimisation'. Together they form a unique fingerprint.

Cite this