Survival-oriented reinforcement learning model: An effcient and robust deep reinforcement learning algorithm for autonomous driving problem

Changkun Ye, Huimin Ma*, Xiaoqin Zhang, Kai Zhang, Shaodi You

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    8 Citations (Scopus)

    Abstract

    Using Deep Reinforcement Learning (DRL) algorithm to deal with autonomous driving tasks usually have unsatisfied performance due to lack of robustness and means to escape local optimum. In this article, we designs a Survival-Oriented Reinforcement Learning (SORL) model that tackle these problems by setting survival rather than maximize total reward as first priority. In SORL model, we model autonomous driving task as Constrained Markov Decision Process (CMDP) and introduce Negative-Avoidance Function to learn from previous failure. The SORL model greatly speed up the training process and improve the robustness of normal Deep Reinforcement Learning algorithm.

    Original languageEnglish
    Title of host publicationImage and Graphics - 9th International Conference, ICIG 2017, Revised Selected Papers
    EditorsXiangwei Kong, Yao Zhao, David Taubman
    PublisherSpringer Verlag
    Pages417-429
    Number of pages13
    ISBN (Print)9783319715889
    DOIs
    Publication statusPublished - 2017
    Event9th International Conference on Image and Graphics, ICIG 2017 - Shanghai, China
    Duration: 13 Sept 201715 Sept 2017

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume10667 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference9th International Conference on Image and Graphics, ICIG 2017
    Country/TerritoryChina
    CityShanghai
    Period13/09/1715/09/17

    Fingerprint

    Dive into the research topics of 'Survival-oriented reinforcement learning model: An effcient and robust deep reinforcement learning algorithm for autonomous driving problem'. Together they form a unique fingerprint.

    Cite this