Causal bandits: Learning good interventions via causal inference

Finnian Lattimore, Tor Lattimore, Mark D. Reid

    Research output: Contribution to journalConference articlepeer-review

    79 Citations (Scopus)

    Abstract

    We study the problem of using causal models to improve the rate at which good interventions can be learned online in a stochastic environment. Our formalism combines multi-arm bandits and causal inference to model a novel type of bandit feedback that is not exploited by existing approaches. We propose a new algorithm that exploits the causal feedback and prove a bound on its simple regret that is strictly better (in all quantities) than algorithms that do not use the additional causal information.

    Original languageEnglish
    Pages (from-to)1189-1197
    Number of pages9
    JournalAdvances in Neural Information Processing Systems
    Publication statusPublished - 2016
    Event30th Annual Conference on Neural Information Processing Systems, NIPS 2016 - Barcelona, Spain
    Duration: 5 Dec 201610 Dec 2016

    Cite this