Inventory Control with Partially Observable States

Erli Wang*, Hanna Kurniawati, Dirk P. Kroese

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    2 Citations (Scopus)

    Abstract

    Consider a retailer who buys a range of commodities from a wholesaler and sells them to customers. At each time period, the retailer has to decide how much of each type of commodity to purchase, so as to maximize some overall profit. This requires a balance between maximizing the amount of high-valued customer demands that can be fulfilled and minimizing storage and delivery costs. Due to inaccuracies in inventory recording, misplaced products, market fluctuation, etc., the above purchasing decisions must be made in the presence of partial observability on the amount of stocked goods and on uncertainty in the demand. A natural framework for such an inventory control problems is the Partially Observable Markov Decision Process (POMDP). Key to POMDP is that it decides the best actions to perform with respect to distributions over states, rather than a single state. Finding the optimal solution of a POMDP problem is computationally intractable, but the past decade has seen substantial advances in finding approximately optimal POMDP solutions and POMDP has started to become practical for many interesting problems. Despite advances in approximate POMDP solvers, they do not perform well on most inventory control problems, due to the massive action space (i.e., purchasing possibilities) of most such problems. Most POMDPbased methods limit the problem to a one-commodity scenario, which is far from reality. In this paper, we apply our recent POMDP-based method (QBASE) to multi-commodity inventory control. QBASE combines Monte Carlo Tree Search with quantile statistics based Monte Carlo, namely the Cross-Entropy method for optimization, to quickly identify good actions without sweeping through the entire action space. It enables QBASE to substantially scale up our ability to compute good solutions to POMDPs with extremely large discrete action spaces (in the order of a million discrete actions). We compare our solution to several commonly used inventory control methods, such as the (s; S) method and other state-of-the-art POMDP solvers. The results are promising, as it demonstrates smarter purchasing behaviors. For instance, if we combine maximum likelihood with the commonly used (s; S) policy, the latter policy is very far from optimal when the uncertainty must be represented as a non-unimodal distribution. Furthermore, the state-of-the-art POMDP solver can only generate a sub-optimal policy of always keeping the stocks of all commodities at a relatively high level, to meet as many demands as possible. In contrast, QBASE can generate a better policy, whereby a small amount of sales are sacrificed t o keep t he i nventory l evel of commodities with expensive storage cost and low value, to be as low as possible, which then lead to a higher profit.

    Original languageEnglish
    Title of host publication23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making
    Subtitle of host publicationThe Role of Modelling and Simulation, MODSIM 2019
    EditorsS. Elsawah
    PublisherModelling and Simulation Society of Australia and New Zealand Inc (MSSANZ)
    Pages200-206
    Number of pages7
    ISBN (Electronic)9780975840092
    Publication statusPublished - 2019
    Event23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making: The Role of Modelling and Simulation, MODSIM 2019 - Canberra, Australia
    Duration: 1 Dec 20196 Dec 2019

    Publication series

    Name23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making: The Role of Modelling and Simulation, MODSIM 2019

    Conference

    Conference23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making: The Role of Modelling and Simulation, MODSIM 2019
    Country/TerritoryAustralia
    CityCanberra
    Period1/12/196/12/19

    Fingerprint

    Dive into the research topics of 'Inventory Control with Partially Observable States'. Together they form a unique fingerprint.

    Cite this