TY - GEN
T1 - Inventory Control with Partially Observable States
AU - Wang, Erli
AU - Kurniawati, Hanna
AU - Kroese, Dirk P.
N1 - Publisher Copyright:
Copyright © 2019 The Modelling and Simulation Society of Australia and New Zealand Inc. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Consider a retailer who buys a range of commodities from a wholesaler and sells them to customers. At each time period, the retailer has to decide how much of each type of commodity to purchase, so as to maximize some overall profit. This requires a balance between maximizing the amount of high-valued customer demands that can be fulfilled and minimizing storage and delivery costs. Due to inaccuracies in inventory recording, misplaced products, market fluctuation, etc., the above purchasing decisions must be made in the presence of partial observability on the amount of stocked goods and on uncertainty in the demand. A natural framework for such an inventory control problems is the Partially Observable Markov Decision Process (POMDP). Key to POMDP is that it decides the best actions to perform with respect to distributions over states, rather than a single state. Finding the optimal solution of a POMDP problem is computationally intractable, but the past decade has seen substantial advances in finding approximately optimal POMDP solutions and POMDP has started to become practical for many interesting problems. Despite advances in approximate POMDP solvers, they do not perform well on most inventory control problems, due to the massive action space (i.e., purchasing possibilities) of most such problems. Most POMDPbased methods limit the problem to a one-commodity scenario, which is far from reality. In this paper, we apply our recent POMDP-based method (QBASE) to multi-commodity inventory control. QBASE combines Monte Carlo Tree Search with quantile statistics based Monte Carlo, namely the Cross-Entropy method for optimization, to quickly identify good actions without sweeping through the entire action space. It enables QBASE to substantially scale up our ability to compute good solutions to POMDPs with extremely large discrete action spaces (in the order of a million discrete actions). We compare our solution to several commonly used inventory control methods, such as the (s; S) method and other state-of-the-art POMDP solvers. The results are promising, as it demonstrates smarter purchasing behaviors. For instance, if we combine maximum likelihood with the commonly used (s; S) policy, the latter policy is very far from optimal when the uncertainty must be represented as a non-unimodal distribution. Furthermore, the state-of-the-art POMDP solver can only generate a sub-optimal policy of always keeping the stocks of all commodities at a relatively high level, to meet as many demands as possible. In contrast, QBASE can generate a better policy, whereby a small amount of sales are sacrificed t o keep t he i nventory l evel of commodities with expensive storage cost and low value, to be as low as possible, which then lead to a higher profit.
AB - Consider a retailer who buys a range of commodities from a wholesaler and sells them to customers. At each time period, the retailer has to decide how much of each type of commodity to purchase, so as to maximize some overall profit. This requires a balance between maximizing the amount of high-valued customer demands that can be fulfilled and minimizing storage and delivery costs. Due to inaccuracies in inventory recording, misplaced products, market fluctuation, etc., the above purchasing decisions must be made in the presence of partial observability on the amount of stocked goods and on uncertainty in the demand. A natural framework for such an inventory control problems is the Partially Observable Markov Decision Process (POMDP). Key to POMDP is that it decides the best actions to perform with respect to distributions over states, rather than a single state. Finding the optimal solution of a POMDP problem is computationally intractable, but the past decade has seen substantial advances in finding approximately optimal POMDP solutions and POMDP has started to become practical for many interesting problems. Despite advances in approximate POMDP solvers, they do not perform well on most inventory control problems, due to the massive action space (i.e., purchasing possibilities) of most such problems. Most POMDPbased methods limit the problem to a one-commodity scenario, which is far from reality. In this paper, we apply our recent POMDP-based method (QBASE) to multi-commodity inventory control. QBASE combines Monte Carlo Tree Search with quantile statistics based Monte Carlo, namely the Cross-Entropy method for optimization, to quickly identify good actions without sweeping through the entire action space. It enables QBASE to substantially scale up our ability to compute good solutions to POMDPs with extremely large discrete action spaces (in the order of a million discrete actions). We compare our solution to several commonly used inventory control methods, such as the (s; S) method and other state-of-the-art POMDP solvers. The results are promising, as it demonstrates smarter purchasing behaviors. For instance, if we combine maximum likelihood with the commonly used (s; S) policy, the latter policy is very far from optimal when the uncertainty must be represented as a non-unimodal distribution. Furthermore, the state-of-the-art POMDP solver can only generate a sub-optimal policy of always keeping the stocks of all commodities at a relatively high level, to meet as many demands as possible. In contrast, QBASE can generate a better policy, whereby a small amount of sales are sacrificed t o keep t he i nventory l evel of commodities with expensive storage cost and low value, to be as low as possible, which then lead to a higher profit.
KW - Inventory control problem
KW - Multi-commodity
KW - On-line pomdp solver
KW - Partially observable markov decision process
UR - http://www.scopus.com/inward/record.url?scp=85086463601&partnerID=8YFLogxK
M3 - Conference contribution
T3 - 23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making: The Role of Modelling and Simulation, MODSIM 2019
SP - 200
EP - 206
BT - 23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making
A2 - Elsawah, S.
PB - Modelling and Simulation Society of Australia and New Zealand Inc (MSSANZ)
T2 - 23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making: The Role of Modelling and Simulation, MODSIM 2019
Y2 - 1 December 2019 through 6 December 2019
ER -