TY - GEN
T1 - Daily rainfall data infilling with a stochastic model
AU - Jin, Huidong
AU - Shao, Quanxi
AU - Crimp, Steven
N1 - Publisher Copyright:
Copyright © 2019 The Modelling and Simulation Society of Australia and New Zealand Inc. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Most models are premised on complete data without missing values, such as using a complete daily weather time series to simulate crop biomass accumulation and production, and predict pest risk, even assess climate change impacts. However, historical data series often have some missing values or have only the aggregated values over a period of time. For example, daily rainfall amount is normally recorded by hand during working days and the data during weekends and holidays are sometimes missing and reported as total during these periods. In addition, some data are still missing even nowadays with automated weather stations, due to instrument failure, power outages, operation interruption and so on. Daily rainfall time series data may suffer from several missing data problems. These include (1) individual missing days, (2) consecutive missing days (missing segments), (3) consecutive missing days with their aggregation available. Aggregation of daily data is most common following weekends or holidays. There are several methods to infill these missing values, such as distributed accumulated rainfall evenly over the accumulation period, spatial interpolation from records of surrounding stations, and climatology. These methods often under-estimate the dry day proportions, i.e., giving more wet days than normal, and smooth out extremely daily rainfall amount. To infill these data gaps appropriately, we investigate a time-varying stochastic model to simulate daily rainfall time series based on true observations, including aggregated observations. A complete rainfall time series is constructed using a three-state Markov chain model to simulate the occurrence of dry, wet and extremely wet days, whilst rainfall amounts for wet and extremely wet days are modelled using a truncated Gamma distribution and an extended Burr XII distribution respectively. Smooth changes on state transition probabilities within a year are captured by time-varying model inputs. The proposed technique can infill the missing data with or without aggregated observations. Experiments on three Australian stations from different climatic zones illustrate its superior performance to a defacto operational approach in Australia and classic climatology method in terms of maintaining daily rainfall data characteristics such as dry day proportions, dry day spells, and rainfall amount distributions. For example, average dry day proportions for these three stations are around 80% based on truly daily rainfall records. They are around 50% for the daily data infilled by our proposed stochastic method. Because these missing daily data do not have substantially missing patterns, these proportions are more reasonable than around 20% for the infilled daily data from the defacto operational approach.
AB - Most models are premised on complete data without missing values, such as using a complete daily weather time series to simulate crop biomass accumulation and production, and predict pest risk, even assess climate change impacts. However, historical data series often have some missing values or have only the aggregated values over a period of time. For example, daily rainfall amount is normally recorded by hand during working days and the data during weekends and holidays are sometimes missing and reported as total during these periods. In addition, some data are still missing even nowadays with automated weather stations, due to instrument failure, power outages, operation interruption and so on. Daily rainfall time series data may suffer from several missing data problems. These include (1) individual missing days, (2) consecutive missing days (missing segments), (3) consecutive missing days with their aggregation available. Aggregation of daily data is most common following weekends or holidays. There are several methods to infill these missing values, such as distributed accumulated rainfall evenly over the accumulation period, spatial interpolation from records of surrounding stations, and climatology. These methods often under-estimate the dry day proportions, i.e., giving more wet days than normal, and smooth out extremely daily rainfall amount. To infill these data gaps appropriately, we investigate a time-varying stochastic model to simulate daily rainfall time series based on true observations, including aggregated observations. A complete rainfall time series is constructed using a three-state Markov chain model to simulate the occurrence of dry, wet and extremely wet days, whilst rainfall amounts for wet and extremely wet days are modelled using a truncated Gamma distribution and an extended Burr XII distribution respectively. Smooth changes on state transition probabilities within a year are captured by time-varying model inputs. The proposed technique can infill the missing data with or without aggregated observations. Experiments on three Australian stations from different climatic zones illustrate its superior performance to a defacto operational approach in Australia and classic climatology method in terms of maintaining daily rainfall data characteristics such as dry day proportions, dry day spells, and rainfall amount distributions. For example, average dry day proportions for these three stations are around 80% based on truly daily rainfall records. They are around 50% for the daily data infilled by our proposed stochastic method. Because these missing daily data do not have substantially missing patterns, these proportions are more reasonable than around 20% for the infilled daily data from the defacto operational approach.
KW - Extremes
KW - Markov chain
KW - Missing data
KW - Temporal disaggregation
UR - http://www.scopus.com/inward/record.url?scp=85086445021&partnerID=8YFLogxK
M3 - Conference contribution
T3 - 23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making: The Role of Modelling and Simulation, MODSIM 2019
SP - 698
EP - 704
BT - 23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making
A2 - Elsawah, S.
PB - Modelling and Simulation Society of Australia and New Zealand Inc (MSSANZ)
T2 - 23rd International Congress on Modelling and Simulation - Supporting Evidence-Based Decision Making: The Role of Modelling and Simulation, MODSIM 2019
Y2 - 1 December 2019 through 6 December 2019
ER -