Abstract
Sufficient dimension reduction (SDR) is an attractive approach to regression modelling. However, despite its rich literature and growing popularity in application, surprisingly little research has been done on how to perform SDR for clustered data, for example as is commonly arises in longitudinal studies. Indeed, current popular SDR methods have been mostly based on a marginal estimating equation approach. In this article, we propose a new approach to SDR for clustered data based on a combination of finite mixture modelling and mixed effects regression. Finite mixture models offer a flexible means of estimating the fixed effects central subspace, based on slicing the space up and probabilistically clustering observations to each slice (mixture component). Dimension reduction is achieved by having the mixing proportions vary only through the sufficient fixed effect predictors. We then incorporate random effects as a natural means of accounting for correlations within clusters. We employ a Monte Carlo expectation–maximisation algorithm to estimate the model parameters and fixed effects central subspace, and discuss methods for associated uncertainty quantification and prediction. Simulation studies demonstrate that our approach performs strongly against both estimating equation methods for estimating the fixed effects central subspace, and SDR methods which do not account for within-cluster correlation. Finally, we apply the proposed approach to a data set on air pollutant monitoring across 13 stations in the Eastern United States.
Original language | English |
---|---|
Pages (from-to) | 133-157 |
Number of pages | 25 |
Journal | Australian and New Zealand Journal of Statistics |
Volume | 64 |
Issue number | 2 |
DOIs | |
Publication status | Published - Jun 2022 |