TY - JOUR
T1 - Random effects misspecification and its consequences for prediction in generalized linear mixed models
AU - Vu, Quan
AU - Hui, Francis K.C.
AU - Muller, Samuel
AU - Welsh, A. H.
N1 - © 2025 The Author(s)
PY - 2026/1
Y1 - 2026/1
N2 - When fitting generalized linear mixed models, choosing the random effects distribution is an important decision. As random effects are unobserved, misspecification of their distribution is a real possibility. Thus, the consequences of random effects misspecification for point prediction and prediction inference of random effects in generalized linear mixed models need to be investigated. A combination of theory, simulation, and a real application is used to explore the effect of using the common normality assumption for the random effects distribution when the correct specification is a mixture of normal distributions, focusing on the impacts on point prediction, mean squared prediction errors, and prediction intervals. Results show that the level of shrinkage for the predicted random effects can differ greatly under the two random effect distributions, and so is susceptible to misspecification. Also, the unconditional mean squared prediction errors for the random effects are almost always larger under the misspecified normal random effects distribution, while results for the mean squared prediction errors conditional on the random effects are more complicated but remain generally larger under the misspecified distribution (especially when the true random effect is close to the mean of one of the component distributions in the true mixture distribution). Results for prediction intervals indicate that the overall coverage probability is, in contrast, not greatly impacted by misspecification. It is concluded that misspecifying the random effects distribution can affect prediction of random effects, and greater caution is recommended when adopting the normality assumption in generalized linear mixed models.
AB - When fitting generalized linear mixed models, choosing the random effects distribution is an important decision. As random effects are unobserved, misspecification of their distribution is a real possibility. Thus, the consequences of random effects misspecification for point prediction and prediction inference of random effects in generalized linear mixed models need to be investigated. A combination of theory, simulation, and a real application is used to explore the effect of using the common normality assumption for the random effects distribution when the correct specification is a mixture of normal distributions, focusing on the impacts on point prediction, mean squared prediction errors, and prediction intervals. Results show that the level of shrinkage for the predicted random effects can differ greatly under the two random effect distributions, and so is susceptible to misspecification. Also, the unconditional mean squared prediction errors for the random effects are almost always larger under the misspecified normal random effects distribution, while results for the mean squared prediction errors conditional on the random effects are more complicated but remain generally larger under the misspecified distribution (especially when the true random effect is close to the mean of one of the component distributions in the true mixture distribution). Results for prediction intervals indicate that the overall coverage probability is, in contrast, not greatly impacted by misspecification. It is concluded that misspecifying the random effects distribution can affect prediction of random effects, and greater caution is recommended when adopting the normality assumption in generalized linear mixed models.
KW - Clustered data
KW - Empirical best predictor
KW - Longitudinal data
KW - Mean squared error of prediction
KW - Prediction inference
UR - https://www.scopus.com/pages/publications/105011988301
U2 - 10.1016/j.csda.2025.108254
DO - 10.1016/j.csda.2025.108254
M3 - Article
AN - SCOPUS:105011988301
SN - 0167-9473
VL - 213
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
M1 - 108254
ER -