TY - JOUR
T1 - GEE-Assisted Forward Regression for Spatial Latent Variable Models
AU - Hui, Francis K.C.
N1 - Publisher Copyright:
© 2022 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.
PY - 2022
Y1 - 2022
N2 - Multivariate spatial data, where multiple responses are recorded at a set of spatial locations, are widely collected in many disciplines. One common approach for analyzing such data is spatial generalized linear latent variable models (spatial GLLVMs), where the latent variables are used to model both the spatial correlation between locations and correlations between responses. However, inference such as variable selection for spatial GLLVMs is computationally demanding, as the marginal likelihood involves a high-dimensional and often intractable integral. To overcome this, we propose to use spatial generalized estimating equations (GEEs) to perform fast, GEE-assisted forward regression for spatial GLLVMs. Focusing on counts and nonnegative continuous responses, we use spatial GEEs to build a forward solution path by choosing the candidate variable which maximizes a score statistic at each point on the path. A model is then selected from this path based on a modified score information criterion. The proposed approach is computationally efficient, relying only on GEEs which are quick to update, coupled with a novel theoretical result linking the coefficients from spatial GEEs to that of spatial GLLVMs. We show that the proposed approach can asymptotically identify all truly important nonzero predictors in the underlying spatial GLLVM. Simulations demonstrate that, when the data are generated from a sparse spatial GLLVM, GEE-assisted forward regression performs well at recovering this sparsity, while taking only a fraction of the computation time required to fit just a single (saturated) spatial GLLVM. Supplementary materials for this article are available online.
AB - Multivariate spatial data, where multiple responses are recorded at a set of spatial locations, are widely collected in many disciplines. One common approach for analyzing such data is spatial generalized linear latent variable models (spatial GLLVMs), where the latent variables are used to model both the spatial correlation between locations and correlations between responses. However, inference such as variable selection for spatial GLLVMs is computationally demanding, as the marginal likelihood involves a high-dimensional and often intractable integral. To overcome this, we propose to use spatial generalized estimating equations (GEEs) to perform fast, GEE-assisted forward regression for spatial GLLVMs. Focusing on counts and nonnegative continuous responses, we use spatial GEEs to build a forward solution path by choosing the candidate variable which maximizes a score statistic at each point on the path. A model is then selected from this path based on a modified score information criterion. The proposed approach is computationally efficient, relying only on GEEs which are quick to update, coupled with a novel theoretical result linking the coefficients from spatial GEEs to that of spatial GLLVMs. We show that the proposed approach can asymptotically identify all truly important nonzero predictors in the underlying spatial GLLVM. Simulations demonstrate that, when the data are generated from a sparse spatial GLLVM, GEE-assisted forward regression performs well at recovering this sparsity, while taking only a fraction of the computation time required to fit just a single (saturated) spatial GLLVM. Supplementary materials for this article are available online.
KW - Factor analysis
KW - Generalized estimating equations
KW - Information criterion
KW - Model selection
KW - Spatial statistics
KW - Variable selection
UR - http://www.scopus.com/inward/record.url?scp=85132669325&partnerID=8YFLogxK
U2 - 10.1080/10618600.2022.2058002
DO - 10.1080/10618600.2022.2058002
M3 - Article
SN - 1061-8600
VL - 31
SP - 1013
EP - 1024
JO - Journal of Computational and Graphical Statistics
JF - Journal of Computational and Graphical Statistics
IS - 4
ER -