Performance of variable and function selection methods for estimating the nonlinear health effects of correlated chemical mixtures: A simulation study

Nina Lazarevic*, Luke D. Knibbs, Peter D. Sly, Adrian G. Barnett

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

13 Citations (Scopus)

Abstract

Statistical methods for identifying harmful chemicals in a correlated mixture often assume linearity in exposure-response relationships. Nonmonotonic relationships are increasingly recognized (eg, for endocrine-disrupting chemicals); however, the impact of nonmonotonicity on exposure selection has not been evaluated. In a simulation study, we assessed the performance of Bayesian kernel machine regression (BKMR), Bayesian additive regression trees (BART), Bayesian structured additive regression with spike-slab priors (BSTARSS), generalized additive models with double penalty (GAMDP) and thin plate shrinkage smoothers (GAMTS), multivariate adaptive regression splines (MARS), and lasso penalized regression. We simulated realistic exposure data based on pregnancy exposure to 17 phthalates and phenols in the US National Health and Nutrition Examination Survey using a multivariate copula. We simulated data sets of size N = 250 and compared methods across 32 scenarios, varying by model size and sparsity, signal-to-noise ratio, correlation structure, and exposure-response relationship shapes. We compared methods in terms of their sensitivity, specificity, and estimation accuracy. In most scenarios, BKMR, BSTARSS, GAMDP, and GAMTS achieved moderate to high sensitivity (0.52-0.98) and specificity (0.21-0.99). BART and MARS achieved high specificity (≥0.90), but low sensitivity in low signal-to-noise ratio scenarios (0.20-0.51). Lasso was highly sensitive (0.71-0.99), except for quadratic relationships (≤0.27). Penalized regression methods that assume linearity, such as lasso, may not be suitable for studies of environmental chemicals hypothesized to have nonmonotonic relationships with outcomes. Instead, BKMR, BSTARSS, GAMDP, and GAMTS are attractive methods for flexibly estimating the shapes of exposure-response relationships and selecting among correlated exposures.

Original languageEnglish
Pages (from-to)3947-3967
Number of pages21
JournalStatistics in Medicine
Volume39
Issue number27
DOIs
Publication statusPublished - 30 Nov 2020
Externally publishedYes

Fingerprint

Dive into the research topics of 'Performance of variable and function selection methods for estimating the nonlinear health effects of correlated chemical mixtures: A simulation study'. Together they form a unique fingerprint.

Cite this