Imputation of Household Survey Data Using Linear Mixed Models

Luise Patricia Lago*, Robert Graham Clark

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

Mixed models are regularly used in the analysis of clustered data, but are only recently being used for imputation of missing data. In household surveys where multiple people are selected from each household, imputation of missing values should preserve the structure pertaining to people within households and should not artificially change the apparent intracluster correlation (ICC). This paper focuses on the use of multilevel models for imputation of missing data in household surveys. In particular, the performance of a best linear unbiased predictor for both stochastic and deterministic imputation using a linear mixed model is compared to imputation based on a single level linear model, both with and without information about household respondents. In this paper an evaluation is carried out in the context of imputing hourly wage rate in the Household, Income and Labour Dynamics of Australia Survey. Nonresponse is generated under various assumptions about the missingness mechanism for persons and households, and with low, moderate and high intra-household correlation to assess the benefits of the multilevel imputation model under different conditions. The mixed model and single level model with information about the household respondent lead to clear improvements when the ICC is moderate or high, and when there is informative missingness.

Original languageEnglish
Pages (from-to)169-187
Number of pages19
JournalAustralian and New Zealand Journal of Statistics
Volume57
Issue number2
DOIs
Publication statusPublished - 1 Jun 2015
Externally publishedYes

Fingerprint

Dive into the research topics of 'Imputation of Household Survey Data Using Linear Mixed Models'. Together they form a unique fingerprint.

Cite this