TY - JOUR
T1 - A multimodel inference approach to categorical variant choice
T2 - Construction, priming and frequency efects on the choice between full and contracted forms of am, are and is
AU - Barth, Danielle
AU - Kapatsinski, Vsevolod
PY - 2014
Y1 - 2014
N2 - The present paper presents a multimodel inference approach to linguistic variation, expanding on prior work by Kuperman and Bresnan (2012). We argue that corpus data ofen present the analyst with high model selection uncertainty. This uncertainty is inevitable given that language is highly redundant: every feature is predictable from multiple other features. However, uncertainty involved in model selection is ignored by the standard method of selecting the single best model and inferring the effects of the predictors under the assumption that the best model is true. Multimodel inference avoids committing to a single model. Rather, we make predictions based on the entire set of plausible models, with contributions of models weighted by the models' predictive value. We argue that multimodel inference is superior to model selection for both the I-Language goal of inferring the mental grammars that generated the corpus, and the E-Language goal of predicting characteristics of future speech samples from the community represented by the corpus. Applying multimodel inference to the classic problem of English auxiliary contraction, we show that the choice between multimodel inference and model selection matters in practice: the best model may contain predictors that are not significant when the full set of plausible models is considered, and may omit predictors that are significant considering the full set of models. We also contribute to the study of English auxiliary contraction. We document the effects of priming, contextual predictability, and specific syntactic constructions and provide evidence against effects of phonological context.
AB - The present paper presents a multimodel inference approach to linguistic variation, expanding on prior work by Kuperman and Bresnan (2012). We argue that corpus data ofen present the analyst with high model selection uncertainty. This uncertainty is inevitable given that language is highly redundant: every feature is predictable from multiple other features. However, uncertainty involved in model selection is ignored by the standard method of selecting the single best model and inferring the effects of the predictors under the assumption that the best model is true. Multimodel inference avoids committing to a single model. Rather, we make predictions based on the entire set of plausible models, with contributions of models weighted by the models' predictive value. We argue that multimodel inference is superior to model selection for both the I-Language goal of inferring the mental grammars that generated the corpus, and the E-Language goal of predicting characteristics of future speech samples from the community represented by the corpus. Applying multimodel inference to the classic problem of English auxiliary contraction, we show that the choice between multimodel inference and model selection matters in practice: the best model may contain predictors that are not significant when the full set of plausible models is considered, and may omit predictors that are significant considering the full set of models. We also contribute to the study of English auxiliary contraction. We document the effects of priming, contextual predictability, and specific syntactic constructions and provide evidence against effects of phonological context.
KW - Contraction
KW - English auxiliaries
KW - English copula
KW - Frequency
KW - Grammaticalization
KW - Mixed effects models
KW - Multimodel inference
KW - Reduction
UR - http://www.scopus.com/inward/record.url?scp=84958193783&partnerID=8YFLogxK
U2 - 10.1515/cllt-2014-0022
DO - 10.1515/cllt-2014-0022
M3 - Article
SN - 1613-7027
VL - 2014
SP - 203
EP - 260
JO - Corpus Linguistics and Linguistic Theory
JF - Corpus Linguistics and Linguistic Theory
ER -