An Adaptive, Automatic Multiple-Case Deletion Technique for Detecting Influence in Regression

    Research output: Contribution to journalArticlepeer-review

    9 Citations (Scopus)

    Abstract

    Critical to any regression analysis is the identification of observations that exert a strong influence on the fitted regression model. Traditional regression influence statistics such as Cook's distance and DFFITS, each based on deleting single observations, can fail in the presence of multiple influential observations if these influential observations "mask" one another, or if other effects such as "swamping" occur. Masking refers to the situation where an observation reveals itself as influential only after one or more other observations are deleted. Swamping occurs when points that are not actually outliers/influential are declared to be so because of the effects on the model of other unusual observations. One computationally expensive solution to these problems is the use of influence statistics that delete multiple rather than single observations. In this article, we build on previous work to produce a computationally feasible algorithm for detecting an unknown number of influential observations in the presence of masking. An important difference between our proposed algorithm and existing methods is that we focus on the data that remain after observations are deleted, rather than on the deleted observations themselves. Further, our approach uses a novel confirmatory step designed to provide a secondary assessment of identified observations. Supplementary materials for this article are available online.

    Original languageEnglish
    Pages (from-to)408-417
    Number of pages10
    JournalTechnometrics
    Volume57
    Issue number3
    DOIs
    Publication statusPublished - 3 Jul 2015

    Fingerprint

    Dive into the research topics of 'An Adaptive, Automatic Multiple-Case Deletion Technique for Detecting Influence in Regression'. Together they form a unique fingerprint.

    Cite this