Order-Preserving Nonparametric Regression, with Applications to Conditional Distribution and Quantile Function Estimation

Peter Hall*, Hans Georg Müller

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    12 Citations (Scopus)

    Abstract

    In some regression problems we observe a "response" Y ti to level t of a "treatment" applied to an individual with level Xi of a given characteristic, where it has been established that response is monotone increasing in the level of the treatment. A related problem arises when estimating conditional distributions, where the raw data are typically independent and identically distributed pairs (X i, Zi), and Yti denotes the proportion of Zi's that do not exceed t. We expect the regression means g t(x) = E(YtiXi = x) to enjoy the same order relation as the responses, that is, gt ≤ gs whenever s ≤ t. This requirement is necessary to obtain bona fide conditional distribution functions, for example. If we estimate gt by passing a linear smoother through each dataset Χt = {(Xi, Y ti) : 1 ≤ i ≤ n}, then the order-preserving property is guaranteed if and only if the smoother has nonnegative weights. However, in such cases the estimators generally have high levels of boundary bias. On the other hand, the order-preserving property usually fails for linear estimators with low boundary bias, such as local linear estimators, or kernel estimators employing boundary kernels. This failure is generally most serious at boundaries of the distribution of the explanatory variables, and ironically it is often in just those places that estimation is of greatest interest, because responses there imply constraints on the larger population. In this article we suggest nonlinear, order-invariant estimators for nonparametric regression, and discuss their properties. The resulting estimators are applied to the estimation of conditional distribution functions at endpoints and also changepoints. The availability of bona fide distribution function estimators at endpoints also enables the computation of changepoint diagnostics that are based on differences in a suitable norm between two estimated conditional distribution functions, obtained from data that fall into one-sided bins.

    Original languageEnglish
    Pages (from-to)598-608
    Number of pages11
    JournalJournal of the American Statistical Association
    Volume98
    Issue number463
    DOIs
    Publication statusPublished - Sept 2003

    Fingerprint

    Dive into the research topics of 'Order-Preserving Nonparametric Regression, with Applications to Conditional Distribution and Quantile Function Estimation'. Together they form a unique fingerprint.

    Cite this