Bandwidth choice for nonparametric classification

Peter Hall*, Kee Hoon Kang

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    51 Citations (Scopus)

    Abstract

    It is shown that, for kernel-based classification with univariate distributions and two populations, optimal bandwidth choice has a dichotomous character. If the two densities cross at just one point, where their curvatures have the same signs, then minimum Bayes risk is achieved using bandwidths which are an order of magnitude larger than those which minimize pointwise estimation error. On the other hand, if the curvature signs are different, or if there are multiple crossing points, then bandwidths of conventional size are generally appropriate. The range of different modes of behavior is narrower in multivariate settings. There, the optimal size of bandwidth is generally the same as that which is appropriate for pointwise density estimation. These properties motivate empirical rules for bandwidth choice.

    Original languageEnglish
    Pages (from-to)284-306
    Number of pages23
    JournalAnnals of Statistics
    Volume33
    Issue number1
    DOIs
    Publication statusPublished - Feb 2005

    Fingerprint

    Dive into the research topics of 'Bandwidth choice for nonparametric classification'. Together they form a unique fingerprint.

    Cite this