TY - JOUR

T1 - Bootstrapping nonparametric density estimators with empirically chosen bandwidths

AU - Hall, Peter

AU - Kang, Kee Hoon

PY - 2001/10

Y1 - 2001/10

N2 - We examine the way in which empirical bandwidth choice affects distributional properties of nonparametric density estimators. Two bandwidth selection methods are considered in detail: local and global plug-in rules. Particular attention is focussed on whether the accuracy of distributional bootstrap approximations is appreciably influenced by using the resample version ĥ*, rather than the sample version ĥ, of an empirical bandwidth. It is shown theoretically that, in marked contrast to similar problems in more familiar settings, no general first-order theoretical improvement can be expected when using the resampling vers on. In the case of local plug-in rules, the inability of the bootstrap to accurately reflect biases of the components used to construct the bandwidth selector means that the bootstrap distribution of ĥ* is unable to capture some of the main properties of the distribution of ĥ. If the second derivative component is slightly undersmoothed then some improvements are possible through using ĥ*, but they would be difficult to achieve in practice. On the other hand, for global plug-in methods, both ĥ and Â* are such good approximations to an optimal, deterministic bandwidth that th ; variations of either can be largely ignored, at least at a first-order level. Thus, for quite different reasons in the two cases, the computational burden of varying an empirical bandwidth across resamples is difficult to justify.

AB - We examine the way in which empirical bandwidth choice affects distributional properties of nonparametric density estimators. Two bandwidth selection methods are considered in detail: local and global plug-in rules. Particular attention is focussed on whether the accuracy of distributional bootstrap approximations is appreciably influenced by using the resample version ĥ*, rather than the sample version ĥ, of an empirical bandwidth. It is shown theoretically that, in marked contrast to similar problems in more familiar settings, no general first-order theoretical improvement can be expected when using the resampling vers on. In the case of local plug-in rules, the inability of the bootstrap to accurately reflect biases of the components used to construct the bandwidth selector means that the bootstrap distribution of ĥ* is unable to capture some of the main properties of the distribution of ĥ. If the second derivative component is slightly undersmoothed then some improvements are possible through using ĥ*, but they would be difficult to achieve in practice. On the other hand, for global plug-in methods, both ĥ and Â* are such good approximations to an optimal, deterministic bandwidth that th ; variations of either can be largely ignored, at least at a first-order level. Thus, for quite different reasons in the two cases, the computational burden of varying an empirical bandwidth across resamples is difficult to justify.

KW - Bootstrap methods

KW - Confidence interval

KW - Edgeworth expansion

KW - Kernel methods

KW - Nonparametric estimation

KW - Plug-in rules

KW - Rate of convergence

KW - Second-order accuracy

KW - Smoothing parameter

UR - http://www.scopus.com/inward/record.url?scp=0035470898&partnerID=8YFLogxK

U2 - 10.1214/aos/1013203461

DO - 10.1214/aos/1013203461

M3 - Article

SN - 0090-5364

VL - 29

SP - 1443

EP - 1468

JO - Annals of Statistics

JF - Annals of Statistics

IS - 5

ER -