TY - JOUR
T1 - On bagging and nonlinear estimation
AU - Friedman, Jerome H.
AU - Hall, Peter
PY - 2007/3/1
Y1 - 2007/3/1
N2 - We propose an elementary model for the way in which stochastic perturbations of a statistical objective function, such as a negative log-likelihood, produce excessive nonlinear variation of the resulting estimator. Theory for the model is transparently simple, and is used to provide new insight into the main factors that affect performance of bagging. In particular, it is shown that if the perturbations are sufficiently symmetric then bagging will not significantly increase bias; and if the perturbations also offer opportunities for cancellation then bagging will reduce variance. For the first property it is sufficient that the third derivative of a perturbation vanish locally, and for the second, that second and fourth derivatives have opposite signs. Functions that satisfy these conditions resemble sinusoids. Therefore, our results imply that bagging will reduce the nonlinear variation, as measured by either variance or mean-squared error, produced in an estimator by sinusoid-like, stochastic perturbations of the objective function. Analysis of our simple model also suggests relationships between the results obtained using different with-replacement and without-replacement bagging schemes. We simulate regression trees in settings that are far more complex than those explicitly addressed by the model, and find that these relationships are generally borne out.
AB - We propose an elementary model for the way in which stochastic perturbations of a statistical objective function, such as a negative log-likelihood, produce excessive nonlinear variation of the resulting estimator. Theory for the model is transparently simple, and is used to provide new insight into the main factors that affect performance of bagging. In particular, it is shown that if the perturbations are sufficiently symmetric then bagging will not significantly increase bias; and if the perturbations also offer opportunities for cancellation then bagging will reduce variance. For the first property it is sufficient that the third derivative of a perturbation vanish locally, and for the second, that second and fourth derivatives have opposite signs. Functions that satisfy these conditions resemble sinusoids. Therefore, our results imply that bagging will reduce the nonlinear variation, as measured by either variance or mean-squared error, produced in an estimator by sinusoid-like, stochastic perturbations of the objective function. Analysis of our simple model also suggests relationships between the results obtained using different with-replacement and without-replacement bagging schemes. We simulate regression trees in settings that are far more complex than those explicitly addressed by the model, and find that these relationships are generally borne out.
KW - Bias
KW - Bootstrap
KW - Half-sampling
KW - Regression tree
KW - Variance reduction
KW - With-replacement sampling
KW - Without-replacement sampling
UR - http://www.scopus.com/inward/record.url?scp=33750494541&partnerID=8YFLogxK
U2 - 10.1016/j.jspi.2006.06.002
DO - 10.1016/j.jspi.2006.06.002
M3 - Article
SN - 0378-3758
VL - 137
SP - 669
EP - 683
JO - Journal of Statistical Planning and Inference
JF - Journal of Statistical Planning and Inference
IS - 3
ER -