TY - GEN
T1 - AOSO-LogitBoost
T2 - 29th International Conference on Machine Learning, ICML 2012
AU - Sun, Peng
AU - Reid, Mark D.
AU - Zhou, Jie
PY - 2012
Y1 - 2012
N2 - This paper presents an improvement to model learning when using multi-class LogitBoost for classification. Motivated by the statistical view, LogitBoost can be seen as additive tree regression. Two important factors in this setting are: 1) coupled classifier output due to a sum-to-zero constraint, and 2) the dense Hessian matrices that arise when computing tree node split gain and node value fittings. In general, this setting is too complicated for a tractable model learning algorithm. However, too aggressive simplification of the setting may lead to degraded performance. For example, the original LogitBoost is outperformed by ABC-LogitBoost due to the latter's more careful treatment of the above two factors. In this paper we propose techniques to address the two main difficulties of the LogitBoost setting: 1) we adopt a vector tree (i.e., each node value is vector) that enforces a sum-to-zero constraint, and 2) we use an adaptive block coordinate descent that exploits the dense Hessian when computing tree split gain and node values. Higher classification accuracy and faster convergence rates are observed for a range of public data sets when compared to both the original and the ABC-LogitBoost implementations.
AB - This paper presents an improvement to model learning when using multi-class LogitBoost for classification. Motivated by the statistical view, LogitBoost can be seen as additive tree regression. Two important factors in this setting are: 1) coupled classifier output due to a sum-to-zero constraint, and 2) the dense Hessian matrices that arise when computing tree node split gain and node value fittings. In general, this setting is too complicated for a tractable model learning algorithm. However, too aggressive simplification of the setting may lead to degraded performance. For example, the original LogitBoost is outperformed by ABC-LogitBoost due to the latter's more careful treatment of the above two factors. In this paper we propose techniques to address the two main difficulties of the LogitBoost setting: 1) we adopt a vector tree (i.e., each node value is vector) that enforces a sum-to-zero constraint, and 2) we use an adaptive block coordinate descent that exploits the dense Hessian when computing tree split gain and node values. Higher classification accuracy and faster convergence rates are observed for a range of public data sets when compared to both the original and the ABC-LogitBoost implementations.
UR - http://www.scopus.com/inward/record.url?scp=84867130926&partnerID=8YFLogxK
M3 - Conference contribution
SN - 9781450312851
T3 - Proceedings of the 29th International Conference on Machine Learning, ICML 2012
SP - 1087
EP - 1094
BT - Proceedings of the 29th International Conference on Machine Learning, ICML 2012
Y2 - 26 June 2012 through 1 July 2012
ER -