TY - JOUR
T1 - Smoothing multivariate performance measures
AU - Zhang, Xinhua
AU - Saha, Ankan
AU - Vishwanathan, S. V.N.
PY - 2012/12
Y1 - 2012/12
N2 - Optimizing multivariate performance measure is an important task in Machine Learning. Joachims (2005) introduced a Support Vector Method whose underlying optimization problem is commonly solved by cutting plane methods (CPMs) such as SVM-Perf and BMRM. It can be shown that CPMs converge to an e accurate solution iterations, where l is the trade-off parameter between the regularizer and the loss function. Motivated by the impressive convergence rate of CPM on a number of practical problems, it was conjectured that these rates can be further improved. We disprove this conjecture in this paper by constructing counter examples. However, surprisingly, we further discover that these problems are not inherently hard, and we develop a novel smoothing strategy, which in conjunction with Nesterov's accelerated gradient method, can find an ε accurate solution iterations. Computationally, our smoothing technicue is also particularly advantageous for optimizing multivariate performance scores such as precision/recall break-even point and ROCArea; the cost per iteration remains the same as that of CPMs. Empirical evaluation on some of the largest publicly available data sets shows that our method converges significantly faster than CPMs without sacrificing generalization ability.
AB - Optimizing multivariate performance measure is an important task in Machine Learning. Joachims (2005) introduced a Support Vector Method whose underlying optimization problem is commonly solved by cutting plane methods (CPMs) such as SVM-Perf and BMRM. It can be shown that CPMs converge to an e accurate solution iterations, where l is the trade-off parameter between the regularizer and the loss function. Motivated by the impressive convergence rate of CPM on a number of practical problems, it was conjectured that these rates can be further improved. We disprove this conjecture in this paper by constructing counter examples. However, surprisingly, we further discover that these problems are not inherently hard, and we develop a novel smoothing strategy, which in conjunction with Nesterov's accelerated gradient method, can find an ε accurate solution iterations. Computationally, our smoothing technicue is also particularly advantageous for optimizing multivariate performance scores such as precision/recall break-even point and ROCArea; the cost per iteration remains the same as that of CPMs. Empirical evaluation on some of the largest publicly available data sets shows that our method converges significantly faster than CPMs without sacrificing generalization ability.
KW - Max-margin methods
KW - Multivariate performance measures
KW - Non-smooth optimization
KW - Smoothing
KW - Support vector machines
UR - http://www.scopus.com/inward/record.url?scp=84873417303&partnerID=8YFLogxK
M3 - Article
SN - 1532-4435
VL - 13
SP - 3623
EP - 3680
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
ER -