zbMATH — the first resource for mathematics

Upper bounds and aggregation in bipartite ranking. (English) Zbl 1336.62068
Summary: One main focus of learning theory is to find optimal rates of convergence. In classification, it is possible to obtain optimal fast rates (faster than $$n^{-1/2}$$) in a minimax sense. Moreover, using an aggregation procedure, the algorithms are adaptive to the parameters of the class of distributions. Here, we investigate this issue in the bipartite ranking framework. We design a ranking rule by aggregating estimators of the regression function. We use exponential weights based on the empirical ranking risk. Under several assumptions on the class of distribution, we show that this procedure is adaptive to the margin parameter and smoothness parameter and achieves the same rates as in the classification framework. Moreover, we state a minimax lower bound that establishes the optimality of the aggregation procedure in a specific case.

MSC:
 62F07 Statistical ranking and selection procedures 62C20 Minimax procedures in statistical decision theory 62G08 Nonparametric regression and quantile regression
Keywords:
ranking; aggregation; minimax rates
Full Text:
References:
 [1] S. Agarwal, T. Graepel, R. Herbrich, S. Har-Peled, and D. Roth. Generalization bounds for the Area Under the ROC Curve. The Journal of Machine Learning Research , 6:393-425, 2005. · Zbl 1222.68129 [2] Pierre Alquier and Karim Lounici. Pac-bayesian bounds for sparse regression estimation with exponential weights. EJS , 5:127-145, 2011. · Zbl 1274.62463 [3] J.-Y. Audibert and A. Tsybakov. Fast learning rates for plug-in classifiers. Ann. Statist. , 35(2):608-633, 2007. · Zbl 1118.62041 [4] J. Y. Audibert. Fast learning rates in statistical inference through aggregation. Annals of Statistics , 37:1591-1646, 2009. · Zbl 1360.62167 [5] P. L. Bartlett, M. I. Jordan, and J. D. McAuliffe. Convexity, classification and risk bounds. J. Amer. Statist. Assoc. , 101:138-156, 2006. · Zbl 1118.62330 [6] S. Boucheron, O. Bousquet, and G. Lugosi. Theory of Classification: A Survey of Some Recent Advances. ESAIM: Probability and Statistics , 9:323-375, 2005. · Zbl 1136.62355 [7] Nicolo Cesa-Bianchi and Gabor Lugosi. Prediction, Learning, and Games . Cambridge University Press, 2006. · Zbl 1114.91001 [8] S. Clémençon, G. Lugosi, and N. Vayatis. Ranking and empirical risk minimization of U-statistics. Ann. Statist. , 36(2):844-874, 2008. · Zbl 1181.68160 [9] S. Clémençon and S. Robbiano. Minimax learning rates for bipartite ranking and plug-in rules. In Proceedings of the 28th international Conference on Machine Learning , ICML’11, pages 441-448, 2011. [10] S. Clémençon and N. Vayatis. Tree-based ranking methods. IEEE Transactions on Information Theory , 55(9):4316-4336, 2009. · Zbl 1367.62047 [11] S. Clémençon and N. Vayatis. Overlaying classifiers: a practical approach to optimal scoring. Constructive Approximation , 32(3):619-648, 2010. · Zbl 1200.62063 [12] A. Dalalyan and A. B. Tsybakov. Aggregation by exponential weighting, sharp pac-bayesian bounds and sparsity. Mach. Learn. , 72(1-2):39-61, 2008. [13] D. M. Green and J. A. Swets. Signal detection theory and psychophysics . Wiley, 1966. [14] R. M. Dudley. Uniform Central Limit Theorems . Cambridge University Press, 1999. · Zbl 0951.60033 [15] Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. The Journal of Machine Learning Research , 4:933-969, 2003. · Zbl 1098.68652 [16] A. N. Kolmogorov and V. M. Tikhomirov. $$\epsilon$$-entropy and $$\epsilon$$-capacity of sets in functional spaces. Amer. Math. Soc. Translations Ser. 2, , 17:277-364, 1961. · Zbl 0133.06703 [17] V. Koltchinskii and O. Beznosova. Exponential convergence rates in classification. In Proceedings of COLT’05 , 2005. · Zbl 1137.68546 [18] G. Lecué. Optimal oracle inequality for aggregation of classifiers under low noise condition. In COLT , 2006. [19] G. Lecué. Classification with minimax fast rates for classes of bayes rules with sparse representation. Electronic Journal of Statistics , 2:741-773, 2008. · Zbl 1320.62146 [20] O. V. Lepski, E. Mammen, and V. G. Spokoiny. Optimal spatial adaptation to inhomogeneous smoothness: an approach based on kernel estimates with variable bandwidth selectors. Annals Statistics , 25:929-947, 1997. · Zbl 0885.62044 [21] P. Massart. Some applications of concentration inequalities to statistics. Ann. Fac. Sci. Toulouse Math. , 9:245-303, 2000. · Zbl 0986.62002 [22] P. Massart. Concentration inequalities and model selection . Lecture Notes in Mathematics. Springer, 2006. [23] P. Massart and E. Nédélec. Risk bounds for statistical learning. Ann. Statist. , 34(5), 2006. · Zbl 1108.62007 [24] J.-B Monnier. Classification via local multi-resolution projections. EJS , 6:382-420, 2012. · Zbl 1274.62251 [25] P. Rigollet and A. Tsybakov. Sparse estimation by exponential weighting. Statistical Science , 27:558-575, 2011. · Zbl 1215.62043 [26] C. Rudin. Ranking with a P-Norm Push. In Proceedings of COLT , 2006. · Zbl 1143.68559 [27] N. Srebro, K. Sridharan, and A. Tewari. Smoothness, low noise and fast rates. In Proceedings of NIPS . 2010. [28] A. Tsybakov. Optimal aggregation of classifiers in statistical learning. Ann. Statist. , 32(1):135-166, 2004. · Zbl 1105.62353 [29] T. Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization (with discussion). Annals of Statistics , 32:56-85, 2004. · Zbl 1105.62323
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.