zbMATH — the first resource for mathematics

Spectral method and regularized MLE are both optimal for top-\(K\) ranking. (English) Zbl 1425.62038
Authors’ abstract: This paper is concerned with the problem of top-\(K\) ranking from pairwise comparisons. Given a collection of \(n\) items and a few pairwise comparisons across them, one wishes to identify the set of \(K\) items that receive the highest ranks. To tackle this problem, we adopt the logistic parametric model-the Bradley-Terry-Luce model, where each item is assigned a latent preference score, and where the outcome of each pairwise comparison depends solely on the relative scores of the two items involved. Recent works have made significant progress toward characterizing the performance (e.g., the mean square error for estimating the scores) of several classical methods, including the spectral method and the maximum likelihood estimator (MLE). However, where they stand regarding top-\(K\) ranking remains unsettled.
We demonstrate that under a natural random sampling model, the spectral method alone, or the regularized MLE alone, is minimax optimal in terms of the sample complexity – the number of paired comparisons needed to ensure exact top-\(K\) identification, for the fixed dynamic range regime. This is accomplished via optimal control of the entrywise error of the score estimates. We complement our theoretical studies by numerical experiments, confirming that both methods yield low entrywise errors for estimating the underlying scores. Our theory is established via a novel leave-one-out trick, which proves effective for analyzing both iterative and noniterative procedures. Along the way, we derive an elementary eigenvector perturbation bound for probability transition matrices, which parallels the Davis-Kahan sin \(\Theta\) theorem for symmetric matrices. This also allows us to close the gap between the \(\ell_2\) error upper bound for the spectral method and the minimax lower limit.

62F07 Statistical ranking and selection procedures
62B10 Statistical aspects of information-theoretic topics
62M15 Inference from stochastic processes and spectral analysis
Full Text: DOI Euclid arXiv
[1] Abbe, E., Fan, J., Wang, K. and Zhong, Y. (2017). Entrywise eigenvector analysis of random matrices with low expected rank. ArXiv preprint. Available at ArXiv:1709.09565.
[2] Agarwal, A., Agarwal, S., Assadi, S. and Khanna, S. (2017). Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons. In Conference on Learning Theory 39-75.
[3] Ammar, A. and Shah, D. (2011). Ranking: Compare, don’t score. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton) 776-783. DOI:10.1109/Allerton.2011.6120246.
[4] Ammar, A. and Shah, D. (2012). Efficient rank aggregation using partial data. In SIGMETRICS40 355-366. ACM, New York.
[5] Baltrunas, L., Makcinskas, T. and Ricci, F. (2010). Group recommendations with rank aggregation and collaborative filtering. In Proceedings of the Fourth ACM Conference on Recommender Systems. RecSys ’10 119-126. ACM, New York.
[6] Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika39 324-345. · Zbl 0047.12903
[7] Bubeck, S. (2015). Convex optimization: Algorithms and complexity. Found. Trends Mach. Learn.8 231-357. · Zbl 1365.90196
[8] Busa-Fekete, R., Szörényi, B., Weng, P., Cheng, W. and Hüllermeier, E. (2013). Top-\(k\) selection based on adaptive sampling of noisy preferences. In International Conference on Machine Learning.
[9] Chen, Y. and Candes, E. (2016). The projected power method: An efficient algorithm for joint alignment from pairwise differences. Comm. Pure Appl. Math. To appear. · Zbl 06919696
[10] Chen, Y. and Candès, E. J. (2017). Solving random quadratic systems of equations is nearly as easy as solving linear systems. Comm. Pure Appl. Math.70 822-883. · Zbl 1379.90024
[11] Chen, Y. and Suh, C. (2015). Spectral MLE: Top-\(K\) rank aggregation from pairwise comparisons. In International Conference on Machine Learning 371-380.
[12] Chen, X., Bennett, P. N., Collins-Thompson, K. and Horvitz, E. (2013). Pairwise ranking aggregation in a crowdsourced setting. In ACM International Conference on Web Search and Data Mining 193-202. ACM, New York.
[13] Chen, X., Gopi, S., Mao, J. and Schneider, J. (2017). Competitive analysis of the top-\(K\) ranking problem. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms 1245-1264. SIAM, Philadelphia, PA. · Zbl 1410.68339
[14] Chen, Y., Chi, Y., Fan, J. and Ma, C. (2018). Gradient descent with random initialization: Fast global convergence for nonconvex phase retrieval. Available at ArXiv:1803.07726. · Zbl 1415.90086
[15] Chen, Y., Fan, J., Ma, C. and Wang, K. (2019). Supplement to “Spectral method and regularized MLE are both optimal for top-\(K\) ranking.” DOI:10.1214/18-AOS1745SUPP.
[16] Chung, F. R. K. (1997). Spectral Graph Theory. CBMS Regional Conference Series in Mathematics92. Published for the Conference Board of the Mathematical Sciences, Washington, DC; by the Amer. Math. Soc., Providence, RI. · Zbl 0867.05046
[17] Davis, C. and Kahan, W. M. (1970). The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal.7 1-46. · Zbl 0198.47201
[18] Dwork, C., Kumar, R., Naor, M. and Sivakumar, D. (2001). Rank aggregation methods for the Web. In International Conference on World Wide Web 613-622.
[19] El Karoui, N. (2018). On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators. Probab. Theory Related Fields170 95-175. · Zbl 1407.62060
[20] Eldridge, J., Belkin, M. and Wang, Y. (2017). Unperturbed: Spectral analysis beyond Davis-Kahan. ArXiv preprint. Available at ArXiv:1706.06516. · Zbl 1406.60014
[21] Fan, J., Wang, W. and Zhong, Y. (2018). An \(\ell _{\infty}\) eigenvector perturbation bound and its application. J. Mach. Learn. Res.18 1-42.
[22] Ford, L. R. Jr. (1957). Solution of a ranking problem from binary comparisons. Amer. Math. Monthly64 28-33.
[23] Hajek, B., Oh, S. and Xu, J. (2014). Minimax-optimal inference from partial rankings. In Neural Information Processing Systems 1475-1483.
[24] Heckel, R., Shah, N. B., Ramchandran, K. and Wainwright, M. J. (2016). Active ranking from pairwise comparisons and when parametric assumptions don’t help. ArXiv preprint. Available at ArXiv:1606.08842.
[25] Hunter, D. R. (2004). MM algorithms for generalized Bradley-Terry models. Ann. Statist.32 384-406. · Zbl 1105.62359
[26] Jamieson, K. G. and Nowak, R. D. (2011). Active ranking using pairwise comparisons. In Neural Information Processing Systems 2240-2248.
[27] Jang, M., Kim, S., Suh, C. and Oh, S. (2016). Top-\(K\) ranking from pairwise comparisons: When spectral ranking is optimal. ArXiv preprint. Available at arXiv:1603.04153.
[28] Javanmard, A. and Montanari, A. (2018). Debiasing the lasso: Optimal sample size for Gaussian designs. Ann. Statist.46 2593-2622. · Zbl 1407.62270
[29] Jiang, X., Lim, L.-H., Yao, Y. and Ye, Y. (2011). Statistical ranking and combinatorial Hodge theory. Math. Program.127 203-244. · Zbl 1210.90142
[30] Keshavan, R. H., Montanari, A. and Oh, S. (2010). Matrix completion from noisy entries. J. Mach. Learn. Res.11 2057-2078. · Zbl 1242.62069
[31] Koltchinskii, V. and Lounici, K. (2016). Asymptotics and concentration bounds for bilinear forms of spectral projectors of sample covariance. Ann. Inst. Henri Poincaré Probab. Stat.52 1976-2013. · Zbl 1353.62053
[32] Koltchinskii, V. and Xia, D. (2016). Perturbation of linear forms of singular vectors under Gaussian noise. In High Dimensional Probability VII. Progress in Probability71 397-423. Springer, Cham. · Zbl 1353.15034
[33] Lu, Y. and Negahban, S. N. (2014). Individualized rank aggregation using nuclear norm regularization. ArXiv preprint. Available at ArXiv:1410.0860.
[34] Luce, R. D. (1959). Individual Choice Behavior: A Theoretical Analysis. Wiley, New York; Chapman & Hall, London. · Zbl 0093.31708
[35] Ma, C., Wang, K., Chi, Y. and Chen, Y. (2017). Implicit regularization in nonconvex statistical estimation: Gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution. ArXiv preprint. Available at ArXiv:1711.10467.
[36] Negahban, S., Oh, S. and Shah, D. (2017). Rank centrality: Ranking from pairwise comparisons. Oper. Res.65 266-287. · Zbl 1414.91133
[37] Negahban, S., Oh, S., Thekumparampil, K. K. and Xu, J. (2017). Learning from comparisons and choices. ArXiv preprint. Available at ArXiv:1704.07228. · Zbl 1461.68192
[38] Pananjady, A., Mao, C., Muthukumar, V., Wainwright, M. J. and Courtade, T. A. (2017). Worst-case vs average-case design for estimation from fixed pairwise comparisons. ArXiv preprint. Available at ArXiv:1707.06217.
[39] Rajkumar, A. and Agarwal, S. (2014). A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In International Conference on Machine Learning I-118-I-126.
[40] Rajkumar, A. and Agarwal, S. (2016). When can we rank well from comparisons of \(O(n\log n)\) non-actively chosen pairs? In Conference on Learning Theory 1376-1401.
[41] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist.39 1878-1915. · Zbl 1227.62042
[42] Shah, N. B. and Wainwright, M. J. (2015). Simple, robust and optimal ranking from pairwise comparisons. ArXiv preprint. Available at arXiv:1512.08949. · Zbl 06982955
[43] Shah, N. B., Balakrishnan, S., Guntuboyina, A. and Wainwright, M. J. (2017). Stochastically transitive models for pairwise comparisons: Statistical and computational issues. IEEE Trans. Inform. Theory63 934-959. · Zbl 1364.94253
[44] Soufiani, H. A., Chen, W. Z., Parkes, D. C. and Xia, L. (2013). Generalized method-of-moments for rank aggregation. In Proceedings of the 26th International Conference on Neural Information Processing Systems. NIPS’13 2706-2714.
[45] Suh, C., Tan, V. Y. F. and Zhao, R. (2017). Adversarial top-\(K\) ranking. IEEE Trans. Inform. Theory63 2201-2225. · Zbl 1366.94173
[46] Sur, P., Chen, Y. and Candès, E. J. (2017). The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled Chi-square. ArXiv preprint. Available at ArXiv:1706.01191.
[47] Tropp, J. A. (2015). An introduction to matrix concentration inequalities. Found. Trends Mach. Learn.8 1-230. · Zbl 1391.15071
[48] Zhong, Y. and Boumal, N. (2017). Near-optimal bounds for phase synchronization. Available at arXiv:1703.06605. · Zbl 1396.90068
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.