Chen, Pinhan; Gao, Chao; Zhang, Anderson Y. Optimal full ranking from pairwise comparisons. (English) Zbl 1539.62064 Ann. Stat. 50, No. 3, 1775-1805 (2022). Summary: We consider the problem of ranking \(n\) players from partial pairwise comparison data under the Bradley-Terry-Luce model. For the first time in the literature, the minimax rate of this ranking problem is derived with respect to the Kendall’s tau distance that measures the difference between two rank vectors by counting the number of inversions. The minimax rate of ranking exhibits a transition between an exponential rate and a polynomial rate depending on the magnitude of the signal-to-noise ratio of the problem. To the best of our knowledge, this phenomenon is unique to full ranking and has not been seen in any other statistical estimation problem. To achieve the minimax rate, we propose a divide-and-conquer ranking algorithm that first divides the \(n\) players into groups of similar skills and then computes local MLE within each group. The optimality of the proposed algorithm is established by a careful approximate independence argument between the two steps. Cited in 1 Document MSC: 62F07 Statistical ranking and selection procedures 62J15 Paired and multiple comparisons; multiple testing Keywords:Bradley-Terry-Luce model; divide-and-conquer; full ranking; minimax risk Software:Quicksort; TrueSkill; trueskill PDFBibTeX XMLCite \textit{P. Chen} et al., Ann. Stat. 50, No. 3, 1775--1805 (2022; Zbl 1539.62064) Full Text: DOI arXiv References: [1] BALTRUNAS, L., MAKCINSKAS, T. and RICCI, F. (2010). Group recommendations with rank aggregation and collaborative filtering. In Proceedings of the Fourth ACM Conference on Recommender Systems 119-126. [2] BEAUDOIN, D. and SWARTZ, T. (2018). A computationally intensive ranking system for paired comparison data. Oper. Res. Perspect. 5 105-112. · doi:10.1016/j.orp.2018.03.002 [3] BOUMAL, N. (2013). On intrinsic Cramér-Rao bounds for Riemannian submanifolds and quotient manifolds. IEEE Trans. Signal Process. 61 1809-1821. · Zbl 1393.94653 · doi:10.1109/TSP.2013.2242068 [4] Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324-345. · Zbl 0047.12903 · doi:10.2307/2334029 [5] BRAVERMAN, M. and MOSSEL, E. (2008). Noisy sorting without resampling. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms 268-276. ACM, New York. · Zbl 1192.94077 [6] BRAVERMAN, M. and MOSSEL, E. (2009). Sorting from noisy information. arXiv preprint. Available at arXiv:0910.1191. · Zbl 1192.94077 [7] CAO, D., HE, X., MIAO, L., AN, Y., YANG, C. and HONG, R. (2018). Attentive group recommendation. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval 645-654. [8] CAO, Z., QIN, T., LIU, T.-Y., TSAI, M.-F. and LI, H. (2007). Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning 129-136. [9] CHEN, P., GAO, C. and ZHANG, A. Y. (2022). Partial recovery for top-\(k\) ranking: Optimality of MLE and sub-optimality of spectral method. Ann. Statist. 50 1618-1652. · Zbl 1539.62063 [10] CHEN, P., GAO, C. and ZHANG, A. Y. (2022). Supplement to “Optimal full ranking from pairwise comparisons.” https://doi.org/10.1214/22-AOS2175SUPP [11] CHEN, X., GOPI, S., MAO, J. and SCHNEIDER, J. (2017). Competitive analysis of the top-\(K\) ranking problem. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms 1245-1264. SIAM, Philadelphia, PA. · Zbl 1410.68339 · doi:10.1137/1.9781611974782.81 [12] Chen, Y., Fan, J., Ma, C. and Wang, K. (2019). Spectral method and regularized MLE are both optimal for top-\(K\) ranking. Ann. Statist. 47 2204-2235. · Zbl 1425.62038 · doi:10.1214/18-AOS1745 [13] CHEN, Y. and SUH, C. (2015). Spectral mle: Top-k rank aggregation from pairwise comparisons. In International Conference on Machine Learning 371-380. [14] CHOO, E. U. and WEDLEY, W. C. (2004). A common framework for deriving preference values from pairwise comparison matrices. Comput. Oper. Res. 31 893-908. · Zbl 1043.62063 [15] COLLIER, O. and DALALYAN, A. (2013). Permutation estimation and minimax matching thresholds. · Zbl 1360.62262 [16] COLLIER, O. and DALALYAN, A. S. (2016). Minimax rates in permutation estimation for feature matching. J. Mach. Learn. Res. 17 Paper No. 6, 31. · Zbl 1360.62262 [17] COSSOCK, D. and ZHANG, T. (2006). Subset ranking using regression. In Learning Theory. Lecture Notes in Computer Science 4005 605-619. Springer, Berlin. · Zbl 1143.68527 · doi:10.1007/11776420_44 [18] CSATÓ, L. (2013). Ranking by pairwise comparisons for Swiss-system tournaments. CEJOR Cent. Eur. J. Oper. Res. 21 783-803. · Zbl 1339.91033 · doi:10.1007/s10100-012-0261-8 [19] DIACONIS, P. and GRAHAM, R. L. (1977). Spearman’s footrule as a measure of disarray. J. Roy. Statist. Soc. Ser. B 39 262-268. · Zbl 0375.62045 [20] DWORK, C., KUMAR, R., NAOR, M. and SIVAKUMAR, D. (2001). Rank aggregation methods for the web. In Proceedings of the 10th International Conference on World Wide Web 613-622. [21] Erdős, P. and Rényi, A. (1960). On the evolution of random graphs. Magy. Tud. Akad. Mat. Kut. Intéz. Közl. 5 17-61. · Zbl 0103.16301 [22] GAO, C. (2017). Phase transitions in approximate ranking. arXiv preprint. Available at arXiv:1711.11189. [23] GAO, C. and ZHANG, A. Y. (2019). Iterative algorithm for discrete structure recovery. arXiv preprint. Available at arXiv:1911.01018. [24] HERBRICH, R., MINKA, T. and GRAEPEL, T. (2007). TrueSkill: A Bayesian skill rating system. In Advances in Neural Information Processing Systems 569-576. [25] Hunter, D. R. (2004). MM algorithms for generalized Bradley-Terry models. Ann. Statist. 32 384-406. · Zbl 1105.62359 · doi:10.1214/aos/1079120141 [26] JADBABAIE, A., MAKUR, A. and SHAH, D. (2020). Estimation of skill distributions. arXiv preprint. Available at arXiv:2006.08189. [27] JANG, M., KIM, S., SUH, C. and OH, S. (2016). Top-\(k\) ranking from pairwise comparisons: When spectral ranking is optimal. ArXiv preprint. Available at arXiv:1603.04153. [28] JANG, M., KIM, S., SUH, C. and OH, S. (2017). Optimal sample complexity of m-wise data for top-k ranking. In Advances in Neural Information Processing Systems 1686-1696. [29] Jones, M. C., Marron, J. S. and Sheather, S. J. (1996). A brief survey of bandwidth selection for density estimation. J. Amer. Statist. Assoc. 91 401-407. · Zbl 0873.62040 · doi:10.2307/2291420 [30] KATAJAINEN, J. and TRÄFF, J. L. (1997). A meticulous analysis of mergesort programs. In Algorithms and Complexity (Rome, 1997). Lecture Notes in Computer Science 1203 217-228. Springer, Berlin. · doi:10.1007/3-540-62592-5_74 [31] KNUTH, D. E. (1997). The Art of Computer Programming. Vol. 1: Fundamental Algorithms, 3rd ed. Addison-Wesley, Reading, MA. · Zbl 0895.68055 [32] LIU, T.-Y. (2011). Learning to Rank for Information Retrieval. Springer, Berlin. · Zbl 1227.68002 [33] LOUVIERE, J. J., HENSHER, D. A. and SWAIT, J. D. (2000). Stated Choice Methods: Analysis and Applications. Cambridge Univ. Press, Cambridge. · Zbl 0992.91002 · doi:10.1017/CBO9780511753831 [34] Luce, R. D. (1959). Individual Choice Behavior: A Theoretical Analysis. Wiley, New York. · Zbl 0093.31708 [35] LUCE, R. D. (1977). The choice axiom after twenty years. J. Math. Psych. 15 215-233. · Zbl 0357.92033 · doi:10.1016/0022-2496(77)90032-3 [36] MANSKI, C. F. (1977). The structure of random utility models. Theory and Decision 8 229-254. · Zbl 0375.90005 · doi:10.1007/BF00133443 [37] MAO, C., WEED, J. and RIGOLLET, P. (2018). Minimax rates and efficient algorithms for noisy sorting. In Algorithmic Learning Theory. Proc. Mach. Learn. Res. (PMLR) 83 27. · Zbl 1407.62066 [38] MCFADDEN, D. (1973). Conditional logit analysis of qualitative choice behavior. [39] MCFADDEN, D. and TRAIN, K. (2000). Mixed MNL models for discrete response. J. Appl. Econometrics 15 447-470. [40] MINKA, T., CLEVEN, R. and ZAYKOV, Y. (2018). Trueskill 2: An improved Bayesian skill rating system. [41] MOTEGI, S. and MASUDA, N. (2012). A network-based dynamical ranking system for competitive sports. Sci. Rep. 2 904. · doi:10.1038/srep00904 [42] NEGAHBAN, S., OH, S. and SHAH, D. (2017). Rank centrality: Ranking from pairwise comparisons. Oper. Res. 65 266-287. · Zbl 1414.91133 · doi:10.1287/opre.2016.1534 [43] PANANJADY, A., MAO, C., MUTHUKUMAR, V., WAINWRIGHT, M. J. and COURTADE, T. A. (2020). Worst-case versus average-case design for estimation from partial pairwise comparisons. Ann. Statist. 48 1072-1097. · Zbl 1452.62561 · doi:10.1214/19-AOS1838 [44] PANANJADY, A., WAINWRIGHT, M. J. and COURTADE, T. A. (2016). Linear regression with an unknown permutation: Statistical and computational limits. In 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton) 417-424. IEEE. [45] Plackett, R. L. (1975). The analysis of permutations. J. R. Stat. Soc. Ser. C. Appl. Stat. 24 193-202. · doi:10.2307/2346567 [46] SAATY, T. L. (1990). Decision Making for Leaders: The Analytic Hierarchy Process for Decisions in a Complex World. RWS Publications. [47] SEDGEWICK, R. (1978). Implementing quicksort programs. Commun. ACM 21 847-857. · Zbl 0386.68058 [48] SHA, L., LUCEY, P., YUE, Y., CARR, P., ROHLF, C. and MATTHEWS, I. (2016). Chalkboarding: A new spatiotemporal query paradigm for sports play retrieval. In Proceedings of the 21st International Conference on Intelligent User Interfaces 336-347. [49] SHAH, N., BALAKRISHNAN, S., GUNTUBOYINA, A. and WAINWRIGHT, M. (2016). Stochastically transitive models for pairwise comparisons: Statistical and computational issues. In International Conference on Machine Learning 11-20. · Zbl 1364.94253 [50] SHAH, N. B. and WAINWRIGHT, M. J. (2017). Simple, robust and optimal ranking from pairwise comparisons. J. Mach. Learn. Res. 18 Paper No. 199, 38. · Zbl 1473.62078 [51] THURSTONE, L. L. (1927). A law of comparative judgment. Psychol. Rev. 34 273 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.