×

The ranking lasso and its application to sport tournaments. (English) Zbl 1257.62020

Summary: Ranking a vector of alternatives on the basis of a series of paired comparisons is a relevant topic in many instances. A popular example is ranking contestants in sport tournaments. To this purpose, paired comparison models such as the R.A. Bradley and M.E. Terry model [Biometrika 39, 324–345 (1952; Zbl 0047.12903)] are often used. This paper suggests fitting paired comparison models with a lasso-type procedure that forces contestants with similar abilities to be classified into the same group. Benefits of the proposed method are easier interpretation of rankings and a significant improvement of the quality of predictions with respect to the standard maximum likelihood fitting. Numerical aspects of the proposed method are discussed in detail. The methodology is illustrated through ranking of the teams of the National Football League 2010-2011 and the American College Hockey Men’s Division I 2009-2010.

MSC:

62F07 Statistical ranking and selection procedures
62J15 Paired and multiple comparisons; multiple testing
65C60 Computational problems in statistics (MSC2010)
62P99 Applications of statistics

Citations:

Zbl 0047.12903
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Agresti, A. (2010). Analysis of Ordinal Categorical Data , 2nd ed. Wiley, Hoboken, NJ. · Zbl 1263.62007
[2] Böckenholt, U. (2006). Thurstonian-based analyses: Past, present, and future utilities. Psychometrika 71 615-629. · Zbl 1306.62382 · doi:10.1007/s11336-006-1598-5
[3] Bondell, H. D. and Reich, B. J. (2009). Simultaneous factor selection and collapsing levels in ANOVA. Biometrics 65 169-177. · Zbl 1159.62048 · doi:10.1111/j.1541-0420.2008.01061.x
[4] Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324-345. · Zbl 0047.12903
[5] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when \(p\) is much larger than \(n\). Ann. Statist. 35 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523
[6] Cattelan, M. (2012). Models for paired comparison data: A review with emphasis on dependent data. Statistical Science 27 412-433. · Zbl 1331.62368
[7] Chatterjee, A. and Lahiri, S. N. (2011). Bootstrapping lasso estimators. J. Amer. Statist. Assoc. 106 608-625. · Zbl 1232.62088 · doi:10.1198/jasa.2011.tm10159
[8] Chen, J. and Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95 759-771. · Zbl 1437.62415 · doi:10.1093/biomet/asn034
[9] Donoho, D. L. (1995). De-noising by soft-thresholding. IEEE Trans. Inform. Theory 41 613-627. · Zbl 0820.62002 · doi:10.1109/18.382009
[10] Efron, B. (1987). Better bootstrap confidence intervals. J. Amer. Statist. Assoc. 82 171-185. · Zbl 0622.62039 · doi:10.2307/2289144
[11] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407-499. · Zbl 1091.62054 · doi:10.1214/009053604000000067
[12] Fahrmeir, L. and Tutz, G. (1994). Dynamic stochastic models for time-dependent ordered paired comparison systems. J. Amer. Statist. Assoc. 89 1438-1449. · Zbl 0809.62088 · doi:10.2307/2291005
[13] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. · Zbl 1073.62547 · doi:10.1198/016214501753382273
[14] Gertheiss, J. and Tutz, G. (2010). Sparse modeling of categorial explanatory variables. Ann. Appl. Stat. 4 2150-2180. · Zbl 1220.62092 · doi:10.1214/10-AOAS355
[15] Glickman, M. E. (1999). Parameter estimation in large dynamic paired comparison experiments. Applied Statistics 48 377-394. · Zbl 0939.62071 · doi:10.1111/1467-9876.00159
[16] Glickman, M. E. (2001). Dynamic paired comparison models with stochastic variances. J. Appl. Stat. 28 673-689. · Zbl 0991.62048 · doi:10.1080/02664760120059219
[17] Guo, J., Levina, E., Michailidis, G. and Zhu, J. (2010). Pairwise variable selection for high-dimensional model-based clustering. Biometrics 66 793-804. · Zbl 1203.62190 · doi:10.1111/j.1541-0420.2009.01341.x
[18] Hestenes, M. R. (1969). Multiplier and gradient methods. J. Optim. Theory Appl. 4 303-320. · Zbl 0174.20705 · doi:10.1007/BF00927673
[19] Joe, H. (1990). Extended use of paired comparison models, with application to chess rankings. J. Roy. Statist. Soc. Ser. C 39 85-93. · Zbl 0707.62149 · doi:10.2307/2347814
[20] Knorr-Held, L. (2000). Dynamic rating of sports teams. The Statistician 49 261-276.
[21] Lian, H. (2010). A simple and efficient algorithm for fused lasso signal approximator with convex loss function. Available at . 1005.5085
[22] Mease, D. (2003). A penalized maximum likelihood approach for the ranking of college football teams independent of victory margins. Amer. Statist. 57 241-248. · Zbl 05680548 · doi:10.1198/0003130032396
[23] Nocedal, J. and Wright, S. J. (2006). Numerical Optimization , 2nd ed. Springer, New York. · Zbl 1104.65059
[24] Powell, M. J. D. (1969). A method for nonlinear constraints in minimization problems. In Optimization ( Sympos. , Univ. Keele , Keele , 1968) 283-298. Academic Press, London. · Zbl 0194.47701
[25] R Development Core Team. (2012). R : A Language and Environment for Statistical Computing . R Foundation for Statistical Computing, Vienna, Austria. Available at .
[26] She, Y. (2010). Sparse regression with exact clustering. Electron. J. Stat. 4 1055-1096. · Zbl 1329.62327 · doi:10.1214/10-EJS578
[27] Stern, H. S. (2004). Statistics and the college football championship. Amer. Statist. 58 179-195. · Zbl 05680584 · doi:10.1198/000313004X2098
[28] Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review 79 281-299.
[29] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[30] Tibshirani, R. J. and Taylor, J. (2011). The solution path of the generalized lasso. Ann. Statist. 39 1335-1371. · Zbl 1234.62107 · doi:10.1214/11-AOS878
[31] Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005). Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 91-108. · Zbl 1060.62049 · doi:10.1111/j.1467-9868.2005.00490.x
[32] Turner, H. and Firth, D. (2012). Bradley-Terry models in R: The BradleyTerry2 package. Journal of Statistical Software 48 1-21.
[33] Ye, G.-B. and Xie, X. (2011). Split Bregman method for large scale fused Lasso. Comput. Statist. Data Anal. 55 1552-1569. · Zbl 1328.65048
[34] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.