×

zbMATH — the first resource for mathematics

Modeling outcomes of soccer matches. (English) Zbl 07024628
Mach. Learn. 108, No. 1, 77-95 (2019); correction ibid. 108, No. 2, 377-378 (2019).
Summary: We compare various extensions of the Bradley-Terry model and a hierarchical Poisson log-linear model in terms of their performance in predicting the outcome of soccer matches (win, draw, or loss). The parameters of the Bradley-Terry extensions are estimated by maximizing the log-likelihood, or an appropriately penalized version of it, while the posterior densities of the parameters of the hierarchical Poisson log-linear model are approximated using integrated nested Laplace approximations. The prediction performance of the various modeling approaches is assessed using a novel, context-specific framework for temporal validation that is found to deliver accurate estimates of the test error. The direct modeling of outcomes via the various Bradley-Terry extensions and the modeling of match scores using the hierarchical Poisson log-linear model demonstrate similar behavior in terms of predictive performance.

MSC:
68T05 Learning and adaptive systems in artificial intelligence
Software:
gamair; R-INLA; LBFGS-B
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Agresti, A. (2015). Foundations of linear and generalized linear models. Hoboken: Wiley. · Zbl 1309.62001
[2] Baio, G.; Blangiardo, M., Bayesian hierarchical model for the prediction of football results, Journal of Applied Statistics, 37, 253-264, (2010)
[3] Berrar, D., Dubitzky, W., Davis, J., & Lopes, P. (2017). Machine learning for Soccer. Retrieved from osf.io/ftuva.
[4] Bradley, RA; Terry, ME, Rank analysis of incomplete block deisngs: I. The method of paired comparisons, Biometrika, 39, 502-537, (1952)
[5] Byrd, RH; Lu, P.; Nocedal, J.; Zhu, C., A limited memory algorithm for bound constrained optimization, SIAM Journal of Scientific Computing, 16, 1190, (1995) · Zbl 0836.65080
[6] Cattelan, M.; Varin, C.; Firth, D., Dynamic BradleyTerry modelling of sports tournaments, Journal of the Royal Statistical Society: Series C (Applied Statistics), 62, 135-150, (2013)
[7] Davidson, RR, On extending the Bradley-Terry model to accommodate ties in paired comparison experiments, Journal of the American Stistical Association, 65, 317, (1970)
[8] DerSimonian, R.; Laird, N., Meta-analysis in clinical trials, Controlled Clinical Trials, 7, 177, (1986)
[9] Dietterich, T. G. (2000). Ensemble methods in machine learning (pp. 1-15). Berlin: Springer.
[10] Dixon, MJ; Coles, SG, Modelling association football scores and inefficiencies in the football betting market, Applied Statistics, 46, 265, (1997)
[11] Efron, B. (1982). The jackknife, the bootstrap and other resampling plans. New Delhi: SIAM. · Zbl 0496.62036
[12] Epstein, ES, A scoring system for probability forecasts of ranked categories, Journal of Applied Meteorology, 8, 985-987, (1969)
[13] Firth, D. (2005). Bradley-Terry Models in R. Journal of Statistical Software, 12(1)
[14] Gneiting, T.; Raftery, AE, Strictly proper scoring rules, prediction, and estimation, Journal of the American Statistical Association, 102, 359-378, (2007) · Zbl 1284.62093
[15] Golub, GH; Heath, M.; Wahba, G., Generalized cross-validation as a method for choosing good ridge parameter, Technometrics, 21, 215, (1979) · Zbl 0461.62059
[16] Karlis, D.; Ntzoufras, I., Analysis of sports data by using bivariate Poisson models, Journal of the Royal Statistical Society D, 52, 381-393, (2003)
[17] Király, F.J., & Qian, Z. (2017). Modelling competitive sports: Bradley-Terry-Élo models for supervised and on-line learning of paired competition outcomes (pp. 1-53). arXiv:1701.08055.
[18] Lindgren, F.; Rue, H., Bayesian spatial modelling with r-inla, Journal of Statistical Software, Articles, 63, 1-25, (2015)
[19] Maher, MJ, Modelling association football scores, Statistica Neerlandica, 36, 109-118, (1982)
[20] Murphy, AH, On the ranked probability score, Journal of Applied Meteorology, 8, 988-989, (1969)
[21] Rue, H.; Martino, S.; Chopin, N., Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations, Journal of the Royal Statistical Society B, 71, 319-392, (2009) · Zbl 1248.62156
[22] Wahba, G. (1990). Spline models for observational data. Society for Industrial and Applied Mathematics. · Zbl 0813.62001
[23] Wikipedia (2018). UEFA coefficient — Wikipedia, the free encyclopedia. http://en.wikipedia.org/w/index.php?title=UEFA%20coefficient&oldid=819064849. Accessed February 09 2018.
[24] Wood, SN, Thin plate regression splines, Journal of the Royal Statistical Society. Series B: Statistical Methodology, 65, 95, (2003) · Zbl 1063.62059
[25] Wood, S. N. (2006). Generalized additive models: An introduction with R. Boca Raton: CRC Press. · Zbl 1087.62082
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.