Dirichlet process mixture models for insurance loss data. (English) Zbl 1416.91188

Summary: In the recent insurance literature, a variety of finite-dimensional parametric models have been proposed for analyzing the hump-shaped, heavy-tailed, and highly skewed loss data often encountered in applications. These parametric models are relatively simple, but they lack flexibility in the sense that an actuary analyzing a new data-set cannot be sure that any one of these parametric models will be appropriate. As a consequence, the actuary must make a non-trivial choice among a collection of candidate models, putting him/herself at risk for various model misspecification biases. In this paper, we argue that, at least in cases where prediction of future insurance losses is the ultimate goal, there is reason to consider a single but more flexible nonparametric model. We focus here on Dirichlet process mixture models, and we reanalyze several of the standard insurance data-sets to support our claim that model misspecification biases can be avoided by taking a nonparametric approach, with little to no cost, compared to existing parametric approaches.


91B30 Risk theory, insurance (MSC2010)
62P05 Applications of statistics to actuarial sciences and financial mathematics
62G07 Density estimation
Full Text: DOI


[1] Bakar, S. A. A.; Hamzah, N. A.; Maghsoudi, M.; Nadarajah, S., Modeling loss data using composite models, Insurance: Mathematics and Economics, 61, 146-154, (2015) · Zbl 1314.91130
[2] Beirlant, J.; Joossens, E.; Segers, J., Discussion of generalized Pareto fit to the society of actuaries’ large claims database, North American Actuarial Journal, 8, 2, 108-110, (2004) · Zbl 1085.62502
[3] Blackwell, D.; MacQueen, J. B., Ferguson distributions via polya urn schemes, Annals of Statistics, 1, 353-355, (1973) · Zbl 0276.62010
[4] Brazauskas, V., Robust and efficient Fitting of loss models: diagnostic tools and insights, North American Actuarial Journal, 13, 3, 356-369, (2009)
[5] Brazauskas, V.; Kleefeld, A., Folded and log-folded-t distributions as models for insurance loss data, Scandinavian Actuarial Journal, 1, 59-74, (2011) · Zbl 1277.62248
[6] Brazauskas, V.; Kleefeld, A., Author’s reply to letter to the editor: regarding folded models and the paper by brazauskas and kleefeld (2011) by scollnik, Scandinavian Actuarial Journal, 1, 59-74, (2014) · Zbl 1392.62309
[7] Brazauskas, Y.; Kleefeld, A., Modeling severity and measuring tail risk of Norwegian fire claims, North American Actuarial Journal, 20, 1, 1-16, (2016)
[8] Calderín-Ojeda, E.; Kwok, C. F., Modeling claims data with composite stoppa models, Scandinavian Actuarial Journal, 9, 817-836, (2016) · Zbl 1401.62205
[9] Cooray, K.; Ananda, M. A. M., Modeling actuarial data with a composite lognormal-Pareto model, Scandinavian Actuarial Journal, 5, 321-334, (2005) · Zbl 1143.91027
[10] Cooray, K.; Cheng, C. I., Bayesian estimators of the lognormal-Pareto composite distribution, Scandinavian Actuarial Journal, 6, 500-515, (2015) · Zbl 1401.91120
[11] DasGupta, A., Asymptotic Theory of Statistics and Probability, (2008), Springer, New York · Zbl 1154.62001
[12] Eling, M., Fitting insurance claims to skewed distributions: are the skew-normal and skew-student good models?, Insurance: Mathematics and Economics, 51, 239-248, (2012)
[13] Embrechts, P.; Kluüppelberg, C.; Mikosch, T., Modelling Extremal Events for Insurance and Finance, (1997), Springer-Verlag, New York · Zbl 0873.62116
[14] Escobar, M. D., Estimating the means of several normal populations by estimating the distribution of the means, (1988), Department of Statistics, Yale University
[15] Escobar, M. D.; West, M., Bayesian density estimation and inference using mixtures, Journal of the American Statistical Association, 90, 577-588, (1995) · Zbl 0826.62021
[16] Ferguson, T. S., Bayesian analysis of some nonparametric problems, Annals of Statistics, 1, 209-230, (1973) · Zbl 0255.62037
[17] Ghosal, S., The Dirichlet process, related priors and posterior asymptotics, 35-79, (2010), Cambridge University Press, Cambridge
[18] Ghosal, S.; van der Vaart, A. W., Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities, Annals of Statistics, 29, 1233-1263, (2001) · Zbl 1043.62025
[19] Ghosal, S.; van der Vaart, A. W., Posterior convergence rates of Dirichlet mixtures at smooth densities, Annals of Statistics, 35, 697-723, (2007) · Zbl 1117.62046
[20] Green, P. J.; Richardson, S., Modeling heterogeneity with and without the Dirichlet process, Scandinavian Journal of Statistics, 28, 355-375, (2001) · Zbl 0973.62031
[21] Hjort, N.; Holmes, C.; Müller, P.; Walker, S. G., Bayesian Nonparametrics, (2010), Cambridge University Press
[22] Hong, L.; Kuffner, T.; Martin, R., On prediction of future insurance claims when the model is uncertain, (2016)
[23] Hong, L.; Martin, R., A flexible Bayesian nonparametrics model for predicting future insurance claims, North American Actuarial Journal, 21, 228-241, (2017)
[24] Kalli, M.; Griffin, J. E.; Walker, S. G., Slice sampling mixture models. statistical, Computing, 21, 93-105, (2011) · Zbl 1256.65006
[25] Klugman, S. A.; Panjer, H. H.; Willmot, G. E., Loss models: from data to decisions, (2008), Wiley, Hoboken · Zbl 1159.62070
[26] Lehmann, E. L., Nonparametrics: statistical methods based on ranks, (2006), New York, Springer · Zbl 1217.62061
[27] Lo, A. Y., On a class of Bayesian nonparametric estimates I, Density estimates. Annals of Statistics, 12, 351-357, (1984) · Zbl 0557.62036
[28] MacEachern, S., Estimating normal means with a conjugate style Dirichlet process prior, Communications in Statistics-Simulation & Computation, 23, 727-741, (1994) · Zbl 0825.62053
[29] MacEachern, S.; Müller, P., Estimating mixture of Dirichlet process models, Journal of Computational and Graphical Statistics, 7, 223-238, (1998)
[30] McLachlan, G.; Peel, D., Finite Mixture Models, (2000), Wiley, Hoboken, NJ · Zbl 0963.62061
[31] McNeil, A., Estimating the tails of loss severity distributions using extreme value theory, ASTIN Bulletin, 27, 117-137, (1997)
[32] Müller, P.; Quintana, F. A., Nonparametric Bayesian data analysis, Statistical Science, 19, 95-110, (2004) · Zbl 1057.62032
[33] Nadarajah, S.; Bakar, S. A. A., New composite models for the danish fire insurance data, Scandinavian Actuarial Journal, 2, 180-187, (2014) · Zbl 1401.91177
[34] Nadarajah, S.; Bakar, S. A. A., New folded models for the log-transformed Norwegian fire claim data, Communications in Statistics-Theory and Methods, 44, 4408-4440, (2015) · Zbl 1357.62076
[35] Neal, R. M., Markov chain sampling methods for Dirichlet process mixture models, Journal of Computational and Graphical Statistics, 9, 249-265, (2000)
[36] Pigeon, M.; Denuit, M., Composite lognormal-Pareto model with random threshold, Scandinavian Actuarial Journal, 3, 177-192, (2011) · Zbl 1277.62258
[37] Resnick, S. I., Discussion of the danish data on large fire insurance losses, ASTIN Bulletin, 27, 139-151, (1997)
[38] Scollnik, D. P. M., On composite lognormal-Pareto models, Scandinavian Actuarial Journal, 1, 20-33, (2007) · Zbl 1146.91028
[39] Scollnik, D. P. M., Letter to editor: regarding folded models and the paper by brazauskas and kleefeld (2011), Scandinavian Actuarial Journal, 2014, 3, 278-281, (2014) · Zbl 1392.62313
[40] Sethuraman, J., A constructive definition of Dirichlet priors, Statistica Sinica, 4, 639-650, (1994) · Zbl 0823.62007
[41] Taylor, J.; Tibshirani, R. J., Statistical learning and selective inference, PNAS, 112, 7629-7634, (2015) · Zbl 1359.62228
[42] Tokdar, S. T., Posterior consistency of Dirichlet location-scale mixture of normals in density estimation and regression, Sankhyā, 67, 4, 90-110, (2006) · Zbl 1193.62056
[43] Wu, Y.; Ghosal, S., Kullback Leibler property of kernel mixture priors in Bayesian density estimation, Electronic Journal of Statistics, 2, 298-331, (2008) · Zbl 1135.62022
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.