×

Robust fitting for generalized additive models for location, scale and shape. (English) Zbl 1461.62008

Summary: The validity of estimation and smoothing parameter selection for the wide class of generalized additive models for location, scale and shape (GAMLSS) relies on the correct specification of a likelihood function. Deviations from such assumption are known to mislead any likelihood-based inference and can hinder penalization schemes meant to ensure some degree of smoothness for nonlinear effects. We propose a general approach to achieve robustness in fitting GAMLSSs by limiting the contribution of observations with low log-likelihood values. Robust selection of the smoothing parameters can be carried out either by minimizing information criteria that naturally arise from the robustified likelihood or via an extended Fellner-Schall method. The latter allows for automatic smoothing parameter selection and is particularly advantageous in applications with multiple smoothing parameters. We also address the challenge of tuning robust estimators for models with nonlinear effects by proposing a novel median downweighting proportion criterion. This enables a fair comparison with existing robust estimators for the special case of generalized additive models, where our estimator competes favorably. The overall good performance of our proposal is illustrated by further simulations in the GAMLSS setting and by an application to functional magnetic resonance brain imaging using bivariate smoothing splines.

MSC:

62-08 Computational methods for problems pertaining to statistics
62G08 Nonparametric regression and quantile regression

Software:

R; GAMLSS; gamair; GJRM; trust; mpr
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Alimadad, A.; Salibian-Barrera, M., An outlier-robust fit for generalized additive models with applications to disease outbreak detection, J. Am. Stat. Assoc., 106, 494, 719-731 (2011) · Zbl 1232.62142 · doi:10.1198/jasa.2011.tm09654
[2] Beyerlein, A.; Fahrmeir, L.; Mansmann, U.; Toschke, AM, Alternative regression models to assess increase in childhood BMI, BMC Med. Res. Methodol., 8, 1, 59 (2008) · doi:10.1186/1471-2288-8-59
[3] Burke, K.; MacKenzie, G., Multi-parameter regression survival modeling: an alternative to proportional hazards, Biometrics, 73, 2, 678-686 (2017) · Zbl 1372.62056 · doi:10.1111/biom.12625
[4] Cantoni, E.; Ronchetti, EM, Resistant selection of the smoothing parameter for smoothing splines, Stat. Comput., 11, 2, 141-146 (2001) · doi:10.1023/A:1008975231866
[5] Cantoni, E.; Ronchetti, EM, Robust inference for generalized linear models, J. Am. Stat. Assoc., 96, 455, 1022-1030 (2001) · Zbl 1072.62610 · doi:10.1198/016214501753209004
[6] Cole, TJ; Stanojevic, S.; Stocks, J.; Coates, AL; Hankinson, JL; Wade, AM, Age-and size-related reference ranges: a case study of spirometry through childhood and adulthood, Stat. Med., 28, 5, 880-898 (2009) · doi:10.1002/sim.3504
[7] Conn, AR; Gould, NIM; Toint, PL, Trust Region Methods (2000), Philadelphia: Society for Industrial and Applied Mathematics, Philadelphia · Zbl 0958.65071 · doi:10.1137/1.9780898719857
[8] Craven, P.; Wahba, G., Smoothing noisy data with spline functions, Numer. Math., 31, 4, 377-403 (1979) · Zbl 0377.65007 · doi:10.1007/BF01404567
[9] Croux, C.; Gijbels, I.; Prosdocimi, I., Robust estimation of mean and dispersion functions in extended generalized additive models, Biometrics, 68, 1, 31-44 (2012) · Zbl 1241.62108 · doi:10.1111/j.1541-0420.2011.01630.x
[10] De Castro, M.; Cancho, VG; Rodrigues, J., A hands-on approach for fitting long-term survival models under the GAMLSS framework, Comput. Methods Programs Biomed., 97, 2, 168-177 (2010) · doi:10.1016/j.cmpb.2009.08.002
[11] Eguchi, S., Kano, Y.: Robustifing maximum likelihood estimation by Psi-divergence. In: Research Memorandum 802. Institute of Statistical Mathematics (ISM), Tokyo (2001)
[12] Field, C.; Smith, B., Robust estimation: a weighted maximum likelihood approach, Int. Stat. Rev., 62, 3, 405-424 (1994) · Zbl 0825.62428 · doi:10.2307/1403770
[13] Geyer, C.J.: Trust: trust region optimization. R package version 0.1-6. http://CRAN.R-project.org/package=trust (2015)
[14] Glasbey, CA; Khondoker, MR, Efficiency of functional regression estimators for combining multiple laser scans of cDNA microarrays, Biomet. J., 51, 1, 45-55 (2009) · Zbl 1442.62379 · doi:10.1002/bimj.200710444
[15] Groll, A., Hambuckers, J., Kneib, T., Umlauf, N.: LASSO-type penalization in the framework of generalized additive models for location, scale and shape. In: Working Papers 2018-2016, Faculty of Economics and Statistics. University of Innsbruck (2018) · Zbl 1496.62119
[16] Hambuckers, J.; Groll, A.; Kneib, T., Understanding the economic determinants of the severity of operational losses: a regularized generalized pareto regression approach, J. Appl. Econom., 33, 898-935 (2018) · doi:10.1002/jae.2638
[17] Hampel, FR, The influence curve and its role in robust estimation, J. Am. Stat. Assoc., 69, 346, 383-393 (1974) · Zbl 0305.62031 · doi:10.1080/01621459.1974.10482962
[18] Hampel, FR; Ronchetti, EM; Rousseeuw, PJ; Stahel, WA, Robust Statistics: The Approach Based on Influence Functions (1986), New York: Wiley, New York · Zbl 0593.62027
[19] Hastie, TJ; Tibshirani, RJ, Generalized Additive Models (1990), New York: Chapman & Hall/CRC, New York · Zbl 0747.62061
[20] Huber, PJ; Ronchetti, EM, Robust Statistics (2009), New York: Wiley, New York · Zbl 1276.62022 · doi:10.1002/9780470434697
[21] Konishi, S.; Kitagawa, G., Generalised information criteria in model selection, Biometrika, 83, 4, 875-890 (1996) · Zbl 0883.62004 · doi:10.1093/biomet/83.4.875
[22] Landau, S.; Ellison-Wright, IC; Bullmore, ET, Tests for a difference in timing of physiological response between two brain regions measured by using functional magnetic resonance imaging, J. R. Stat. Soc. Ser. C, 53, 1, 63-82 (2003) · Zbl 1111.62352 · doi:10.1111/j.0035-9254.2003.04844.x
[23] Lang, S.; Umlauf, N.; Wechselberger, P.; Harttgen, K.; Kneib, T., Multilevel structured additive regression, Stat. Comput., 24, 2, 223-238 (2014) · Zbl 1325.62179 · doi:10.1007/s11222-012-9366-0
[24] Marra, G., Radice, R.: GJRM: generalised joint regression modelling. R package version 0.2-3. http://CRAN.R-project.org/package=GJRM (2020)
[25] Marra, G.; Radice, R.; Bärnighausen, T.; Wood, SN; McGovern, ME, A simultaneous equation approach to estimating HIV prevalence with non-ignorable missing responses, J. Am. Stat. Assoc., 112, 518, 484-496 (2017) · doi:10.1080/01621459.2016.1224713
[26] Marra, G.; Wood, SN, Coverage properties of confidence intervals for generalized additive model components, Scand. J. Stat., 39, 53-74 (2012) · Zbl 1246.62058 · doi:10.1111/j.1467-9469.2011.00760.x
[27] Mayr, A.; Fenske, N.; Hofner, B.; Kneib, T.; Schmid, M., Generalized additive models for location, scale and shape for high-dimensional data: a flexible approach based on boosting, J. R. Stat. Soc. Ser. C, 61, 3, 403-427 (2012) · doi:10.1111/j.1467-9876.2011.01033.x
[28] Nocedal, J.; Wright, SJ, Numerical Optimization (2006), New York: Springer, New York · Zbl 1104.65059
[29] Pan, J.; Mackenzie, G., On modelling mean-covariance structures in longitudinal studies, Biometrika, 90, 1, 239-244 (2003) · Zbl 1039.62068 · doi:10.1093/biomet/90.1.239
[30] R Core Team: R: A language and environment for statistical computing. In: R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/ (2020)
[31] Rigby, RA; Stasinopoulos, DM, Generalized additive models for location, scale and shape, J. R. Stat. Soc. Ser. C, 54, 507-554 (2005) · Zbl 1490.62201 · doi:10.1111/j.1467-9876.2005.00510.x
[32] Rigby, RA; Stasinopoulos, MD; Heller, GZ; De Bastiani, F., Distributions for Modeling Location, Scale, and Shape: Using GAMLSS in R (2019), Boca Raton: Chapman & Hall/CRC, Boca Raton · doi:10.1201/9780429298547
[33] Rudge, J.; Gilchrist, R., Excess winter morbidity among older people at risk of cold homes: a population-based study in a London borough, J. Publ. Health, 27, 4, 353-358 (2005) · doi:10.1093/pubmed/fdi051
[34] Stasinopoulos, MD; Rigby, RA; De Bastiani, F., GAMLSS: a distributional regression approach, Stat. Model., 18, 3-4, 248-273 (2018) · Zbl 07289508 · doi:10.1177/1471082X18759144
[35] Stasinopoulos, MD; Rigby, RA; Heller, GZ; Voudouris, V.; De Bastiani, F., Flexible Regression and Smoothing: Using GAMLSS in R (2017), Boca Raton: Chapman & Hall/CRC, Boca Raton · doi:10.1201/b21973
[36] Stasinopoulos, M., Rigby, B.: GAMLSS: generalised additive models for location scale and shape. R package version 5.1-7. http://CRAN.R-project.org/package=gamlss (2020)
[37] Vatter, T.; Chavez-Demoulin, V., Generalized additive models for conditional dependence structures, J. Multivar. Anal., 141, 147-167 (2015) · Zbl 1328.62390 · doi:10.1016/j.jmva.2015.07.003
[38] Wong, RKW; Yao, F.; Lee, TCM, Robust estimation for generalized additive models, J. Comput. Graph. Stat., 23, 1, 270-289 (2014) · doi:10.1080/10618600.2012.756816
[39] Wood, SN, Generalized Additive Models: An Introduction with R (2017), Boca Raton: Chapman & Hall/CRC, Boca Raton · Zbl 1368.62004 · doi:10.1201/9781315370279
[40] Wood, SN; Fasiolo, M., A generalized Fellner-Schall method for smoothing parameter optimization with application to Tweedie location, scale and shape models, Biometrics, 73, 4, 1071-1081 (2017) · Zbl 1405.62216 · doi:10.1111/biom.12666
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.