×

zbMATH — the first resource for mathematics

Generalized additive models for location, scale and shape (with discussion). (English) Zbl 05188697
Summary: A general class of statistical models for a univariate response variable is presented which we call the generalized additive model for location, scale and shape (GAMLSS). The model assumes independent observations of the response variable y given the parameters, the explanatory variables and the values of the random effects. The distribution for the response variable in the GAMLSS can be selected from a very general family of distributions including highly skew or kurtotic continuous and discrete distributions. The systematic part of the model is expanded to allow modelling not only of the mean (or location) but also of the other parameters of the distribution of y, as parametric and/or additive nonparametric (smooth) functions of explanatory variables and/or random-effects terms. Maximum (penalized) likelihood estimation is used to fit the (non)parametric models. A Newton-Raphson or Fisher scoring algorithm is used to maximize the (penalized) likelihood. The additive terms in the model are fitted by using a backfitting algorithm. Censored data are easily incorporated into the framework. Five data sets from different fields of application are analysed to emphasize the generality of the GAMLSS class of models.

MSC:
62-XX Statistics
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] DOI: 10.1111/j.0006-341X.1999.00117.x · Zbl 1059.62564 · doi:10.1111/j.0006-341X.1999.00117.x
[2] DOI: 10.1109/TAC.1974.1100705 · Zbl 0314.62039 · doi:10.1109/TAC.1974.1100705
[3] Akaike H., Bull. Int. Statist. Inst. 50 pp 277– (1983)
[4] DOI: 10.1198/016214503388619238 · Zbl 1047.62076 · doi:10.1198/016214503388619238
[5] Berger J. O., Statistical Decision Theory and Bayesian Analysis (1985) · Zbl 0572.62008 · doi:10.1007/978-1-4757-4286-2
[6] DOI: 10.1111/1467-9868.00201 · Zbl 0951.62091 · doi:10.1111/1467-9868.00201
[7] DOI: 10.1007/BF00116466 · Zbl 0760.62029 · doi:10.1007/BF00116466
[8] Boor C., A Practical Guide to Splines (1978) · Zbl 0406.41003 · doi:10.1007/978-1-4612-6333-3
[9] Box G. E. P., J. R. Statist. Soc 26 pp 211– (1964)
[10] Box G. E. P., Bayesian Inference in Statistical Analysis (1973) · Zbl 0271.62044
[11] Breslow N. E., J. Am. Statist. Ass. 88 pp 9– (1993)
[12] Breslow N. E., Biometrika 82 pp 81– (1995)
[13] DOI: 10.1198/016214503000000819 · Zbl 1045.62003 · doi:10.1198/016214503000000819
[14] Cleveland W. S., Statistical Modelling in S pp 309– (1993)
[15] DOI: 10.1002/(SICI)1097-0258(19980228)17:4<407::AID-SIM742>3.0.CO;2-L · doi:10.1002/(SICI)1097-0258(19980228)17:4<407::AID-SIM742>3.0.CO;2-L
[16] Cole T. J., Statist. Med. 11 pp 1305– (1992)
[17] DOI: 10.1080/030144699282633 · doi:10.1080/030144699282633
[18] Cox D. R., J. R. Statist. Soc. 49 pp 1– (1987)
[19] Crisp A., Biometrika 81 pp 585– (1994)
[20] CYTEL Software Corporation, EGRET for Windows (2001)
[21] Diggle P. J., Analysis of Longitudinal Data (2002)
[22] Draper D., J. R. Statist. Soc 57 pp 45– (1995)
[23] Dunn P. K., J. Comput. Graph. Statist. 5 pp 236– (1996)
[24] DOI: 10.1214/ss/1038425655 · Zbl 0955.62562 · doi:10.1214/ss/1038425655
[25] DOI: 10.1111/1467-9876.00229 · Zbl 04565472 · doi:10.1111/1467-9876.00229
[26] Fahrmeir L., Multivariate Statistical Modelling based on Generalized Linear Models (2001) · Zbl 0980.62052 · doi:10.1007/978-1-4757-3454-6
[27] Gange S. J., Appl. Statist. 45 pp 371– (1996)
[28] Green P. J., Biometrika 72 pp 527– (1985)
[29] Green P. J., Nonparametric Regression and Generalized Linear Models (1994) · Zbl 0832.62032 · doi:10.1007/978-1-4899-4473-3
[30] Harvey A. C., Forecasting Structural Time Series Models and the Kalman Filter (1989)
[31] Hastie T. J., Generalized Additive Models (1990) · Zbl 0747.62061
[32] Hastie T., J. R. Statist. Soc 55 pp 757– (1993)
[33] Hastie T. J., Statist. Sci. 15 pp 213– (2000)
[34] Hastie T. J., The Elements of Statistical Learning: Data Mining, Inference and Prediction (2001) · Zbl 0973.62007
[35] DOI: 10.1198/016214503000000828 · Zbl 1047.62003 · doi:10.1198/016214503000000828
[36] DOI: 10.1111/1467-9868.00137 · Zbl 0909.62072 · doi:10.1111/1467-9868.00137
[37] DOI: 10.1093/biomet/88.2.367 · Zbl 0984.62045 · doi:10.1093/biomet/88.2.367
[38] Ihaka R., J. Computnl Graph. Statist. 5 pp 299– (1996)
[39] Johnson N. L., Biometrika 36 pp 149– (1949)
[40] Johnson N. L., Continuous Univariate Distributions (1994)
[41] Johnson N. L., Continuous Univariate Distributions (1995)
[42] Johnson N. L., Univariate Discrete Distributions (1993)
[43] Kohn R., Bayesian Analysis of Time Series and Dynamic Models pp 393– (1998)
[44] Kohn R., J. Am. Statist. Ass. 86 pp 1042– (1991)
[45] Lange K., Numerical Analysis for Statisticians (1999) · Zbl 0920.62001
[46] Lange K. L., J. Am. Statist. Ass. 84 pp 881– (1989)
[47] Lee Y., J. R. Statist. Soc 58 pp 619– (1996)
[48] Lee Y., Appl. Statist. 49 pp 591– (2000) · Zbl 04561702 · doi:10.1111/1467-9876.00214
[49] DOI: 10.1093/biomet/88.4.987 · Zbl 0995.62066 · doi:10.1093/biomet/88.4.987
[50] DOI: 10.1191/147108201128050 · Zbl 1004.62080 · doi:10.1191/147108201128050
[51] DOI: 10.1111/1467-9868.00183 · Zbl 0915.62062 · doi:10.1111/1467-9868.00183
[52] A. Lopatatzidis, and P. J. Green (2000 ) Nonparametric quantile regression using the gamma distribution . To be published.
[53] Madigan D., J. Am. Statist. Ass. 89 pp 1535– (1994)
[54] McCulloch C. E., J. Am. Statist. Ass. 92 pp 162– (1997)
[55] Nelder J. A., J. R. Statist. Soc 135 pp 370– (1972)
[56] Nelson D. B., Econometrica 59 pp 347– (1991)
[57] Ortega J. M., Iterative Solution of Nonlinear Equations in Several Variables (1970) · Zbl 0241.65046
[58] Pawitan Y., In All Likelihood: Statistical Modelling and Inference using Likelihood (2001) · Zbl 1013.62001
[59] DOI: 10.1093/biomet/83.2.251 · Zbl 0864.62049 · doi:10.1093/biomet/83.2.251
[60] Raftery A. E., Sociol. Meth. Res. 27 pp 411– (1999)
[61] Reinsch C., Numer. Math. 10 pp 177– (1967)
[62] DOI: 10.1007/BF00161574 · doi:10.1007/BF00161574
[63] Rigby R. A., Statistical Theory and Computational Aspects of Smoothing pp 215– (1996) · doi:10.1007/978-3-642-48425-4_16
[64] Rigby R. A., Technical Report 01/04 (2004)
[65] DOI: 10.1002/sim.1861 · doi:10.1002/sim.1861
[66] Ripley B. D., Pattern Recognition and Neural Networks (1996) · Zbl 0853.62046 · doi:10.1017/CBO9780511812651
[67] Royston P., Appl. Statist. 43 pp 429– (1994)
[68] Schumaker L. L., Spline Functions: Basic Theory (1993)
[69] Schwarz G., Ann. Statist. 6 pp 461– (1978)
[70] Silverman B. W., J. R. Statist. Soc 47 pp 1– (1985)
[71] Smith P. L., Am. Statistn 33 pp 57– (1979)
[72] Speed T. P., Statist. Sci. 6 pp 42– (1991)
[73] DOI: 10.1016/0167-9473(92)90119-Z · Zbl 0925.62306 · doi:10.1016/0167-9473(92)90119-Z
[74] Stasinopoulos D. M., Technical Report 02/04 (2004)
[75] Stasinopoulos D. M., Statistician 49 pp 479– (2000) · doi:10.1111/1467-9884.00247
[76] Thall P. F., Biometrics 46 pp 657– (1990)
[77] Tierney L., J. Am. Statist. Ass. 81 pp 82– (1986)
[78] Tong H., Non-linear Time Series (1990) · Zbl 0716.62085
[79] DOI: 10.1111/1467-9876.00154 · Zbl 0956.62062 · doi:10.1111/1467-9876.00154
[80] Wahba G., J. R. Statist. Soc 40 pp 364– (1978)
[81] Wahba G., Ann. Statist. 4 pp 1378– (1985)
[82] DOI: 10.1111/1467-9868.00240 · Zbl 04558581 · doi:10.1111/1467-9868.00240
[83] Wood S. N., R News 1 pp 20– (2001)
[84] Zeger S. L., J. Am. Statist. Ass. 86 pp 79– (1991)
[85] Amoroso L., Ann. Mat. Pura Appl. 2 pp 123– (1925)
[86] Azzalini A., Scand. J. Statist. 123 pp 171– (1985)
[87] Breslow N. E., Biometrika 82 pp 81– (1995)
[88] Cole T. J., Statist. Med. 11 pp 1305– (1992)
[89] Fernandez C., J. Am. Statist. Ass. 93 pp 359– (1998)
[90] Johnson N. L., Continuous Univariate Distributions (1994)
[91] DOI: 10.1111/1467-9868.00378 · Zbl 1063.62013 · doi:10.1111/1467-9868.00378
[92] Lane P. W., Compstat Proceedings in Computational Statistics pp 331– (1996)
[93] DOI: 10.1191/1471082X04st070oa · Zbl 1111.62054 · doi:10.1191/1471082X04st070oa
[94] Lee Y., J. R. Statist. Soc. 58 pp 619– (1996)
[95] DOI: 10.1093/biomet/88.4.987 · Zbl 0995.62066 · doi:10.1093/biomet/88.4.987
[96] Y. Lee, and J. A. Nelder (2004 ) Double hierarchical generalized linear models . To be published. · Zbl 05188732
[97] Little R. J. A., Statistical Analysis with Missing Data (2002) · Zbl 1011.62004 · doi:10.1002/9781119013563
[98] DOI: 10.1023/A:1021995912647 · doi:10.1023/A:1021995912647
[99] Longford N. T., Modern Analytical Equipment for the Survey Statistician: Missing Data and Small-area Estimation (2005) · Zbl 1092.62008
[100] McCullagh P., Generalized Linear Models (1989) · Zbl 0744.62098 · doi:10.1007/978-1-4899-3242-6
[101] DOI: 10.1016/S0167-9473(02)00181-0 · Zbl 1430.62235 · doi:10.1016/S0167-9473(02)00181-0
[102] DOI: 10.1016/0016-0032(95)00029-W · Zbl 0840.62016 · doi:10.1016/0016-0032(95)00029-W
[103] M. Noh, and Y. Lee (2004 ) REML estimation for binary data in GLMMs . To be published. · Zbl 1113.62087
[104] DOI: 10.1002/sim.1692 · doi:10.1002/sim.1692
[105] R Development Core Team, R: a Language and Environment for Statistical Computing (2004)
[106] Rider P. R., Ann. Inst. Statist. Math. 9 pp 215– (1958)
[107] Rieck J. R., Technometrics 33 pp 51– (1991)
[108] R. A. Rigby, and D. M. Stasinopoulos (2004 ) Box-Coxtdistribution for modelling skew and leptokurtotic data .Technical Report 01/04. STORM Research Centre, London Metropolitan University, London.
[109] DOI: 10.1002/sim.1861 · doi:10.1002/sim.1861
[110] Stacy E. W., Ann. Math. Statist. 33 pp 1187– (1962)
[111] Y. Wu, V. V. Fedorov, and K. J. Propert (2003 ) Optimal design for beta distributed responses .Technical Report 2004-1. GlaxoSmithKline, Collegeville.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.