×

Specification tests for the response distribution in generalized linear models. (English) Zbl 1304.65046

Summary: Goodness-of-fit tests are proposed for the case of independent observations coming from the same family of distributions but with different parameters. The most popular related context is that of generalized linear models (GLMs) where the mean of the distribution varies with regressors. In the proposed procedures, and based on suitable estimators of the parameters involved, the data are transformed to normality. Then any test for normality for i.i.d. data may be applied. The method suggested is in full generality as it may be applied to arbitrary laws with continuous or discrete distribution functions, provided that an efficient method of estimation exists for the parameters. We investigate by Monte Carlo the relative performance of classical tests based on the empirical distribution function, in comparison to a corresponding test which instead of the empirical distribution function, utilizes the empirical characteristic function. Standard measures of goodness-of-fit often used in the context of GLM are also included in the comparison. The paper concludes with several real-data examples.

MSC:

62-08 Computational methods for problems pertaining to statistics
62J12 Generalized linear models (logistic models)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Chen G, Balakrishnan N (1995) A general purpose approximate goodness-of-fit test. J Qual Technol 27: 154–161
[2] Cox DR, Snell EJ (1968) A general definition of residuals (with discussion). J R Stat Soc B 30: 248–275 · Zbl 0164.48903
[3] D’Agostino M, Stephens R (1986) Goodness-of-fit techniques. Marcel Dekker. Inc., New York
[4] Davison AC (2003) Statistical models. Cambridge University Press, Cambridge · Zbl 1044.62001
[5] Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5: 236–244
[6] Epps TW, Pulley LB (1983) A test for normality based on the empirical characteristic function procedures. Biometrika 70: 723–726 · Zbl 0523.62045 · doi:10.1093/biomet/70.3.723
[7] Faraway JJ (2006) Extending the Linear Model with R: generalized linear. Mixed effects and nonparametric regression models. Chapman & Hall/CRC · Zbl 1095.62082
[8] Faraway JJ (2011) Faraway: functions and datasets for books by Julian Faraway. http://cran.r-project.org/web/packages/faraway/index.html
[9] Hallin M, Ingenbleek J-F (1983) The Swedish automobile portfolio in 1977. A statistical study. Scand Actuarial J 49–64 · Zbl 0537.62088
[10] Hardin JW, Hilbe JM (2007) Generalized linear models and extensions, 2nd edn. Stata Press, College Station
[11] Henze N (1990) An approximation to the limit distribution of the Epps-Pulley test statistic for normality. Metrika 37: 7–18 · Zbl 0800.62835 · doi:10.1007/BF02613501
[12] Hu B, Shao J (2008) Generalized linear model selection using R 2. J Stat Plan Infer 138: 3705–3712 · Zbl 1146.62052 · doi:10.1016/j.jspi.2007.12.009
[13] Loynes RM (1980) The Empirical distribution function of residuals from generalised regression. Ann Stat 8: 285–298 · Zbl 0451.62040 · doi:10.1214/aos/1176344954
[14] McCullagh P, Nelder JA (1989) Generalized linear models. 2nd edn. Chapman and Hall, London · Zbl 0744.62098
[15] Meintanis SG (2009) Goodness-of-fit testing by transforming to normality: comparison between classical and characteristic function-based methods. J Stat Comput Simul 79: 205–212 · Zbl 1161.62021 · doi:10.1080/00949650701730547
[16] Mittlböck M, Heinzl H (2002) Measures of explained variation in gamma regression models. Commun Stat Simul Comput 31: 61–73 · Zbl 1081.62542 · doi:10.1081/SAC-120002716
[17] Paul SR, Deng D (2002) Score tests for goodness of fit of generalized linear models to sparse data. Sankhya 64: 179–191 · Zbl 1192.62175
[18] R Development Core Team (2011) R: a language and environment for statistical computing. R Foundation for Statistical Computing. http://www.R-project.org
[19] Shayib MA, Young DH (2002) Modified goodness of fit of tests in gamma regression. J Stat Comput Simul 33: 125–133 · Zbl 0718.62086 · doi:10.1080/00949658908811191
[20] Smyth G, with contributions from Hu Y, Dunn P, Phipson B (2011) statmod: Statistical Modeling. http://CRAN.R-project.org/package=statmod
[21] Spinelli JJ, Lockhart RA, Stephens MA (2002) Tests for the response distribution in a Poisson regression model. J Statist Plan Infer 108: 137–154 · Zbl 1016.62045 · doi:10.1016/S0378-3758(02)00275-6
[22] Thode HC (2002) Testing for normality. Marcel Dekker. Inc., New York · Zbl 1032.62040
[23] Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York · Zbl 1006.62003
[24] Wood GR (2002) Generalized linear accident models and goodness of fit testing. Accident Anal Prevent 34: 417–427 · doi:10.1016/S0001-4575(01)00037-9
[25] Zheng B (2000) Summarizing the goodness of fit of generalized linear models for longitudinal data. Stat Med 19: 1265–1275 · doi:10.1002/(SICI)1097-0258(20000530)19:10<1265::AID-SIM486>3.0.CO;2-U
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.