×

Empirical process of residuals for high-dimensional linear models. (English) Zbl 0853.62042

Summary: We give a stochastic expansion for the empirical distribution function \(\widehat F_n\) of residuals in a \(p\)-dimensional linear model. This expansion holds for \(p\) increasing with \(n\). It shows that, for high-dimensional linear models, \(\widehat F_n\) strongly depends on the chosen estimator \(\widehat \theta\) of the parameter \(\theta\) of the linear model. In particular, if one uses an \(ML\)-estimator \(\widehat \theta_{ML}\) which is motivated by a wrongly specified error distribution function \(G\), then \(\widehat F_n\) is biased toward \(G\). For \(p^2/n \to \infty\), this bias effect is of larger order than the stochastic fluctuations of the empirical process. Hence, the statistical analysis may just reproduce the assumptions imposed.

MSC:

62G30 Order statistics; empirical distribution functions
62J05 Linear regression; mixed models
62J20 Diagnostics, and linear inference and regression
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] BICKEL, P. J. and FREEDMAN, D. A. 1983. Bootstrapping regression models with many parameZ. ters. In A Festschrift for Erich Lehmann P. Bickel, K. Doksum and J. L. Hodges, eds. 28 48. Wadsworth, Belmont, CA. Z. · Zbl 0529.62057
[2] EHM, W. 1986. On maximum likelihood estimation in high-dimensional log-linear ty pe models. I. The independent case. Unpublished manuscript, Sonderforschungsbereich 123, Univ. Heidelberg. Z.
[3] HABERMAN, S. J. 1977a. Log-linear and frequency tables with small expected cell counts. Ann. Statist. 5 1148 1169. Z. · Zbl 0404.62025 · doi:10.1214/aos/1176344001
[4] HABERMAN, S. J. 1977b. Maximum likelihood estimates in exponential response models. Ann. Statist. 5 815 841. Z. HOEFFDING 1963. Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13 30. Z. · Zbl 0368.62019 · doi:10.1214/aos/1176343941
[5] HUBER, P. J. 1981. Robust Statistics. Wiley, New York. Z. · Zbl 0536.62025
[6] IOANNIDES, E. 1987. Asy mptotik des empirischen Prozesses der Residuen und Schatzen der \" Form der Fehlerverteilung im linearen Regressionsmodell. Diplomarbeit, Fakultat fur \" \" Mathematik, Univ. Heidelberg. Z.
[7] KANTOROWITSCH, L. W. and AKILOW, G. P. 1964. Funktionalanalysis in Normierten Raumen. Äkademie Verlag, Berlin. Z. · Zbl 0359.46017
[8] KOUL, H. 1969. Asy mptotic behavior of Wilcoxon ty pe confidence regions in multiple linear regression. Ann. Math. Statist. 40 1950 1979. Z. · Zbl 0199.53503 · doi:10.1214/aoms/1177697278
[9] KOUL, H. 1970. Some convergence theorems for ranks and weighted empirical cumulatives. Ann. Math. Statist. 41 1768 1773. Z. · Zbl 0232.62020 · doi:10.1214/aoms/1177696824
[10] KOUL, H. 1984. Tests of goodness-of-fit in linear regression. Colloq. Math. Soc. Janos Boly ai 45 \' 279 315. · Zbl 0616.62060
[11] KOUL, H. 1992. Weighted Empiricals and Linear Models. IMS, Hay ward, CA. Z. · Zbl 0998.62501
[12] KREISS, J.-P. 1988. Asy mptotic statistical inference for a class of stochastic processes. Habilitationsschrift, Fachbereich Mathematik, Univ. Hamburg. Z.
[13] KREISS, J.-P. 1991. Estimation of the distribution function of noise in stationary processes. Metrika 38 285 297. Z. · Zbl 0735.62085 · doi:10.1007/BF02613623
[14] LOy NES, R. M. 1980. The empirical distribution function of residuals from generalised regression. Ann. Statist. 8 285 298. Z. · Zbl 0451.62040 · doi:10.1214/aos/1176344954
[15] MAMMEN, E. 1989. Asy mptotics with increasing dimension for robust regression with applications to the bootstrap. Ann. Statist. 17 382 400. Z. · Zbl 0674.62017 · doi:10.1214/aos/1176347023
[16] MAMMEN, E. 1993. Bootstrap and wild bootstrap for high-dimensional linear models. Ann. Statist. 21 255 285. Z. · Zbl 0771.62032 · doi:10.1214/aos/1176349025
[17] MO, M. 1991. Asy mptotic normality of minimum contrast estimators. Unpublished manuscript, Dept. Statistics, Univ. Toronto. Z.
[18] MO, M. 1992. Bootstrapping with increasing dimension. Unpublished manuscript, Dept. Statistics, Univ. Toronto. Z.
[19] POLLARD, D. 1984. Convergence of Stochastic Processes. Springer, New York. Z. · Zbl 0544.60045
[20] PORTNOY, S. 1984. Asy mptotic behaviour of M-estimators of p regression parameters when p2 n is large. I. Consistency. Ann. Statist. 12 1298 1309. Z. · Zbl 0584.62050 · doi:10.1214/aos/1176346793
[21] PORTNOY, S. 1985. Asy mptotic behaviour of M-estimator of p regression parameters when p2 n is large. II. Normal approximation. Ann. Statist. 13 1403 1417. Z. · Zbl 0601.62026 · doi:10.1214/aos/1176349744
[22] PORTNOY, S. 1986. Asy mptotic behaviour of the empiric distribution of M-estimated residuals from a regression model with many parameters. Ann. Statist. 14 1152 1170. Z. · Zbl 0612.62072 · doi:10.1214/aos/1176350056
[23] PORTNOY, S. 1988. Asy mptotic behaviour of likelihood methods for exponential families when the number of parameters tends to infinity. Ann. Statist. 16 356 366. Z. · Zbl 0637.62026 · doi:10.1214/aos/1176350710
[24] SAUERMANN, W. 1989. Bootstrapping the maximum likelihood estimator in high-dimensional log-linear models. Ann. Statist. 17 1198 1216. Z. · Zbl 0683.62025 · doi:10.1214/aos/1176347264
[25] SHORACK, G. R. 1982. Bootstrapping robust regression. Comm. Statist. Theory Methods 11 961 972. Z. · Zbl 0523.62033 · doi:10.1080/03610928208828286
[26] SHORACK, G. R. and WELLNER, J. A. 1986. Empirical Processes with Applications to Statistics. Wiley, New York. Z. · Zbl 1170.62365
[27] WELSH, A. H. 1989. On M-processes and M-estimation Ann. Statist. 17 337 361. Z. · Zbl 0701.62074 · doi:10.1214/aos/1176347021
[28] WHITTLE, P. 1960. Bounds for the moments of linear and quadratic forms in independent variables. Theory Probab. Appl. 5 302 305. · Zbl 0101.12003
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.