zbMATH — the first resource for mathematics

Uniform convergence of the empirical cumulative distribution function under informative selection from a finite population. (English) Zbl 1329.62053
Summary: Consider informative selection of a sample from a finite population. Responses are realized as independent and identically distributed (i.i.d.) random variables with a probability density function (p.d.f.) \(f\), referred to as the superpopulation model. The selection is informative in the sense that the sample responses, given that they were selected, are not i.i.d. \(f\). In general, the informative selection mechanism may induce dependence among the selected observations. The impact of such dependence on the empirical cumulative distribution function (c.d.f.) is studied. An asymptotic framework and weak conditions on the informative selection mechanism are developed under which the (unweighted) empirical c.d.f. converges uniformly, in \(L_2\) and almost surely, to a weighted version of the superpopulation c.d.f. This yields an analogue of the Glivenko-Cantelli theorem. A series of examples, motivated by real problems in surveys and other observational studies, shows that the conditions are verifiable for specified designs.

62D05 Sampling theory, sample surveys
60F25 \(L^p\)-limit theorems
62G20 Asymptotic properties of nonparametric inference
Full Text: DOI Euclid arXiv
[1] Arratia, R., Goldstein, L. and Langholz, B. (2005). Local central limit theorems, the high-order correlations of rejective sampling and logistic likelihood asymptotics. Ann. Statist. 33 871-914. · Zbl 1068.62106
[2] Binder, D.A. (1983). On the variances of asymptotically normal estimators from complex surveys. Internat. Statist. Rev. 51 279-292. · Zbl 0535.62014
[3] Breckling, J.U., Chambers, R.L., Dorfman, A.H., Tam, S.M. and Welsh, A.H. (1994). Maximum likelihood inference from sample survey data. Internat. Statist. Rev. 62 349-363. · Zbl 0828.62009
[4] Breidt, F.J. and Opsomer, J.D. (2000). Local polynomial regresssion estimators in survey sampling. Ann. Statist. 28 1026-1053. · Zbl 1105.62302
[5] Breidt, F.J. and Opsomer, J.D. (2008). Endogenous post-stratification in surveys: Classifying with a sample-fitted model. Ann. Statist. 36 403-427. · Zbl 1132.62006
[6] Cassel, C.M., Särndal, C.E. and Wretman, J.H. (1977). Foundations of Inference in Survey Sampling. Wiley Series in Probability and Mathematical Statistics . New York: Wiley-Interscience. · Zbl 0391.62007
[7] Chambers, R.L., Dorfman, A.H. and Wang, S. (1998). Limited information likelihood analysis of survey data. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 397-411. · Zbl 0918.62006
[8] Chambers, R.L. and Skinner, C. J. (2003). Analysis of Survey Data . Chichester: Wiley. · Zbl 1024.00035
[9] Cox, D. (1969). Some sampling problems in technology. In New Developments in Survey Sampling (U. Johnson and H. Smith, eds.) 506-527. New York: Wiley Interscience.
[10] Eagleson, G.K. and Weber, N.C. (1978). Limit theorems for weakly exchangeable arrays. Math. Proc. Cambridge Philos. Soc. 84 123-130. · Zbl 0387.60026
[11] Eideh, A. and Nathan, G. (2009). Two-stage informative cluster sampling-estimation and prediction with applications for small-area models. J. Statist. Plann. Inference 139 3088-3101. · Zbl 1168.62009
[12] Eideh, A.A.H. and Nathan, G. (2006). Fitting time series models for longitudinal survey data under informative sampling. J. Statist. Plann. Inference 136 3052-3069. · Zbl 1094.62113
[13] Eideh, A.A.H. and Nathan, G. (2007). Corrigendum to “Fitting time series models for longitudinal survey data under informative sampling” [ J. Statist. Plann. Inference 136 3052-3069]. J. Statist. Plann. Inference 137 628. · Zbl 1094.62113
[14] Fuller, W. (2009). Sampling Statistics . New York: Wiley. · Zbl 1179.62019
[15] Hájek, J. (1981). Sampling from a Finite Population. Statistics : Textbooks and Monographs 37 . New York: Dekker. Edited by Václav Dupač, With a foreword by P. K. Sen. · Zbl 0494.62008
[16] Hausman, J. and Wise, D. (1981). Stratification on endogenous variables and estimation: The Gary income maintenance experiment. In Structural Analysis of Discrete Data . Cambridge, MA: MIT Press.
[17] Isaki, C.T. and Fuller, W.A. (1982). Survey design under the regression superpopulation model. J. Amer. Statist. Assoc. 77 89-96. · Zbl 0511.62016
[18] Jewell, N.P. (1985). Least squares regression with data arising from stratified samples of the dependent variable. Biometrika 72 11-21.
[19] Kish, L. and Frankel, M.R. (1974). Inference from complex samples. J. Roy. Statist. Soc. Ser. B 36 1-37. · Zbl 0295.62011
[20] Krieger, A.M. and Pfeffermann, D. (1992). Maximum likelihood estimation for complex sample surveys. Survey Methodology 18 225-239.
[21] Langholz, B. and Goldstein, L. (2001). Conditional logistic analysis of case-control studies with complex sampling. Biostatistics 2 63-84. · Zbl 1017.62113
[22] Leigh, G.M. (1988). A comparison of estimates of natural mortality from fish tagging experiments. Biometrika 75 347-353. · Zbl 0638.62103
[23] Mantel, N. (1973). Synthetic retrospective studies and related topics. Biometrics 29 479-486.
[24] Nowell, C. and Stanley, L.R. (1991). Length-biased sampling in mall intercept surveys. Journal of Marketing Research 28 475-479.
[25] Patil, G.P. and Rao, C.R. (1978). Weighted distributions and size-biased sampling with applications to wildlife populations and human families. Biometrics 34 179-189. · Zbl 0384.62014
[26] Pfeffermann, D., Krieger, A.M. and Rinott, Y. (1998). Parametric distributions of complex survey data under informative probability sampling. Statist. Sinica 8 1087-1114. · Zbl 0923.62019
[27] Pfeffermann, D., Moura, F.A.D.S. and Silva, P.L.d.N. (2006). Multi-level modelling under informative sampling. Biometrika 93 943-959. · Zbl 1436.62046
[28] Pfeffermann, D. and Sverchkov, M. (1999). Parametric and semi-parametric estimation of regression models fitted to survey data. Sankhyā Ser. B 61 166-186. · Zbl 0985.62013
[29] Pfeffermann, D. and Sverchkov, M. (2007). Small-area estimation under informative probability sampling of areas and within the selected areas. J. Amer. Statist. Assoc. 102 1427-1439. · Zbl 1333.62023
[30] Pfeffermann, D. and Sverchkov, M. (2009). Inference under informative sampling. In Handbook of Statistics (D. Pfefferman and C.R. Rao, eds.) 29B 455-487. Amsterdam: North-Holland.
[31] Pfeffermann, D. and Sverchkov, M.Y. (2003). Fitting generalized linear models under informative sampling. In Analysis of Survey Data. Wiley Ser. Surv. Methodol. 175-195. Chichester: Wiley.
[32] Poisson, S.D. (1837). Recherches sur la probabilité des jugements en matière criminelle et en matière civile. In Procédés des Règles Générales du Calcul des Probabilitiés . Paris: Bachelier, Imprimeur-Libraire pour les Mathématiques. · ERAM 016.0525cj
[33] Robinson, P.M. and Särndal, C.E. (1983). Asymptotic properties of the generalized regression estimator in probability sampling. Sankhyā Ser. B 45 240-248. · Zbl 0531.62005
[34] Särndal, C.E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling . New York: Springer. · Zbl 0742.62008
[35] Serfling, R.J. (1980). Approximation Theorems of Mathematical Statistics. Wiley Series in Probability and Mathematical Statistics . New York: Wiley. · Zbl 0538.62002
[36] Shaw, D. (1988). On-site samples’ regression: Problems of nonnegative integers, truncation, and endogenous stratification. J. Econometrics 37 211-223.
[37] Skinner, C. (1994). Sample models and weights. In Proceedings of the Section on Survey Research Methods 133-142. Washington, DC: American Statistical Association.
[38] Sullivan, P., Breidt, F., Ditton, R., Knuth, B., Leaman, B., O’Connell, V., Parsons, G., Pollock, K., Smith, S. and Stokes, S. (2006). Review of Recreational Fisheries Survey Methods . Washington, DC: National Academies Press.
[39] van der Vaart, A.W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3 . Cambridge: Cambridge Univ. Press. · Zbl 0910.62001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.