# zbMATH — the first resource for mathematics

Uniform convergence of the empirical cumulative distribution function under informative selection from a finite population. (English) Zbl 1329.62053
Summary: Consider informative selection of a sample from a finite population. Responses are realized as independent and identically distributed (i.i.d.) random variables with a probability density function (p.d.f.) $$f$$, referred to as the superpopulation model. The selection is informative in the sense that the sample responses, given that they were selected, are not i.i.d. $$f$$. In general, the informative selection mechanism may induce dependence among the selected observations. The impact of such dependence on the empirical cumulative distribution function (c.d.f.) is studied. An asymptotic framework and weak conditions on the informative selection mechanism are developed under which the (unweighted) empirical c.d.f. converges uniformly, in $$L_2$$ and almost surely, to a weighted version of the superpopulation c.d.f. This yields an analogue of the Glivenko-Cantelli theorem. A series of examples, motivated by real problems in surveys and other observational studies, shows that the conditions are verifiable for specified designs.

##### MSC:
 62D05 Sampling theory, sample surveys 60F25 $$L^p$$-limit theorems 62G20 Asymptotic properties of nonparametric inference
Full Text:
##### References:
  Arratia, R., Goldstein, L. and Langholz, B. (2005). Local central limit theorems, the high-order correlations of rejective sampling and logistic likelihood asymptotics. Ann. Statist. 33 871-914. · Zbl 1068.62106  Binder, D.A. (1983). On the variances of asymptotically normal estimators from complex surveys. Internat. Statist. Rev. 51 279-292. · Zbl 0535.62014  Breckling, J.U., Chambers, R.L., Dorfman, A.H., Tam, S.M. and Welsh, A.H. (1994). Maximum likelihood inference from sample survey data. Internat. Statist. Rev. 62 349-363. · Zbl 0828.62009  Breidt, F.J. and Opsomer, J.D. (2000). Local polynomial regresssion estimators in survey sampling. Ann. Statist. 28 1026-1053. · Zbl 1105.62302  Breidt, F.J. and Opsomer, J.D. (2008). Endogenous post-stratification in surveys: Classifying with a sample-fitted model. Ann. Statist. 36 403-427. · Zbl 1132.62006  Cassel, C.M., Särndal, C.E. and Wretman, J.H. (1977). Foundations of Inference in Survey Sampling. Wiley Series in Probability and Mathematical Statistics . New York: Wiley-Interscience. · Zbl 0391.62007  Chambers, R.L., Dorfman, A.H. and Wang, S. (1998). Limited information likelihood analysis of survey data. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 397-411. · Zbl 0918.62006  Chambers, R.L. and Skinner, C. J. (2003). Analysis of Survey Data . Chichester: Wiley. · Zbl 1024.00035  Cox, D. (1969). Some sampling problems in technology. In New Developments in Survey Sampling (U. Johnson and H. Smith, eds.) 506-527. New York: Wiley Interscience.  Eagleson, G.K. and Weber, N.C. (1978). Limit theorems for weakly exchangeable arrays. Math. Proc. Cambridge Philos. Soc. 84 123-130. · Zbl 0387.60026  Eideh, A. and Nathan, G. (2009). Two-stage informative cluster sampling-estimation and prediction with applications for small-area models. J. Statist. Plann. Inference 139 3088-3101. · Zbl 1168.62009  Eideh, A.A.H. and Nathan, G. (2006). Fitting time series models for longitudinal survey data under informative sampling. J. Statist. Plann. Inference 136 3052-3069. · Zbl 1094.62113  Eideh, A.A.H. and Nathan, G. (2007). Corrigendum to “Fitting time series models for longitudinal survey data under informative sampling” [ J. Statist. Plann. Inference 136 3052-3069]. J. Statist. Plann. Inference 137 628. · Zbl 1094.62113  Fuller, W. (2009). Sampling Statistics . New York: Wiley. · Zbl 1179.62019  Hájek, J. (1981). Sampling from a Finite Population. Statistics : Textbooks and Monographs 37 . New York: Dekker. Edited by Václav Dupač, With a foreword by P. K. Sen. · Zbl 0494.62008  Hausman, J. and Wise, D. (1981). Stratification on endogenous variables and estimation: The Gary income maintenance experiment. In Structural Analysis of Discrete Data . Cambridge, MA: MIT Press.  Isaki, C.T. and Fuller, W.A. (1982). Survey design under the regression superpopulation model. J. Amer. Statist. Assoc. 77 89-96. · Zbl 0511.62016  Jewell, N.P. (1985). Least squares regression with data arising from stratified samples of the dependent variable. Biometrika 72 11-21.  Kish, L. and Frankel, M.R. (1974). Inference from complex samples. J. Roy. Statist. Soc. Ser. B 36 1-37. · Zbl 0295.62011  Krieger, A.M. and Pfeffermann, D. (1992). Maximum likelihood estimation for complex sample surveys. Survey Methodology 18 225-239.  Langholz, B. and Goldstein, L. (2001). Conditional logistic analysis of case-control studies with complex sampling. Biostatistics 2 63-84. · Zbl 1017.62113  Leigh, G.M. (1988). A comparison of estimates of natural mortality from fish tagging experiments. Biometrika 75 347-353. · Zbl 0638.62103  Mantel, N. (1973). Synthetic retrospective studies and related topics. Biometrics 29 479-486.  Nowell, C. and Stanley, L.R. (1991). Length-biased sampling in mall intercept surveys. Journal of Marketing Research 28 475-479.  Patil, G.P. and Rao, C.R. (1978). Weighted distributions and size-biased sampling with applications to wildlife populations and human families. Biometrics 34 179-189. · Zbl 0384.62014  Pfeffermann, D., Krieger, A.M. and Rinott, Y. (1998). Parametric distributions of complex survey data under informative probability sampling. Statist. Sinica 8 1087-1114. · Zbl 0923.62019  Pfeffermann, D., Moura, F.A.D.S. and Silva, P.L.d.N. (2006). Multi-level modelling under informative sampling. Biometrika 93 943-959. · Zbl 1436.62046  Pfeffermann, D. and Sverchkov, M. (1999). Parametric and semi-parametric estimation of regression models fitted to survey data. Sankhyā Ser. B 61 166-186. · Zbl 0985.62013  Pfeffermann, D. and Sverchkov, M. (2007). Small-area estimation under informative probability sampling of areas and within the selected areas. J. Amer. Statist. Assoc. 102 1427-1439. · Zbl 1333.62023  Pfeffermann, D. and Sverchkov, M. (2009). Inference under informative sampling. In Handbook of Statistics (D. Pfefferman and C.R. Rao, eds.) 29B 455-487. Amsterdam: North-Holland.  Pfeffermann, D. and Sverchkov, M.Y. (2003). Fitting generalized linear models under informative sampling. In Analysis of Survey Data. Wiley Ser. Surv. Methodol. 175-195. Chichester: Wiley.  Poisson, S.D. (1837). Recherches sur la probabilité des jugements en matière criminelle et en matière civile. In Procédés des Règles Générales du Calcul des Probabilitiés . Paris: Bachelier, Imprimeur-Libraire pour les Mathématiques. · ERAM 016.0525cj  Robinson, P.M. and Särndal, C.E. (1983). Asymptotic properties of the generalized regression estimator in probability sampling. Sankhyā Ser. B 45 240-248. · Zbl 0531.62005  Särndal, C.E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling . New York: Springer. · Zbl 0742.62008  Serfling, R.J. (1980). Approximation Theorems of Mathematical Statistics. Wiley Series in Probability and Mathematical Statistics . New York: Wiley. · Zbl 0538.62002  Shaw, D. (1988). On-site samples’ regression: Problems of nonnegative integers, truncation, and endogenous stratification. J. Econometrics 37 211-223.  Skinner, C. (1994). Sample models and weights. In Proceedings of the Section on Survey Research Methods 133-142. Washington, DC: American Statistical Association.  Sullivan, P., Breidt, F., Ditton, R., Knuth, B., Leaman, B., O’Connell, V., Parsons, G., Pollock, K., Smith, S. and Stokes, S. (2006). Review of Recreational Fisheries Survey Methods . Washington, DC: National Academies Press.  van der Vaart, A.W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3 . Cambridge: Cambridge Univ. Press. · Zbl 0910.62001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.