Nonparametric Bayes modeling with sample survey weights. (English) Zbl 1384.62031

Summary: In population studies, it is standard to sample data via designs in which the population is divided into strata, with the different strata assigned different probabilities of inclusion. Although there have been some proposals for including sample survey weights into Bayesian analyses, existing methods require complex models or ignore the stratified design underlying the survey weights. We propose a simple approach based on modeling the distribution of the selected sample as a mixture, with the mixture weights appropriately adjusted, while accounting for uncertainty in the adjustment. We focus for simplicity on Dirichlet process mixtures but the proposed approach can be applied more broadly. We sketch a simple Markov chain Monte Carlo algorithm for computation, and assess the approach via simulations and an application.


62D05 Sampling theory, sample surveys
62G07 Density estimation
62F15 Bayesian inference
Full Text: DOI arXiv Link


[1] Bellhouse, D. R.; Stafford, J. E., Density estimation from complex surveys, Statist. Sinica, 9, 407-424, (1999) · Zbl 0921.62041
[2] Buskirk, T. D., Nonparametric density estimation using complex survey data, (Proceedings of the Survey Research Methods Section, (1998), American Statistical Association Washington, DC), 799-801
[3] Buskirk, T. D.; Lohr, S. L., Asymptotic properties of kernel density estimation with complex survey data, J. Statist. Plann. Inference, 128, 165-190, (2005) · Zbl 1058.62032
[4] Canale, A.; Dunson, D. B., Bayesian kernel mixtures for counts, J. Amer. Statist. Assoc., 106, 1528-1539, (2011) · Zbl 1233.62041
[5] Carlin, B. P.; Louis, T. A., Bayes and empirical Bayes methods for data analysis, (2000), Chapman and Hall London · Zbl 1017.62005
[6] Chen, Q.; Elliott, M. R.; Little, R. J.A., Bayesian penalized spline model-based inference for finite population proportion in unequal probability sampling, Surv. Methodol., 36, 23-34, (2010)
[7] Dunson, D. B.; Xing, C., Nonparametric Bayes modeling of multivariate categorical data, J. Amer. Statist. Assoc., 104, 1042-1051, (2009) · Zbl 1388.62151
[8] Gelman, A., Struggles with survey weighting and regression modeling, Statist. Sci., 22, 153-164, (2007) · Zbl 1246.62043
[9] Harris, K.M., Halpern, C.T., Whitsel, E., Hussey, J., Tabor, J., Entzel, P., Udry, J.R., 2009. The national longitudinal study of adolescent health: Research design [www document] URL: http://www.cpc.unc.edu/projects/addhealth/design.
[10] Hoff, P. D., A first course in Bayesian statistical methods, (2009), Springer New York · Zbl 1213.62044
[11] Horvitz, D. G.; Thompson, D. J., A generalization of sampling without replacement from a finite universe, J. Amer. Statist. Assoc., 47, 663-685, (1952) · Zbl 0047.38301
[12] Ishwaran, H.; James, L. F., Gibbs sampling methods for stick-breaking priors, J. Amer. Statist. Assoc., 96, 161-173, (2001) · Zbl 1014.62006
[13] Little, R. J.A., To model or not to model? competing modes of inference for finite population sampling, J. Amer. Statist. Assoc., 99, 546-556, (2004) · Zbl 1117.62389
[14] Schifeling, T. A.; Reiter, J. P., Incorporating marginal prior information in latent class models, Bayesian Anal., 1-20, (2015)
[15] Si, Y.; Pillai, N.; Gelman, A., Bayesian nonparametric weighted sampling inference, Bayesian Anal., 10, 605-625, (2015) · Zbl 1334.62024
[16] Si, Y.; Reiter, J. P., Nonparametric Bayesian multiple imputation for incomplete categorical variables in large-scale assessment surveys, J. Educ. Behav. Stat., 38, 499-521, (2013)
[17] Zangeneh, S.Z., Little, R.J., 2012. Bayesian inference for the finite population total from a heteroscedastic probability proportional to size sample. In: Proceedings of the Joint Statistical Meetings.
[18] Zheng, H.; Little, R. J.A., Penalized spline model-based estimation of finite population total from probability-proportionalto-size samples, J. Official Stat., 19, 99-107, (2003)
[19] Zheng, H.; Little, R. J.A., Inference for the population total from probability-proportional-to-size samples based on predictions from a penalized spline nonparametric model, J. Official Stat., 21, 1-20, (2005)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.