×

Sample selection when a multivariate set of size measures is available. (English) Zbl 1427.62008

Summary: The design of a \(\pi\)ps random sample from a finite population when multivariate auxiliary variables are available deals with two main issues: the definition of a selection probability for each unit in the population as a function of the whole set of the auxiliary variables and the determination of the sample size required to achieve a fixed precision level for each auxiliary variable. These precisions are usually expressed as a set of upper limits on the coefficients of variation of the estimates. A strategy, based on a convex linear combination of the univariate selection probabilities, is suggested to approach jointly these issues. The weights of the linear combination are evaluated in such a way that the sample sizes necessary to reach each constrained error level are the same. The procedure is applied to design a \(\pi\)ps sampling scheme for the monthly slaughtering survey conducted by the Italian Institute of Statistics (Istat). The results clearly show that the use of this strategy implies an appreciable gain in the efficiency of the design. The selection probabilities returned by this proposal do not involve excessive and unnecessary efforts on some auxiliary variables, disadvantaging other variables that for this reason will have too high errors. On the contrary, this may occur when simple summary statistics are used (average, maximum, etc.) to reduce the multivariate problem to a known univariate situation. For a given set of precision levels, our procedure achieves a sample size which is much lower than the one used by Istat and obtained through a multivariate stratification of the frame.

MSC:

62D05 Sampling theory, sample surveys
62P20 Applications of statistics to economics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Baillargeon S, Rivest LP (2009) A general algorithm for univariate stratification. Int Stat Rev 77(3):331-344 · doi:10.1111/j.1751-5823.2009.00093.x
[2] Bee, M.; Benedetti, R.; Espa, G.; Piersimoni, F.; Benedetti, R. (ed.); Bee, M. (ed.); Espa, G. (ed.); Piersimoni, F. (ed.), On the use of auxiliary variables in agricultural surveys design (2010), Chicester
[3] Benedetti R, Espa G, Lafratta G (2008) A tree-based approach to forming strata in multipurpose business surveys. Surv Methodol 34:195-203
[4] Benedetti R, Piersimoni F (2012) Multivariate boundaries of a self representing stratum of large units in agricultural survey design. Surv Res Methods 6(3):125-135
[5] Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc Ser B 26(2):211-252 · Zbl 0156.40104
[6] Brewer KRW, Hanif M (1983) Sampling with unequal probabilities. Springer, New York · Zbl 0514.62015 · doi:10.1007/978-1-4684-9407-5
[7] Chauvet G, Bonnéry D, Deville JC (2011) Optimal inclusion probabilities for balanced sampling. J Stat Plan Inference 141:984-994 · Zbl 1200.62007 · doi:10.1016/j.jspi.2010.09.005
[8] Chauvet G, Tillé Y (2006) A fast algorithm of balanced sampling. Comput Stat 21:53-62 · Zbl 1113.62143 · doi:10.1007/s00180-006-0250-2
[9] Dayal S (1985) Allocation of sample using values of auxiliary characteristics. J Stat Plan Inference 11:321-328 · Zbl 0566.62008 · doi:10.1016/0378-3758(85)90037-0
[10] Deville JC, Särndal CE (1992) Calibration estimators in survey sampling. J Am Stat Assoc 87(418):376-382 · Zbl 0760.62010 · doi:10.1080/01621459.1992.10475217
[11] Deville JC, Tillé Y (1998) Unequal probability sampling without replacement through a splitting method. Biometrika 85(1):89-101 · Zbl 1067.62508 · doi:10.1093/biomet/85.1.89
[12] Deville JC, Tillé Y (2004) Effcient balanced sampling; the CUBE method. Biometrika 91:893-912 · Zbl 1064.62015 · doi:10.1093/biomet/91.4.893
[13] Deville JC, Tillé Y (2005) Variance approximation under balanced sampling. J Stat Plan Inference 128:411-425 · Zbl 1089.62005 · doi:10.1016/j.jspi.2003.11.011
[14] Falorsi PD, Righi P (2008) A balanced sampling approach for multi-way stratification designs for small area estimation. Surv Methodol 34(2):223-234
[15] Falorsi PD, Righi P (2015) Generalized framework for defining the optimal inclusion probabilities of one-stage sampling designs for multivariate and multi-domain surveys. Surv Methodol 41(1):215-236
[16] Fuller WA (2009) Some design properties of a rejective sampling procedure. Biometrika 96(4):933-944 · Zbl 1179.62020 · doi:10.1093/biomet/asp042
[17] Grafström A, Lisic J (2017) BalancedSampling: balanced and spatially balanced sampling. R package version 1.5.2. https://CRAN.R-project.org/package=BalancedSampling. Accessed 10 July 2017
[18] Hartley HO, Rao JNK (1962) Sampling with unequal probabilities and without replacement. Ann Math Stat 33(2):350-374 · Zbl 0121.13704 · doi:10.1214/aoms/1177704564
[19] Hicks S, Amrhein J, Kott PS (1996) Methods to meet target sample sizes under a multivariate PPS Sampling Strategy. In: Proceedings of the survey research methods section, ASA
[20] Hidiroglou MA, Srinath KP (1993) Problems associated with designing sub-annual business surveys. J Bus Econ Stat 11(4):397-405
[21] Holmberg A (2007) Using unequal probability sampling in business surveys to limit anticipated variances of regression estimators. Int Conf Establ Surv III:550-556
[22] Isaki CT, Fuller WA (1982) Survey design under the regression superpopulation model. J Am Stat Assoc 77(377):89-96 · Zbl 0511.62016 · doi:10.1080/01621459.1982.10477770
[23] Kott PS, Bailey JT (2000) The theory and practice of maximal brewer selection with Poisson PRN sampling. In: Proceedings of the second international conference on establishment surveys, Invited papers, pp 269-278
[24] Istat (2017) Macellazione mensile e annuale del bestiame a carni rosse e bianche, https://www.istat.it/it/archivio/200942. Accessed 10 July 2017
[25] Ohlsson E (1998) Sequential Poisson sampling. J Off Stat 14(2):149-162
[26] Nedyalkova D, Tillé Y (2008) Optimal sampling and estimation strategies under the linear model. Biometrika 95:521-537 · Zbl 1437.62562 · doi:10.1093/biomet/asn027
[27] R Core Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. https://www.R-project.org/. Accessed 10 July 2017
[28] Ranalli MG, Arcos A, Rueda M, Teodoro A (2016) Calibration estimation in dual-frame surveys. Stat Methods Appl 25(3):321-349 · Zbl 1348.62031 · doi:10.1007/s10260-015-0336-5
[29] Rosén B (1997) Asymptotic theory for order sampling. J Stat Plan Inference 62:135-158 · Zbl 0937.62012 · doi:10.1016/S0378-3758(96)00185-1
[30] Rosén B (1997) On sampling with probability proportional to size. J Stat Plan Inference 62:159-191 · Zbl 0937.62013 · doi:10.1016/S0378-3758(96)00186-3
[31] Särndal CE, Swensson B, Wretman J (1992) Model assisted survey sampling. Springer, New York · Zbl 0742.62008 · doi:10.1007/978-1-4612-4378-6
[32] Sigman, RS; Monsour, NJ; Cox, B. (ed.), Selecting samples from list frames of businesses, 133-152 (1995), New York
[33] Smyth GK (2002) An efficient algorithm for REML in heteroscedastic regression. J Comput Graph Stat 11:836-847 · doi:10.1198/106186002871
[34] Smyth GK (2017) statmod: Statistical modeling. R package version 1.4.30. https://CRAN.R-project.org/package=statmod. Accessed 10 July 2017
[35] Tillé Y (2006) Sampling algorithms. Springer, New York · Zbl 1099.62009
[36] Tillé Y (2011) Ten years of balanced sampling with the CUBE method: an appraisal. Surv Methodol 2:215-226
[37] Tillé Y, Favre AC (2005) Optimal allocation in balanced sampling. Stat Probab Lett 74(1):31-37 · Zbl 1123.62011 · doi:10.1016/j.spl.2005.04.027
[38] Tillè Y, Matei A (2017) sampling: survey sampling. R package version 2.7. https://CRAN.R-project.org/packageDsampling. Accessed 10 July 2017
[39] Tillé Y, Wilhelm M (2017) Probability sampling designs: principles for choice of design and balancing. Stat Sci 32(2):176-189 · Zbl 1381.62032 · doi:10.1214/16-STS606
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.