×

New important developments in small area estimation. (English) Zbl 1332.62038

Summary: The problem of small area estimation (SAE) is how to produce reliable estimates of characteristics of interest such as means, counts, quantiles, etc., for areas or domains for which only small samples or no samples are available, and how to assess their precision. The purpose of this paper is to review and discuss some of the new important developments in small area estimation methods. J. N. K. Rao [Small area estimation. With a foreword by Graham Kalton. Hoboken, NJ: Wiley (2003; Zbl 1026.62003)] wrote a very comprehensive book, which covers all the main developments in this topic until that time. A few review papers have been written after 2003, but they are limited in scope. Hence, the focus of this review is on new developments in the last 7–8 years, but to make the review more self-contained, I also mention shortly some of the older developments. The review covers both design-based and model-dependent methods, with the latter methods further classified into frequentist and Bayesian methods. The style of the paper is similar to the style of my previous review on SAE published in 2002, explaining the new problems investigated and describing the proposed solutions, but without dwelling on theoretical details, which can be found in the original articles. I hope that this paper will be useful both to researchers who like to learn more on the research carried out in SAE and to practitioners who might be interested in the application of the new methods.

MSC:

62D05 Sampling theory, sample surveys
62-02 Research exposition (monographs, survey articles) pertaining to statistics

Citations:

Zbl 1026.62003
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Battese, G. E., Harter, R. M. and Fuller, W. A. (1988). An error components model for prediction of county crop area using survey and satellite data. J. Amer. Statist. Assoc. 83 28-36.
[2] Bayarri, M. J. and Castellanos, M. E. (2007). Bayesian checking of the second levels of hierarchical models. Statist. Sci. 22 322-343. · Zbl 1246.62029 · doi:10.1214/07-STS235
[3] Bell, W. R. and Huang, E. T. (2006). Using the \(t\)-distribution to deal with outliers in small area estimation. In Proceedings of Statistics Canada Symposium on Methodological Issues in Measuring Population Health . Statistics Canada, Ottawa, Canada.
[4] Chambers, R., Chandra, H. and Tzavidis, N. (2011). On bias-robust mean squared error estimation for pseudo-linear small are estimators. Survey Methodology 37 153-170.
[5] Chambers, R. and Tzavidis, N. (2006). \(M\)-quantile models for small area estimation. Biometrika 93 255-268. · Zbl 1153.62004 · doi:10.1093/biomet/93.2.255
[6] Chandra, H. and Chambers, R. (2009). Multipurpose small area estimation. Journal of Official Statistics 25 379-395.
[7] Chatterjee, S., Lahiri, P. and Li, H. (2008). Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models. Ann. Statist. 36 1221-1245. · Zbl 1360.62378 · doi:10.1214/07-AOS512
[8] Chaudhuri, S. and Ghosh, M. (2011). Empirical likelihood for small area estimation. Biometrika 98 473-480. · Zbl 1215.62031 · doi:10.1093/biomet/asr004
[9] Chen, S. and Lahiri, P. (2002). On mean squared prediction error estimation in small area estimation problems. In Proceedings of the Survey Research Methods Section 473-477. American Statistical Association, Alexandria, VA.
[10] Das, K., Jiang, J. and Rao, J. N. K. (2004). Mean squared error of empirical predictor. Ann. Statist. 32 818-840. · Zbl 1092.62063 · doi:10.1214/009053604000000201
[11] Datta, G. S. (2009). Model-based approach to small area estimation. In Sample Surveys : Inference and Analysis , (D. Pfeffermann and C. R. Rao, eds.). Handbook of Statistics 29B 251-288. North-Holland, Amsterdam.
[12] Datta, G. S., Hall, P. and Mandal, A. (2011). Model selection by testing for the presence of small-area effects, and application to area-level data. J. Amer. Statist. Assoc. 106 362-374. · Zbl 1396.62023 · doi:10.1198/jasa.2011.tm10036
[13] Datta, G. S. and Lahiri, P. (2000). A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Statist. Sinica 10 613-627. · Zbl 1054.62566
[14] Datta, G. S., Rao, J. N. K. and Smith, D. D. (2005). On measuring the variability of small area estimators under a basic area level model. Biometrika 92 183-196. · Zbl 1068.62027 · doi:10.1093/biomet/92.1.183
[15] Datta, G. S., Rao, J. N. K. and Torabi, M. (2010). Pseudo-empirical Bayes estimation of small area means under a nested error linear regression model with functional measurement errors. J. Statist. Plann. Inference 140 2952-2962. · Zbl 1207.62011 · doi:10.1016/j.jspi.2010.03.046
[16] Datta, G. S., Ghosh, M., Steorts, R. and Maples, J. (2011). Bayesian benchmarking with applications to small area estimation. Test 20 574-588. · Zbl 1274.62197 · doi:10.1007/s11749-010-0218-y
[17] Dey, D. K., Gelfand, A. E., Swartz, T. B. and Vlachos, A. K. (1998). A simulation-intensive approach for checking hierarchical models. Test 7 325-346. · Zbl 0935.62082 · doi:10.1007/BF02565116
[18] Estevao, V. M. and Särndal, C. E. (2004). Borrowing strength is not the best technique within a wide class of design-consistent domain estimators. Journal of Official Statistics 20 645-669.
[19] Estevao, V. M. and Särndal, C. E. (2006). Survey estimates by calibration on complex auxiliary information. International Statistical Review 74 127-147.
[20] Falorsi, P. D. and Righi, P. (2008). A balanced sampling approach for multi-way stratification designs for small area estimation. Survey Methodology 34 223-234.
[21] Fay, R. E. and Herriot, R. A. (1979). Estimates of income for small places: An application of James-Stein procedures to census data. J. Amer. Statist. Assoc. 74 269-277. · doi:10.1080/01621459.1979.10482505
[22] Ganesh, N. and Lahiri, P. (2008). A new class of average moment matching priors. Biometrika 95 514-520. · Zbl 1437.62466 · doi:10.1093/biomet/asn008
[23] Ghosh, M., Maiti, T. and Roy, A. (2008). Influence functions and robust Bayes and empirical Bayes small area estimation. Biometrika 95 573-585. · Zbl 1437.62471 · doi:10.1093/biomet/asn030
[24] Ghosh, M. and Rao, J. N. K. (1994). Small area estimation: An appraisal (with discussion). Statist. Sci. 9 65-93. · Zbl 0955.62538 · doi:10.1214/ss/1177010647
[25] Ghosh, M., Sinha, K. and Kim, D. (2006). Empirical and hierarchical Bayesian estimation in finite population sampling under structural measurement error models. Scand. J. Statist. 33 591-608. · Zbl 1114.62011 · doi:10.1111/j.1467-9469.2006.00492.x
[26] Ghosh, M. and Sinha, K. (2007). Empirical Bayes estimation in finite population sampling under functional measurement error models. J. Statist. Plann. Inference 137 2759-2773. · Zbl 1207.62012 · doi:10.1016/j.jspi.2006.08.008
[27] Ghosh, M., Natarajan, K., Stroud, T. W. F. and Carlin, B. P. (1998). Generalized linear models for small-area estimation. J. Amer. Statist. Assoc. 93 273-282. · Zbl 0906.62068 · doi:10.2307/2669623
[28] Gurka, M. J. (2006). Selecting the best linear mixed model under REML. Amer. Statist. 60 19-26. · doi:10.1198/000313006X90396
[29] Hall, P. and Maiti, T. (2006). On parametric bootstrap methods for small area prediction. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 221-238. · Zbl 1100.62039 · doi:10.1111/j.1467-9868.2006.00541.x
[30] Huang, E. T. and Bell, W. R. (2006). Using the \(t\)-distribution in small area estimation: An application to SAIPE state poverty models. In Proceedings of the Survey Research Methods Section 3142-3149. American Statistical Association, Alexandria, VA.
[31] Jiang, J., Lahiri, P. and Wan, S. M. (2002). A unified jackknife theory for empirical best prediction with \(M\)-estimation. Ann. Statist. 30 1782-1810. · Zbl 1020.62025 · doi:10.1214/aos/1043351257
[32] Jiang, J. and Lahiri, P. (2006a). Estimation of finite population domain means: A model-assisted empirical best prediction approach. J. Amer. Statist. Assoc. 101 301-311. · Zbl 1118.62303 · doi:10.1198/016214505000000790
[33] Jiang, J. and Lahiri, P. (2006b). Mixed model prediction and small area estimation. Test 15 1-96. · Zbl 1149.62320 · doi:10.1007/BF02595419
[34] Jiang, J., Nguyen, T. and Rao, J. S. (2010). Fence method for non-parametric small area estimation. Survey Methodology 36 3-11.
[35] Jiang, J., Nguyen, T. and Rao, J. S. (2011). Best predictive small area estimation. J. Amer. Statist. Assoc. 106 732-745. · Zbl 05965230 · doi:10.1198/jasa.2011.tm10221
[36] Jiang, J., Rao, J. S., Gu, Z. and Nguyen, T. (2008). Fence methods for mixed model selection. Ann. Statist. 36 1669-1692. · Zbl 1142.62047 · doi:10.1214/07-AOS517
[37] Judkins, D. R. and Liu, J. (2000). Correcting the bias in the range of a statistic across small areas. Journal of Official Statist. 16 1-13.
[38] Kott, P. S. (2009). Calibration weighting: Combining probability samples and linear prediction models. In Sample Surveys : Inference and Analysis , (D. Pfeffermann and C. R. Rao, eds.). Handbook of Statistics 29B 55-82. North-Holland, Amsterdam.
[39] Lehtonen, R., Särndal, C. E. and Veijanen, A. (2003). The effect of model choice in estimation for domains, including small domains. Survey Methodology 29 33-44.
[40] Lehtonen, R., Särndal, C. E. and Veijanen, A. (2005). Does the model matter? Comparing model-assisted and model-dependent estimators of class frequencies for domains. Statistics in Transition 7 649-673.
[41] Lehtonen, R. and Veijanen, A. (2009). Design-based methods of estimation for domains and small areas. In Sample Surveys : Inference and Analysis (D. Pfeffermann and C. R. Rao, eds.). Handbook of Statistics 29B 219-249. North-Holland, Amsterdam.
[42] Lohr, S. L. and Rao, J. N. K. (2009). Jackknife estimation of mean squared error of small area predictors in nonlinear mixed models. Biometrika 96 457-468. · Zbl 1255.62084 · doi:10.1093/biomet/asp003
[43] Macgibbon, B. and Tomberlin, T. J. (1989). Small area estimates of proportions via empirical Bayes techniques. Survey Methodology 15 237-252.
[44] Malec, D., Davis, W. W. and Cao, X. (1999). Model-based small area estimates of overweight prevalence using sample selection adjustment. Stat. Med. 18 3189-3200.
[45] Malinovsky, Y. and Rinott, Y. (2010). Prediction of ordered random effects in a simple small area model. Statist. Sinica 20 697-714. · Zbl 1187.62007
[46] Mohadjer, L., Rao, J. N. K., Liu, B., Krenzke, T. and Van De Kerckhove, W. (2007). Hierarchical Bayes small area estimates of adult literacy using unmatched sampling and linking models. In Proceedings of the Survey Research Methods Section 3203-3210. American Statistical Association, Alexandria, VA.
[47] Molina, I. and Rao, J. N. K. (2010). Small area estimation of poverty indicators. Canad. J. Statist. 38 369-385. · Zbl 1235.62140 · doi:10.1002/cjs.10051
[48] Nandram, B. and Choi, J. W. (2010). A Bayesian analysis of body mass index data from small domains under nonignorable nonresponse and selection. J. Amer. Statist. Assoc. 105 120-135. · Zbl 1397.62463 · doi:10.1198/jasa.2009.ap08443
[49] Nandram, B. and Sayit, H. (2011). A Bayesian analysis of small area probabilities under a constraint. Survey Methodology 37 137-152.
[50] Opsomer, J. D., Claeskens, G., Ranalli, M. G., Kauermann, G. and Breidt, F. J. (2008). Non-parametric small area estimation using penalized spline regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 265-286. · Zbl 05563354 · doi:10.1111/j.1467-9868.2007.00635.x
[51] Pan, Z. and Lin, D. Y. (2005). Goodness-of-fit methods for generalized linear mixed models. Biometrics 61 1000-1009. · Zbl 1087.62081 · doi:10.1111/j.1541-0420.2005.00365.x
[52] Pfeffermann, D. (2002). Small area estimation-new developments and directions. International Statistical Review 70 125-143. · Zbl 1211.62022 · doi:10.1111/j.1751-5823.2002.tb00352.x
[53] Pfeffermann, D. and Correa, S. (2012). Empirical bootstrap bias correction and estimation of prediction mean square error in small area estimation. Biometrika 99 457-472. · Zbl 1239.62025 · doi:10.1093/biomet/ass010
[54] Pfeffermann, D. and Sverchkov, M. (2007). Small-area estimation under informative probability sampling of areas and within the selected areas. J. Amer. Statist. Assoc. 102 1427-1439. · Zbl 1333.62023 · doi:10.1198/016214507000001094
[55] Pfeffermann, D., Terryn, B. and Moura, F. A. S. (2008). Small area estimation under a two-part random effects model with application to estimation of literacy in developing countries. Survey Methodology 34 235-249.
[56] Pfeffermann, D. and Tiller, R. (2006). Small-area estimation with state-space models subject to benchmark constraints. J. Amer. Statist. Assoc. 101 1387-1397. · Zbl 1171.62346 · doi:10.1198/016214506000000593
[57] Prasad, N. G. N. and Rao, J. N. K. (1990). The estimation of the mean squared error of small-area estimators. J. Amer. Statist. Assoc. 85 163-171. · Zbl 0719.62064 · doi:10.2307/2289539
[58] Rao, J. N. K. (2003). Small Area Estimation . Wiley, Hoboken, NJ. · Zbl 1026.62003
[59] Rao, J. N. K. (2005). Inferential issues in small area estimation: Some new developments. Statistics in Transition 7 513-526.
[60] Rao, J. N. K. (2008). Some methods for small area estimation. Revista Internazionale di Siencze Sociali 4 387-406.
[61] Rao, J. N. K., Sinha, S. K. and Roknossadati, M. (2009). Robust small area estimation using penalized spline mixed models. In Proceedings of the Survey Research Methods Section 145-153. American Statistical Association, Alexandria, VA. · Zbl 1177.62076 · doi:10.1002/cjs.10029
[62] Sinha, S. K. and Rao, J. N. K. (2009). Robust small area estimation. Canad. J. Statist. 37 381-399. · Zbl 1177.62076 · doi:10.1002/cjs.10029
[63] Torabi, M., Datta, G. S. and Rao, J. N. K. (2009). Empirical Bayes estimation of small area means under a nested error linear regression model with measurement errors in the covariates. Scand. J. Stat. 36 355-368. · Zbl 1197.62081 · doi:10.1111/j.1467-9469.2008.00623.x
[64] Torabi, M. and Rao, J. N. K. (2008). Small area estimation under a two-level model. Survey Methodology 34 11-17.
[65] Tzavidis, N., Marchetti, S. and Chambers, R. (2010). Robust estimation of small-area means and quantiles. Aust. N. Z. J. Stat. 52 167-186. · Zbl 1337.62065 · doi:10.1111/j.1467-842X.2010.00572.x
[66] Ugarte, M. D., Militino, A. F. and Goicoa, T. (2009). Benchmarked estimates in small areas using linear mixed models with restrictions. TEST 18 342-364. · Zbl 1203.62021 · doi:10.1007/s11749-008-0094-x
[67] Vaida, F. and Blanchard, S. (2005). Conditional Akaike information for mixed-effects models. Biometrika 92 351-370. · Zbl 1094.62077 · doi:10.1093/biomet/92.2.351
[68] Wang, J., Fuller, W. A. and Qu, Y. (2008). Small area estimation under a restriction. Survey Methodology 34 29-36.
[69] Wright, D. L., Stern, H. S. and Cressie, N. (2003). Loss functions for estimation of extrema with an application to disease mapping. Canad. J. Statist. 31 251-266. · Zbl 1042.62029 · doi:10.2307/3316085
[70] Yan, G. and Sedransk, J. (2007). Bayesian diagnostic techniques for detecting hierarchical structure. Bayesian Anal. 2 735-760. · Zbl 1331.62067 · doi:10.1214/07-BA230
[71] Yan, G. and Sedransk, J. (2010). A note on Bayesian residuals as a hierarchical model diagnostic technique. Statist. Papers 51 1-10. · Zbl 1247.62100 · doi:10.1007/s00362-007-0111-2
[72] Ybarra, L. M. R. and Lohr, S. L. (2008). Small area estimation when auxiliary information is measured with error. Biometrika 95 919-931. · Zbl 1437.62666 · doi:10.1093/biomet/asn048
[73] You, Y. and Rao, J. N. K. (2002). A pseudo-empirical best linear unbiased prediction approach to small area estimation using survey weights. Canad. J. Statist. 30 431-439. · Zbl 1018.62008 · doi:10.2307/3316146
[74] Zhang, L. C. (2009). Estimates for small area compositions subjected to informative missing data. Survey Methodology 35 191-201.
[75] Zhang, L. C. and Chambers, R. L. (2004). Small area estimates for cross-classifications. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 479-496. · Zbl 1062.62129 · doi:10.1111/j.1369-7412.2004.05266.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.