×

Estimation of sums of random variables: examples and information bounds. (English) Zbl 1086.62035

Summary: This paper concerns the estimation of sums of functions of obsersable and unobservable variables. Lower bounds for the asymptotic variance and a convolution theorem are derived in general finite- and infinite-dimensional models. An explicit relationship is established between efficient influence functions for the estimation of sums of variables and the estimation of their means. Certain “plug-in” estimators are proved to be asymptotically efficient in finite-dimensional models, while the “\(u,v\)” estimators of H. Robbins [Statistical Decision Theory and Related Topics 4, 4th Purdue Symp., West Lafayette/Indiana 1986, Vol.1, 265–269 (1988; Zbl 0685.62032)] are proved to be efficient in infinite-dimensional mixture models. Examples include certain species, network and data confidentiality problems.

MSC:

62F12 Asymptotic properties of parametric estimators
62G05 Nonparametric estimation
62P99 Applications of statistics
62F10 Point estimation
62G20 Asymptotic properties of nonparametric inference
62F15 Bayesian inference
62P10 Applications of statistics to biology and medical sciences; meta analysis

Citations:

Zbl 0685.62032
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Benedetti, R. and Franconi, L. (1998). Statistical and technological solutions for controlled data dissemination. In Pre-proceedings of New Techniques and Technologies for Statistics , Sorrento 1 225–232.
[2] Bethlehem, J., Keller, W. and Pannekoek, J. (1990). Disclosure control of microdata. J. Amer. Statist. Assoc. 85 38–45.
[3] Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ. Press, Baltimore. · Zbl 0786.62001
[4] Bunge, J. and Fitzpatrick, M. (1993). Estimating the number of species: A review. J. Amer. Statist. Assoc. 88 364–373.
[5] Chao, A. (1984). Nonparametric estimation of the number of classes in a population. Scand. J. Statist. 11 265–270.
[6] Chao, A. and Bunge, J. (2002). Estimating the number of species in a stochastic abundance model. Biometrics 58 531–539. · Zbl 1210.62225 · doi:10.1111/j.0006-341X.2002.00531.x
[7] Clauset, A. and Moore, C. (2003). Traceroute sampling makes random graphs appear to have power law degree.
[8] Coates, A., Hero, A., Nowak, R. and Yu, B. (2002). Internet tomography. IEEE Signal Processing Magazine 19 (3) 47–65.
[9] Darroch, J. N. and Ratcliff, D. (1980). A note on capture–recapture estimation. Biometrics 36 149–153. · Zbl 0437.62100 · doi:10.2307/2530505
[10] Duncan, G. T. and Pearson, R. W. (1991). Enhancing access to microdata while protecting confidentiality: Prospects for the future (with discussion). Statist. Sci. 6 219–239.
[11] Engen, S. (1974). On species frequency models. Biometrika 61 263–270. · Zbl 0281.62062 · doi:10.1093/biomet/61.2.263
[12] Faloutsos, M., Faloutsos, P. and Faloutsos, C. (1999). On power-law relationships of the Internet topology. In Proc. ACM SIGCOMM 1999 251–262. ACM Press, New York. · Zbl 0889.68050
[13] Fisher, R. A., Corbet, A. S. and Williams, C. B. (1943). The relation between the number of species and the number of individuals in a random sample of an animal population. J. Animal Ecology 12 42–58.
[14] Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika 40 237–264. · Zbl 0051.37103 · doi:10.1093/biomet/40.3-4.237
[15] Govindan, R. and Tangmunarunkit, H. (2000). Heuristics for Internet map discovery. In Proc. IEEE INFOCOM 2000 3 1371–1380. IEEE Press, New York.
[16] Lakhina, A., Byers, J., Crovella, M. and Xie, P. (2003). Sampling biases in IP topology measurements. In Proc. IEEE INFOCOM 2003 1 332–341. IEEE Press, New York.
[17] Pfanzagl, J. (with the assistance of W. Wefelmeyer) (1982). Contributions to a General Asymptotic Statistical Theory. Lecture Notes in Statist. 13 . Springer, New York. · Zbl 0512.62001
[18] Polettini, S. and Seri, G. (2003). Guidelines for the protection of social micro-data using individual risk methodology. Application within \(\mu\)-Argus version 3.2, CASC Project Deliverable No. 1.2-D3. Available at neon.vb.cbs.nl/casc/deliv/12D3_guidelines.pdf.
[19] Rao, C. R. (1971). Some comments on the logarithmic series distribution in the analysis of insect trap data. In Statistical Ecology (G. P. Patil, E. C. Pielou and W. E. Waters, eds.) 1 131–142. Pennsylvania State Univ. Press, University Park.
[20] Rieder, H. (2000). One-sided confidence about functionals over tangent cones. Available at www.uni-bayreuth.de/departments/math/org/mathe7/RIEDER/pubs/cc.pdf.
[21] Rinott, Y. (2003). On models for statistical disclosure risk estimation. Working paper no. 16, Joint ECE/Eurostat Work Session on Data Confidentiality, Luxemburg, 2003. Available at www.unece.org/stats/documents/2003/04/confidentiality/wp.16.e.pdf.
[22] Robbins, H. (1977). Prediction and estimation for the compound Poisson distribution. Proc. Natl. Acad. Sci. U.S.A. 74 2670–2671. · Zbl 0359.62014 · doi:10.1073/pnas.74.7.2670
[23] Robbins, H. (1980). An empirical Bayes estimation problem. Proc. Natl. Acad. Sci. U.S.A. 77 6988–6989. · Zbl 0456.62029 · doi:10.1073/pnas.77.12.6988
[24] Robbins, H. (1988). The \(u,v\) method of estimation. In Statistical Decision Theory and Related Topics IV (S. S. Gupta and J. O. Berger, eds.) 1 265–270. Springer, New York. · Zbl 0685.62032
[25] Robbins, H. and Zhang, C.-H. (1988). Estimating a treatment effect under biased sampling. Proc. Natl. Acad. Sci. U.S.A. 85 3670–3672. · Zbl 0639.62098 · doi:10.1073/pnas.85.11.3670
[26] Robbins, H. and Zhang, C.-H. (1989). Estimating the superiority of a drug to a placebo when all and only those patients at risk are treated with the drug. Proc. Natl. Acad. Sci. U.S.A. 86 3003–3005. · Zbl 0665.62108 · doi:10.1073/pnas.86.9.3003
[27] Robbins, H. and Zhang, C.-H. (1991). Estimating a multiplicative treatment effect under biased allocation. Biometrika 78 349–354. · Zbl 0735.62031 · doi:10.1093/biomet/78.2.349
[28] Robbins, H. and Zhang, C.-H. (2000). Efficiency of the \(u,v\) method of estimation. Proc. Natl. Acad. Sci. U.S.A. 97 12,976–12,979. · Zbl 0968.62030 · doi:10.1073/pnas.97.24.12976
[29] Sampford, M. R. (1955). The truncated negative binomial distribution. Biometrika 42 58–69. · Zbl 0065.12703 · doi:10.1093/biomet/42.1-2.58
[30] Spring, N., Mahajan, R. and Wetherall, D. (2002). Measuring ISP topologies with rocketfuel. In Proc. ACM SIGCOMM 2002 133–145. ACM Press, New York.
[31] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press. · Zbl 0910.62001 · doi:10.1017/CBO9780511802256
[32] Vardi, Y. (1996). Network tomography: Estimating source-destination traffic intensities from link data. J. Amer. Statist. Assoc. 91 365–377. · Zbl 0871.62103 · doi:10.2307/2291416
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.