×

Normal approximation and smoothness for sums of means of lattice-valued random variables. (English) Zbl 1291.62046

Let \( \hat{\theta}\) be a given statistic for which a central limit theorem applies. To obtain estimates for the difference between the exact distribution of \(T=(\hat{\theta} - E[ \hat{\theta}])/\sqrt{\mathrm{Var} ( \hat{\theta})}\) and the standard normal distribution \(\Phi (x)\), several expansions for the distribution of \(T\) are available. An important class of these expansions is given by Edgeworth expansions, which are expansions of the form \[ P (T \leq x) = \Phi (x) + \sum_{j=0}^{r} \frac{p_j(x)}{n^{j/2}} \, \phi (x) + O(n^{-(r+1)/2}), \qquad r \geq 0, \] where \(p_0 (x) \equiv 0\), \(\phi (x)\) is the derivative of \(\Phi (x)\), and for \(j \geq 1\), \(p_j(x)\) are polynomials whose coefficients depend on the cumulants of \(\hat{\theta} - E[ \hat{\theta}]\). For examples, see [P. Hall, The bootstrap and Edgeworth expansion. Springer Series in Statistics. New York etc.: Springer-Verlag. (1992; Zbl 0744.62026)], and [X. H. Zhou, C. M. Li and Z. Yang, “Improving interval estimation of binomial proportions”, Phil. Trans. Roy. Soc. Ser. A 366, 2405–2418 (2001)].
In this article the authors investigate the first order Edgeworth expansions of sums of independent means of independent lattice-valued random variables. Sums or differences of binomial proportions are special cases of the problem under investigation. Let \(\{X_{j1}, X_{j2}, \dots , X_{jn_j}\}\), \(j=1,2,\dots , k\), \(k\geq 2\), be \(k\) independent samples of independent lattice-valued random variables, with \(E[|X_{j1}|^3]< +\infty\). Put \(\mu_j = E(X_{j1})\), \(\sigma_j^2 = \mathrm{Var} (X_{j1})\), \(\overline{X}_j = n_j^{-1} \sum_i X_{ji}\), and \(S = \sum_{j=1}^k \overline{X}_j \). Under these assumptions one would expect \(S\) to have a first order Edgeworth expansion of the form \[ P\left(\frac{S - E(S)}{\sqrt{\mathrm{Var} (S)}}\leq x \right) = \Phi (x) + \frac{\beta (1-x^2) \phi (x) }{6 \sqrt{n}} + \frac{d_n(x) \phi (x)}{\sqrt{n}} + o(n^{-1/2}), \] where \(n = n_1 + \dots + n_k\), \[ \beta = \beta (n) = \frac{\sqrt{n} \, E[(S - E(S))^3]}{{(\mathrm{Var} (S))^{3/2}}}, \] and \(d_n\) is a discontinuous term in general needed when dealing with lattice distributions see [C.-G. Esseen, Acta Math. 77, 1–125 (1945; Zbl 0060.28705)]. The terms \(d_n\) are often referred to as continuity corrections.
The authors investigate the distribution of S, and describe a methodology and conditions under which continuity corrections are not needed for this multi-sample problem. Specifically, suppose that the sample sizes \(n_1, n_2, \dots , n_k\) are changing in such a way that the correspondent sequence of values of \(n\) is strictly increasing, and that \[ \min_{1 \leq j \leq k} \liminf_{n\rightarrow +\infty} \frac{n_j}{n} > 0. \] Let \(e_j\) denote the span of the distribution of \(X_{j1}\), and for every \(1 \leq j_1 < j_2 \leq k\) put \(\rho_{j_1j_2} = (e_{j_2}n_{j_1})/ (e_{j_1}n_{j_2})\). The authors prove that if for at least one of the \(\rho_{j_1j_2}\), \[ \lim_{n\rightarrow +\infty} \sqrt{n} \, | \sin (l \rho_{j_1j_2} \pi)| = +\infty \qquad \text{ for every positive integer \(l\), } \] then \[ P\left(\frac{S - E(S)}{\sqrt{\mathrm{Var} (S)}}\leq x \right) = \Phi (x) + \frac{\beta (1-x^2) \phi (x) }{6 \sqrt{n}} + o(n^{-1/2}) \] holds uniformly in \(x\). The authors also give conditions under which a continuity correction \(d_n\) is needed, and for the case \(k=2\), \(d_n\) is derived. Extensions to problems where distributions are estimated using the bootstrap are also given.

MSC:

62E17 Approximations to statistical distributions (nonasymptotic)
60F05 Central limit and other weak theorems
62F40 Bootstrap, jackknife and other resampling methods
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] Agresti, A. and Caffo, B. (2000). Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. Amer. Statist. 54 280-288. · Zbl 1250.62016
[2] Borkowf, C.B. (2006). Constructing binomial confidence intervals and near nominal coverage by adding a single imaginary failure or success. Stat. Med. 25 3679-3695.
[3] Brown, L. and Li, X. (2005). Confidence intervals for two sample binomial distribution. J. Statist. Plann. Inference 130 359-375. · Zbl 1087.62041
[4] Brown, L.D., Cai, T.T. and DasGupta, A. (2001). Interval estimation for a binomial proportion. Statist. Sci. 16 101-133. With comments and a rejoinder by the authors. · Zbl 1059.62533
[5] Brown, L.D., Cai, T.T. and DasGupta, A. (2002). Confidence intervals for a binomial proportion and asymptotic expansions. Ann. Statist. 30 160-201. · Zbl 1012.62026
[6] Clopper, C.J. and Pearson, E.S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26 404-413. · JFM 60.1175.02
[7] Duffy, D. and Santer, T.J. (1987). Confidence intervals for a binomial parameter based on multistage tests. Biometrics 43 81-94. · Zbl 0657.62091
[8] Einsiedler, M. and Ward, T. (2011). Ergodic Theory with a View Towards Number Theory. Graduate Texts in Mathematics 259 . London: Springer London Ltd. · Zbl 1206.37001
[9] Esseen, C.G. (1945). Fourier analysis of distribution functions. A mathematical study of the Laplace-Gaussian law. Acta Math. 77 1-125. · Zbl 0060.28705
[10] Griffiths, M. (2004). Formulae for the convergents to some irrationals. Math. Gazette 88 28-38.
[11] Hall, P. (1982). Improving the normal approximation when constructing one-sided confidence intervals for binomial or Poisson parameters. Biometrika 69 647-652. · Zbl 0493.62036
[12] Hall, P. (1987). On the bootstrap and continuity correction. J. Roy. Statist. Soc. Ser. B 49 82-89. · Zbl 0629.62045
[13] Kuipers, L. and Niederreiter, H. (1974). Uniform Distribution of Sequences . New York: Wiley-Interscience [John Wiley & Sons]. · Zbl 0281.10001
[14] Lee, J.J., Serachitopol, D.M. and Brown, B.W. (1997). Likelihood-weighted confidence intervals for the difference of two binomial proportions. Biometrical J. 39 387-407. · Zbl 1063.62533
[15] LeVeque, W.J. (1956). Topics in Number Theory. Vol. 1. Reading, MA: Addison-Wesley. · Zbl 0070.03804
[16] Price, R.M. and Bonett, D.G. (2004). An improved confidence interval for a linear function of binomial proportions. Comput. Statist. Data Anal. 45 449-456. · Zbl 1429.62102
[17] Ribenboim, P. (2000). My Numbers , My Friends : Popular Lectures on Number Theory . New York: Springer. · Zbl 0947.11001
[18] Roth, K.F. (1955). Rational approximations to algebraic numbers. Mathematika 2 1-20. Corrigendum 168. · Zbl 0064.28501
[19] Roths, S.A. and Tebbs, J.M. (2006). Revisiting Beal’s confidence intervals for the difference of two binomial proportions. Comm. Statist. Theory Methods 35 1593-1609. · Zbl 1105.62036
[20] Singh, K. (1981). On the asymptotic accuracy of Efron’s bootstrap. Ann. Statist. 9 1187-1195. · Zbl 0494.62048
[21] Sterne, T.E. (1954). Some remarks on confidence or fiducial limits. Biometrika 41 275-278. · Zbl 0055.12807
[22] Wang, H. (2010). The monotone boundary property and the full coverage property of confidence intervals for a binomial proportion. J. Statist. Plann. Inference 140 495-501. · Zbl 1177.62040
[23] Zhou, X.H., Li, C.M. and Yang, Z. (2001). Improving interval estimation of binomial proportions. Phil. Trans. Roy. Soc. Ser. A 366 2405-2418.
[24] Zieliński, W. (2010). The shortest Clopper-Pearson confidence interval for binomial probability. Comm. Statist. Simulation Comput. 39 188-193. · Zbl 1182.62063
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.