×

Stein’s method for the Poisson-Dirichlet distribution and the Ewens sampling formula, with applications to Wright-Fisher models. (English) Zbl 1476.60025

Summary: We provide a general theorem bounding the error in the approximation of a random measure of interest – for example, the empirical population measure of types in a Wright-Fisher model – and a Dirichlet process, which is a measure having Poisson-Dirichlet distributed atoms with i.i.d. labels from a diffuse distribution. The implicit metric of the approximation theorem captures the sizes and locations of the masses, and so also yields bounds on the approximation between the masses of the measure of interest and the Poisson-Dirichlet distribution. We apply the result to bound the error in the approximation of the stationary distribution of types in the finite Wright-Fisher model with infinite-alleles mutation structure (not necessarily parent independent) by the Poisson-Dirichlet distribution. An important consequence of our result is an explicit upper bound on the total variation distance between the random partition generated by sampling from a finite Wright-Fisher stationary distribution, and the Ewens sampling formula. The bound is small if the sample size \(n\) is much smaller than \(N^{1/6}\log (N)^{-1/2}\), where \(N\) is the total population size. Our analysis requires a result of separate interest, giving an explicit bound on the second moment of the number of types of a finite Wright-Fisher stationary distribution. The general approximation result follows from a new development of Stein’s method for the Dirichlet process, which follows by viewing the Dirichlet process as the stationary distribution of a Fleming-Viot process, and then applying Barbour’s generator approach.

MSC:

60E05 Probability distributions: general theory
92D25 Population dynamics (general)
60B10 Convergence of probability measures
60J25 Continuous-time Markov processes on general state spaces
PDFBibTeX XMLCite
Full Text: DOI arXiv Link

References:

[1] Aldous, D. J. (1985). Exchangeability and related topics. In École d’été de Probabilités de Saint-Flour, XIII—1983. Lecture Notes in Math. 1117 1-198. Springer, Berlin. · Zbl 0562.60042 · doi:10.1007/BFb0099421
[2] Arratia, R., Barbour, A. D. and Tavaré, S. (2003). Logarithmic Combinatorial Structures: A Probabilistic Approach. EMS Monographs in Mathematics. European Mathematical Society (EMS), Zürich. · Zbl 1040.60001 · doi:10.4171/000
[3] Barbour, A. D. (1988). Stein’s method and Poisson process convergence. J. Appl. Probab. 25A 175-184. · Zbl 0661.60034 · doi:10.1017/s0021900200040341
[4] Barbour, A. D. (1990). Stein’s method for diffusion approximations. Probab. Theory Related Fields 84 297-322. · Zbl 0665.60008 · doi:10.1007/BF01197887
[5] Barbour, A. D., Holst, L. and Janson, S. (1992). Poisson Approximation. Oxford Studies in Probability 2. Oxford Univ. Press, New York. · Zbl 0746.60002
[6] Bartroff, J., Goldstein, L. and Işlak, Ü. (2018). Bounded size biased couplings, log concave distributions and concentration of measure for occupancy models. Bernoulli 24 3283-3317. · Zbl 1407.60032 · doi:10.3150/17-BEJ961
[7] Baxendale, P. (2011). T. E. Harris’s contributions to recurrent Markov processes and stochastic flows. Ann. Probab. 39 417-428. · Zbl 1213.60116 · doi:10.1214/10-AOP594
[8] Bhaskar, A., Clark, A. G. and Song, Y. S. (2014). Distortion of genealogical properties when the sample is very large. Proc. Natl. Acad. Sci. USA 111 2385-2390.
[9] Bourguin, S. and Campese, S. C. (2019). Approximation of Hilbert-valued Gaussian measures on Dirichlet structures. Preprint. Available at arXiv:1905.05127. · Zbl 1480.46057
[10] Chatterjee, S. (2014). A short survey of Stein’s method. In Proceedings of the International Congress of Mathematicians—Seoul 2014, Vol. IV 1-24. Kyung Moon Sa, Seoul. · Zbl 1373.60052
[11] Chatterjee, S., Diaconis, P. and Meckes, E. (2005). Exchangeable pairs and Poisson approximation. Probab. Surv. 2 64-106. · Zbl 1189.60072 · doi:10.1214/154957805100000096
[12] Chatterjee, S., Fulman, J. and Röllin, A. (2011). Exponential approximation by Stein’s method and spectral graph theory. ALEA Lat. Am. J. Probab. Math. Stat. 8 197-223. · Zbl 1276.60010
[13] Chatterjee, S. and Meckes, E. (2008). Multivariate normal approximation using exchangeable pairs. ALEA Lat. Am. J. Probab. Math. Stat. 4 257-283. · Zbl 1162.60310
[14] Chatterjee, S. and Shao, Q.-M. (2011). Nonnormal approximation by Stein’s method of exchangeable pairs with application to the Curie-Weiss model. Ann. Appl. Probab. 21 464-483. · Zbl 1216.60018 · doi:10.1214/10-AAP712
[15] Chen, L. H. Y., Goldstein, L. and Shao, Q.-M. (2011). Normal Approximation by Stein’s Method. Probability and Its Applications (New York). Springer, Heidelberg. · Zbl 1213.62027 · doi:10.1007/978-3-642-15007-4
[16] Chen, L. H. Y. and Xia, A. (2004). Stein’s method, Palm theory and Poisson process approximation. Ann. Probab. 32 2545-2569. · Zbl 1057.60051 · doi:10.1214/009117904000000027
[17] Dalal, A. and Schmutz, E. (2002). Compositions of random functions on a finite set. Electron. J. Combin. 9 Research Paper 26. · Zbl 0994.60003
[18] Dawson, D. A. and Hochberg, K. J. (1982). Wandering random measures in the Fleming-Viot model. Ann. Probab. 10 554-580. · Zbl 0492.60045
[19] Döbler, C. (2015). Stein’s method of exchangeable pairs for the beta distribution and generalizations. Electron. J. Probab. 20 Art. ID 109. · Zbl 1328.60064 · doi:10.1214/EJP.v20-3933
[20] Ethier, S. N. (1990). The infinitely-many-neutral-alleles diffusion model with ages. Adv. in Appl. Probab. 22 1-24. · Zbl 0699.92012 · doi:10.2307/1427594
[21] Ethier, S. N. and Griffiths, R. C. (1993). The transition function of a Fleming-Viot process. Ann. Probab. 21 1571-1590. · Zbl 0778.60038
[22] Ethier, S. N.and Kurtz, T. G. (1981). The infinitely-many-neutral-alleles diffusion model. Adv. in Appl. Probab. 13 429-452. · Zbl 0483.60076 · doi:10.2307/1426779
[23] Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Convergence. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. Wiley, New York. · Zbl 0592.60049 · doi:10.1002/9780470316658
[24] Ethier, S. N. and Kurtz, T. G. (1993). Fleming-Viot processes in population genetics. SIAM J. Control Optim. 31 345-386. · Zbl 0774.60045 · doi:10.1137/0331019
[25] Ethier, S. N. and Kurtz, T. G. (1994). Convergence to Fleming-Viot processes in the weak atomic topology. Stochastic Process. Appl. 54 1-27. · Zbl 0817.60029 · doi:10.1016/0304-4149(94)00006-9
[26] Ewens, W. J. (2004). Mathematical Population Genetics. I: Theoretical Introduction, 2nd ed. Interdisciplinary Applied Mathematics 27. Springer, New York. · Zbl 1060.92046 · doi:10.1007/978-0-387-21822-9
[27] Feng, S. (2010). The Poisson-Dirichlet Distribution and Related Topics: Models and asymptotic behaviors. Probability and Its Applications (New York). Springer, Heidelberg. · Zbl 1214.60001 · doi:10.1007/978-3-642-11194-5
[28] Fill, J. A. (2002). On compositions of random functions on a finite set. Preprint.
[29] Fleming, W. H. and Viot, M. (1979). Some measure-valued Markov processes in population genetics theory. Indiana Univ. Math. J. 28 817-843. · Zbl 0444.60064 · doi:10.1512/iumj.1979.28.28058
[30] Fu, Y.-X. (2006). Exact coalescent for the Wright-Fisher model. Theor. Popul. Biol. 69 385-394. · Zbl 1120.92028
[31] Fulman, J. and Ross, N. (2013). Exponential approximation and Stein’s method of exchangeable pairs. ALEA Lat. Am. J. Probab. Math. Stat. 10 1-13. · Zbl 1277.60015
[32] Gan, H. L., Röllin, A. and Ross, N. (2017). Dirichlet approximation of equilibrium distributions in Cannings models with mutation. Adv. in Appl. Probab. 49 927-959. · Zbl 1429.92104 · doi:10.1017/apr.2017.27
[33] Ghosal, S. (2010). The Dirichlet process, related priors and posterior asymptotics. In Bayesian Nonparametrics. Camb. Ser. Stat. Probab. Math. 28 35-79. Cambridge Univ. Press, Cambridge.
[34] Gorham, J., Duncan, A. B., Vollmer, S. J. and Mackey, L. (2019). Measuring sample quality with diffusions. Ann. Appl. Probab. 29 2884-2928. · Zbl 1439.60073 · doi:10.1214/19-AAP1467
[35] Götze, F. (1991). On the rate of convergence in the multivariate CLT. Ann. Probab. 19 724-739. · Zbl 0729.62051
[36] Karlin, S. and McGregor, J. (1967). The number of mutant forms maintained in a population. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 4: Biology and Problems of Health 415-438. Univ. California Press, Berkeley, CA.
[37] Hitczenko, P. and Pemantle, R. (2005). Central limit theorem for the size of the range of a renewal process. Statist. Probab. Lett. 72 249-264. · Zbl 1069.60024 · doi:10.1016/j.spl.2004.12.011
[38] Kallenberg, O. (2017). Random Measures, Theory and Applications. Probability Theory and Stochastic Modelling 77. Springer, Cham. · Zbl 1376.60003 · doi:10.1007/978-3-319-41598-7
[39] Kasprzak, M. J. (2017a). Diffusion approximations via Stein’s method and time changes. Preprint. Available at arXiv:1701.07633.
[40] Kasprzak, M. J. (2017b). Multivariate functional approximations via Stein’s method of exchangeable pairs. Preprint. Available at arXiv:1710.09263.
[41] Kasprzak, M. J. (2020). Stein’s method for multivariate Brownian approximations of sums under dependence. Stochastic Process. Appl. 130 4927-4967. · Zbl 1445.60058 · doi:10.1016/j.spa.2020.02.006
[42] Kingman, J. F. C. (1975). Random discrete distribution. J. Roy. Statist. Soc. Ser. B 37 1-22. · Zbl 0331.62019
[43] Lessard, S. (2007). An exact sampling formula for the Wright-Fisher model and a solution to a conjecture about the finite-island model. Genetics 177 1249-1254.
[44] Lessard, S. (2010). Recurrence equations for the probability distribution of sample configurations in exact population genetics models. J. Appl. Probab. 47 732-751. · Zbl 1204.92052 · doi:10.1239/jap/1285335406
[45] McSweeney, J. K. and Pittel, B. G. (2008). Expected coalescence time for a nonuniform allocation process. Adv. in Appl. Probab. 40 1002-1032. · Zbl 1165.60006
[46] Meyn, S. P. and Tweedie, R. L. (1993). Markov Chains and Stochastic Stability. Communications and Control Engineering Series. Springer, London. · Zbl 0925.60001 · doi:10.1007/978-1-4471-3267-7
[47] Möhle, M. (2004). The time back to the most recent common ancestor in exchangeable population models. Adv. in Appl. Probab. 36 78-97. · Zbl 1042.60054 · doi:10.1239/aap/1077134465
[48] Petrov, L. A. (2009). A two-parameter family of infinite-dimensional diffusions on the Kingman simplex. Funktsional. Anal. i Prilozhen. 43 45-66. · Zbl 1204.60076 · doi:10.1007/s10688-009-0036-8
[49] Pitman, J. (2006). Combinatorial Stochastic Processes. Lecture Notes in Math. 1875. Springer, Berlin. · Zbl 1103.60004
[50] Reinert, G. (2005). Three general approaches to Stein’s method. In An Introduction to Stein’s Method. Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap. 4 183-221. Singapore Univ. Press, Singapore. · doi:10.1142/9789812567680_0004
[51] Reinert, G. and Röllin, A. (2009). Multivariate normal approximation with Stein’s method of exchangeable pairs under a general linearity condition. Ann. Probab. 37 2150-2173. · Zbl 1200.62010 · doi:10.1214/09-AOP467
[52] Rinott, Y. andRotar, V. (1997). On coupling constructions and rates in the CLT for dependent summands with applications to the antivoter model and weighted \(U\)-statistics. Ann. Appl. Probab. 7 1080-1105. · Zbl 0890.60019 · doi:10.1214/aoap/1043862425
[53] Röllin, A. (2007). Translated Poisson approximation using exchangeable pair couplings. Ann. Appl. Probab. 17 1596-1614. · Zbl 1143.60020 · doi:10.1214/105051607000000258
[54] Röllin, A. (2008). A note on the exchangeability condition in Stein’s method. Statist. Probab. Lett. 78 1800-1806. · Zbl 1147.62017 · doi:10.1016/j.spl.2008.01.043
[55] Ross, N. (2011). Fundamentals of Stein’s method. Probab. Surv. 8 210-293. · Zbl 1245.60033 · doi:10.1214/11-PS182
[56] Stein, C. (1972). A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. II: Probability Theory 583-602. Univ. California Press, Berkeley, CA. · Zbl 0278.60026
[57] Stein, C. (1986). Approximate Computation of Expectations. Institute of Mathematical Statistics Lecture Notes—Monograph Series 7. IMS, Hayward, CA. · Zbl 0721.60016
[58] Wright, S. (1949). Adaptation and selection. In Genetics, Paleontology and Evolution 365-389. Princeton Univ. Press, Princeton, NJ
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.