×

Markov chains for Monte Carlo tests of genetic equilibrium in multidimensional contingency tables. (English) Zbl 0871.62094

Summary: Hardy-Weinberg equilibrium and linkage equilibrium are fundamental concepts in population genetics. In practice, testing linkage equilibrium in haplotype data is equivalent to testing independence in a large, sparse, multidimensional contingency table. Testing Hardy-Weinberg and linkage equilibrium simultaneously on multilocus genotype data introduces the additional complications of missing information and symmetry constraints on marginal probabilities. To avoid unreliable large-sample approximations for sparse contingency tables, one can use exact tests like Fisher’s classical test that conditions on observed marginal totals. Unfortunately, computing \(p\)-values for exact tests is often infeasible because of the large number of tables consistent with the marginal totals of an observed table.
We develop here Markov chains for sampling from the appropriate conditional distributions for testing genetic equilibrium. These chains compare favorably with a parallel, independent-sampling method that we present. For \(n\) haplotype observations on \(J\) loci, the Markov chains converge to their stationary distributions in \([(J-1)n \ln n]/2 +O(n)\) steps and can be an efficient tool for estimating \(p\)-values. Our theoretical treatment of these results involves strong stationary stopping times, order statistics, large deviations and the embedding of Poisson processes. We include some general results on the application of strong stationary times to bounding the precision and bias of sample average estimators.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
92D10 Genetics and epigenetics
65C05 Monte Carlo methods
62H17 Contingency tables

Software:

AS 144
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] AGRESTI, A. 1992. A survey of exact inference for contingency tables. Statist. Sci. 7 131 177. Z. · Zbl 0955.62587 · doi:10.1214/ss/1177011454
[2] ALDOUS, D. and DIACONIS, P. 1986. Shuffling cards and stopping times. Amer. Math. Monthly 93 333 348. Z. JSTOR: · Zbl 0603.60006 · doi:10.2307/2323590
[3] BARNARD, G. 1963. Discussion of “The spectral analysis of point processes” by M. S. Bartlett. J. Roy. Statist. Soc. Ser. B 25 294. Z. JSTOR:
[4] BESAG, J. and CLIFFORD, P. 1989. Generalized Monte Carlo significance tests. Biometrika 76 633 642. Z. JSTOR: · Zbl 0679.62033 · doi:10.1093/biomet/76.4.633
[5] BLOM, G. and HOLST, L. 1991. Embedding procedures for discrete problems in probability. Math. Sci. 16 27 40. Z. · Zbl 0737.60014
[6] BOy ETT, J. M. 1979. Random R C tables with given row and columns totals. J. Roy. Statist. Soc. Ser. C 28 329 332. Z. CAVALLI-SFORZA, L. L. and BODMER, W. F. 1971. The Genetics of Human Populations. Freeman, San Francisco. Z.
[7] CROW, J. E. 1988. Eighty years ago: the beginnings of population genetics. Genetics 119 473 476. Z.
[8] DAVID, H. A. 1981. Order Statistics. Wiley, New York. Z. · Zbl 0553.62046
[9] DIACONIS, P. 1988. Group Representations in Probability and Statistics. IMS, Hay ward, CA. Z. · Zbl 0695.60012
[10] DIACONIS, P. and STURMFELS, B. 1996. Algebraic algorithms for sampling from conditional distributions. Unpublished manuscript. Z.
[11] ELSTON, R. and FORTHOFER, R. 1977. Testing for Hardy Weinberg equilibrium in small samples. Biometrics 33 536 542. Z. · Zbl 0371.62146 · doi:10.2307/2529370
[12] EMIGH, T. 1980. A comparison of tests for Hardy Weinberg equilibrium. Biometrics 36 627 642. Z. JSTOR: · Zbl 0446.62111 · doi:10.2307/2556115
[13] FELLER, W. 1968. An Introduction to Probability and Its Applications 1, 3rd ed. Wiley, New York. Z. · Zbl 0155.23101
[14] GORADIA, T. M., LANGE, K., MILLER, P. L. and NADKARNI, P. M. 1992. Fast computation of genetic likelihoods on human pedigree data. Human Heredity 42 42 62. Z.
[15] GRAHAM, R. L., KNUTH, D. E. and PATASHNIK, O. 1989. Concrete Mathematics. Addison-Wesley, Reading, MA. Z. · Zbl 0668.00003
[16] GUO, S. and THOMPSON, E. 1992. Performing the exact test of Hardy Weinberg proportion for multiple alleles. Biometrics 48 361 372. Z. · Zbl 0825.62835 · doi:10.2307/2532296
[17] HALDANE, J. 1954. An exact test for randomness of mating. Journal of Genetics 52 631 635. Z.
[18] HASTINGS, W. K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97 109. Z. · Zbl 0219.65008 · doi:10.1093/biomet/57.1.97
[19] KARLIN, S. and TAy LOR, H. M. 1975. A First Course in Stochastic Processes. Academic Press, New York. Z.
[20] KELLY, F. P. 1979. Reversibility and Stochastic Networks. Wiley, New York. Z. · Zbl 0422.60001
[21] KOLASSA, J. E. and TANNER, M. A. 1994. Approximate conditional inference in exponential families via the Gibbs sampler. J. Amer. Statist. Assoc. 89 697 702. Z. JSTOR: · Zbl 0803.62013 · doi:10.2307/2290874
[22] LANGE, K. 1993. A stochastic model for genetic linkage equilibrium. Theoret. Population Biol. 44 129 148. · Zbl 0783.92022 · doi:10.1006/tpbi.1993.1022
[23] LAZZERONI, L. C., ARNHEM, N., SCHMITT, K. and LANGE, K. 1994. Multipoint mapping calculations for sperm-ty ping data. American Journal of Human Genetics 55 431 436. Z.
[24] LEVENE, H. 1949. On a matching problem arising in genetics. Ann. Math. Statist. 20 91 94. Z. · Zbl 0031.37402 · doi:10.1214/aoms/1177730093
[25] LI, C. C. 1955. Population Genetics. Univ. Chicago Press. Z.
[26] LOUIS, E. and DEMPSTER, E. 1987. An exact test for Hardy Weinberg and multiple alleles. Biometrics 43 805 811. Z. · Zbl 0715.62266 · doi:10.2307/2531534
[27] MATTHEWS, P. 1988. A strong uniform time for random transpositions. J. Theoret. Probab. 1 411 423. Z. · Zbl 0716.60074 · doi:10.1007/BF01048728
[28] NIJENHUIS, A. and WILF, H. S. 1978. Combinatorial Algorithms. Academic Press, New York. Z. · Zbl 0476.68047
[29] SEN, P. K. and SINGER, J. M. 1993. Large Sample Methods in Statistics. Chapman & Hall, New York. Z. · Zbl 0867.62003
[30] VERBEEK, A. and KROONENBERG, P. M. 1985. A survey of algorithms for exact distributions of test statistics in r c contingency tables with fixed margins. Comput. Statist. Data Anal. 3 159 285. Z. · Zbl 0586.62083 · doi:10.1016/0167-9473(85)90080-5
[31] WEIR, B. S. 1990. Genetic Data Analy sis. Sinauer Associates, Sunderland. Z.
[32] WEIR, B. S. and BROOKS, L. D. 1986. Disequilibrium on human chromosome 11p. Genetic Z. Epidemiology Supplement 1 177 183.
[33] STANFORD, CALIFORNIA 94305 ANN ARBOR, MICHIGAN 48109 E-MAIL: laura@play fair.stanford.edu E-MAIL: klange@sph.umich.edu
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.