×

A fast MCMC algorithm for the uniform sampling of binary matrices with fixed margins. (English) Zbl 1439.62060

Summary: Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix is too sparse/dense. Here we propose the Rectangle Loop algorithm, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly. Theoretically the Rectangle Loop algorithm is better than the swap algorithm in Peskun’s order. Empirical studies also demonstrate that the Rectangle Loop algorithm is remarkably more efficient than the swap algorithm.

MSC:

62D05 Sampling theory, sample surveys
62-08 Computational methods for problems pertaining to statistics
62H12 Estimation in multivariate analysis
65C05 Monte Carlo methods

References:

[1] Julian Besag and Peter Clifford. Generalized Monte Carlo significance tests., Biometrika, 76(4):633-642, 1989. · Zbl 0679.62033 · doi:10.1093/biomet/76.4.633
[2] Corrie Jacobien Carstens and Pieter Kleer. Speeding up switch Markov chains for sampling bipartite graphs with given degree sequence. In, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. · Zbl 1522.60060
[3] Yuguo Chen, Persi Diaconis, Susan P Holmes, and Jun S Liu. Sequential Monte Carlo methods for statistical analysis of tables., Journal of the American Statistical Association, 100(469):109-120, 2005. · Zbl 1117.62310 · doi:10.1198/016214504000001303
[4] Persi Diaconis and Anil Gangolli. Rectangular arrays with fixed margins. In, Discrete probability and algorithms, pages 15-41. Springer, 1995. · Zbl 0839.05005 · doi:10.1007/978-1-4612-0801-3
[5] Persi Diaconis, Bernd Sturmfels, et al. Algebraic algorithms for sampling from conditional distributions., The Annals of statistics, 26(1):363-397, 1998. · Zbl 0952.62088 · doi:10.1214/aos/1030563990
[6] David Gale et al. A theorem on flows in networks., Pacific J. Math, 7(2) :1073-1082, 1957. · Zbl 0087.16303 · doi:10.2140/pjm.1957.7.1073
[7] Matthew T Harrison and Jeffrey W Miller. Importance sampling for weighted binary random matrices with specified margins., arXiv preprint arXiv:1301.3928, 2013.
[8] RB Holmes and LK Jones. On uniform generation of two-way tables with fixed margins and the conditional volume test of diaconis and efron., The Annals of Statistics, pages 64-68, 1996. · Zbl 0888.62060 · doi:10.1214/aos/1033066199
[9] Ravi Kannan, Prasad Tetali, and Santosh Vempala. Simple Markov-chain algorithms for generating bipartite graphs and tournaments., Random Structures & Algorithms, 14(4):293-308, 1999. · Zbl 0933.05145 · doi:10.1002/(SICI)1098-2418(199907)14:4<293::AID-RSA1>3.0.CO;2-G
[10] István Miklós and János Podani. Randomization of presence-absence matrices: comments and new algorithms., Ecology, 85(1):86-92, 2004.
[11] Peter H Peskun. Optimum Monte-Carlo sampling using Markov chains., Biometrika, 60(3):607-612, 1973. · Zbl 0271.62041 · doi:10.1093/biomet/60.3.607
[12] A Ramachandra Rao, Rabindranath Jana, and Suraj Bandyopadhyay. A Markov chain Monte Carlo method for generating random (0, 1)-matrices with given marginals., Sankhyā: The Indian Journal of Statistics, Series A, pages 225-242, 1996. · Zbl 0901.60047
[13] Steffen Rechner. Markov chain Monte Carlo algorithms for the uniform sampling of combinatorial objects., 2018.
[14] Alan Roberts and Lewis Stone. Island-sharing by archipelago species., Oecologia, 83(4):560-567, 1990.
[15] Herbert J Ryser. Combinatorial properties of matrices of zeros and ones. In, Classic Papers in Combinatorics, pages 269-275. Springer, 2009.
[16] Tom AB Snijders. Enumeration and simulation methods for 0-1 matrices with given marginals., Psychometrika, 56(3):397-417, 1991. · Zbl 0850.05002 · doi:10.1007/BF02294482
[17] Giovanni Strona, Domenico Nappo, Francesco Boccacci, Simone Fattorini, and Jesus San-Miguel-Ayanz. A fast and unbiased procedure to randomize ecological binary matrices with fixed row and column totals., Nature communications, 5 :4114, 2014.
[18] Norman D Verhelst. · Zbl 1284.62757 · doi:10.1007/s11336-008-9062-3
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.