## Consistent estimation of the spectrum of trace class data augmentation algorithms.(English)Zbl 1428.62409

Summary: Markov chain Monte Carlo is widely used in a variety of scientific applications to generate approximate samples from intractable distributions. A thorough understanding of the convergence and mixing properties of these Markov chains can be obtained by studying the spectrum of the associated Markov operator. While several methods to bound/estimate the second largest eigenvalue are available in the literature, very few general techniques for consistent estimation of the entire spectrum have been proposed. Existing methods for this purpose require the Markov transition density to be available in closed form, which is often not true in practice, especially in modern statistical applications. In this paper, we propose a novel method to consistently estimate the entire spectrum of a general class of Markov chains arising from a popular and widely used statistical approach known as Data Augmentation. The transition densities of these Markov chains can often only be expressed as intractable integrals. We illustrate the applicability of our method using real and simulated data.

### MSC:

 62M15 Inference from stochastic processes and spectral analysis 65C05 Monte Carlo methods 60J10 Markov chains (discrete-time Markov processes on discrete state spaces) 60J22 Computational methods in Markov chains 62M05 Markov processes: estimation; hidden Markov models

### Software:

reshape; Rcpp; BayesLogit; Gibbsit; tsbridge; ggplot2; bootlib
Full Text:

### References:

 [1] Adamczak, R. and Bednorz, W. (2015). Some remarks on MCMC estimation of spectra of integral operators. Bernoulli21 2073-2092. · Zbl 1350.60027 · doi:10.3150/14-BEJ635 [2] Albert, J.H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Amer. Statist. Assoc.88 669-679. · Zbl 0774.62031 · doi:10.1080/01621459.1993.10476321 [3] Asmussen, S. and Glynn, P.W. (2011). A new proof of convergence of MCMC via the ergodic theorem. Statist. Probab. Lett.81 1482-1485. · Zbl 1232.65015 · doi:10.1016/j.spl.2011.05.004 [4] Athreya, K.B. and Atuncar, G.S. (1998). Kernel estimation for real-valued Markov chains. Sankhya, Ser. A60 1-17. · Zbl 0977.62093 [5] Bennett, C.H. (1976). Efficient estimation of free energy differences from Monte Carlo data. J. Comput. Phys.22 245-268. [6] Chakraborty, S. and Khare, K. (2017). Convergence properties of Gibbs samplers for Bayesian probit regression with proper priors. Electron. J. Stat.11 177-210. · Zbl 1366.60093 · doi:10.1214/16-EJS1219 [7] Chakraborty, S. and Khare, K. (2019). Supplement to “Consistent estimation of the spectrum of trace class data augmentation algorithms.” DOI:10.3150/19-BEJ1112SUPP. · Zbl 1428.62409 [8] Choi, H.M. and Hobert, J.P. (2013). The Polya-gamma Gibbs sampler for Bayesian logistic regression is uniformly ergodic. Electron. J. Stat.7 2054-2064. · Zbl 1349.60123 · doi:10.1214/13-EJS837 [9] Choi, H.M. and Román, J.C. (2017). Analysis of Polya-Gamma Gibbs sampler for Bayesian logistic analysis of variance. Electron. J. Stat.11 326-337. · Zbl 1356.60117 · doi:10.1214/17-EJS1227 [10] Conway, J.B. (1990). A Course in Functional Analysis, 2nd ed. Graduate Texts in Mathematics96. New York: Springer. · Zbl 0706.46003 [11] Davison, A.C. and Hinkley, D.V. (1997). Bootstrap Methods and Their Application. Cambridge Series in Statistical and Probabilistic Mathematics1. Cambridge: Cambridge Univ. Press. · Zbl 0886.62001 [12] Diaconis, P., Khare, K. and Saloff-Coste, L. (2008). Gibbs sampling, exponential families and orthogonal polynomials. Statist. Sci.23 151-178. · Zbl 1327.62058 · doi:10.1214/07-STS252 [13] Diaconis, P. and Saloff-Coste, L. (1993). Comparison techniques for random walk on finite groups. Ann. Probab.21 2131-2156. · Zbl 0790.60011 · doi:10.1214/aop/1176989013 [14] Diaconis, P. and Saloff-Coste, L. (1996). Nash inequalities for finite Markov chains. J. Theoret. Probab.9 459-510. · Zbl 0870.60064 · doi:10.1007/BF02214660 [15] Diaconis, P. and Stroock, D. (1991). Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Probab.1 36-61. · Zbl 0731.60061 · doi:10.1214/aoap/1177005980 [16] Eddelbuettel, D. (2013). Seamless R and $$C ++$$ Integration with Rcpp. New York: Springer. · Zbl 1283.62001 [17] François, O. (2000). Geometric inequalities for the eigenvalues of concentrated Markov chains. J. Appl. Probab.37 15-28. · Zbl 0961.60067 [18] Frühwirth-Schnatter, S. (2001). Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J. Amer. Statist. Assoc.96 194-209. · Zbl 1015.62022 · doi:10.1198/016214501750333063 [19] Garren, S.T. and Smith, R.L. (2000). Estimating the second largest eigenvalue of a Markov transition matrix. Bernoulli6 215-242. · Zbl 0976.62081 · doi:10.2307/3318575 [20] Hobert, J.P., Roy, V. and Robert, C.P. (2011). Improving the convergence properties of the data augmentation algorithm with an application to Bayesian mixture modeling. Statist. Sci.26 332-351. · Zbl 1246.60095 · doi:10.1214/11-STS365 [21] Hoffman, A.J. and Wielandt, H.W. (1953). The variation of the spectrum of a normal matrix. Duke Math. J.20 37-39. · Zbl 0051.00903 · doi:10.1215/S0012-7094-53-02004-3 [22] Jones, G.L. and Hobert, J.P. (2001). Honest exploration of intractable probability distributions via Markov chain Monte Carlo. Statist. Sci.16 312-334. · Zbl 1127.60309 · doi:10.1214/ss/1015346317 [23] Jörgens, K. (1982). Linear Integral Operators. Surveys and Reference Works in Mathematics7. Boston, MA-London: Pitman. · Zbl 0499.47029 [24] Khare, K. and Hobert, J.P. (2011). A spectral analytic comparison of trace-class data augmentation algorithms and their sandwich variants. Ann. Statist.39 2585-2606. · Zbl 1259.60081 · doi:10.1214/11-AOS916 [25] Khare, K. and Zhou, H. (2009). Rates of convergence of some multivariate Markov chains with polynomial eigenfunctions. Ann. Appl. Probab.19 737-777. · Zbl 1171.60016 · doi:10.1214/08-AAP562 [26] Koltchinskii, V. and Giné, E. (2000). Random matrix approximation of spectra of integral operators. Bernoulli6 113-167. · Zbl 0949.60078 · doi:10.2307/3318636 [27] Lawler, G.F. and Sokal, A.D. (1988). Bounds on the $$L^2$$ spectrum for Markov chains and Markov processes: A generalization of Cheeger’s inequality. Trans. Amer. Math. Soc.309 557-580. · Zbl 0716.60073 [28] Liu, J.S., Wong, W.H. and Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika81 27-40. · Zbl 0811.62080 · doi:10.1093/biomet/81.1.27 [29] Meng, X.-L. and Wong, W.H. (1996). Simulating ratios of normalizing constants via a simple identity: A theoretical exploration. Statist. Sinica6 831-860. · Zbl 0857.62017 [30] Pal, S., Khare, K. and Hobert, J.P. (2017). Trace class Markov chains for Bayesian inference with generalized double Pareto shrinkage priors. Scand. J. Stat.44 307-323. · Zbl 1422.62243 [31] Polson, N.G., Scott, J.G. and Windle, J. (2013). Bayesian inference for logistic models using Pólya-Gamma latent variables. J. Amer. Statist. Assoc.108 1339-1349. · Zbl 1283.62055 · doi:10.1080/01621459.2013.829001 [32] Qin, Q. and Hobert, J.P. (2018). Trace-class Monte Carlo Markov chains for Bayesian multivariate linear regression with non-Gaussian errors. J. Multivariate Anal.166 335-345. · Zbl 1426.62206 · doi:10.1016/j.jmva.2018.03.012 [33] Qin, Q., Hobert, J.P. and Khare, K. (2017). Estimating the spectral gap of a trace-class Markov operator. Preprint. Available at arXiv:1704.00850. · Zbl 1466.60139 · doi:10.1214/19-EJS1563 [34] Raftery, A.E. and Lewis, S. (1992). How many iterations in the Gibbs sampler. Bayesian Stat.4 763-773. [35] Retherford, J.R. (1993). Hilbert Space: Compact Operators and the Trace Theorem. London Mathematical Society Student Texts27. Cambridge: Cambridge Univ. Press. · Zbl 0783.47031 [36] Rosenthal, J.S. (1995). Minorization conditions and convergence rates for Markov chain Monte Carlo. J. Amer. Statist. Assoc.90 558-566. · Zbl 0824.60077 · doi:10.1080/01621459.1995.10476548 [37] Roy, V. (2012). Convergence rates for MCMC algorithms for a robust Bayesian binary regression model. Electron. J. Stat.6 2463-2485. · Zbl 1295.60089 · doi:10.1214/12-EJS756 [38] Saloff-Coste, L. (2004). Total variation lower bounds for finite Markov chains: Wilson’s lemma. In Random Walks and Geometry 515-532. Berlin: de Gruyter. · Zbl 1067.60066 [39] Sinclair, A. and Jerrum, M. (1989). Approximate counting, uniform generation and rapidly mixing Markov chains. Inform. and Comput.82 93-133. · Zbl 0668.05060 · doi:10.1016/0890-5401(89)90067-9 [40] Wickham, H. (2007). Reshaping data with the reshape package. J. Stat. Softw.21 1-20. [41] Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. New York: Springer. · Zbl 1397.62006 [42] Yuen, W.K. (2000). Applications of geometric bounds to the convergence rate of Markov chains on $$\mathbf{R}^n$$ . Stochastic Process. Appl.87 1-23. · Zbl 1045.60073
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.