Efficient data augmentation techniques for some classes of state space models. (English) Zbl 07708430

Summary: Data augmentation improves the convergence of iterative algorithms, such as the EM algorithm and Gibbs sampler by introducing carefully designed latent variables. In this article, we first propose a data augmentation scheme for the first-order autoregression plus noise model, where optimal values of working parameters introduced for recentering and rescaling of the latent states, can be derived analytically by minimizing the fraction of missing information in the EM algorithm. The proposed data augmentation scheme is then utilized to design efficient Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference of some non-Gaussian and nonlinear state space models, via a mixture of normals approximation coupled with a block-specific reparametrization strategy. Applications on simulated and benchmark real data sets indicate that the proposed MCMC sampler can yield improvements in simulation efficiency compared with centering, noncentering and even the ancillarity-sufficiency interweaving strategy.


62-XX Statistics


CODA; R; astsa; TSA; nimble; Optim; Stan; NUTS; Julia
Full Text: DOI arXiv


[1] ABANTO-VALLE, C. A. and DEY, D. K. (2014). State space mixed models for binary responses with scale mixture of normal distributions links. Comput. Statist. Data Anal. 71 274-287. · Zbl 1471.62007 · doi:10.1016/j.csda.2013.01.009
[2] ALMEIDA, C. and CZADO, C. (2012). Efficient Bayesian inference for stochastic time-varying copula models. Comput. Statist. Data Anal. 56 1511-1527. · Zbl 1243.62031 · doi:10.1016/j.csda.2011.08.015
[3] Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 269-342. · Zbl 1411.65020 · doi:10.1111/j.1467-9868.2009.00736.x
[4] BASS, M. R. and SAHU, S. K. (2017). A comparison of centring parameterisations of Gaussian process-based models for Bayesian computation using MCMC. Stat. Comput. 27 1491-1512. · Zbl 1384.62078 · doi:10.1007/s11222-016-9700-z
[5] BAUWENS, L. and VEREDAS, D. (2004). The stochastic conditional duration model: A latent variable model for the analysis of financial durations. J. Econometrics 119 381-412. · Zbl 1282.91236 · doi:10.1016/S0304-4076(03)00201-X
[6] BERLINET, A. and ROLAND, C. (2009). Parabolic acceleration of the EM algorithm. Stat. Comput. 19 35-47. · doi:10.1007/s11222-008-9067-x
[7] Bezanson, J., Edelman, A., Karpinski, S. and Shah, V. B. (2017). Julia: a fresh approach to numerical computing. SIAM Rev. 59 65-98. · Zbl 1356.68030 · doi:10.1137/141000671
[8] BITTO, A. and FRÜHWIRTH-SCHNATTER, S. (2019). Achieving shrinkage in a time-varying parameter model framework. J. Econometrics 210 75-97. · Zbl 1452.62216 · doi:10.1016/j.jeconom.2018.11.006
[9] Blei, D. M., Kucukelbir, A. and McAuliffe, J. D. (2017). Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112 859-877. · doi:10.1080/01621459.2017.1285773
[10] CARTER, C. K. and KOHN, R. (1997). Semiparametric Bayesian inference for time series with mixed spectra. J. Roy. Statist. Soc. Ser. B 59 255-268. · Zbl 0889.62078 · doi:10.1111/1467-9868.00067
[11] CARVALHO, C. M., JOHANNES, M. S., LOPES, H. F. and POLSON, N. G. (2010). Particle learning and smoothing. Statist. Sci. 25 88-106. · Zbl 1328.62541 · doi:10.1214/10-STS325
[12] CHRISTENSEN, O. F., ROBERTS, G. O. and SKÖLD, M. (2006). Robust Markov chain Monte Carlo methods for spatial generalized linear mixed models. J. Comput. Graph. Statist. 15 1-17. · doi:10.1198/106186006X100470
[13] CRYER, J. D. and CHAN, K. S. (2008). Time series analysis with applications in R. Springer, New York. · Zbl 1137.62366
[14] de Valpine, P., Turek, D., Paciorek, C. J., Anderson-Bergman, C., Temple Lang, D. and Bodik, R. (2017). Programming with models: Writing statistical algorithms for general model structures with NIMBLE. J. Comput. Graph. Statist. 26 403-413. · doi:10.1080/10618600.2016.1172487
[15] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1-38. · Zbl 0364.62022
[16] DOUCET, A., DE FREITAS, N. and GORDON, N. (2012). Sequential Monte Carlo methods in practice. Springer, New York. · Zbl 0967.00022
[17] ENGLE, R. F. and RUSSELL, J. R. (1998). Autoregressive conditional duration: A new model for irregularly spaced transaction data. Econometrica 66 1127-1162. · Zbl 1055.62571 · doi:10.2307/2999632
[18] FEARNHEAD, P. and MELIGKOTSIDOU, L. (2016). Augmentation schemes for particle MCMC. Stat. Comput. 26 1293-1306. · Zbl 1356.65025 · doi:10.1007/s11222-015-9603-4
[19] FENG, D., JIANG, G. J. and SONG, P. X. K. (2004). Stochastic conditional duration models with “leverage effect” for financial transaction data. J. Financ. Econ. 2 390-421. · doi:10.1093/jjfinec/nbh016
[20] FRÜHWIRTH-SCHNATTER, S. (2004). Efficient Bayesian parameter estimation. In State Space and Unobserved Component Models 123-151. Cambridge Univ. Press, Cambridge. · Zbl 05280144 · doi:10.1017/CBO9780511617010.008
[21] FRÜHWIRTH-SCHNATTER, S. and FRÜHWIRTH, R. (2007). Auxiliary mixture sampling with applications to logistic models. Comput. Statist. Data Anal. 51 3509-3528. · Zbl 1161.62387 · doi:10.1016/j.csda.2006.10.006
[22] FRÜHWIRTH-SCHNATTER, S. and WAGNER, H. (2010). Stochastic model specification search for Gaussian and partial non-Gaussian state space models. J. Econometrics 154 85-100. · Zbl 1431.62373 · doi:10.1016/j.jeconom.2009.07.003
[23] GELFAND, A. E., SAHU, S. K. and CARLIN, B. P. (1995). Efficient parameterisations for normal linear mixed models. Biometrika 82 479-488. · Zbl 0832.62064 · doi:10.1093/biomet/82.3.479
[24] GELFAND, A. E., SAHU, S. K. and CARLIN, B. P. (1996). Efficient parametrizations for generalized linear mixed models. In Bayesian Statistics, 5 (Alicante, 1994) (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford Sci. Publ. 165-180. Oxford Univ. Press, New York.
[25] GEMAN, S. and GEMAN, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-6 721-741. · Zbl 0573.62030
[26] Girolami, M. and Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 123-214. · Zbl 1411.62071 · doi:10.1111/j.1467-9868.2010.00765.x
[27] GOPLERUD, M. (2022). Fast and accurate estimation of non-nested binomial hierarchical models using variational inference. Bayesian Anal. 17 623-650. · Zbl 07809913 · doi:10.1214/21-BA1266
[28] HARVEY, A., RUIZ, E. and SHEPHARD, N. (1994). Multivariate stochastic variance models. Rev. Econ. Stud. 61 247-264. · Zbl 0805.90026
[29] HENDERSON, N. C. and VARADHAN, R. (2019). Damped Anderson acceleration with restarts and monotonicity control for accelerating EM and EM-like algorithms. J. Comput. Graph. Statist. 28 834-846. · Zbl 07499030 · doi:10.1080/10618600.2019.1594835
[30] Hoffman, M. D. and Gelman, A. (2014). The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. 15 1593-1623. · Zbl 1319.60150
[31] HOSSZEJNI, D. and KASTNER, G. (2021). Modeling univariate and multivariate stochastic volatility in R with stochvol and factorstochvol. J. Stat. Softw. 100 1-34.
[32] JAMSHIDIAN, M. and JENNRICH, R. I. (1997). Acceleration of the EM algorithm by using quasi-Newton methods. J. Roy. Statist. Soc. Ser. B 59 569-587. · Zbl 0889.62042 · doi:10.1111/1467-9868.00083
[33] Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. J. Basic Eng. 82 35-45.
[34] KANTAS, N., DOUCET, A., SINGH, S. S., MACIEJOWSKI, J. and CHOPIN, N. (2015). On particle methods for parameter estimation in state-space models. Statist. Sci. 30 328-351. · Zbl 1332.62096 · doi:10.1214/14-STS511
[35] KASTNER, G. and FRÜHWIRTH-SCHNATTER, S. (2014). Ancillarity-sufficiency interweaving strategy (ASIS) for boosting MCMC estimation of stochastic volatility models. Comput. Statist. Data Anal. 76 408-423. · Zbl 1506.62094 · doi:10.1016/j.csda.2013.01.002
[36] KASTNER, G., FRÜHWIRTH-SCHNATTER, S. and LOPES, H. F. (2017). Efficient Bayesian inference for multivariate factor stochastic volatility models. J. Comput. Graph. Statist. 26 905-917. · doi:10.1080/10618600.2017.1322091
[37] KIM, S., SHEPHARD, N. and CHIB, S. (1998). Stochastic volatility: Likelihood inference and comparison with ARCH models. Rev. Econ. Stud. 65 361-393. · Zbl 0910.90067
[38] KLEPPE, T. S. (2019). Dynamically rescaled Hamiltonian Monte Carlo for Bayesian hierarchical models. J. Comput. Graph. Statist. 28 493-507. · Zbl 07499072 · doi:10.1080/10618600.2019.1584901
[39] KREUZER, A. and CZADO, C. (2020). Efficient Bayesian inference for nonlinear state space models with univariate autoregressive state equation. J. Comput. Graph. Statist. 29 523-534. · Zbl 07499294 · doi:10.1080/10618600.2020.1725523
[40] KROESE, D. P. and CHAN, J. C. C. (2014). Statistical Modeling and Computation. Springer, New York. · Zbl 1280.62008 · doi:10.1007/978-1-4614-8775-3
[41] LI, M. and SCHARTH, M. (2022). Leverage, asymmetry, and heavy tails in the high-dimensional factor stochastic volatility model. J. Bus. Econom. Statist. 40 285-301. · doi:10.1080/07350015.2020.1806853
[42] LIU, C., RUBIN, D. B. and WU, Y. N. (1998). Parameter expansion to accelerate EM: The PX-EM algorithm. Biometrika 85 755-770. · Zbl 0921.62071 · doi:10.1093/biomet/85.4.755
[43] Meng, X.-L. and Rubin, D. B. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 80 267-278. · Zbl 0778.62022 · doi:10.1093/biomet/80.2.267
[44] MENG, X.-L. and VAN DYK, D. (1997). The EM algorithm—An old folk-song sung to a fast new tune. J. Roy. Statist. Soc. Ser. B 59 511-567. · Zbl 1090.62518 · doi:10.1111/1467-9868.00082
[45] MENG, X.-L. and VAN DYK, D. (1998). Fast EM-type implementations for mixed effects models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 559-578. · Zbl 0909.62073 · doi:10.1111/1467-9868.00140
[46] MICHAUD, N., DE VALPINE, P., TUREK, D., PACIOREK, C. J. and NGUYEN, D. (2020). Sequential Monte Carlo methods in the nimble R package.
[47] MOGENSEN, P. K. and RISETH, A. N. (2018). Optim: A mathematical optimization package for Julia. J. Open Sour. Softw. 3 615. · doi:10.21105/joss.00615
[48] NEAL, R. M. (2011). MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo (S. Brooks, A. Gelman, G. Jones and X.-L. Meng, eds.). Chapman & Hall/CRC Handb. Mod. Stat. Methods 113-162. CRC Press, Boca Raton, FL. · Zbl 1229.65018
[49] OLSSON, R. K. and HANSEN, L. K. (2006). Linear state-space models for blind source separation. J. Mach. Learn. Res. 7 2585-2602. · Zbl 1222.94025
[50] OMORI, Y., CHIB, S., SHEPHARD, N. and NAKAJIMA, J. (2007). Stochastic volatility with leverage: Fast and efficient likelihood inference. J. Econometrics 140 425-449. · Zbl 1247.91207 · doi:10.1016/j.jeconom.2006.07.008
[51] ORMEROD, J. T. and WAND, M. P. (2010). Explaining variational approximations. Amer. Statist. 64 140-153. · Zbl 1200.65007 · doi:10.1198/tast.2010.09058
[52] OSMUNDSEN, K. K., KLEPPE, T. S. and LIESENFELD, R. (2021). Importance sampling-based transport map Hamiltonian Monte Carlo for Bayesian hierarchical models. J. Comput. Graph. Statist. 30 906-919. · Zbl 07499926 · doi:10.1080/10618600.2021.1923519
[53] PAL, A. and PRAKASH, P. (2017). Practical time series analysis. Packt Publishing, Birmingham, Mumbai.
[54] PAPASPILIOPOULOS, O., ROBERTS, G. O. and SKÖLD, M. (2003). Non-centered parameterizations for hierarchical models and data augmentation. In Bayesian Statistics, 7 (Tenerife, 2002) (J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.) 307-326. Oxford Univ. Press, New York.
[55] PAPASPILIOPOULOS, O., ROBERTS, G. O. and SKÖLD, M. (2007). A general framework for the parametrization of hierarchical models. Statist. Sci. 22 59-73. · Zbl 1246.62195 · doi:10.1214/088342307000000014
[56] PITT, M. K. and SHEPHARD, N. (1999a). Analytic convergence rates and parameterization issues for the Gibbs sampler applied to state space models. J. Time Series Anal. 20 63-85. · doi:10.1111/1467-9892.00126
[57] PITT, M. K. and SHEPHARD, N. (1999b). Filtering via simulation: Auxiliary particle filters. J. Amer. Statist. Assoc. 94 590-599. · Zbl 1072.62639 · doi:10.2307/2670179
[58] PITT, M. K. and SHEPHARD, N. (1999c). Time-varying covariances: A factor stochastic volatility approach. In Bayesian Statistics, 6 (Alcoceber, 1998) (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 547-570. Oxford Univ. Press, New York. · Zbl 0956.62107
[59] Plummer, M., Best, N., Cowles, K. and Vines, K. (2006). CODA: Convergence diagnosis and output analysis for MCMC. R News 6 7-11.
[60] Roberts, G. O. and Sahu, S. K. (1997). Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler. J. Roy. Statist. Soc. Ser. B 59 291-317. · Zbl 0886.62083 · doi:10.1111/1467-9868.00070
[61] SAÂDAOUI, F. (2010). Acceleration of the EM algorithm via extrapolation methods: Review, comparison and new methods. Comput. Statist. Data Anal. 54 750-766. · Zbl 1464.62150 · doi:10.1016/j.csda.2008.11.011
[62] SAHU, S. K. and ROBERTS, G. O. (1999). On convergence of the EM algorithm and the Gibbs sampler. Stat. Comput. 9 55-64.
[63] SHUMWAY, R. H. and STOFFER, D. S. (1982). An approach to time series smoothing and forecasting using the EM algorithm. J. Time Series Anal. 3 253-264. · Zbl 0502.62085
[64] Shumway, R. H. and Stoffer, D. S. (2017). Time Series Analysis and Its Applications: With R Examples, 4th ed. Springer Texts in Statistics. Springer, Cham. · Zbl 1367.62004 · doi:10.1007/978-3-319-52452-8
[65] SIMPSON, M., NIEMI, J. and ROY, V. (2017). Interweaving Markov chain Monte Carlo strategies for efficient estimation of dynamic linear models. J. Comput. Graph. Statist. 26 152-159. · doi:10.1080/10618600.2015.1105748
[66] STRICKLAND, C. M., FORBES, C. S. and MARTIN, G. M. (2006). Bayesian analysis of the stochastic conditional duration model. Comput. Statist. Data Anal. 50 2247-2267. · Zbl 1445.62241 · doi:10.1016/j.csda.2005.07.005
[67] TAK, H., YOU, K., GHOSH, S. K., SU, B. and KELLY, J. (2020). Data transforming augmentation for heteroscedastic models. J. Comput. Graph. Statist. 29 659-667. · Zbl 07499304 · doi:10.1080/10618600.2019.1704295
[68] TAN, L. S. L. (2019). Explicit inverse of tridiagonal matrix with applications in autoregressive modelling. IMA J. Appl. Math. 84 679-695. · Zbl 1469.65079 · doi:10.1093/imamat/hxz010
[69] TAN, L. S. L. (2021). Use of model reparametrization to improve variational Bayes. J. R. Stat. Soc. Ser. B. Stat. Methodol. 83 30-57. · Zbl 07555255 · doi:10.1111/rssb.12399
[70] TAN, L. S. L. (2023). Supplement to “Efficient data augmentation techniques for some classes of state space models.” https://doi.org/10.1214/22-STS867SUPP
[71] TAN, L. S. L. and NOTT, D. J. (2013). Variational inference for generalized linear mixed models using partially noncentered parametrizations. Statist. Sci. 28 168-188. · Zbl 1331.62167 · doi:10.1214/13-sts418
[72] TAN, S. L. and NOTT, D. J. (2014). Variational approximation for mixtures of linear mixed models. J. Comput. Graph. Statist. 23 564-585. · doi:10.1080/10618600.2012.761138
[73] TAN, M., TIAN, G.-L., FANG, H.-B. and NG, K. W. (2007). A fast EM algorithm for quadratic optimization subject to convex constraints. Statist. Sinica 17 945-964. · Zbl 1133.62019
[74] TAYLOR, S. J. (1982). Financial returns modelled by the product of two stochastic processes—A study of daily sugar prices, 1961-79. In Time Series Analysis: Theory and Practice, Vol. 1 (O. D. Anderson, ed.) 203-226. Elsevier, North-Holland, Amsterdam.
[75] STAN DEVELOPMENT TEAM (2019). Stan modeling language users guide and reference manual. Version 2.28.
[76] R Core Team (2020). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
[77] TSAY, R. S. (2010). Analysis of Financial Time Series, 3rd ed. Wiley Series in Probability and Statistics. Wiley, Hoboken, NJ. · Zbl 1209.91004 · doi:10.1002/9780470644560
[78] VAN DYK, D. A. and MENG, X.-L. (2001). The art of data augmentation. J. Comput. Graph. Statist. 10 1-50. · doi:10.1198/10618600152418584
[79] Wu, C.-F. J. (1983). On the convergence properties of the EM algorithm. Ann. Statist. 11 95-103. · Zbl 0517.62035 · doi:10.1214/aos/1176346060
[80] YANG, B., STROUD, J. R. and HUERTA, G. (2018). Sequential Monte Carlo smoothing with parameter estimation. Bayesian Anal. 13 1133-1157. · Zbl 1407.62343 · doi:10.1214/17-BA1088
[81] YU, Y. and MENG, X.-L. (2011). To center or not to center: That is not the question—An ancillarity-sufficiency interweaving strategy (ASIS) for boosting MCMC efficiency. J. Comput. Graph. Statist. 20 531-570. · doi:10.1198/jcgs.2011.203main
[82] ZANELLA, G. and ROBERTS, G. (2021). Multilevel linear models, Gibbs samplers and multigrid decompositions. Bayesian Anal. 16 1308-1390. · Zbl 07808155 · doi:10.1214/20-BA1242
[83] ZHOU, H., ALEXANDER, D. and LANGE, K. (2011). A quasi-Newton acceleration for high-dimensional optimization algorithms. Stat. Comput. 21 261-273. · Zbl 1284.90095 · doi:10.1007/s11222-009-9166-3
[84] ZHOU, L. and TANG, Y. (2021). Linearly preconditioned nonlinear conjugate gradient acceleration of the PX-EM algorithm. Comput. Statist. Data Anal. 155 Paper No. 107056, 13 pp · Zbl 1510.62090 · doi:10.1016/j.csda.2020.107056
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.