Multiscale sparse microcanonical models.

*(English)*Zbl 1426.62111Summary: We study approximations of non-Gaussian stationary processes having long range correlations with microcanonical models. These models are conditioned by the empirical value of an energy vector, evaluated on a single realization. Asymptotic properties of maximum entropy microcanonical and macrocanonical processes and their convergence to Gibbs measures are reviewed. We show that the Jacobian of the energy vector controls the entropy rate of microcanonical processes. Sampling maximum entropy processes through MCMC algorithms require too many operations when the number of constraints is large. We define microcanonical gradient descent processes by transporting a maximum entropy measure with a gradient descent algorithm which enforces the energy conditions. Convergence and symmetries are analyzed. Approximations of non-Gaussian processes with long range interactions are defined with multiscale energy vectors computed with wavelet and scattering transforms. Sparsity properties are captured with \(\mathbf{l}^1\) norms. Approximations of Gaussian, Ising and point processes are studied, as well as image and audio texture synthesis.

##### MSC:

62G07 | Density estimation |

53A07 | Higher-dimensional and -codimensional surfaces in Euclidean and related \(n\)-spaces |

62M45 | Neural nets and related approaches to inference from stochastic processes |

82B20 | Lattice systems (Ising, dimer, Potts, etc.) and systems on graphs arising in equilibrium statistical mechanics |

82B28 | Renormalization group methods in equilibrium statistical mechanics |

##### References:

[1] | P.-A. Absil, R. Mahony, and B. Andrews, Convergence of the iterates of descent methods for analytic cost functions, SIAM J. Optim., 16 (2005), no. 2, 531-547.Zbl 1092.90036 MR 2197994 · Zbl 1092.90036 |

[2] | J. Andén and S. Mallat, Deep scattering spectrum, IEEE Trans. Signal Process., 62 (2014), no. 16, 4114-4128.Zbl 1394.94040 MR 3260414 Multiscale sparse microcanonical models313 · Zbl 1394.94040 |

[3] | L. Barbet, M. Dambrine, A. Daniilidis, and L. Rifford, Sard theorems for lipschitz functions and applications in optimization, Israel J. Math., 212 (2016), no. 2, 757-790. Zbl 1353.58005 MR 3505402 · Zbl 1353.58005 |

[4] | F. Barthe, O. Guédon, S. Mendelson, and A. Naor, A probabilistic approach to the geometry of lpn-ball, Ann. Probab., 33 (2005), no. 2, 480-513.Zbl 1071.60010 MR 2123199 · Zbl 1071.60010 |

[5] | G. Battle, Wavelets and renormalization, Series in Approximations and Decompositions, 10, World Scientific Publishing Co., Inc., River Edge, NJ, 1999.Zbl 0949.65145 MR 1688691 · Zbl 0949.65145 |

[6] | M. Betancourt, A conceptual introduction to Hamiltonian Monte Carlo, 2017. arXiv:1701.02434 |

[7] | E. Borel, Sur les principes de la theorie cinetique des gas, Ann. Sci. École Norm. Sup. (3), 23(1906), 9-33.Zbl 37.0944.01 MR 1509063 |

[8] | P. Brémaud, L. Massoulié, and A. Ridolfi, Power spectra of random spike fields and related processes, Adv. in Appl. Probab., 37 (2005), no. 4, 1116-1146.Zbl 1102.60030 MR 2193999 |

[9] | J. Bruna and S. Mallat, Audio texture synthesis with scattering moments, 2013. arXiv:1311.0407 |

[10] | J. Bruna and S. Mallat, Invariant scattering convolution networks, IEEE Trans. Pattern Anal. Mach. Intell., 35 (2013), no. 8, 1872-1886. |

[11] | J. Bruna, S. Mallat, E. Bacry, and J.-F. Muzy, Intermittent process analysis with scattering moments, Ann. Statist., 43 (2015), no. 1, 323-351.Zbl 1308.62168 MR 3311862 · Zbl 1308.62168 |

[12] | J. V. Burke, A. S. Lewis, and M. L. Overton, A robust gradient sampling algorithm for nonsmooth, nonconvex optimization, SIAM J. Optim., 15 (2005), no. 3, 751-779. Zbl 1078.65048 MR 2142859 · Zbl 1078.65048 |

[13] | S. Chatterjee, A note about the uniform distribution on the intersection of a simplex and a sphere, J. Topol. Anal., 9 (2017), no. 4, 717-738.Zbl 1379.35288 MR 3684622 · Zbl 1379.35288 |

[14] | L. Chizat and F. Bach, On the global convergence of gradient descent for overparameterized models using optimal transport, in Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. CesaBianchi, and R. Garnett (eds.), 3036-3046, Curran Associates, Inc., 2018. |

[15] | M. Creutz, Microcanonical Monte Carlo simulation, Phys. Rev. Lett., 50 (1983), no. 19, 1411-1444.MR 701663 |

[16] | A. Dembo and O. Zeitouni, Large deviations techniques and applications, Jones and Bartlett Publishers, Boston, MA, 1993.Zbl 0793.60030 MR 1202429 · Zbl 0793.60030 |

[17] | J. Deuschel, D. Stroock, and H. Zession, Microcanonical distributions for lattice gases, Comm. Math. Phys., 139 (1991), no. 1, 83-101.Zbl 0727.60025 MR 1116411 · Zbl 0727.60025 |

[18] | P. Diaconis and D. Freedman, A dozen de Finetti-style results in search of a theory, Ann. Inst. H. Poincaré Probab. Statist., 23 (1987), no. 2, suppl., 397-423.Zbl 0619.60039 MR 898502 · Zbl 0619.60039 |

[19] | M. Donsker and S. Varadhan, Large deviations for stationary Gaussian processes, Comm. Math. Phys., 97 (1985), no. 1-2, 187-210.Zbl 0646.60030 MR 782966 314J. Bruna and S. Mallat · Zbl 0657.60036 |

[20] | R. Ellis, Entropy, large deviations, and statistical mechanics, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 271, Springer-Verlag, New York, 1985.Zbl 0566.60097 MR 793553 |

[21] | B. Galerne, Y. Gousseau, and J.-M. Morel, Random phase textures: theory and synthesis, IEEE Trans. Image Process., 20 (2011), no. 1, 257-267.Zbl 1372.94086 MR 2789729 · Zbl 1372.94086 |

[22] | I. Gallagher, L. Saint-Raymond, and B. Texier, From Newton to Boltzmann: hard spheres and short-range potentials, Zürich Lectures in Advanced Mathematics, European Mathematical Society (EMS), Zürich, 2013.Zbl 1315.82001 MR 3157048 |

[23] | L. Gatys, A. S. Ecker, and M. Bethge, Texture synthesis using convolutional neural networks, in Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (eds.), 262-270, Curran Associates, Inc., 2015. |

[24] | S. Geman and D. Geman, Stochastic relaxation, gibbs distributions, and the bayesian restoration of images, IEEE Trans. Pattern Anal. Mach. Intell., 6 (1984), no. 6, 721-741. · Zbl 0573.62030 |

[25] | H.-O. Georgii, Gibbs measures and phase transitions. Second edition, De Gruyter Studies in Mathematics, 9, Walter de Gruyter & Co., Berlin, 2011.Zbl 1225.60001 MR 2807681 |

[26] | D. J. Heeger and J. R. Bergen, Pyramid-based texture analysis/synthesis, in Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques, 229-238, ACM Press, 1995. |

[27] | E. T. Jaynes, Information theory and statistical mechanics, Phys. Rev. (2), 106 (1957), no. 4, 620-630.Zbl 0084.43701 MR 87305 · Zbl 0084.43701 |

[28] | P. Kopietz, L. Bartosch, and F. Schütz, Introduction to the functional renormalization group, Lecture Notes in Physics, 798, Springer-Verlag, Berlin, 2010.Zbl 1196.82001 MR 2641839 |

[29] | P. Kopietz, L. Bartosch, and F. Schütz, Mean-field theory and the gaussian approximation, in Introduction to the functional renormalization group, Lecture Notes in Physics, 798, Springer-Verlag, Berlin, 2010. |

[30] | O. E. Lanford III, Time evolution of large classical systems, in Dynamical systems, theory and applications (Rencontres, Battelle Res. Inst., Seattle, Wash., 1974), 1-111, Lecture Notes in Phys., 38, Springer, Berlin, 1975.Zbl 0329.70011 MR 479206 |

[31] | J. D. Lee, M. Simchowitz, M. I. Jordan, and B. Recht, Gradient descent only converges to minimizers, in Conference on Learning Theory, 1246-1257, 2016. |

[32] | D. A. Levin and Y. Peres, Markov chains and mixing times. With contributions by Elizabeth L. Wilmer and a chapter on “Coupling from the past” by James G. Propp and David B. Wilson. Second edition, American Mathematical Society, Providence, RI, 2017.Zbl 1390.60001 MR 3726904 |

[33] | S. Mallat, Group invariant scattering, Comm. Pure Appl. Math., 65 (2012), no. 10, 1331- 1398.Zbl 1282.47009 MR 2957703 · Zbl 1282.47009 |

[34] | S. Mallat, S. Zhang, and G. Rochette, Phase harmonics and correlation invariants in convolutional neural networks, 2018.arXiv:1810.12136 |

[35] | J. H. McDermott and E. P. Simoncelli, Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis, Neuron, 71 (2011), no. 5, 926-940. Multiscale sparse microcanonical models315 |

[36] | Y. Meyer, Wavelets and operators. Translated from the 1990 French original by D. H. Salinger, Cambridge Studies in Advanced Mathematics, 37, Cambridge University Press, Cambridge, 1992.Zbl 0776.42019 MR 1228209 |

[37] | L. Onsager, Crystal statistics. I. A two-dimensional model with an order-disorder transition, Phys. Rev. (2), 65 (1944), no. 3-4, 117-149.Zbl 0060.46001 MR 10315 · Zbl 0060.46001 |

[38] | I. Panageas and G. Piliouras, Gradient descent converges to minimizers: The case of non-isolated critical points, 2016.arXiv:1605.00405 · Zbl 1402.90210 |

[39] | J. Portilla and E. P. Simoncelli, A parametric texture model based on joint statistics of complex wavelet coefficients, Int. J. Comput. Vis., 40 (2000), no. 1, 49-70.Zbl 1012.68698 · Zbl 1012.68698 |

[40] | G. M. Rotskoff and E. Vanden-Eijnden, Neural networks as interacting particle systems: Asymptotic convexity of the loss landscape and universal scaling of the approximation error, 2018.arXiv:1805.00915 |

[41] | E. P. Simoncelli and B. A. Olshausen, Natural image statistics and neural representation, Annual Review of Neuroscience, 24 (2001), no. 1, 1193-1216. |

[42] | A. Sokol, Advanced Probability, Lecture Notes, 2013. |

[43] | D. W. Stroock and O. Zeitouni, Microcanonical distributions, Gibbs states, and the equivalence of ensembles, Random walks, Brownian motion, and interacting particle systems, 399-424, Progr. Probab., 28, Birkhäuser Boston, Boston, MA, 1991. Zbl 0733.00027 MR 1146461 · Zbl 0745.60105 |

[44] | A. Tagliani, Hamburger moment problem and maximum entropy: On the existence conditions, Appl. Math. Comput., 231 (2014), 111-116.Zbl 06892872 MR 3174016 · Zbl 1410.44005 |

[45] | M. J. Wainwright and M. I. Jordan, Graphical models, exponential families, and variational inference, Foundations and Trends® in Machine Learning, 1 (2008), no. 1-2, 1-305. · Zbl 1193.62107 |

[46] | Y. Wang, W. Yin, and J. Zeng, Global convergence of ADMM in nonconvex nonsmooth optimization, J. Sci. Comput., 78 (2019), no. 1, 29-63.MR 3902876 · Zbl 07042437 |

[47] | M. Welling, Herding dynamical weights to learn, in Proceedings of the 26th Annual International Conference on Machine Learning, 1121-1128, ACM Press, 2009. |

[48] | S. Zhang and S. Mallat, Wavelet phase harmonic covariance models of stationary processes, 2019. |

[49] | S. C. Zhu, Y. Wu, and D. Mumford, Filters, random fields and maximum entropy (frame): Towards a unified theory for texture modeling, Int. J. Comput. Vis., 27 (1998), no. 2, 107-126. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.