Gaussian processes with multidimensional distribution inputs via optimal transport and Hilbertian embedding. (English) Zbl 1448.60085

Summary: In this work, we propose a way to construct Gaussian processes indexed by multidimensional distributions. More precisely, we tackle the problem of defining positive definite kernels between multivariate distributions via notions of optimal transport and appealing to Hilbert space embeddings. Besides presenting a characterization of radial positive definite and strictly positive definite kernels on general Hilbert spaces, we investigate the statistical properties of our theoretical and empirical kernels, focusing in particular on consistency as well as the special case of Gaussian distributions. A wide set of applications is presented, both using simulations and implementation with real data.


60G15 Gaussian processes
Full Text: DOI arXiv Euclid


[1] P. Abrahamsen. A review of Gaussian random fields and correlation functions. Technical report, Norwegian Computing Center, 1997.
[2] Martial Agueh and Guillaume Carlier. Barycenters in the Wasserstein space., SIAM Journal on Mathematical Analysis, 43(2):904-924, 2011. · Zbl 1223.49045 · doi:10.1137/100805741
[3] Pedro C. Alvarez-Esteban, Eustasio del Barrio, Juan Antonio Cuesta-Albertos, and Carlos Matrán. Wide consensus for parallelized inference., arXiv preprint arXiv:1511.05350, 2015.
[4] Pedro C. Álvarez-Esteban, Eustasio del Barrio, Juan Antonio Cuesta-Albertos, and Carlos Matrán. A fixed-point approach to barycenters in Wasserstein space., Journal of Mathematical Analysis and Applications, 441(2):744-762, 2016. · Zbl 1383.49052
[5] Ethan Anderes. On the consistent separation of scale and variance for Gaussian random fields., The Annals of Statistics, 38:870-893, 2010. · Zbl 1204.60041 · doi:10.1214/09-AOS725
[6] François Bachoc. Cross validation and maximum likelihood estimations of hyper-parameters of gaussian processes with model misspecification., Computational Statistics & Data Analysis, 66:55-69, 2013. · Zbl 1471.62021 · doi:10.1016/j.csda.2013.03.016
[7] François Bachoc. Asymptotic analysis of covariance parameter estimation for gaussian processes in the misspecified case., Bernoulli, 24(2) :1531-1575, 2018. · Zbl 1429.60035 · doi:10.3150/16-BEJ906
[8] François Bachoc, Fabrice Gamboa, Jean-Michel Loubes, and Nil Venet. A gaussian process regression model for distribution inputs., IEEE Transactions on Information Theory, 64(10) :6620-6637, 2018. · Zbl 1401.62106 · doi:10.1109/TIT.2017.2762322
[9] B.J.C. Baxter. Positive definite functions on Hilbert space., East Journal on Approximations, 10(3):269-274, 2004. · Zbl 1113.43005
[10] Alain Berlinet and Christine Thomas-Agnan., Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer Science & Business Media, 2011. · Zbl 1145.62002
[11] José Betancourt, François Bachoc, Thierry Klein, Déborah Idier, Rodrigo Pedreros, and Jérémy Rohmer. Gaussian process metamodeling of functional-input code for coastal flood hazard assessment., Reliability Engineering and System Safety, forthcoming. https://hal.archives-ouvertes.fr/hal-01998727/, 2019.
[12] Moreno Bevilacqua, Tarik Faouzi, Reinhard Furrer, and Emilio Porcu. Estimation and prediction using generalized Wendland covariance functions under fixed domain asymptotics., The Annals of Statistics, 47(2):828-856, 2019. · Zbl 1418.62365 · doi:10.1214/17-AOS1652
[13] Jérémie Bigot and Thierry Klein. Characterization of barycenters in the Wasserstein space by averaging optimal transport maps., ESAIM: Probability and Statistics, 22:35-57, 2018. · Zbl 1409.62049 · doi:10.1051/ps/2017020
[14] Melf Boeckel, Vladimir Spokoiny, and Alexandra Suvorikova. Multivariate Brenier cumulative distribution functions and their application to non-parametric testing., arXiv preprint arXiv:1809.04090, 2018.
[15] Emmanuel Boissard, Thibaut Le Gouic, and Jean-Michel Loubes. Distribution’s template estimate with Wasserstein metrics., Bernoulli, 21(2):740-759, 2015. · Zbl 1320.62107 · doi:10.3150/13-BEJ585
[16] Nicolas Bonneel, Julien Rabin, Gabriel Peyré, and Hanspeter Pfister. Sliced and radon Wasserstein barycenters of measures., Journal of Mathematical Imaging and Vision, 51(1):22-45, 2015. · Zbl 1332.94014 · doi:10.1007/s10851-014-0506-3
[17] Hans Werner Borchers. adagio: Discrete and global optimization routines. URL, http://CRAN.R-project.org/package=adagio, 2016.
[18] Yann Brenier. Polar factorization and monotone rearrangement of vector-valued functions., Communications on Pure and Applied Mathematics, 44(4):375-417, 1991. · Zbl 0738.46011 · doi:10.1002/cpa.3160440402
[19] Victor Chernozhukov, Alfred Galichon, Marc Hallin, and Marc Henry. Monge-kantorovich depth, quantiles, ranks and signs., The Annals of Statistics, 45(1):223-256, 2017. · Zbl 1426.62163 · doi:10.1214/16-AOS1450
[20] Nello Cristianini and John Shawe-Taylor., Support Vector Machines. Cambridge University Press, 2000. · Zbl 0994.68074
[21] Juan Antonio Cuesta and Carlos Matrán. Notes on the Wasserstein metric in Hilbert spaces., Ann. Probab., 17(3) :1264-1276, 1989. ISSN 0091-1798. · Zbl 0688.60011
[22] Marco Cuturi and Arnaud Doucet. Fast computation of Wasserstein barycenters. In, International Conference on Machine Learning, pages 685-693, 2014.
[23] Eustasio del Barrio, Juan Antonio Cuesta-Albertos, Marc Hallin, and Carlos Matrán. Center-outward distribution functions, quantiles, ranks, and signs in \(\mathbbR^d\)., arXiv e-prints arXiv:1806.01238, Jun 2018.
[24] Alessio Figalli. On the continuity of center-outward distribution and quantile functions., Nonlinear Analysis, 177:413-421, 2018. · Zbl 1433.62132 · doi:10.1016/j.na.2018.05.008
[25] David Ginsbourger, Jean Baccou, Clément Chevalier, and Frédéric Perales. Design of computer experiments using competing distances between set-valued inputs. In, mODa 11-Advances in Model-Oriented Design and Analysis, pages 123-131. Springer, 2016.
[26] Carsten Gottschlich and Dominic Schuhmacher. The shortlist method for fast computation of the Earth mover’s distance and finding optimal solutions to transportation problems., PloS One, 9(10): e110214, 2014.
[27] Arthur Gretton, Olivier Bousquet, Alex Smola, and Bernhard Schölkopf. Measuring statistical dependence with Hilbert-Schmidt norms. In, International Conference on Algorithmic Learning Theory, pages 63-77. Springer, 2005. · Zbl 1168.62354
[28] Trevor Hastie, Robert Tibshirani, and Jerome Friedman., The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media, 2009. · Zbl 1273.62005
[29] [29], http://www cast3m.cea.fr. Cast3m software.
[30] Marcel Klatt., Regularized Wasserstein Distances and Barycenters, 2018. URL https://cran.r-project.org/web/packages/Barycenter/Barycenter.pdf. R package version 1.3.1.
[31] Soheil Kolouri, Yang Zou, and Gustavo K. Rohde. Sliced Wasserstein kernels for probability distributions. In, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5258-5267, 2016.
[32] Alexey Kroshnin, Vladimir Spokoiny, and Alexandra Suvorikova. Statistical inference for Bures-Wasserstein barycenters., arXiv preprint arXiv:1901.00226, 2019.
[33] Bryan R. Lajoie, Job Dekker, and Noam Kaplan. The hitchhiker’s guide to hi-c analysis: practical guidelines., Methods, 72:65-75, 2015.
[34] Thibaut Le Gouic and Jean-Michel Loubes. Existence and consistency of Wasserstein barycenters., Probability Theory and Related Fields, 168(3-4):901-917, 2017. · Zbl 1406.60019 · doi:10.1007/s00440-016-0727-z
[35] Wei-Liem Loh. Estimating the smoothness of a gaussian random field from irregularly spaced data via higher-order quadratic variations., The Annals of Statistics, 43(6) :2766-2794, 2015. · Zbl 1327.62482 · doi:10.1214/15-AOS1365
[36] David G. Luenberger and Yinyu Ye., Linear and Nonlinear Programming, volume 2. Springer, 1984. · Zbl 1207.90003
[37] Quentin Mérigot. A multiscale approach to optimal transport. In, Computer Graphics Forum, volume 30, pages 1583-1592. Wiley Online Library, 2011.
[38] Charles A Micchelli, Yuesheng Xu, and Haizhang Zhang. Universal kernels., Journal of Machine Learning Research, 7(Dec) :2651-2667, 2006. · Zbl 1222.68266
[39] Krikamol Muandet, Kenji Fukumizu, Bharath Sriperumbudur, and Bernhard Schölkopf. Kernel mean embedding of distributions: A review and beyond., Foundations and Trends® in Machine Learning, 10(1-2):1-141, 2017. · Zbl 1380.68336
[40] Thomas Muehlenstaedt, Jana Fruth, and Olivier Roustant. Computer experiments with functional inputs and scalar outputs by a norm-based approach., Statistics and Computing, 27 :1083-1097, 2017. · Zbl 1384.62282 · doi:10.1007/s11222-016-9672-z
[41] David Nualart., The Malliavin Calculus and Related Topics, volume 1995. Springer. · Zbl 0837.60050
[42] Gabriel Peyré and Marco Cuturi. Computational optimal transport., Foundations and Trends® in Machine Learning, 11(5-6):355-607, 2019.
[43] Barnabas Poczos, Aarti Singh, Alessandro Rinaldo, and Larry Wasserman. Distribution-free distribution regression. In Carlos M. Carvalho and Pradeep Ravikumar, editors, Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, volume 31 of Proceedings of Machine Learning Research, pages 507-515, Scottsdale, Arizona, USA, 29 Apr-01 May 2013. PMLR.
[44] Carl Edward Rasmussen and Chris K.I. Williams., Gaussian Processes for Machine Learning. The MIT Press, Cambridge, 2006. · Zbl 1177.68165
[45] Filippo Santambrogio., Optimal Transport for Applied Mathematicians. Birkäuser, NY, pages 99-102, 2015. · Zbl 1401.49002
[46] Thomas J. Santner, Brian J. Williams, and William Notz., The Design and Analysis of Computer Experiments. Springer, New York, 2003. · Zbl 1041.62068
[47] Isaac J. Schoenberg. Metric spaces and completely monotone functions., Annals of Mathematics, 39(4):811-841, 1938. · JFM 64.0617.03 · doi:10.2307/1968466
[48] Bernhard Schölkopf and Alexander J. Smola., Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002.
[49] Dominic Schuhmacher, Björn Bähre, Carsten Gottschlich, Valentin Hartmann, Florian Heinemann, and Bernhard Schmitzer., transport: Computation of Optimal Transport Plans and Wasserstein Distances, 2019. URL https://cran.r-project.org/package=transport. R package version 0.11-0.
[50] Michael L. Stein., Interpolation of Spatial Data: Some Theory for Kriging. Springer, New York, 1999. · Zbl 0924.62100
[51] Bui Thi Thien Trang, Jean-Michel Loubes, Laurent Risser, and Patricia Balaresque. Distribution regression model with a reproducing kernel Hilbert space approach., Communications in Statistics - Theory and Methods, pages 1-23, 2019.
[52] César A. Uribe, Darina Dvinskikh, Pavel Dvurechensky, Alexander Gasnikov, and Angelia Nedic. Distributed computation of Wasserstein barycenters over networks. In, 2018 IEEE 57th Annual Conference on Decision and Control (CDC), 2018. Accepted, arXiv:1803.02933.
[53] Cédric Villani., Optimal Transport: Old and New, volume 338. Springer Science & Business Media, 2009. · Zbl 1156.53003
[54] Holger Wendland., Scattered Data Approximation, volume 17. Cambridge University Press, 2004. · Zbl 1185.65022
[55] Hao Zhang. · Zbl 1089.62538 · doi:10.1198/016214504000000241
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.