zbMATH — the first resource for mathematics

Optimal transport: fast probabilistic approximation with exact solvers. (English) Zbl 1441.62216
Summary: We propose a simple subsampling scheme for fast randomized approximate computation of optimal transport distances on finite spaces. This scheme operates on a random subset of the full data and can use any exact algorithm as a black-box back-end, including state-of-the-art solvers and entropically penalized versions. It is based on averaging the exact distances between empirical measures generated from independent samples from the original measures and can easily be tuned towards higher accuracy or shorter computation times. To this end, we give non-asymptotic deviation bounds for its accuracy in the case of discrete optimal transport problems. In particular, we show that in many important instances, including images (2D-histograms), the approximation error is independent of the size of the full problem. We present numerical experiments that demonstrate that a very good approximation in typical applications can be obtained in a computation time that is several orders of magnitude smaller than what is required for exact computation of the full problem.
62L20 Stochastic approximation
62P30 Applications of statistics in engineering and industry; control charts
62-08 Computational methods for problems pertaining to statistics
90B06 Transportation, logistics and supply chain management
Full Text: Link
[1] Martial Agueh and Guillaume Carlier. Barycenters in the Wasserstein space.SIAM J. Math. Anal., 43(2):904-924, 2011.
[2] Jason Altschuler, Jonathan Weed, and Philippe Rigollet. Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. InAdvances in Neural Information Processing Systems, pages 1964-1974, 2017.
[3] Dimitri P. Bertsekas. Auction algorithms for network flow problems: A tutorial introduction.Computational Optimization and Applications, 1(1):7-66, 1992.
[4] Emmanuel Boissard and Thibaut Le Gouic. On the mean speed of convergence of empirical and occupation measures in Wasserstein distance.Ann. Inst. H. Poincar´e Probab. Statist., 50(2): 539-563, 2014.
[5] Fran¸cois Bolley and C´edric Villani. Weighted Csisz´ar-Kullback-Pinsker inequalities and applications to transportation inequalities.Annales de La Facult´e Des Sciences de Toulouse: Math´ematiques, 14(3):331-352, 2005.
[6] Nicolas Bonneel, Julien Rabin, Gabriel Peyr´e, and Hanspeter Pfister. Sliced and Radon Wasserstein barycenters of measures.Journal of Mathematical Imaging and Vision, 51(1):22-45, 2015.
[7] Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Carl-Johann Simon-Gabriel, and Bernhard Schoelkopf. From optimal transport to generative modeling: the VEGAN cookbook. 2017. URL https://arxiv.org/abs/1705.07642.
[8] Hector Corrada Bravo and Stefan Theussl.Rcplex: R interface to cplex, 2016.URLhttps: //CRAN.R-project.org/package=Rcplex. R package version 0.3-3.
[9] Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. InAdvances in Neural Information Processing Systems, pages 2292-2300, 2013.
[10] Marco Cuturi and Arnaud Doucet. Fast computation of Wasserstein barycenters. InInternational Conference on Machine Learning, pages 685-693, 2014.
[11] Pavel Dvurechensky, Alexander Gasnikov, and Alexey Kroshnin. Computational optimal transport: Complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm. InInternational Conference on Machine Learning, pages 1367-1376. 2018.
[12] Steven N. Evans and Frederick A. Matsen. The phylogenetic Kantorovich-Rubinstein metric for environmental sequence samples.J. R. Stat. Soc. B, 74(3):569-592, 2012.
[13] Nicolas Fournier and Arnaud Guillin. On the rate of convergence in Wasserstein distance of the empirical measure.Probab. Theory Relat. Fields, 162(3-4):707-738, 2015.
[14] Carsten Gottschlich and Dominic Schuhmacher. The Shortlist method for fast computation of the earth mover’s distance and finding optimal solutions to transportation problems.PLoS ONE, 9 (10):e110214, 2014.
[15] Nathael Gozlan and Christian L´eonard. A large deviation approach to some transportation cost inequalities.Probab. Theory Relat. Fields, 139(1-2):235-283, 2007.
[16] Philippe Heinrich and Jonas Kahn.Strong identifiability and optimal minimax rates for finite mixture estimation.Ann. Stat., 46(6A):2844-2870, 2018.
[17] Leonid Vitaliyevich Kantorovich. On the translocation of masses.(Dokl.) Acad. Sci. URSS 37, 3: 199-201, 1942.
[18] Benoˆıt R. Kloeckner. A geometric study of Wasserstein spaces: Ultrametrics.Mathematika, 61(1): 162-178, 2015.
[19] Tianyi Lin, Nhat Ho, and Michael I Jordan. On efficient optimal transport: An analysis of greedy and accelerated mirror descent algorithms. 2019. URLhttps://arxiv.org/abs/1901.06482.
[20] Haibin Ling and Kazunori Okada. An efficient earth mover’s distance algorithm for robust histogram comparison.IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5):840-853, 2007.
[21] Jialin Liu, Wotao Yin, Wuchen Li, and Yat Tin Chow. Multilevel optimal transport: A fast approximation of Wasserstein-1 distances. 2018. URLhttps://arxiv.org/abs/1810.00118.
[22] David G. Luenberger and Yinyu Ye.Linear and Nonlinear Programming. Springer, New York, 2008.
[23] Axel Munk and Claudia Czado. Nonparametric validation of similar distributions and assessment of goodness of fit.J. R. Stat. Soc. B, 60(1):223-241, 1998.
[24] Kangyu Ni, Xavier Bresson, Tony Chan, and Selim Esedoglu. Local histogram based segmentation using the Wasserstein distance.International Journal of Computer Vision, 84(1):97-111, 2009.
[25] Ofir Pele and Michael Werman. Fast and robust earth mover’s distances. InIEEE 12th International Conference on Computer Vision, pages 460-467, 2009.
[26] Svetlozar T. Rachev and Ludger R¨uschendorf.Mass Transportation Problems, Volume 1: Theory. Springer, New York, 1998.
[27] Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas. The earth mover’s distance as a metric for image retrieval.International Journal of Computer Vision, 40(2):99-121, 2000.
[28] Brian E. Ruttenberg, Gabriel Luna, Geoffrey P. Lewis, Steven K. Fisher, and Ambuj K. Singh. Quantifying spatial relationships from whole retinal images.Bioinformatics, 29(7):940-946, 2013.
[29] Bernhard Schmitzer. A sparse multi-scale algorithm for dense optimal transport.Journal of Mathematical Imaging and Vision, 56(2):238-259, 2016.
[30] J¨orn Schrieber, Dominic Schuhmacher, and Carsten Gottschlich. DOTmark — a benchmark for discrete optimal transport.IEEE Access, 5:271-282, 2016. doi: 10.1109/ACCESS.2016.2639065.
[31] Dominic Schuhmacher, Carsten Gottschlich, and Bjoern Baehre. R-package transport: Optimal transport in various forms, 2014. URLhttps://cran.r-project.org/package=transport. R package version 0.6-3.
[32] Shai Shalev-Shwartz, Ohad Shamir, Nathan Srebro, and Karthik Sridharan. Learnability, stability and uniform convergence.Journal of Machine Learning Research, 11(Oct):2635-2670, 2010.
[33] Sameer Shirdhonkar and David W. Jacobs. Approximate earth mover’s distance in linear time. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1-8, 2008.
[34] Max Sommerfeld and Axel Munk. Inference for empirical Wasserstein distances on finite spaces.J. R. Stat. Soc. B, 80(1):219-238, 2018.
[35] Carla Tameling and Axel Munk. Computational strategies for statistical inference based on empirical optimal transport. In2018 IEEE Data Science Workshop, pages 175-179. IEEE, 2018.
[36] Jean-Louis Verger-Gaugry. Covering a ball with smaller equal balls inRn.Discrete & Computational Geometry, 33(1):143-155, 2005.
[37] C´edric Villani.Optimal Transport: Old and New. Springer, New York, 2008.
[38] Jianguo Zhang, Marcin Marsza lek, Svetlana Lazebnik, and Cordelia Schmid.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.