×

Estimation of Monge matrices. (English) Zbl 1460.62082

Summary: Monge matrices and their permuted versions known as pre-Monge matrices naturally appear in many domains across science and engineering. While the rich structural properties of such matrices have long been leveraged for algorithmic purposes, little is known about their impact on statistical estimation. In this work, we propose to view this structure as a shape constraint and study the problem of estimating a Monge matrix subject to additive random noise. More specifically, we establish the minimax rates of estimation of Monge and pre-Monge matrices. In the case of pre-Monge matrices, the minimax-optimal least-squares estimator is not efficiently computable, and we propose two efficient estimators and establish their rates of convergence. Our theoretical findings are supported by numerical experiments.

MSC:

62H12 Estimation in multivariate analysis
62F30 Parametric inference under constraints
62-08 Computational methods for problems pertaining to statistics

Software:

SCS; ECOS; GitHub; UNLocBoX
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Aggarwal, A., Klawe, M.M., Moran, S., Shor, P. and Wilber, R. (1987). Geometric applications of a matrix-searching algorithm. Algorithmica 2 195-208. · Zbl 0642.68078 · doi:10.1007/BF01840359
[2] Airoldi, E.M., Costa, T.B. and Chan, S.H. (2013). Stochastic blockmodel approximation of a graphon: Theory and consistent estimation. Adv. Neural Inf. Process. Syst. 26 692-700.
[3] Atkins, J.E., Boman, E.G. and Hendrickson, B. (1999). A spectral algorithm for seriation and the consecutive ones problem. SIAM J. Comput. 28 297-310. · Zbl 0930.05064 · doi:10.1137/S0097539795285771
[4] Berthet, Q. and Rigollet, P. (2013). Complexity theoretic lower bounds for sparse principal component detection. In COLT 2013 - The 26th Conference on Learning Theory (S. Shalev-Shwartz and I. Steinwart, eds.). JMLR W&CP 30 1046-1066.
[5] Blei, R., Gao, F. and Li, W.V. (2007). Metric entropy of high dimensional distributions. Proc. Amer. Math. Soc. 135 4009-4018. · Zbl 1147.46018 · doi:10.1090/S0002-9939-07-08935-6
[6] Borgs, C., Chayes, J.T., Cohn, H. and Ganguly, S. (2015). Consistent nonparametric estimation for heavy-tailed sparse graphs. Preprint. Available at arXiv:1508.06675.
[7] Boyle, J.P. and Dykstra, R.L. (1986). A method for finding projections onto the intersection of convex sets in Hilbert spaces. In Advances in Order Restricted Statistical Inference (Iowa City, Iowa, 1985). Lect. Notes Stat. 37 28-47. Berlin: Springer. · Zbl 0603.49024
[8] Braverman, M. and Mossel, E. (2008). Noisy sorting without resampling. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms 268-276. New York: ACM. · Zbl 1192.94077
[9] Burkard, R.E. (2007). Monge properties, discrete convexity and applications. European J. Oper. Res. 176 1-14. · Zbl 1137.90579 · doi:10.1016/j.ejor.2005.04.050
[10] Burkard, R.E., Deĭneko, V.G., van Dal, R., van der Veen, J.A.A. and Woeginger, G.J. (1998). Well-solvable special cases of the traveling salesman problem: A survey. SIAM Rev. 40 496-546. · Zbl 1052.90597 · doi:10.1137/S0036144596297514
[11] Burkard, R.E., Deĭneko, V.G. and Woeginger, G.J. (1999). The travelling salesman problem on permuted Monge matrices. J. Comb. Optim. 2 333-350. · Zbl 0955.90113 · doi:10.1023/A:1009768317347
[12] Burkard, R.E., Klinz, B. and Rudolf, R. (1996). Perspectives of Monge properties in optimization. Discrete Appl. Math. 70 95-161. · Zbl 0856.90091 · doi:10.1016/0166-218X(95)00103-X
[13] Cechlárová, K. and Szabó, P. (1990). On the Monge property of matrices. Discrete Math. 81 123-128. · Zbl 0726.06010
[14] Çela, E., Deineko, V. and Woeginger, G.J. (2018). New special cases of the quadratic assignment problem with diagonally structured coefficient matrices. European J. Oper. Res. 267 818-834. · Zbl 1403.90471 · doi:10.1016/j.ejor.2017.12.024
[15] Chan, S. and Airoldi, E. (2014). A consistent histogram estimator for exchangeable graph models. In International Conference on Machine Learning 208-216.
[16] Chatterjee, S. (2014). A new perspective on least squares under convex constraint. Ann. Statist. 42 2340-2381. · Zbl 1302.62053 · doi:10.1214/14-AOS1254
[17] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177-214. · Zbl 1308.62038 · doi:10.1214/14-AOS1272
[18] Chatterjee, S., Guntuboyina, A. and Sen, B. (2018). On matrix estimation under monotonicity constraints. Bernoulli 24 1072-1100. · Zbl 1419.62135 · doi:10.3150/16-BEJ865
[19] Chatterjee, S. and Mukherjee, S. (2019). Estimation in tournaments and graphs under monotonicity constraints. IEEE Trans. Inf. Theory 65 3525-3539. · Zbl 1432.62160 · doi:10.1109/TIT.2019.2893911
[20] Collier, O. and Dalalyan, A.S. (2016). Minimax rates in permutation estimation for feature matching. J. Mach. Learn. Res. 17 Paper No. 6, 31. · Zbl 1360.62262
[21] Combettes, P.L. and Pesquet, J.-C. (2011). Proximal splitting methods in signal processing. In Fixed-Point Algorithms for Inverse Problems in Science and Engineering. Springer Optim. Appl. 49 185-212. New York: Springer. · Zbl 1242.90160
[22] Deineko, V., Rudolf, R. and Woeginger, G.J. (1996). On the recognition of permuted Supnick and incomplete Monge matrices. Acta Inform. 33 559-569. · Zbl 0858.68042 · doi:10.1007/s002360050058
[23] Deineko, V.G. and Filonenko, V.L. (1979). On the reconstruction of specially structured matrices. Aktualnyje Problemy EVM, programmirovanije, Dnepropetrovsk, DGU. (in Russian).
[24] Deutsch, F. and Hundal, H. (1994). The rate of convergence of Dykstra’s cyclic projections algorithm: The polyhedral case. Numer. Funct. Anal. Optim. 15 537-565. · Zbl 0807.41019 · doi:10.1080/01630569408816580
[25] Ding, J., Ma, Z., Wu, Y. and Xu, J. (2018). Efficient random graph matching via degree profiles. Preprint. Available at arXiv:1811.07821.
[26] Domahidi, A., Chu, E. and ECOS, S.B. (2013). An SOCP solver for embedded systems. In European Control Conference (ECC) 3071-3076.
[27] Fallat, S., Lauritzen, S., Sadeghi, K., Uhler, C., Wermuth, N. and Zwiernik, P. (2017). Total positivity in Markov structures. Ann. Statist. 45 1152-1184. · Zbl 1414.60010 · doi:10.1214/16-AOS1478
[28] Fang, B., Guntuboyina, A. and Sen, B. (2019). Multivariate extensions of isotonic regression and total variation denoising via entire monotonicity and Hardy-Krause variation. Preprint. Available at arXiv:1903.01395.
[29] Flammarion, N., Mao, C. and Rigollet, P. (2019). Optimal rates of statistical seriation. Bernoulli 25 623-653. · Zbl 1442.62084 · doi:10.3150/17-BEJ1000
[30] Fogel, F., d’Aspremont, A. and Vojnovic, M. (2014). Serialrank: Spectral ranking using seriation. In Advances in Neural Information Processing Systems 27 (Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence and K. Weinberger, eds.) 900-908. Curran Associates,.
[31] Fogel, F., Jenatton, R., Bach, F. and d’Aspremont, A. (2013). Convex relaxations for permutation problems. In Advances in Neural Information Processing Systems 26 (C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K. Weinberger, eds.) 1016-1024. Curran Associates.
[32] Fortuin, C.M., Kasteleyn, P.W. and Ginibre, J. (1971). Correlation inequalities on some partially ordered sets. Comm. Math. Phys. 22 89-103. · Zbl 0346.06011 · doi:10.1007/BF01651330
[33] Gao, C., Lu, Y. and Zhou, H.H. (2015). Rate-optimal graphon estimation. Ann. Statist. 43 2624-2652. · Zbl 1332.60050 · doi:10.1214/15-AOS1354
[34] Gao, F. (2013). Bracketing entropy of high dimensional distributions. In High Dimensional Probability VI. Progress in Probability 66 3-17. Basel: Birkhäuser/Springer. · Zbl 1270.41006
[35] Gavish, M. and Donoho, D.L. (2014). The optimal hard threshold for singular values is \(4/\sqrt{3} \). IEEE Trans. Inf. Theory 60 5040-5053. · Zbl 1360.94071 · doi:10.1109/TIT.2014.2323359
[36] Hitchcock, F.L. (1941). The distribution of a product from several sources to numerous localities. J. Math. Phys. Mass. Inst. Tech. 20 224-230. · Zbl 0026.33904 · doi:10.1002/sapm1941201224
[37] Hoffman, A.J. (1963). On simple linear programming problems. In Proc. Sympos. Pure Math., Vol. VII 317-327. Providence, RI: Amer. Math. Soc. · Zbl 0171.17801
[38] Hütter, J.-C., Mao, C., Rigollet, P. and Robeva, E. (2020). Optimal rates for estimation of two-dimensional totally positive distributions. In preparation. · Zbl 1446.62105
[39] Hütter, J.-C., Mao, C., Rigollet, P. and Robeva, E. (2020). Supplement to “Estimation of Monge Matrices.” https://doi.org/10.3150/20-BEJ1215SUPP
[40] Hütter, J.-C. and Rigollet, P. (2016). Optimal rates for total variation denoising. In Conference on Learning Theory 1115-1146.
[41] Johnstone, I.M. (2017). Gaussian estimation: Sequence and wavelet models. Unpublished manuscript.
[42] Juditsky, A., Rigollet, P. and Tsybakov, A.B. (2008). Learning by mirror averaging. Ann. Statist. 36 2183-2206. · Zbl 1274.62288 · doi:10.1214/07-AOS546
[43] Karlin, S. and Rinott, Y. (1980). Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions. J. Multivariate Anal. 10 467-498. · Zbl 0469.60006 · doi:10.1016/0047-259X(80)90065-2
[44] Karlin, S. and Rinott, Y. (1980). Classes of orderings of measures and related correlation inequalities. II. Multivariate reverse rule distributions. J. Multivariate Anal. 10 499-516. · Zbl 0469.60007 · doi:10.1016/0047-259X(80)90066-4
[45] Kendall, D.G. (1969). Incidence matrices, interval graphs and seriation in archeology. Pacific J. Math. 28 565-570. · Zbl 0185.03301 · doi:10.2140/pjm.1969.28.565
[46] Kendall, D.G. (1970). A mathematical approach to seriation. Philos. Trans. R. Soc. Lond. Ser. A, Math. Phys. Sci. 269 125-134.
[47] Klopp, O., Tsybakov, A.B. and Verzelen, N. (2017). Oracle inequalities for network models and sparse graphon estimation. Ann. Statist. 45 316-354. · Zbl 1367.62090 · doi:10.1214/16-AOS1454
[48] Lauritzen, S., Uhler, C. and Zwiernik, P. (2019). Maximum likelihood estimation in Gaussian models under total positivity. Ann. Statist. 47 1835-1863. · Zbl 1467.62092 · doi:10.1214/17-AOS1668
[49] Livi, L. and Rizzi, A. (2013). The graph matching problem. PAA Pattern Anal. Appl. 16 253-283. · Zbl 1284.68470 · doi:10.1007/s10044-012-0284-8
[50] Ma, Z. and Wu, Y. (2015). Computational barriers in minimax submatrix detection. Ann. Statist. 43 1089-1116. · Zbl 1328.62354 · doi:10.1214/14-AOS1300
[51] Mao, C., Pananjady, A. and Wainwright, M.J. (2018). Breaking the \(1/\sqrt{n}\) barrier: Faster rates for permutation-based models in polynomial time. Preprint. Available at arXiv:1802.09963.
[52] Mao, C., Weed, J. and Rigollet, P. (2018). Minimax rates and efficient algorithms for noisy sorting. In AProceedings of the 28th International Conference on Algorithmic Learning Theory. · Zbl 1407.62066
[53] Massart, P. (2007). Concentration Inequalities and Model Selection: Ecole D’Eté de Probabilités de Saint-Flour XXXIII - 2003. Number No. 1896 in Ecole D’Eté de Probabilités de Saint-Flour. Berlin: Springer. · Zbl 1170.60006
[54] Monge, G. (1781). Mémoire sur la théorie des déblais et des remblais. Histoire de L’Académie Royale des Sciences de Paris.
[55] O’Donoghue, B., Chu, E., Parikh, N. and Boyd, S. (2016). Conic optimization via operator splitting and homogeneous self-dual embedding. J. Optim. Theory Appl. 169 1042-1068. · Zbl 1342.90136 · doi:10.1007/s10957-016-0892-3
[56] O’Donoghue, B., Chu, E., Parikh, N. and SCS, S.B. (2017). Splitting conic solver, version 2.0.2. https://github.com/cvxgrp/scs.
[57] Pananjady, A., Mao, C., Muthukumar, V., Wainwright, M.J. and Courtade, T.A. (2017). Worst-case vs average-case design for estimation from fixed pairwise comparisons. Preprint. Available at arXiv:1707.06217. · Zbl 1452.62561
[58] Park, J.K. (1991). The monge array: An abstraction and its applications. Ph.D. Thesis, Massachusetts Institute of Technology.
[59] Park, J.K. (1991). A special case of the \(n\)-vertex traveling-salesman problem that can be solved in \(O(n)\) time. Inform. Process. Lett. 40 247-254. · Zbl 0743.68079 · doi:10.1016/0020-0190(91)90118-2
[60] Pferschy, U., Rudolf, R. and Woeginger, G.J. (1994). Monge matrices make maximization manageable. Oper. Res. Lett. 16 245-254. · Zbl 0822.90112 · doi:10.1016/0167-6377(94)90037-X
[61] Queyranne, M., Spieksma, F. and Tardella, F. (1998). A general class of greedily solvable linear programs. Math. Oper. Res. 23 892-908. · Zbl 0977.90045 · doi:10.1287/moor.23.4.892
[62] Robeva, E., Sturmfels, B., Tran, N. and Uhler, C. (2018). Maximum likelihood estimation for totally positive log-concave densities. Preprint. Available at arXiv:1806.10120.
[63] Rudolf, R. (1994). Recognition of \(d\)-dimensional Monge arrays. Discrete Appl. Math. 52 71-82. · Zbl 0819.90060 · doi:10.1016/0166-218X(92)00189-S
[64] Rudolf, R. and Woeginger, G.J. (1995). The cone of Monge matrices: Extremal rays and applications. Math. Methods Oper. Res. 42 161-168. · Zbl 0843.90101 · doi:10.1007/BF01415751
[65] Shah, D. and Lee, C. Reducing crowdsourcing to graphon estimation, statistically. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics (A. Storkey and F. Perez-Cruz, eds.). Proceedings of Machine Learning Research 84 1741-1750. Playa Blanca, Lanzarote, Canary Islands, 09-11 Apr 2018. PMLR.
[66] Shah, N.B., Balakrishnan, S., Guntuboyina, A. and Wainwright, M.J. (2017). Stochastically transitive models for pairwise comparisons: Statistical and computational issues. IEEE Trans. Inf. Theory 63 934-959. · Zbl 1364.94253 · doi:10.1109/TIT.2016.2634418
[67] Shah, N.B., Balakrishnan, S. and Wainwright, M.J. (2016). A permutation-based model for crowd labeling: Optimal estimation and robustness. Preprint. Available at arXiv:1606.09632.
[68] Shah, N.B., Balakrishnan, S. and Wainwright, M.J. (2019). Feeling the bern: Adaptive estimators for Bernoulli probabilities of pairwise comparisons. IEEE Trans. Inf. Theory 65 4854-5874. · Zbl 1432.62234 · doi:10.1109/TIT.2019.2903249
[69] Strang, G. (2007). Computational Science and Engineering. Wellesley, MA: Wellesley-Cambridge Press. · Zbl 1194.65001
[70] Wang, Y.-X., Sharpnack, J., Smola, A.J. and Tibshirani, R.J. (2015). Trend filtering on graphs. Artificial Intelligence and Statistics 1042-1050.
[71] Wellner, J.A. (2003). Gaussian white noise models: Some results for monotone functions. In Crossing Boundaries: Statistical Essays in Honor of Jack Hall. Institute of Mathematical Statistics Lecture Notes - Monograph Series 43 87-104. Beachwood, OH: IMS. · Zbl 1255.62100
[72] Wolfe, P.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.