×

Iterative algorithm for discrete structure recovery. (English) Zbl 1486.62058

Summary: We propose a general modeling and algorithmic framework for discrete structure recovery that can be applied to a wide range of problems. Under this framework, we are able to study the recovery of clustering labels, ranks of players, signs of regression coefficients, cyclic shifts and even group elements from a unified perspective. A simple iterative algorithm is proposed for discrete structure recovery, which generalizes methods including Lloyd’s algorithm and the power method. A linear convergence result for the proposed algorithm is established in this paper under appropriate abstract conditions on stochastic errors and initialization. We illustrate our general theory by applying it on several representative problems: (1) clustering in Gaussian mixture model, (2) approximate ranking, (3) sign recovery in compressed sensing, (4) multireference alignment and (5) group synchronization, and show that minimax rate is achieved in each case.

MSC:

62F07 Statistical ranking and selection procedures
62J05 Linear regression; mixed models
62H30 Classification and discrimination; cluster analysis (statistical aspects)
94A12 Signal theory (characterization, reconstruction, filtering, etc.)
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Abbe, E., Bandeira, A. S. and Hall, G. (2016). Exact recovery in the stochastic block model. IEEE Trans. Inf. Theory 62 471-487. · Zbl 1359.94047 · doi:10.1109/TIT.2015.2490670
[2] ABBE, E., BENDORY, T., LEEB, W., PEREIRA, J. M., SHARON, N. and SINGER, A. (2019). Multireference alignment is easier with an aperiodic translation distribution. IEEE Trans. Inf. Theory 65 3565-3584. · Zbl 1432.94018 · doi:10.1109/TIT.2018.2889674
[3] Abbe, E., Fan, J., Wang, K. and Zhong, Y. (2020). Entrywise eigenvector analysis of random matrices with low expected rank. Ann. Statist. 48 1452-1474. · Zbl 1450.62066 · doi:10.1214/19-AOS1854
[4] ABBE, E., MASSOULIÉ, L., MONTANARI, A., SLY, A. and SRIVASTAVA, N. (2018). Group synchronization on grids. Math. Stat. Learn. 1 227-256. · Zbl 1426.62165 · doi:10.4171/msl/6
[5] AERON, S., SALIGRAMA, V. and ZHAO, M. (2010). Information theoretic bounds for compressed sensing. IEEE Trans. Inf. Theory 56 5111-5130. · Zbl 1366.94179 · doi:10.1109/TIT.2010.2059891
[6] AGUERREBERE, C., DELBRACIO, M., BARTESAGHI, A. and SAPIRO, G. (2016). Fundamental limits in multi-image alignment. IEEE Trans. Signal Process. 64 5707-5722. · Zbl 1414.94008 · doi:10.1109/TSP.2016.2600517
[7] ALOISE, D., DESHPANDE, A., HANSEN, P. and POPAT, P. (2009). NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75 245-248. · Zbl 1378.68047
[8] ARTHUR, D. and VASSILVITSKII, S. (2007). \(k\)-means++: The advantages of careful seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms 1027-1035. ACM, New York. · Zbl 1302.68273
[9] AWASTHI, P. and SHEFFET, O. (2012). Improved spectral-norm bounds for clustering. In Approximation, Randomization, and Combinatorial Optimization. Lecture Notes in Computer Science 7408 37-49. Springer, Heidelberg. · Zbl 1358.68220 · doi:10.1007/978-3-642-32512-0_4
[10] BAGARIA, V., DING, J., TSE, D., WU, Y. and XU, J. (2020). Hidden Hamiltonian cycle recovery via linear programming. Oper. Res. 68 53-70. · Zbl 1445.90007 · doi:10.1287/opre.2019.1886
[11] Balakrishnan, S., Wainwright, M. J. and Yu, B. (2017). Statistical guarantees for the EM algorithm: From population to sample-based analysis. Ann. Statist. 45 77-120. · Zbl 1367.62052 · doi:10.1214/16-AOS1435
[12] BANDEIRA, A. S., BOUMAL, N. and SINGER, A. (2017). Tightness of the maximum likelihood semidefinite relaxation for angular synchronization. Math. Program. 163 145-167. · Zbl 1365.90188 · doi:10.1007/s10107-016-1059-6
[13] BANDEIRA, A. S., CHARIKAR, M., SINGER, A. and ZHU, A. (2014). Multireference alignment using semidefinite programming. In ITCS’14—Proceedings of the 2014 Conference on Innovations in Theoretical Computer Science 459-470. ACM, New York. · Zbl 1364.94108
[14] BANDEIRA, A. S., NILES-WEED, J. and RIGOLLET, P. (2019). Optimal rates of estimation for multi-reference alignment. Math. Stat. Learn. 2 25-75. · Zbl 1437.62227 · doi:10.4171/msl/11
[15] Bellec, P. C., Lecué, G. and Tsybakov, A. B. (2018). Slope meets Lasso: Improved oracle bounds and optimality. Ann. Statist. 46 3603-3642. · Zbl 1405.62056 · doi:10.1214/17-AOS1670
[16] BENDORY, T., BOUMAL, N., MA, C., ZHAO, Z. and SINGER, A. (2018). Bispectrum inversion with application to multireference alignment. IEEE Trans. Signal Process. 66 1037-1050. · Zbl 1414.94066 · doi:10.1109/TSP.2017.2775591
[17] Blumensath, T. and Davies, M. E. (2009). Iterative hard thresholding for compressed sensing. Appl. Comput. Harmon. Anal. 27 265-274. · Zbl 1174.94008 · doi:10.1016/j.acha.2009.04.002
[18] Bogdan, M., van den Berg, E., Sabatti, C., Su, W. and Candès, E. J. (2015). SLOPE—adaptive variable selection via convex optimization. Ann. Appl. Stat. 9 1103-1140. · Zbl 1454.62212 · doi:10.1214/15-AOAS842
[19] Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324-345. · Zbl 0047.12903 · doi:10.2307/2334029
[20] BRAVERMAN, M. and MOSSEL, E. (2008). Noisy sorting without resampling. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms 268-276. ACM, New York. · Zbl 1192.94077
[21] BRODER, A. Z., FRIEZE, A. M. and SHAMIR, E. (1994). Finding hidden Hamiltonian cycles. Random Structures Algorithms 5 395-410. · Zbl 0809.05064 · doi:10.1002/rsa.3240050303
[22] BUTUCEA, C., NDAOUD, M., STEPANOVA, N. A. and TSYBAKOV, A. B. (2018). Variable selection with Hamming loss. Ann. Statist. 46 1837-1875. · Zbl 1414.62126 · doi:10.1214/17-AOS1572
[23] Chen, Y. and Candès, E. J. (2018). The projected power method: An efficient algorithm for joint alignment from pairwise differences. Comm. Pure Appl. Math. 71 1648-1714. · Zbl 1480.90199 · doi:10.1002/cpa.21760
[24] CHEN, Y., GUIBAS, L. J. and HUANG, Q.-X. (2014). Near-optimal joint object matching via convex relaxation. Preprint. Available at arXiv:1402.1473.
[25] COLLIER, O. and DALALYAN, A. S. (2016). Minimax rates in permutation estimation for feature matching. J. Mach. Learn. Res. 17 Paper No. 6, 31 pp. · Zbl 1360.62262
[26] CONTE, D., FOGGIA, P., SANSONE, C. and VENTO, M. (2004). Thirty years of graph matching in pattern recognition. Int. J. Pattern Recognit. Artif. Intell. 18 265-298.
[27] DASGUPTA, S. (2008). The Hardness of k-Means Clustering. Department of Computer Science and Engineering, Univ. California.
[28] DASKALAKIS, C., TZAMOS, C. and ZAMPETAKIS, M. (2016). Ten steps of EM suffice for mixtures of two Gaussians. Preprint. Available at arXiv:1609.00368.
[29] DAWID, A. P. and SKENE, A. M. (1979). Maximum likelihood estimation of observer error-rates using the em algorithm. J. R. Stat. Soc. Ser. C. Appl. Stat. 28 20-28.
[30] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1-38. · Zbl 0364.62022
[31] DERUMIGNY, A. (2018). Improved bounds for square-root Lasso and square-root slope. Electron. J. Stat. 12 741-766. · Zbl 1473.62132 · doi:10.1214/18-EJS1410
[32] DING, J., MA, Z., WU, Y. and XU, J. (2021). Efficient random graph matching via degree profiles. Probab. Theory Related Fields 179 29-115. · Zbl 1460.05171 · doi:10.1007/s00440-020-00997-4
[33] DWIVEDI, R., HO, N., KHAMARU, K., WAINWRIGHT, M. J., JORDAN, M. I. and YU, B. (2020). Singularity, misspecification and the convergence rate of EM. Ann. Statist. 48 3161-3182. · Zbl 1462.62382 · doi:10.1214/19-AOS1924
[34] FEI, Y. and CHEN, Y. (2019). Exponential error rates of SDP for block models: Beyond Grothendieck’s inequality. IEEE Trans. Inf. Theory 65 551-571. · Zbl 1432.90102 · doi:10.1109/TIT.2018.2839677
[35] FEI, Y. and CHEN, Y. (2020). Achieving the Bayes error rate in synchronization and block models by SDP, robustly. IEEE Trans. Inf. Theory 66 3929-3953. · Zbl 1448.90071 · doi:10.1109/TIT.2020.2966438
[36] FLETCHER, A. K., RANGAN, S. and GOYAL, V. K. (2009). Necessary and sufficient conditions for sparsity pattern recovery. IEEE Trans. Inf. Theory 55 5758-5772. · Zbl 1367.94090 · doi:10.1109/TIT.2009.2032726
[37] FOUCART, S. (2011). Hard thresholding pursuit: An algorithm for compressive sensing. SIAM J. Numer. Anal. 49 2543-2563. · Zbl 1242.65060 · doi:10.1137/100806278
[38] FRANK, J. (2006). Three-Dimensional Electron Microscopy of Macromolecular Assemblies: Visualization of Biological Molecules in Their Native State. Oxford Univ. Press, London.
[39] FRIEDRICH, F., KEMPE, A., LIEBSCHER, V. and WINKLER, G. (2008). Complexity penalized \(M\)-estimation: Fast computation. J. Comput. Graph. Statist. 17 201-224. · doi:10.1198/106186008X285591
[40] GAO, C. (2017). Phase transitions in approximate ranking. Preprint. Available at arXiv:1711.11189.
[41] GAO, C., LU, Y. and ZHOU, D. (2016). Exact exponent in optimal rates for crowdsourcing. In International Conference on Machine Learning 603-611.
[42] GAO, C. and MA, Z. (2021). Minimax rates in network analysis: Graphon estimation, community detection and hypothesis testing. Statist. Sci. 36 16-33. · Zbl 07368217 · doi:10.1214/19-STS736
[43] GAO, C., MA, Z., ZHANG, A. Y. and ZHOU, H. H. (2017). Achieving optimal misclassification proportion in stochastic block models. J. Mach. Learn. Res. 18 Paper No. 60, 45 pp. · Zbl 1440.62244
[44] Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2018). Community detection in degree-corrected block models. Ann. Statist. 46 2153-2185. · Zbl 1408.62116 · doi:10.1214/17-AOS1615
[45] Gao, C., van der Vaart, A. W. and Zhou, H. H. (2020). A general framework for Bayes structured linear models. Ann. Statist. 48 2848-2878. · Zbl 1471.62241 · doi:10.1214/19-AOS1909
[46] GAO, C. and ZHANG, A. Y. (2022). Supplement to “Iterative algorithm for discrete structure recovery.” https://doi.org/10.1214/21-AOS2140SUPP
[47] Giraud, C. and Verzelen, N. (2018). Partial recovery bounds for clustering with the relaxed \(K\)-means. Math. Stat. Learn. 1 317-374. · Zbl 1426.62186
[48] Girvan, M. and Newman, M. E. J. (2002). Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99 7821-7826. · Zbl 1032.91716 · doi:10.1073/pnas.122653799
[49] Hajek, B., Wu, Y. and Xu, J. (2016). Achieving exact cluster recovery threshold via semidefinite programming. IEEE Trans. Inf. Theory 62 2788-2797. · Zbl 1359.94222 · doi:10.1109/TIT.2016.2546280
[50] Hajek, B., Wu, Y. and Xu, J. (2016). Achieving exact cluster recovery threshold via semidefinite programming: Extensions. IEEE Trans. Inf. Theory 62 5918-5937. · Zbl 1359.94951 · doi:10.1109/TIT.2016.2594812
[51] Hartigan, J. A. (1975). Clustering Algorithms. Wiley Series in Probability and Mathematical Statistics. Wiley, New York. · Zbl 0372.62040
[52] JI, P. and JIN, J. (2012). UPS delivers optimal phase diagram in high-dimensional variable selection. Ann. Statist. 40 73-103. · Zbl 1246.62160 · doi:10.1214/11-AOS947
[53] KANUNGO, T., MOUNT, D. M., NETANYAHU, N. S., PIATKO, C. D., SILVERMAN, R. and WU, A. Y. (2004). A local search approximation algorithm for \(k\)-means clustering. Comput. Geom. 28 89-112. · Zbl 1077.68109 · doi:10.1016/j.comgeo.2004.03.003
[54] Kumar, A. and Kannan, R. (2010). Clustering with spectral norm and the \(k\)-means algorithm. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science—FOCS 2010 299-308. IEEE Computer Soc., Los Alamitos, CA.
[55] KUMAR, A., SABHARWAL, Y. and SEN, S. (2004). A simple linear time \[(1+\epsilon )\]-approximation algorithm for \(k\)-means clustering in any dimensions. In Annual Symposium on Foundations of Computer Science 45 454-462. IEEE Computer Society Press.
[56] LERMAN, G. and SHI, Y. (2019). Robust group synchronization via cycle-edge message passing. Preprint. Available at arXiv:1912.11347.
[57] LESKOVEC, J., LANG, K. J. and MAHONEY, M. (2010). Empirical comparison of algorithms for network community detection. In Proceedings of the 19th International Conference on World Wide Web 631-640. ACM, New York.
[58] LING, S. (2020). Solving orthogonal group synchronization via convex and low-rank optimization: Tightness and landscape analysis. Preprint. Available at arXiv:2006.00902.
[59] LING, S. (2020). Near-optimal performance bounds for orthogonal and permutation group synchronization via spectral methods. Preprint. Available at arXiv:2008.05341.
[60] Lloyd, S. P. (1982). Least squares quantization in PCM. IEEE Trans. Inf. Theory 28 129-137. · Zbl 0504.94015 · doi:10.1109/TIT.1982.1056489
[61] LOUNICI, K. (2008). Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2 90-102. · Zbl 1306.62155 · doi:10.1214/08-EJS177
[62] LU, Y. and ZHOU, H. H. (2016). Statistical and computational guarantees of lloyd’s algorithm and its variants. Preprint. Available at arXiv:1612.02099.
[63] LUCE, R. D. (2012). Individual Choice Behavior: A Theoretical Analysis. Wiley, New York.
[64] MAHAJAN, M., NIMBHORKAR, P. and VARADARAJAN, K. (2012). The planar \(k\)-means problem is NP-hard. Theoret. Comput. Sci. 442 13-21. · Zbl 1260.68158 · doi:10.1016/j.tcs.2010.05.034
[65] MAO, C., WEED, J. and RIGOLLET, P. (2018). Minimax rates and efficient algorithms for noisy sorting. In Algorithmic Learning Theory 2018. Proc. Mach. Learn. Res. (PMLR) 83 27. Proceedings of Machine Learning Research PMLR. · Zbl 1407.62066
[66] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436-1462. · Zbl 1113.62082 · doi:10.1214/009053606000000281
[67] Montanari, A. and Sen, S. (2016). Semidefinite programs on sparse random graphs and their application to community detection. In STOC’16—Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing 814-827. ACM, New York. · Zbl 1376.90043 · doi:10.1145/2897518.2897548
[68] MONTEILLER, P., CLAICI, S., CHIEN, E., MIRZAZADEH, F., SOLOMON, J. M. and YUROCHKIN, M. (2019). Alleviating label switching with optimal transport. In Advances in Neural Information Processing Systems 13634-13644.
[69] Mossel, E., Neeman, J. and Sly, A. (2014). Consistency thresholds for binary symmetric block models. Preprint. Available at arXiv:1407.1591. · Zbl 1321.05242
[70] MOSSEL, E., NEEMAN, J. and SLY, A. (2018). A proof of the block model threshold conjecture. Combinatorica 38 665-708. · Zbl 1424.05272 · doi:10.1007/s00493-016-3238-8
[71] NDAOUD, M. (2018). Sharp optimal recovery in the two Gaussian mixture model. Preprint. Available at arXiv:1812.08078.
[72] NDAOUD, M. and TSYBAKOV, A. B. (2020). Optimal variable selection and adaptive noisy compressed sensing. IEEE Trans. Inf. Theory 66 2517-2532. · Zbl 1448.94081 · doi:10.1109/TIT.2020.2965738
[73] PACHAURI, D., KONDOR, R. and SINGH, V. (2013). Solving the multi-way matching problem by permutation synchronization. In Advances in Neural Information Processing Systems 1860-1868.
[74] PANANJADY, A., WAINWRIGHT, M. J. and COURTADE, T. A. (2018). Linear regression with shuffled data: Statistical and computational limits of permutation recovery. IEEE Trans. Inf. Theory 64 3286-3300. · Zbl 1395.62204 · doi:10.1109/TIT.2017.2776217
[75] PERRY, A., WEED, J., BANDEIRA, A. S., RIGOLLET, P. and SINGER, A. (2019). The sample complexity of multireference alignment. SIAM J. Math. Data Sci. 1 497-517. · Zbl 1499.92047 · doi:10.1137/18M1214317
[76] PERRY, A., WEIN, A. S., BANDEIRA, A. S. and MOITRA, A. (2016). Optimality and sub-optimality of pca for spiked random matrices and synchronization. Preprint. Available at arXiv:1609.05573. · Zbl 1404.62065
[77] RAD, K. R. (2011). Nearly sharp sufficient conditions on exact sparsity pattern recovery. IEEE Trans. Inf. Theory 57 4672-4679. · Zbl 1365.62203 · doi:10.1109/TIT.2011.2145670
[78] SALIGRAMA, V. and ZHAO, M. (2011). Thresholded basis pursuit: LP algorithm for order-wise optimal support recovery for sparse and approximately sparse signals from noisy random measurements. IEEE Trans. Inf. Theory 57 1567-1586. · Zbl 1366.94124 · doi:10.1109/TIT.2011.2104512
[79] SIGWORTH, F. J. (1998). A maximum-likelihood approach to single-particle image refinement. J. Struct. Biol. 122 328-339.
[80] Singer, A. (2011). Angular synchronization by eigenvectors and semidefinite programming. Appl. Comput. Harmon. Anal. 30 20-36. · Zbl 1206.90116 · doi:10.1016/j.acha.2010.02.001
[81] SU, W. and CANDÈS, E. (2016). SLOPE is adaptive to unknown sparsity and asymptotically minimax. Ann. Statist. 44 1038-1068. · Zbl 1338.62032 · doi:10.1214/15-AOS1397
[82] Wainwright, M. J. (2009). Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting. IEEE Trans. Inf. Theory 55 5728-5741. · Zbl 1367.94106 · doi:10.1109/TIT.2009.2032816
[83] WANG, W., WAINWRIGHT, M. J. and RAMCHANDRAN, K. (2010). Information-theoretic limits on sparse signal recovery: Dense versus sparse measurement matrices. IEEE Trans. Inf. Theory 56 2967-2979. · Zbl 1366.94130 · doi:10.1109/TIT.2010.2046199
[84] Wasserman, L. and Roeder, K. (2009). High-dimensional variable selection. Ann. Statist. 37 2178-2201. · Zbl 1173.62054 · doi:10.1214/08-AOS646
[85] Wu, C.-F. J. (1983). On the convergence properties of the EM algorithm. Ann. Statist. 11 95-103. · Zbl 0517.62035 · doi:10.1214/aos/1176346060
[86] WU, Y. and ZHOU, H. H. (2019). Randomly initialized EM algorithm for two-component Gaussian mixture achieves near optimality in \[\sqrt{n}\] iterations. Preprint. Available at arXiv:1908.10935. · Zbl 1493.62350
[87] Xu, J., Hsu, D. J. and Maleki, A. (2016). Global analysis of expectation maximization for mixtures of two Gaussians. In Advances in Neural Information Processing Systems 2676-2684.
[88] XU, J., HSU, D. J. and MALEKI, A. (2018). Benefits of over-parameterization with EM. In Advances in Neural Information Processing Systems 10662-10672.
[89] YAN, J., CHO, M., ZHA, H., YANG, X. and CHU, S. M. (2015). Multi-graph matching via affinity optimization with graduated consistency regularization. IEEE Trans. Pattern Anal. Mach. Intell. 38 1228-1242.
[90] Yun, S.-Y. and Proutiere, A. (2014). Accurate community detection in the stochastic block model via spectral algorithms. Preprint. Available at arXiv:1412.7335.
[91] Zhang, A. Y. and Zhou, H. H. (2016). Minimax rates of community detection in stochastic block models. Ann. Statist. 44 2252-2280. · Zbl 1355.60125 · doi:10.1214/15-AOS1428
[92] Zhang, A. Y. and Zhou, H. H. (2020). Theoretical and computational guarantees of mean field variational inference for community detection. Ann. Statist. 48 2575-2598. · Zbl 1462.62221 · doi:10.1214/19-AOS1898
[93] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541-2563. · Zbl 1222.62008
[94] Zhong, Y. and Boumal, N. (2018). Near-optimal bounds for phase synchronization. SIAM J. Optim. 28 989-1016. · Zbl 1396.90068 · doi:10.1137/17M1122025
[95] ZHOU, X., ZHU, M. and DANIILIDIS, K. (2015). Multi-image matching via fast alternating minimization. In Proceedings of the IEEE International Conference on Computer Vision 4032-4040
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.