×

Total variation based community detection using a nonlinear optimization approach. (English) Zbl 1444.91175

Summary: Maximizing the modularity of a network is a successful tool to identify an important community of nodes. However, this combinatorial optimization problem is known to be NP-complete. Inspired by recent nonlinear modularity eigenvector approaches, we introduce the modularity total variation \(TV_Q\) and show that its box-constrained global maximum coincides with the maximum of the original discrete modularity function. Thus we describe a new nonlinear optimization approach to solve the equivalent problem leading to a community detection strategy based on \(TV_Q\). The proposed approach relies on the use of a fast first-order method that embeds a tailored active-set strategy. We report extensive numerical comparisons with standard matrix-based approaches and the generalized RatioDCA approach for nonlinear modularity eigenvectors, showing that our new method compares favorably with state-of-the-art alternatives.

MSC:

91D30 Social networks; opinion dynamics
65K10 Numerical optimization and variational techniques
91C20 Clustering in the social and behavioral sciences
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] E. Abbe, Community detection and stochastic block models: Recent developments, J. Mach. Learn. Res., 18 (2017), pp. 6446-6531. · Zbl 1403.62110
[2] R. Andreani, E. G. Birgin, J. M. Martínez, and M. L. Schuverdt, Second-order negative-curvature methods for box-constrained and general constrained optimization, Comput. Optim. Appl., 45 (2010), pp. 209-236. · Zbl 1187.90265
[3] F. Bach, Learning with submodular functions: A convex optimization perspective, Found. Trends Mach. Learn., 6 (2013), pp. 145-373. · Zbl 1280.68001
[4] V. Batagelj and A. Mrvar, Pajek Datasets Collection, http://vlado.fmf.uni-lj.si/pub/networks/data/, 2006. · Zbl 1054.68564
[5] A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2 (2009), pp. 183-202. · Zbl 1175.94009
[6] E. G. Birgin and J. M. Martínez, Large-scale active-set box-constrained optimization method with spectral projected gradients, Comput. Optim. Appl., 23 (2002), pp. 101-125. · Zbl 1031.90012
[7] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, Fast unfolding of communities in large networks, J. Stat. Mech. Theory Exp., 2008 (2008), P10008. · Zbl 1459.91130
[8] Z. M. Boyd, E. Bae, X.-C. Tai, and A. L. Bertozzi, Simplified energy landscape for modularity using total variation, SIAM J. Appl. Math., 78 (2018), pp. 2439-2464. · Zbl 1398.65133
[9] U. Brandes, D. Delling, M. Gaertler, R. Görke, M. Hoefer, Z. Nikoloski, and D. Wagner, On finding graph clusterings with maximum modularity, in Proceedings of the International Workshop on Graph-Theoretic Concepts in Computer Science, Springer, New York, 2007, pp. 121-132. · Zbl 1141.68519
[10] X. Bresson, T. Laurent, D. Uminsky, and J. Von Brecht, Multiclass total variation clustering, in Proceedings of NIPS, 2013, pp. 1421-1429.
[11] T. Bühler and M. Hein, Spectral clustering based on the graph \(p\)-Laplacian, in Proceedings of the 26th Annual International Conference on Machine Learning (ICML), 2009, pp. 81-88.
[12] A. Chambolle and T. Pock, A first-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vision, 40 (2011), pp. 120-145. · Zbl 1255.68217
[13] E. Cho, S. A. Myers, and J. Leskovec, Friendship and mobility: User movement in location-based social networks, in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2011.
[14] F. R. K. Chung and L. Lu, Complex Graphs and Networks, CBMS Reg. Conf. Ser. Math. 107, American Mathematical Society, Providence, RI, 2006. · Zbl 1114.90071
[15] M. Cinelli, L. Peel, A. Iovanella, and J.-C. Delvenne, Network Constraints on the Mixing Patterns of Binary Node Metadata, preprint, https://arxiv.org/abs/1908.04588, 2019.
[16] A. Clauset, M. E. J. Newman, and C. Moore, Finding community structure in very large networks, Phys. Rev. E, 70 (2004), 066111.
[17] A. Cristofari, M. De Santis, S. Lucidi, and F. Rinaldi, A two-stage active-set algorithm for bound-constrained optimization, J. Optim. Theory Appl., 172 (2017), pp. 369-401. · Zbl 1398.90170
[18] F. E. Curtis, Z. Han, and D. P. Robinson, A globally convergent primal-dual active-set framework for large-scale convex quadratic optimization, Comput. Optim. Appl., 60 (2015), pp. 311-341. · Zbl 1309.90072
[19] M. De Santis, G. Di Pillo, and S. Lucidi, An active set feasible method for large-scale minimization problems with bound constraints, Comput. Optim. Appl., 53 (2012), pp. 395-423. · Zbl 1284.90075
[20] G. Di Pillo and L. Grippo, A class of continuously differentiable exact penalty function algorithms for nonlinear programming problems, in System Modelling and Optimization, Springer, New York, 1984, pp. 246-256. · Zbl 0545.90085
[21] J. Duch and A. Arenas, Community detection in complex networks using extremal optimization, Phys. Rev. E, 72 (2005), 027104.
[22] F. Facchinei and S. Lucidi, Quadratically and superlinearly convergent algorithms for the solution of inequality constrained minimization problems, J. Optim. Theory Appl., 85 (1995), pp. 265-289. · Zbl 0830.90125
[23] D. Fasino and F. Tudisco, An algebraic analysis of the graph modularity, SIAM J. Matrix Anal. Appl., 35 (2014), pp. 997-1018. · Zbl 1304.05110
[24] D. Fasino and F. Tudisco, Generalized modularity matrices, Linear Algebra Appl., 502 (2016), pp. 327-345. · Zbl 1335.05108
[25] H. J. Ferreau, C. Kirches, A. Potschka, H. G. Bock, and M. Diehl, qpoases: A parametric active-set algorithm for quadratic programming, Math. Program. Comput., 6 (2014), pp. 327-363. · Zbl 1302.90146
[26] R. Fletcher, On the Barzilai-Borwein method, in Optimization and Control with Applications, Springer, New York, 2005, pp. 235-256. · Zbl 1118.90318
[27] S. Fortunato, Community detection in graphs, Phys. Rep., 486 (2010), pp. 75-174.
[28] S. Fortunato and D. Hric, Community detection in networks: A user guide, Phys. Rep., 659 (2016), pp. 1-44.
[29] B. H. Good, Y.-A. de Montjoye, and A. Clauset, Performance of modularity maximization in practical contexts, Phys. Rev. E, 81 (2010), 046106.
[30] L. Grippo, F. Lampariello, and S. Lucidi, A nonmonotone line search technique for Newton’s method, SIAM J. Numer. Anal., 23 (1986), pp. 707-716. · Zbl 0616.65067
[31] L. Grippo, F. Lampariello, and S. Lucidi, A class of nonmonotone stabilization methods in unconstrained optimization, Numer. Math., 59 (1991), pp. 779-805. · Zbl 0724.90060
[32] A. Grosso, M. Locatelli, and F. Schoen, A population-based approach for hard global optimization problems based on dissimilarity measures, Math. Program., 110 (2007), pp. 373-404. · Zbl 1116.90084
[33] R. Guimera, M. Sales-Pardo, and L. A. N. Amaral, Modularity from fluctuations in random graphs and complex networks, Phys. Rev. E, 70 (2004), p. 025101.
[34] W. W. Hager and H. Zhang, A new active set algorithm for box constrained optimization, SIAM J. Optim., 17 (2006), pp. 526-557. · Zbl 1165.90570
[35] L. H. Hartwell, J. J. Hopfield, S. Leibler, and A. W. Murray, From molecular to modular cell biology, Nature, 402 (1999), pp. C47-C52.
[36] M. Hein and T. Bühler, An inverse power method for nonlinear eigenproblems with applications in \(1\)-spectral clustering and sparse PCA, in Proceedings of NIPS, 2010, pp. 847-855.
[37] M. Hein and S. Setzer, Beyond spectral clustering-tight relaxations of balanced graph cuts, in Proceedings of NIPS, 2011, pp. 2366-2374.
[38] H. Hu, T. Laurent, M. A. Porter, and A. L. Bertozzi, A method based on total variation for network modularity optimization using the MBO scheme, SIAM J. Appl. Math., 73 (2013), pp. 2224-2246. · Zbl 1284.62377
[39] A. Lancichinetti and S. Fortunato, Community detection algorithms: A comparative analysis, Phys. Rev. E, 80 (2009), 056117.
[40] R. H. Leary, Global optimization on funneling landscapes, J. Global Optim., 18 (2000), pp. 367-383. · Zbl 0986.90038
[41] J. Leskovec, J. Kleinberg, and C. Faloutsos, Graphs over time: Densification laws, shrinking diameters and possible explanations, in Proceedings of the SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2005.
[42] J. Leskovec, J. Kleinberg, and C. Faloutsos, Graph evolution: Densification and shrinking diameters, ACM Trans. Knowledge Discovery from Data, 2 (2007).
[43] P. Mercado, A. Gautier, F. Tudisco, and M. Hein, The power mean Laplacian for multilayer graph clustering, in Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS), Proc. Mach. Learn. Res., 2018, pp. 1828-1838.
[44] P. Mercado, F. Tudisco, and M. Hein, Clustering signed networks with the geometric mean of Laplacians, in Proceedings of NIPS, 2016.
[45] M. E. J. Newman, Finding community structure in networks using the eigenvectors of matrices, Phys. Rev. E, 74 (2006), 036104.
[46] M. E. J. Newman, Networks: An Introduction, Oxford University Press, Oxford, 2010. · Zbl 1195.94003
[47] M. E. J. Newman and M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E, 69 (2004), 026113.
[48] J. Nocedal and S. J. Wright, Sequential Quadratic Programming, Springer, New York, 2006.
[49] M. Ozer, N. Kim, and H. Davulcu, Community detection in political Twitter networks using nonnegative matrix factorization methods, in Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, IEEE Press, Piscataway, NJ, 2016, pp. 81-88.
[50] M. A. Porter, J.-P. Onnela, and P. J. Mucha, Communities in networks, Not. Amer. Math. Soc., 56 (2009), pp. 1082-1097. · Zbl 1188.05142
[51] M. Rosvall, D. Axelsson, and C. T. Bergstrom, The map equation, Eur. Phys. J. Special Topics, 178 (2009), pp. 13-23.
[52] I. Safro, P. Sanders, and C. Schulz, Advanced coarsening schemes for graph partitioning, J. Exper. Algorithmics, 19 (2015), art. 2.2. · Zbl 1347.68355
[53] S. E. Schaeffer, Graph clustering, Comput. Sci. Rev., 1 (2007), pp. 27-64. · Zbl 1302.68237
[54] H.-W. Shen and X.-Q. Cheng, Spectral methods for the detection of network community structure: A comparative analysis, J. Stat. Mech. Theory Exp., 2010 (2010), P10020.
[55] V. A. Traag, P. Van Dooren, and Y. Nesterov, Narrow scope for resolution-limit-free community detection, Phys. Rev. E, 84 (2011), 016114.
[56] F. Tudisco and M. Hein, A nodal domain theorem and a higher-order Cheeger inequality for the graph \(p\)-Laplacian, J. Spectr. Theory, 8 (2018), pp. 883-908. · Zbl 1491.05126
[57] F. Tudisco and D. J. Higham, A nonlinear spectral method for core-periphery detection in networks, SIAM J. Math. Data Sci., (2019), pp. 269-292. · Zbl 1499.05402
[58] F. Tudisco, P. Mercado, and M. Hein, Community detection in networks via nonlinear modularity eigenvectors, SIAM J. Appl. Math., 78 (2018), pp. 2393-2419. · Zbl 1397.90089
[59] P. Zhang and C. Moore, Scalable detection of statistically significant communities and hierarchies, using message passing for modularity, Proc. Natl. Acad. Sci. USA, 111 (2014), pp. 18144-18149.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.