×

Mixing local and global information for community detection in large networks. (English) Zbl 1311.68133

Summary: Clustering networks play a key role in many scientific fields, from Biology to Sociology and Computer Science. Some clustering approaches are called global because they exploit knowledge about the whole network topology. Vice versa, so-called local methods require only a partial knowledge of the network topology. Global approaches yield accurate results but do not scale well on large networks; local approaches, vice versa, are less accurate but computationally fast. We propose CONCLUDE (COmplex Network CLUster DEtection), a new clustering method that couples the accuracy of global approaches with the scalability of local methods. CONCLUDE generates random, non-backtracking walks of finite length to compute the importance of each edge in keeping the network connected, i.e., its edge centrality. Edge centralities allow for mapping vertices onto points of a Euclidean space and compute all-pairs distances between vertices; those distances are then used to partition the network into clusters.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
05C82 Small world graphs, complex networks (graph-theoretic aspects)
91D30 Social networks; opinion dynamics
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[2] Arenas, A.; Fernandez, A.; Gomez, S., Analysis of the structure of complex networks at different resolution levels, New J. Phys., 10, 053039 (2008)
[3] Berry, J.; Hendrickson, B.; LaViolette, R.; Phillips, C., Tolerating the community detection resolution limit with edge weighting, Phys. Rev. E, 83, 5, 056119 (2011)
[4] Blondel, V.; Guillaume, J.; Lambiotte, R.; Lefebvre, E., Fast unfolding of communities in large networks, J. Stat. Mech., 2008, P10008 (2008) · Zbl 1459.91130
[5] Brandes, U., A faster algorithm for betweenness centrality, J. Math. Soc., 25, 2, 163-177 (2001) · Zbl 1051.91088
[7] Chávez, E.; Navarro, G.; Baeza-Yates, R.; Marroquín, J., Searching in metric spaces, ACM Comput. Surveys (CSUR), 33, 3, 273-321 (2001)
[8] Clauset, A.; Newman, M.; Moore, C., Finding community structure in very large networks, Phys. Rev. E, 70, 6, 066111 (2004)
[9] Danon, L.; Díaz-Guilera, A.; Duch, J.; Arenas, A., Comparing community structure identification, J. Stat. Mech., 2005, P09008 (2005)
[10] DasGupta, B.; Desai, D., On the complexity of Newmanʼs community finding approach for biological and social networks, J. Comput. System Sci., 79, 1, 50-67 (2013) · Zbl 1258.05118
[11] De Meo, P.; Ferrara, E.; Fiumara, G.; Provetti, A., Enhancing community detection using a network weighting strategy, Inform. Sci., 222, 648-668 (2013) · Zbl 1293.91164
[12] De Meo, P.; Ferrara, E.; Fiumara, G.; Ricciardello, A., A novel measure of edge centrality in social networks, Knowledge-Based Systems, 30, 136-150 (2012)
[13] Donetti, L.; Muñoz, M., Detecting network communities: a new systematic and efficient algorithm, J. Stat. Mech. Theory Exp., 2004, 10, P10012 (2004) · Zbl 1073.82596
[14] Duch, J.; Arenas, A., Community detection in complex networks using extremal optimization, Phys. Rev. E, 72, 2, 027104 (2005)
[15] Ferrara, E., Community structure discovery in facebook, Int. J. Soc. Network Mining, 1, 1, 67-90 (2012)
[16] Ferrara, E., A large-scale community structure analysis in facebook, EPJ Data Sci., 1, 1 (2012)
[17] Fortunato, S., Community detection in graphs, Phys. Rep., 486, 75-174 (2010)
[18] Fortunato, S.; Barthélemy, M., Resolution limit in community detection, Proc. Nat. Acad. Sci., 104, 1, 36 (2007)
[19] Fortunato, S.; Latora, V.; Marchiori, M., Method to find community structures based on information centrality, Phys. Rev. E, 70, 5, 056104 (2004)
[20] Friedkin, N., Horizons of observability and limits of informal control in organizations, Social Forces, 62, 1, 55-77 (1983)
[21] Girvan, M.; Newman, M., Community structure in social and biological networks, Proc. Nat. Acad. Sci., 99, 12, 7821 (2002) · Zbl 1032.91716
[22] Gjoka, M.; Kurant, M.; Butts, C.; Markopoulou, A., Walking in facebook: a case study of unbiased sampling of OSNs, (Proceedings of the 29th Conference on Information Communications (2010), IEEE Press), 2498-2506
[23] Good, B.; de Montjoye, Y.; Clauset, A., Performance of modularity maximization in practical contexts, Phys. Rev. E, 81, 4, 046106 (2010)
[24] Granovetter, M., The strength of weak ties, Amer. J. Sociol., 1360-1380 (1973)
[26] Guimera, R.; Amaral, L. N., Functional cartography of complex metabolic networks, Nature, 433, 7028, 895-900 (2005)
[27] Hoeffding, W., Probability inequalities for sums of bounded random variables, J. Amer. Statist. Assoc., 13-30 (1963) · Zbl 0127.10602
[28] Kempe, D.; McSherry, F., A decentralized algorithm for spectral analysis, J. Comput. System Sci., 74, 1, 70-83 (2008) · Zbl 1131.68074
[29] Lancichinetti, A.; Radicchi, F., Benchmark graphs for testing community detection algorithms, Phys. Rev. E, 78, 4, 046110 (2008)
[30] Lancichinetti, A.; Radicchi, F.; Ramasco, J., Finding statistically significant communities in networks, PloS One, 6, 4, e18961 (2011)
[31] Leskovec, J.; Faloutsos, C., Sampling from large graphs, (Proc. of the 12th International Conference on Knowledge Discovery and Data Mining (2006), ACM), 631-636
[32] Li, Z.; Zhang, S.; Wang, R.; Zhang, X.; Chen, L., Quantitative function for community detection, Phys. Rev. E, 77, 3, 036109 (2008)
[33] Meunier, D.; Lambiotte, R.; Fornito, A.; Ersche, K.; Bullmore, E., Hierarchical modularity in human brain functional networks, Front. Neuroinform., 3 (2009)
[34] Meyer, C., Matrix Analysis and Applied Linear Algebra. Solutions Manual, Miscellaneous Titles in Appl. Math. Ser., vol. 71 (2000), Society for Industrial Mathematics
[35] Muff, S.; Rao, F.; Caflisch, A., Local modularity measure for network clusterizations, Phys. Rev. E, 72, 5, 056107 (2005)
[36] Nepusz, T.; Yu, H.; Paccanaro, A., Detecting overlapping protein complexes in protein-protein interaction networks, Nature Methods, 9, 471-472 (2012)
[37] Newman, M., Fast algorithm for detecting community structure in networks, Phys. Rev. E, 69, 6, 066133 (2004)
[38] Newman, M., A measure of betweenness centrality based on random walks, Social Networks, 27, 1, 39-54 (2005)
[39] Newman, M.; Girvan, M., Finding and evaluating community structure in networks, Phys. Rev. E, 69, 2, 26113 (2004)
[40] Ng, A.; Jordan, M.; Weiss, Y., On Spectral Clustering: Analysis and an Algorithm, Adv. Neural Inf. Process. Syst., vol. 14 (2001)
[41] Palla, G.; Derényi, I.; Farkas, I.; Vicsek, T., Uncovering the overlapping community structure of complex networks in nature and society, Nature, 435, 9 (2005)
[42] Raghavan, U.; Albert, R.; Kumara, S., Near linear time algorithm to detect community structures in large-scale networks, Phys. Rev. E, 76, 3, 036106 (2007)
[43] Raphael, Y., Approximate shortest paths in weighted graphs, J. Comput. System Sci., 78, 2, 632-637 (2012) · Zbl 1237.68248
[44] Shi, J.; Malik, J., Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Machine Intell., 22, 8, 888-905 (2000)
[45] Viswanath, B.; Mislove, A.; Cha, M.; Gummadi, K. P., On the evolution of user interaction in facebook, (Proc. of the 2nd Workshop on Social Networks (2009), ACM)
[46] Von Mering, C.; Zdobnov, E.; Tsoka, S.; Ciccarelli, F.; Pereira-Leal, J.; Ouzounis, C.; Bork, P., Genome evolution reveals biochemical networks and functional modules, Proc. Nat. Acad. Sci., 100, 26, 15428 (2003)
[47] Wasserman, S.; Faust, K., Social Network Analysis: Methods and Applications (1994), Cambridge Univ. Press
[48] Weiss, Y., Segmentation using eigenvectors: a unifying view, (Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2. Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, 1999 (1999), IEEE), 975-982
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.