Link prediction via controlling the leading eigenvector. (English) Zbl 07426891

Summary: Link prediction is a fundamental challenge in network science. Among various methods, similarity-based algorithms are popular for their simplicity, interpretability, high efficiency and good performance. In this paper, we show that the most elementary local similarity index Common Neighbor (CN) can be linearly decomposed by eigenvectors of the adjacency matrix of the target network, with each eigenvector’s contribution being proportional to the square of the corresponding eigenvalue. As in many real networks, there is a huge gap between the largest eigenvalue and the second largest eigenvalue, the CN index is thus dominated by the leading eigenvector and much useful information contained in other eigenvectors may be overlooked. Accordingly, we propose a parameter-free algorithm that ensures the contributions of the leading eigenvector and the secondary eigenvector the same. Extensive experiments on real networks demonstrate that the prediction performance of the proposed algorithm is remarkably better than well-performed local similarity indices in the literature. A further proposed algorithm that can adjust the contribution of leading eigenvector shows the superiority over state-of-the-art algorithms with tunable parameters for its competitive accuracy and lower computational complexity.


05Cxx Graph theory
68Txx Artificial intelligence
91Dxx Mathematical sociology (including anthropology)


node2vec; Pajek; KONECT
Full Text: DOI arXiv


[1] Gosak, M.; Markovič, R.; Dolenšek, J.; Rupnik, M. S.; Marhl, M.; Stožer, A.; Perc, M., Network science of biological systems at different scales: a review, Phys. Life Rev., 24, 118-135 (2018)
[2] Barabási, A.-L.; Gulbahce, N.; Loscalzo, J., Network medicine: a network-based approach to human disease, Nat. Rev. Genet., 12, 56-68 (2011)
[3] Barabási, A.-L.; Oltvai, Z. N., Network biology: understanding the cell’s functional organization, Nat. Rev. Genet., 5, 101-113 (2004)
[4] Jackson, M. O., Networks in the understanding of economic behaviors, J. Econ. Perspect., 28, 3-22 (2014)
[5] Schweitzer, F.; Fagiolo, G.; Sornette, D.; Vega-Redondo, F.; Vespignani, A.; White, D. R., Economic networks: the new challenges, Science, 325, 422-425 (2009) · Zbl 1226.91057
[6] Gao, J.; Zhang, Y.-C.; Zhou, T., Computational socioeconomics, Phys. Rep., 817, 1-104 (2019)
[7] Lü, L.; Medo, M.; Yeung, C. H.; Zhang, Y.-C.; Zhang, Z.-K.; Zhou, T., Recommender systems, Phys. Rep., 519, 1-49 (2012)
[8] Kossinets, G.; Watts, D. J., Empirical analysis of an evolving social network, Science, 311, 88-90 (2006) · Zbl 1226.91055
[9] Hu, J.; Zhang, Q.-M.; Zhou, T., Segregation in religion networks, EPJ Data Sci., 8, 6 (2019)
[10] Zanin, M.; Papo, D.; Sousa, P. A.; Menasalvas, E.; Nicchi, A.; Kubik, E.; Boccaletti, S., Combining complex networks and data mining: why and how, Phys. Rep., 635, 1-44 (2016)
[11] Stanisz, T.; Kwapień, J.; Drożdż, S., Linguistic data mining with complex networks: astylometric-oriented approach, Inf. Sci., 482, 301-320 (2019)
[12] Wu, X.; Zhu, X.; Wu, G.-Q.; Ding, W., Data mining with big data, IEEE Trans. Knowl. Data Eng., 26, 97-107 (2013)
[13] Lü, L.; Zhou, T., Link prediction in complex networks: a survey, Physica A, 390, 1150-1170 (2011)
[14] T. Zhou, Progresses and challenges in link prediction, arXiv preprint arXiv:2102.11472 (2021).
[15] Ding, H.; Takigawa, I.; Mamitsuka, H.; Zhu, S., Similarity-based machine learning methods for predicting drug-target interactions: a brief review, Briefings Bioinf., 15, 734-747 (2014)
[16] Martinčić-Ipšić, S.; Močibob, E.; Perc, M., Link prediction on twitter, PLoS ONE, 12, e0181079 (2017)
[17] Wang, W.-Q.; Zhang, Q.-M.; Zhou, T., Evaluating network models: a likelihood analysis, EPL, 98, 28004 (2012)
[18] Zhang, Q.-M.; Xu, X.-K.; Zhu, Y.-X.; Zhou, T., Measuring multiple evolution mechanisms of complex networks, Sci. Rep., 5, 10350 (2015)
[19] Lü, L.; Pan, L.; Zhou, T.; Zhang, Y.-C.; Stanley, H. E., Toward link predictability of complex networks, Proc. Natl. Acad. Sci. U.S.A., 112, 2325-2330 (2015) · Zbl 1355.94107
[20] Xian, X.; Wu, T.; Qiao, S.; Wang, X.-Z.; Wang, W.; Liu, Y., NetSRE: link predictability measuring and regulating, Knowl. Based Syst., 196, 105800 (2020)
[21] Liben-Nowell, D.; Kleinberg, J., The link-prediction problem for social networks, J. Am. Soc. Inf. Sci. Technol., 58, 1019-1031 (2007)
[22] Zhou, T.; Lü, L.; Zhang, Y.-C., Predicting missing links via local information, Eur. Phys. J. B, 71, 623-630 (2009) · Zbl 1188.05143
[23] Cannistraci, C. V.; Alanis-Lobato, G.; Ravasi, T., From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks, Sci. Rep., 3, 1613 (2013)
[24] Neville, J.; Jensen, D., Relational dependency networks., J. Mach. Learn. Res., 8, 653-692 (2007) · Zbl 1222.68274
[25] Yu, K.; Chu, W.; Yu, S.; Tresp, V.; Xu, Z., Stochastic relational models for discriminative link prediction, Proceedings of the 19th International Conference on Neural Information Processing Systems, 1553-1560 (2006), MIT Press, Cambridge
[26] Clauset, A.; Moore, C.; Newman, M. E.J., Hierarchical structure and the prediction of missing links in networks, Nature, 453, 98-101 (2008)
[27] Guimerà, R.; Sales-Pardo, M., Missing and spurious interactions and the reconstruction of complex networks, Proc. Natl. Acad. Sci. U.S.A., 106, 22073-22078 (2009)
[28] Pan, L.; Zhou, T.; Lü, L.; Hu, C.-K., Predicting missing links and identifying spurious links via likelihood analysis, Sci. Rep., 6, 22955 (2016)
[29] Grover, A.; Leskovec, J., node2vec: scalable feature learning for networks, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 855-864 (2016), ACM Press, New York
[30] Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.; Mei, Q., Line: large-scale information network embedding, Proceedings of the 24th International Conference on World Wide Web, 1067-1077 (2015), ACM Press, New York
[31] Wang, D.; Cui, P.; Zhu, W., Structural deep network embedding, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1225-1234 (2016), ACM Press, New York
[32] Pech, R.; Hao, D.; Pan, L.; Cheng, H.; Zhou, T., Link prediction via matrix completion, EPL, 117, 38002 (2017)
[33] Benson, A. R.; Abebe, R.; Schaub, M. T.; Jadbabaie, A.; Kleinberg, J., Simplicial closure and higher-order link prediction, Proc. Natl. Acad. Sci. U.S.A., 115, E11221-E11230 (2018)
[34] Ghasemian, A.; Hosseinmardi, H.; Galstyan, A.; Airoldi, E. M.; Clauset, A., Stacking models for nearly optimal link prediction in complex networks, Proc. Natl. Acad. Sci. U.S.A., 117, 23393-23400 (2020)
[35] Jalili, M.; Orouskhani, Y.; Asgari, M.; Alipourfard, N.; Perc, M., Link prediction in multiplex online social networks, R. Soc. Open Sci., 4, 160863 (2017)
[36] Farkas, I. J.; Derényi, I.; Barabási, A.-L.; Vicsek, T., Spectra of “real-world” graphs: beyond the semicircle law, Phys. Rev. E, 64, 026704 (2001)
[37] Lee, Y.-L.; Zhou, T., Collaborative filtering approach to link prediction, Physica A, 578, 126107 (2021)
[38] Kunegis, J., KONECT: the Koblenz network collection, Proceedings of the 22nd International Conference on World Wide Web, 1343-1350 (2013), ACM Press, New York
[39] Watts, D. J.; Strogatz, S. H., Collective dynamics of small-world networks, Nature, 393, 440-442 (1998) · Zbl 1368.05139
[40] Newman, M. E.J., Assortative mixing in networks, Phys. Rev. Lett., 89, 208701 (2002)
[41] Rossi, R.; Ahmed, N., The network data repository with interactive graph analytics and visualization, Proceedings of the 29 th AAAI Conference on Artificial Intelligence, 4292-4293 (2015), AAAI Press, California
[42] (http://www.grouplens.org/node/73)
[43] Kovács, I. A.; Luck, K.; Spirohn, K.; Wang, Y.; Pollis, C.; Schlabach, S.; Bian, W.; Kim, D.-K.; Kishore, N.; Hao, T., Network-based prediction of protein interactions, Nat. Commun., 10, 1240 (2019)
[44] (http://vlado.fmf.uni-lj.si/pub/networks/data/)
[45] Dohleman, B. S., Exploratory social network analysis with Pajek: review, Psychometrika, 71, 605-606 (2006)
[46] Hanley, J. A.; McNeil, B. J., The meaning and use of the area under a receiver operating characteristic (ROC) curve, Radiology, 143, 29-36 (1982)
[47] Provost, F.; Fawcett, T., Robust classification for imprecise environments, Mach. Learn., 42, 203-231 (2001) · Zbl 0969.68126
[48] Lichtenwalter, R. N.; Lussier, J. T.; Chawla, N. V., New perspectives and methods in link prediction, Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 243-252 (2010), ACM Press, New York
[49] Yang, Y.; Lichtenwalter, R. N.; Chawla, N. V., Evaluating link prediction methods, Knowl. Inf. Syst., 45, 751-782 (2015)
[50] Adamic, L. A.; Adar, E., Friends and neighbors on the web, Soc. Netw., 25, 211-230 (2003)
[51] Ou, Q.; Jin, Y.-D.; Zhou, T.; Wang, B.-H.; Yin, B.-Q., Power-law strength-degree correlation from resource-allocation dynamics on weighted networks, Phys. Rev. E, 75, 021102 (2007)
[52] Maslov, S.; Sneppen, K., Specificity and stability in topology of protein networks, Science, 296, 910-913 (2002)
[53] Barabási, A.-L.; Albert, R., Emergence of scaling in random networks, Science, 286, 509-512 (1999) · Zbl 1226.05223
[54] Katz, L., A new status index derived from sociometric analysis, Psychometrika, 18, 39-43 (1953) · Zbl 0053.27606
[55] Lü, L.; Jin, C.-H.; Zhou, T., Similarity index based on local paths for link prediction of complex networks, Phys. Rev. E, 80, 046122 (2009)
[56] Pech, R.; Hao, D.; Lee, Y.-L.; Yuan, Y.; Zhou, T., Link prediction via linear optimization, Physica A, 528, 121319 (2019)
[57] Calvetti, D.; Reichel, L.; Sorensen, D. C., An implicitly restarted Lanczos method for large symmetric eigenvalue problems, Electron. Trans. Numer. Anal., 2, 1-21 (1994) · Zbl 0809.65030
[58] Mann, H. B.; Whitney, D. R., On a test of whether one of two random variables is stochastically larger than the other, Annals. Math. Stat., 18, 50-60 (1947) · Zbl 0041.26103
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.