Vertex nomination via seeded graph matching.

*(English)*Zbl 07260677Summary: Consider two networks on overlapping, nonidentical vertex sets. Given vertices of interest (VOIs) in the first network, we seek to identify the corresponding vertices, if any exist, in the second network. While in moderately sized networks graph matching methods can be applied directly to recover the missing correspondences, herein we present a principled methodology appropriate for situations in which the networks are too large/noisy for brute-force graph matching. Our methodology identifies vertices in a local neighborhood of the VOIs in the first network that have verifiable corresponding vertices in the second network. Leveraging these known correspondences, referred to as seeds, we match the induced subgraphs in each network generated by the neighborhoods of these verified seeds, and rank the vertices of the second network in terms of the most likely matches to the original VOIs. We demonstrate the applicability of our methodology through simulations and real data examples.

##### Keywords:

graph inference; graph matching; graph mining; seeded graph matching; stochastic block model; vertex nomination##### Software:

MI-GRAAL
PDF
BibTeX
XML
Cite

\textit{H. G. Patsolic} et al., Stat. Anal. Data Min. 13, No. 3, 229--244 (2020; Zbl 07260677)

Full Text:
DOI

##### References:

[1] | E. M. Airoldi et al.,Mixed membership stochastic blockmodels, J Mach Learn Res 9 (2008), 1981-2014. · Zbl 1225.68143 |

[2] | A. Berg, T. Berg, and J. Malik,Shape matching and object recognition using low distortion correspondences, inIEEE Conference on Computer Vision and Pattern Recognition, IEEE, San Diego, CA, Vol 2005, 2005, 26-33. |

[3] | S. Bergsma and B. Van Durme,Using conceptual class attributes to characterize social media users, in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Long Papers, Association for Computational Linguistics, Sofia, Vol 1, 2013, 710-720. |

[4] | J. R. Beveridge,Local search algorithms for geometric object recognition: Optimal correspondence and pose, Univ. of Massachusetts at Amherst, Amherst, 1994. |

[5] | D. Conte et al.,Thirty years of graph matching in pattern recognition, Int J Pattern Recogn Artif Intell 18 (2004), 265-298. |

[6] | G. Coppersmith,Vertex nomination, WIREs Comput Stat 6 (2014), 144-153. |

[7] | G. Coppersmith and C. Priebe,Vertex nomination via content and context, 2012, arXiv preprint arXiv:1201.4118. |

[8] | L. Devroye, L. Györfi, and G. Lugosi,A probabilistic theory of pattern recognition, Vol 31, Springer Science & Business Media, Berlin, 2013. · Zbl 0853.68150 |

[9] | C. Fabiana, M. Garetto, and E. Leonardi,De-anonymizing scale-free social networks by percolation graph matching, in2015 IEEE Conference on Computer Communications (INFOCOM), IEEE, New York, 2015, 1571-1579. |

[10] | M. Fiori et al.,Robust multimodal graph matching: Sparse coding meets graph matching, Adv Neur Inform Process Syst 26 (2013), 127-135. |

[11] | D. Fishkind et al.,Seeded graph matching, Pattern Recogn 87 (2019), 203-215. |

[12] | D. Fishkind et al.,Vertex nomination schemes for membership prediction, Ann Appl Stat 9 (2015), 1510-1532. · Zbl 1454.62180 |

[13] | P. Foggia, G. Perncannella, and M. Vento,Graph matching and learning in pattern recognition in the last 10 years, Int J Pattern Recogn Artif Intell 28 (2014). https://doi.org/10.1142/ S0218001414500013. |

[14] | M. Frank and P. Wolfe,An algorithm for quadratic programming, Naval Res Log Q 3 (1956), 95-110. https://doi.org/10.1002/nav. 3800030109. |

[15] | K. Fukunage and P. M. Narendra,A branch and bound algorithm for computing k-nearest neighbors, IEEE Trans Comput (1975), 750-753. · Zbl 0307.68069 |

[16] | M. Gönen and E. Alpaydin,Localized multiple kernel learning, in Proceedings of the 25th International Conference on Machine Learning. ICML ’08, ACM, New York, 2008, 352-359. https://doi. org/10.1145/1390156.1390201. · Zbl 1254.68204 |

[17] | J. Ham, D. D. Lee, and L. K. Saul,Semisupervised alignment of manifolds, AISTATS 120 (2005), 120-127. |

[18] | P. W. Holland, K. Laskey, and S. Leinhardt,Stochastic blockmodels: First steps, Soc Netw 5 (1983), 109-137. |

[19] | Z. Huang, W. Chung, and H. Chen,A graph model for e-commerce recommender systems, J Am Soc Inform Sci Technol 55 (2004), 259-274. |

[20] | S. Ji et al.,On the relative de-anonymizability of graph data: Quantification and evaluation, inIEEE INFOCOM 2016—The 35th Annual IEEE International Conference on Computer Communications, IEEE, New York, 2016, 1-9. |

[21] | E. Kazemi, S. H. Hassani, and M. Grossglauser,Growing a graph matching from a handful of seeds, Proc VLDB Endow 8 (2015), 1010-1021. |

[22] | E. Kazemi, L. Yartseva, and M. Grossglauser,When can two unlabeled networks be aligned under partial overlap?in2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), IEEE, New York, 2015, 33-42. |

[23] | J. M. Keller, M. R. Gray, and J. A. Givens,A fuzzy k-nearest neighbor algorithm, IEEE Trans Syst Man Cybern 4 (1985), 580-585. |

[24] | O. Kuchaiev and N. Pržulj,Integrative network alignment reveals large regions of global network similarity in yeast and human, Bioinformatics 27 (2011), 1390-1396. |

[25] | H. Kuhn,The Hungarian method for the assignment problem, Naval Res Log Q 2 (1955), 83-97. https://doi.org/10.1002/nav. 3800020109. |

[26] | M. Leordeanu, R. Sukthankar, and M. Hebert,Unsupervised learning for graph matching, Int J Comput Vision 96 (2012), 28-45. · Zbl 1235.68274 |

[27] | J. Leskovec and A. Krevl.SNAP datasets:Stanford large network dataset collection, 2014, November 28, 2018, available at http:// snap.stanford.edu/data. |

[28] | L. Li and W. Campbell,Matching community structure across online social networks, Networks NIPS (2015). https://stanford. edu/ jugander/NetworksNIPS2015/. |

[29] | V. Lyzinski et al.,Graph matching: Relax at your own risk, IEEE Trans Pattern Anal Mach Intell 38 (2016), 60-73. |

[30] | V. Lyzinski, D. Fishkind, and C. Priebe,Seeded graph matching for correlated Erdös-Rènyi graphs, J Mach Learn Res 15 (2014), 3693-3720. · Zbl 1310.68166 |

[31] | V. Lyzinski et al.,On the consistency of the likelihood maximization vertex nomination scheme: Bridging the gap between maximum likelihood estimation and graph matching, J Mach Learn Res 17 (2016), 1-34. · Zbl 1392.62189 |

[32] | V. Lyzinski, K. Levin, and C. E. Priebe,On consistent vertex nomination schemes, J Mach Learn Res 20 (2019), 1-39. · Zbl 07064049 |

[33] | V. Lyzinski et al.,Spectral clustering for divide-and-conquer graph matching, Parall Comput 47 (2015), 70-87. |

[34] | D. Marchette, C. E. Priebe, and G. Coppersmith,Vertex nomination via attributed random dot product graphs, inProceedings of the 57th ISI World Statistics Congress, International Statistical Institute, Dahlgren, VA, Vol 6, 2011, 16. |

[35] | R. Mastrandrea, J. Fournet, and A. Barrat,Contact patterns in a high school: A comparison between data collected using wearable sensors, contact diaries and friendship surveys, PLoS One 10 (2015), e0136497. https://doi.org/10.1371/journal.pone. 0136497. |

[36] | T. Milenkovic et al.,Optimal network alignment with graphlet degree vectors, Cancer Inform 9 (2010), 121. |

[37] | M. Muja and D. G. Lowe,Fast approximate nearest neighbors with automatic algorithm configuration, VISAPP 2 (2009), 331-340. |

[38] | S. C. Olhede and P. J. Wolfe,Network histograms and universality of block model approximation, Proc Natl Acad Sci USA 111 (2014), 14722-14727. |

[39] | L. Page et al.,The pagerank citation ranking: Bringing order to the web, Stanford InfoLab, Stanford, CA, 1999. |

[40] | P. Pedarsani, D. R. Figueiredo, and M. Grossglauser,A Bayesian method for matching two similar graphs without seeds, in51st Annual Allerton Conference on Communication, |

[41] | C. E. Priebe et al.,Scan statistics on enron graphs, Comput Math Org Theory 11 (2005), 229-247. · Zbl 1086.68562 |

[42] | C. E. Priebe et al., On a two-truths phenomenon in spectral graph clustering, 116 (2019), 5995-6000. https://doi.org/10. 1073/pnas.1814462116. · Zbl 1431.62269 |

[43] | P. Rastogi, V. Lyzinski, and B. Van Durme.Vertex nomination on the cold start knowledge graph. Human Language Technology Center of Excellence: Technical report, 2017. |

[44] | K. Rohe, S. Chatterjee, and B. Yu,Spectral clustering and the high-dimensional stochastic blockmodel, Ann Stat 39 (2011), 1878-1915. · Zbl 1227.62042 |

[45] | P. Sen et al.,Collective classification in network data, AI Magaz 29 (2008), 93-93. |

[46] | P. Sermanet et al.Overfeat:Integrated recognition,localization and detection using convolutional networks, 2013. arXiv preprint arXiv:1312.6229. |

[47] | C. J. Stone,Consistent nonparametric regression, Ann Stat 5 (1977), 595-620. · Zbl 0366.62051 |

[48] | M. Sun, M. Tang, and C. E. Priebe,A comparison of graph embedding methods for vertex nomination, in2012 IEEE ICMLA, Boca Raton, FL, Vol 1, 2012, 398-403. |

[49] | D. L. Sussman et al.,A consistent adjacency spectral embedding for stochastic blockmodel graphs, J Am Stat Assoc 107 (2012), 1119-1128. · Zbl 1443.62188 |

[50] | S. Suwan, D. S. Lee, and C. E. Priebe,Bayesian vertex nomination using content and context, WIREs Comput Stat 7 (2015), 400-416. |

[51] | P. P. Talukdar and F. Pereira,Experiments in graph-based semi-supervised learning methods for class-instance acquisition, inProceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Stroudsburg, PA, 2010, 1473-1481. |

[52] | J. Vogelstein et al.,Fast approximate quadratic programming for graph matching, PLoS One 10 (2015), e0121002. |

[53] | H. Wang et al.,Locality statistics for anomaly detection in time series of graphs, IEEE Trans Signal Process 62 (2013), 703-717. · Zbl 1394.94790 |

[54] | L. Wiskott et al.,Face recognition by elastic bunch graph matching, IEEE Trans Pattern Anal Mach Intell 19 (1997), 775-779. |

[55] | J. Yan et al.,Joint optimization for consistent multiple graph matching, inProceedings of the IEEE International Conference on Computer Vision, IEEE, New York, 2013, 1649-1656. |

[56] | S. J. Young and E. R. Scheinerman,Random dot product graph models for social networks, inInternational Workshop on Algorithms and Models for the Web-Graph, Springer, Berlin, 2007, 138-149. · Zbl 1136.05322 |

[57] | M. Zaslavskiy, F. Bach, and J.-P. Vert,Global alignment of protein-protein interaction networks by graph matching methods, Bioinformatics 25 (2009), i259-i1267. |

[58] | R. Zass and A. Shashua,Probabilistic graph and hypergraph matching, inIEEE Conference on Computer Vision and Pattern Recognition. CVPR 2008, IEEE, New York, 2008, 2008, 1-8. |

[59] | F. Zhou and F. De la Torre,Factorized graph matching, in2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, New York, 2012, 127-134. |

[60] | She is currently a doctoral candidate in the Applied Math |

[61] | From 1985 to 1994, he worked as a mathematician and scientist in the U. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.