zbMATH — the first resource for mathematics

The journey of graph kernels through two decades. (English) Zbl 1382.68191
Summary: In the real world all events are connected. There is a hidden network of dependencies that governs behavior of natural processes. Without much argument it can be said that, of all the known data-structures, graphs are naturally suitable to model such information. But to learn to use graph data structure is a tedious job as most operations on graphs are computationally expensive, so exploring fast machine learning techniques for graph data has been an active area of research and a family of algorithms called kernel based approaches has been famous among researchers of the machine learning domain. With the help of support vector machines, kernel based methods work very well for learning with Gaussian processes. In this survey we will explore various kernels that operate on graph representations. Starting from the basics of kernel based learning we will travel through the history of graph kernels from its first appearance to discussion of current state of the art techniques in practice.

68T05 Learning and adaptive systems in artificial intelligence
68P05 Data structures
68R10 Graph theory (including graph drawing) in computer science
68-02 Research exposition (monographs, survey articles) pertaining to computer science
Full Text: DOI
[1] Dipert, R. R., The mathematical structure of the world: the world as graph, J. Philos., 94, 7, 329-358, (1997)
[2] Bunke, H., Graph-based tools for data mining and machine learning, (International Workshop on Machine Learning and Data Mining in Pattern Recognition, (2003), Springer), 7-19 · Zbl 1029.68564
[3] Conte, D.; Foggia, P.; Sansone, C.; Vento, M., Thirty years of graph matching in pattern recognition, Int. J. Pattern Recognit. Artif. Intell., 18, 03, 265-298, (2004)
[4] Gärtner, T., A survey of kernels for structured data, ACM SIGKDD Explor. Newslett., 5, 1, 49-58, (2003)
[5] Vishwanathan, S. V.N.; Schraudolph, N. N.; Kondor, R.; Borgwardt, K. M., Graph kernels, J. Mach. Learn. Res., 11, Apr, 1201-1242, (2010) · Zbl 1242.05112
[6] Shervashidze, N., Scalable Graph Kernels, (2012), Universität Tübingen, (Ph.D. thesis)
[7] Borgwardt, K. M., Graph Kernels, (2007), lmu, (Ph.D. thesis)
[8] Aronszajn, N., Theory of reproducing kernels, Trans. Amer. Math. Soc., 68, 3, 337-404, (1950) · Zbl 0037.20701
[9] B. Scholkopf, A.J. Smola, Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, 2002, arXiv: arXiv:1011.1669v3, http://dx.doi.org/10.1198/jasa.2003.s269.
[10] Tsochantaridis, I.; Joachims, T.; Hofmann, T.; Altun, Y.; Org, A.-C., Large margin methods for structured and interdependent output variables, J. Mach. Learn. Res., 6, 1453-1484, (2005) · Zbl 1222.68321
[11] Vapnik, V.; Lerner, a., Pattern recognition using generalized portrait method, Autom. Remote Control, 24, 774-780, (1963), doi:citeulike-article-id:619639
[12] Bunke, H., Graph matching : theoretical foundations, algorithms, and applications, Algorithmica, 2000, 2, 82-88, (2000)
[13] Bunke, H.; Foggia, P.; Guidobaldi, C.; Sansone, C.; Vento, M., A comparison of algorithms for maximum common subgraph on randomly connected graphs, (Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), (2002), Springer), 123-132 · Zbl 1073.68700
[14] M.R. Garey, D.S. Johnson, A Guide to the Theory of NP-Completeness, 1979, http://dx.doi.org/10.1137/1024022. · Zbl 0411.68039
[15] McKay, B., Nauty User’s Guide (version 2.4), 1-70, (2007), Computer Science Dept., Australian National University
[16] Neuhaus, M.; Bunke, H., Edit distance-based kernel functions for structural pattern classification, Pattern Recognit., 39, 10, 1852-1863, (2006) · Zbl 1096.68140
[17] Bunke, H.; Shearer, K., A graph distance metric based on the maximal common subgraph, Pattern Recognit. Lett., 19, 3, 255-259, (1998) · Zbl 0905.68128
[18] Fernández, M. L.; Valiente, G., A graph distance metric combining maximum common subgraph and minimum common supergraph, Pattern Recognit. Lett., 22, 6-7, 753-758, (2001) · Zbl 1010.68889
[19] Koch, I., Enumerating all connected maximal common subgraphs in two graphs, Theoret. Comput. Sci., 250, 12, 1-30, (2001) · Zbl 0952.68105
[20] Bron, C.; Kerbosch, J., Algorithm 457: finding all cliques of an undirected graph, Commun. ACM, 16, 9, 575-577, (1973), arXiv:citation.cfm?doid=362342.362367 · Zbl 0261.68018
[21] Liang, H.; Ward, W. F., PGC-1alpha: a key regulator of energy metabolism, Adv. Physiol. Educ., 30, 4, 145-151, (2006)
[22] Bunke, H.; Allermann, G., Inexact graph matching for structural pattern recognition, Pattern Recognit. Lett., 1, 4, 245-253, (1983) · Zbl 0509.68100
[23] H. Bunke, Recognition of cursive roman handwriting - past, present and future, in: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, Vol. 2003-Janua, 2003, pp. 448-459, http://dx.doi.org/10.1109/ICDAR.2003.1227707.
[24] Chung, F. R.K.; Chung-Graham, F.; Chung, F. R.K.; Chung-Graham, F.; Chung, F. R.K., Spectral graph theory, ACM SIGACT News, 92, 92, 14, (1997) · Zbl 0872.05052
[25] R. Todeschini, V. Consonni, Handbook of Molecular Descriptors, New York, Vol. 11, 2000, p. 688, http://dx.doi.org/10.1002/9783527613106.
[26] Wiener, H., Correlation of heats of isomerization, and differences in heats of vaporization of isomers, among the paraffin hydrocarbons, J. Amer. Chem. Soc., 69, 11, 2636-2638, (1947)
[27] Köbler, J., On Graph Isomorphism for Restricted Graph Classes, 241-256, (2006), Springer Berlin Heidelberg · Zbl 1145.68435
[28] Neuhaus, M.; Bunke, H., Self-organizing maps for learning the edit costs in graph matching, IEEE Trans. Syst. Man Cybern. B, 35, 3, 503-514, (2005)
[29] Neuhaus, M.; Bunke, H., Automatic learning of cost functions for graph edit distance, Inform. Sci., 177, 1, 239-247, (2007) · Zbl 1142.68492
[30] Caelli, T.; Caetano, T., Graphical models for graph matching: approximate models and optimal algorithms, Pattern Recognit. Lett., 26, 3, 339-346, (2005)
[31] S. Kramer, L. De Raedt, C. Helma, Molecular feature mining in HIV data, in: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’01, 2001, pp. 136-143, http://dx.doi.org/10.1145/502512.502533.
[32] Deshpande, M.; Kuramochi, M.; Wale, N.; Karypis, G., Frequent substructure-based approaches for classifying chemical compounds, IEEE Trans. Knowl. Data Eng., 17, 8, 1036-1050, (2005)
[33] H. Cheng, X. Yan, J. Han, C.-W. Hsu, Discriminative frequent pattern analysis for effective classification, in: Proceedings of the 23rd IEEE International Conference on Data Engineering, ICDE 2007, 2007, pp. 716-725, http://dx.doi.org/10.1109/ICDE.2007.367917.
[34] T. Jaakkola, M. Diekhans, D. Haussler, Using the Fisher kernel method to detect remote protein homologies, in: International Conference on Intelligent Systems for Molecular Biology, ISMB, 1999, pp. 149-158.
[35] Eddy, S. R., Hidden Markov models, Curr. Opin. Struct. Biol., 6, 3, 361-365, (1996)
[36] T. Jaakkola, D. Haussler, Probabilistic kernel regression models, in: Proceedings of the 1999 Conference on AI and Statistics, 1999, p. 9.
[37] Jaakkola, T. S.; Haussler, D., Exploiting generative models in discriminative classifiers, Adv. Neural Inf. Process. Syst., 487-493, (1999)
[38] A. Kashima, Hisashi Tsunda, Koji Inokuchi, Marginalized Kernels between labeled graphs, in: ICML, (2002), 2003, pp. 321-328.
[39] Lafferty, J.; Lafferty, J.; Lebanon, G.; Lebanon, G., Information diffusion kernels, Adv. Neural Inf. Process. Syst., 391-398, (2003)
[40] D. Haussler, Convolution Kernels on discrete structures, in: Technical Report UCSCRL9910 UC, 23 (1), 1999, 1-38, arXiv: arXiv:1403.8129v1, http://dx.doi.org/10.1007/s10863-011-9338-7.
[41] Gärtner, T., A survey of kernels for structured data, ACM SIGKDD Explor. Newslett., 5, 1, 49, (2003)
[42] T. Gärtner, P. Flach, S. Wrobel, T. Gärtner, On graph Kernels: Hardness results and efficient alternatives, in: Proceedings of the 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop, 2003, pp. 129-143, http://dx.doi.org/10.1007/b12006. · Zbl 1274.68312
[43] U. Kang, H. Tong, J. Sun, Fast random walk graph Kernel, in: Proceedings of the 2012 SIAM International Conference on Data Mining, 2012, pp. 828-838, http://dx.doi.org/10.1137/1.9781611972825.71.
[44] P. Mahé, N. Ueda, T. Akutsu, J.-L. Perret, J.-P. Vert, Extensions of marginalized graph kernels, in: Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004, 2004, pp. 552-559, arXiv: arXiv:1011.1669v3, http://dx.doi.org/10.1145/1015330.1015446.
[45] K.M. Borgwardt, H.P. Kriegel, Shortest-path kernels on graphs, in: Proceedings - IEEE International Conference on Data Mining, ICDM, 2005, pp. 74-81, http://dx.doi.org/10.1109/ICDM.2005.132.
[46] Imrich, W.; Klavzar, S., Product Graphs, (2000), Wiley
[47] Nocedal, J.; Wright, S. J., Numerical Optimization, Vol. 43, 164-175, (1999), arXiv:NIHMS150003
[48] Vishwanathan, S.; Schraudolph, N.; Kondor, R.; Borgwardt, K., Graph kenrels, J. Mach. Learn. Res., 11, 1201-1242, (2010), arXiv:0807.0093 · Zbl 1242.05112
[49] Golub, G. H.; Van Loan, C. F., Matrix Computations, (1996), arXiv: arXiv:1011.1669v3 · Zbl 0865.65009
[50] Feragen, A.; Kasenburg, N.; Petersen, J.; de Bruijne, M.; Borgwardt, K., Scalable kernels for graphs with continuous attributes, (Advances in Neural Information Processing Systems, (2013)), 216-224
[51] T. Horváth, T. Gärtner, S. Wrobel, Cyclic pattern kernels for predictive graph mining, in: Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’04, 2004, p. 158, http://dx.doi.org/10.1145/1014052.1014072.
[52] N. Shervashidze, S.V.N. Vishwanathan, T.H. Petri, K. Mehlhorn, K.M. Borgwardt, Efficient graphlet Kernels for large graph comparison, in: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, 2009, pp. 488-495, http://dx.doi.org/
[53] H. Frohlich, J.K. Wegner, F. Sieker, A. Zell, H. Fröhlich, J.K. Wegner, F. Sieker, A. Zell, Optimal assignment Kernels for attributed molecular graphs, in: Proceedings of the 22nd International Conference on Machine Learning, 2005, pp. 225-232, http://dx.doi.org/10.1145/1102351.1102380.
[54] Costa, F.; Grave, K. D., Fast neighborhood subgraph pairwise distance kernel, Computer, 54, v, 255-262, (2003)
[55] N. Wale, G. Karypis, Comparison of descriptor spaces for chemical compound retrieval and classification, in: Proceedings - IEEE International Conference on Data Mining, ICDM, 2006, pp. 678-689, http://dx.doi.org/10.1109/ICDM.2006.39.
[56] S. Menchetti, F. Costa, P. Frasconi, Weighted decomposition kernels, in: Proceedings of the 22nd International Conference on Machine Learning, 2005, pp. 585-592, http://dx.doi.org/10.1145/1102351.1102425.
[57] Schietgat, L.; Ramon, J.; Bruynooghe, M.; Blockeel, H., (An Efficiently Computable Graph-Based Metric for the Classification of Small Molecules, Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5255, (2008)), 197-209, LNAI
[58] J. Ramon, T. Gärtner, Expressivity versus efficiency of graph kernels, in: Proceedings of the First International Workshop on Mining Graphs, Trees and Sequences, 2003, pp. 65-74.
[59] Mahé, P.; Vert, J. P., Graph kernels based on tree patterns for molecules, Mach. Learn., 75, 1, 3-35, (2009), arXiv:0609024v1
[60] Shervashidze, N.; Schweitzer, P.; Leeuwen, V.; Jan, E.; Mehlhorn, K.; Borgwardt, K., Weisfeiler-lehman graph kernels, J. Mach. Learn. Res., 12, 2539-2561, (2011) · Zbl 1280.68194
[61] N. Shervashidze, K.M. Borgwardt, Fast subtree kernels on graphs, in: 23rd Annual Conference on Neural Information Processing Systems, 2009, pp. 1660-1668.
[62] Neumann, M.; Garnett, R.; Bauckhage, C.; Kersting, K., Propagation kernels: efficient graph kernels from propagated information, Mach. Learn., 102, 2, 209-245, (2016) · Zbl 1357.68178
[63] P. Yanardag, W. Lafayette, S.V.N. Vishwanathan, Deep graph kernels, in: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, pp. 1365-1374, http://dx.doi.org/10.1145/2783258.2783417.
[64] Schmidhuber, J., Deep learning in neural networks: an overview, Neural Netw., 61, 85-117, (2015)
[65] A. Narayanan, M. Chandramohan, R. Venkatesan, L. Chen, Y. Liu, S. Jaiswal, graph2vec: Learning distributed representations of graphs, 2017, arXiv preprint arXiv:1707.05005.
[66] T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, 2016, arXiv preprint arXiv:1609.02907.
[67] Sowa, J. F., Principles of Semantic Networks: Explorations in the Representation of Knowledge, (2014), Morgan Kaufmann
[68] A. Singhal, Introducing the knowledge graph: things, not strings, Official Google Blog, 2012.
[69] Gasteiger, J.; Engel, T., Chemoinformatics: A textbook, (Computational, (2003)), 680
[70] Debnath, A. K.; Lopez de Compadre, R. L.; Debnath, G.; Shusterman, A. J.; Hansch, C., Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity., J. Med. Chem., 34, 2, 786-797, (1991)
[71] Toivonen, H.; Srinivasan, A.; King, R. D.; Kramer, S.; Helma, C., Statistical evaluation of the predictive toxicology challenge 2000-2001, Bioinformatics, 19, 10, 1183-1193, (2003)
[72] Berman, H. M., The protein data bank, Nucleic Acids Res., 28, 1, 235-242, (2000)
[73] Xenarios, I.; Salwinski, L.; Duan, X. J.; Higney, P.; Kim, S.-M.; Eisenberg, D., DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions, Nucleic Acids Res., 30, 1, 303-305, (2002)
[74] Kanehisa, M., Chapter: the KEGG database, Nature Biotechnol., 22, 8, 1017-1019, (2004)
[75] Davidson, E. H.; Rast, J. P.; Oliveri, P.; Ransick, A.; Calestani, C.; Yuh, C.-H.; Minokawa, T.; Amore, G.; Hinman, V.; Arenas-Mena, C.; Otim, O.; Brown, C. T.; Livi, C. B.; Lee, P. Y.; Revilla, R.; Rust, A. G.; Pan, Z. J.; Schilstra, M. J.; Clarke, P. J.C.; Arnone, M. I.; Rowen, L.; Cameron, R. A.; McClay, D. R.; Hood, L.; Bolouri, H., A genomic regulatory network for development, Science, 295, 5560, 1669-1678, (2002)
[76] Huson, D. H.; Bryant, D., Application of phylogenetic networks in evolutionary studies, Mol. Biol. Evol., 23, 2, 254-267, (2006)
[77] Whisstock, J. C.; Lesk, A. M., Prediction of protein function from protein sequence and structure, Q. Rev. Biophys., 36, 3, 307-340, (2003)
[78] Wasserman, S.; Faust, K., Social network analysis : methods and applications, American Ethnologist, Vol. 24, 219-220, (1994), arXiv: arXiv:1011.1669v3
[79] L. Page, Method for node ranking in a linked database, US Patent 7058628, 1 (12), 1998, pp. 1-13, arXiv: arXiv:1208.5721, http://dx.doi.org/10.1016/j.(73).
[80] Deutsch, A.; Fernandez, M.; Florescu, D.; Levy, A.; Suciu, D., A query language for XML, Comput. Netw., 31, 11-16, 1155-1169, (1999)
[81] M. Weis, F. Naumann, Detecting duplicates in complex XML data, in: Proceedings - International Conference on Data Engineering, Vol. 2006, 2006, p. 109, http://dx.doi.org/10.1109/ICDE.2006.49.
[82] M. Collins, N. Duffy, Convolution Kernels for natural language, in: Proceedings of Neural Information Processing Systems 2001 - Advances in Neural Information Processing Systems 14, 2002, pp. 625-632, http://dx.doi.org/
[83] Das, N.; Ghosh, S.; Gonçalves, T.; Quaresma, P., Comparison of different graph distance metrics for semantic text based classification, Polibits, 49, 51-58, (2014)
[84] Das, N.; Ghosh, S.; Gonçalves, T.; Quaresma, P., Using graphs and semantic information to improve text classifiers, (International Conference on Natural Language Processing, (2014), Springer), 324-336
[85] Beck, D.; Cohn, T.; Hardmeier, C.; Specia, L., Learning structural kernels for natural language processing, Trans. Assoc. Comput. Linguist., 3, 461-473, (2015)
[86] Aldea, E.; Atif, J.; Bloch, I., Image classification using marginalized kernels for graphs, (Graph-Based Representations in Pattern Recognition, (2007)), 103-113 · Zbl 1182.68143
[87] F.R. Bach, Graph kernels between point clouds, in: Proceedings of the 25th International Conference on Machine Learning, 2008, pp. 25-32, arXiv: arXiv:0712.3402v1, http://dx.doi.org/10.1145/1390156.1390160.
[88] Harchaoui, Z.; Bach, F., Image classification with segmentation graph kernels, (IEEE Conference on Computer Vision and Pattern Recognition, 2007, CVPR’07, (2007), IEEE), 1-8
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.