×

zbMATH — the first resource for mathematics

A unifying view of explicit and implicit feature maps of graph kernels. (English) Zbl 1458.68170
Summary: Non-linear kernel methods can be approximated by fast linear ones using suitable explicit feature maps allowing their application to large scale problems. We investigate how convolution kernels for structured data are composed from base kernels and construct corresponding feature maps. On this basis we propose exact and approximative feature maps for widely used graph kernels based on the kernel trick. We analyze for which kernels and graph properties computation by explicit feature maps is feasible and actually more efficient. In particular, we derive approximative, explicit feature maps for state-of-the-art kernels supporting real-valued attributes including the GraphHopper and graph invariant kernels. In extensive experiments we show that our approaches often achieve a classification accuracy close to the exact methods based on the kernel trick, but require only a fraction of their running time. Moreover, we propose and analyze algorithms for computing random walk, shortest-path and subgraph matching kernels by explicit and implicit feature maps. Our theoretical results are confirmed experimentally by observing a phase transition when comparing running time with respect to label diversity, walk lengths and subgraph size, respectively.
MSC:
68T05 Learning and adaptive systems in artificial intelligence
05C81 Random walks on graphs
05C85 Graph algorithms (graph-theoretic aspects)
68W40 Analysis of algorithms
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Andoni A (2009) Nearest neighbor search: the old, the new, and the impossible. Ph.D. thesis, MIT
[2] Bai L, Rossi L, Zhang Z, Hancock ER (2015) An aligned subtree kernel for weighted graphs. In: Proceedings of the thirty-second international conference on machine learning, pp 30-39
[3] Borgwardt KM, Kriegel HP (2005) Shortest-path kernels on graphs. In: Proceedings of the fifth IEEE international conference on data mining, pp 74-81
[4] Borgwardt, KM; Ong, CS; Schönauer, S.; Vishwanathan, SVN; Smola, AJ; Kriegel, HP, Protein function prediction via graph kernels, Bioinformatics, 21, Suppl 1, i47-i56 (2005)
[5] Bruna J, Zaremba W, Szlam A, LeCun Y (2014) Spectral networks and deep locally connected networks on graphs. In: International conference on learning representations
[6] Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1-27:27. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm
[7] Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in neural information processing systems, pp 3844-3852
[8] Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. In: Advances in neural information processing systems, pp 2224-2232
[9] Feragen A, Kasenburg N, Petersen J, Bruijne MD, Borgwardt K (2013) Scalable kernels for graphs with continuous attributes. In: Burges C, Bottou L, Welling M, Ghahramani Z, Weinberger K (eds) Advances in neural information processing systems, pp 216-224. Erratum available at http://image.diku.dk/aasa/papers/graphkernels_nips_erratum.pdf
[10] Fout A, Byrd J, Shariat B, Ben-Hur A (2017) Protein interface prediction using graph convolutional networks. In: Advances in neural information processing systems, pp 6533-6542
[11] Fröhlich H, Wegner JK, Sieker F, Zell A (2005) Optimal assignment kernels for attributed molecular graphs. In: Proceedings of the 22nd international conference on machine learning. ACM, New York, NY, USA, ICML ’05, pp 225-232
[12] Gärtner T, Flach P, Wrobel S (2003) On graph kernels: hardness results and efficient alternatives. In: Learning theory and kernel machines, Lecture Notes in Computer Science, vol 2777. Springer, pp 129-143 · Zbl 1274.68312
[13] Ghosh, S.; Das, N.; Gonçalves, T.; Quaresma, P.; Kundu, M., The journey of graph kernels through two decades, Comput Sci Rev, 27, 88-111 (2018) · Zbl 1382.68191
[14] Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: 33rd International conference on machine learning
[15] Hamilton WL, Ying R, Leskovec J (2017a) Inductive representation learning on large graphs. In: Advances in neural information processing systems, pp 1025-1035
[16] Hamilton, WL; Ying, R.; Leskovec, J., Representation learning on graphs: methods and applications, IEEE Data Eng Bull, 40, 3, 52-74 (2017)
[17] Harchaoui Z, Bach F (2007) Image classification with segmentation graph kernels. In: IEEE conference on computer vision and pattern recognition
[18] Haussler D (1999) Convolution kernels on discrete structures. Tech. Rep. UCSC-CRL-99-10, University of California, Santa Cruz, CA, USA
[19] Hido S, Kashima H (2009) A linear-time graph kernel. In: The ninth IEEE international conference on data mining, pp 179-188
[20] Horváth T, Gärtner T, Wrobel S (2004) Cyclic pattern kernels for predictive graph mining. In: Proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining, pp 158-167
[21] Joachims T (2006) Training linear SVMs in linear time. In: Proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining, pp 217-226
[22] Johansson FD, Dubhashi D (2015) Learning with similarity functions on graphs using matchings of geometric embeddings. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, USA, KDD ’15, pp 467-476
[23] Kang U, Tong H, Sun J (2012) Fast random walk graph kernel. In: Proceedings of the 2012 SIAM international conference on data mining, pp 828-838
[24] Kar P, Karnick H (2012) Random feature maps for dot product kernels. In: Proceedings of the fifteenth international conference on artificial intelligence and statistics, AISTATS 2012, La Palma, Canary Islands, April 21-23, 2012, pp 583-591
[25] Kashima H, Tsuda K, Inokuchi A (2003) Marginalized kernels between labeled graphs. In: Proceedings of the twentieth international conference on machine learning, pp 321-328
[26] Kersting K, Kriege NM, Morris C, Mutzel P, Neumann M (2016) Benchmark data sets for graph kernels. http://graphkernels.cs.tu-dortmund.de
[27] Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International conference on learning representations
[28] Kondor R, Pan H (2016) The multiscale Laplacian graph kernel. In: Advances in neural information processing systems, pp 2982-2990
[29] Kriege N, Mutzel P (2012) Subgraph matching kernels for attributed graphs. In: Proceedings of the 29th international conference on machine learning. http://www.icml.cc/Omnipress
[30] Kriege N, Neumann M, Kersting K, Mutzel M (2014) Explicit versus implicit graph feature maps: a computational phase transition for walk kernels. In: 2014 IEEE international conference on data mining, pp 881-886
[31] Kriege NM, Giscard PL, Wilson R (2016) On valid optimal assignment kernels and applications to graph classification. In: Advances in neural information processing systems. Curran Associates, Inc., pp 1623-1631
[32] Kriege NM, Johansson FD, Morris C (2019) A survey on graph kernels. CoRR. arXiv:1903.11835
[33] Mahé P, Ueda N, Akutsu T, Perret JL, Vert JP (2004) Extensions of marginalized graph kernels. In: Proceedings of the twenty-first international conference on machine learning, p 70
[34] Martino GDS, Navarin N, Sperduti A (2012) A tree-based kernel for graphs. In: Proceedings of the 2012 SIAM international conference on data mining. SIAM/Omnipress, pp 975-986
[35] Martino, GDS; Navarin, N.; Sperduti, A., Tree-based kernel for graphs with continuous attributes, IEEE Trans Neural Netw Learn Syst, 29, 7, 3270-3276 (2018)
[36] Merkwirth, C.; Lengauer, T., Automatic generation of complementary descriptors with molecular graph networks, J Chem Inf Model, 45, 5, 1159-1168 (2005)
[37] Morris C, Kriege NM, Kersting K, Mutzel P (2016) Faster kernels for graphs with continuous attributes via hashing. In: Bonchi F, Domingo-Ferrer J (eds) IEEE international conference on data mining (ICDM)
[38] Narayanan A, Chandramohan M, Chen L, Liu Y, Saminathan S (2016) subgraph2vec: learning distributed representations of rooted sub-graphs from large graphs. In: Workshop on mining and learning with graphs. arXiv:1606.08928
[39] Neumann, M.; Garnett, R.; Bauckhage, C.; Kersting, K., Propagation kernels: efficient graph kernels from propagated information, Mach Learn, 102, 2, 209-245 (2016) · Zbl 1357.68178
[40] Nikolentzos G, Meladianos P, Vazirgiannis M (2017) Matching node embeddings for graph similarity. In: AAAI. AAAI Press, pp 2429-2435
[41] Nikolentzos G, Meladianos P, Limnios S, Vazirgiannis M (2018) A degeneracy framework for graph similarity. In: IJCAI, pp 2595-2601. http://www.ijcai.org
[42] Orsini F, Frasconi P, De Raedt L (2015) Graph invariant kernels. In: Proceedings of the twenty-fourth international joint conference on artificial intelligence, pp 3756-3762
[43] Pham N, Pagh R (2013) Fast and scalable polynomial kernels via explicit feature maps. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, USA, KDD ’13, pp 239-247. 10.1145/2487575.2487591
[44] Rahimi A, Recht B (2008) Random features for large-scale kernel machines. In: Advances in neural information processing systems, pp 1177-1184
[45] Ramon J, Gärtner T (2003) Expressivity versus efficiency of graph kernels. In: First international workshop on mining graphs, trees and sequences
[46] Scarselli, F.; Gori, M.; Tsoi, AC; Hagenbuchner, M.; Monfardini, G., The graph neural network model, Trans Neural Netw, 20, 1, 61-80 (2009)
[47] Schiavinato M, Gasparetto A, Torsello A (2015) Transitive assignment kernels for structural classification. In: Feragen A, Pelillo M, Loog M (eds) Similarity-based pattern recognition: third international workshop, SIMBAD 2015, Copenhagen, Denmark, October 12-14, 2015. Springer International Publishing, Cham, pp 146-159
[48] Schütt K, Kindermans PJ, Sauceda HE, Chmiela S, Tkatchenko A, Müller KR (2017) SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. In: Advances in neural information processing systems, pp 992-1002
[49] Shawe-Taylor, J.; Cristianini, N., Kernel methods for pattern analysis (2004), New York: Cambridge University Press, New York
[50] Shervashidze N, Borgwardt K (2009) Fast subtree kernels on graphs. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds) Advances in neural information processing systems, vol 22, pp 1660-1668
[51] Shervashidze N, Vishwanathan S, Petri TH, Mehlhorn K, Borgwardt KM (2009) Efficient graphlet kernels for large graph comparison. In: 12th International conference on artificial intelligence and statistics
[52] Shervashidze, N.; Schweitzer, P.; van Leeuwen, EJ; Mehlhorn, K.; Borgwardt, KM, Weisfeiler-Lehman graph kernels, J Mach Learn Res, 12, 2539-2561 (2011) · Zbl 1280.68194
[53] Shi, Q.; Petterson, J.; Dror, G.; Langford, J.; Smola, A.; Vishwanathan, S., Hash kernels for structured data, J Mach Learn Res, 10, 2615-2637 (2009) · Zbl 1235.68188
[54] Shin, K.; Kuboyama, T., A generalization of Haussler’s convolution kernel—mapping kernel and its application to tree kernels, J Comput Sci Technol, 25, 1040-1054 (2010)
[55] Su Y, Han F, Harang RE, Yan X (2016) A fast kernel for attributed graphs. In: Proceedings of the 2016 SIAM international conference on data mining
[56] Sugiyama M, Borgwardt KM (2015) Halting in random walk kernels. In: Advances in neural information processing systems, pp 1630-1638
[57] Swamidass, SJ; Chen, J.; Bruand, J.; Phung, P.; Ralaivola, L.; Baldi, P., Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity, Bioinformatics, 21, Suppl 1, i359-i368 (2005)
[58] Vedaldi, A.; Zisserman, A., Efficient additive kernels via explicit feature maps, IEEE Trans Pattern Anal Mach Intell, 34, 3, 480-492 (2012)
[59] Vert JP (2008) The optimal assignment kernel is not positive definite. CoRR abs/0801.4061
[60] Vishwanathan, SVN; Schraudolph, NN; Kondor, RI; Borgwardt, KM, Graph kernels, J Mach Learn Res, 11, 1201-1242 (2010) · Zbl 1242.05112
[61] Wale, N.; Watson, IA; Karypis, G., Comparison of descriptor spaces for chemical compound retrieval and classification, Knowl Inf Syst, 14, 3, 347-375 (2008)
[62] Yanardag P, Vishwanathan SVN (2015) Deep graph kernels. In: 21st ACM SIGKDD international conference on knowledge discovery and data mining, pp 1365-1374
[63] Ying R, He R, Chen K, Eksombatchai P, Hamilton WL, Leskovec J (2018a) Graph convolutional neural networks for web-scale recommender systems. In: ACM SIGKDD international conference on knowledge discovery & data mining
[64] Ying R, You J, Morris C, Ren X, Hamilton WL, Leskovec J (2018b) Hierarchical graph representation learning with differentiable pooling. In: Advances in neural information processing systems
[65] Zhang M, Cui Z, Neumann M, Chen Y (2018a) An end-to-end deep learning architecture for graph classification. In: AAAI conference on artificial intelligence
[66] Zhang, Y.; Wang, L.; Wang, L., A comprehensive evaluation of graph kernels for unattributed graphs, Entropy, 20, 12, 984 (2018)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.