Propagation kernels: efficient graph kernels from propagated information.

*(English)*Zbl 1357.68178Summary: We introduce propagation kernels, a general graph-kernel framework for efficiently measuring the similarity of structured data. Propagation kernels are based on monitoring how information spreads through a set of given graphs. They leverage early-stage distributions from propagation schemes such as random walks to capture structural information encoded in node labels, attributes, and edge information. This has two benefits. First, off-the-shelf propagation schemes can be used to naturally construct kernels for many graph types, including labeled, partially labeled, unlabeled, directed, and attributed graphs. Second, by leveraging existing efficient and informative propagation schemes, propagation kernels can be considerably faster than state-of-the-art approaches without sacrificing predictive performance. We will also show that if the graphs at hand have a regular structure, for instance when modeling image or video data, one can exploit this regularity to scale the kernel computation to large databases of graphs with thousands of nodes. We support our contributions by exhaustive experiments on a number of real-world graphs from a variety of application domains.

##### MSC:

68T05 | Learning and adaptive systems in artificial intelligence |

05C81 | Random walks on graphs |

68R10 | Graph theory (including graph drawing) in computer science |

94A17 | Measures of information, entropy |

##### Keywords:

learning with graphs; graph kernels; random walks; locality sensitive hashing; convolutions
PDF
BibTeX
XML
Cite

\textit{M. Neumann} et al., Mach. Learn. 102, No. 2, 209--245 (2016; Zbl 1357.68178)

Full Text:
DOI

##### References:

[1] | Bauckhage, C., & Kersting, K. (2013). Efficient information theoretic clustering on discrete lattices (2013). arxiv:1310.7114. |

[2] | Borgwardt, K., & Kriegel, H. P. (2005). Shortest-path kernels on graphs. In Proceedings of international conference on data mining (ICDM-05), pp. 74-81. |

[3] | Datar, M., & Indyk, P. (2004). Locality-sensitive hashing scheme based on \(p\)-stable distributions. In Proceedings of the 20th annual symposium on computational geometry (SCG-2004), pp. 253-262. · Zbl 1373.68193 |

[4] | Debnath, A; Compadre, RL; Debnath, G; Schusterman, A; Hansch, C, Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity, Journal of Medicinal Chemistry, 34, 786-797, (1991) |

[5] | Desrosiers, C., & Karypis, G. (2009). Within-network classification using local structure similarity. In Proceedings of the European conference on machine learning and knowledge discovery in databases (ECML/PKDD-09), pp. 260-275. |

[6] | Dobson, PD; Doig, AJ, Distinguishing enzyme structures from non-enzymes without alignments, Journal of Molecular Biology, 330, 771-783, (2003) |

[7] | Feragen, A; Kasenburg, N; Petersen, J; Bruijne, M; Borgwardt, KM, Scalable kernels for graphs with continuous attributes, Advances in Neural Information Processing Systems, 26, 216-224, (2013) |

[8] | Gallagher, B., Tong, H., Eliassi-Rad, T., & Faloutsos, C. (2008). Using ghost edges for classification in sparsely labeled networks. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD-08), pp. 256-264. |

[9] | Gärtner, T., Flach, P. A., & Wrobel, S. (2003). On graph kernels: Hardness results and efficient alternatives. In Proceedings of computational learning theory and kernel machines (COLT-03), pp. 129-143. · Zbl 1274.68312 |

[10] | Gersho, A., & Gray, R. (1991). Vector quantization and signal compression. Norwell, MA: Kluwer Academic Publishers. · Zbl 0782.94001 |

[11] | Haralick, RM; Shanmugam, K; Dinstein, IH, Textural features for image classification, IEEE Transactions on Systems, Man and Cybernetics, SMC-3, 610-621, (1973) |

[12] | Harchaoui, Z., & Bach, F. (2007). Image classification with segmentation graph kernels. In CVPR. IEEE Computer Society. |

[13] | Haussler, D. (1999). Convolution kernels on discrete structures. Tech. rep., Department of Computer Science, University of California, Santa Cruz. |

[14] | Hido, S., & Kashima, H. (2009). A linear-time graph kernel. In Proceedings of the 9th IEEE international conference on data mining (ICDM-09), pp. 179-188. |

[15] | Horváth, T., Gärtner, T., & Wrobel, S. (2004). Cyclic pattern kernels for predictive graph mining. In Proceedings of knowledge discovery in databases (KDD-04), pp. 158-167. |

[16] | Hwang, T., & Kuang, R. (2010). A heterogeneous label propagation algorithm for disease gene discovery. In Proceedings of the SIAM international conference on data mining (SDM-10), pp. 583-594. |

[17] | Jaakkola, T; Haussler, D, Exploiting generative models in discriminative classifiers, Advances in Neural Information Processing Systems, 11, 487-493, (1998) |

[18] | Jähne, B. (2005). Digital Image Processing (6th ed.). Berlin: Springer. |

[19] | Jebara, T; Kondor, R; Howard, A, Probability product kernels, Journal of Machine Learning Research, 5, 819-844, (2004) · Zbl 1222.68226 |

[20] | Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels between labeled graphs. In Proceedings of the 20th international conference on machine learning (ICML-03), pp. 321-328. |

[21] | Kersting, K., Mladenov, M., Garnett, R., & Grohe, M. (2014). Power iterated color refinement. In Proceedings of the 28th AAAI conference on artificial intelligence (AAAI-14), pp. 1904-1910. · Zbl 1242.05112 |

[22] | Kondor, R., & Jebara, T. (2003). A kernel between sets of vectors. In Proceedings of the twentieth international conference on machine learning (ICML-03), pp. 361-368. |

[23] | Kondor, R. I., & Lafferty, J. D. (2002). Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the nineteenth international conference on machine learning (ICML-02), pp. 315-322. |

[24] | Kriege, N., & Mutzel, P. (2012). Subgraph matching kernels for attributed graphs. In Proceedings of the 29th international conference on machine learning (ICML-12). |

[25] | Lafferty, J; Lebanon, G, Information diffusion kernels, Advances in Neural Information Processing Systems, 22, 375-382, (2002) |

[26] | Lin, F., & Cohen, W. W. (2010). Power iteration clustering. In Proceedings of the 27th international conference on machine learning (ICML-10), pp. 655-662. |

[27] | Lovász, L; Miklós, D (ed.); Sós, VT (ed.); Szőnyi, T (ed.), Random walks on graphs: A survey, No. 2, 353-398, (1996), Budapest |

[28] | Mahé, P; Vert, JP, Graph kernels based on tree patterns for molecules, Machine Learning, 75, 3-35, (2009) |

[29] | Moreno, P., Ho, P., & Vasconcelos, N. (2003). A Kullback-Leibler divergence based kernel for SVM classification in multimedia applications. Advances in Neural Information Processing Systems, it 23(NIPS-03), 1385-1392. |

[30] | Neumann, M., Garnett, R., & Kersting, K. (2013). Coinciding walk kernels: Parallel absorbing random walks for learning with graphs and few labels. In Asian conference on machine learning (ACML-13), pp. 357-372. |

[31] | Neumann, M., Hallau, L., Klatt, B., Kersting, K., & Bauckhage, C. (2014). Erosion band features for cell phone image based plant disease classification. In Proceedings of the 22nd international conference on pattern recognition (ICPR-14), pp. 3315-3320. |

[32] | Neumann, M., Moreno, P., Antanas, L., Garnett, R., & Kersting, K. (2013). Graph kernels for object category prediction in task-dependent robot grasping. In Eleventh workshop on mining and learning with graphs (MLG-13), Chicago, Illinois. |

[33] | Neumann, M., Patricia, N., Garnett, R., & Kersting, K. (2012). Efficient graph kernels by randomization. In European conference on machine learning and knowledge discovery in databases (ECML/PKDD-12), pp. 378-393. |

[34] | Ojala, T; Pietikäinen, M; Mäenpää, T, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 971-987, (2002) |

[35] | Ramon, J., & Gärtner, T. (2003). Expressivity versus efficiency of graph kernels. In Proceedings of the 1st international workshop on mining graphs, trees and sequences, pp. 65-74. · Zbl 1280.68194 |

[36] | Schomburg, I; Chang, A; Ebeling, C; Gremse, M; Heldt, C; Huhn, G; Schomburg, D, Brenda, the enzyme database: updates and major new developments, Nucleic Acids Research, 32, 431-433, (2004) |

[37] | Shervashidze, N; Schweitzer, P; Leeuwen, E; Mehlhorn, K; Borgwardt, K, Weisfeiler-lehman graph kernels, Journal of Machine Learning Research, 12, 2539-2561, (2011) · Zbl 1280.68194 |

[38] | Shervashidze, N; Vishwanathan, S; Petri, T; Mehlhorn, K; Borgwardt, K, Efficient graphlet kernels for large graph comparison, Journal of Machine Learning Research—Proceedings Track, 5, 488-495, (2009) |

[39] | Shi, Q; Petterson, J; Dror, G; Langford, J; Smola, AJ; Vishwanathan, SVN, Hash kernels for structured data, Journal of Machine Learning Research, 10, 2615-2637, (2009) · Zbl 1235.68188 |

[40] | Szummer, M; Jaakkola, T, Partially labeled classification with Markov random walks, Advances in Neural Information Processing Systems, 15, 945-952, (2001) |

[41] | Valkealahti, K; Oja, E, Reduced multidimensional co-occurrence histograms in texture classification, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 90-94, (1998) |

[42] | Vishwanathan, S; Schraudolph, N; Kondor, R; Borgwardt, K, Graph kernels, Journal of Machine Learning Research, 11, 1201-1242, (2010) · Zbl 1242.05112 |

[43] | Wale, N., & Karypis, G. (2006). Comparison of descriptor spaces for chemical compound retrieval and classification. In Proceedings of the international conference on data mining (ICDM-06) (pp. 678-689). Silver Spring, MD: IEEE Computer Society. |

[44] | Winn, J. M., Criminisi, A., & Minka, T. P. (2005). Object categorization by learned universal visual dictionary. In 10th IEEE international conference on computer vision (ICCV-05), pp. 1800-1807. |

[45] | Wu, XM; Li, Z; So, AMC; Wright, J; Chang, SF, Learning with partially absorbing random walks, Advances in Neural Information Processing Systems, 26, 3086-3094, (2012) |

[46] | Zhou, D; Bousquet, O; Lal, TN; Weston, J; Schölkopf, B, Learning with local and global consistency, Advances in Neural Information Processing Systems, 17, 321-328, (2003) |

[47] | Zhu, X., Ghahramani, Z., & Lafferty, J. D. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the twentieth international conference on machine learning (ICML-03), pp. 912-919. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.