×

zbMATH — the first resource for mathematics

Treelet kernel incorporating cyclic, stereo and inter pattern information in chemoinformatics. (English) Zbl 1373.68319
Summary: Chemoinformatics is a research field concerned with the study of physical or biological molecular properties through computer science’s research fields such as machine learning and graph theory. From this point of view, graph kernels provide a nice framework which allows to naturally combine machine learning and graph theory techniques. Graph kernels based on bags of patterns have proven their efficiency on several problems both in terms of accuracy and computational time. Treelet kernel is a graph kernel based on a bag of small subtrees. We propose in this paper several extensions of this kernel devoted to chemoinformatics problems. These extensions aim to weight each pattern according to its influence, to include the comparison of non-isomorphic patterns, to include stereo information and finally to explicitly encode cyclic information into kernel computation.

MSC:
68T05 Learning and adaptive systems in artificial intelligence
05C90 Applications of graph theory
68T10 Pattern recognition, speech recognition
Software:
SimpleMKL; AFGen
PDF BibTeX Cite
Full Text: DOI
References:
[1] Johnson, M. A.; Maggiora, G. M., Concepts and applications of molecular similarity, (1990), John Wiley & Sons New York
[2] Cherqaoui, D.; Villemin, D.; Mesbah, A.; Cense, J. M.; Kvasnicka, V., Use of a neural network to determine the normal boiling points of acyclic ethers, peroxides, acetals and their sulfur analogues, J. Chem. Soc. Faraday Trans., 90, 2015-2019, (1994)
[3] Caelli, T.; Kosinov, S., An eigenspace projection clustering method for inexact graph matching, IEEE Trans. Pattern Anal. Mach. Intell., 26, 515-519, (2004)
[4] P. Foggia, G. Percannella, M. Vento, Graph matching and learning in pattern recognition in the last 10 years, Int. J. Pattern Recognit. Artif. Intell. 28 (1) (2014).
[5] G. Poezevara, B. Cuissart, B. Crémilleux, Discovering emerging graph patterns from chemicals, in: ISMIS׳2009, Lecture Notes in Computer Science, Prague, 2009, pp. 45-55.
[6] Riesen, K.; Bunke, H., Graph classification based on vector space embedding, Int. J. Pattern Recognit. Artif. Intell., 23, 6, 1053-1081, (2009)
[7] Dattorro, J., Convex optimization and Euclidean distance geometry, (2005), Meboo Publishing USA
[8] Kashima, H.; Tsuda, K.; Inokuchi, A., Kernels for graphs, 155-170, (2004), MIT Press Cambridge, (Chapter 7)
[9] Mahé, P.; Vert, J.-P., Graph kernels based on tree patterns for molecules, Mach. Learn., 75, 1, 3-35, (2008)
[10] N. Shervashidze, K.M. Borgwardt, Fast subtree kernels on graphs, in: Advances in Neural Information Processing Systems, 2009, pp. 1660-1668.
[11] Gaüzère, B.; Brun, L.; Villemin, D., Two new graphs kernels in chemoinformatics, Pattern Recognit. Lett., 33, 15, 2038-2047, (2012)
[12] N. Shervaszide, Scalable Graph Kernels (Ph.D. thesis), Universität Tübingen, 2012.
[13] T. Horváth, T. Gärtner, S. Wrobel, Cyclic pattern kernels for predictive graph mining, in: KDD׳2004, ACM Press, Seattle, Washington, USA, 2004, p. 158.
[14] Brown, J.; Urata, T.; Tamura, T.; Arai, M. A.; Kawabata, T.; Akutsu, T., Compound analysis via graph kernels incorporating chirality, J. Bioinf. Comput. Biol., 8, 1, 63-81, (2010)
[15] B. Gaüzére, L. Brun, D. Villemin, Relevant cycle hypergraph representation for molecules, in: GBR׳2013, Springer, Berlin, Heidelberg, 2013, pp. 111-120. · Zbl 1382.92266
[16] P.-A. Grenier, L. Brun, D. Villemin, Treelet kernel incorporating chiral information, in: GBR׳2013, Springer, Berlin, 2013, pp. 132-141. · Zbl 1382.92267
[17] Gönen, M.; Alpaydın, E., Multiple kernel learning algorithms, J. Mach. Learn. Res., 12, 2211-2268, (2011) · Zbl 1280.68167
[18] D. Haussler, Convolution Kernels on Discrete Structures, Technical Report, Department of Computer Science, University of California, Santa Cruz, 1999.
[19] Wale, N.; Watson, I.; Karypis, G., Comparison of descriptor spaces for chemical compound retrieval and classification, Knowl. Inf. Syst., 14, 3, 347-375, (2008)
[20] Morgan, H. L., The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service, J. Chem. Doc., 5, 2, 107-113, (1965)
[21] Rakotomamonjy, A.; Bach, F.; Canu, S.; Grandvalet, Y., Simplemkl, J. Mach. Learn. Res., 9, 2491-2521, (2008) · Zbl 1225.68208
[22] M. Varma, D. Ray, Learning the discriminative power-invariance trade-off, in: ICCV׳2007, IEEE, Rio de Janeiro, 2007, pp. 1-8.
[23] F. Yger, R.A., Wavelet kernel learning, Pattern Recognit. 44(10-11) (2011) 2614-2629.
[24] Neuhaus, M.; Bunke, H., Bridging the gap between graph edit distance and kernel machines, (2007), World Scientific Publishing Co., Inc. River Edge, NJ, USA · Zbl 1140.68473
[25] S. Fankhauser, K. Riesen, H. Bunke, Speeding up graph edit distance computation through fast bipartite matching, in: GBR׳2011, vol. 6658, 2011, pp. 102-111. · Zbl 1381.05068
[26] L. Brun, B. Gaüzére, S. Fourey, Relationships Between Graph edit Distance and Maximal Common Unlabeled Subgraph, Technical Report, GREYC, 〈http://hal.archives-ouvertes.fr/hal-00714879〉, 2012.
[27] Bunke, H., Error correcting graph matchingon the influence of the underlying cost function, IEEE Trans. Pattern Anal. Mach. Intell., 21, 9, 917-922, (1999)
[28] Zhang, K.; Statman, R.; Shasha, D., On the editing distance between unordered labeled trees, Inf. Process. Lett., 42, 3, 133-139, (1992) · Zbl 0780.68070
[29] P.-A. Grenier, L. Brun, D. Villemin, Incorporating Chirality Within the Graph Kernel Framework, Technical Report, CNRS UMR 6072 GREYC, 〈http://hal.archives-ouvertes.fr/hal-00809066/〉, 2013. · Zbl 1382.92267
[30] T. Horváth, Cyclic pattern kernels revisited, in: KDD׳2005, vol. 3518, 2005, pp. 791-801.
[31] Vismara, P., Union of all the minimum cycle bases of a graph, Electron. J. Comb., 4, 1, 73-87, (1997)
[32] Berge, C., Graphs and Hypergraphs, vol. 6, (1976), Elsevier Amsterdam · Zbl 0483.05029
[33] F. Suard, A. Rakotomamonjy, A. Bensrhair, Kernel on bag of paths for measuring similarity of shapes, in: European Symposium on Artificial Neural Networks, 2002, pp. 355-360.
[34] Toivonen, H.; Srinivasan, A.; King, R. D.; Kramer, S.; Helma, C., Statistical evaluation of the predictive toxicology challenge 2000-2001, Bioinformatics, 19, 10, 1183-1193, (2003)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.