Efficient semi-supervised learning on locally informative multiple graphs. (English) Zbl 1227.68093

Summary: We address an issue of semi-supervised learning on multiple graphs, over which informative subgraphs are distributed. One application under this setting can be found in molecular biology, where different types of gene networks are generated depending upon experiments. Here an important problem is to annotate unknown genes by using functionally known genes, which connect to unknown genes in gene networks, in which informative parts vary over networks. We present a powerful, time-efficient approach for this problem by combining soft spectral clustering with label propagation for multiple graphs. We demonstrate the effectiveness and efficiency of our approach using both synthetic and real biological datasets.


68T05 Learning and adaptive systems in artificial intelligence
68R10 Graph theory (including graph drawing) in computer science
Full Text: DOI


[1] Weston, J.; Leslie, C.; Ie, E.; Zhou, D.; Elisseeff, A.; Noble, W.S., Semi-supervised protein classification using cluster kernels, Bioinformatics, 21, 15, 3241-3247, (2005)
[2] Fujino, A.; Ueda, N.; Saito, K., Semisupervised learning for a hybrid generative/discriminative classifier based on the maximum entropy principle, IEEE transactions on pattern analysis and machine intelligence, 30, 3, 424-437, (2008)
[3] Bao, B.-K.; Nia, B.; Mua, Y.; Yana, S., Efficient region-aware large graph construction towards scalable multi-label propagation, Pattern recognition, 44, 3, 598-606, (2011)
[4] O. Chapelle, B. Schölkopf, A. Zien (Eds.), Introduction to Semi-Supervised Learning, MIT Press, Cambridge, MA, 2006, pp. 1-12 (Chapter 1).
[5] Kulis, B.; Basu, S.; Dhillon, I.S.; Mooney, R.J., Semi-supervised graph clustering: a kernel approach, Machine learning, 74, 1, 1-22, (2009)
[6] Yin, X.; Chen, S.; Hu, E.; Zhang, D., Semi-supervised clustering with metric learning: an adaptive kernel method, Pattern recognition, 43, 4, 1320-1333, (2010) · Zbl 1192.68623
[7] Lee, I.; Date, S.; Adai, A.; Marcotte, E., A probabilistic functional network of yeast genes, Science, 306, 5701, 1555, (2004)
[8] Tsuda, K.; Shin, H.; Schölkopf, B., Fast protein classification with multiple networks, Bioinformatics, 21, 2, 59-65, (2005)
[9] Shiga, M.; Takigawa, I.; Mamitsuka, H., Annotating gene function by combining expression data with a modular gene network, Bioinformatics, 23, 13, i468-i478, (2007)
[10] Karaoz, U.; Murali, T.; Letovsky, S.; Zheng, Y.; Ding, C.; Cantor, C.; Kasif, S., Whole-genome annotation by using evidence integration in functional-linkage networks, Proceedings of the national Academy of sciences, 101, 9, 2888, (2004)
[11] Maciag, K.; Altschuler, S.J.; Slack, M.D.; Krogan, N.J.; Emili, A.; Greenblatt, J.F.; Maniatis, T.; Wu, L.F., Systems-level analyses identify extensive coupling among gene expression machines, Molecular systems biology, 2, (2006), 2006.0003
[12] Sharan, R.; Ulitsky, I.; Shamir, R., Network-based prediction of protein function, Molecular systems biology, 3, 88, (2007), Epub
[13] Mostafavi, S.; Ray, D.; Warde-Farley, D.; Grouios, C.; Morris, Q., Genemania: a real-time multiple association network integration algorithm for predicting gene function, Genome biology, 9, Suppl 1, S4, (2008)
[14] Xi, W.; Zhang, B.; Chen, Z.; Lu, Y.; Yan, S.; Ma, W.-Y.; Fox, E.A., Link fusion: a unified link analysis framework for multi-type interrelated data objects, (), 319-327
[15] Zhang, T.; Popescul, A.; Dom, B., Linear prediction models with graph regularization for web-page categorization, (), 821-826
[16] Ling, X.; Dai, W.; Xue, G.-R.; Yang, Q.; Yu, Y., Spectral domain-transfer learning, (), 488-496
[17] Kato, T.; Kashima, H.; Sugiyama, M., Robust label propagation on multiple networks, IEEE transactions on neural networks, 20, 1, 35-44, (2009)
[18] Lanckriet, G.R.G.; Cristianini, N.; Bartlett, P.; Ghaoui, L.E.; Jordan, M.I., Learning the kernel matrix with semidefinite programming, Journal of machine learning research, 5, 27-72, (2004) · Zbl 1222.68241
[19] Zhu, X.; Kandola, J.; Ghahramani, Z.; Lafferty, J., Nonparametric transforms of graph kernels for semi-supervised learning, (), 1641-1648
[20] Argyriou, A.; Herbster, M.; Pontil, M., Combining graph Laplacians for semi-supervised learning, (), 67-74
[21] Lewis, D.P.; Jebara, T.; Noble, W.S., Nonstationary kernel combination, (), 553-560
[22] Ong, C.S.; Smola, A.J.; Williamson, T.C., Learning the kernel with hyperkernels, Journal of machine learning research, 6, 1043-1071, (2005) · Zbl 1222.68277
[23] Gönen, M.; Alpaydin, E., Localized multiple kernel learning, (), 352-359
[24] Filippone, M.; Camastra, F.; Masulli, F.; Rovetta, S., A survey of kernel and spectral methods for clustering, Pattern recognition, 41, 1, 176-190, (2008) · Zbl 1122.68530
[25] Banerjee, A.; Dhillon, I.S.; Ghosh, J.; Sra, S., Clustering on the unit hypersphere using von mises – fisher distributions, Journal of machine learning research, 6, 1345-1382, (2005) · Zbl 1190.62116
[26] Zhou, D.; Bousquet, O.; Lal, T.N.; Weston, J.; Schölkopf, B., Learning with local and global consistency, (), 321-328
[27] Kondor, R.I.; Lafferty, J.D., Diffusion kernels on graphs and other discrete input spaces, (), 315-322
[28] Smola, A.J.; Kondor, R.I., Kernels and regularization on graphs, (), 144-158 · Zbl 1274.68351
[29] Hand, D.J.; Till, R.J., A simple generalization of the area under the ROC curve for multiple class classification problems, Machine learning, 45, 171-186, (2001) · Zbl 1007.68180
[30] Mamitsuka, H., Selecting features in microarray classification using ROC curves, Pattern recognition, 39, 12, 2393-2404, (2006) · Zbl 1103.68774
[31] Güldener, U., CYGD: the comprehensive yeast genome database, Nucleic acids research, 33, Database issue, D364-D368, (2005)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.