×

A new learning paradigm: learning using privileged information. (English) Zbl 1335.68212

Summary: In the afterword to the second edition of the book [Estimation of dependences based on empirical data. With a new afterword of the 2006 ed.: Empirical inference science. 2nd ed., reprint of 1982 ed. 2nd ed., reprint of 1982 ed. New York, NY: Springer (2006; Zbl 1118.62002)] by V. Vapnik, an advanced learning paradigm called learning using hidden information (LUHI) was introduced. This afterword also suggested an extension of the SVM method (the so called \(\mathrm{SVM}_\gamma+\) method) to implement algorithms which address the LUHI paradigm [loc. cit., Sections 2.4.2 and 2.5.3 of the afterword). See also [V. Vapnik et al., “Learning using hidden information: master class learning”, in: Proceedings of NATO workshop on mining massive data sets for security. Amsterdam: IOS Press. 3–14 (2008; doi:10.3233/978-1-58603-898-4-3); “Learning using hidden information (learning with teacher)”, in: Proceedings of international joint conference on neural networks. Los Alamitos: IEEE Computer Society. 3188–3195 (2009; doi:10.1109/IJCNN.2009.5178760)] for further development of the algorithms.
In contrast to the existing machine learning paradigm where a teacher does not play an important role, the advanced learning paradigm considers some elements of human teaching. In the new paradigm along with examples, a teacher can provide students with hidden information that exists in explanations, comments, comparisons, and so on.
This paper discusses details of the new paradigm and corresponding algorithms, introduces some new algorithms, considers several specific forms of privileged information, demonstrates superiority of the new learning paradigm over the classical learning paradigm when solving practical problems, and discusses general questions related to the new ideas.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Citations:

Zbl 1118.62002

Software:

MAMMOTH
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Berman, H. M.; Westbrook, J.; Feng, Z.; Gillil, G.; Bhat, T. N.; Weissig, H., The protein data bank, Nucleic Acids Research, 28, 235-242 (2000)
[2] Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on computational learning theory (pp. 144-152). Vol. 5; Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on computational learning theory (pp. 144-152). Vol. 5
[3] Casdagli, M., Nonlinear prediction of chaotic time series, Physica D, 35, 335-356 (1989) · Zbl 0671.62099
[4] Chechik, G.; Tishby, N., Extracting relevant structures with side information, (Advances in neural information processing systems 2002, Vol. 16 (2002)), 857-864
[5] Cortes, C.; Vapnik, V. N., Support vector networks, Machine Learning, 20, 273-297 (1995) · Zbl 0831.68098
[6] Kuang, R.; Le, E.; Wang, K.; Wang, K.; Siddiqi, M.; Freund, Y., Profile-based string kernels for remote homology detection and motif extraction, Journal of Bioinformatics and Computational Biology, 3, 527-550 (2005)
[7] Liao, L.; Noble, W. S., Combining pairwise sequence similarity and support vector machines for remote protein homology detection, Journal of Computational Biology, 10, 6, 857-868 (2003)
[8] Mukherjee, S.; Osuna, E.; Girosi, F., Nonlinear prediction of chaotic time series using support vector machines, Neural Networksfor Signal Processing, 511-520 (1997)
[9] Murzin, A. G.; Brenner, S. E.; Hubbard, T.; Chothia, C., SCOP: A structural classification of proteins database for investigation of sequences and structures, Journal of Molecular Biology, 247, 536-540 (1995)
[10] Ortiz, A. R.; Strauss, C. E.M.; Olmea, O., MAMMOTH (matching molecular models obtained from theory): An automated method for model comparison, Protein Science, 11, 2606-2621 (2002)
[11] Platt, J. C., Sequential minimal optimization: A fast algorithm for training support vector machines, (Schölkopf, B.; Burges, C.; Smola, A., Advances in kernel methods: Support vector learning (1998), MIT Press)
[12] Steinwart, I., Support vector machines are universally consistent, Journal of Complexity, 18, 3, 768-791 (2002) · Zbl 1030.68074
[13] Steinwart, I., & Scovel, C. (2004). When do support vector machines learn fast? In The sixteenth international symposium on mathematical theory of networks and systems (MTNS); Steinwart, I., & Scovel, C. (2004). When do support vector machines learn fast? In The sixteenth international symposium on mathematical theory of networks and systems (MTNS)
[14] Tsybakov, A. B., Optimal aggregation of classifiers in statistical learning, Annals of Statistics, 32, 1, 135-166 (2004) · Zbl 1105.62353
[15] Vapnik, V.; Vashist, A.; Pavlovitch, N., Learning using hidden information: Master class learning, (Proceedings of NATO workshop on mining massive data sets for security (2008), IOS Press), 3-14
[16] Vapnik, V., Vashist, A., & Pavlovitch, N. (2009). Learning using hidden information (learning with teacher). In Proceedings of International Joint Conference on Neural Networks (pp. 3188-3195); Vapnik, V., Vashist, A., & Pavlovitch, N. (2009). Learning using hidden information (learning with teacher). In Proceedings of International Joint Conference on Neural Networks (pp. 3188-3195)
[17] Vapnik, V. N., Estimation of dependencies based on empirical data, (Empirical inference science (2006), Springer) · Zbl 0746.68075
[18] Vapnik, V. N., The nature of statistical learning theory (1995), Springer-Verlag · Zbl 0934.62009
[19] Vapnik, V. N., Statistical learning theory (1998), Wiley-Interscience · Zbl 0934.62009
[20] Xing, E. P.; Ng, A. Y.; Jordan, M. I.; Russell, S., Distance metric learning with application to clustering with side-information, (Advances in neural information processing systems, Vol. 16 (2002)), 521-528
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.