×

zbMATH — the first resource for mathematics

One-trial correction of legacy AI systems and stochastic separation theorems. (English) Zbl 1448.68369
Summary: We consider the problem of efficient “on the fly” tuning of existing, or legacy, Artificial Intelligence (AI) systems. The legacy AI systems are allowed to be of arbitrary class, albeit the data they are using for computing interim or final decision responses should posses an underlying structure of a high-dimensional topological real vector space. The tuning method that we propose enables dealing with errors without the need to re-train the system. Instead of re-training a simple cascade of perceptron nodes is added to the legacy system. The added cascade modulates the AI legacy system’s decisions. If applied repeatedly, the process results in a network of modulating rules “dressing up” and improving performance of existing AI systems. Mathematical rationale behind the method is based on the fundamental property of measure concentration in high dimensional spaces. The method is illustrated with an example of fine-tuning a deep convolutional network that has been pre-trained to detect pedestrians in images.
MSC:
68T01 General topics in artificial intelligence
60D05 Geometric probability and stochastic geometry
60E15 Inequalities; stochastic orderings
68T05 Learning and adaptive systems in artificial intelligence
68T07 Artificial neural networks and deep learning
68T09 Computational aspects of data analysis and big data
Software:
LIBLINEAR
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Anderson, J.; Belkin, M.; Goyal, N.; Rademacher, L.; Voss, J., The more, the merrier: the blessing of dimensionality for learning large Gaussian mixtures, J. Mach. Learn. Res. :Workshop and Conference Proceedings,., 35, 1-30 (2014)
[2] Ball, K., An elementary introduction to modern convex geometry, Flavors Geom., 31, 1-58 (1997) · Zbl 0901.52002
[3] Bennett, K., Legacy systems: coping with success, IEEE Softw., 12, 1, 19-23 (1995)
[4] Bisbal, J.; Lawless, D.; Wu, B.; Grimson, J., Legacy information systems: issues and directions, IEEE Softw., 16, 5, 103-111 (1999)
[5] Brahma, P. P.; Wu, D.; She, Y., Why deep learning works: a manifold disentanglement perspective, IEEE Trans. Neural Netw. Learn. Syst., 27, 10, 1997-2008 (2016)
[8] Dalal, N.; Triggs, B., Histograms of oriented gradients for human detection, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 886-893 (2005)
[9] Ess, A.; Leibe, B.; Schindler, K.; van Gool, L., A mobile vision system for robust multi-person tracking, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 1-8 (2008)
[10] Fan, R.-E.; Chang, K.-W.; Hsieh, C.-J.; Wang, X.-R.; Lin, C.-J., Liblinear: a library for large linear classification, J. Mach. Learn. Res., 9, 1871-1874 (2008) · Zbl 1225.68175
[11] Fricker, R. J., False positives are statistically inevitable, Science, 351, 569-570 (2016)
[12] Gear, C.; Kevrekidis, I., Constraint-defined manifolds: a legacy code approach to low-dimensional computation, J. Sci. Comput., 25, 1, 17-28 (2005) · Zbl 1203.37005
[13] Gibbs, J., Elementary Principles in Statistical Mechanics, Developed With Especial Reference to the Rational Foundation of Thermodynamics (1960 (1902)), Dover Publications: Dover Publications New York
[14] Glorot, X.; Bengio, Y., Understanding the difficulty of training deep feedforward neural networks, Proc. of the 13th International Conference on Arificial Intelligence and Statistics (AISTATS), 9, 249-256 (2010)
[15] Gorban, A., Order-disorder separation: geometric revision, Physica A, 374, 85-102 (2007)
[16] Gorban, A.; Tyukin, I.; Prokhorov, D.; Sofeikov, K., Approximation with random bases: Pro et Contra, Inf. Sci., 364-365, 129-145 (2016)
[18] Gromov, M., Metric Structures for Riemannian and non-Riemannian Spaces. With Appendices by M. Katz, P. Pansu, S. Semmes. Translated from the French by Sean Muchael Bates (1999), Birkhauser: Birkhauser Boston, MA · Zbl 0953.53002
[19] Gromov, M., Isoperimetry of waists and concentration of maps, GAFA, Geomter. Funct. Anal., 13, 178-215 (2003) · Zbl 1044.46057
[20] Halmos, P., Finite Dimensional Vector Spaces, Undergraduate Texts in Mathematics (1974), Springer
[21] Hansen, L. K.; Salamon, P., Neural network ensembles, IEEE Trans. Pattern Anal. Mach. Intell., 12, 10, 993-1001 (1990)
[22] He, K.; Zhang, X.; Ren, S.; Sun, J., Deep residual learning for image recognition, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778 (2016)
[23] Ho, T. K., Random decision forests, Proc. of the 3rd International Conference on Document Analysis and Recognition, 993-1001 (1995)
[24] Ho, T. K., The random subspace method for constructing decision forests, IEEE Trans. Pattern Anal. Mach. Intell., 20, 8, 832-844 (1998)
[25] Ison, M.; Quian Quiroga, R.; Fried, I., Rapid encoding of new memories by individual neurons in the human brain, Neuron, 87, 1, 220-230 (2015)
[26] Jackson, D., Stopping rules in principal components analysis: a comparison of heuristical and statistical approaches, Ecology, 74, 8, 2204-2214 (1993)
[28] Johnson, W. B.; Lindenstrauss, J., Extensions of Lipschitz mappings into a Hilbert space, Contemp. Math., 26, 1, 189-206 (1984) · Zbl 0539.46017
[29] Krein, M.; Milman, D., On extreme points of regular convex sets, StudiaMath, 9, 133-138 (1940) · Zbl 0063.03360
[30] Krizhevsky, A.; Sutskever, I.; Hinton, G., Imagenet classification with deep convolutional neural networks, (Pereira, F.; Burges, C. J.C.; Bottou, L.; Weinberger, K. Q., Advances in Neural Information Processing Systems 25 (2012), Curran Associates, Inc.), 1097-1105
[31] Kuznetsova, A.; Hwang, S.; Rosenhahn, B.; Sigal, L., Expanding object detectors horizon: incremental learning framework for object detection in videos, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 28-36 (2015)
[32] Lévy, P., Problèmes Concrets d’analyse Fonctionnelle (1951), Gauthier-Villars: Gauthier-Villars Paris · Zbl 0043.32302
[33] Macarthur, R., On the relative abundance of bird species, Proc. Natl. Acad. Sci., 43, 3, 293-295 (1957)
[34] Milman, V. D.; Schechtman, G., Asymptotic theory of finite dimensional normed spaces: Isoperimetric inequalities in Riemannian manifolds, Lecture Notes in Mathematics, 1200 (2009), Springer
[35] Misra, I.; Shrivastava, A.; Hebert, M., Semi-supervised learning for object detectors from video, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3594-3602 (2015)
[36] Nguyen, A.; Yosinski, J.; Clune, J., Deep neural networks are easily fooled: High confidence predictions for unrecognizable images, Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 427-436 (2015)
[37] Prest, A.; Leistner, C.; Civera, J.; Schmid, C.; Ferrari, V., Learning object class detectors from weakly annotated video, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3282-3289 (2012)
[38] Quian Quiroga, R., Concept cells: the building blocks of declarative memory functions, Nat. Rev. Neurosci., 13, 8, 587-597 (2012)
[39] Quian Quiroga, R.; Reddy, L.; Kreiman, G.; Koch, C.; Fried, I., Invariant visual representation by single neurons in the human brain, Nature, 435, 7045, 1102-1107 (2005)
[40] Rudin, W., Functional Analysis, International Series in Pure and Applied Mathematics (1991), McGraw-Hill · Zbl 0867.46001
[41] Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; Berg, A. C.; Fei-Fei, L., Imagenet large scale visual recognition challenge, Int. J. Comput. Vis., 1-42 (2014)
[42] Scardapane, S.; Wang, D., Randomness in neural networks: an overview, Wiley Interdiscip. Rev., 7, 2, e1200 (2017)
[43] Schaefer, H., Topological Vector Spaces (1999), Springer: Springer New York · Zbl 0983.46002
[44] Simonyan, K.; Zisserman, A., Very deep convolutional networks for large-scale image recognition, International Conference on Learning Representations (2015)
[45] Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I. J.; Fergus, R., Intriguing properties of neural networks, Proc. of International Conference on Learining Representations (ICLR) (2014)
[47] Vapnik, V., The Nature of Statistical Learning Theory (2000), Springer-Verlag · Zbl 0934.62009
[48] Vapnik, V.; Chapelle, O., Bounds on error expectation for support vector machines, Neural Comput., 12, 9, 2013-2036 (2000)
[49] Viskontas, I.; Quian Quiroga, R.; Fried, I., Human medial temporal lobe neurons respond preferentially to personally relevant images, Proc. Nat. Acad. Sci., 106, 50, 21329-21334 (2009)
[50] Wang, D., Editorial: randomized algorithms for training neural networks, Inf. Sci., 364-365, 126-128 (2016)
[51] Zheng, S.; Song, Y.; Leung, T.; Goodfellow, I., Improving the robustness of deep neural networks via stability training, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.