Networks and the best approximation property. (English) Zbl 0714.94029

Networks can be considered as approximation schemes. It is well known that multilayer networks of the perceptron type can approximate arbitrarily well continuous functions. A similar result is proven for networks derived from regularization theory and including radial basis functions. From the point of view of approximation theory, however, the property of approximating continuous functions arbitrarily well is not sufficient for characterizing good approximation schemes. More critical is the property of best approximation, that is the presence of an element of minimum distance from the function that has to be approximated. In this paper it is shown that multilayer perceptron networks, of the type used in backpropagation, do not have this property. For regularization networks (in particular radial basis function networks) existence and uniqueness of best approximation immediately derives from the linearity of the theory.
Reviewer: F.Girosi


94C99 Circuits, networks
Full Text: DOI


[1] Bertero M (1986) Regularization methods for linear inverse problems. In: Talenti CG (eds) Inverse problems. Springer, Berlin Heidelberg New York · Zbl 0603.65038
[2] Bertero M, Poggio T, Torre V (1988) Ill-posed problems in early vision. Proc IEEE 76:869–889
[3] Braess D (1986) Nonlinear approximation theory. Springer, Berlin Heidelberg New York · Zbl 0656.41001
[4] Broomhead DS, Lowe D (1988) Multivariable functional interpolation and adaptive networks. Complex Syst 2:321–355 · Zbl 0657.68085
[5] Carrol SM, Dickinson BW (1989) Construction of neural nets using the Radon transform. In: Proceedings of the International Joint Conference on Neural Networks, pp I-607–I-611. Washington D. C. June 1989. IEEE TAB Neural Network Committee
[6] Cheney EW (1981) Introduction to approximation theory. Chelsea, New York
[7] Cybenko (1988) Continuous valued neural networks with two hidden layers are sufficient. Technical report. Department of Computer Sciences, Tufts University, Medford, Mass
[8] Cybenko G (1989) Approximation by superposition of a sigmoidal function. Math Control Syst Signals (in press) · Zbl 0679.94019
[9] De Boor C (1969) On the approximation by{\(\gamma\)}-Polynomials. In: Schoenberg IJ (eds) Approximation with special emphasis on spline functions. Academic Press, New York, pp 157–183
[10] Funahashi K (1989) On the approximate realization of continuous mappings by neural networks. Neural Networks 2:183–192
[11] Gelfand IM, Shilov GE (1964) Generalized functions, vol 1: Properties and operations. Academic Press, New York
[12] Hartman E, Keeler K, Kowalski JM (1989) Layered neural networks with gaussian hidden units as universal approximations, (submitted for publication)
[13] Hobby CR, Rice JR (1967) Approximation from a curve of functions. Arch Rat Mech Anal 27:91–106 · Zbl 0187.32602
[14] Micchelli CA (1986) Interpolation of scattered data: distance matrices and conditionally positive definite functions. Constn Approx 2:11–22 · Zbl 0625.41005
[15] Moody J, Darken C (1989) Fast learning in networks of locally-tuned processing units. Neural Comput 1:281–294
[16] Moore B, Poggio T (1988) Representations properties of multilayer feed-forward networks. In: Abstracts of the First Annual INNS Meeting. Pergamon Press, New York, p 502
[17] Morozov VA (1984) Methods for solving incorrectly posed problems. Springer, Berlin Heidelberg New York
[18] Poggio T, Girosi F (1989) A theory of networks for approximation and learning. A. I. Memo No. 1140. Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge Mass
[19] Rice JR (1964) The approximation of functions, vol 1. Addison-Wesley, Reading, Mass · Zbl 0152.06303
[20] Rice JR (1969) The approximation of functions, vol. 2. Addison-Wesley, Reading, Mass · Zbl 0189.06601
[21] Rudin W (1973) Functional analysis. McGraw-Hill, New York · Zbl 0253.46001
[22] Rumelhart DE, Hinton GE, Williams RJ (1986a) Learning internal representations by error propagation. In: Parallel distributed processing, chap 8. MIT Press, Cambridge Mass, pp 318–362
[23] Rumelhart DE, Hinton GE, Williams RJ (1986b) Learning representations by back-propagating errors. Nature 323:533–536 · Zbl 1369.68284
[24] Sejnowski TJ, Rosenberg CR (1987) Parallel networks that learn to pronounce english text. Complex Syst 1:145–168 · Zbl 0655.68107
[25] Stinchcombe M, White H (1989) Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions. In: Proceedings of the International Joint Conference on Neural Networks. Washington DC, June 1989. IEEE TAB Neural Network Committee, pp I/607–I/611
[26] Stone MH (1937) Applications of the theory of Boolean rings to general topology. AMS Trans 41:375–481 · Zbl 0017.13502
[27] Stone MH (1948) The generalized Weierstrass approximation theorem. Math Mag 21:167–183, 237–254
[28] Tikhonov AN (1963) Solution of incorrectly formulated problems and the regularization method. Soviet Math Dokl 4:1035–1038 · Zbl 0141.11001
[29] Tikhonov AN, Arsenin VY (1977) Solutions of Ill-posed Problems. Winston, Washington, DC · Zbl 0354.65028
[30] Yosida K (1974) Functional analysis. Springer, Berlin Heidelberg New York · Zbl 0286.46002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.