Multilayer feedforward networks are universal approximators. (English) Zbl 1383.92015

Summary: This paper rigorously establishes that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available. In this sense, multilayer feedforward networks are a class of universal approximators.


92B20 Neural networks for/in biological studies, artificial life and related topics
41A05 Interpolation in approximation theory
Full Text: DOI


[1] Billingsley, P., Probability and measure, (1979), Wiley New York · Zbl 0411.60001
[2] Cybenko, G., Approximation by superpositions of a sigmoidal function, () · Zbl 0679.94019
[3] Dugundji, J., Topology, (1966), Allyn and Bacon, Inc Boston · Zbl 0144.21501
[4] Gallant, A.R.; White, H., There exists a neural network that does not make avoidable mistables, (), I:657-I:664
[5] Grenander, U., Abstract inference, (1981), Wiley New York · Zbl 0505.62069
[6] Halmos, P.R., Measure theory, (1974), Springer-Verlag New York · Zbl 0283.28001
[7] Hecht-Nielsen, R., Kolmogorov’s mapping neural network existence theorem, (), III:11-III:14
[8] Hecht-Nielsen, R., Theory of the back propagation neural network, (), I:593-I:608
[9] Hornik, K.; Stinchcombe, M.; White, H., Multilayer feedforward networks are universal approximators, (), San Diego · Zbl 1383.92015
[10] ()
[11] ()
[12] Irie, B.; Miyake, S., Capabilities of three layer perceptrons, (), I:641-I:648
[13] Kolmogorov, A.N., On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition, Doklady akademii nauk SSR, 114, 953-956, (1957) · Zbl 0090.27103
[14] Kolmogorov, A.N.; Tihomirov, V.M., Ε-entropy and ϵ-capacity of sets in functional spaces, American mathematical society translations, 2, 17, 277-364, (1961)
[15] Lapedes, A.; Farber, R., How neural networks work, ()
[16] le Cun, Y., Modeles connexionistes de l’apprentissage, ()
[17] Lorentz, G.G., The thirteenth problem of Hilbert, (), 419-430
[18] Maxwell, T.; Giles, G.L.; Lee, Y.C.; Chen, H.H., Nonlinear dynamics of artificial neural systems, ()
[19] Minsky, M.; Papert, S., Perceptrons, (1969), MIT Press Cambridge · Zbl 0197.43702
[20] Rudin, W., Principles of mathematical analysis, (1964), McGraw-Hill New York · Zbl 0148.02903
[21] Severini, J.A.; Wong, W.H., Convergence rates of maximum likelihood and related estimates in general parameter spaces, ()
[22] Stinchcombe, M.; White, H., Universal approximation using feedforward networks with non-Sigmoid hidden layer activation functions, (), I:613-I:618
[23] White, H., The case for conceptual and operational separation of network architectures and learning mechanisms, (), San Diego
[24] White, H., Multilayer feedforward networks can learn arbitrary mappings: connectionist nonparametric regression with automatic and semi-automatic determination of network complexity, (), San Diego
[25] White, H., & Wooldridge, J. M. (in press). Some results for sieve estimation with dependent observations. In W. Barnett, J. Powell, & G. Tauchen (Eds.), Nonparametric and semi-parametric methods in econometrics and statistic. New York: Cambridge University Press. · Zbl 0776.62042
[26] Williams, R.J., The logic of activation functions, (), 423-443
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.