Deep ReLU networks and high-order finite element methods. (English) Zbl 1452.65354

Summary: Approximation rate bounds for emulations of real-valued functions on intervals by deep neural networks (DNNs) are established. The approximation results are given for DNNs based on ReLU activation functions. The approximation error is measured with respect to Sobolev norms. It is shown that ReLU DNNs allow for essentially the same approximation rates as nonlinear, variable-order, free-knot (or so-called “\(hp\)-adaptive”) spline approximations and spectral approximations, for a wide range of Sobolev and Besov spaces. In particular, exponential convergence rates in terms of the DNN size for univariate, piecewise Gevrey functions with point singularities are established. Combined with recent results on ReLU DNN approximation of rational, oscillatory, and high-dimensional functions, this corroborates that continuous, piecewise affine ReLU DNNs afford algebraic and exponential convergence rate bounds which are comparable to “best in class” schemes for several important function classes of high and infinite smoothness. Using composition of DNNs, we also prove that radial-like functions obtained as compositions of the above with the Euclidean norm and, possibly, anisotropic affine changes of co-ordinates can be emulated at exponential rate in terms of the DNN size and depth without the curse of dimensionality.


65N30 Finite element, Rayleigh-Ritz and Galerkin methods for boundary value problems involving PDEs
65D07 Numerical computation using splines
65N12 Stability and convergence of numerical methods for boundary value problems involving PDEs
41A25 Rate of convergence, degree of approximation
41A46 Approximation by arbitrary nonlinear expressions; widths and entropy
35B65 Smoothness and regularity of solutions to PDEs
35R02 PDEs on graphs and networks (ramified or polygonal spaces)
68T07 Artificial neural networks and deep learning
92B20 Neural networks for/in biological studies, artificial life and related topics


Full Text: DOI


[1] Barron, A. R., Complexity regularization with application to artificial neural networks, in Nonparametric Functional Estimation and Related Topics (NAIO Advanced Science Institutes Series C: Mathematical and Physical Sciences, Vol. 335) (Kluwer Academic Publisher, Dordrecht, 1991), pp. 561-576. · Zbl 0739.62001
[2] Barron, A. R., Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inform. Theory, 39(3) (1993) 930-945. · Zbl 0818.68126
[3] C. Beck, S. Becker, P. Grohs, N. Jaafari and A. Jentzen, Solving stochastic differential equations and Kolmogorov equations by means of deep learning, Technical Report (2018), arXiv:1806.00421.
[4] Beck, C., W. E. and Jentzen, A., Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations, J. Nonlinear Sci.29(4) (2019) 1563. · Zbl 1442.91116
[5] Bölcskei, H., Grohs, P., Kutyniok, G. and Petersen, P., Optimal approximation with sparsely connected deep neural networks, SIAM J. Math. Data Sci.1(1) (2019) 8-45. · Zbl 1499.41029
[6] Cheng, G. and Shcherbakov, V., Anisotropic radial basis function methods for continental size ice sheet simulations, J. Comput. Phys.372 (2018) 161-177. · Zbl 1415.76497
[7] Chernov, A., von Petersdorff, T. and Schwab, Ch., Exponential convergence of \(hp\) quadrature for integral operators with Gevrey kernels, ESAIM Math. Mod. Num. Anal.45 (2011) 387-422. · Zbl 1269.65143
[8] Chui, C. K., Li, X. and Mhaskar, H. N., Neural networks for localized approximation, Math. Comput.63(208) (1994) 607-623. · Zbl 0806.41020
[9] Chui, C. K., Lin, S.-B. and Zhou, D.-X., Construction of neural networks for realization of localized deep learning, Front. Appl. Math. Statist.4 (2018).
[10] Chui, C. K., Lin, S.-B. and Zhou, D.-X., Deep neural networks for rotation-invariance approximation and learning, Anal. Appl.17(5) (2019) 737-772. · Zbl 1423.68378
[11] Chui, C. K. and Mhaskar, H. N., Deep nets for local manifold learning, Front. Appl. Math. Statist.4 (2018).
[12] Dahmen, W. and Scherer, K., Best approximation by piecewise polynomials with variable knots and degrees, J. Approx. Theory26(1) (1979) 1-13. · Zbl 0407.41010
[13] D. Devaud, \(hp\)-approximation of linear parabolic evolution problems in \(H^{1 / 2}\), PhD thesis, ETH Zürich, Dissertation 24393 (2017).
[14] DeVore, R. A. and Lorentz, G. G., Constructive Approximation, , Vol. 303 (Springer-Verlag, Berlin, 1993). · Zbl 0797.41016
[15] E, W. and Wang, Q., Exponential convergence of the deep neural network approximation for analytic functions, Sci. China Math.61(10) (2018) 1733-1740. · Zbl 1475.65007
[16] E, W. and Yu, B., The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems, Commun. Math. Stat.6(1) (2018) 1-12. · Zbl 1392.35306
[17] D. Elbrächter, P. Grohs, A. Jentzen and Ch. Schwab, DNN expression rate analysis of high-dimensional PDEs: Application to option pricing, Technical Report (2018).
[18] Erdélyi, A., Magnus, W., Oberhettinger, F. and Tricomi, F. G., Higher Transcendental Functions. II (Robert E. Krieger Publishing Co., Inc., Melbourne, 1981), Based on notes left by Harry Bateman, Reprint of the 1953 original. · Zbl 0052.29502
[19] P. Grohs, D. Perekrestenko, D. Elbrächter and H. Bölcskei, Deep neural network approximation theory, Technical Report (2019), arXiv:1901.02220.
[20] P. Grohs, T. Wiatowski and H. Boelcskei, Deep convolutional neural networks on cartoon functions, Technical Report 2016-25, Seminar for Applied Mathematics, ETH Zürich (2016).
[21] Gui, W. and Babuška, I., The \(h,p\) and \(h-p\) versions of the finite element method in \(1\) dimension. II. The error analysis of the \(h\)- and \(h-p\) versions, Numer. Math.49(6) (1986) 613-657. · Zbl 0614.65089
[22] Han, J., Jentzen, A. and W. E.Solving high-dimensional partial differential equations using deep learning, Proc. Natl. Acad. Sci. USA115(34) (2018) 8505-8510. · Zbl 1416.35137
[23] J. He, L. Li, J. Xu and C. Zheng, ReLU deep neural networks and linear finite elements, arXiv:1807.03973 (2018).
[24] Hornik, K., Stinchcombe, M. and White, H., Multilayer feedforward networks are universal approximators, Neural Netw.2(5) (1989) 359-366. · Zbl 1383.92015
[25] LeCun, Y., Bengio, Y. and Hinton, G., Deep learning, Nature521(7553) (2015) 436-444.
[26] Liang, S. and Srikant, R.. Why deep neural networks for function approximation? in Proc. of ICLR 2017 (2017), pp. 1-17.
[27] Lin, S.-B., Generalization and expressivity for deep nets, IEEE Trans. Neural Netwo. Learn Syst.30(5) (2019) 1392-1406.
[28] B. McCane and L. Szymanski, Deep radial kernel networks: Approximating radially symmetric functions with deep networks, Technical Report (2017), arXiv:1703.03470.
[29] McCane, B. and Szymanski, L., Efficiency of deep networks for radially symmetric functions, Neurocomputing, 313 (2018) 119-124.
[30] Melenk, J. and Schwab, Ch., \(hp\) FEM for reaction-diffusion equations. I. Robust exponential convergence, SIAM J. Numer. Anal.35(4) (1998) 1520-1557. · Zbl 0972.65093
[31] Melenk, J. and Schwab, Ch., Analytic regularity for a singularly perturbed problem, SIAM J. Math. Anal.30(2) (1999) 379-400. · Zbl 1023.35009
[32] Mhaskar, H. N., Approximation properties of a multilayered feedforward artificial neural network, Adv. Comput. Math.1(1) (1993) 61-80. · Zbl 0824.41011
[33] Mhaskar, H. N., Neural networks for optimal approximation of smooth and analytic functions, Neural Comput.8 (1996) 164-177.
[34] Montanelli, H. and Du, Q., New error bounds for deep ReLU networks using sparse grids, SIAM J. Math. Data Sci.1(1) (2019) 78-92. · Zbl 1513.68054
[35] J. A. A. Opschoor, P. Petersen and Ch. Schwab, Deep ReLU networks and multivariate high-order finite element methods (2020) in preparation.
[36] J. A. A. Opschoor, Ch. Schwab and J. Zech, Exponential ReLU DNN expression of holomorphic maps in high dimension, Technical Report 2019-35, Seminar for Applied Mathematics, ETH Zürich, Switzerland (2019).
[37] Oswald, P., On the degree of nonlinear spline approximation in Besov-Sobolev spaces, J. Approx. Theory61(2) (1990) 131-157. · Zbl 0742.41017
[38] G. Pang, L. Lu and G. E. Karniadakis, fPINNs: Fractional physics-informed neural networks, Technical Report (2018), arXiv:1811.08967. · Zbl 1420.35459
[39] Petersen, P. and Voigtlaender, F., Optimal approximation of piecewise smooth functions using deep ReLU neural networks, Neural Netw.108 (2018) 296-330. · Zbl 1434.68516
[40] Petrushev, P. P., Direct and converse theorems for spline and rational approximation and Besov spaces, in Function Spaces and Applications (Springer, 1988), pp. 363-377. · Zbl 0663.41012
[41] Pinkus, A., Approximation theory of the MLP model in neural networks, in Acta Numerica, Vol. 8 (Cambridge University Press, Cambridge, 1999), pp. 143-195. · Zbl 0959.68109
[42] Robbins, H., A Remark on Stirling’s Formula, Amer. Math. Mon.62(1) (1955) 26-29. · Zbl 0068.05404
[43] Rolnick, D. and Tegmark, M., The power of deeper networks for expressing natural functions, in 6th Int. Conf. Learning Representations, Vancouver, BC, Canada, (2018).
[44] Scherer, K., On optimal global error bounds obtained by scaled local error estimates, Numer. Math.36(2) (1980/81) 151-176. · Zbl 0495.65006
[45] Schötzau, D. and Schwab, Ch., Exponential convergence of \(hp\)-FEM for elliptic problems in polyhedra: Mixed boundary conditions and anisotropic polynomial degrees, Found. Comput. Math.18(3) (2018) 595-660. · Zbl 1402.65166
[46] Schwab, Ch., \(p\)- and \(hp\)-Finite Element Methods, . (Oxford University Press, New York, 1998). · Zbl 0910.73003
[47] Schwab, Ch. and Suri, M., The \(p\) and \(hp\) versions of the finite element method for problems with boundary layers, Math. Comp.65(216) (1996) 1403-1429. · Zbl 0853.65115
[48] Schwab, Ch. and Zech, J., Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ, Anal. Appl. (Singap.), 17(1) (2019) 19-55. · Zbl 1478.68309
[49] Shaham, U., Cloninger, A. and Coifman, R. R., Provable approximation properties for deep neural networks, Appl. Comput. Harmon. Anal.44(3) (2018) 537-557. · Zbl 1390.68553
[50] Sirignano, J. and Spiliopoulos, K., DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys.375 (2018) 1339-1364. · Zbl 1416.65394
[51] Telgarsky, M., Neural networks and rational functions, in Proc. 34th Int. Conf. Machine Learning, Sydney, Australia, (2017).
[52] Triebel, H., Interpolation Theory, Function Spaces, Differential Operators, 2nd edn. (Johann Ambrosius Barth, Heidelberg, 1995). · Zbl 0830.46028
[53] Triebel, H., Theory of Function Spaces, (Springer, Basel, 2010). · Zbl 1235.46002
[54] Wendland, H., Scattered Data Approximation, , Vol. 17 (Cambridge University Press, Cambridge, 2005). · Zbl 1075.65021
[55] Yarotsky, D., Error bounds for approximations with deep ReLU networks, Neural Netw.94 (2017) 103-114. · Zbl 1429.68260
[56] D. Yarotsky, Optimal approximation of continuous functions by very deep ReLU networks, in Proc. 31st Conf. Learning Theory, Proc. Machine Learning Research, Vol. 75 (PMLR, 06-09 Jul 2018), pp. 639-649.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.