×

zbMATH — the first resource for mathematics

A multiscale neural network based on hierarchical nested bases. (English) Zbl 07096701
Summary: In recent years, deep learning has led to impressive results in many fields. In this paper, we introduce a multiscale artificial neural network for high-dimensional nonlinear maps based on the idea of hierarchical nested bases in the fast multipole method and the \(\mathcal{H}^2\)-matrices. This approach allows us to efficiently approximate discretized nonlinear maps arising from partial differential equations or integral equations. It also naturally extends our recent work based on the generalization of hierarchical matrices (Fan et al. arXiv:1807.01883), but with a reduced number of parameters. In particular, the number of parameters of the neural network grows linearly with the dimension of the parameter space of the discretized PDE. We demonstrate the properties of the architecture by approximating the solution maps of nonlinear Schrödinger equation, the radiative transfer equation and the Kohn-Sham map.

MSC:
65 Numerical analysis
37 Dynamical systems and ergodic theory
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: A system for large-scale machine learning. In: OSDI, vol. 16, pp. 265-283. USENIX Association (2016)
[2] Anglin, JR; Ketterle, W., Bose-Einstein condensation of atomic gases, Nature, 416, 211, (2002)
[3] Araya-Polo, M.; Jennings, J.; Adler, A.; Dahlke, T., Deep-learning tomography, Lead. Edge, 37, 58-66, (2018)
[4] Badrinarayanan, V.; Kendall, A.; Cipolla, R., SegNet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 39, 2481-2495, (2017)
[5] Bao, W.; Du, Q., Computing the ground state solution of Bose-Einstein condensates by a normalized gradient flow, SIAM J. Sci. Comput., 25, 1674-1697, (2004) · Zbl 1061.82025
[6] Beck, C., E, W., Jentzen, A.: Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations. arXiv:1709.05963 (2017)
[7] Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. arXiv:1711.06464 (2017)
[8] Börm, S.; Grasedyck, L.; Hackbusch, W., Introduction to hierarchical matrices with applications, Eng. Anal. Bound. Elem., 27, 405-422, (2003) · Zbl 1035.65042
[9] Bruna, J.; Mallat, S., Invariant scattering convolution networks, IEEE Trans. Pattern Anal. Mach. Intell., 35, 1872-1886, (2013)
[10] Chan, S.; Elsheikh, AH, A machine learning approach for efficient uncertainty quantification using multiscale methods, J. Comput. Phys., 354, 493-511, (2018) · Zbl 1380.65331
[11] Chaudhari, P., Oberman, A., Osher, S., Soatto, S., Carlier, G.: Partial differential equations for training deep neural networks. In: 2017 51st Asilomar Conference on Signals, Systems, and Computers, pp. 1627-1631 (2017)
[12] Chen, LC; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, AL, DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs, IEEE Trans. Pattern Anal. Mach. Intell., 40, 834-848, (2018)
[13] Chollet, F., et al.: Keras. https://keras.io (2015). Accessed April 30, 2018
[14] Cohen, N., Sharir, O., Shashua, A.: On the expressive power of deep learning: a tensor analysis. arXiv:1509.05009 (2018)
[15] E, W.; Han, J.; Jentzen, A., Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Commun. Math. Stat., 5, 349-380, (2017) · Zbl 1382.65016
[16] Fan, Y.; An, J.; Ying, L., Fast algorithms for integral formulations of steady-state radiative transfer equation, J. Comput. Phys., 380, 191-211, (2019)
[17] Fan, Y., Lin, L., Ying, L., Zepeda-Núñez, L.: A multiscale neural network based on hierarchical matrices. arXiv:1807.01883 (2018)
[18] Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016) · Zbl 1373.68009
[19] Greengard, L.; Rokhlin, V., A fast algorithm for particle simulations, J. Comput. Phys., 73, 325-348, (1987) · Zbl 0629.65005
[20] Hackbusch, W., A sparse matrix arithmetic based on \(\cal{H}\)-matrices. Part I: introduction to \(\cal{H}\)-matrices, Computing, 62, 89-108, (1999) · Zbl 0927.65063
[21] Hackbusch, W.; Khoromskij, BN, A sparse \(\cal{H}\)-matrix arithmetic: general complexity estimates, J. Comput. Appl. Math., 125, 479-501, (2000) · Zbl 0977.65036
[22] Hackbusch, W., Khoromskij, B.N., Sauter, S.: On \(\cal{H}^2\)-Matrices. Lectures on Applied Mathematics. Springer, Berlin (2000)
[23] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778 (2016)
[24] Hinton, G.; Deng, L.; Yu, D.; Dahl, GE; Mohamed, Ar; Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; Sainath, TN; Kingsbury, B., Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups, IEEE Signal Process. Mag., 29, 82-97, (2012)
[25] Hohenberg, P.; Kohn, W., Inhomogeneous electron gas, Phys. Rev., 136, b864, (1964)
[26] Hornik, K., Approximation capabilities of multilayer feedforward networks, Neural Netw., 4, 251-257, (1991)
[27] Khoo, Y., Lu, J., Ying, L.: Solving parametric PDE problems with artificial neural networks. arXiv:1707.03351 (2017)
[28] Khrulkov, V., Novikov, A., Oseledets, I.: Expressive power of recurrent neural networks. arXiv:1711.00811 (2018) · Zbl 06856780
[29] Klose, AD; Netz, U.; Beuthan, J.; Hielscher, AH, Optical tomography using the time-independent equation of radiative transfer—part 1: forward model, J. Quant. Spectrosc. Radiat. Transf., 72, 691-713, (2002)
[30] Koch, R.; Becker, R., Evaluation of quadrature schemes for the discrete ordinates method, J. Quant. Spectrosc. Radiat. Transf., 84, 423-435, (2004)
[31] Kohn, W.; Sham, LJ, Self-consistent equations including exchange and correlation effects, Phys. Rev., 140, a1133, (1965)
[32] Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems—Volume 1, NIPS’12, pp. 1097-1105, USA, Curran Associates Inc (2012)
[33] LeCun, Y.; Bengio, Y.; Hinton, G., Deep learning, Nature, 521, 436-444, (2015)
[34] Leung, MKK; Xiong, HY; Lee, LJ; Frey, BJ, Deep learning of the tissue-regulated splicing code, Bioinformatics, 30, i121-i129, (2014)
[35] Li, Y., Cheng, X., Lu, J.: Butterfly-Net: Optimal function representation based on convolutional neural networks. arXiv:1805.07451 (2018)
[36] Lin, L.; Lu, J.; Ying, L., Fast construction of hierarchical matrix representation from matrix-vector multiplication, J. Comput. Phys., 230, 4071-4087, (2011) · Zbl 1218.65038
[37] Litjens, G.; Kooi, T.; Bejnordi, BE; Setio, AAA; Ciompi, F.; Ghafoorian, M.; Laak, JAWM; Ginneken, B.; Sánchez, CI, A survey on deep learning in medical image analysis, Med. Image Anal., 42, 60-88, (2017)
[38] Ma, J.; Sheridan, RP; Liaw, A.; Dahl, GE; Svetnik, V., Deep neural nets as a method for quantitative structure-activity relationships, J. Chem. Inf. Model., 55, 263-274, (2015)
[39] Marshak, A., Davis, A.: 3D Radiative Transfer in Cloudy Atmospheres. Springer, Berlin (2005)
[40] Mhaskar, H., Liao, Q., Poggio, T.: Learning functions: when is deep better than shallow. arXiv:1603.00988 (2018)
[41] Paschalis, P.; Giokaris, ND; Karabarbounis, A.; Loudos, G.; Maintas, D.; Papanicolas, C.; Spanoudaki, V.; Tsoumpas, C.; Stiliaris, E., Tomographic image reconstruction using artificial neural networks, Nucl. Instrum. Methods Phys. Res. Sect. A Accel. Spectrom. Detect. Assoc. Equip., 527, 211-215, (2004)
[42] Pitaevskii, L., Vortex lines in an imperfect Bose gas, Sov. Phys. JETP, 13, 451-454, (1961)
[43] Pomraning, G.C.: The Equations of Radiation Hydrodynamics. Courier Corporation, Chelmsford (1973)
[44] Raissi, M.; Karniadakis, GE, Hidden physics models: machine learning of nonlinear partial differential equations, J. Comput. Phys., 357, 125-141, (2018) · Zbl 1381.68248
[45] Ren, K., Zhang, R., Zhong, Y.: A fast algorithm for radiative transport in isotropic media. arXiv:1610.00835 (2016)
[46] Ronneberger, O.; Fischer, P.; Brox, T.; Navab, N. (ed.); Hornegger, J. (ed.); Wells, WM (ed.); Frangi, AF (ed.), U-Net: Convolutional networks for biomedical image segmentation, 234-241, (2015), Cham
[47] Rudd, K.; Muro, GD; Ferrari, S., A constrained backpropagation approach for the adaptive solution of partial differential equations, IEEE Trans. Neural Netw. Learn. Syst., 25, 571-584, (2014)
[48] Sarikaya, R.; Hinton, GE; Deoras, A., Application of deep belief networks for natural language understanding, IEEE/ACM Trans. Audio Speech Lang. Process., 22, 778-784, (2014)
[49] Schmidhuber, J., Deep learning in neural networks: an overview, Neural Netw., 61, 85-117, (2015)
[50] Silver, D.; Huang, A.; Maddison, CJ; Guez, LSA; Driessche, GVD; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M., Mastering the game of Go with deep neural networks and tree search, Nature, 529, 484-489, (2016)
[51] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-sacle image recognition. Computing Research Repository (CoRR). arXiv:1409.1556 (2014)
[52] Socher, R., Bengio, Y., Manning, C.D.: Deep learning for NLP (without magic). In: The 50th Annual Meeting of the Association for Computational Linguistics, Tutorial Abstracts, vol. 5 (2012)
[53] Spiliopoulos, K., Sirignano, J.: DGM: A deep learning algorithm for solving partial differential equations. arXiv:1708.07469 (2018) · Zbl 1416.65394
[54] Sutskever, I.; Vinyals, O.; Le, QV; Ghahramani, Z. (ed.); Welling, M. (ed.); Cortes, C. (ed.); Lawrence, ND (ed.); Weinberger, KQ (ed.), Sequence to sequence learning with neural networks, No. 27, 3104-3112, (2014), New York
[55] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. Computing Research Repository (CoRR). arXiv:1409.4842 (2014)
[56] Timothy, D.: Incorporating Nesterov momentum into Adam. http://cs229.stanford.edu/proj2015/054_report.pdf (2015)
[57] Trefethen, L.: Spectral Methods in MATLAB. Society for Industrial and Applied Mathematics, Philadelphia (2000) · Zbl 0953.68643
[58] Tyrtyshnikov, E., Mosaic-skeleton approximations, Calcolo, 33, 47-57, (1998) · Zbl 0906.65048
[59] Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. arXiv:1711.10925 (2018)
[60] Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: Pattern Recognition (ICPR), 2012 21st International Conference on Pattern Recognition (ICPR2012), pp. 3304-3308 (2012)
[61] Wang, Y., Siu, C.W., Chung, E.T., Efendiev, Y., Wang, M.: Deep multiscale model learning. arXiv:1806.04830 (2018)
[62] Xiong, HY; etal., The human splicing code reveals new insights into the genetic determinants of disease, Science, 347, 1254806, (2015)
[63] Zeiler, Matthew D.; Fergus, Rob, Visualizing and Understanding Convolutional Networks, 818-833, (2014), Cham
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.