Simulator-free solution of high-dimensional stochastic elliptic partial differential equations using deep neural networks. (English) Zbl 1453.65021

Summary: Stochastic partial differential equations (SPDEs) are ubiquitous in engineering and computational sciences. The stochasticity arises as a consequence of uncertainty in input parameters, constitutive relations, initial/boundary conditions, etc. Because of these functional uncertainties, the stochastic parameter space is often high-dimensional, requiring hundreds, or even thousands, of parameters to describe it. This poses an insurmountable challenge to response surface modeling since the number of forward model evaluations needed to construct an accurate surrogate grows exponentially with the dimension of the uncertain parameter space; a phenomenon referred to as the curse of dimensionality. State-of-the-art methods for high-dimensional uncertainty propagation seek to alleviate the curse of dimensionality by performing dimensionality reduction in the uncertain parameter space. However, one still needs to perform forward model evaluations that potentially carry a very high computational burden. We propose a novel methodology for high-dimensional uncertainty propagation of elliptic SPDEs which lifts the requirement for a deterministic forward solver. Our approach is as follows. We parameterize the solution of the elliptic SPDE using a deep residual network (ResNet). In a departure from traditional squared residual (SR) based loss function for training the ResNet, we introduce a physics-informed loss function derived from variational principles. Specifically, our loss function is the expectation of the energy functional of the PDE over the stochastic variables. We demonstrate our solver-free approach through various examples where the elliptic SPDE is subjected to different types of high-dimensional input uncertainties. Also, we solve high-dimensional uncertainty propagation and inverse problems.


65C30 Numerical solutions to stochastic differential and integral equations
68T07 Artificial neural networks and deep learning
60H15 Stochastic partial differential equations (aspects of stochastic analysis)
35R60 PDEs with randomness, stochastic partial differential equations
Full Text: DOI arXiv


[1] Reed, D. A.; Dongarra, J., Exascale computing and big data, Commun. ACM, 58, 56-68 (2015)
[2] Langtangen, H. P., Computational Partial Differential Equations: Numerical Methods and Diffpack Programming, vol. 2 (1999), Springer: Springer Berlin · Zbl 0929.65098
[3] Smith, R. C., Uncertainty Quantification: Theory, Implementation, and Applications, vol. 12 (2013), Siam
[4] Sullivan, T. J., Introduction to Uncertainty Quantification, vol. 63 (2015), Springer · Zbl 1336.60002
[5] Robert, C.; Casella, G., Monte Carlo Statistical Methods (2013), Springer Science & Business Media
[6] Graham, I. G.; Parkinson, M. J.; Scheichl, R., Modern Monte Carlo variants for uncertainty quantification in neutron transport, (Contemporary Computational Mathematics-A Celebration of the 80th Birthday of Ian Sloan (2018), Springer), 455-481 · Zbl 1405.65014
[7] Cliffe, K. A.; Giles, M. B.; Scheichl, R.; Teckentrup, A. L., Multilevel Monte Carlo methods and applications to elliptic PDEs with random coefficients, Comput. Vis. Sci., 14, 3 (2011) · Zbl 1241.65012
[8] Kuo, F. Y.; Schwab, C.; Sloan, I. H., Quasi-Monte Carlo finite element methods for a class of elliptic partial differential equations with random coefficients, SIAM J. Numer. Anal., 50, 3351-3374 (2012) · Zbl 1271.65017
[9] Dick, J.; Gantner, R. N.; Gia, Q. T.L.; Schwab, C., Higher order quasi-Monte Carlo integration for bayesian estimation (2016), preprint
[10] Sambridge, M.; Mosegaard, K., Monte Carlo methods in geophysical inverse problems, Rev. Geophys., 40, 1-3 (2002)
[11] Sankararaman, S.; Ling, Y.; Mahadevan, S., Uncertainty quantification and model validation of fatigue crack growth prediction, Eng. Fract. Mech., 78, 1487-1504 (2011)
[12] Isukapalli, S.; Roy, A.; Georgopoulos, P., Stochastic response surface methods (srsms) for uncertainty propagation: application to environmental and biological systems, Risk Anal., 18, 351-363 (1998)
[13] Angelikopoulos, P.; Papadimitriou, C.; Koumoutsakos, P., Bayesian uncertainty quantification and propagation in molecular dynamics simulations: a high performance computing framework, J. Chem. Phys., 137, Article 144103 pp. (2012)
[14] Lockwood, B. A.; Anitescu, M., Gradient-enhanced universal kriging for uncertainty propagation, Nucl. Sci. Eng., 170, 168-195 (2012)
[15] Martin, J. D.; Simpson, T. W., Use of kriging models to approximate deterministic computer models, AIAA J., 43, 853-863 (2005)
[16] Bilionis, I.; Zabaras, N., Multi-output local gaussian process regression: applications to uncertainty quantification, J. Comput. Phys., 231, 5718-5746 (2012) · Zbl 1277.60066
[17] Bilionis, I.; Zabaras, N.; Konomi, B. A.; Lin, G., Multi-output separable gaussian process: towards an efficient, fully bayesian paradigm for uncertainty quantification, J. Comput. Phys., 241, 212-239 (2013) · Zbl 1349.76760
[18] Chen, P.; Zabaras, N.; Bilionis, I., Uncertainty propagation using infinite mixture of gaussian processes and variational bayesian inference, J. Comput. Phys., 284, 291-333 (2015) · Zbl 1351.76277
[19] Tripathy, R.; Bilionis, I.; Gonzalez, M., Gaussian processes with built-in dimensionality reduction: applications to high-dimensional uncertainty propagation, J. Comput. Phys., 321, 191-223 (2016) · Zbl 1349.65049
[20] Najm, H. N., Uncertainty quantification and polynomial chaos techniques in computational fluid dynamics, Annu. Rev. Fluid Mech., 41, 35-52 (2009) · Zbl 1168.76041
[21] Eldred, M.; Burkardt, J., Comparison of non-intrusive polynomial chaos and stochastic collocation methods for uncertainty quantification, (47th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition (2009)), 976
[22] Xiu, D.; Karniadakis, G. E., The Wiener-Askey polynomial chaos for stochastic differential equations, SIAM J. Sci. Comput., 24, 619-644 (2002) · Zbl 1014.65004
[23] Ernst, O. G.; Mugler, A.; Starkloff, H.-J.; Ullmann, E., On the convergence of generalized polynomial chaos expansions, Modél. Math. Anal. Numér., 46, 317-339 (2012) · Zbl 1273.65012
[24] Regis, R. G.; Shoemaker, C. A., Combining radial basis function surrogates and dynamic coordinate search in high-dimensional expensive black-box optimization, Eng. Optim., 45, 529-555 (2013)
[25] Volpi, S.; Diez, M.; Gaul, N. J.; Song, H.; Iemma, U.; Choi, K.; Campana, E. F.; Stern, F., Development and validation of a dynamic metamodel based on stochastic radial basis functions and uncertainty quantification, Struct. Multidiscip. Optim., 51, 347-368 (2015)
[26] Keogh, E.; Mueen, A., Curse of dimensionality, (Encyclopedia of Machine Learning (2011), Springer), 257-258
[27] Constantine, P., Active subspaces: emerging ideas for dimension reduction in parameter studies (2016)
[28] Saltelli, A.; Chan, K.; Scott, E. M., Sensitivity Analysis, vol. 1 (2000), Wiley: Wiley New York
[29] Neal, R. M., Assessing relevance determination methods using delve, NATO ASI Ser., Ser. F: Comp. Syst. Sci., 168, 97-132 (1998) · Zbl 0928.68092
[30] Ghanem, R., Stochastic finite elements with multiple random non-gaussian properties, J. Eng. Mech., 125, 26-40 (1999)
[31] Jolliffe, I., Principal component analysis, (International Encyclopedia of Statistical Science (2011), Springer), 1094-1096
[32] Schölkopf, B.; Smola, A.; Müller, K.-R., Kernel principal component analysis, (International Conference on Artificial Neural Networks (1997), Springer), 583-588
[33] Ma, X.; Zabaras, N., Kernel principal component analysis for stochastic input model generation, J. Comput. Phys., 230, 7311-7331 (2011) · Zbl 1252.65014
[34] Constantine, P. G.; Dow, E.; Wang, Q., Active subspace methods in theory and practice: applications to kriging surfaces, SIAM J. Sci. Comput., 36, A1500-A1524 (2014) · Zbl 1311.65008
[35] Constantine, P.; Gleich, D., Computing active subspaces with Monte Carlo (2014), preprint
[36] Lukaczyk, T. W.; Constantine, P.; Palacios, F.; Alonso, J. J., Active subspaces for shape optimization, (10th AIAA Multidisciplinary Design Optimization Conference (2014)), 1171
[37] Jefferson, J. L.; Gilbert, J. M.; Constantine, P. G.; Maxwell, R. M., Active subspaces for sensitivity analysis and dimension reduction of an integrated hydrologic model, Comput. Geosci., 83, 127-138 (2015)
[38] Constantine, P. G.; Emory, M.; Larsson, J.; Iaccarino, G., Exploiting active subspaces to quantify uncertainty in the numerical simulation of the hyshot ii scramjet, J. Comput. Phys., 302, 1-20 (2015) · Zbl 1349.76153
[39] Constantine, P. G.; Kent, C.; Bui-Thanh, T., Accelerating Markov chain Monte Carlo with active subspaces, SIAM J. Sci. Comput., 38, A2779-A2805 (2016) · Zbl 1348.65010
[40] Tezzele, M.; Ballarin, F.; Rozza, G., Combined parameter and model reduction of cardiovascular problems by means of active subspaces and pod-Galerkin methods, (Mathematical and Numerical Modeling of the Cardiovascular System and Applications (2018), Springer), 185-207
[41] Tripathy, R.; Bilionis, I., Deep uq: learning deep neural network surrogate models for high dimensional uncertainty quantification (2018), preprint · Zbl 1419.68084
[42] Zhu, Y.; Zabaras, N., Bayesian deep convolutional encoder-decoder networks for surrogate modeling and uncertainty quantification, J. Comput. Phys., 366, 415-447 (2018) · Zbl 1407.62091
[43] Mo, S.; Zhu, Y.; Zabaras Nicholas, J.; Shi, X.; Wu, J., Deep convolutional encoder-decoder networks for uncertainty quantification of dynamic multiphase flow in heterogeneous media, Water Resour. Res. (2018)
[44] He, K.; Zhang, X.; Ren, S.; Sun, J., Deep residual learning for image recognition, (Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)), 770-778
[45] Baydin, A. G.; Pearlmutter, B. A.; Radul, A. A., Automatic differentiation in machine learning: a survey (2015), CoRR
[46] Qin, T.; Wu, K.; Xiu, D., Data driven governing equations approximation using deep neural networks (2018), preprint
[47] Goodfellow, I.; Bengio, Y.; Courville, A.; Bengio, Y., Deep Learning, vol. 1 (2016), MIT press: MIT press Cambridge · Zbl 1373.68009
[48] Hornik, K.; Stinchcombe, M.; White, H., Multilayer feedforward networks are universal approximators, Neural Netw., 2, 359-366 (1989) · Zbl 1383.92015
[49] Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M., Tensorflow: a system for large-scale machine learning, (OSDI, vol. 16 (2016)), 265-283
[50] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, A. Lerer, Automatic differentiation in pytorch, 2017.
[51] Chen, T.; Li, M.; Li, Y.; Lin, M.; Wang, N.; Wang, M.; Xiao, T.; Xu, B.; Zhang, C.; Zhang, Z., Mxnet: a flexible and efficient machine learning library for heterogeneous distributed systems (2015), preprint
[52] Kingma, D. P.; Ba, J., Adam: a method for stochastic optimization (2014), preprint
[53] Tieleman, T.; Hinton, G., Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude, COURSERA: Neural Netw. Mach. Learn., 4, 26-31 (2012)
[54] Zeiler, M. D., Adadelta: an adaptive learning rate method (2012), preprint
[55] Raissi, M.; Perdikaris, P.; Karniadakis, G. E., Physics informed deep learning (part i): data-driven solutions of nonlinear partial differential equations (2017), preprint
[56] Raissi, M.; Perdikaris, P.; Karniadakis, G. E., Physics informed deep learning (part ii): data-driven discovery of nonlinear partial differential equations (2017), preprint
[57] Weinan, E.; Yu, B., The deep ritz method: a deep learning-based numerical algorithm for solving variational problems, Commun. Math. Stat., 6, 1-12 (2018) · Zbl 1392.35306
[58] Nabian, M. A.; Meidani, H., A deep learning solution approach for high-dimensional random differential equations, Probab. Eng. Mech., 57, 14-25 (2019)
[59] Zhu, Y.; Zabaras, N.; Koutsourelakis, P.-S.; Perdikaris, P., Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data (2019), preprint
[60] Adams, R. A.; Fournier, J. J., Sobolev Spaces, vol. 140 (2003), Elsevier
[61] Fletcher, R., Practical Methods of Optimization (2013), John Wiley & Sons · Zbl 0905.65002
[62] I.E. Lagaris, A. Likas, D.I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, ArXiv Physics e-prints, 1997.
[63] Berg, J.; Nyström, K., A unified deep artificial neural network approach to partial differential equations in complex geometries, Neurocomputing, 317, 28-41 (2018)
[64] Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. A., Inception-v4, inception-resnet and the impact of residual connections on learning, (AAAI, vol. 4 (2017)), 12
[65] Wu, Z.; Shen, C.; Van Den Hengel, A., Wider or deeper: revisiting the resnet model for visual recognition, Pattern Recognit. (2019)
[66] Veit, A.; Wilber, M. J.; Belongie, S., Residual networks behave like ensembles of relatively shallow networks, (Advances in Neural Information Processing Systems (2016)), 550-558
[67] Ramachandran, P.; Zoph, B.; Le, Q. V., Swish: a self-gated activation function (2017), preprint
[68] Bottou, L., Large-scale machine learning with stochastic gradient descent, (Proceedings of COMPSTAT’2010 (2010), Springer), 177-186
[69] Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Goodfellow, I. J.; Harp, A.; Irving, G.; Isard, M.; Jia, Y.; Józefowicz, R.; Kaiser, L.; Kudlur, M.; Levenberg, J.; Mané, D.; Monga, R.; Moore, S.; Murray, D. G.; Olah, C.; Schuster, M.; Shlens, J.; Steiner, B.; Sutskever, I.; Talwar, K.; Tucker, P. A.; Vanhoucke, V.; Vasudevan, V.; Viégas, F. B.; Vinyals, O.; Warden, P.; Wattenberg, M.; Wicke, M.; Yu, Y.; Zheng, X., Tensorflow: large-scale machine learning on heterogeneous distributed systems (2016), CoRR
[70] Chauvin, Y.; Rumelhart, D. E., Backpropagation: Theory, Architectures, and Applications (1995), Psychology Press
[71] Chollet, F., Keras (2015)
[72] Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Goodfellow, I.; Harp, A.; Irving, G.; Isard, M.; Jia, Y.; Jozefowicz, R.; Kaiser, L.; Kudlur, M.; Levenberg, J.; Mané, D.; Monga, R.; Moore, S.; Murray, D.; Olah, C.; Schuster, M.; Shlens, J.; Steiner, B.; Sutskever, I.; Talwar, K.; Tucker, P.; Vanhoucke, V.; Vasudevan, V.; Viégas, F.; Vinyals, O.; Warden, P.; Wattenberg, M.; Wicke, M.; Yu, Y.; Zheng, X., TensorFlow: large-scale machine learning on heterogeneous systems (2015), available from tensorflow.org
[73] Guyer, J. E.; Wheeler, D.; Warren, J. A., Fipy: partial differential equations with python, Comput. Sci. Eng., 11 (2009)
[74] Laloy, E.; Hérault, R.; Jacques, D.; Linde, N., Training-image based geostatistical inversion using a spatial generative adversarial neural network, Water Resour. Res., 54, 381-406 (2018)
[75] Kaipio, J.; Somersalo, E., Statistical and Computational Inverse Problems, vol. 160 (2006), Springer Science & Business Media
[76] Tarantola, A., Inverse Problem Theory and Methods for Model Parameter Estimation, vol. 89 (2005), SIAM
[77] G.O. Roberts, J.S. Rosenthal, Optimal scaling of discrete approximations to Langevin diffusions, 1998. · Zbl 0913.60060
[78] Xifara, T.; Sherlock, C.; Livingstone, S.; Byrne, S.; Girolami, M., Langevin diffusions and the Metropolis-adjusted Langevin algorithm (2013), e-prints · Zbl 1298.60081
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.