×

zbMATH — the first resource for mathematics

Deep UQ: learning deep neural network surrogate models for high dimensional uncertainty quantification. (English) Zbl 1419.68084
Summary: State-of-the-art computer codes for simulating real physical systems are often characterized by vast number of input parameters. Performing uncertainty quantification (UQ) tasks with Monte Carlo (MC) methods is almost always infeasible because of the need to perform hundreds of thousands or even millions of forward model evaluations in order to obtain convergent statistics. One, thus, tries to construct a cheap-to-evaluate surrogate model to replace the forward model solver. For systems with large numbers of input parameters, one has to address the curse of dimensionality through suitable dimensionality reduction techniques. A popular class of dimensionality reduction methods are those that attempt to recover a low-dimensional representation of the high-dimensional feature space. However, such methods often tend to overestimate the intrinsic dimensionality of the input feature space. In this work, we demonstrate the use of deep neural networks (DNN) to construct surrogate models for numerical simulators. We parameterize the structure of the DNN in a manner that lends the DNN surrogate the interpretation of recovering a low-dimensional nonlinear manifold. The model response is a parameterized nonlinear function of the low-dimensional projections of the input. We think of this low-dimensional manifold as a nonlinear generalization of the notion of the active subspace. Our approach is demonstrated with a problem on uncertainty propagation in a stochastic elliptic partial differential equation (SPDE) with uncertain diffusion coefficient. We deviate from traditional formulations of the SPDE problem by lifting the assumption of fixed length scales of the uncertain diffusion field. Instead, we attempt to solve a more challenging problem of learning a map between an arbitrary snapshot of the diffusion field and the response.

MSC:
68T05 Learning and adaptive systems in artificial intelligence
35R60 PDEs with randomness, stochastic partial differential equations
65N99 Numerical methods for partial differential equations, boundary value problems
PDF BibTeX Cite
Full Text: DOI
References:
[1] Smith, R. C., Uncertainty Quantification: Theory, Implementation, and Applications, vol. 12, (2013), Siam
[2] Sullivan, T. J., Introduction to Uncertainty Quantification, vol. 63, (2015), Springer · Zbl 1336.60002
[3] Liu, J. S., Monte Carlo Strategies in Scientific Computing, (2008), Springer Science & Business Media · Zbl 1132.65003
[4] Mooney, C. Z., Monte Carlo Simulation, vol. 116, (1997), Sage Publications
[5] Robert, C. P., Monte Carlo Methods, (2004), Wiley Online Library · Zbl 1096.62003
[6] Baraldi, P.; Zio, E., A combined Monte Carlo and possibilistic approach to uncertainty propagation in event tree analysis, Risk Anal., 28, 1309-1326, (2008)
[7] Mosegaard, K.; Tarantola, A., Monte Carlo sampling of solutions to inverse problems, J. Geophys. Res., Solid Earth, 100, 12431-12447, (1995)
[8] Mosegaard, K.; Sambridge, M., Monte Carlo analysis of inverse problems, Inverse Probl., 18, R29, (2002) · Zbl 1015.65001
[9] Bilionis, I.; Drewniak, B. A.; Constantinescu, E. M., Crop physiology calibration in the clm, Geosci. Model Dev., 8, 1071-1083, (2015)
[10] Spall, J. C., Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control, vol. 65, (2005), John Wiley & Sons
[11] Rasmussen, C. E., Gaussian Processes for Machine Learning, (2006) · Zbl 1177.68165
[12] Tripathy, R.; Bilionis, I.; Gonzalez, M., Gaussian processes with built-in dimensionality reduction: applications to high-dimensional uncertainty propagation, J. Comput. Phys., 321, 191-223, (2016) · Zbl 1349.65049
[13] Bilionis, I.; Zabaras, N., Multi-output local gaussian process regression: applications to uncertainty quantification, J. Comput. Phys., 231, 5718-5746, (2012) · Zbl 1277.60066
[14] Bilionis, I.; Zabaras, N.; Konomi, B. A.; Lin, G., Multi-output separable gaussian process: towards an efficient, fully bayesian paradigm for uncertainty quantification, J. Comput. Phys., 241, 212-239, (2013) · Zbl 1349.76760
[15] Chen, P.; Zabaras, N.; Bilionis, I., Uncertainty propagation using infinite mixture of gaussian processes and variational bayesian inference, J. Comput. Phys., 284, 291-333, (2015) · Zbl 1351.76277
[16] Xiu, D.; Karniadakis, G. E., The Wiener-Askey polynomial chaos for stochastic differential equations, SIAM J. Sci. Comput., 24, 619-644, (2002) · Zbl 1014.65004
[17] Xiu, D.; Karniadakis, G. E., Modeling uncertainty in flow simulations via generalized polynomial chaos, J. Comput. Phys., 187, 137-167, (2003) · Zbl 1047.76111
[18] Xiu, D.; Karniadakis, G. E., Modeling uncertainty in steady state diffusion problems via generalized polynomial chaos, Comput. Methods Appl. Mech. Eng., 191, 4927-4948, (2002) · Zbl 1016.65001
[19] Najm, H. N., Uncertainty quantification and polynomial chaos techniques in computational fluid dynamics, Annu. Rev. Fluid Mech., 41, 35-52, (2009) · Zbl 1168.76041
[20] Park, J.; Sandberg, I. W., Universal approximation using radial-basis-function networks, Neural Comput., 3, 246-257, (1991)
[21] Bilionis, I.; Zabaras, N., Multidimensional adaptive relevance vector machines for uncertainty quantification, SIAM J. Sci. Comput., 34, B881-B908, (2012) · Zbl 1257.62024
[22] Keogh, E.; Mueen, A., Curse of dimensionality, (Encyclopedia of Machine Learning, (2011), Springer), 257-258
[23] Saltelli, A.; Chan, K.; Scott, E. M., Sensitivity Analysis, vol. 1, (2000), Wiley: Wiley New York
[24] Neal, R. M., Assessing relevance determination methods using delve, NATO ASI Ser., Ser. F: Comput. Syst. Sci., 168, 97-132, (1998) · Zbl 0928.68092
[25] Ghanem, R. G.; Spanos, P. D., Stochastic Finite Elements: a Spectral Approach, (2003), Courier Corporation
[26] Spanos, P. D.; Ghanem, R., Stochastic finite element expansion for random media, J. Eng. Mech., 115, 1035-1053, (1989)
[27] Jolliffe, I., Principal Component Analysis, (2002), Wiley Online Library · Zbl 1011.62064
[28] Wold, S.; Esbensen, K.; Geladi, P., Principal component analysis, Chemom. Intell. Lab. Syst., 2, 37-52, (1987)
[29] B. Schölkopf, A. Smola, K.-R. Müller, Kernel principal component analysis, in: International Conference on Artificial Neural Networks, Springer, pp. 583-588.
[30] Ma, X.; Zabaras, N., Kernel principal component analysis for stochastic input model generation, J. Comput. Phys., 230, 7311-7331, (2011) · Zbl 1252.65014
[31] Constantine, P. G.; Dow, E.; Wang, Q., Active subspace methods in theory and practice: applications to kriging surfaces, SIAM J. Sci. Comput., 36, A1500-A1524, (2014) · Zbl 1311.65008
[32] T.W. Lukaczyk, P. Constantine, F. Palacios, J.J. Alonso, Active subspaces for shape optimization, in: 10th AIAA Multidisciplinary Design Optimization Conference, p. 1171.
[33] Constantine, P.; Gleich, D., Computing active subspaces with Monte Carlo, (2014), preprint
[34] Constantine, P. G.; Emory, M.; Larsson, J.; Iaccarino, G., Exploiting active subspaces to quantify uncertainty in the numerical simulation of the hyshot ii scramjet, J. Comput. Phys., 302, 1-20, (2015) · Zbl 1349.76153
[35] Constantine, P. G.; Zaharatos, B.; Campanelli, M., Discovering an active subspace in a single-diode solar cell model, Stat. Anal. Data Min. ASA Data Sci. J., 8, 264-273, (2015)
[36] M. Bauerheim, A. Ndiaye, P. Constantine, G. Iaccarino, S. Moreau, F. Nicoud, Uncertainty quantification of thermo-acoustic instabilities in annular combustors, in: Proceedings of the Summer Program, pp. 209-218.
[37] Bishop, C. M., Neural Networks for Pattern Recognition, (1995), Oxford University Press
[38] Goodfellow, I.; Bengio, Y.; Courville, A., Deep Learning, (2016), MIT Press · Zbl 1373.68009
[39] Schmidhuber, J., Deep learning in neural networks: an overview, Neural Netw., 61, 85-117, (2015)
[40] Haykin, S. S.; Haykin, S. S.; Haykin, S. S.; Haykin, S. S., Neural Networks and Learning Machines, vol. 3, (2009), Pearson: Pearson Upper Saddle River, NJ, USA · Zbl 0523.94001
[41] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in: Advances in Neural Information Processing Systems, pp. 1097-1105.
[42] L. Wan, M. Zeiler, S. Zhang, Y.L. Cun, R. Fergus, Regularization of neural networks using dropconnect, in: Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 1058-1066.
[43] Graham, B., Fractional max-pooling, (2014), preprint
[44] Clevert, D.-A.; Unterthiner, T.; Hochreiter, S., Fast and accurate deep network learning by exponential linear units (elus), (2015), preprint
[45] C.-Y. Lee, P.W. Gallagher, Z. Tu, Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree, in: International Conference on Artificial Intelligence and Statistics.
[46] Huval, B.; Wang, T.; Tandon, S.; Kiske, J.; Song, W.; Pazhayampallil, J.; Andriluka, M.; Rajpurkar, P.; Migimatsu, T.; Cheng-Yue, R., An empirical evaluation of deep learning on highway driving, (2015), preprint
[47] C. Chen, A. Seff, A. Kornhauser, J. Xiao, Deepdriving: Learning affordance for direct perception in autonomous driving, in: Proceedings of the IEEE International Conference on Computer Vision, pp. 2722-2730.
[48] Kingma, D.; Ba, J., Adam: a method for stochastic optimization, (2014), preprint
[49] Tieleman, T.; Hinton, G., Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude, COURSERA: Neural Netw. Mach. Learn., 4, 26-31, (2012)
[50] Duchi, J.; Hazan, E.; Singer, Y., Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res., 12, 2121-2159, (2011) · Zbl 1280.68164
[51] Zeiler, M. D., Adadelta: an adaptive learning rate method, (2012), preprint
[52] Srivastava, N.; Hinton, G. E.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R., Dropout: a simple way to prevent neural networks from overfitting, J. Mach. Learn. Res., 15, 1929-1958, (2014) · Zbl 1318.68153
[53] Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M., Tensorflow: large-scale machine learning on heterogeneous distributed systems, (2016), preprint
[54] Paszke, A.; Gross, S.; Chintala, S.; Chanan, G., Pytorch, (2017)
[55] J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, Y. Bengio, Theano: a CPU and GPU math compiler in Python, in: Proc. 9th Python in Science Conf., pp. 1-7.
[56] Hornik, K.; Stinchcombe, M.; White, H., Multilayer feedforward networks are universal approximators, Neural Netw., 2, 359-366, (1989) · Zbl 1383.92015
[57] Khoo, Y.; Lu, J.; Ying, L., Solving parametric pde problems with artificial neural networks, (2017), preprint
[58] Zhu, Y.; Zabaras, N., Bayesian deep convolutional encoder-decoder networks for surrogate modeling and uncertainty quantification, (2018), preprint
[59] Brochu, E.; Cora, V. M.; De Freitas, N., A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning, (2010), preprint
[60] X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectifier neural networks, in: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315-323.
[61] Ramachandran, P.; Zoph, B.; Le, Q., Searching for Activation Functions, (2017)
[62] Zou, H.; Hastie, T., Regularization and variable selection via the elastic net, J. R. Stat. Soc., Ser. B, Stat. Methodol., 67, 301-320, (2005) · Zbl 1069.62054
[63] Chauvin, Y.; Rumelhart, D. E., Backpropagation: Theory, Architectures, and Applications, (1995), Psychology Press
[64] Bottou, L., Large-scale machine learning with stochastic gradient descent, (Proceedings of COMPSTAT’2010, (2010), Springer), 177-186
[65] Bengio, Y., Practical recommendations for gradient-based training of deep architectures, (Neural Networks: Tricks of the Trade, (2012), Springer), 437-478
[66] Pandita, P.; Bilionis, I.; Panchal, J., Extending expected improvement for high-dimensional stochastic optimization of expensive black-box functions, J. Mech. Des., 138, (2016)
[67] Iman, R. L., Latin hypercube sampling, (Encyclopedia of Quantitative Risk Analysis and Assessment, (2008))
[68] Li, W.; Lin, G.; Li, B., Inverse regression-based uncertainty quantification algorithms for high-dimensional models: theory and practice, J. Comput. Phys., 321, 259-278, (2016) · Zbl 1349.62133
[69] Guyer, J. E.; Wheeler, D.; Warren, J. A., Fipy: partial differential equations with python, Comput. Sci. Eng., 11, (2009)
[70] Blundell, C.; Cornebise, J.; Kavukcuoglu, K.; Wierstra, D., Weight uncertainty in neural networks, (2015), preprint
[71] R. Ranganath, S. Gerrish, D. Blei, Black box variational inference, in: Artificial Intelligence and Statistics, pp. 814-822.
[72] A. Graves, Practical variational inference for neural networks, in: Advances in Neural Information Processing Systems, pp. 2348-2356.
[73] B. Peherstorfer, K. Willcox, M. Gunzburger, Survey of multifidelity methods in uncertainty propagation, inference, and optimization, preprint, 2016, pp. 1-57. · Zbl 1458.65003
[74] Kennedy, M. C.; O’Hagan, A., Predicting the output from a complex computer code when fast approximations are available, Biometrika, 87, 1-13, (2000) · Zbl 0974.62024
[75] P. Perdikaris, M. Raissi, A. Damianou, N. Lawrence, G.E. Karniadakis, Nonlinear information fusion algorithms for data-efficient multi-fidelity modelling, in: Proc. R. Soc. A 473, The Royal Society, p. 20160751.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.