×

Learning and meta-learning of stochastic advection-diffusion-reaction systems from sparse measurements. (English) Zbl 07441294

Summary: Physics-informed neural networks (PINNs) were recently proposed in [M. Raissi et al., J. Comput. Phys. 378, 686–707 (2019; Zbl 1415.68175)] as an alternative way to solve partial differential equations (PDEs). A neural network (NN) represents the solution, while a PDE-induced NN is coupled to the solution NN, and all differential operators are treated using automatic differentiation. Here, we first employ the standard PINN and a stochastic version, sPINN, to solve forward and inverse problems governed by a non-linear advection-diffusion-reaction (ADR) equation, assuming we have some sparse measurements of the concentration field at random or pre-selected locations. Subsequently, we attempt to optimise the hyper-parameters of sPINN by using the Bayesian optimisation method (meta-learning) and compare the results with the empirically selected hyper-parameters of sPINN. In particular, for the first part in solving the inverse deterministic ADR, we assume that we only have a few high-fidelity measurements, whereas the rest of the data is of lower fidelity. Hence, the PINN is trained using a composite multi-fidelity network, first introduced in [X. Meng and G. E. Karniadakis, J. Comput. Phys. 401, Article ID 109020, 15 p. (2020; Zbl 1454.76006)], that learns the correlations between the multi-fidelity data and predicts the unknown values of diffusivity, transport velocity and two reaction constants as well as the concentration field. For the stochastic ADR, we employ a Karhunen-Loève (KL) expansion to represent the stochastic diffusivity, and arbitrary polynomial chaos (aPC) to represent the stochastic solution. Correspondingly, we design multiple NNs to represent the mean of the solution and learn each aPC mode separately, whereas we employ a separate NN to represent the mean of diffusivity and another NN to learn all modes of the KL expansion. For the inverse problem, in addition to stochastic diffusivity and concentration fields, we also aim to obtain the (unknown) deterministic values of transport velocity and reaction constants. The available data correspond to 7spatial points for the diffusivity and 20 space-time points for the solution, both sampled 2000 times. We obtain good accuracy for the deterministic parameters of the order of 1–2% and excellent accuracy for the mean and variance of the stochastic fields, better than three digits of accuracy. In the second part, we consider the previous stochastic inverse problem, and we use Bayesian optimisation to find five hyper-parameters of sPINN, namely the width, depth and learning rate of two NNs for learning the modes. We obtain much deeper and wider optimal NNs compared to the manual tuning, leading to even better accuracy, i.e., errors less than 1% for the deterministic values, and about an order of magnitude less for the stochastic fields.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
35R30 Inverse problems for PDEs
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Barajas-Solano, D. A. & Tartakovsky, A. M. (2019) Approximate bayesian model inversion for pdes with heterogeneous and state-dependent coefficients. J. Comput. Phys.395, 247-262. · Zbl 1453.62410
[2] Bergstra, J. S., Bardenet, R., Bengio, Y. & Kégl, B. (2011) Algorithms for hyper-parameter optimization. In: Advances in Neural Information Processing Systems, pp. 2546-2554.
[3] Chen, T. Q., Rubanova, Y., Bettencourt, J. & Duvenaud, D. K. (2018) Neural ordinary differential equations. In: Advances in Neural Information Processing Systems, pp. 6571-6583.
[4] Chaudhari, P., Oberman, A., Osher, S., Soatto, S. & Carlier, G. (2018) Deep relaxation: partial differential equations for optimizing deep neural networks. Res. Math. Sci.5 (3), 30. · Zbl 1427.82032
[5] Falkner, S., Klein, A. & Hutter, F. (2018) BOHB: robust and efficient hyperparameter optimization at scale. arXiv preprint arXiv:1807.01774.
[6] Finn, C., Abbeel, P. & Levine, S. (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, JMLR.org, pp. 1126-1135.
[7] Han, J., Jentzen, A. & Weinan, E. (2018) Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci.115, 8505-8510. · Zbl 1416.35137
[8] He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J. & Han, S. (2018) AMC: AutoML for model compression and acceleration on mobile devices. In: Proceedings of the European Conference on Computer Vision (ECCV),pp. 784-800.
[9] Jaafra, Y., Laurent, J. L., Deruyver, A. & Naceur, M. S. (2018) A review of meta-reinforcement learning for deep neural networks architecture search. arXiv preprint arXiv:1812.07995.
[10] Li, L., Jamieson, K., Desalvo, G., Rostamizadeh, A. & Talwalkar, A. (2018) Hyperband: a novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res.18, 1-51. · Zbl 1468.68204
[11] Li, Y. & Osher, S. (2009) Coordinate descent optimization for l1 minimization with application to compressed sensing; a greedy algorithm. Inverse Problemsd Imaging3, 487-503. · Zbl 1188.90196
[12] Meng, X. & Karniadakis, G. E. (2020) A composite neural network that learns from multi-fidelity data: application to function approximation and inverse PDE problems. J. Comput. Phys.401, 109020. · Zbl 1454.76006
[13] Mitchell, M. (1998) An Introduction to Genetic Algorithms, MIT Press. · Zbl 0906.68113
[14] Pang, G., Lu, L. & Karniadakis, G. E. (2019) fpinns: fractional physics-informed neural networks. SIAM J. Sci. Comput. 41, A2603-A2626. · Zbl 1420.35459
[15] Pang, G., Yang, L. & Karniadakis, G. E. (2019) Neural-net-induced gaussian process regression for function approximation and pde solution. J. Comput. Phys.384, 270-288. · Zbl 1451.68242
[16] Paulson, J. A., Buehler, E. A. & Mesbah, A. (2017) Arbitrary polynomial chaos for uncertainty propagation of correlated random variables in dynamic systems. IFAC-PapersOnLine50, 3548-3553.
[17] Qin, T., Wu, K. & Xiu, D. (2019) Data driven governing equations approximation using deep neural networks. J. Comput. Phys.395, 620-635. · Zbl 1455.65125
[18] Raissi, M., Perdikaris, P. & Karniadakis, G. E. (2019) Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys.378, 686-707. · Zbl 1415.68175
[19] Raissi, M., Perdikaris, P. & Karniadakis, G. E. (2018) Numerical gaussian processes for time-dependent and nonlinear partial differential equations. SIAM J. Sci. Comput.40, A172-A198. · Zbl 1386.65030
[20] Sirignano, J. & Spiliopoulos, K. (2018) Dgm: a deep learning algorithm for solving partial differential equations. J. Comput. Phys.375, 1339-1364. · Zbl 1416.65394
[21] Snoek, J., Larochelle, H. & Adams, R. P. (2012) Practical bayesian optimization of machine learning algorithms. Adv. Neural Inf. Process Syst.25, 2951-2959.
[22] Snoek, J., Rippel, O., Swersky, K., Kiros, R., Satish, N., Sundaram, N., Patwary, M., Prabhat, M. & Adams, R. (2015) Scalable bayesian optimization using deep neural networks. Int. Conf. Mach. Learn.37, 2171-2180.
[23] Tartakovsky, A. M., Marrero, C. O., Tartakovsky, D. & Barajas-Solano, D. (2018) Learning parameters and constitutive relationships with physics informed deep neural networks. arXiv preprint arXiv:1808.03398.
[24] Wan, X. & Karniadakis, G. E. (2006) Multi-element generalized polynomial chaos for arbitrary probability measures. SIAM J. Sci. Comput.28, 901-928. · Zbl 1128.65009
[25] Zhang, D., Lu, L., Guo, L. & Karniadakis, G. E. (2019) Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems, J. Comput. Phys.397, 108850. · Zbl 1454.65008
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.