Collocation based training of neural ordinary differential equations. (English) Zbl 07413881

Summary: The predictive power of machine learning models often exceeds that of mechanistic modeling approaches. However, the interpretability of purely data-driven models, without any mechanistic basis is often complicated, and predictive power by itself can be a poor metric by which we might want to judge different methods. In this work, we focus on the relatively new modeling techniques of neural ordinary differential equations. We discuss how they relate to machine learning and mechanistic models, with the potential to narrow the gulf between these two frameworks: they constitute a class of hybrid model that integrates ideas from data-driven and dynamical systems approaches. Training neural ODEs as representations of dynamical systems data has its own specific demands, and we here propose a collocation scheme as a fast and efficient training strategy. This alleviates the need for costly ODE solvers. We illustrate the advantages that collocation approaches offer, as well as their robustness to qualitative features of a dynamical system, and the quantity and quality of observational data. We focus on systems that exemplify some of the hallmarks of complex dynamical systems encountered in systems biology, and we map out how these methods can be used in the analysis of mathematical models of cellular and physiological processes.


68-XX Computer science
34-XX Ordinary differential equations
Full Text: DOI


[1] Álvarez, M., Luengo, D., Titsias, M., and Lawrence, N. (2010). Efficient multioutput Gaussian processes through variational inducing kernels. J. Mach. Learn. Res. 9: 25-32.
[2] Babtie, A.C., Kirk, P., and Stumpf, M.P.H. (2014). Topological sensitivity analysis for systems biology. Proc. Natl. Acad. Sci. U.S.A. 111: 18507-18512. doi:10.1073/pnas.1414026112. · Zbl 1355.92039
[3] Baker, R.E., Peña, J.-M., Jayamohan, J., and Jérusalem, A. (2018). Mechanistic models versus machine learning, a fight worth fighting for the biological community?. Biol. Lett. 14: 20170660. doi:10.1098/rsbl.2017.0660.
[4] Che, Z., Purushotham, S., Cho, K., Sontag, D., and Liu, Y. (2018). Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8. doi:10.1038/s41598-018-24271-9.
[5] Chen, R.T.Q., Rubanova, Y., Bettencourt, J., and Duvenaud, D. (2019). Neural ordinary differential equations. arXiv.
[6] Crook, O.M., Gatto, L., and Kirk, P.D.W. (2019). Fast approximate inference for variable selection in dirichlet process mixtures, with an application to pan-cancer proteomics. Stat. Appl. Genet. Mol. Biol. 18: 20180065. doi:10.1515/sagmb-2018-0065. · Zbl 1445.92105
[7] Dupont, E., Doucet, A., and Teh, Y.W. (2019). Augmented neural ODEs. arXiv.
[8] Durbin, J. and Koopman, S.J. (2012). Time series analysis by state space methods. Oxford statistical science series, 2nd ed. Oxford University Press, Oxford. · Zbl 1270.62120
[9] Estakhroueieh, M., Nikravesh, S., and Gharibzadeh, S. (2014). ECG generation based on action potential using modified van der pol equation. Annu. Res. Rev. Biol. 4: 4259-4272. doi:10.9734/arrb/2014/11916.
[10] Gardiner, C. (2009). Stochastic methods: a handbook for the natural and social sciences. Springer, Berlin and Heidelberg. · Zbl 1181.60001
[11] Glorot, X. and Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. J. Mach. Learn. Res. 9: 249-256.
[12] Gupta, A. and Khammash, M. (2014). Sensitivity analysis for stochastic chemical reaction networks with multiple time-scales. Electron. J. Probab. 19. doi:10.1214/ejp.v19-3246. · Zbl 1327.60137
[13] Heinonen, M., Yildiz, C., Mannerström, H., Intosalmi, J., and Lähdesmäki, H. (2018). Learning unknown ODE models with Gaussian processes. arXiv.
[14] Innes, M. (2018). Flux: elegant machine learning with julia. J. Open Source Software 3: 602. doi:10.21105/joss.00602.
[15] Innes, M., Saba, E., Fischer, K., Gandhi, D., Rudilosso, M.C., Joy, N.M., Karmali, T., Pal, A., and Shah, V. (2018). Fashionable modelling with flux. arXiv.
[16] Jia, J. and Benson, A.R. (2019). Neural jump stochastic differential equations. arXiv.
[17] Jost, J. (2005). Dynamical systems examples of complex behaviour. Springer, Berlin and Heidelberg. · Zbl 1082.37001
[18] Kersting, H., Krämer, N., Schiegg, M., Daniel, C., Tiemann, M., and Hennig, P. (2020). Differentiable likelihoods for fast inversion of ’likelihood-free’ dynamical systems. arXiv.
[19] Kirk, P., Thorne, T., and Stumpf, M.P. (2013). Model selection in systems and synthetic biology. Curr. Opin. Biotechnol. 24: 767-774. doi:10.1016/j.copbio.2013.03.012.
[20] Kirk, P.D.W., Babtie, A.C., and Stumpf, M.P.H. (2015). Systems biology (un)certainties. Science 350: 386-388. doi:10.1126/science.aac9505.
[21] Lakatos, E. and Stumpf, M.P.H. (2017). Control mechanisms for stochastic biochemical systems via computation of reachable sets. R. Soc. Open Sci. 4: 160790. doi:10.1098/rsos.160790.
[22] LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature 521: 436-444. doi:10.1038/nature14539.
[23] Leon, M., Woods, M.L., Fedorec, A.J.H., and Barnes, C.P. (2016). A computational method for the investigation of multistable systems and its application to genetic switches. BMC Syst. Biol. 10. doi:10.1186/s12918-016-0375-z.
[24] Liang, H. and Wu, H. (2008). Parameter estimation for differential equation models using a framework of measurement error in regression models. J. Am. Stat. Assoc. 103: 1570-1583. doi:10.1198/016214508000000797. · Zbl 1286.62039
[25] Liepe, J., Filippi, S., Komorowski, M., and Stumpf, M.P.H. (2013). Maximizing the information content of experiments in systems biology. PLoS Comput. Biol. 9: e1002888. doi:10.1371/journal.pcbi.1002888.
[26] Liepe, J., Kirk, P., Filippi, S., Toni, T., Barnes, C.P., and Stumpf, M.P.H. (2014). A framework for parameter estimation and model selection from experimental data in systems biology using approximate bayesian computation. Nat. Protoc. 9: 439-456. doi:10.1038/nprot.2014.025.
[27] Liu, X., Xiao, T., Si, S., Cao, Q., Kumar, S., and Hsieh, C.-J. (2019). Neural SDE: stabilizing neural ODE networks with stochastic noise. arXiv.
[28] Milias-Argeitis, A., Summers, S., Stewart-Ornstein, J., Zuleta, I., Pincus, D., El-Samad, H., Khammash, M., and Lygeros, J. (2011). In silico feedback for in vivo regulation of a gene expression circuit. Nat. Biotechnol. 29: 1114-1116. doi:10.1038/nbt.2018.
[29] Murphy, K.P. (2012). Machine learning: a probabilistic perspective. MIT Press, Cambridge, Massachusetts and London, England. · Zbl 1295.68003
[30] Rackauckas, C. and Nie, Q. (2017). DifferentialEquations.jl – a performant and feature-rich ecosystem for solving differential equations in julia. J. Open Res. Software 5. doi:10.5334/jors.151.
[31] Rackauckas, C., Innes, M., Ma, Y., Bettencourt, J., White, L., and Dixit, V. (2019). DiffEqFlux.jl-a julia library for neural differential equations. arXiv.
[32] Rackauckas, C., Ma, Y., Martensen, J., Warner, C., Zubov, K., Supekar, R., Skinner, D., and Ramadhan, A. (2020). Universal differential equations for scientific machine learning. arXiv.
[33] Rasmussen, C.E. and Williams, C.K.I. (2006). Gaussian processes for machine learning. MIT Press, Cambridge, Massachusetts and London, England. · Zbl 1177.68165
[34] Roesch, E. and Stumpf, M.P.H. (2019). Parameter inference in dynamical systems with co-dimension 1 bifurcations. R. Soc. Open Sci. 6: 190747. doi:10.1098/rsos.190747.
[35] Rubanova, Y., Chen, R.T.Q., and Duvenaud, D. (2019). Latent ODEs for irregularly-sampled time series. arXiv.
[36] Schnoerr, D., Sanguinetti, G., and Grima, R. (2017). Approximation and inference methods for stochastic biochemical kinetics—a tutorial review. J. Phys. A: Math. Theor. 50: 093001. doi:10.1088/1751-8121/aa54d9. · Zbl 1360.92051
[37] Scholes, N.S., Schnoerr, D., Isalan, M., and Stumpf, M.P.H. (2019). A comprehensive network atlas reveals that turing patterns are common but not robust. Cell Syst. 9: 243-257.e4. doi:10.1016/j.cels.2019.07.007.
[38] Silk, D., Barnes, C.P., Kirk, P.D.W., Kirk, P., Toni, T., Rose, A., Moon, S., Dallman, M.J., Stumpf, M.P.H., and Stumpf, M.P.H. (2011). Designing attractive models via automated identification of chaotic and oscillatory dynamical regimes. Nat. Commun. 2: 489. doi:10.1038/ncomms1496.
[39] Tankhilevich, E., Ish-Horowicz, J., Hameed, T., Roesch, E., Kleijn, I., Stumpf, M.P.H., and He, F. (2020). GpABC: a julia package for approximate bayesian computation with Gaussian process emulation. Bioinformatics 36: 3286-3287. doi:10.1093/bioinformatics/btaa078.
[40] Toni, T., Welch, D., Strelkowa, N., Ipsen, A., and Stumpf, M.P. (2008). Approximate bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6: 187-202. doi:10.1098/rsif.2008.0172.
[41] Tyson, J.J., Chen, K.C., and Novák, B. (2003). Sniffers, buzzers, toggles and blinkers: dynamics of regulatory and signaling pathways in the cell. Curr. Opin. Cell Biol. 15: 221-231. doi:10.1016/s0955-0674(03)00017-6.
[42] Tzen, B. and Raginsky, M. (2019). Neural stochastic differential equations: deep latent Gaussian models in the diffusion limit. arXiv.
[43] Žurauskienė, J., Kirk, P., Thorne, T., and Stumpf, M.P. (2014). Bayesian non-parametric approaches to reconstructing oscillatory systems and the nyquist limit. Phys. A 407: 33-42. · Zbl 1395.62072
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.