Mining gold from implicit models to improve likelihood-free inference.

*(English)*Zbl 07321276Summary: Simulators often provide the best description of real-world phenomena. However, the probability density that they implicitly define is often intractable, leading to challenging inverse problems for inference. Recently, a number of techniques have been introduced in which a surrogate for the intractable density is learned, including normalizing flows and density ratio estimators. We show that additional information that characterizes the latent process can often be extracted from simulators and used to augment the training data for these surrogate models. We introduce several loss functions that leverage these augmented data and demonstrate that these techniques can improve sample efficiency and quality of inference.

##### MSC:

62G05 | Nonparametric estimation |

58C15 | Implicit function theorems; global Newton methods on manifolds |

PDF
BibTeX
XML
Cite

\textit{J. Brehmer} et al., Proc. Natl. Acad. Sci. USA 117, No. 10, 5242--5249 (2020; Zbl 07321276)

Full Text:
DOI

##### References:

[1] | D. B. Rubin, Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 1151-1172 (1984). · Zbl 0555.62010 |

[2] | M. A. Beaumont, W. Zhang, D. J. Balding, Approximate Bayesian computation in population genetics. Genetics 162, 2025-2035 (2002). |

[3] | P. Marjoram, J. Molitor, V. Plagnol, S. Tavaré, Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. U.S.A. 100, 15324-15328 (2003). |

[4] | S. A. Sisson, Y. Fan, M. M. Tanaka, Sequential Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. U.S.A. 104, 1760-1765 (2007). · Zbl 1160.65005 |

[5] | S. A. Sisson, Y. Fan, M. Beaumont, Handbook of Approximate Bayesian Computation (Chapman and Hall/CRC, 2018). · Zbl 1416.62005 |

[6] | J. Alsing, B. Wandelt, S. Feeney, Massive optimal data compression and density estimation for scalable, likelihood-free inference in cosmology. Mon. Not. R. Astron. Soc. 477, 2874-2885 (2018). |

[7] | T. Charnock, G. Lavaux, B. D. Wandelt, Automatic physical inference with information maximizing neural networks. Phys. Rev. D 97, 083004 (2018). |

[8] | P. J. Diggle, R. J. Gratton, Monte Carlo methods of inference for implicit statistical models. J. R. Stat. Soc. 46, 193-212 (1984). · Zbl 0561.62035 |

[9] | I. J. Goodfellow et al. , Generative adversarial networks. https://arxiv.org/abs/1406.2661 (10 June 2014). |

[10] | K. Cranmer, J. Pavez, G. Louppe, Approximating likelihood ratios with calibrated discriminative classifiers https://arxiv.org/abs/1506.02169 (6 June 2015). |

[11] | K. Cranmer, G. Louppe, Unifying generative models and exact likelihood-free inference with conditional bijections. Zenodo, 10.5281/zenodo.198541 (2016). |

[12] | G. Louppe, K. Cranmer, J. Pavez, carl: A likelihood-free inference toolbox. J. Open Source Softw. 1, 11 (2016). |

[13] | S. Mohamed, B. Lakshminarayanan, Learning in implicit generative models. https://arxiv.org/abs/1610.03483 (11 October 2016). |

[14] | M. U. Gutmann, R. Dutta, S. Kaski, J. Corander, Likelihood-free inference via classification. Stat. Comput. 28, 411-425 (2017). · Zbl 1384.62089 |

[15] | T. Dinev, M. U. Gutmann, Dynamic likelihood-free inference via ratio estimation (DIRE). arXiv:1810.09899 (23 October 2018). |

[16] | J. Hermans, V. Begy, G. Louppe, Likelihood-free MCMC with approximate likelihood ratios. https://arxiv.org/abs/1903.04057v1 (10 March 2019). |

[17] | I. Guyon et al. D. Tran, R. Ranganath, D. Blei, “Hierarchical implicit models and likelihood-free variational inference” in Advances in Neural Information Processing Systems, I. Guyon et al., Eds. (Curran Associates, Inc., 2017), vol. 30, pp. 5523-5533. |

[18] | L. Dinh, D. Krueger, Y. Bengio, NICE: Non-linear independent components estimation. https://arxiv.org/abs/1410.8516 (30 October 2014). |

[19] | D. Jimenez Rezende, S. Mohamed, Variational inference with normalizing flows. https://arxiv.org/abs/1505.05770v5 (21 May 2015). |

[20] | L. Dinh, J. Sohl-Dickstein, S. Bengio, Density estimation using Real NVP. https://arxiv.org/abs/1605.08803 (27 May 2016). |

[21] | G. Papamakarios, T. Pavlakou, I. Murray, Masked autoregressive flow for density estimation. https://arxiv.org/abs/1705.07057 (19 May 2017). |

[22] | C.-W. Huang, D. Krueger, A. Lacoste, A. Courville, Neural autoregressive flows. https://arxiv.org/abs/1804.00779 (3 April 2018). |

[23] | G. Papamakarios, D. C. Sterratt, I. Murray, Sequential neural likelihood: Fast likelihood-free inference with autoregressive flows. https://arxiv.org/abs/1805.07226 (18 May 2018). |

[24] | T. Q. Chen, Y. Rubanova, J. Bettencourt, D. K. Duvenaud, Neural ordinary differential equations. http://arxiv.org/abs/1806.07366 (19 June 2018). |

[25] | D. P. Kingma, P. Dhariwal, Glow: Generative flow with invertible 1x1 convolutions. arXiv:1807.03039 (9 July 2018). |

[26] | W. Grathwohl, R. T. Q. Chen, J. Bettencourt, I. Sutskever, D. Duvenaud, FFJORD: Free-form continuous dynamics for scalable reversible generative models. https://arxiv.org/abs/1810.01367 (2 October 2018). |

[27] | M. Germain, K. Gregor, I. Murray, H. Larochelle, MADE: Masked autoencoder for distribution estimation. https://arxiv.org/abs/1502.03509 (12 February 2015). |

[28] | B. Uria, M.-A. Côté, K. Gregor, I. Murray, H. Larochelle, Neural autoregressive distribution estimation. https://arxiv.org/abs/1605.02226 (7 May 2016). · Zbl 1433.68393 |

[29] | A. van den Oord et al. , WaveNet: A generative model for raw audio. https://arxiv.org/abs/1609.03499 (12 September 2016). |

[30] | A. van den Oord et al. , Conditional image generation with PixelCNN decoders. https://arxiv.org/abs/1606.05328 (16 June 2016). |

[31] | A. van den Oord, N. Kalchbrenner, K. Kavukcuoglu, Pixel recurrent neural networks. https://arxiv.org/abs/1601.06759 (25 January 2016). |

[32] | Y. Fan, D. J. Nott, S. A. Sisson, Approximate Bayesian computation via regression density estimation. https://arxiv.org/abs/1212.1479 (6 December 2012). |

[33] | D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, R. Garnett G. Papamakarios, I. Murray, “Fast ε-free inference of simulation models with bayesian conditional density estimation” in Advances in Neural Information Processing Systems, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, R. Garnett, Eds. (MIT Press, Cambridge, MA, 2016), pp. 1028-1036. |

[34] | B. Paige, F. Wood, Inference networks for sequential Monte Carlo in graphical models. https://arxiv.org/abs/1602.06701v2 (22 February 2016). |

[35] | R. Dutta, J. Corander, S. Kaski, M. U. Gutmann, Likelihood-free inference by ratio estimation. http://export.arxiv.org/abs/1611.10242 (30 November 2016). · Zbl 1384.62089 |

[36] | G. Louppe, K. Cranmer, Adversarial variational optimization of non-differentiable simulators. https://arxiv.org/abs/1707.07113 (22 July 2017). |

[37] | J.-M. Lueckmann et al. , Flexible statistical inference for mechanistic models of neural dynamics. arXiv:1711.01861 (6 November 2017). |

[38] | J.-M. Lueckmann, G. Bassetto, T. Karaletsos, J. H. Macke, Likelihood-free inference with emulator networks. arXiv:1805.09294 (23 May 2018). |

[39] | J. Neyman, E. S. Pearson, K. Pearson, IX. On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. R. Soc. A 231, 289-337 (1933). · JFM 59.1163.02 |

[40] | S. S. Wilks, The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann. Math. Stat. 9, 60-62 (1938). · Zbl 0018.32003 |

[41] | E. Meeds, R. Leenders, M. Welling, Hamiltonian ABC. arXiv:1503.01916 (6 March 2015). |

[42] | M. M. Graham, A. J. Storkey, Asymptotically exact inference in differentiable generative models. Electron. J. Stat. 11, 5105-5164 (2017). · Zbl 1380.65025 |

[43] | S. Kaski, J. Corander F. Wood, J. W. van de Meent, V. Mansinghka, “A new approach to probabilistic programming inference” in Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, S. Kaski, J. Corander, Eds. (Proceedings of Machine Learning Research, 2014), pp. 1024-1032. |

[44] | A. Singh, J. Zhu T. Anh Le, A. Gunes Baydin, F. Wood, “Inference compilation and universal probabilistic programming” in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Volume 54 of Proceedings of Machine Learning Research, A. Singh, J. Zhu, Eds. (PMLR, Fort Lauderdale, FL, 2017), pp. 1338-1348. |

[45] | K. Cranmer, J. Brehmer, G. Louppe, The frontier of simulation-based inference https://arxiv.org/abs/1911.01429v1 (4 November 2019). |

[46] | D. S. Greenberg, M. Nonnenmacher, J. H. Macke, Automatic posterior transformation for likelihood-free inference. arXiv:1905.07488 (17 May 2019). |

[47] | J. Brehmer, K. Cranmer, G. Louppe, J. Pavez, Constraining effective field theories with machine learning. Phys. Rev. Lett. 121, 111801 (2018). |

[48] | J. Brehmer, K. Cranmer, G. Louppe, J. Pavez, A guide to constraining effective field theories with machine learning. Phys. Rev. D 98, 052004 (2018). |

[49] | R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learning 8, 229-256 (1992). · Zbl 0772.68076 |

[50] | J. Brehmer, K. Cranmer, G. Louppe, J. Pavez, Code repository for the generalized Galton board example in the paper “Mining gold from implicit models to improve likelihood-free inference.” GitHub. http://github.com/johannbrehmer/simulator-mining-example. Deposited 3 December 2019. |

[51] | J. Brehmer, F. Kling, I. Espejo, K. Cranmer, MadMiner: Machine learning-based inference for particle physics. Comput. Softw. Big Sci. 4, 3 (2019). |

[52] | J. Brehmer, S. Mishra-Sharma, J. Hermans, G. Louppe, K. Cranmer, Mining for Dark Matter Substructure: Inferring subhalo population properties from strong lenses with machine learning. Astrophys. J. 886, 49 (2019). |

[53] | PPX Developers, Probabilistic Programming eXecution protocol (PPX). GitHub. http://github.com/probprog/ppx. Accessed 6 February 2020. |

[54] | Participants of the Likelihood-Free Inference Meeting at the Flatiron Institute 2019, Code repository for the automatic calculation of joint score and joint likelihood ratio with Pyro. GitHub. https://github.com/LFITaskForce/benchmark. Accessed 6 February 2020. |

[55] | E. Bingham et al. , Pyro: Deep universal probabilistic programming. J. Mach. Learn. Res. (2019) 20, 1-6. · Zbl 07049747 |

[56] | P. Baldi, K. Cranmer, T. Faucett, P. Sadowski, D. Whiteson, Parameterized neural networks for high-energy physics. Eur. Phys. J. C 76, 235 (2016). |

[57] | J. Alsing, B. Wandelt, Generalized massive optimal data compression. Mon. Not. Roy. Astron. Soc. 476, L60-L64 (2018). |

[58] | J. Alsing, B. Wandelt, Nuisance hardened data compression for fast likelihood-free inference Mon. Not. R. Astron. Soc. 488, 5093-5103 (2019). |

[59] | A. J. Lotka, Analytical note on certain rhythmic relations in organic systems. Proc. Natl. Acad. Sci. U.S.A. 6, 410-415 (1920). |

[60] | A. J. Lotka, Undamped oscillations derived from the law of mass action. J. Am. Chem. Soc. 42, 1595-1599 (1920). |

[61] | D. T. Gillespie, A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 22, 403-434 (1976). |

[62] | G. Papamakarios, T. Pavlakou, I. Murray, Code repository for paper “masked autoregressive flow for density estimation.” GitHub. http://github.com/gpapamak/maf. Accessed 6 February 2020. |

[63] | J. Brehmer, K. Cranmer, G. Louppe, J. Pavez, Code repository for the Lotka-Volterra example in the paper “Mining gold from implicit models to improve likelihood-free inference.” GitHub. http://github.com/johannbrehmer/goldmine. Deposited 6 October 2018. |

[64] | J. Alwall et al. , The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations. J. High Energy Phys. 07, 079 (2014). · Zbl 1402.81011 |

[65] | K. Cranmer, T. Plehn, Maximum significance at the LHC and Higgs decays to muons. Eur. Phys. J. C 51, 415-420 (2007). |

[66] | T. Plehn, P. Schichtel, D. Wiegand, Where boosted significances come from. Phys. Rev. D 89, 054002 (2014). |

[67] | F. Kling, T. Plehn, P. Schichtel, Maximizing the significance in Higgs boson pair analyses. Phys. Rev. D 95, 035026 (2017). |

[68] | J. Brehmer, K. Cranmer, G. Louppe, J. Pavez, Code repository for the paper “Constraining effective field theories with machine learning.” GitHub. https://github.com/johannbrehmer/higgs_inference. Deposited 28 February 2019. |

[69] | B. Eli et al. , Pyro: Deep probabilistic programming. GitHub. https://github.com/uber/pyro. Accessed 6 February 2020. · Zbl 07049747 |

[70] | D. Tran et al. , Deep probabilistic programming. arXiv:1701.03757 (13 January 2017). |

[71] | I. Guyon et al. N. Siddharth et al. , “Learning disentangled representations with semi-supervised deep generative models” in Advances in Neural Information Processing Systems, I. Guyon et al., Eds. (Curran Associates, Inc., 2017), vol. 30, pp. 5927-5937. |

[72] | A. Gelman, D. Lee, J. Guo. Stan: A probabilistic programming language for Bayesian inference and optimization. J. Educ. Behav. Stat. 40, 530-543 (2015). |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.