×

Calibrating expert assessments using hierarchical Gaussian process models. (English) Zbl 1475.62124

Summary: Expert assessments are routinely used to inform management and other decision making. However, often these assessments contain considerable biases and uncertainties for which reason they should be calibrated if possible. Moreover, coherently combining multiple expert assessments into one estimate poses a long-standing problem in statistics since modeling expert knowledge is often difficult. Here, we present a hierarchical Bayesian model for expert calibration in a task of estimating a continuous univariate parameter. The model allows experts’ biases to vary as a function of the true value of the parameter and according to the expert’s background. We follow the fully Bayesian approach (the so-called supra-Bayesian approach) and model experts’ bias functions explicitly using hierarchical Gaussian processes. We show how to use calibration data to infer the experts’ observation models with the use of bias functions and to calculate the bias corrected posterior distributions for an unknown system parameter of interest. We demonstrate and test our model and methods with simulated data and a real case study on data-limited fisheries stock assessment. The case study results show that experts’ biases vary with respect to the true system parameter value and that the calibration of the expert assessments improves the inference compared to using uncalibrated expert assessments or a vague uniform guess. Moreover, the bias functions in the real case study show important differences between the reliability of alternative experts. The model and methods presented here can be also straightforwardly applied to other applications than our case study.

MSC:

62F15 Bayesian inference
62P12 Applications of statistics to environmental and related topics
60G15 Gaussian processes

Software:

NUTS; BayesDA; Stan
PDFBibTeX XMLCite
Full Text: DOI Euclid

References:

[1] Albert, I., Donnet, S., Guihenneuc-Jouyaux, C., Low-Choy, S., Mengersen, K., and Rousseau, J. (2012). “Combining Expert Opinions in Prior Elicitation.” Bayesian Analysis, 7(3): 503-532. · Zbl 1330.62105 · doi:10.1214/12-BA717
[2] Berkson, J. and Thorson, J. T. (2014). “The determination of data-poor catch limits in the United States: is there a better way?” ICES Journal of Marine Science, 72(1): 237-242.
[3] Burgman, M. (2005). Risks and Decisions for Conservation and Environmental Management. Cambridge University Press.
[4] Burgman, M., Carr, A., Godden, L., Gregory, R., McBride, M., Flander, L., and Maguire, L. (2011). “Redefining expertise and improving ecological judgment.” Conservation Letters, 4(2): 81-87.
[5] Chrysafi, A., Cope, J., and Kuparinen, A. (2019). “Eliciting expert knowledge to inform stock status for data-limited stock assessments.” Marine Policy, (101): 167-176.
[6] Chrysafi, A. and Kuparinen, A. (2015). “Assessing abundance of populations with limited data: Lessons learned from data-poor fisheries stock assessment.” Environmental Reviews, 24(1): 25-38.
[7] Clemen, R. T. and Lichtendahl, K. C. (2002). “Debiasing expert overconfidence: A Bayesian calibration model.” Working paper, Duke University.
[8] Consalez-Laxe, F. (2005). “The precautionary principle in fisheries management.” Marine Policy, 29: 495-505.
[9] Cooke, R. M. and Goossens, L. L. (2008). “TU Delft expert judgment data base.” Reliability Engineering & System Safety, 93(5): 657-674. Expert Judgement.
[10] Cope, J. M. (2013). “Implementing a statistical catch-at-age model (Stock Synthesis) as a tool for deriving overfishing limits in data-limited situations.” Fisheries Research, 142: 3-14.
[11] Costello, C., Ovando, D., Hilborn, R., Gaines, S. D., Deschenes, O., and Lester, S. E. (2012). “Status and solutions for the world’s unassessed fisheries.” Science, 338: 517-520.
[12] Daan, N., Gislason, H., Pope, J. G., and Rice, J. C. (2011). “Apocalypse in world fisheries? The reports of their death are greatly exaggerated.” ICES Journal of Marine Science, 68(7): 1375-1378.
[13] de Little, S. C., Casas-Mulet, R., Patulny, L., Wand, J., Miller, K. A., Fidler, F., Stewardson, M. J., and Webb, J. A. (2018). “Minimising biases in expert elicitations to inform environmental management: Case studies from environmental flows in Australia.” Environmental Modelling and Software, 100: 146-158.
[14] Dias, L. C., Morton, A., and Quigley, J. (2018). Elicitation. Springer International Publishing. · Zbl 1386.90005
[15] Dick, E. J. and MacCall, A. D. (2011). “Depletion-Based Stock Reduction Analysis: A catch-based method for determining sustainable yields for data-poor fish stocks.” Fisheries Research, 110(2): 331-341.
[16] Dietrich, F. and List, C. (2014). Probabilistic opinion pooling. Oxford University Press. · Zbl 1392.91052 · doi:10.1007/s00355-017-1034-z
[17] Farr, C., Ruggeri, F., and Mengersen, K. (2018). “Prior and Posterior Linear Pooling for Combining Expert Opinions: Uses and impact on Bayesian networks.” Entropy, 20(3): 209.
[18] Food and Agriculture Organization of the United Nations (1995). “Code of Conduct for responsible Fisheries.”
[19] French, S. (1980). “Updating of Belief in the Light of Someone Else’s Opinion.” Journal of the Royal Statistical Society. Series A (General), 143(1): 43-48. · Zbl 0432.62004 · doi:10.2307/2981768
[20] French, S. (2011). “Aggregating expert judgement.” Revista de la Real Academia de Ciencias Exactas, Fisicas y Naturales. Serie A. Matematicas, 105(1): 181-206.
[21] Froese, R., Demirel, N., Coro, G., Kleisner, K. M., and Winker, H. (2017). “Estimating fisheries reference points from catch and resilience.” Fish and Fisheries, 18(3): 506-526.
[22] Garthwaite, P. H., Kadane, J. B., and O’Hagan, A. (2005). “Statistical Methods for Eliciting Probability Distributions.” Journal of the American Statistical Association, 100(470): 680-701. · Zbl 1117.62340 · doi:10.1198/016214505000000105
[23] Gelfand, A. E., Mallick, B. K., and Dey, D. K. (1995). “Modeling Expert Opinion Arising As a Partial Probabilistic Specification.” Journal of the American Statistical Association, 90(430): 598-604. · Zbl 0826.62007 · doi:10.1080/01621459.1995.10476552
[24] Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian data analysis. CRC press. · Zbl 1279.62004
[25] Genest, C. and Schervish, M. J. (1985). “Modeling Expert Judgements for Bayesian Updating.” The Annals of Statistics, 13(3): 1198-1212. · Zbl 0609.62007 · doi:10.1214/aos/1176349664
[26] Geromont, H. F. and Butterworth, D. S. (2015). “A Review of assessment methods and the development of management procedures for data-poor fisheries.” FAO report, FAO.
[27] Gneiting, T., Balabdaoui, F., and Raftery, A. E. (2007). “Probabilistic forecasts, calibration and sharpness.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(2): 243-268. · Zbl 1120.62074 · doi:10.1111/j.1467-9868.2007.00587.x
[28] Griffiths, S. P., Kuhnert, P. M., Venables, W. N., and Blaber, S. J. (2007). “Estimating abundance of pelagic fishes using illnet catch data in data-limited fisheries: A Bayesian approach.” Canadian Journal for Fisheries and Aquatic Sciences, 64(7): 1019-1033.
[29] Hartley, D. and French, S. (2018). “Elicitation and Calibration: A Bayesian Perspective.” In Dias, L. C., Morton, A., and Quigley, J. (eds.), Elicitation The science and Art of Structuring Judgement, 119-140. Springer International Publishing. · Zbl 1386.90005
[30] Hilborn, R., Maquire, J., Parma, A. M., and Rosenberg, A. A. (2001). “The precautionary approach and risk management: can they increase the probability of success in fisheries.” Canadian Journal of Fisheries and Aquatic Sciences, 58: 99-107.
[31] Hoffman, M. D. and Gelman, A. (2014). “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo.” Journal of Machine Learning Research, 1593-1623. · Zbl 1319.60150
[32] Kennedy, M. C. and O’Hagan, A. (2001). “Bayesian calibration of computer models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(3): 425-464. · Zbl 1007.62021 · doi:10.1111/1467-9868.00294
[33] Kuhnert, P. M., Hayes, K., Martin, T. G., and McBride, M. F. (2009). “Expert opinion in statistical models.” 18th World IMACS Congress and MODSIM09 International Congress on Modelling and Simulation., 4264-4268.
[34] Kuhnert, P. M., Martin, T. G., and Griffiths, S. P. (2010). “A guide to eliciting and using expert knowledge in Bayesian ecological models.” Ecology letters, 13(7): 900-914.
[35] Kuparinen, A., Mäntyniemi, S., Hutchings, J., and Kuikka, S. (2012). “Increasing biological realism of fisheries stock assessment: towards hierarchical Bayesian methods.” Environmental Reviews, 20: 135-151.
[36] Kynn, M. (2008). “The ’heuristics and biases’ bias in expert elicitation.” Journal of the Royal Statistical Society: Series A (Statistics in Society), 171(1): 239-264.
[37] Landquiste, H., Norman, J., Lindhe, A., Norberg, T., Hassellöv, I., Lindgren, J. F., and Rosen, L. (2017). “Expert elicitation for deriving input data for probabilistic risk assessment of shipwrecks.” Marine Pollution Bulletin, 125: 399-415.
[38] Lindley, D. V. (1982). “The Improvement of Probability Judgements.” Journal of Royal Statistical Society. Series A (General), 145(1): 117-126. · Zbl 0479.62005 · doi:10.2307/2981425
[39] Lindley, D. V. (1983). “Reconciliation of Probability Distributions.” Operations Research, 31(5): 866-880. · Zbl 0529.90062 · doi:10.1287/opre.31.5.866
[40] Lindley, D. V. and Singpurwalla, N. D. (1986). “Reliability (and fault tree) analysis using expert opinions.” Journal of the American Statistical Association, 81(393): 87-90. · Zbl 0594.62108 · doi:10.1080/01621459.1986.10478241
[41] Lindley, D. V., Tversky, A., and Brown, R. V. (1979). “On the Reconciliation of Probability Assessments.” Journal of the Royal Statistical Society. Series A (General), 142(2): 146-180. · Zbl 0427.62003 · doi:10.2307/2345078
[42] Low-Choy, S., O’Leary, R., and Mengersen, K. (2009). “Elicitation by design in ecology: using expert opinion to inform priors for Bayesian statistical models.” Ecology, 90(1): 265-277.
[43] Magnusson, A. and Hilborn, R. (2007). “What makes fisheries data informative?” Fish and Fisheries, 8(4): 337-358.
[44] Mäntyniemi, S., Haapasaari, P., Kuikka, S., Parmanne, R., Lehtiniemi, M., and Kaitaranta, J. (2013). “Incorporating stakeholders’ knowledge to stock assessment: Central Baltic herring.” Canadian Journal of Fisheries and Aquatic Sciences, 70(4): 591-599.
[45] McConway, K. (1981). “Marginalization and Linear Opinion Pools.” Journal of the American Statistical Association, 71: 410-414. · Zbl 0455.90004 · doi:10.1080/01621459.1981.10477661
[46] Meissa, B., Gascuel, D., and Rivot, E. (2013). “Assessing stocks in data-poor African fisheries: a case study on the white grouper Epinephelus aeneus of Mauritania.” African Journal of Marine Science, 35: 253-267.
[47] Methot, R. D. and Wetzel, C. R. (2013). “Stock synthesis: A biological and statistical framework for fish stock assessment and fishery management.” Fisheries Research, 142: 86-99.
[48] Morgan, M. G. (2014). “Use (and abuse) of expert elicitation in support of decision making for public policy.” Proceedings of the National Academy of Sciences, 111(20): 7176-7184.
[49] Morris, P. A. (1974). “Decision Analysis Expert Use.” Management Science, 20(9): 1233-1241. · Zbl 0317.90002 · doi:10.1287/mnsc.20.9.1233
[50] Nevalainen, M., Helle, I., and Vanhatalo, J. (2018). “Estimating the acute impacts of Arctic marine oil spills using expert elicitation.” Marine Pollution Bulletin, 131: 782-792.
[51] Newman, D., Berkson, J., and Suatoni, L. (2015). “Current methods for setting catch limits for data-limited fish stocks in the United States.” Fisheries Research, 164: 86-93.
[52] O’Hagan, A., Buck, C. E., Daneshkhah, A., Eiser, J. R., Garthwaite, P. H., Jenkinson, D. J., Oakley, J. E., and Rakow, T. (2006). Uncertain Judgements: Eliciting Experts’ Probabilities. John Wiley & Sons. · Zbl 1269.62009
[53] O’Hagan, A. and Oakley, J. E. (2004). “Probability is perfect, but we can’t elicit it perfectly.” Reliability Engineering & System Safety, 85(1): 239-248.
[54] Roman, H. A., Walker, K. D., Walsh, T. L., Conner, L., Richmond, H. M., Hubbel, B. J., and Kinnery, P. L. (2008). “Expert Judgment Assessment of the Mortality Impact of Changes in Ambient Fine Particulate Matter in the U.S.” Environmental Science & Technology, 42(7): 2268-2274.
[55] Salas, S., Chuenpagdee, R., Seij, J., and Charles, A. (2007). “Challenges in the assessment and management of small-scale fisheries in Latin America and Caribbean.” Fisheries Research, 87: 5-16.
[56] Speris-Bridge, A., Fidler, F., McBride, M., Flander, L., Cumming, G., and Burgman, M. (2010). “Reducing Overconfidence in the Interval Judgements of Experts.” Risk Analysis, 30: 512-523.
[57] Stan Development Team (2016). “Stan: A C++ Library for Probability and Sampling, Version 2.9.0.” URL http://mc-stan.org/.
[58] Tversky, A. and Kahneman, D. (1974). “Judgment under uncertainty: Heuristics and Biases.” Science, 185(4157): 1124-1131.
[59] Usher, W. and Strachan, N. (2013). “An expert elicitation of climate, energy and economic uncertainties.” Energy Policy, 61: 811-821.
[60] Uusitalo, L., Kuikka, S., and Romakkaniemi, A. (2005). “Estimation of Atlantic salmon smolt carrying capacity of rivers using expert knowledge.” ICES Journal of Marine Science: Journal du Conseil, 62(4): 708-722.
[61] Vanhatalo, J. and Vehtari, A. (2007). “Sparse Log Gaussian Processes via MCMC for Spatial Epidemiology.” JMLR Workshop and Conference Proceedings, 1: 73-89.
[62] Vehtari, A. and Ojanen, J. (2012). “A survey of Bayesian predictive methods for model assessment, selection and comparison.” Statistics Surveys, 6: 141-228. · Zbl 1302.62011 · doi:10.1214/12-SS102
[63] Williams, C. K. and Rasmussen, C. E. (2006). Gaussian processes for machine learning. MIT Press. · Zbl 1177.68165
[64] Wilson, E. C., Usher-Smith, J. A., Emery, J., Corrie, P. G., and Walter, F. M. (2018). “Expert elicitation of multinomial probabilities for decision-analytic modelling: An application to rates of disease progression in undiagnosed and untreated melanoma.” Value in Health, in press.
[65] Zickfeld, K.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.