×

In search of good probability assessors: an experimental comparison of elicitation rules for confidence judgments. (English) Zbl 1378.91067

Summary: In this paper, we use an experimental design to compare the performance of elicitation rules for subjective beliefs. Contrary to previous works in which elicited beliefs are compared to an objective benchmark, we consider a purely subjective belief framework (confidence in one’s own performance in a cognitive task and a perceptual task). The performance of different elicitation rules is assessed according to the accuracy of stated beliefs in predicting success. We measure this accuracy using two main factors: calibration and discrimination. For each of them, we propose two statistical indexes and we compare the rules’ performances for each measurement. The matching probability method provides more accurate beliefs in terms of discrimination, while the quadratic scoring rule reduces overconfidence and the free rule, a simple rule with no incentives, which succeeds in eliciting accurate beliefs. Nevertheless, the matching probability appears to be the best mechanism for eliciting beliefs due to its performances in terms of calibration and discrimination, but also its ability to elicit consistent beliefs across measures and across tasks, as well as its empirical and theoretical properties.

MSC:

91B06 Decision theory
91A90 Experimental studies
62P15 Applications of statistics to psychology
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Abdellaoui, M., Vossmann, F., & Weber, M. (2005). Choice-based elicitation and decomposition of decision weights for gains and losses under uncertainty. Management Science, 51(9), 1384-1399. · Zbl 1232.91115 · doi:10.1287/mnsc.1050.0388
[2] Andersen, S., Fountain, J., Harrison, G., & Rutstrom, E. (2010). Estimating subjective probabilities. CEAR Working Paper.
[3] Armantier, O., & Treich, N. (2013). Eliciting beliefs: Proper scoring rules, incentives, stakes and hedging. European Economic Review, 62, 17-40. · doi:10.1016/j.euroecorev.2013.03.008
[4] Arrow, K. J. (1951). Alternative approaches to the theory of choice in risk-taking situations. Econometrica, 19, 404-437. · Zbl 0044.15304 · doi:10.2307/1907465
[5] Baillon, A., & Bleichrodt, H. (2015). Testing ambiguity models through the measurement of probabilities for gains and losses. American Economic Journal: Microeconomics (forthcoming), 7(2), 77-100.
[6] Baillon, A., Cabantous, L., & Wakker, P. (2012). Aggregating imprecise or conflicting beliefs: An experimental investigation using modern ambiguity theories. Journal of Risk and Uncertainty, 44(2), 115-147. · doi:10.1007/s11166-012-9140-x
[7] Baranski, J., & Petrusic, W. (1994). The calibration and resolution of confidence in perceptual judgments. Perception and Psychophysics, 55(4), 412-428. · doi:10.3758/BF03205299
[8] Becker, G., DeGroot, M., & Marschak, J. (1964). Measuring utility by a single-response sequential method. Behavioral Science, 9(3), 226-232. · doi:10.1002/bs.3830090304
[9] Biais, B., Hilton, D., Mazurier, K., & Pouget, S. (2005). Judgmental overconfidence, self monitoring, and trading performance in an experimental financial market. The Review of Economic Studies, 72(2), 287-312. · Zbl 1121.91358 · doi:10.1111/j.1467-937X.2005.00333.x
[10] Blavatskyy, P. (2009). Betting on own knowledge: Experimental test of overconfidence. Journal of Risk and Uncertainty, 38(1), 39-49. · Zbl 1162.91330 · doi:10.1007/s11166-008-9048-7
[11] Brainard, D. (1997). The psychophysics toolbox. Spatial Vision, 10, 433-436. · doi:10.1163/156856897X00357
[12] Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78(1), 1-3. · doi:10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2
[13] Camerer, C., & Lovallo, D. (1999). Overconfidence and excess entry: An experimental approach. The American Economic Review, 89(1), 306-318. · doi:10.1257/aer.89.1.306
[14] Clark, J., & Friesen, L. (2009). Overconfidence in forecasts of own performance: An experimental study. The Economic Journal, 119(534), 229-251. · doi:10.1111/j.1468-0297.2008.02211.x
[15] Dimmock, S., Kouwenberg, R., & Wakker, P. (2011). Ambiguity attitudes and portfolio choice: Evidence from a large representative survey. Netspar Discussion Paper No 06/2011-054.
[16] Fleming, S., & Dolan, R. (2012). The neural basis of accurate metacognition. Philosophical Transactions of the Royal Society B, 367(1594), 1338-1349. · doi:10.1098/rstb.2011.0417
[17] Fleming, S. M., Weil, R. S., Nagy, Z., Dolan, R. J., & Rees, G. (2010). Relating introspective accuracy to individual differences in brain structure. Science, 329, 1541-1543. · doi:10.1126/science.1191883
[18] Galvin, S. J., Podd, J. V., Drga, V., & Whitmore, J. (2003). Type 2 tasks in the theory of signal detectability: Discrimination between correct and incorrect decisions. Psychonomic Bulletin and Review, 10, 843-876. · doi:10.3758/BF03196546
[19] Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477), 359-378. · Zbl 1284.62093 · doi:10.1198/016214506000001437
[20] Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.
[21] Grether, D. (1992). Testing Bayes rule and the representativeness heuristic: Some experimental evidence. Journal of Economic Behavior and Organization, 17, 31-57. · doi:10.1016/0167-2681(92)90078-P
[22] Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143, 29-36. · doi:10.1148/radiology.143.1.7063747
[23] Hao, L., & Houser, D. (2012). Belief elicitation in the presence of naive respondents: An experimental study. Journal of Risk and Uncertainty, 44(2), 161-180. · doi:10.1007/s11166-011-9133-1
[24] Harvey, N. (1997). Confidence in judgment. Trends in Cognitive Sciences, 1(2), 78-82. · doi:10.1016/S1364-6613(97)01014-0
[25] Holt, C. (2006). Markets, games, and strategic behavior: Recipes for interactive learning. Reading: Addison-Wesley.
[26] Holt, C., & Smith, M. (2009). An update on Bayesian updating. Journal of Economic Behavior and Organization, 69(2), 125-134. · doi:10.1016/j.jebo.2007.08.013
[27] Hossain, T., & Okui, R. (2013). The binarized scoring rule. The Review of Economic Studies, 80(3), 984-1001. · Zbl 1405.91094 · doi:10.1093/restud/rdt006
[28] Kadane, J. B., & Winkler, R. L. (1988). Separating probability elicitation from utilities. Journal of the American Statistical Association, 83(402), 357-363. · doi:10.1080/01621459.1988.10478605
[29] Kaivanto, K. (2006). Informational rent, publicly known firm type, and ‘closeness’ in relationship finance. Economics Letters, 91(3), 430-435. · doi:10.1016/j.econlet.2006.01.005
[30] Karni, E. (2009). A mechanism for eliciting probabilities. Econometrica, 77(2), 603-606. · Zbl 1158.91339 · doi:10.3982/ECTA7833
[31] Kothiyal, A., Spinu, V., & Wakker, P. (2011). Comonotonic proper scoring rules to measure ambiguity and subjective beliefs. Journal of Multi-Criteria Decision Analysis, 17, 101-113. · Zbl 1214.90064 · doi:10.1002/mcda.454
[32] LaValle, I. H. (1978). Fundamentals of decision analysis. New York: Holt, Rinehart and Winston. · Zbl 0394.62001
[33] Levitt, H. (1971). Transformed up-down methods in psychoacoustics. Journal of the Acoustical Society of America, 49, 467-477. · doi:10.1121/1.1912375
[34] Lichtenstein, S., & Fischhoff, B. (1977). Do those who know more also know more about how much they know? The calibration of probability judgments. Organizational Behavior and Human Performance, 20(7), 159-183. · doi:10.1016/0030-5073(77)90001-0
[35] Lichtenstein, S.; Fischhoff, B.; Phillips, L.; Kahneman, D. (ed.); Slovic, P. (ed.); Tversky, A. (ed.), Calibration of probabilities: The state of the art to 1980, 306-334 (1982), Cambridge · doi:10.1017/CBO9780511809477.023
[36] Massoni, S. (2009). A direct revelation mechanism for elicitating confidence in perceptual and cognitive tasks: An experimental study. Master’s Thesis, Université Paris 1.
[37] Massoni, S., Gajdos, T., & Vergnaud, J. C. (2014). Confidence measurement in the light of signal detection theory. Frontiers in Psychology, 5, 1455. · doi:10.3389/fpsyg.2014.01455
[38] McCurdy, L., Maniscalco, B., Metcalfe, J., Liu, K., de Lange, F., & Lau, H. (2013). Anatomical coupling between distinct metacognitive systems for memory and visual perception. The Journal of Neuroscience, 33(5), 1897-1906. · doi:10.1523/JNEUROSCI.1890-12.2013
[39] Mobius, M., Niederle, M., Niehaus, P., & Rosenblat, T. (2011). Managing self-confidence: Theory and experimental evidence. NBER Working Paper No 17014.
[40] Murphy, A. H. (1972). Scalar and vector partitions of the probability score. Part I: Two-state situation. Journal of Applied Meteorology, 11, 273-282. · doi:10.1175/1520-0450(1972)011<0273:SAVPOT>2.0.CO;2
[41] Murphy, A. H. (1998). The early history of probability forecasts: Some extensions and clarifications. Weather and Forecasting, 13, 5-15. · doi:10.1175/1520-0434(1998)013<0005:TEHOPF>2.0.CO;2
[42] Nyarko, Y., & Schotter, A. (2002). An experimental study of belief learning using elicited beliefs. Econometrica, 70(3), 971-1005. · Zbl 1121.91317 · doi:10.1111/1468-0262.00316
[43] Offerman, T., Sonnemans, J., Van de Kuilen, G., & Wakker, P. (2009). A truth-serum for non-Bayesian: Correcting proper scoring rules for risk attitudes. Review of Economic Studies, 76(4), 1461-1489. · Zbl 1187.91060 · doi:10.1111/j.1467-937X.2009.00557.x
[44] Palfrey, T., & Wang, S. (2009). On eliciting beliefs in strategic games. Journal of Economic Behavior and Organization, 71(2), 98-109. · doi:10.1016/j.jebo.2009.03.025
[45] Raiffa, H. (1968). Decision analysis. London: Addison-Wesley. · Zbl 0181.21802
[46] Rounis, E., Maniscalco, B., Rothwell, J. C., Passingham, R. E., & Lau, H. (2010). Theta-burst transcranial magnetic stimulation to the prefrontal cortex impairs metacognitive visual awareness. Cognitive Neuroscience, 1(3), 165-175. · doi:10.1080/17588921003632529
[47] Schotter, A., & Trevino, I. (2014). Belief Elicitation in the Laboratory. Annual Review of Economics, 6, 103-128. · doi:10.1146/annurev-economics-080213-040927
[48] Song, C., Kanai, R., Fleming, S., Weil, R., Schwarzkopf, D., & Rees, G. (2011). Relating inter-individual differences in metacognitive performance on different perceptual tasks. Consciousness and Cognition, 20(4), 1787-1792. · doi:10.1016/j.concog.2010.12.011
[49] Trautmann, S., & van de Kuilen, G. (2015). Belief elicitation: A horse race among truth serums. The Economic Journal (forthcoming).
[50] Wallsten, T. S., & Budescu, D. V. (1983). Encoding subjective probabilities: A psychological and psychometric review. Management Science, 29(2), 151-173. · doi:10.1287/mnsc.29.2.151
[51] Winkler, R. L. (1972). An introduction to Bayesian inference and decision theory. New York: Holt, Rinehart and Winston.
[52] Winkler, R. L., & Murphy, A. H. (1968). “good” probability assessors. Journal of Applied Meteorology, 7, 751-758. · doi:10.1175/1520-0450(1968)007<0751:PA>2.0.CO;2
[53] Yates, J. F. (1982). External correspondence: Decompositions of the mean probability score. Organizational Behavior and Human Performance, 30(1), 132-156. · doi:10.1016/0030-5073(82)90237-9
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.