×

Why imitate, and if so, how? A boundedly rational approach to multi-armed bandits. (English) Zbl 0895.90003

Summary: Individuals in a finite population repeatedly choose among actions yielding uncertain payoffs. Between choices, each individual observes the action and realized outcome of one other individual. We restrict our search to learning rules with limited memory that increase expected payoffs regardless of the distribution underlying their realizations. It is shown that the rule that outperforms all others is that which imitates the action of an observed individual (whose realized outcome is better than self) with a probability proportional to the difference in these realizations. When each individual uses this best rule, the aggregate population behavior is approximated by the replicator dynamic.

MSC:

91B06 Decision theory
91E40 Memory and learning in psychology
91B62 Economic growth models
91A15 Stochastic games, stochastic differential games
92D25 Population dynamics (general)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Banerjee, A. V., A simple model of herd behavior, Quart. J. Econ., 107, 797-818 (1992)
[2] Binmore, K. G.; Samuelson, L.; Vaughan, R., Musical chairs: Modeling noisy evolution, Games Econ. Beh., 11, 1-35 (1995) · Zbl 0839.90140
[3] Björnerstedt, J.; Schlag, K. H., Discussion Paper (1996)
[4] Björnerstedt, J.; Weibull, J., Nash equilibrium and evolution by imitation, The Rational Foundations of Economic Behaviour. The Rational Foundations of Economic Behaviour, Proc. IEA Conference (1996), MacMillan: MacMillan London, p. 155-171
[5] Börgers, T.; Sarin, R., Discussion Paper (1993)
[6] Boylan, R. T., Laws of large numbers for dynamical systems with randomly matched individuals, J. Econ. Theory, 57, 473-504 (1992) · Zbl 0761.92025
[7] A. Cabrales, Stochastic Replicator Dynamics, University of California, San Diego, 1993; A. Cabrales, Stochastic Replicator Dynamics, University of California, San Diego, 1993
[8] D. Easley, A. Rustichini, Choice Without Beliefs, Cornell University and C.O.R.E. 1995; D. Easley, A. Rustichini, Choice Without Beliefs, Cornell University and C.O.R.E. 1995 · Zbl 1026.91508
[9] Ellison, G.; Fudenberg, D., Word-of-mouth communication and social learning, Quart. J. Econ., 440, 93-125 (1995) · Zbl 0827.90039
[10] Friedman, D., Evolutionary games in economics, Econometrica, 59, 637-666 (1991) · Zbl 0745.90012
[11] Gale, J.; Binmore, K. G.; Samuelson, L., Learning to be imperfect: The ultimatum game, Games Econ. Beh., 8, 56-90 (1995) · Zbl 0827.90146
[12] Helbing, D., Interrelations between stochastic equations for systems with pair interactions, Physica A, 181, 29-52 (1992)
[13] J. Hofbauer, Imitation Dynamics for Games, University of Vienna, 1995; J. Hofbauer, Imitation Dynamics for Games, University of Vienna, 1995
[14] Malawski, M., Some Learning Processes in Population Games (1989), University of Bonn
[15] Matsui, A., Best response dynamics and socially stable strategies, J. Econ. Theory, 57, 343-362 (1992) · Zbl 0773.90103
[16] Robson, A., A biological basis for expected and non-expected utility, J. Econ. Theory, 9, 397-424 (1996) · Zbl 0852.90019
[17] Rogers, A., Does biology constrain culture?, Amer. Anthropol., 90, 819-831 (1989)
[18] Rothschild, M., A two-armed bandit theory of market pricing, J. Econ. Theory, 9, 185-202 (1974)
[19] R. Sarin, An Axiomatization of the Cross Learning Dynamic, University of California, San Diego, 1993; R. Sarin, An Axiomatization of the Cross Learning Dynamic, University of California, San Diego, 1993
[20] Samuelson, L.; Zhang, J., Evolutionary stability in asymmetric games, J. Econ. Theory, 57, 363-391 (1992) · Zbl 0770.90096
[21] Savage, L. J., The Foundations of Statistics (1954), Wiley: Wiley New York · Zbl 0121.13603
[22] Schlag, K. H., Discussion Paper (1994)
[23] Schlag, K. H., Discussion Paper (1996)
[24] Schmalensee, R., Alternative models of bandit selection, J. Econ. Theory, 10, 333-342 (1975)
[25] Taylor, P., Evolutionarily stable strategies with two types of players, J. Applied Prob., 16, 76-83 (1979) · Zbl 0398.90120
[26] Weibull, J., Evolutionary Game Theory (1995), MIT Press: MIT Press Cambridge · Zbl 0879.90206
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.