×

Learning in network contexts: experimental results from simulations. (English) Zbl 1028.91519

Summary: The authors describe the results of simulation experiments performed on a suite of learning algorithms. We focus on games in network contexts. These are contexts in which (1) agents have very limited information about the game and (2) play can be extremely as asynchronous. There are many proposed learning algorithms in the literature. They choose a small sampling of such algorithms and use numerical simulation to explore the nature of asymptotic play. In particular, they explore the extent to which the asymptotic play depends on three factors: limited information, asynchronous play, and the degree of responsiveness of the learning algorithm.

MSC:

91A28 Signaling and communication in game theory
68T05 Learning and adaptive systems in artificial intelligence
91A90 Experimental studies
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Auer, P.; Cesa-Bianchi, N.; Freund, Y.; Schapire, R., Gambling in a Rigged Casino: The Adversarial Multi-armed Bandit Problem, Proceedings of the 36th Annual Symposium on Foundations of Computer Science (1995), Assoc. Comput. Mach: Assoc. Comput. Mach New York, p. 322-331 · Zbl 0938.68920
[2] Banos, A., On Pseudo Games, Ann. Math. Statist., 39, 1932-1945 (1968) · Zbl 0222.90056
[3] Bernheim, B. D., Rationalizable Strategic Behavior, Econometrica, 52, 1007-1028 (1984) · Zbl 0552.90098
[4] Blackwell, D., An Analog of the Minimax Theorem for Vector Payoffs, Pacific J. Math., 6, 1-8 (1956) · Zbl 0074.34403
[5] Borgers, T, and, Sarin, R. 1995, Learning through Reinforcement and Replicator Dynamics, Mimeo.; Borgers, T, and, Sarin, R. 1995, Learning through Reinforcement and Replicator Dynamics, Mimeo.
[6] Borodin, A.; El-Yaniv, R., Online Computation and Competitive Analysis (1998), Cambridge Univ. Press: Cambridge Univ. Press Cambridge · Zbl 0931.68015
[7] Chen, Y. 1997, Asynchronicity and Learning in Cost-Sharing Mechanisms, Mimeo.; Chen, Y. 1997, Asynchronicity and Learning in Cost-Sharing Mechanisms, Mimeo.
[8] Cover, T., Universal Portfolios, Math. Finance, 1, 1-29 (1991) · Zbl 0900.90052
[9] Erev, I, and, Roth, A. 1996, On the Need for Low Rationality Cognitive Game Theory: Reinforcement Learning in Experimental Games with Unique Mixed Strategy Equilibria, Mimeo.; Erev, I, and, Roth, A. 1996, On the Need for Low Rationality Cognitive Game Theory: Reinforcement Learning in Experimental Games with Unique Mixed Strategy Equilibria, Mimeo.
[10] Foster, D.; Vohra, R., A Randomization Rule for Selecting Forecasts, Oper. Res., 41, 704-709 (1993) · Zbl 0785.62091
[11] Foster, D, and, Vohra, R. 1995, Calibrated Learning and Correlated Equilibrium, Preprint.; Foster, D, and, Vohra, R. 1995, Calibrated Learning and Correlated Equilibrium, Preprint. · Zbl 0894.90188
[12] Foster, D.; Vohra, R., Regret in the On-Line Decision Problem, Games Econ. Behav., 21, 40-55 (1997)
[13] Freund, Y.; Schapire, R., Game Theory, On-Line Prediction, and Boosting, Proceedings of the 9th Annual Conference on Computational Learning Theory (1996), Assoc. Comput. Math: Assoc. Comput. Math New York, p. 325-332
[14] Friedman, E. J., Dynamics and Rationality in Ordered Externality Games, Games Econ. Behav., 16, 65-76 (1996) · Zbl 0864.90137
[15] Friedman, E. 1998, Learnability in a Class of Non-Atomic Games Arising on the Internet, Mimeo.; Friedman, E. 1998, Learnability in a Class of Non-Atomic Games Arising on the Internet, Mimeo.
[16] Friedman, E, and, Shenker, S. 1996, Synchronous and Asynchronous Learning by Responsive Learning Automata, Mimeo.; Friedman, E, and, Shenker, S. 1996, Synchronous and Asynchronous Learning by Responsive Learning Automata, Mimeo.
[17] Friedman, E, and, Shenker, S. 1997, Learning and Implementation on the Internet, Mimeo.; Friedman, E, and, Shenker, S. 1997, Learning and Implementation on the Internet, Mimeo.
[18] Fudenberg, D.; Levine, D., Reputation and Equilibrium Selection in Games with a Patient Player, Econometrica, 57, 759-778 (1989) · Zbl 0693.90105
[19] Fudenberg, D, and, Levine, D. 1995a, Conditional Universal Consistency, Mimeo.; Fudenberg, D, and, Levine, D. 1995a, Conditional Universal Consistency, Mimeo.
[20] Fudenberg, D.; Levine, D., Consistency and Cautious Fictitious Play, J. Econ. Dynam. Control, 19, 1065-1089 (1995) · Zbl 0900.90423
[21] Fudenberg, D.; Levine, D., The Theory of Learning in Games (1998), MIT Press: MIT Press Cambridge
[22] Gittens, J., Multi-armed Bandit Allocation Indices (1989), Wiley: Wiley New York
[23] Greenwald, A. 1999, Learning to Play Network Games, Ph.D. thesis, Courant Institute of Mathematical Sciences, New York University, New York.; Greenwald, A. 1999, Learning to Play Network Games, Ph.D. thesis, Courant Institute of Mathematical Sciences, New York University, New York.
[24] Greenwald, A. R.; Kephart, J. O., Shopbots and Pricebots, Lecture Notes on Artificial Intelligence: Proceedings of the IJCAI Workshop on Agent-Mediated Electronic Commerce (2000), Springer-Verlag: Springer-Verlag Berlin/New York
[25] Hannan, J., Approximation to Bayes Risk in Repeated Plays, (Dresher, M.; Tucker, A. W.; Wolfe, P., Contributions to the Theory of Games (1957), Princeton Univ. Press: Princeton Univ. Press Princeton), 97-139 · Zbl 0078.32804
[26] Hart, S, and, Mas-Colell, A. 1997, A Simple Adaptive Procedure Leading to Correlated Equilibrium, Technical Report, Center for Rationality and Interactive Decision Theory.; Hart, S, and, Mas-Colell, A. 1997, A Simple Adaptive Procedure Leading to Correlated Equilibrium, Technical Report, Center for Rationality and Interactive Decision Theory. · Zbl 1020.91003
[27] Huberman, B. A.; Glance, N. S., Evolutionary Games and Computer Simulations, Proc. Nat. Acad. Sci. U.S.A., 90, 7716-7718 (1993) · Zbl 0800.92168
[28] Van Huyck, J, Battalio, R, and, Rankin, F. 1996, Selection Dynamics and Adaptive Behavior without Much Information, Mimeo.; Van Huyck, J, Battalio, R, and, Rankin, F. 1996, Selection Dynamics and Adaptive Behavior without Much Information, Mimeo. · Zbl 1121.91310
[29] Kaelbling, L.; Littman, M.; Moore, A., Reinforcement Learning: A Survey, Artificial Intelligence Res., 4, 237-285 (1996)
[30] Kalai, E.; Lehrer, E., Rational Learning Leads to Nash Equilibrium, Econometrica, 61, 1019-1045 (1993) · Zbl 0793.90106
[31] Littman, M., Markov Games as a Framework for Multi-agent Reinforcement Learning, Proceedings of Eleventh International Conference on Machine Learning (1994), Morgan Kaufmann: Morgan Kaufmann San Mateo, p. 157-163
[32] Lukose, R, and, Huberman, B. 1997, A Methodology for Managing Risk in Electronic Transactions over the Internet, presented at; Lukose, R, and, Huberman, B. 1997, A Methodology for Managing Risk in Electronic Transactions over the Internet, presented at
[33] Megiddo, N., On Repeated Games with Incomplete Information Played by Non-Bayesian Players, Int. J. Game Theory, 9, 157-167 (1986) · Zbl 0441.90118
[34] Milgrom, P.; Roberts, J., Adaptive and Sophisticated Learning in Normal Form Games, Games Econ. Behav., 3, 82-100 (1991) · Zbl 0751.90093
[35] Mookherjee, D.; Sopher, B., Learning and Decision Costs in Experimental Constant Sum Games, Games Econ. Behav., 19, 97-132 (1996) · Zbl 0872.90153
[36] Narendra, K.; Thathachar, M. A.L., Learning Automata: A Survey, IEEE Trans. Systems Man Cybernet, SMC-4, 323-334 (1974) · Zbl 0279.68067
[37] Pearce, D., Rationalizable Strategic Behavior and the Problem of Perfection, Econometrica, 52, 1029-1050 (1984) · Zbl 0552.90097
[38] Rosenthal, R., A Note on the Robustness of Equilibria with Respect to Commitment Opportunities, Games and Econ. Behav., 3, 237-243 (1991) · Zbl 0825.90806
[39] Roth, A.; Erev, I., Learning in Extensive Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term, Games Econ. Behav., 8, 164-212 (1995) · Zbl 0833.90144
[40] Sandholm, T.; Crites, R., Multiagent Reinforcement Learning in the Iterated Prisoners’ Dilemma, Special Issue on the Prisoners’ Dilemma. Special Issue on the Prisoners’ Dilemma, Biosystems, 37, 147-166 (1995)
[41] Shapley, L. S., A Value for \(n\)-Person Games, (Kuhn, H.; Tucker, A., Contributions to the Theory of Games (1953), Princeton Univ. Press: Princeton Univ. Press Princeton), 307-317 · Zbl 0050.14404
[42] Shenker, S., Making Greed Work in Networks: A Game-Theoretic Analysis of Switch Service Disciplines, IEEE/ACM Trans. Networking, 3, 819-831 (1995)
[43] Shoham, Y, and, Tennenholtz, M. 1993, Co-learning and the Evolution of Social Activity, Mimeo.; Shoham, Y, and, Tennenholtz, M. 1993, Co-learning and the Evolution of Social Activity, Mimeo.
[44] Stone, P, and, Veloso, M. 1997, Multiagent Systems: A Survey from a Machine Learning Perspective, Technical Report, 193, Carnegie Mellon University.; Stone, P, and, Veloso, M. 1997, Multiagent Systems: A Survey from a Machine Learning Perspective, Technical Report, 193, Carnegie Mellon University.
[45] Sutton, R.; Barto, A., Reinforcement Learning: An Introduction (1998), MIT Press: MIT Press Cambridge
[46] Watkins, C. 1989, Learning from Delayed Rewards. Ph.D. thesis, Cambridge University, Cambridge, UK.; Watkins, C. 1989, Learning from Delayed Rewards. Ph.D. thesis, Cambridge University, Cambridge, UK.
[47] Watson, J., A Reputation Refinement without Equilibrium, Econometrica, 61, 199-206 (1993) · Zbl 0778.90105
[48] Wellman, M.; Hu, J., Conjectural Equilibrium in Multi-agent Learning, Special Issue on Multi-agent Learning. Special Issue on Multi-agent Learning, Mach. Learning, 33, 179-200 (1998) · Zbl 0913.68172
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.