Модели обучения в экономических экспериментах тема диссертации и автореферата по ВАК РФ 08.00.01, кандидат наук Чернов Григорий Витальевич

  • Чернов Григорий Витальевич
  • кандидат науккандидат наук
  • 2022, ФГАОУ ВО «Национальный исследовательский университет «Высшая школа экономики»
  • Специальность ВАК РФ08.00.01
  • Количество страниц 128
Чернов Григорий Витальевич. Модели обучения в экономических экспериментах: дис. кандидат наук: 08.00.01 - Экономическая теория. ФГАОУ ВО «Национальный исследовательский университет «Высшая школа экономики». 2022. 128 с.

Оглавление диссертации кандидат наук Чернов Григорий Витальевич

Introduction

Problem description. Modern economics increasingly becomes an experimental science. Laboratory and field experiments, natural and quasi-experiments are everywhere in the economics literature. Experimental results have shed light on a wide range of questions, ranging from individual rationality, heuristics and biases, to efficiency and implementation of public policy programs. One of the main goals of experimental research is to improve our understanding of the mechanisms and reasons for making particular economic decisions. Yet progress in this direction so far has been limited. On the one hand, human decisions result from complicated, multi-dimensional processes, most of which remain unobservable or unidentifiable to the observer. On the other, human decision-makers are known to be boundedly rational and have limited abilities to figure out their best strategies even in relatively simple strategic environments, let alone to decode strategic intentions of the other player(s). In the heart of all these problems lies the problem of gradual accumulation of knowledge in the process of interactions, i.e. strategic learning.

The study of strategic learning is the main topic of this thesis. The work is threefold: we describe modeling approaches common in this theory, motivate the problem of experimental comparisons between the theories and prepare the ground to introduce new models with an example of such experimental comparison.

As we will learn from the first chapter, there is a huge pool of prospective models that fall into several large groups. These groups possess particular features: some models care about realized wins, some look only at the behavior of the opponent while ignoring that the opponent may do the same. There is also a hybridization between these models, resulting in "a model zoo". The problem then is to decide whether to fit a particular model or particular flavor of general model type to our data. Which models are more useful or more theoretically sound than others, which are observationally equivalent and can be safely substituted by simpler models - these are open scientific questions. We propose to start answering those questions by finding testable assumptions between classes of models and their specifications. We show that it can be done by simulations because the resulting empirical distributions reflect both finite-sample and asymptotic properties of experimental samples.

We then discuss the common empirical problem for model comparisons - we have no analytical and general way to compare models and to establish whether one model explains behavior better than another or even that it is better in clearly defined conditions (specific experimental game). Moreover, our econometric tools of comparison are lacking in this context - even when we can say that one model fits the data "better", it is hard to specify "by how much" and "how robust" is this comparison even for a different realization of the same game. The second chapter covers Formal Theory Approach (FTA) to the experimental problem: how can we identify and estimate the model within the laboratory

bounds.

Finally, constraints on possible experimental designs are underappreciated by econometric theorists (Basse and Bojinov, 2020). Econometricians tend to assume asymptotic properties. It may be a fair assumption when it can be addressed by just assembling a larger sample, but not all dimensions are equally expandable (Salmon, 2001). Experiments are naturally constrained by the length of the experimental session and, to a lesser extent, by a number of observations of the specific player after a specific history. We cannot expect our subject to play for an infinite time and there are only two observations of two players in each round of the two-player experimental game, it cannot "go to infinity". We can, however, add more subjects or select a longer or shorter experimental session, we usually can change the character of the game, add new features to the design. One such feature is introduced in the third chapter - instead of a player that can react to her opponent in a sophisticated manner we put our experimental subjects to play against a robot with known Data Generating Process (DGP). Robot player enables us to check more specific hypotheses about our subjects by fine-tuning robot's (DGP), an option unavailable with a human player.

Thus the chapters of the dissertation cover all experimental stages. First, we specify what models of learning we want to test on our human subjects, what are our experimental constraints (the sample dimensions), and if we want to test specific hypotheses using a robot - how the robot should behave. Next, the foundational work from the second chapter does the heavy-lifting here - it shows us whether we can distinguish the models given this experiment or not. Finally, being sure that this design is effective and efficient, we run the experiment itself. We conclude by discussing the results of the experiments, clear evidence that specific extension of basic models of learning tracks human behavior demonstrably better than the baseline.

Objectives of the research. Learning the opponent's strategy in repeated games and optimally reacting to it requires time and more complex strategies require more time to learn. Thus, proper understanding and modeling of this process (both theoretical and empirical) are of utmost importance for game theory and economics in general. This learning may take too long while the player is losing most of the time. To insure learning models from such a potential loss, many of them only learn "the latest empirical frequencies of actions", and quickly adapt to them. This has two consequences: a) they can prevent rough manipulation by an opponent and win at least as much as if they know opponent action frequencies in advance (see (Hannan et al., 1957) for details) b) they can't learn complex strategies and therefore can't react optimally to them. It allows the models to be flexible and to play "without losing too much" against any type of opponent. At the same time, it remains unclear whether they can learn the optimal response to even a simple pattern that human subjects easily detect.

As a consequence, we formulate our main research question as follows: "In a laboratory experiment with a repeated game how to check that the subjects can recognize patterns?" Can we distinguish between strategy learning, when a human player tries to recognize the contingent action plan of her opponent, and action-based learning that may produce complex behavior out of just simple actions history?

Thus the main aim of the research is to conduct an experimental and structural econometric assessment of the participants' adaptive response to a strategy with a fixed

complexity in a repeated game. This problem was divided into the following tasks.

• Analyze the existing theoretical and empirical approaches to the classification of learning models

• Construct game-theoretic learning models that are able to process existing simple regularities in opponent played sequence of action and react to them considering consequences of current play for opponent's future action.

• Analyze whether popular learning models can be distinguished in an experiment. For that purpose:

— Create a synthetic dataset to test the performance of the maximum likelihood estimator under various conditions

— Formulate the criteria and the procedure for testing models through simulations

— Test the procedure on popular learning models

• Based on the conducted analysis, formulate the criteria applicable to the experimental design, allowing to correctly identify the models on the data

• Develop a laboratory experiment design that meets the above criteria, and that allows to identify the subjects' type of learning

• Conduct simulations according to the previously developed procedure and check that the criteria are met

• Run the developed experiment and obtain structural econometric estimates of the developed learning models

• Determine with the structural assessments which particular class of learners the participants belong to

• Find which model type predicts the subjects' behavior better

Brief literature review. The theory of learning in games originated in the Cournot model and nowadays is a well-developed theory ((Fudenberg and Levine, 1998); (Young, 2004)). However, its development is hindered by a lack of development of methods at the intersection of experimental inference methods and microeconometrics. While separately they are quite developed and sophisticated, their intersection requires special conditions: advanced experimental designs and taking into account finite samples.

The complexity of this problem is illustrated by several relatively recent works on model selection and testing in learning. First popular learning models on 2x2 games were tested in (McKelvey and Palfrey, 2001) who found that the models fit experimental data extremely poorly when played on some types of games, such as coordination games. A series of tournaments (starting with (Arifovic et al., 2006)) tested the potential difference between data generated by the model and human subjects. Time after time the models did not follow the dynamics humans show. Later, literature turned to rethink simple

goodness of fit measure as a criterion and researchers began to experiment not only with the composition of models pool but also with the goodness of fit criteria. In ((Erev et al., 2007); (Erev et al., 2010)) tournament, authors started experimenting with out-of-sample predictions and comparing different samples by using aggregated choices in one sample as a predictor to another. A bit different approach was followed by (Mathevet and Romero, 2012), namely the theory of predictive metrics in a game based on average payoffs (started by (Selten, 1998) but not developed until (Mathevet and Romero, 2012)). All these papers test a pool of models on multiple datasets, but instead of balancing between context and accuracy they prioritize only one of these. Simple model can be generalized to most, but not all contexts and in the remaining contexts they perform abysmally bad. Complex models may fit well in all contexts separately, one by one, but do not generalise across them. The consensus we know today is to move towards the accumulation of large datasets and the development of specific criteria ((Fudenberg et al., 2020); (Fudenberg et al., 2019)).

In our view, however, the accumulation of data may not be sufficient. For example, Salmon2001 shows on 500 synthetic datasets that the common methods do not provide a correct statistical inference. This problem has long been discussed in the econometric literature as "weak identification" (Lewbel, 2019) and is aptly described by ((Morton and Williams, 2010) p. 202) as: "Inspired-By Evaluations of Formal Theory Predictions: When a researcher evaluates a formal theory prediction using a Rubin Causality Modelbased approach and assumes consistency with all model imposed assumptions but does not explicitly investigate whether it holds or not." We are aware of only one recent work that tries to find an analytical solution for this problem in the case of a linear dynamic model (Bojinov et al., 2020). We take the-similar approach, namely to find a simulation-based solution to provide an experiment planning tool.

Main findings We present a class of learning models that avoids losing too much against an arbitrary opponent, and at the same time can learn simple conditional strategies of the "win-stay-lose-shift" type (e.g., a memory of at most k, automaton imple-mentability, etc. (Nachbar, 2005)). This class can be extended to repeated games with an interval action space (e.g. setting a price in an oligopoly).

Based on (Salmon, 2001), direct identification testing of the learning model was carried out on the example of the hybrid EWA learning (Camerer and Ho, 1999). Salmon's previous results are reproduced and extended, the resulting estimates confirm and elaborate Salmon's result that point identification in a realistic experimental setting is problematic for EWA. Indices were developed to assess the identifiability of the learning model. It is shown that if we consider only the basic representatives included in the model (individual points of the hybrid model), then we can point identify them. We describe the formal distinction between action-based and strategy-based learners. We developed empirically-based criteria to test whether participants can be classified as action-based or strategy-based learners. We found experimental conditions meeting such criteria and proposed the specific experiment design involving a controlled robot opponent with a pre-programmed strategy. Specific algorithms for a strategy-based class are proposed and formalized as well. A pool of models has been selected for comparison, including representatives of belief-based learning, reinforcement learning, action-based learning, and strategy-based learning. The identifiability of the proposed model pool has been tested.

An experiment satisfying all the necessary properties was conducted.

Experimental evidence confirms that: (a) many people are capable to defeat our simple preprogrammed artificial opponent; (b) usually it happens in the span of 30-60 rounds, depending on the noise level; (c) often when subject's behavior shows learning, they can explain what they have learned, typically in belief-based terms; (d) among the three sub-parts of the best response strategy, one part is more easily learned.

Contribution The contribution of this dissertation begins with a survey chapter that reviews and reevaluates the existing learning model classifications and properties of models that follow from them. In addition to writing down approaches specific to this literature or the convergence analysis traditional for the reviews1, this section deals with the cognitive aspects of learning and its representation in models, as well as the connection between the theoretical conceptualization of different properties of learning models and the issues of their empirical testing and comparison of models. Latest is discussed in more detail at the beginning of the second chapter.

Further in chapter 2, we present common approaches to evaluate those models, discuss the weaknesses of those approaches and propose a new way to avoid the largest pitfalls of the existing studies (namely the horse-race approach). In particular, we scrupulously discuss the issue of"weak identification" (Lewbel, 2019) for learning models in conditions having the same sample constraints as (Bojinov et al., 2020). From the general criteria provided by (Matzkin, 2005, 2007) we induce the numerical criterion called "simulation ratio" (SM) specifically for simulations with learning models and test its work in the (Salmon, 2001) setting. Salmon (2001) starts from the fact that the EWA learning model is not correctly identified through the statistical criteria and routines used in the original papers. We go further and show that for a given sample size the EWA learning model cannot be identified at all by any test through information derived from the likelihood function. The proposed procedure and numerical SM indicator are thus available to evaluate any arbitrary set of learning models (both nested models like EWA family and different models) and are used by us to evaluate the identifiability of strategic learning models in Chapter

The work also develops the approach of repeated strategies in learning. Although the approach proposed by (Hanaki, 2004) has already been tested by (Ioannou and Romero, 2014) in 2-by-2 games, its implementation has been limited for using in games with larger action space both in terms of necessity for "model training" 2, as in terms of computational limitations of the subjects' resources. In previous studies (e.g. in Axelrod's tournament) implementation of repeated strategies has been looked at through the prism of evolutionary theory and such strategies are understood as an exhaustive indivisible prescription in different situations. For example, the Tit-for-tat strategy for the prisoner's dilemma can be taken as an instruction how to act in 2 different situations: if the player-opponent cooperates or not. This is why (Ioannou and Romero, 2014) learning models use repeated strategies as the basis and work with strategies in their entirety. We

1We should mention the works (Marimon, 1996; Fudenberg and Levine, 1998, 2009, 2016). Erev and Haruvy (2013) outlines a view of this theory that is close to experimental and behavioral economics. Not to mention the article (Nachbar, 2020), which the most succinctly describes the main results in the field.

2i.e. there is no element of online learning, the model has to play with itself before it can predict people

show however that dividing repeated strategies into small component parts (we called it "elementary strategies") has a number of comparative advantages in terms of modeling.

Firstly, conceptually, it allows the learner to construct a complex pattern "on the fly" during the learning process, which sufficiently simplifies the computation and even, as we will show at the end of the first chapter, can be implemented in games with continuous action space.

Secondly, the model comparison may not be trivial, and the simpler conceptualization of repeated strategies is easier to test empirically. For example, let us present two specially simplified models for the Battle of the sexes game: strategic and action-based models. The strategic model knows how to choose between alternating on even periods and alternating on odd periods. A model on action repeats an opponent's action if it was successful (imitation) and if it wasn't, it randomizes. If the actual subjects after some time since the beginning of the repeated Battle of the sexes game cooperate in practice, then for us these two models as an explanation would be observationally equivalent.

Exploring this problem in general, in chapter 2 we show how empirically the conceptualization of repeated strategies through "elementary strategies" can be separated from the action-based class of algorithms through experimental intervention. It's possible if experimenter "freezes" the behavior of one of the players in the pair and equips it with a strategy with a fixed complexity. In the conditions of the laboratory experiment in Chapter 3, we achieve this through the use of a robot opponent (which, however, does not imply that the robot is needed in practice, because through the experiment we investigate the properties of the Lerner-human, and they are identical in the laboratory and outside). We use noise in the actions of the robot in order to break the cycle of winnings in which the player could get "occasionally" in some rounds. We observe an increase in the frequency of successful "elementary strategies" at different periods of the game (from early to late), as well as the verbalization of these strategies in the post-experimental questionnaire. The behavior of the participants demonstrates a correct adaptive response, however, the results of the experiment can be interpreted wider, that "elementary strategies " are kept by the memory of the participants as separate irreducible elements.

Finally, registered data on learning dynamics which were obtained in this work, methodological analysis of the human reaction to the robot opponent, and a conceptual revision of how players behave in the Rock-Paper-Scissors game (Wang et al., 2014) are valuable in their own right.

List of author's original articles

• Chernov G. V. How to Learn to Defeat Noisy Robot in Rock-Paper-Scissors Game: An Exploratory Study //HSE Economic Journal. 2020. Vol. 24. No. 4. P

• Chernov G. V., Susin I. S. Models of learning in games: the review // Journal of the New Economic Association. - 2019. - P

• Chernov G. V., Susin I. S. Heuristics Recognition and Learning in Rock-paper-scissors Game: Experimental Study // Russian Journal of Economic Theory. -2018. - T. 15. - №. 3. - C

• Chernov G. Cheparuhin S. Susin I., , Evaluation of Econometric Models of Adaptive Learning by Predictive Measures / SSRN. Series "Working Papers"

Also, the candidate participated in the following international conference with presentations on the topic of the thesis:

• XXI April International Scientific Conference on problems of development of economy and society (Moscow). Presentation: Identification and predictive power of learning models in economic experiments

• The workshop "Causality in the Social Sciences II" , Germany. Presentation: "Conditional Learning in Non-Transitive Game: An Exploratory Study"

• iCare 6th International Conference on Applied Research in Economics . Presentation: Heuristics recognition and learning In rock-paper-scissors game: experimental study

Рекомендованный список диссертаций по специальности «Экономическая теория», 08.00.01 шифр ВАК

Введение диссертации (часть автореферата) на тему «Модели обучения в экономических экспериментах»

Annotation

The subject of this dissertation is learning in experimental games from both theoretical and empirical perspectives. The main objective of this work is to develop a generalized concept of strategy-based learning and to test its application, including empirical identification, in a class of simple experimental games. Experiment supports the viability of the methodological study and an improvement in explanatory power with a new class of models.

The first chapter analyzes the central ideas and the current state of the economic theory of learning in games. Within the framework of game theory, learning can be seen as both an alternative to equilibrium analysis, and as a way to investigate the nature of equilibrium concept(s). Outside of this framework, learning in games (starting from the classical Cournot dynamics) sheds new light on economic interactions, sets interesting theoretical and non-trivial econometric problems, and can be studied experimentally. Learning in games connects economics with other (sometimes unexpected) scientific disciplines: biology, philosophy of rationality, and computer science. The first chapter examines in detail why there are so many learning models, what properties in a dynamic context are crucial, and what are the criteria for the "goodness" of these models. At the end of the chapter, a classification of models of learners based on their crucial properties is presented.

The second chapter is devoted to the question of why it is so hard to study learning even in the laboratory setting, outlining several theoretical and practical concerns (like the limited length of an experimental session). In particular, simulations by (Salmon, 2001) show, in a cross-model (or "blind") testing of several models, the data generated by those models does not correspond to the estimated parameters. Thus, even when the real data generation process is known we cannot distinguish correct models from incorrect ones by looking at the estimates. However, we demonstrate that part of these problems could be resolved through simulations and experimental design. We also present the simulation-based toolbox for testing weak identification for any particular experimental sample.

The third chapter studies learning in a strategic environment using experimental data from the Rock-Paper-Scissors game. In a repeated game framework, we explore the response of human subjects to the uncertain behavior of the strategically sophisticated opponent. We model this opponent as a robot that plays a stationary strategy with superimposed noise varying across four experimental treatments. Using experimental data from 85 subjects playing against such a stationary robot for 100 periods, we show that humans can decode its strategies, on average outperforming the random response to such a robot by 17%. Further, we show that the human ability to recognize such strategies decreases with exogenous noise in the behavior of the robot. Further, we fit learning data to classical Reinforcement Learning (RL) and Fictitious Play (FP) models and show that the classic action-based approach to learning is inferior to the strategy-based one. We adapt the criteria from the second chapter and provide specific algorithms for the strategy-based class of learning from the first chapter into a 3x3 game. We also show, using a combination of experimental and post-experimental survey data, that human participants are better at learning separate components of the opponent's strategy than in recognizing this strategy as a whole. This decomposition offers a shorter and more

intuitive way to figure out their own best response. We build a strategic extension of the classical learning models accounting for this behavioral fact and calibrate its practical application to our experimental data.

Похожие диссертационные работы по специальности «Экономическая теория», 08.00.01 шифр ВАК

Список литературы диссертационного исследования кандидат наук Чернов Григорий Витальевич, 2022 год

Bibliography

(1) S. Andersen, G. W. Harrison, M. I. Lau, and E. E. Rutstrom. Eliciting risk and time preferences. Econometrica, 76(3):583-618, 2008.

(2) J. Andreoni and C. Sprenger. Estimating time preferences from convex budgets. American Economic Review, 102(7):33-56, 2012.

(3) J. Arifovic and J. Ledyard. Scaling up learning models in public good games. Journal of Public Economic Theory, 6(2):203-238, 2004.

(4) J. Arifovic, R. McKelvey, and S. Pevnitskaya. An initial implementation of the turing tournament to learning in repeated two-person games. Games and Economic Behavior, 57(1):93-122, 2006.

(5) R. J. Aumann. Correlated equilibrium as an expression of bayesian rationality. Econometrica: Journal of the Econometric Society, pp, pages 1-18, 1987.

(6) G. Basse and I. Bojinov. A general theory of identification. preprint, arXiv, 2020.

(7) A. Beggs. On the convergence of reinforcement learning. Journal of economic theory, 122:1-36, 2005.

(8) C. Bellemare, S. Kruger, and A. Van Soest. Measuring inequity aversion in a heterogeneous population using experimental decisions and subjective probabilities. Econometrica, 76(4):815-839, 2008.

(9) M. Benaim and M. Hirsch. Mixed equilibria and dynamical systems arising from fictitious play in perturbed games. Games and Economic Behavavior, 72:29-36, 1999.

(10) D. Bergemann and J. Valimaki. Bandit problems. The New Palgrave Dictionary of Economics:, 1 (8):336-340, 2008.

(11) I. Bojinov, A. Rambachan, and N. Shephard. Panel experiments and dynamic causal effects: A finite population perspective. preprint, arXiv, 2020.

(12) J. Bracht and H. Ichimura. Identification of a general learning model on experimental game data. Working paper, Hebrew University of Jerusalem, 2001.

(13) A. Brandenburger. Strategic and structural uncertainty in games. In R. L. K. Zeckhauser and J. K. Sebenius, editors, Richard J. Wise Choices: Games, Decisions, and Negotiations, Brighton: Harvard Business School Press pp. 221232, 1996.

(14) G. W. Brown. Iterative solutions of games by fictitious play, m in activity analysis of production and allocation, ed, by t. koop# mans. page 376. New York: Wiley, 1951.

(15) S. Bubeck and N. Cesa-Bianchi. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning, 5(1):1-122, 2012.

(16) R. R. Bush and F. Mosteller. Stochastic 'models for learning. John Wiley & Sons, Inc., 1955.

(17) C. Camerer and H. Ho. Experience-weighted attraction learning in normal form games. Econometrica, 67(4):827-874, 1999.

(18) C. F. Camerer, A. Dreber, E. Forsell, T.-H. Ho, J. Huber, M. Johannesson, M. Kirchler, J. Almenberg, A. Altmejd, T. Chan, et al. Evaluating replicability of laboratory experiments in economics. Science, 351(6280):1433-1436, 2016.

(19) C. F. Camerer, A. Dreber, F. Holzmeister, T.-H. Ho, J. Huber, M. Johannesson, M. Kirchler, G. Nave, B. A. Nosek, T. Pfeiffer, et al. Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behaviour, 2(9):637-644, 2018.

(20) T. N. Cason and D. Friedman. Learning in a laboratory market with random supply and demand. Experimental Economics, 2(1):77-98, 1999.

(21) A. C. Chang and P. Li. A preanalysis plan to replicate sixty economics research papers that worked half of the time. American Economic Review, 107(5):60-64, 2017.

(22) G. V. Chernov. How to learn to defeat noisy robot in rock-paper-scissors game:an exploratory study. HSE Economic Journal, 24(4):503-538, 2020.

(23) Y. Cheung and D. Friedman. Individual learning in normal form games: Some laboratory results. Games and Economic Behavior, 19:46-76, 1997.

(24) T. Chmura, S. J. Goerg, and R. Selten. Learning in experimental 2x 2 games. Games and Economic Behavior, 76(1):44-73, 2012.

(25) A. Cournot. Researches into the Mathematical Principles of the Theory of Wealth, trans. Kelly, N. Bacon. New York, 1960.

(26) E. Czibor, D. Jimenez-Gomez, and J. A. List. The dozen things experimental economists should do (more of). Southern Economic Journal, 86(2):371-432, 2019.

(27) L. C. Daniel, M. Schonger, and C. Wickens. otree:an open-source platform for laboratory,online, and field experiments. Journal of Behavioral and Experimental Finance, 9:88-97, 2016.

(28) P. Duersch, A. Kolb, and J. Oechssler. Rage against the machines: how subjects play against learning algorithms. Economic Theory, 43(3):407-430, 2010.

(29) P. Duersch, J. Oechssler, and B. C. Schipper. Unbeatable imitation. Games and Economic Behavior, 76(1):88-96, 2012.

(30) H. Ebbinghaus. Memory: a contribution to experimental psychology. Annals of neurosciences, 20 (4):155, 1885/1974.

(31) I. Erev and E. Haruvy. Generality and the role of descriptive learning models. Journal of Mathematical Psychology, 49(5):357-371, 2005.

(32) I. Erev and E. Haruvy. Learning and the economics of small decisions. In The handbook of experimental economics Vol. 2, pages 638-716. Princeton University Press, 2013.

(33) I. Erev and A. Roth. Predicting how people play games: reinforcement learning in games with unique strategy mixed-strategy equilibrium. American Economic Review, 88:848-881, 1998.

(34) I. Erev, D. Gopher, R. Itkin, and Y. Greenshpa. Toward a generalization of signal detection theory to n-person games: The example of two-person safety problem. Journal of Mathematical Psychologyn, 39(4):360-375, 1995.

(35) I. Erev, A. Roth, R. L. Slonim, and G. Barron. Learning and equilibrium as useful approximations: Accuracy of prediction on randomly selected constant sum games. Economic Theory, 33(1):29-51, 2007.

(36) I. Erev, E. Ert, A. E. Roth, E. Haruvy, S. M. Herzog, R. Hau, R. Hertwig, T. Stewart, R. West, and C. Lebiere. A choice prediction competition: Choices from experience and from description. Journal of Behavioral Decision Making, 23(1):15-47, 2010.

(37) W. K. Estes. Probability learning. In Categories of human learning, pages 89-128. Elsevier, 1964.

(38) D. Foster and H. P. Young. On the impossibility of predicting the behavior of rational agents. Proceedings of the National Academy of Sciences of the USA, 98(22):12848-12853, 2001.

(39) D. Foster and P. Young. Regret testing: Learning to play nash equilibrium without knowing you have an opponent. Theoretical Economics, 1(3):341-367, 2006.

(40) D. P. Foster and R. Vohra. Asymptotic calibration. Biometrika, 85:379-390, 1998.

(41) D. P. Foster and R. V. Vohra. Calibrated learning and correlated equilibrium. Games and Economic Behavior, 21(1-2):40, 1997.

(42) D. P. Foster and H. P. Young. Learning, hypothesis testing, and nash equilibrium. Games and Economic Behavior, 45(1):73-96, 2003.

(43) D. Fudenberg and D. Levine. Steady state learning and Nash equilibrium. Econometrica: Journal of the Econometric Society, pp, pages 547-573, 1993.

(44) D. Fudenberg and D. Levine. The theory of learning in games, volume 2. MIT press, 1998.

(45) D. Fudenberg and D. Levine. Learning and equilibrium. Annual Review of Economics, 1(1):385-420, 2009.

(46) D. Fudenberg and D. Levine. Whither game theory? towards a theory of learning in games. Journal of Economic Perspectives, 30(4):151-170, 2016.

(47) D. Fudenberg and J. Tirole. Noncooperative game theory for industrial organization: an introduction and overview. Handbook of industrial Organization, 1:259-327, 1989.

(48) D. Fudenberg and J. Tirole. Game theory. 1991. Cambridge, Massachusetts, 393(12), 80, 1991.

(49) D. Fudenberg, J. Kleinberg, A. Liang, and S. Mullainathan. Measuring the Completeness of Theories. arXiv. org, 2019.

(50) D. Fudenberg, W. Gao, and A. Liang. How flexible is that functional form? Quantifying the Restrictiveness of Theories. arXiv preprint, 2020.

(51) A. Gelman and J. Hill. Data analysis using regression and multilevelhierarchical models, volume 1. Cambridge University Press, New York, NY, USA, 2007.

(52) J. C. Gittins. Bandit processes and dynamic allocation indices. Journal of the Royal Statistical Society, Series B (Methodological), pages 148-177, 1979.

(53) S. Goyal and M. Janssen. Can we rationally learn to coordinate? Theory and Decision, 40(1):29-49, 1996.

(54) A. W. Gregory and M. R. Veall. Formulating wald tests of nonlinear restrictions. Econometrica: Journal of the Econometric Society, pp, pages 1465-1468, 1985.

(55) W. Guth, R. Schmittberger, and B. Schwarze. An experimental analysis of ultimatum bargaining. Journal of Economic Behavior and Organization, 3:367-388, 1982.

(56) D. S. Hamermesh. Viewpoint: Replication in economics. Canadian Journal of Economics, 40(3): 715-733, 2007.

(57) N. Hanaki. Action learning versus strategy learning. Complexity, 9(5):41-50, 2004.

(58) J. Hannan, M. Dresher, A. Tucker, and P. Wolfe. Approximation to bayes risk in repeated play. In Contributions to the Theory of Games, pages 97-139. Princeton Univ. Press, Princeton, 1957.

(59) D. W. Harless and C. Camerer. The predictive utility of generalized expected utility theories. Econometrica, 62(6):1251-89, 1994.

(60) S. Hart and A. Mas-Colell. A general class of adaptive strategies. Journal of Economic Theory, 98: 26-54, 2001.

(61) S. Hart and A. Mas-Colell. Uncoupled dynamics do not lead to Nash equilibrium. American Economic Review, 93:1830-1836, 2003.

(62) J. D. Hey and C. Orme. Investigating generalizations of expected utility theory using experimental data. Econometrica: Journal of the Econometric Society :, pages 1291-1326, 1994.

(63) T. Ho, C. F. Camerer, and J. Chong. Self-tuning experience weighted attraction learning in games. Journal of Economic Theory, 133:177-198, 2007.

(64) J. Hofbauer and E. Hopkins. Learning in perturbed asymmetric games. Games and Economic Behavwr, 52(1):133-152, 2005.

(65) E. Hopkins. Two competing models of how people learn in games. Econometrica, 70(6):2141-2166, 2002.

(66) C. A. Ioannou and J. Romero. A generalized approach to belief learning in repeated games. Games and Economic Behavior, 87:178-203, 2014.

(67) D. Kahneman and A. Tversky. On the interpretation of intuitive probabbility:A reply to Jonathan Cohen. Elsevier Science, 1979.

(68) S. M. Kakade and D. P. Foster. Deterministic calibration and nash equilibrium. Journal of Computer and System Sciences, 74(1):115-130, 2008.

(69) E. Kalai and E. Lehrer. Rational learning leads to nash equilibrium. Econometrica, 61:1019-1045, 1993.

(70) A. Lewbel. The identification zoo: Meanings of identification in econometrics. Journal of Economic Literature, 57(4):835-903, 2019.

(71) R. Marimon. Learning from learning in economics. In Advances in economic theory:the 7th Word Congress. CUP. European University Institute, 1996.

(72) L. Mathevet and J. Romero. Predictive repeated game theory: Measures and experiments. 2012.

(73) R. L. Matzkin. Identification of consumers: preferences when individuals choices are unobservable. Economic Theory, 26:423-443, 2005.

(74) R. L. Matzkin. Nonparametric identification. In J. Heckman and E. Leamer, editors, Handbook of Econometrics, pages 5307-5368. Amsterdam, 2007.

(75) R. D. McKelvey and T. R. Palfrey. Playing in the dark: Information, learning, and coordination in repeated games. California Institute of Technology, California, 2001.

(76) K. Miyasawa. On the convergence of the learning process in a 2 x 2 non-zero-sum game. Economic Research Program, Princeton University, Research Memorandum No, 33, 1961.

(77) P. Moffatt. Experimetrics. Palgrave Macmillan, London, 2014.

(78) R. Morton and K. C. Williams. Experimental political science and the study of causality: From nature to the lab. Cambridge University Press, Cambridge, 2010.

(79) J. M. J. Murre and J. Dros. Replication and analysis of ebbinghaus forgetting curve. PLOS ONE, 10:7, 2015. doi: 10.1371/journal.pone.0120644.

(80) J. Nachbar. Learning in games. In Springer, editor, Complex Social and Behavioral Systems: Game Theory and Agent-Based Models, pages 485-498. Springer, New York, 2020.

(81) J. H. Nachbar. Evolutionary selection dynamics in games: Convergence and limit properties. International journal of game theory, 19(1):59-89, 1990.

(82) J. H. Nachbar. Beliefs in repeated games. Econometrica, 73(2):459-480, 2005.

(83) R. Nagel. Unraveling in guessing games: An experimental study. AmericanEconomic Review, 85 (5):1995, 1995.

(84) J. A. Nevin. Behavioral momentum and the partial reinforcement effect. Psychological Bulletin, 103:44-56, 1988.

(85) I. Nevo and I. Erev. On Surprise, Change, and the Effect of Recent Outcomes. Frontiers in Cognitive Science, 2012.

(86) J. Robinson. An iterative method of solving a game. Annals of Mathematics, 54:296-301, 1951.

(87) J. Romero and Y. Rosokha. Constructing strategies in the indefinitely repeated prisoners dilemma game. European Economic Review, 104:185-219, 2018.

(88) A. E. Roth and I. Erev. Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games and Economic Behavior, 8:164-212, 1995.

(89) M. Rothschild. A two-armed bandit theory of market pricing. Journal of Economic Theory, 9: 185-202, 1974.

(90) A. Sadrieh. The alternating double auction markest: A game theoretic and experimental investigation, volume 466. Springer, Science & Business Media, 1998.

(91) T. C. Salmon. An evaluation of econometric models of adaptive learning. Econometrica, 69(6): 1597-1628, 2001.

(92) J. D. Sargan. Identification and lack of identification. Econometrica, 51:1605-1633, 1958.

(93) L. J. Savage. The Foundations of Statistics. New York, Wiley, 1954.

(94) R. Selten. Axiomatic characterization of the quadratic scoring rule. Experimental Economics, 1(1): 43-61, 1998.

(95) R. Selten. Learning direction theory and impulse balance equilibrium. In Economics Lab, pages 147-154. Routledge, 2004.

(96) R. Selten and J. Buchta. Experimental sealed bid first price auctions with directly observed bid functions. In Z. Budescu, Erev, editor, Games and Human Behavior, pages 79-104. Essays in the Honor of Amnon Rapoport. Lawrenz Associates,, Lawrenz A. and Mahwah, N.J., 1999.

(97) R. Selten and R. Stoecker. End behavior in sequences of finite prisoner's dilemma supergames a learning theory approach. Journal of Economic Behavior & Organization, 7(1):47-70, 1986.

(98) R. Selten, T. Abbink, and R. Cox. Learning direction theory and the winners curse. Experimental Economics, 8(1):5-20, 2005.

(99) R. Selten, T. Chmura, T. Pitz, S. Kube, and M. Schreckenberg. Commuters route choice behaviour. Games and Economic Behavior, 58(2):394-406, 2007.

(100) L. S. Shapley. Some topics in two-person games. In M. Dresher, L. S. Shapley, and A. W. Tucker, editors, Advances in Game Theory, pages 1-28. Princeton University Press, 1964.

(101) L. Spiliopoulos. Pattern recognition and subjectie belief learning in a repeated constant-sum game. Games and Economic Behavior, 75:921-35, 2012.

(102) D. O. Stahl. Evolution of smartn players. Games and Economic Behavior, 5(4):604-617, 1993.

(103) D. O. Stahl. A survey of rule learning in normal-form games. In Cognitive Processes and Economic behavior, pages 43-62. Taylor and Francis, 2012.

(104) P. Suppes and R. C. Atkinson. Markov Learning Models for Multiperson Interactions. Stanford University Press, Stanford, C.A., 1960.

(105) R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. 2nd edition. MIT press, 2018.

(106) E. L. Thorndike. Animal Intelligence. Macmillan, New York, 1911.

(107) E. L. Thorndike. The law of effect. American Journal of Psychology, 39:212-222, 1927.

(108) E. Van Damme. Stability and perfection of Nash equilibria, volume 339. Springer-Verlag, Berlin, 1991.

(109) Z. Wang, B. Xu, and H. Zhou. Social cycling and conditional responses in the rock-paper-scissors game. Scientific reports, 4(1):1-7, 2014.

(110) J. B. Watson and G. A. Kimble. Behaviorism. Routledge, 2017.

(111) N. T. Wilcox. Theories of learning in games and heterogeneity bias. Econometrica, 74(5):1271-1292, 2006.

(112) P. Wilson. The misuse of the vuong test for non-nested models to test for zero-inflation. Economics Letters, 127:51-53, 2015.

(113) E. Xie. Monetary payoff and utility function in adaptive learning models (no. 2019-50). Technical report, Bank of Canada Staff Working Paper, 2019.

(114) H. P. Young. Strategic learning and its limits. Oxford University Press, Oxford, 2004.

Обратите внимание, представленные выше научные тексты размещены для ознакомления и получены посредством распознавания оригинальных текстов диссертаций (OCR). В связи с чем, в них могут содержаться ошибки, связанные с несовершенством алгоритмов распознавания. В PDF файлах диссертаций и авторефератов, которые мы доставляем, подобных ошибок нет.