共查询到20条相似文献,搜索用时 0 毫秒
1.
Self-tuning experience weighted attraction learning in games 总被引:2,自引:0,他引:2
Self-tuning experience weighted attraction (EWA) is a one-parameter theory of learning in games. It addresses a criticism that an earlier model (EWA) has too many parameters, by fixing some parameters at plausible values and replacing others with functions of experience so that they no longer need to be estimated. Consequently, it is econometrically simpler than the popular weighted fictitious play and reinforcement learning models. The functions of experience which replace free parameters “self-tune” over time, adjusting in a way that selects a sensible learning rule to capture subjects’ choice dynamics. For instance, the self-tuning EWA model can turn from a weighted fictitious play into an averaging reinforcement learning as subjects equilibrate and learn to ignore inferior foregone payoffs. The theory was tested on seven different games, and compared to the earlier parametric EWA model and a one-parameter stochastic equilibrium theory (QRE). Self-tuning EWA does as well as EWA in predicting behavior in new games, even though it has fewer parameters, and fits reliably better than the QRE equilibrium benchmark. 相似文献
2.
We explore how learning to play strategically in one signaling game promotes strategic play in a related signaling game. Following
convergence to a pooling equilibrium, payoffs are changed to only support separating equilibria. More strategic play is observed
following the change in payoffs than for inexperienced subjects in control sessions, contrary to the prediction of a fictitious
play learning model. Introducing a growing proportion of sophisticated learners, subjects who anticipate responders’ behavior
following the change in payoffs, enables the model to capture the positive cross-game learning observed in the data.
Research support form the National Science Foundation grant number SBR9809538 is gratefully acknowledged. We have received
research support from Jo Ducey, Guillaume Frechette, Steve Lehrer, and Carol Kraker Stockman. We have benefitted from comments
of Eric Bettinger, John Ham, Jim Rebeitzer, Bob Slonim and seminar participants at Case Western Reserve University, Ohio State
University, the University of Mississippi, the University of Illinois, and Purdue University. The usual caveat applies. 相似文献
3.
John B. Van Huyck Raymond C. Battalio Frederick W. Rankin 《Experimental Economics》2007,10(3):205-220
This paper reports an experiment designed to detect the influence of strategic uncertainty on behavior in order statistic
coordination games, which arise when a player’s best response is an order statistic of the cohort’s action combination. Unlike
previous experiments using order statistic coordination games, the new experiment holds the payoff function constant and only
changes cohort size and order statistic.
Electronic Supplementary Material The online version of this article () contains supplementary material, which is available to authorized users. Related research available at 相似文献
Electronic Supplementary Material The online version of this article () contains supplementary material, which is available to authorized users. Related research available at 相似文献
4.
Strategic stability and uniqueness in signaling games 总被引:1,自引:0,他引:1
A class of signaling games is studied in which a unique Universally Divine equilibrium outcome exists. We identify a monotonicity property under which a variation of Universal Divinity is generically equivalent to strategic stability. Further assumptions guarantee the existence of a unique Universally Divine outcome. 相似文献
5.
Ernan HaruvyDale O. Stahl 《Games and Economic Behavior》2012,74(1):208-221
Rule learning posits that decision makers, rather than choosing over actions, choose over behavioral rules with different levels of sophistication. Rules are reinforced over time based on their historically observed payoffs in a given game. Past works on rule learning have shown that when playing a single game over a number of rounds, players can learn to form sophisticated beliefs about others. Here we are interested in learning that occurs between games where the set of actions is not directly comparable from one game to the next. We study a sequence of ten thrice-played dissimilar games. Using experimental data, we find that our rule learning model captures the ability of players to learn to reason across games. However, this learning appears different from within-game rule learning as previously documented. The main adjustment in sophistication occurs by switching from non-belief-based strategies to belief-based strategies. The sophistication of the beliefs themselves increases only slightly over time. 相似文献
6.
Summary. This paper studies adaptive learning in extensive form games and provides conditions for convergence points of adaptive learning
to be sequential equilibria. Precisely, we present a set of conditions on learning sequences such that an assessment is a
sequential equilibrium if and only if there is a learning sequence fulfilling the conditions, which leads to the assessment.
Received: November 5, 1996; revised version: May 28, 1997 相似文献
7.
Luis R. Izquierdo Segismundo S. Izquierdo Nicholas M. Gotts J. Gary Polhill 《Games and Economic Behavior》2007,61(2):259-276
Reinforcement learners tend to repeat actions that led to satisfactory outcomes in the past, and avoid choices that resulted in unsatisfactory experiences. This behavior is one of the most widespread adaptation mechanisms in nature. In this paper we fully characterize the dynamics of one of the best known stochastic models of reinforcement learning [Bush, R., Mosteller, F., 1955. Stochastic Models of Learning. Wiley & Sons, New York] for 2-player 2-strategy games. We also provide some extensions for more general games and for a wider class of learning algorithms. Specifically, it is shown that the transient dynamics of Bush and Mosteller's model can be substantially different from its asymptotic behavior. It is also demonstrated that in general—and in sharp contrast to other reinforcement learning models in the literature—the asymptotic dynamics of Bush and Mosteller's model cannot be approximated using the continuous time limit version of its expected motion. 相似文献
8.
Olga Shurchkov 《Experimental Economics》2013,16(3):313-334
Coordination problems are ubiquitous in social and economic life. Political mass demonstrations, the decision whether to join a speculative currency attack, investment in a risky venture, and capital flight from a particular country are all characterized by coordination problems. Furthermore, all these events have a dynamic nature which has been largely omitted from previous experimental studies. Here I use a two-stage variant of a dynamic global game to study experimentally how the arrival of information in a dynamic setting affects the relative aggressiveness of speculators. In the first stage, subjects exhibit excess aggressiveness, which appears to be driven by beliefs about others’ actions rather than an intrinsic taste for attacking. However, following a failed first-stage attack, subjects learn to be less aggressive in the second stage. On the other hand, the arrival of new, more precise information after a failed attack leads to an increase in subjects’ aggressiveness. Beliefs, again, play a crucial role in explaining how the arrival of information affects attacking behavior. 相似文献
9.
Alejandro M. Manelli 《Economic Theory》1996,7(2):323-335
Summary For a class of infinite signaling games, the perfect Bayesian equilibrium strategies of finite approximating games converge to equilibrium strategies of the infinite game. This proves the existence of perfect Bayesian equilibrium for that class of games. It is well known that in general, equilibria may not exist in infinite signaling games.I am very grateful to Karl Iorio with whom I derived most of the results in this paper. I am solely responsible for any remaining errors. I am also grateful to Robert Anderson, Debra Aron, Eddie Dekel, Raymond Deneckere, Michael Kirscheneiter, Steven Matthews, Roger Myerson, Daniel Vincent and Robert Weber for comments on previous drafts of this paper. 相似文献
10.
Leading-by-example and signaling in voluntary contribution games: an experimental study 总被引:2,自引:0,他引:2
We report experimental results on the effect of leadership in a voluntary contribution game. Consistent with recent theories
we find that leading-by-example increases contributions and earnings in an environment where a leader has private information
about the returns from contributing (Hermalin in Am Econ Rev 88:1188–1206, 1998; Vesterlund in J Public Econ 87:627–657, 2003).
In contrast the ability to lead-by-example has no effect on total contributions and earnings when such returns are commonly
known. In our environment the success of leadership therefore appears to be driven by signaling rather than by nonpecuniary
factors such as reciprocity.
This paper was started while the authors were visiting the Harvard Business School during the fall of 2000. We are grateful
for their hospitality and financial support. Vesterlund acknowledges support from the National Science Foundation and Potters
from the Royal Netherlands’ Academy of Arts and Sciences. We thank Henrik Orzen for assistance in conducting the experiment.
We also thank David Cooper and an anonymous referee who helped us improve the paper. Finally we thank Chris Anderson, Jim
Andreoni, John Duffy, Simon Gaechter, Ernan Haruvy, Muriel Niederle, Jack Ochs, Elke Renner, Al Roth, participants at ESA-meetings
(Barcelona, 2001), the Leadership and Social Interactions Workshop (Lyon, 2003), SITE (Stanford, 2004) and seminar participants
at Alabama, CMU, Duke, Keele, Maryland, Nottingham, NYU, Pittsburgh, OSU, and York for valuable comments. 相似文献
11.
In this paper we apply the concept of preference conjecture equilibrium introduced in Perea (2005) to signaling games and show its relation to sequential equilibrium. We introduce the concept of minimum revision equilibrium and show how this can be interpreted as a refinement of sequential equilibrium 相似文献
12.
We report the results of experiments designed to test the impact of social status on learning in a coordination game. In the
experiment, all subjects observe the play of an agent who either has high status or low status. In one treatment the agent
is another player in the game; in the other the agent is a simulated player. Status is assigned within the experiment based
on answers to a trivia quiz. The coordination game has two equilibria: one is payoff-dominant but risky, and the other is
risk-dominant. The latter is most commonly chosen in experiments where there is no coordination device. We find that a commonly
observed agent enhances coordination on the payoff-dominant equilibrium more often when the agent has high status.
相似文献
13.
Summary. Tacit coordination in large groups is studied in an iterated market entry game with complete information and multiple market capacities that are varied randomly from period to period. On each period, each player must decide independently whether to enter any of the markets, and if entering, which of the two markets to enter. Across symmetric and asymmetric markets, we find remarkable coordination on the aggregate level, which is accounted for by the Nash equilibrium, together with considerable individual differences in frequency of entry and decision rules. With experience, the decisions of most players converge to decision rules with cutoff values on the combined market capacity that determine whether or not to enter but not which of the two markets to enter. This latter decision is determined probabilistically by the differential market capacities. The aggregate and individual results are accounted for quite well by a reinforcement-based learning model that combines deterministic and probabilistic elements. 相似文献
14.
We report experiments studying mixed strategy Nash equilibria that are theoretically stable or unstable under learning. The Time Average Shapley Polygon (TASP) predicts behavior in the unstable case. We study two versions of Rock-Paper-Scissors that include a fourth strategy, Dumb. The unique Nash equilibrium is identical in the two games, but the predicted frequency of Dumb is much higher in the game where the NE is stable. Consistent with TASP, the observed frequency of Dumb is lower and play is further from Nash in the high payoff unstable treatment. However, Dumb is played too frequently in all treatments. 相似文献
15.
16.
In repeated games, subgame-perfect equilibria involving threats of punishment may be implausible if punishing one player hurts the other(s). If players can renegotiate after a defection, such a punishment may not be carried out. We explore a solution concept that recognizes this fact, and show that in many games the prospect of renegotiation strictly limits the cooperative outcomes that can be sustained. We characterize those outcomes in general, and in the prisoner's dilemma, Cournot and Bertrand duopolies, and an advertising game in particular. 相似文献
17.
Lorenzo Rocco 《International Review of Economics》2007,54(2):225-247
This paper purpose is twofold. First, it offers a critical review of the proofs of existence of pure strategy Nash Equilibria
in nonatomic games. In particular, it focuses on the alternative ways of formalizing the critical assumption of anonymity.
Second, the paper proves the existence of pure strategy Nash Equilibria by relaxing anonymity and allowing instead for “limited
anonymity” (i.e. players’ decisions depend on the average strategy of a finite number of players’ subsets and not on the average
strategy of the whole set of players). (JEL: C72, C79) 相似文献
18.
Summary Experimental games typically involve subjects playing the same game a number of times. In the absence of perfect rationality by all players, the subjects may use the behavior of their opponents in early rounds to learn about the extent of irrationality in the population they face. This makes the problem of finding the Bayes-Nash equilibrium of the experimental game much more complicated than finding the game-theoretic solution to the ideal game without irrationality. We propose and implement a computationally intensive algorithm for finding the equilibria of complicated games with irrationality via the minimization of an appropriate multi-variate function. We propose two hypotheses about how agents learn when playing experimental games. The first posits that they tend to learn about each opponent as they play it repeatedly, but do not learn about the population parameters through their observations of random opponents (myopic learning). The second posits that both types of learning take place (sequential learning). We introduce a computationally intensive sequential procedure to decide on the informational value of conducting additional experiments. With the help of that procedure, we decided after 12 experiments that our original model of irrationality was unsatisfactory for the purpose of discriminating between our two hypotheses. We changed our models, allowing for two different types of irrationality, reanalyzed the old data, and conducted 7 more experiments. The new model successfully discriminated between our two hypotheses about learning. After only 7 more experiments, our approximately optimal stopping rule led us to stop sampling and accept the model where both types of learning occur.We acknowledge the financial support from NSF grant #SES9011828 to the California Institute of Technology. We also acknowledge the able research assistance of Mark Fey, Lynell Jackson and Jeffrey Prisbrey in setting up the experiments, recruiting subjects and running the experiments. We acknowledge the help of the Jet Propulsion Laboratory and its staff members for giving us access to their Cray XMP/18, and subsequently their Cray YMP2E/116. 相似文献
19.
Commitment and observability in games 总被引:2,自引:0,他引:2
20.
Individuals belonging to two large populations are repeatedly randomly matched to play a cyclic game such as Matching Pennies. Between matching rounds, individuals sometimes change their strategy after observing a finite sample of other outcomes within their population. Individuals from the same population follow the same behavioral rule. In the resulting discrete time dynamics the unique Nash equilibrium is unstable. However, for sample sizes greater than one, we present an imitation rule where long run play cycles closely around the equilibrium. 相似文献