共查询到20条相似文献,搜索用时 84 毫秒
1.
Self-tuning experience weighted attraction learning in games 总被引:2,自引:0,他引:2
Self-tuning experience weighted attraction (EWA) is a one-parameter theory of learning in games. It addresses a criticism that an earlier model (EWA) has too many parameters, by fixing some parameters at plausible values and replacing others with functions of experience so that they no longer need to be estimated. Consequently, it is econometrically simpler than the popular weighted fictitious play and reinforcement learning models. The functions of experience which replace free parameters “self-tune” over time, adjusting in a way that selects a sensible learning rule to capture subjects’ choice dynamics. For instance, the self-tuning EWA model can turn from a weighted fictitious play into an averaging reinforcement learning as subjects equilibrate and learn to ignore inferior foregone payoffs. The theory was tested on seven different games, and compared to the earlier parametric EWA model and a one-parameter stochastic equilibrium theory (QRE). Self-tuning EWA does as well as EWA in predicting behavior in new games, even though it has fewer parameters, and fits reliably better than the QRE equilibrium benchmark. 相似文献
2.
Ying Chen 《Economics Letters》2012,114(3):343-345
I find in two classes of sender-receiver games that the receiver’s equilibrium payoff is not increasing in the informativeness of a public signal because the sender may transmit less information when the public signal is more informative. 相似文献
3.
Aya Kaya 《Games and Economic Behavior》2009,66(2):841-854
I analyze a class of repeated signaling games in which the informed player's type is persistent and the history of actions is perfectly observable. In this context, a large class of possibly complex sequences of signals can be supported as the separating equilibrium actions of the “strong type” of the informed player. I characterize the set of such sequences. I also characterize the sequences of signals in least cost separating equilibria (LCSE) of these games. In doing this, I introduce a state variable that can be interpreted as a measure of reputation. This gives the optimization problem characterizing the LCSE a recursive structure. I show that, in general, the equilibrium path sequences of signals have a simple structure. The shapes of the optimal sequences depend critically on the relative concavities of the payoff functions of different types, which measure the relative preferences towards payoff smoothing. 相似文献
5.
We explore how learning to play strategically in one signaling game promotes strategic play in a related signaling game. Following
convergence to a pooling equilibrium, payoffs are changed to only support separating equilibria. More strategic play is observed
following the change in payoffs than for inexperienced subjects in control sessions, contrary to the prediction of a fictitious
play learning model. Introducing a growing proportion of sophisticated learners, subjects who anticipate responders’ behavior
following the change in payoffs, enables the model to capture the positive cross-game learning observed in the data.
Research support form the National Science Foundation grant number SBR9809538 is gratefully acknowledged. We have received
research support from Jo Ducey, Guillaume Frechette, Steve Lehrer, and Carol Kraker Stockman. We have benefitted from comments
of Eric Bettinger, John Ham, Jim Rebeitzer, Bob Slonim and seminar participants at Case Western Reserve University, Ohio State
University, the University of Mississippi, the University of Illinois, and Purdue University. The usual caveat applies. 相似文献
6.
John B. Van Huyck Raymond C. Battalio Frederick W. Rankin 《Experimental Economics》2007,10(3):205-220
This paper reports an experiment designed to detect the influence of strategic uncertainty on behavior in order statistic
coordination games, which arise when a player’s best response is an order statistic of the cohort’s action combination. Unlike
previous experiments using order statistic coordination games, the new experiment holds the payoff function constant and only
changes cohort size and order statistic.
Electronic Supplementary Material The online version of this article () contains supplementary material, which is available to authorized users. Related research available at 相似文献
Electronic Supplementary Material The online version of this article () contains supplementary material, which is available to authorized users. Related research available at 相似文献
7.
Strategic stability and uniqueness in signaling games 总被引:1,自引:0,他引:1
A class of signaling games is studied in which a unique Universally Divine equilibrium outcome exists. We identify a monotonicity property under which a variation of Universal Divinity is generically equivalent to strategic stability. Further assumptions guarantee the existence of a unique Universally Divine outcome. 相似文献
8.
《Games and Economic Behavior》2003,42(1):25-47
This article studies situations in which agents do not initially know the effect of their decisions, but learn from experience the payoffs induced by their choices and their opponents'. We chararacterize equilibrium payoffs in terms of simple strategies in which an exploration phase is followed by a payoff acquisition phase. 相似文献
9.
Ernan HaruvyDale O. Stahl 《Games and Economic Behavior》2012,74(1):208-221
Rule learning posits that decision makers, rather than choosing over actions, choose over behavioral rules with different levels of sophistication. Rules are reinforced over time based on their historically observed payoffs in a given game. Past works on rule learning have shown that when playing a single game over a number of rounds, players can learn to form sophisticated beliefs about others. Here we are interested in learning that occurs between games where the set of actions is not directly comparable from one game to the next. We study a sequence of ten thrice-played dissimilar games. Using experimental data, we find that our rule learning model captures the ability of players to learn to reason across games. However, this learning appears different from within-game rule learning as previously documented. The main adjustment in sophistication occurs by switching from non-belief-based strategies to belief-based strategies. The sophistication of the beliefs themselves increases only slightly over time. 相似文献
10.
《Games and Economic Behavior》2010,69(2):716-730
Psychologists have long recognized two kinds of learning: one that is relatively shallow and domain-specific; and another that is deeper, producing generalizable insights that transfer across domains. The game theory literature has only recently considered this distinction, and the conditions that stimulate the latter kind of “meaningful” learning in games are still unclear. Three experiments demonstrate that one kind of meaningful learning — acquisition of iterated dominance — occurs in the absence of any feedback. We demonstrate that such feedback-free meaningful learning transfers to new strategically similar games, and that such transfer does not typically occur when initial games are played with feedback. The effects of withholding feedback are similar to, and substitutable with, those produced by requiring players to explain their behavior, a method commonly employed in psychology to increase deliberation. This similarity suggests that withholding feedback encourages deeper thinking about the game in a manner similar to such self-explanation. 相似文献
11.
12.
The paper explores the implications of melioration learning—an empirically significant variant of reinforcement learning—for game theory. We show that in games with invariable pay-offs melioration learning converges to Nash equilibria in a way similar to the replicator dynamics. Since melioration learning is known to deviate from optimizing behavior when an action’s rewards decrease with increasing relative frequency of that action, we also investigate an example of a game with frequency-dependent pay-offs. Interactive melioration learning is then still appropriately described by the replicator dynamics, but it indeed deviates from rational choice behavior in such a game. 相似文献
13.
Luis R. Izquierdo Segismundo S. Izquierdo Nicholas M. Gotts J. Gary Polhill 《Games and Economic Behavior》2007,61(2):259-276
Reinforcement learners tend to repeat actions that led to satisfactory outcomes in the past, and avoid choices that resulted in unsatisfactory experiences. This behavior is one of the most widespread adaptation mechanisms in nature. In this paper we fully characterize the dynamics of one of the best known stochastic models of reinforcement learning [Bush, R., Mosteller, F., 1955. Stochastic Models of Learning. Wiley & Sons, New York] for 2-player 2-strategy games. We also provide some extensions for more general games and for a wider class of learning algorithms. Specifically, it is shown that the transient dynamics of Bush and Mosteller's model can be substantially different from its asymptotic behavior. It is also demonstrated that in general—and in sharp contrast to other reinforcement learning models in the literature—the asymptotic dynamics of Bush and Mosteller's model cannot be approximated using the continuous time limit version of its expected motion. 相似文献
14.
Summary. This paper studies adaptive learning in extensive form games and provides conditions for convergence points of adaptive learning
to be sequential equilibria. Precisely, we present a set of conditions on learning sequences such that an assessment is a
sequential equilibrium if and only if there is a learning sequence fulfilling the conditions, which leads to the assessment.
Received: November 5, 1996; revised version: May 28, 1997 相似文献
15.
《Games and Economic Behavior》2008,62(2):259-276
Reinforcement learners tend to repeat actions that led to satisfactory outcomes in the past, and avoid choices that resulted in unsatisfactory experiences. This behavior is one of the most widespread adaptation mechanisms in nature. In this paper we fully characterize the dynamics of one of the best known stochastic models of reinforcement learning [Bush, R., Mosteller, F., 1955. Stochastic Models of Learning. Wiley & Sons, New York] for 2-player 2-strategy games. We also provide some extensions for more general games and for a wider class of learning algorithms. Specifically, it is shown that the transient dynamics of Bush and Mosteller's model can be substantially different from its asymptotic behavior. It is also demonstrated that in general—and in sharp contrast to other reinforcement learning models in the literature—the asymptotic dynamics of Bush and Mosteller's model cannot be approximated using the continuous time limit version of its expected motion. 相似文献
16.
This paper reports an experiment using Stag Hunt games with different payoff ranges. Specifically, in one treatment for half of the games the payoff dominant and the risk dominant equilibrium coincide and for the other half of the games they conflict; in the second treatment they always conflict. The experiment provides evidence that the payoff range experienced by the participant influences the likelihood of efficient conventions emerging. In particular, experiencing games where payoff dominance and risk dominance coincide appears to make payoff dominance more attractive in games in which they conflict. In the experiment, we also observe conditional behavior emerging with experience. We develop a model of conditional expectations to explain these stylized facts that depends crucially on the assumption that after a brief learning period participants categorize their experience using the same relative bandwidth in both treatments even though the range of experience is twice as large in treatment 1 as it is in treatment 2. The assumption cannot be rejected by the data. The analysis provides a formal example in which increasing experienced diversity by changing the way similar experiences are categorized increases the likelihood of efficient conventions emerging in communities playing similar Stag Hunt games. 相似文献
17.
A deterministic learning model applied to a game with multiple equilibria produces distinct basins of attraction for those
equilibria. In symmetric two-by-two games, basins of attraction are invariant to a wide range of learning rules including
best response dynamics, replicator dynamics, and fictitious play. In this paper, we construct a class of three-by-three symmetric
games for which the overlap in the basins of attraction under best response learning and replicator dynamics is arbitrarily
small. We then derive necessary and sufficient conditions on payoffs for these two learning rules to create basins of attraction
with vanishing overlap. The necessary condition requires that with probability one the initial best response is not an equilibrium
to the game. The existence of parasitic or misleading actions allows subtle differences in the learning rules to accumulate. 相似文献
18.
This paper experimentally compares the impact of the presence of strategic substitutes (GSS) and complements (GSC) on players’ ability to successfully play equilibrium strategies. By exploiting a simple property of the ordering on strategy spaces, our design allows us to isolate these effects by avoiding other confounding factors that are present in more complex settings, such as market games. We find that the presence of strategic complementarities significantly improves the rate of Nash play, but that this effect is driven mainly by early rounds of play. This suggests that GSS may be more difficult to learn initially, but that given sufficient time, the theoretically supported globally stable equilibrium offers a good prediction in both settings. We also show that increasing the degree of substitutability or complementarity does not significantly improve the rate of Nash play in either setting, which builds on the findings of previous studies. 相似文献
19.
Cycling in a stochastic learning algorithm for normal form games 总被引:2,自引:0,他引:2
Martin Posch 《Journal of Evolutionary Economics》1997,7(2):193-207
In this paper we study a stochastic learning model for 2×2 normal form games that are played repeatedly. The main emphasis
is put on the emergence of cycles. We assume that the players have neither information about the payoff matrix of their opponent
nor about their own. At every round each player can only observe his or her action and the payoff he or she receives. We prove
that the learning algorithm, which is modeled by an urn scheme proposed by Arthur (1993), leads with positive probability
to a cycling of strategy profiles if the game has a mixed Nash equilibrium. In case there are strict Nash equilibria, the
learning process converges a.s. to the set of Nash equilibria. 相似文献
20.
Olga Shurchkov 《Experimental Economics》2013,16(3):313-334
Coordination problems are ubiquitous in social and economic life. Political mass demonstrations, the decision whether to join a speculative currency attack, investment in a risky venture, and capital flight from a particular country are all characterized by coordination problems. Furthermore, all these events have a dynamic nature which has been largely omitted from previous experimental studies. Here I use a two-stage variant of a dynamic global game to study experimentally how the arrival of information in a dynamic setting affects the relative aggressiveness of speculators. In the first stage, subjects exhibit excess aggressiveness, which appears to be driven by beliefs about others’ actions rather than an intrinsic taste for attacking. However, following a failed first-stage attack, subjects learn to be less aggressive in the second stage. On the other hand, the arrival of new, more precise information after a failed attack leads to an increase in subjects’ aggressiveness. Beliefs, again, play a crucial role in explaining how the arrival of information affects attacking behavior. 相似文献