首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
We report results from an experiment in which humans repeatedly play one of two games against a computer program that follows either a reinforcement or an experience weighted attraction learning algorithm. Our experiment shows these learning algorithms detect exploitable opportunities more sensitively than humans. Also, learning algorithms respond to detected payoff-increasing opportunities systematically; however, the responses are too weak to improve the algorithms' payoffs. Human play against various decision maker types does not vary significantly. These factors lead to a strong linear relationship between the humans' and algorithms' action choice proportions that is suggestive of the algorithms' best response correspondences.  相似文献   

2.
3.
Which one should I imitate?   总被引:1,自引:0,他引:1  
An individual repeatedly faces the same decision. Payoffs generated by his available actions are noisy. This individual forgets earlier experience and is limited to learn by observing current success of two other individuals. We select among behavioral rules that lead an infinite population of identical individuals in the long run to the expected payoff maximizing action. Optimal behavior requires learning by imitation. The second most successful in a sample must sometimes be imitated although the most successful will always be imitated with a higher probability. Inertia is optimal. When each individual follows an optimal rule, then the population evolves according to an aggregate monotone dynamic.  相似文献   

4.
准确地预估用户的点击率,并根据该概率对商品排序以供用户选择在推荐系统领域有着重要的意义。推荐系统中常用的因子分解机等机器学习模型一般只考虑用户选择单个商品的概率,忽略了候选商品之间的相互影响,离散选择模型则考虑将商品候选集作为整体进行考虑。提出了使用深度学习模型来改进离散选择模型,模型使用相对特征层、注意力机制等网络结构帮助深度学习模型进行不同商品间的特征比较,研究结果表明引入离散选择模型的深度学习模型表现优于梯度提升决策树、因子分解机等模型。  相似文献   

5.
田辉 《企业经济》2012,(2):110-113
目前,针对于企业员工的流动行为争议比较多,主要的观点是认为企业员工的流动会给雇主造成损失,违反了契约合意,所以应该加以限制。本文的分析表明,流动行为是企业员工在劳动力市场上的理性选择行为,是受到多种因素影响下的权衡过程。企业员工的流动行为能促进劳动力市场的效率,增强人职匹配质量,是一种互惠行为。所以,促进由市场机制自行调节的企业员工流动,对于整个社会及企业员工本身都是有利的。  相似文献   

6.
We study school choice markets where the non-strategy-proof Boston mechanism is used to assign students to schools. Inspired by previous field and experimental evidence, we analyze a type of behavior called priority-driven: students have a common ranking over the schools and then give a bonus in their submitted preferences to those schools for which they have high priority. We first prove that under this behavior, there is a unique stable and efficient matching, which is the outcome of the Boston mechanism. Second, we show that the three most prominent mechanisms on school choice (Boston, deferred acceptance, and top trading cycles) coincide when students’ submitted preferences are priority-driven. Finally, we run some computational simulations to show that the assumption of priority-driven preferences can be relaxed by introducing an idiosyncratic preference component, and our qualitative results carry over to a more general model of preferences.  相似文献   

7.
An Economist's Perspective on Probability Matching   总被引:3,自引:0,他引:3  
The experimental phenomenon known as 'probability matching' is often offered as evidence in support of adaptive learning models and against the idea that people maximise their expected utility. Recent interest in dynamic-based equilibrium theories means the term re-appears in Economics. However, there seems to be conflicting views on what is actually meant by the term and about the validity of the data.
The purpose of this paper is therefore threefold: First, to introduce today's readers to what is meant by probability matching, and in particular to clarify which aspects of this phenomenon challenge the utility-maximisation hypothesis. Second, to familiarise the reader with the different theoretical approaches to behaviour in such circumstances, and to focus on the differences in predictions between these theories in light of recent advances. Third, to provide a comprehensive survey of repeated, binary choice experiments.  相似文献   

8.
Decision analysis using targets instead of utility functions   总被引:5,自引:0,他引:5  
A common precept of decision analysis under uncertainty is the choice of an action which maximizes the expected value of a utility function. Savage's (1954) axioms for subjective expected utility provide a normative foundation for this principle of choice. This paper shows that the same set of axioms implies that one should select an action which maximizes the probability of meeting an uncertain target. This suggests a new perspective and an alternate target-based language for decision analysis. We explore the implications and the advantages of this target-based approach for both individual and group decision-making.  相似文献   

9.
10.
Facing the challenge of attracting consumers and winning market share under the proliferation of TV stations and channels, the traditional TV stations often make some marketing strategies. However, how to evaluate the effectiveness of different strategies and select the best one is a key issue. This study proposes to resolve this problem. We develop an innovative structural model to simulate the dynamic choices consumers make under two interactive behaviors: learning and forgetting. Learning behavior refers to updating programme quality assessment by using experience, while forgetting behavior prevents the use of previous experience. The Bayesian rules are employed to model learning behavior, and they are extended by incorporating an exponential decay function to measure the effect of forgetting behavior. The structural model is tested and validated by using Hong Kong television viewing data. The empirical results show that when modeling consumer choice decisions, considering learning and forgetting behavior significantly improves the performance of the model in regard to rating prediction and marketing strategy evaluation. Five cases are simulated to show how the model is used to evaluate marketing strategies. Managerial implications are then discussed to guide the decision-making of traditional TV broadcasters and advertisers.  相似文献   

11.
We investigate the finite sample properties of a large number of estimators for the average treatment effect on the treated that are suitable when adjustment for observed covariates is required, like inverse probability weighting, kernel and other variants of matching, as well as different parametric models. The simulation design used is based on real data usually employed for the evaluation of labour market programmes in Germany. We vary several dimensions of the design that are of practical importance, like sample size, the type of the outcome variable, and aspects of the selection process. We find that trimming individual observations with too much weight as well as the choice of tuning parameters are important for all estimators. A conclusion from our simulations is that a particular radius matching estimator combined with regression performs best overall, in particular when robustness to misspecifications of the propensity score and different types of outcome variables is considered an important property.  相似文献   

12.
In electronic marketplaces automated and dynamic pricing is becoming increasingly popular. Agents that perform this task can improve themselves by learning from past observations, possibly using reinforcement learning techniques. Co-learning of several adaptive agents against each other may lead to unforeseen results and increasingly dynamic behavior of the market. In this article we shed some light on price developments arising from a simple price adaptation strategy. Furthermore, we examine several adaptive pricing strategies and their learning behavior in a co-learning scenario with different levels of competition. Q-learning manages to learn best-reply strategies well, but is expensive to train.  相似文献   

13.
-learning agents in a Cournot oligopoly model   总被引:1,自引:1,他引:0  
Q-learning is a reinforcement learning model from the field of artificial intelligence. We study the use of Q-learning for modeling the learning behavior of firms in repeated Cournot oligopoly games. Based on computer simulations, we show that Q-learning firms generally learn to collude with each other, although full collusion usually does not emerge. We also present some analytical results. These results provide insight into the underlying mechanism that causes collusive behavior to emerge. Q-learning is one of the few learning models available that can explain the emergence of collusive behavior in settings in which there is no punishment mechanism and no possibility for explicit communication between firms.  相似文献   

14.
Revealed preference theory on the choice of lotteries   总被引:1,自引:0,他引:1  
The choice behavior of a decision-maker is said to be consistent with expected utility maximization if there exists a utility function defined on the set of prizes such that the decision-maker chooses lotteries with the highest expected utility. We present a revealed preference characterization of choice behavior that is consistent with expected utility maximization. A necessary and sufficient condition for expected utility maximization is that there does not exist a way to compound lotteries such that the probability distribution over the final prizes generated by the chosen lotteries of each observation is equal to that generated by the rejected lotteries of each observation. Our result is quite general and can be applied to any compact set of prizes and any choice correspondence.  相似文献   

15.
This paper introduces a learning algorithm that allows for imitation in recursive dynamic games. The Kiyotaki–Wright model of money is a well-known example of such decision environments. In this context, learning by experience has been studied before. Here, we introduce imitation as an additional channel for learning. In numerical simulations, we observe that the presence of imitation either speeds up social convergence to the theoretical Markov–Nash equilibrium or leads every agent of the same type to the same mode of suboptimal behavior. We observe an increase in the probability of convergence to equilibrium, as the incentives for optimal play become more pronounced.  相似文献   

16.
This paper identifies the globally stable conditions under which an individual facing the same choice in many subsequent times learns to behave as prescribed by the expected‐utility model. The analysis moves from the relevant behavioural models suggested by psychology, by updating probability estimations and outcome preferences according to the learning models suggested by neuroscience, in a manner analogous to Bayesian updating. The search context is derived from experimental economics, whereas the learning framework is borrowed from theoretical economics. Analytical results show that the expected‐utility model explains real behaviours in the long run whenever bad events are more likely than good events.  相似文献   

17.
共享单车在为城市出行带来便利的同时,也面临着资源分布不平衡问题。针对单车分布动态变化环境下的共享单车重置问题,提出基于强化学习的实时调度策略结构。构建了面向强化学习的共享单车重置问题模型,利用深度确定性策略梯度算法(DDPG)进行求解,以获得实时调度策略。基于实际单车分布数据,构建了调度过程中的环境交互模拟器。最后,利用强化学习在模拟器中进行大规模数据实验,结果表明算法得到的调度策略能提高系统表现,并且效果好于已有方法。  相似文献   

18.
A digital mechanism is defined as an iterative procedure in which bidders select an action, from a finite set, in each iteration. When bidders have continuous valuations and make strategic reports, we show that any ex post implementation of the Vickrey choice rule via such a mechanism needs infinitely many iterations for almost all realizations of the bidders’ valuations. Thus, when valuations are drawn from a continuous probability distribution, the Vickrey choice rule can only be used at the expense of a running time that is infinite with probability one. This infeasibility result even holds in the case of two bidders and the Vickrey choice rule only being required to be established with probability one. Establishing the efficient allocation when the n bidders’ report truthfully contrasts starkly to the previous setting: a bisection procedure has a finite running time almost always, and an expected number of reports are equal to 2n. Using a Groves payment scheme rather than Vickrey’s second price payment scheme somewhat mitigates the problem. We provide an example mechanism with a Groves payment scheme, in which the running time of the mechanism in equilibrium is finite with probability 12.  相似文献   

19.
苑延华  徐莹  陈洪海 《价值工程》2011,30(15):230-231
本文通过应用概率统计方法解释人类行为与认识活动实例的分析,说明了基于概率统计方法如何描述客观现象和理解人类的认识行为,进而得出概率统计方法是基于归纳的演绎推理,其本质是认识论中的归纳推理。  相似文献   

20.
We study the learning behavior of a population of buyers and a population of sellers whose members are repeatedly randomly matched to engage in a sealed bid double auction. The agents are assumed to be boundedly rational and choose their strategies by imitating successful behavior and adding innovations triggered by random errors or communication with other agents. This process is modelled by a two-population genetic algorithm. A general characterization of the equilibria in mixed population distributions is given and it is shown analytically that only one price equilibria are attractive for the GA dynamics. Simulation results confirm these findings and imply that in cases with random initialization with high probability the gain of trade is equally split between buyers and sellers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号