Regret minimization in repeated matrix games with variable stage duration期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Regret minimization in repeated matrix games with variable stage duration

Authors:	Shie Mannor Nahum Shimkin

Institution:	^aDepartment of Electrical and Computer Engineering, McGill University, 3480 University Street, Montreal, PQ, Canada;^bDepartment of Electrical Engineering, Technion, Israel Institute of Technology, Haifa 32000, Israel

Abstract:	Regret minimization in repeated matrix games has been extensively studied ever since Hannan's seminal paper Hannan, J., 1957. Approximation to Bayes risk in repeated play. In: Dresher, M., Tucker, A.W., Wolfe, P. (Eds.), Contributions to the Theory of Games, vol. III. Ann. of Math. Stud., vol. 39, Princeton Univ. Press, Princeton, NJ, pp. 97–193]. Several classes of no-regret strategies now exist; such strategies secure a long-term average payoff as high as could be obtained by the fixed action that is best, in hindsight, against the observed action sequence of the opponent. We consider an extension of this framework to repeated games with variable stage duration, where the duration of each stage may depend on actions of both players, and the performance measure of interest is the average payoff per unit time. We start by showing that no-regret strategies, in the above sense, do not exist in general. Consequently, we consider two classes of adaptive strategies, one based on Blackwell's approachability theorem and the other on calibrated play, and examine their performance guarantees. We further provide sufficient conditions for existence of no-regret strategies in this model.

Keywords:	No-regret strategies Regret minimization Hannan consistency Best-response envelope Repeated matrix games Variable duration games Approachability Calibrated play
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏