The QLBS Q-Learner goes NuQLear: fitted Q iteration,inverse RL,and option portfolios期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

The QLBS Q-Learner goes NuQLear: fitted Q iteration,inverse RL,and option portfolios

Authors:	Igor Halperin

Affiliation:	1. Department of Financial Engineering, NYU Tandon School of Engineering, New York, NY, USAigor.halperin@nyu.edu

Abstract:	The QLBS model is a discrete-time option hedging and pricing model that is based on Dynamic Programming (DP) and Reinforcement Learning (RL). It combines the famous Q-Learning method for RL with the Black–Scholes (–Merton) (BSM) model's idea of reducing the problem of option pricing and hedging to the problem of optimal rebalancing of a dynamic replicating portfolio for the option, which is made of a stock and cash. Here we expand on several NuQLear (Numerical Q-Learning) topics with the QLBS model. First, we investigate the performance of Fitted Q Iteration for an RL (data-driven) solution to the model, and benchmark it versus a DP (model-based) solution, as well as versus the BSM model. Second, we develop an Inverse Reinforcement Learning (IRL) setting for the model, where we only observe prices and actions (re-hedges) taken by a trader, but not rewards. Third, we outline how the QLBS model can be used for pricing portfolios of options, rather than a single option in isolation, thus providing its own, data-driven and model-independent solution to the (in)famous volatility smile problem of the Black–Scholes model.

Keywords:	Black-Scholes model Reinforcement learning Q-learning Inverse reinforcement learning Hedging risk

设为首页 | 免责声明 | 关于勤云 | 加入收藏