首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度强化学习的动态共享单车重置问题研究
引用本文:张建同,何钰林.基于深度强化学习的动态共享单车重置问题研究[J].上海管理科学,2021(2):81-86.
作者姓名:张建同  何钰林
作者单位:同济大学经济与管理学院
基金项目:国家自然科学基金“数据驱动的共享单车重置问题研究”(71971156)。
摘    要:共享单车在为城市出行带来便利的同时,也面临着资源分布不平衡问题。针对单车分布动态变化环境下的共享单车重置问题,提出基于强化学习的实时调度策略结构。构建了面向强化学习的共享单车重置问题模型,利用深度确定性策略梯度算法(DDPG)进行求解,以获得实时调度策略。基于实际单车分布数据,构建了调度过程中的环境交互模拟器。最后,利用强化学习在模拟器中进行大规模数据实验,结果表明算法得到的调度策略能提高系统表现,并且效果好于已有方法。

关 键 词:共享单车重置问题  深度强化学习  摩拜单车

Research on Dynamic Bike Repositioning Problem Basedon Deep Reinforcement Learning
ZHANG Jiantong,HE Yulin.Research on Dynamic Bike Repositioning Problem Basedon Deep Reinforcement Learning[J].Shanghai Managent Science,2021(2):81-86.
Authors:ZHANG Jiantong  HE Yulin
Institution:(School of Economics and Management,Tongji University,Shanghai 200092,China)
Abstract:While bikes sharing bring convenience to urban travel,they also face the problem of unbalanced distribution of shared bike resources.A real-time scheduling strategy structure based on reinforcement learning was proposed to solve the repositioning problem of shared bikes under dynamic change of bicycle distribution.In this paper,a model of the bike repositioning problem for reinforcement learning is built,which is solved by deep deterministic strategy gradient(DDPG)to obtain real-time scheduling strategy.Based on the actual distribution data of shared bikes,an environmental interaction simulator is constructed for the scheduling process.A large-scale data experiment using reinforcement learning is carried out in the simulator.The experiment results show that the reposi tioning strategy obtained by the algorithm can significantly improve the performance of the system,and the algorithm performance is better than other existing methods.
Keywords:bike repositioning problem  deep reinforcement learning  Mobike
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号