In order to optimize the speed of upstream and downstream traffic flow in the bottleneck area of expressway, two reward functions based on unit distance velocity variation and SG convolution smoothing were proposed, and two Q-learning models of speed harmonization were established. The integrated simulation platform combining Excel-VBA, VISSIM and MATLAB was used to simulate the reward functions. The results show that, the reward function based on SG convolution smoothing can effectively relieve the stop-and-go traffic on the upstream of the bottleneck. The fluctuation amplitude of speed was reduced. The Q-Learning model of speed-coordinated can suggest the optimal real time speed according to the traffic state.
参考文献
相似文献
引证文献
引用本文
刘元元 ,卢守峰 ,刘肖亮 ,等.速度协调的 Q 学习模型研究[J].交通科学与工程,2021,37(2):98-104. LIU Yuan-yuan, LU Shou-feng, LIU Xiao-liang, et al. Research on Q-Learning model of speed harmonization[J]. Journal of Transport Science and Engineering,2021,37(2):98-104.