面向部分可观测信息的城市交通网络信号控制方法
DOI:
CSTR:
作者:
作者单位:

长沙理工大学 交通运学院

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)


Urban Traffic Signal Control Method Oriented to Partially Observable Information
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    【目的】针对现有基于深度学习的交通信号控制方法普遍依赖全量可观测交通状态信息,构建一种适用于稀疏观测环境的智能交通信号控制模型。【方法】首先,以浮动车的交通状态为核心观测输入,并通过空间离散化将交叉口各进口道划分为固定网格单元,构建由车辆位置矩阵、速度矩阵与车道拥堵矩阵组成的三维张量状态表示。然后,在策略设计中引入基于车辆拥堵程度的连续奖励机制,结合固定最小绿灯与相位切换规则,引导智能体在不同状态下动态选择信号相位,实现延误最小化目标。最后,搭建了单交叉口与多交叉口两个实验场景,在不同交通条件下分别与不同算法进行对比测试。【结果】本文方法与固定时控制、深度Q网络(Deep Q Network,DQN)以及近端策略优化算法(Proximal Policy Optimization, PPO)进行了对比。所提模型在不同交通负荷尤其是在浮动车低渗透率条件下均表现出更高的收敛速度与稳定性,能显著降低平均延误时间、平均排队长度与平均行程时间,整体通行效率提升明显。【结论】研究结果验证了所提方法在部分可观测信息条件下的有效性与鲁棒性。该模型能够在低渗透率环境中保持优良控制性能,为城市交通信号控制在实际数据受限条件下的智能化、协同化发展提供了可行的技术路径与理论支撑。

    Abstract:

    [Purposes] To address the common reliance of existing deep learning-based traffic signal control methods on fully observable traffic state information, this study constructs an intelligent traffic signal control model suitable for sparse observation environments. [Methods] Firstly, using the traffic status of floating cars as the core observational input, and through spatial discretization dividing each entrance lane at the intersection into fixed grid cells, a three-dimensional tensor state representation is constructed comprising a vehicle position matrix, a velocity matrix, and a lane congestion matrix. Next, a continuous reward mechanism based on vehicle congestion levels is introduced into the policy design. Combined with fixed minimum green time and phase switching rules, this guides the agent to dynamically select signal phases under different conditions, achieving the goal of minimizing delays. Finally, two experimental scenarios—single-intersection and multi-intersection—are established. Comparative tests are conducted against various algorithms under different traffic conditions. [Findings] The proposed method was compared with fixed-time control, Deep Q-Network (DQN), and Proximal Policy Optimization (PPO). The proposed model demonstrated faster convergence and greater stability under various traffic loads, particularly at low floating vehicle penetration rates. It significantly reduced average delay time, average queue length, and average travel time, resulting in a notable improvement in overall traffic efficiency. [Conclusions] The research findings validate the effectiveness and robustness of the proposed method under partially observable conditions. The model maintains excellent control performance in low-permeability environments, providing a feasible technical pathway and theoretical foundation for the intelligent and coordinated development of urban traffic signal control systems under conditions of limited real-world data.

    参考文献
    相似文献
    引证文献
引用本文
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-11-07
  • 最后修改日期:2025-12-10
  • 录用日期:2025-12-11
  • 在线发布日期:
  • 出版日期:
文章二维码