西安电子科技大学学报 ›› 2023, Vol. 50 ›› Issue (6): 161-171.doi: 10.19665/j.issn1001-2400.20230309

• 信息与通信工程 & 计算机科学与技术 • 上一篇    下一篇

HCPP:一种数据高效的分层车辆跟随方法

王旭1(),商尔科2(),苗启广1(),戴斌2(),刘泱3()   

  1. 1.西安电子科技大学 计算机科学与技术学院,陕西 西安 710071
    2.军事科学院 国防科技创新研究院,北京 100071
    3.国防科技大学 智能科学学院,湖南 长沙 410073
  • 收稿日期:2022-11-23 出版日期:2023-12-20 发布日期:2024-01-22
  • 通讯作者: 刘泱(1994—),男,国防科技大学博士研究生,E-mail:bjly17@sina.com
  • 作者简介:王旭(1993—),男,西安电子科技大学博士研究生,E-mail:xuwangxw@foxmail.com;|商尔科(1984—),男,副研究员,E-mail:erke1984@qq.com;|戴斌(1975—),男,研究员,E-mail: ibindai@163.com
  • 基金资助:
    国家自然科学基金重大项目重点报告(61790565);国家自然科学基金(62272364);国家自然科学基金(61902296);陕西省重点研发计划(2020LSFP3-15);广西可信软件重点实验室研究课题(KX202061);青岛市科技计划重点研发专项(21-1-2-18-xx)

HCPP:a data efficient hierarchical car-following method

WANG Xu1(),SHANG Erke2(),MIAO Qiguang1(),DAI Bin2(),LIU Yang3()   

  1. 1. School of Computer Science,Xidian University,Xi’an 710071,China
    2. National Defense Science and Technology Innovation Research Institute,Beijing 100071,China
    3. School of Intelligent Science,National University of DefenseTechnology,Changsha 410073,China
  • Received:2022-11-23 Online:2023-12-20 Published:2024-01-22

摘要:

现有的基于控制理论的车辆跟随控制方法需要对两辆车的速度和距离等建立模型,缺乏泛化性并难以实现稳定平滑的控制效果。为了解决这个问题,提出了一种不依赖车辆运动学模型的数据高效的分层车辆跟随控制方法。该方法的上层利用车辆坐标、速度和其他车载传感器的感知结果构建数据集,并训练深度强化学习模型,避免了依赖先验知识的问题,并不需要在真实世界中进行训练。训练过程中从数据集中随机采样样本,提高了数据利用率。该方法的下层通过比例积分微分控制算法实时控制车辆的加速度和角速度,避免了深度强化学习策略不稳定导致的控制抖动,从而使车辆的控制更加平滑。为了验证该算法的性能,进行了仿真实验和实车实验。实验结果表明,该算法可以使跟随车与目标车之间的距离始终保持在安全合理的范围内。通过对比实验,证明了该算法在横向和纵向上都实现了更稳定、平滑和安全的车辆跟随控制。

关键词: 车辆跟随控制, 深度强化学习, 离线训练, 分层控制, 比例积分微分控制

Abstract:

The current car following control methods based on the control theory rely on models for both car speed and distance,which suffer from a lack of generalization and achieve stable and smooth control results with difficulty.To address this problem,we propose a data-efficient hierarchical car following control method that does not depend on car kinematic models.The upper layer of the proposed method constructs a dataset based on the perception results of car coordinates,speed,and other onboard sensors.A deep reinforcement learning model is trained to perform car following,avoiding the reliance on prior knowledge and eliminating the need for real-world training.Training samples are randomly selected from the dataset,which improves data utilization.The lower layer of the method implements real-time control of the car acceleration and angular velocity using a PID controller,which avoids the control jitter caused by the instability of deep reinforcement learning policies,resulting in smoother control.To verify the performance of the algorithm,both simulation and real-world experiments are conducted.Experimental results show that the proposed algorithm can keep the distance between the following car and the target car within a safe and reasonable range.The comparative experiments demonstrate that the proposed algorithm achieves more stable,smoother,and safer car following control in both lateral and longitudinal directions.

Key words: car-following control, deep reinforcement learning, offline training, hierarchical control, proportion integral differential(PID) control

中图分类号: 

  • TP242.6