Abstract:Aiming at the problem of direct conflict between autonomous vehicles and other human-driven vehicles at intersections, an autonomous vehicle behavior decision model is built, and deep reinforcement learning is used to train autonomous vehicles when passing road intersections, allowing autonomous vehicles to make autonomous decisions and achieve fast control of complex scenarios,and the comparison with the non-dominated sorting genetic algorithm-Ⅱ verifies the stability of the autonomous vehicle.The simulation results show that the autonomous vehicle beha-vior decision-making method using the depth deterministic strategy gradient algorithm has better output speed to ensure the smooth changes of the throttle and brake values, and effectively solve the safety and comfort problems of autonomous vehicles.
徐泽洲, 曲大义, 洪家乐, 宋晓晨. 智能网联汽车自动驾驶行为决策方法研究[J]. 复杂系统与复杂性科学, 2021, 18(3): 88-94.
XU Zezhou, QU Dayi, HONG Jiale, SONG Xiaochen. Research on Decision-making Method for Autonomous Driving Behavior of Connected and Automated Vehicle. Complex Systems and Complexity Science, 2021, 18(3): 88-94.
[1] 杨帆.无人驾驶汽车的发展现状和展望[J].上海汽车,2014(3):35-40. Yang Fan. Development situation and prospect of driverless vehicle[J].Shanghai Auto,2014(3):35-40. [2] 马国成. 车辆自适应巡航跟随控制技术研究[D].北京:北京理工大学,2014. Ma Guocheng. Research on adaptive cruise control tracking system applied for motor vehicles[D].Beijing:Beijing Institute of Technology,2014. [3] Lange S, Riedmiller M. Deep auto-encoder neural networks in reinforcement learning[C].The 2010 International Joint Conference on Neural Networks (IJCNN), Barcelona, 2010. [4] Lange S, Riedmiller M, Voigtlander A. Autonomous reinforcement learning on raw visual input data in a real world application[C]. The 2012 International Joint Conference on Neural Networks. Brisbance: IEEE, 2012. [5] Mnih V, Kavukcuoglu K, Silved D, etal.Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540):529-533. [6] Chae H, Kang C M, Kim B D,et al. Autonomous braking system via deep reinforcement learning[J]. 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC). Yokohama:IEEE, 2017. [7] Sallab A, Abdou M, Perot E, et al. Deep reinforcement learning framework for autonomous driving[J]. Electronic Imaging,2017(19):70-76. [8] Vasquez R,Farooq B. Multi-objective autonomous braking system using naturalistic dataset[C]. 2019 IEEE Intelligent Transportation Systems Conference (ITSC). Auckland: IEEE, 2019. [9] Wang Y, Li X P, Yao H D. Review of trajectory optimisation for connected automated vehicles[J]. IET Intelligent Transport Systems,2018,13(4):580-586. [10] Gerard A U, Jin W L. Mobility and environment improvement of signalized networks through Vehicle-to-Infrastructure (V2I) communications[J]. Transportation Research Part C,2016,68:70-82. [11] Yao H D,Cui J X,Li X P,et al. A trajectory smoothing method at signalized intersection based on individualized variable speed limits with location optimization[J]. Transportation Research Part D,2018,62:456-473. [12] Jiang H F,Hu J, An S, et al. Eco approaching at an isolated signalized intersection under partially connected and automated vehicles environment[J]. Transportation Research Part C,2017,79:290-307. [13] Xu B, Jeff B X G ,Bian Y G, et al. Cooperative method of traffic signal optimization and speed control of connected vhicles at isolated intersections[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(4):1390-1403. [14] Han X, Ma R, H. Zhang M. Energy-aware trajectory optimization of CAV platoons through a signalized intersection[J]. Transportation Research Part C, 2020, 118: 102652. [15] 夏伟. 基于深度强化学习的自动驾驶决策仿真[D].深圳:中国科学院大学(中国科学院深圳先进技术研究院),2017. Xia Wei. Simulation of automatic driving strategy based on deep reinforcement learning[D]. Shenzhen:University of Chinese Academy of Sciences(Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences ),2017. [16] 范鑫磊,李栋,张尉,等.基于深度强化学习的导弹规避决策训练研究[J].电光与控制,2021,28(1):81-85. Fan Xinlei, Li Dong, Zhang Wei, et al. Missile evasion decision training based on deep reinforcement learning[J].Electronics Optics & Control,2021,28(1):81-85. [17] 徐国艳,宗孝鹏,余贵珍,等.基于DDPG的无人车智能避障方法研究[J].汽车工程,2019,41(2):206-212. Xu Guoyan, Zong Xiaopeng, Yu Guizhen, et al. A research on intelligent obstacle avoidance of unmanned vehicle based on DDPG algorithm[J].Automotive Engineering, 2019, 41(2): 206-212. [18] 杨顺,蒋渊德,吴坚,等.基于多类型传感数据的自动驾驶深度强化学习方法[J].吉林大学学报(工学版),2019,49(4):1026-1033. Yang Shun, Jiang Yuande, Wu Jian, et al. Autonomous driving policy learning based on deep reinforcement learning and multi-type sensor data[J].Journal of Jilin University(Engineering and Technology Edition),2019,49(4):1026-1033. [19] Qi W W, Wang W, Shen B, et al. A modified post encroachment time model of urban road merging area based on lane-change characteristics[J]. IEEE Access, 2020,8:72835-72846. [20] 樊娇,雷涛,董南江,等.基于改进NSGA-Ⅱ算法的多目标无人机路径规划[DB/OL]. [2021-05-11].http://kns.cnki.net/kcms/detail/14.1138.TJ.20210419.1630.002.html. Fan Jiao, Lei Tao, Dong Nanjiang, et al. Multi-objective UAV path planning based on an improved NGSA-II[DB/OL]. [2021-05-11].http://kns.cnki.net/kcms/detail/14.1138.TJ.20210419.1630.002.html.