基于强化学习的无人驾驶车辆行为决策方法研究进展

doi:10.16180/j.cnki.issn1007-7820.2021.05.012

Abstract

Abstract:

The decision-making system can integrate environment and ego vehicle information, so that the autonomous vehicle produces safe and reasonable driving behavior, which is the core technology to realize the autonomous driving. Reinforcement learning algorithm adopts a self-supervised learning method, so that the decision-making system of autonomous vehicles can autonomously learn the optimal decision model through continuous improvement of its strategy during the interaction with the environment, which provides a direction for building an effective decision-making system.This study summarizes the research progress in recent years of the decision-making method based on reinforcement learning in terms of improving decision accuracy, improving decision-making breadth, and dealing with uncertain factors. The improvement of decision-making accuracy mainly depends on the introduction of deep learning algorithm with strong representation ability and the hierarchical abstraction technology that can decompose complex tasks to alleviate the dimension disaster. The uncertainty is considered by partially observable Markov decision process to improve driving safety.

Key words: autonomous driving, reinforcement learning, decision-making, self-monitoring learning, strategy improvement, decision accuracy, decision breadth, uncertainty

CLC Number:

TP242.6

ZHANG Jiapeng,LI Lin,ZHU Ye. A Review of Research on Decision-Making Method of Autonomous Vehicle Based on Reinforcement Learning[J].Electronic Science and Technology, 2021, 34(5): 66-71.

References 37

[1]	Hillel A B, Lerner R, Levi D, et al. Recent progress in road and lane detection:a survey[J]. Machine Vision and Applications, 2014,25(3):727-745. doi: 10.1007/s00138-011-0404-2
[2]	Gao H, Cheng B, Wang J, et al. Object classification using CNN-based fusion of vision and LIDAR in autonomous vehicle environment[J]. IEEE Transactions on Industrial Informatics, 2018,14(9):4224-4231. doi: 10.1109/TII.9424
[3]	Schwarting W, Alonso Mora J, Rus D, et al. Planning and decision-making for autonomous vehicles[J]. Annual Review of Control, Robotics, and Autonomous Systems, 2018,1(7):187-210. doi: 10.1146/annurev-control-060117-105157
[4]	Bae I, Moon J, Cha J, et al. Integrated lateral and longitudinal control system for autonomous vehicles[C]. Qingdao:International Conference on Intelligent Transportation Systems, 2014.
[5]	Sutton R S, Barto A G, et al. Reinforcement learning:An introduction[M]. Cambridge: MIT Press, 2018.
[6]	Leonard J, How J, Teller S, et al. A perception-driven autonomous urban vehicle[J]. Journal of Field Robotics, 2008,25(10):727-774. doi: 10.1002/rob.v25:10
[7]	Montemerlo M, Becker J, Bhat S, et al. Junior:The stanford entry in the urban challenge[J]. Journal of Field Robotics, 2008,25(9):569-597. doi: 10.1002/rob.v25:9
[8]	Urmson C, Anhalt J, Bagnell D, et al. Autonomous driving in urban environments:Boss and the urban challenge[J]. Journal of Field Robotics, 2008,25(8):425-466. doi: 10.1002/rob.v25:8
[9]	Bacha A, Bauman C, Faruque R, et al. Odin:Team victortango's entry in the darpa urban challenge[J]. Journal of Field Robotics, 2008,25(8):467-492. doi: 10.1002/rob.v25:8
[10]	Zheng R, Liu C, Guo Q. A decision-making method for autonomous vehicles based on simulation and reinforcement learning[C]. Tianjin:International Conference on Machine Learning and Cybernetics, 2013.
[11]	Gao Z, Sun T, Xiao H, et al. Decision-making method for vehicle longitudinal automatic driving based on reinforcement Q-learning[J]. International Journal of Advanced Robotic Systems, 2019,16(3):141-172.
[12]	Mnih V, Kavukcuoglu K, Silver D, et al. Playing atari with deep reinforcement learning[EB/OL]. (2013-12-09) [2019-12-20] https://arxiv.org/abs/1312.5602.
[13]	缪冉, 李菲菲, 陈虬. 基于卷积神经网络与多尺度空间编码的场景识别方法[J]. 电子科技, 2020,33(12):54-58,74.
	Miao Ran, Li Feifei, Chen Qiu, et al. Scene recognition method based on convolutional neural network and multi-scale space coding[J]. Electronic Science and Technology, 2020,33(12):54-58,74.
[14]	程俊华, 曾国辉, 刘瑾, 等. 基于深度强化学习的复杂背景分类方法研究[J]. 电子科技, 2020,33(12):59-66.
	Cheng Junhua, Zeng Guohui, Liu Jin, et al. Research on complex background image classification method based on deep learning[J]. Electronic Science and Technology, 2020,33(12):59-66.
[15]	Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015,518(7540):529-533. doi: 10.1038/nature14236
[16]	Silver D, Huang A, Maddison C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016,529(7587):484-496. doi: 10.1038/nature16961 pmid: 26819042
[17]	Lillicrap, Timothy P, Hunt, et al. Continuous control with deep reinforcement learning[J]. Computer Ence, 2015,8(6):A187-199.
[18]	Wolf P, Hubschneider C, Weber M, et al. Learning how to drive in a real world simulation with deep q-networks[C]. Los Angeles:IEEE Intelligent Vehicles Symposium, 2017.
[19]	Chae H, Kang C M, Kim B D, et al. Autonomous braking system via deep reinforcement learning[C]. Yokohama: IEEE The Twentieth International Conference on Intelligent Transportation Systems, 2017.
[20]	Sallab A E L, Abdou M, Perot E, et al. Deep reinforcement learning framework for autonomous driving[J]. Electronic Imaging, 2017(19):70-76.
[21]	Kendall A, Hawke J, Janz D, et al. Learning to drive in a day[C]. Montreal:International Conference on Robotics and Automation, 2019.
[22]	Ye Y, Zhang X, Sun J. Automated vehicle’s behavior decision making using deep reinforcement learning and high-fidelity simulation environment[J]. Transportation Research Part C:Emerging Technologies, 2019,107(19):155-170. doi: 10.1016/j.trc.2019.08.011
[23]	Al-Emran M. Hierarchical reinforcement learning: a survey[J]. International Journal of Computing and Digital Systems, 2015,4(2):137-143. doi: 10.12785/ijcds/040207
[24]	Vezhnevets A S, Osindero S, Schaul T, et al. Feudal networks for hierarchical reinforcement learning[C]. Sydney:Proceedings of the Thirth-fourth International Conference on Machine Learning-Volume, 2017.
[25]	Nachum O, Gu S S, Lee H, et al. Data-efficient hierarchical reinforcement learning[C]. Montreal:Advances in Neural Information Processing Systems, 2018.
[26]	Paxton C, Raman V, Hager G D, et al. Combining neural networks and tree search for task and motion planning in challenging environments[C]. Vancouver:RSJ International Conference on Intelligent Robots and Systems, 2017.
[27]	Nosrati M S, Abolfathi E A, Elmahgiubi M, et al. Towards practical hierarchical reinforcement learning for multi-lane autonomous driving[C]. Montreal:The Thirty-second Conference on Neural Information Processing Systems, 2018.
[28]	Shani G, Pineau J, Kaplow R. A survey of point-based POMDP solvers[J]. Autonomous Agents and Multi-Agent Systems, 2013,27(1):1-51. doi: 10.1007/s10458-012-9200-2
[29]	Bai H, Hsu D, Lee W S. Integrated perception and planning in the continuous space: A POMDP approach[J]. The International Journal of Robotics Research, 2014,33(9):1288-1302. doi: 10.1177/0278364914528255
[30]	Brechtel S, Gindele T, Dillmann R. Solving continuous POMDPs: Value iteration with incremental learning of an efficient space representation[C]. Karlsruhe:International Conference on Machine Learning, 2013.
[31]	Wei J, Dolan J M, Snider J M, et al. A point-based mdp for robust single-lane autonomous driving behavior under uncertainties[C]. Shanghai:IEEE International Conference on Robotics and Automation, 2011.
[32]	Ulbrich S, Maurer M. Probabilistic online POMDP decision making for lane changes in fully automated driving[C]. Hague:International IEEE Conference on Intelligent Transportation Systems, 2013.
[33]	Brechtel S, Gindele T, Dillmann R. Probabilistic decision-making under uncertainty for autonomous driving using continuous POMDPs[C]. Qingdao: International IEEE Conference on Intelligent Transportation Systems, 2014.
[34]	Bandyopadhyay T, Won K S, Frazzoli E, et al. Intention-aware motion planning[M]. Berlin:Algorithmic Foundations of Robotics X, 2013.
[35]	Bai H, Cai S, Ye N, et al. Intention-aware online POMDP planning for autonomous driving in a crowd[C]. Seattle: IEEE International Conference on Robotics and Automation, 2015.
[36]	Liu W, Kim S W, Pendleton S, et al. Situation-aware decision making for autonomous driving on urban road using online POMDP[C]. Seoul:IEEE Intelligent Vehicles Symposium, 2015.
[37]	Song W, Xiong G, Chen H. Intention-aware autonomous driving decision-making in an uncontrolled intersection[J]. Mathematical Problems in Engineering, 2016,31(2):71-87.

A Review of Research on Decision-Making Method of Autonomous Vehicle Based on Reinforcement Learning

RichHTML

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References 37

Related Articles 6

Metrics

Comments

Recommended 1

[1]	YU Jiawei,HU Haiyang. Energy Consumption Model of Cyber-Physical System Based on Object-Oriented Fuzzy Petri Net [J]. Electronic Science and Technology, 2021, 34(7): 19-25.
[2]	ZHENG Mingliang,WANG Quan,HUANG Xiang. Decision Method Based on Intuitionistic Linguistic Evaluation of Condition Maintenance for Power Transforme [J]. Electronic Science and Technology, 2019, 32(9): 15-19.
[3]	LIU Shuai,QIN Shanshan,GUO Jinxiang. Research on Collision Risk Based NUCp of ADS-B [J]. Electronic Science and Technology, 2019, 32(9): 72-75.
[4]	PEI Guiling. Error Analysis and Evaluation of Uncertainty in the K-type Thermocouple Test [J]. , 2013, 26(3): 51-.
[5]	WU Yong-Peng, LEI Zhen-Ya, SHAN Tuan-Biao. Study of the Effect of Target Pose on RCS Dynamic Measurement [J]. , 2011, 24(4): 40-.
[6]	ZHAN Zhong-Li, WANG Qiang, CHEN Xian-Ting. Reinforcement Learning Model,Algorithms and Its Application [J]. , 2011, 24(1): 47-.