A Behavioral Decision-Making Method Based on Improved Deep 
Reinforcement Learning Algorithms

doi:10.16638/j.cnki.1671-7988.2025.001.005

Automobile Applied Technology ›› 2025, Vol. 50 ›› Issue (1): 25-30.DOI: 10.16638/j.cnki.1671-7988.2025.001.005

• Intelligent Connected Vehicle • Previous Articles

A Behavioral Decision-Making Method Based on Improved Deep Reinforcement Learning Algorithms

JIA Ruihao

School of Automobile, Chang'an University

Published:2025-01-09
Contact: JIA Ruihao

基于改进深度强化学习算法的行为决策方法

贾瑞豪

长安大学汽车学院

通讯作者: 贾瑞豪
作者简介:贾瑞豪（1999－），男，硕士研究生，研究方向为载运工具运用工程，E-mail:jia_rh@chd.edu.cn

Abstract

Abstract: Aiming at the traditional deep reinforcement learning algorithms' problems of simultaneous low driving efficiency, slow convergence and low decision success rate in self-driving decisionmaking tasks due to poor exploration strategies during training, a decision-making method of deep competitive double Q network combined with expert evaluation is proposed. An offline expert model and an online model are proposed, and an adaptive balance factor is introduced between them; a prioritized experience replay mechanism with adaptive importance coefficients is introduced to build an online model on the basis of the competitive deep Q-network; and a reward function that considers driving efficiency, safety, and comfort is designed. The results show that the algorithm improves the convergence speed by 25.93% and 20.00%, the decision success rate by 3.19% and 2.77%, the average steps by 6.40% and 0.14%, and the average speed by 7.46% and 0.42%, respectively, compared with D3QN and PERD3QN.

Key words: autonomous driving; behavioral decision; deep reinforcement learning; imitation learning; improved DQN algorithm

摘要： 针对传统深度强化学习算法因训练时探索策略差导致在自动驾驶决策任务中同时出现行驶效率低、收敛慢和决策成功率低的问题，提出了结合专家评价的深度竞争双 Q 网络的决策方法。提出离线专家模型和在线模型，在两者间引入自适应平衡因子；引入自适应重要性系数的优先经验回放机制在竞争深度 Q 网络的基础上搭建在线模型；设计了考虑行驶效率、安全性和舒适性的奖励函数。结果表明，该算法相较于 D3QN、PERD3QN 在收敛速度上分别提高了 25.93%和 20.00%，决策成功率分别提高了 3.19%和 2.77%，平均步数分别降低了 6.40% 和 0.14%，平均车速分别提升了 7.46%与 0.42%。

关键词: 自动驾驶；行为决策；深度强化学习；模仿学习；改进 DQN 算法

JIA Ruihao. A Behavioral Decision-Making Method Based on Improved Deep Reinforcement Learning Algorithms[J]. Automobile Applied Technology, 2025, 50(1): 25-30.

贾瑞豪. 基于改进深度强化学习算法的行为决策方法[J]. 汽车实用技术, 2025, 50(1): 25-30.

References

[ 1 ] WANG Z,CHEN X,WANG P,et al.A Decision-making Model for Autonomous Vehicles at Urban Intersections Based on Conflict Resolution[J].Journal of Advanced Transportation,2021,20(12):1-12. [ 2 ] 金立生,韩广德,谢宪毅,等.基于强化学习的自动驾驶决策研究综述[J].汽车工程,2023,45(4):527-540. [ 3 ] CUI J X,ZHAO B Y,QU M C.An Integrated Lateral and Longitudinal Decision-making Model for Autonomous Driving Based on Deep Reinforcement Learning [J].Journal of Advanced Transportation,2023:1-13. [ 4 ] HOEL C J,WOLFF K,LAINE L.Automated Speed and Lane Change Decision Making Using Deep Reinforcement Learning[C]//ITSC.2018 21st International Conference on Intelligent Transportation Systems.Maui: Chalmers University of Technology,2018:2148-2155. [ 5 ] ALIZADEH A,MOGHADAM M,BICER Y,et al.Automated Lane Change Decision Making Using Deep Reinforcement Learning in Dynamic and Uncertain Highway Environment[C]//ITSC.2019 IEEE Intelligent Transportation Systems Conference.Piscataway:IEEE, 2019:1399-1404. [ 6 ] WANG J J,ZHANG Q,ZHAO D,et al.Lane Change Decision-making through Deep Reinforcement Learning with Rule-based Constraints[C]//IJCNN.International Joint Conference on Neural Networks.Budapest: The State Key Laboratory of Management and Control for Complex Systems,2019:1-6. [ 7 ] REDDY S,DRAGAN A D,LEVINE S.SQIL:Imitation Learning via Regularized Behavioral Cloning[J].arXiv, 2019,1995:11108. [ 8 ] HESTER T,VECERíK M,PIETQUIN O,et al.Learning from Demonstrations for Real World Reinforcement Learning[J].arXiv,2017,1704:03732. [ 9 ] GHIMIRE M,CHOUDHURY M R,LAGUDU G S S H.Lane Change Decision-making through Deep Reinforcement Learning[J].arXiv 2021,2112:14705. [10] 何逸煦,林泓熠,刘洋,等.强化学习在自动驾驶技术中的应用与挑战[J].同济大学学报(自然科学版),2024, 52(4):520-531. [11] VALIENTE R, TOGHI B, PEDARSANI R,et al.Robustness and Adaptability of Reinforcement Learningbased Cooperative Autonomous Driving in Mixedautonomy Traffic[J].IEEE Open J Intell Transp Syst, 2022:397-410. [12] WANG Z,SCHAUL T,HESSEL M,et al.Dueling Network Architectures for Deep Reinfo Rcement Learning [C]//ICML.International Conference on Machine Learning.Piscataway:IEEE,2016:1995-2003. [13] TORABI F,WARNELL G,STONE P.Behavioral Cloning from Observation[C]//IJCAII.International Joint Conference on Artificial Intelligence.Piscataway:IEEE, 2018:501-519.

A Behavioral Decision-Making Method Based on Improved Deep Reinforcement Learning Algorithms

基于改进深度强化学习算法的行为决策方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 0

Recommended Articles

Metrics