Journal of Xidian University ›› 2021, Vol. 48 ›› Issue (1): 160-167.doi: 10.19665/j.issn1001-2400.2021.01.018

Previous Articles     Next Articles

Optimal method for the generation of the attack path based on the Q-learning decision

LI Teng(),CAO Shijie(),YIN Siwei(),WEI Dawei(),MA Xindi(),MA Jianfeng()   

  1. School of Network and Information Security,Xidian University,Xi’an 710071,China
  • Received:2020-08-18 Online:2021-02-20 Published:2021-02-03


The main research purpose of this paper is to generate a dynamic approach to finding the optimal attack path based on the Q-learning algorithm in machine learning,and to improve the efficiency and adaptability of this approach.The method,based on the Q-learning algorithm and by the reference network connectivity and partition,uses the delete inaccessible path in the network topology reduction method,and simulated by machine learning hacker attacks,combines state and action,in keep learning to improve their ability of adaptation and decision-making,so as to generate the optimal attack path efficiently.Finally,through experiments,the established simulated attacker can obtain the state-value table in the Q-learning method in the environment with the IDS alarm device,and can obtain the optimal attack path sequence from the source host to the destination host by traversing the Q table,which verifies the validity and accuracy of the model and algorithm.At the same time,by analyzing the host reachability in advance,the redundant nodes are greatly reduced,a great advantage in large network topology.

Key words: attack graph, network security, reinforcement learning, optimization algorithm, Q-learning

CLC Number: 

  • TP309