Journal of Xidian University ›› 2023, Vol. 50 ›› Issue (2): 138-146.doi: 10.19665/j.issn1001-2400.2023.02.014

• Cyberspace Security & Others • Previous Articles     Next Articles

App traffic identification under ShadowSocksR proxy with machine learning

GUO Gang(),YANG Chao(),CHEN Mingzhe(),MA Jianfeng()   

  1. School of Cyber Engineering,Xidian University,Xi’an 710071,China
  • Received:2022-05-26 Online:2023-04-20 Published:2023-05-12

Abstract:

An App traffic identification scheme based on machine learning under ShadowSocksR (SSR) proxy is proposed with the purpose being to identify from which APP the ShadowSocksR proxy traffic generated by the smartphone originates.The scheme consists of three steps:traffic preprocessing,feature extraction and model construction.First,the packet set corresponding to the ShadowSocksR traffic generated by smartphones is divided into fine-grained stream data groups according to the arrival time interval,source and destination IP address and port,and then the stream data groups containing fewer packets are further filtered with the purpose being to filter out noise traffic generated by the background App or smart phone operating system that interferes with traffic identification.Then,from the filtered flow data grouping set,the statistical features and distribution features of packet length,time statistical features,packet frequency features,packet filtering ratio features,and the combined features of the front and rear streams are extracted to form a feature matrix,which is input into the machine learning algorithm.An app traffic identification model for the ShadowSocksR traffic that needs to be identified is obtained,and after the feature matrix is obtained through the same processing steps,the flow identification results can be obtained by inputting the App traffic identification model.Experimental results show that the traffic identification method can reach an accuracy rate of more than 97% for App traffic identification under ShadowSocksR proxy.

Key words: ShadowSocksR, smartphone app traffic identification, machine learning

CLC Number: 

  • TP309