西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (1): 149-159.doi: 10.19665/j.issn1001-2400.2021.01.017

• • 上一篇    下一篇

TargetedFool:一种实现有目标攻击的算法

张华(),高浩然(),杨兴国(),李文敏(),高飞(),温巧燕()   

  1. 北京邮电大学 网络与交换技术国家重点实验室,北京 100876
  • 收稿日期:2020-11-02 出版日期:2021-02-20 发布日期:2021-02-03
  • 通讯作者: 高浩然
  • 作者简介:张 华(1978—),女,副教授,博士,E-mail: zhanghua_288@bupt.edu.cn|杨兴国(1997—),男,北京邮电大学硕士研究生,E-mail: yangxingg@bupt.edu.cn|李文敏(1983—),女,副教授,博士,E-mail: liwenmin02@outlook.com|高 飞(1980—),男,教授,博士,E-mail: gaof@bupt.edu.cn|温巧燕(1959—),女,教授,博士,E-mail: wqy@bupt.edu.cn
  • 基金资助:
    国家自然科学基金(62072051);国家自然科学基金(61672110);国家自然科学基金(61671082);国家自然科学基金(61976024);国家自然科学基金(61972048);中央高校基本科研业务费专项资金(2019XD-A01)

TargetedFool:an algorithm for achieving targeted attacks

ZHANG Hua(),GAO Haoran(),YANG Xingguo(),LI Wenmin(),GAO Fei(),WEN Qiaoyan()   

  1. State Key Laboratory of Networking and Switching Technology,Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Received:2020-11-02 Online:2021-02-20 Published:2021-02-03
  • Contact: Haoran GAO

摘要:

随着人工智能技术的发展,深度神经网络广泛应用于人脸识别、语音识别、图片识别以及自动驾驶等领域。由于轻微的扰动就可以使深度神经网络出现错误分类,所以在有限的时间内实现特定的攻击效果是对抗攻击领域研究的重点之一。针对有目标对抗攻击算法中产生扰动时间久和扰动易被人眼观察的问题,基于Deepfool提出了在典型的卷积神经网络上生成有目标的对抗样本的算法,即TargetedFool。大量的实验结果表明,TargetedFool可以对MNIST、CIFAR-10和ImageNet实现有目标的对抗攻击。在ImageNet上,TargetedFool可以在平均2.84 s的时间内达到99.8%的扰动率。此外,分析了基于DeepFool的攻击算法无法产生有目标的通用对抗性扰动的原因。

关键词: 深度神经网络, 深度学习, 目标攻击, 对抗样本

Abstract:

With the development of artificial intelligence technology,deep neural networks are widely used in fields such as face recognition,voice recognition,image recognition,and autonomous driving.In recent years,experiments have proved that slight perturbations can cause misclassification of deep neural networks (DNNs) and achieving specific attack effects in a limited time is one of the focuses of research in the field of adversarial attacks.The DeepFool algorithm has a wide range of applications in machine learning platforms such as cleverhans.However,there is still room for research on targeted attacks using the DeepFool algorithm.To solve the problem that generating perturbations takes a long time and that the perturbation is easy for the human eye to observe,this paper proposes the TargetedFool algorithm based on the DeepFool algorithm for generating targeted adversarial examples on typical convolution neural networks (CNNs).Extensive experimental results show that the algorithm proposed in this paper can achieve targeted attacks on the MNIST,CIFAR-10 and ImageNet.The targeted attack described in this paper can achieve a 99.8% deception success rate in an average time of 2.84 s on the ImageNet.In addition,this paper analyzes the reason why the attack algorithm based on the DeepFool cannot generate targeted universal adversarial perturbations.

Key words: deep neural network, deep learning, targeted attack, adversarial examples

中图分类号: 

  • TP301.6