增强型深度对抗样本攻击防御算法

doi:10.19665/j.issn1001-2400.2021.06.004

Abstract

Abstract:

Although deep learning has achieved great success in various applications,the deep neural networks (DNNs) are vulnerable to the attack of adversarial samples with imperceptive perturbation information,which makes the robustness and performance of DNNs decrease greatly.To overcome the weakness of the existing denoising algorithms against adversarial samples,which destroys the information on clean samples,leading to reduction in CNN sclassification accuracy,this paper presents a novel enhanced denoising algorithm ID+HIR(Input Denoising andHidden Information Restoring)for adversarial samples.Our ID+HIR is made up of an enhanced input denoising and hidden lossy information restoring based on the theory of convex hull.The algorithm first trains a denoiser on the input layer of the model,with the input of the denoiser being the concatenation of clean and adversarial samples,and the denoiser is expected to remove the adversarial perturbations while avoiding the forgetting of clean samples.Since the denoiser destroys the perturbation information contained in the clean samples,a restorer is trained in the hidden layer of the model,with the input of the restorer being a convex combination of the hidden vectors of the clean and adversarial samples,expecting the restorer to remap the samples located in the incorrect classification space back to the correct classification space,thus training a more robust model.Extensive comparative simulation experiments on several standard datasets show that the denoiser and the recoverer proposed in this paper can effectively improve the robustness of the model,and extensive experiments on benchmark datasets show that our proposed algorithm ID+HIR is superior to the competitive baselines.

Key words: deep learning, adversarial samples, input denoising, hidden information restoring

CLC Number:

TP183

LIU Jiawei,ZHANG Wenhui,KOU Xiaoli,LI Yanni. Harnessing adversarial examples via input denoising and hidden information restoring[J].Journal of Xidian University, 2021, 48(6): 23-31.

Figures/Tables 6

References 29

[1]	JOSHI A, MUKHERJEE A, SARKAR S, et al. Semantic Adversarial Attacks:Parametric Transformations That Fool Deep Classifiers[C]// Proceeding of the IEEE/CVF International Conference on Computer Vision (ICCV).Piscataway:IEEE, 2019:4773-4783.
[2]	JIA R, LIANGP. Adversarial Examples for Evaluating Reading Comprehension Systems[C]// Proceeding of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP).Stroudsburg:ACL, 2017:2021-2031.
[3]	SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing Properties of Neural Networks[C]// Proceeding of the 2nd International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2014:1-10.
[4]	MADRY A, MAKELOV A, SCHMIDTL, et al. Towards Deep Learning Models Resistant to Adversarial Attacks[C]// Proceeding of the 6th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2018:1-28.
[5]	SHAFAHI A, HUANG W R, STUDER C, et al. Are Adversarial Examples Inevitable[C]// Proceeding of the 7th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2019:1-17.
[6]	SHAFAHI A, NAJIBI M, GHIASI A, et al. Adversarial Training for Free![C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems (NIPS).New York:ACM, 2019:3358-3369.
[7]	WANG Y, ZOU D, YI J, et al. Improving Adversarial Robustness Requires Revisiting Misclassified Examples[C]// Proceeding of the 7th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2019:1-14.
[8]	ZHENG H, ZHANG Z, GU J, et al. Efficient Adversarial Training With Transferable Adversarial Examples[C]// Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2020:1181-1190.
[9]	DING G W, SHARMA Y, LUIK Y C, et al. MMA Training:Direct Input Space Margin Maximization through Adversarial Training[C]// Proceeding of the 7th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2019:1-28.
[10]	JIA X, WEI X, CAO X, et al. ComDefend:An Efficient Image Compression Model to Defend Adversarial Examples[C]// Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2019:6084-6092.
[11]	LIAO F, LIANG M, DONG Y, et al. Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser[C]// Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2018:1778-1787.
[12]	SONG Y, KIM T, NOWOZIN S, et al. PixelDefend:Leveraging Generative Models to Understand and Defend against Adversarial Examples[C]// Proceeding of the 6th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2018:1-20.
[13]	GU S, RIGAZIO L. Towards Deep Neural Network Architectures Robust to Adversarial Examples[C]// In Proceeding of the 3th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2015:1-9.
[14]	XU W, DAVID E, QI Y. Feature Squeezing:Detecting Adversarial Examples in Deep Neural Networks[C]// In Proceeding of the 25th Annual Network and Distributed System Security Symposium(NDSS) 2018.
[15]	GUO C, RANA M, CISSE M, et al. Countering Adversarial Images using Input Transformations[C]// Proceeding of the 6th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2018:1-12.
[16]	NASEER M, KHAN S, HAYATM, et al. A Self-Supervised Approach for Adversarial Robustness[C]// Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2020:262-271.
[17]	张树栋, 高海昌, 曹曦文, 等. 针对ASR系统的快速有目标自适应对抗攻击[J]. 西安电子科技大学学报, 2021, 48(1):168-175.
	ZHANG Shudong, GAO Haichang, CAO Xiwen, et al. Adaptive Fast and Targeted Adversarial Attack for Speech Recognition[J]. Journal of Xidian University, 2021, 48(1):168-175.
[18]	XIE C, WU Y, MAATEN L, et al. Feature Denoising for Improving Adversarial Robustness[C]// Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2019:501-509.
[19]	SALMAN H, SUN M, YANG G, et al. Denoised Smoothing:A Provable Defense for Pretrained Classifiers[C]// Proceeding of 33th Neural Information Processing Systems (NIPS).New York:ACM, 2020:21945-21957.
[20]	JEONG, J, SHIN J. Consistency Regularization for Certified Robustness of Smoothed Classifiers[C]// // Proceeding of 33th Neural Information Processing Systems (NIPS).New York:ACM, 2020:6-12.
[21]	RONNEBERGER O, FISCHER P, BROX T. U-net:Convolutional Networks for Biomedical Image Segmentation[C]// In Proceeding of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI).Heidelberg:Springer, 2015:234-241.
[22]	ZHANG H, CISSE M, YANN N, et al. Mixup:Beyond Empirical Risk Minimization[C]// Proceeding of the 6th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2018:1-13.
[23]	陈开周. 最优化计算方法[M]. 西安: 西北电讯工程学院出版社, 1985: 22.
[24]	HE K, ZHANG X, REN S, et al. Deep Residual Learning for Image Recognition[C]// Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2016:770-778.
[25]	GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and Harnessing Adversarial Examples[C]// Proceeding of the 3rd International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2015:1-11.
[26]	CARLINI N, WAGNER D. Towards Evaluating the Robustness of Neural Networks[C]// Proceeding of the 2017 IEEE Symposium on Security and Privacy (SP).Piscataway:IEEE, 2017:39-57.
[27]	KOS J, FISCHER I, SONG D. Adversarial Examples for Generative Models[C]// Proceeding of the 2018 IEEE Security and Privacy Workshops (SPW).Piscataway:IEEE, 2018:36-42.
[28]	LIUS, DENG W. Very Deep Convolutional Neural Network Based Image Classification Using Small Training Sample Size[C]// Proceeding of the 3rd IAPR Asian Conference on Pattern Recognition (ACPR).Piscataway:IEEE, 2015:730-734.
[29]	WONG E, RICE L, KOLTER J Z. Fast is Better Than Free:Revisiting Adversarial Training[C]// Proceeding of the 8th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2020:1-17.

防御算法	样本
	干净样本		FGSM(ε=4/8/16)		CW(ε=4/8/16)		BIM(ε=4/8/16)
	VGG16	ResNet18	VGG16	ResNet18	VGG16	ResNet18	VGG16	ResNet18
无防御	92.6	93.2	80.3/64.6/46.1	82.8/67.6/45.2	83.5/77.0/69.3	84.5/76.2/70.3	82.7/57.1/30.9	83.2/55.1/31.5
NRP^[16](2020)	79.4	78.7	76.7/74.9/72.2	74.8/72.5/68.1	78.5/78.5/77.5	77.1/76.7/75.8	77.4/74.8/71.4	75.6/72.5/68.3
Fast AT^[29](2020)	81.5	83.8	88.1/78.7/56.0	89.0/79.1/55.0	91.8/90.8/87.8	91.2/90.7/87.4	90.1/90.8/87.1	91.0/90.7/87.5
ComDefend^[10](2019)	85.2	87.1	82.7/78.8/70.8	82.2/77.3/69.9	84.5/84.5/82.6	84.6/84.5/81.9	83.2/80.3/76.6	84.0/79.8/74.9
FS^[14](2018)	91.0	92.0	82.5/65.3/48.8	83.4/66.7/49.1	89.5/88.1/83.3	89.7/88.9/82.7	85.2/62.2/41.6	86.3/66.9/43.9
JPEG^[15](q=25) (2018)	71.0	74.1	66.6/63.0/54.3	66.2/64.1/55.0	70.1/69.8/68.2	69.9/69.9/68.5	67.5/64.2/60.2	68.2/63.1/59.4
JPEG^[15](q=50)	80.2	82.3	74.4/68.2/51.0	73.5/69.2/51.8	79.0/77.8/75.1	79.2/77.2/74.4	75.7/70.5/63.3	76.3/71.2/64.3
JPEG^[15](q=75)	85.9	84.9	78.4/68.5/47.5	79.2/69.7/48.1	84.8/83.0/78.5	85.9/84.2/78.8	80.5/72.9/58.8	81.0/73.4/58.9
TVM^[15](2018)	88.8	86.9	81.5/76.7/52.7	80.5/75.2/51.8	87.0/86.2/81.8	87.8/86.6/81.0	83.1/74.0/60.3	82.2/73.8/59.7
ID+HIR(Our)	91.2	92.1	89.7/86.8/73.8	90.0/86.2/75.4	90.8/90.9/90.6	91.2/90.8/90.2	90.4/88.7/82.6	90.2/88.0/83.1

防御算法	样本
	干净样本		FGSM(ε=4/8/16)		CW(ε=4/8/16)		BIM(ε=4/8/16)
	VGG16	ResNet18	VGG16	ResNet18	VGG16	ResNet18	VGG16	ResNet18
无防御	72.6	76.4	52.1/45.3/39.6	51.3/42.4/37.9	63.2/61.4/54.8	62.0/59.3/51.3	53.4/50.1/29.3	52.6/48.4/28.2
NRP^[16](2020)	65.2	65.4	65.4/62.1/61.2	69.2/68.5/68.3	68.1/68.7/66.3	71.8/70.9/70.3	69.3/67.7/59.3	72.1/70.7/66.4
Fast AT^[28](2020)	65.8	70.1	70.2/62.3/55.1	74.1/70.2/65.2	72.1/70.9/69.2	74.2/74.1/73.6	72.0/70.3/68.2	75.5/73.7/72.9
ComDefend^[10](2019)	68.7	71.4	66.2/61.4/60.2	70.2/69.8/67.2	70.0/69.7/65.2	73.8/71.2/69.1	68.2/66.3/60.1	73.2/71.7/67.5
FS^[14](2018)	71.3	74.1	67.2/57.1/54.2	54.7/44.5/40.1	70.2/70.1/70.5	70.4/67.2/65.6	68.3/62.5/57.3	71.3/69.5/64.1
JPEG^[15](q=25) (2018)	55.4	55.7	47.2/42.3/39.1	49.2/41.3/39.1	57.2/57.1/56.80.5	56.1/54.1/50.4	57.7/55.4/50.0	58.2/56.2/51.2
JPEG^[15](q=50)	58.8	60.5	49.2/43.7/38.29.1	51.5/44.7/42.7	64.2/63.4/62.7	65.2/62.6/62.1	66.0/61.6/58.2	65.1/62.4/59.7
JPEG^[15](q=75)	68.2	71.8	58.9/47.2/45.2	55.2/47.5/45.4	69.3/68.9/67.2	68.3/67.3/63.5	67.3/58.2/45.5	63.3/59.2/47.5
TVM^[15](2018)	69.1	72.3	61.7/51.2/49.7	57.1/49.3/51.6	70.1/68.2/67.7	69.8/68.2/63.1	67.9/61.3/50.2	64.2/60.8/51.7
ID+HIR (Our)	71.8	74.8	71.2/68.3/63.7	75.5/72.3/68.7	72.2/71.2/70.6	74.9/74.7/74.2	71.5/69.8/64.5	75.5/73.1/69.3.1

防御算法	样本
	FGSM(?=4/8/16)			CW(?=4/8/16)
	CIFAR-10	SVHN	MNIST	CIFAR-10	SVHN	MNIST
无防御	52.7/44.0/34.0	52.1/40.0/27.4	-/-/41.2	17.1/14.2/9.0	37.1/17.2/9.5	-/-/74.3
ID	86.7/90.5/92.2	92.6/95.3/96.1	-/-/98.9	81.8/82.1/73.2	95.8/93.5/94.6	-/-/98.0
ID+HIR	88.3/91.5/92.2	93.5/96.2/96.4	-/-/99.1	85.0/85.2/90.0	96.2/94.8/95.7	-/-/98.7

Harnessing adversarial examples via input denoising and hidden information restoring

RichHTML

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 6

References 29

Related Articles 15

Metrics

Comments

Recommended 10

[1]	SONG Jianfeng,MIAO Qiguang,WANG Chongxiao,XU Hao,YANG Jin. Multi-scale single object tracking based on the attention mechanism [J]. Journal of Xidian University, 2021, 48(5): 110-116.
[2]	ZHANG Yuhao,CHENG Peitao,ZHANG Shuhao,WANG Xiumei. Lightweight image super-resolution with the adaptive weight learning network [J]. Journal of Xidian University, 2021, 48(5): 15-22.
[3]	LI Peng,FENG Cunqian,XU Xuguang,TANG Zixiang. Ballistic target fretting classification network based on Bayesian optimization [J]. Journal of Xidian University, 2021, 48(5): 139-148.
[4]	YAN Jia,CAO Yudong,REN Jiaxing,CHEN Donghao,LI Xiaohui. Deep asymmetric compression Hashing algorithm [J]. Journal of Xidian University, 2021, 48(5): 212-221.
[5]	NING Yang,DU Jianchao,HAN Shuo,YANG Chuankai. Fire segmentation based on the improved DeeplabV3+ and the analytical method for fire development [J]. Journal of Xidian University, 2021, 48(5): 38-46.
[6]	ZHOU Peng,YANG Jun. Semantic segmentation of remote sensing images based on neural architecture search [J]. Journal of Xidian University, 2021, 48(5): 47-57.
[7]	QI Yanjun,KONG Yueping,WANG Jiajing,ZHU Xudong. Gait recognition method combining LSTM and CNN [J]. Journal of Xidian University, 2021, 48(5): 78-85.
[8]	HUI Haisheng,ZHANG Xueying,WU Zelin,LI Fenglian. Method for stroke lesion segmentation using the primary-auxiliary path attention compensation network [J]. Journal of Xidian University, 2021, 48(4): 200-208.
[9]	SUN Haojie,LI Miaoyu,ZHANG Panpan,XU Pengfei. Self-supervised facial asymmetry learning for automatic evaluation of facial paralysis [J]. Journal of Xidian University, 2021, 48(3): 115-122.
[10]	ZHANG Hua,GAO Haoran,YANG Xingguo,LI Wenmin,GAO Fei,WEN Qiaoyan. TargetedFool:an algorithm for achieving targeted attacks [J]. Journal of Xidian University, 2021, 48(1): 149-159.
[11]	YANG Hongyu,ZENG Renyun. Method for assessment of network security situation with deep learning [J]. Journal of Xidian University, 2021, 48(1): 183-190.
[12]	ZHANG Lu,SUN Rong,LIU Jingwei. Cloned piggybacking framework for distributed storage [J]. Journal of Xidian University, 2020, 47(6): 139-147.
[13]	HU Jianwei,ZHAO Wei,CUI Yanpeng,CUI Junjie. PHP code vulnerability mining technology based on theimproved ASTNN [J]. Journal of Xidian University, 2020, 47(6): 164-173.
[14]	GUO Liujun,ZHANG Xueying,CHEN Guijun. Deep linear discriminant analysis for two-stage brain-controlled character spelling decoding [J]. Journal of Xidian University, 2020, 47(4): 109-116.
[15]	DU Lizhao,XU Yan,ZHANG Wei. Phased smoke detection algorithm using dual network fusion [J]. Journal of Xidian University, 2020, 47(4): 141-148.