西安电子科技大学学报 ›› 2020, Vol. 47 ›› Issue (6): 148-157.doi: 10.19665/j.issn1001-2400.2020.06.021
收稿日期:
2019-12-10
出版日期:
2020-12-20
发布日期:
2021-01-06
通讯作者:
田春娜
作者简介:
成 磊(1993—),男,西安电子科技大学硕士研究生,E-mail: 基金资助:
CHENG Lei(),WANG Yue,TIAN Chunna(
)
Received:
2019-12-10
Online:
2020-12-20
Published:
2021-01-06
Contact:
Chunna TIAN
摘要:
由于传统的卷积神经网结构不能有效地发挥其强大的特征学习和特征表达能力,故提出一种改良的特征提取网络用于视频目标跟踪。在传统特征提取网络的基础上,引入残差网络形式的注意力机制和特征融合策略,同时在网络模型的训练阶段引入基于区域重叠率的损失函数,使得算法模型获得更好的定位效果。实验结果表明,改进算法可以长时间准确地跟踪目标,并且该方法具有泛化能力,对其他基于深度学习的跟踪算法有借鉴意义。
中图分类号:
成磊,王玥,田春娜. 一种添加残差注意力机制的视觉目标跟踪算法[J]. 西安电子科技大学学报, 2020, 47(6): 148-157.
CHENG Lei,WANG Yue,TIAN Chunna. Residual attention mechanism for visual tracking[J]. Journal of Xidian University, 2020, 47(6): 148-157.
表3
VOT2016数据库上精确度测试实验结果%"
算法 | 未定义 属性 | 摄像机 运动 | 光照 变化 | 移动 | 遮挡 | 尺度 变化 | 平均 | 加权 平均 |
---|---|---|---|---|---|---|---|---|
GOTURN | 50.78 | 47.58 | 56.91 | 43.97 | 40.04 | 46.49 | 47.63 | 47.27 |
GOTURN_concat | 49.85 | 48.42 | 61.65 | 41.56 | 38.00 | 49.24 | 48.12 | 47.33 |
GOTURN_atten | 51.08 | 50.71 | 64.45 | 43.99 | 36.29 | 52.49 | 49.84 | 49.23 |
GOTURN_giou | 50.99 | 49.34 | 62.13 | 44.27 | 40.69 | 47.23 | 49.11 | 48.24 |
GOTURN_all | 50.81 | 50.51 | 64.95 | 45.30 | 40.48 | 53.53 | 50.93 | 49.87 |
表4
VOT2016数据库上鲁棒性测试实验结果"
算法 | 未定义 属性 | 摄像机 运动 | 光照 变化 | 移动 | 遮挡 | 尺度 变化 | 平均 | 加权 平均 |
---|---|---|---|---|---|---|---|---|
GOTURN | 32.000 0 | 58.000 0 | 4.000 0 | 45.000 0 | 26.000 0 | 19.000 0 | 30.666 7 | 38.0153 |
GOTURN_concat | 30.000 0 | 60.000 0 | 2.000 0 | 49.000 0 | 26.000 0 | 20.000 0 | 31.166 7 | 38.886 7 |
GOTURN_atten | 34.000 0 | 51.000 0 | 2.000 0 | 35.000 0 | 22.000 0 | 22.000 0 | 27.666 7 | 34.889 9 |
GOTURN_giou | 38.000 0 | 70.000 0 | 2.000 0 | 47.000 0 | 24.000 0 | 22.000 0 | 33.833 3 | 43.596 4 |
GOTURN_all | 27.000 0 | 55.000 0 | 4.000 0 | 44.000 0 | 24.000 0 | 23.000 0 | 29.666 7 | 36.550 0 |
表5
跟踪结果比较(VOT2016)"
算法 | 精确度/(%) | 鲁棒性/(f) |
---|---|---|
DFT | 44.56 | 59.611 6 |
DAT | 45.82 | 28.353 3 |
EBT | 45.29 | 15.193 5 |
STRUCK2014 | 44.52 | 56.102 7 |
KCF2014 | 48.88 | 38.082 0 |
ACT | 43.72 | 42.603 1 |
DPT | 48.46 | 31.938 9 |
MLDF | 48.73 | 15.043 7 |
GOTURN | 47.27 | 38.015 3 |
GOTURN_atten | 49.23 | 34.889 9 |
GOTURN_concat | 47.33 | 38.886 7 |
GOTURN_giou | 48.24 | 43.596 4 |
GOTURN_all | 49.87 | 36.146 8 |
[1] | 王海军, 张圣燕. 自适应权值卷积特征的鲁棒目标跟踪算法[J]. 西安电子科技大学学报, 2019, 46(1): 117-123. |
WANG Haijun, ZHANG Shengyan. Robust Object Tracking Via Adaptive Weight Convolutional Features[J]. Journal of Xidian University, 2019, 46(1): 117-123. | |
[2] |
HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-Speed Tracking with Kernelized Correlation Filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015. 37(3): 583-596.
doi: 10.1109/TPAMI.2014.2345390 pmid: 26353263 |
[3] | DANELLJAN M, HAGER G, KHAN F S, et al. Accurate Scale Estimation for Robust Visual Tracking[C]// Proceedings of the 2014 British Machine Vision Conference. Durham: British Machine Vision Association, 2014. DOI: 10.5244/C.28.65. |
[4] | 宋建锋, 苗启广, 申猛, 等. 多特征融合的相关滤波红外单目标跟踪算法[J]. 西安电子科技大学学报, 2019, 46(5): 142-147. |
SONG Jianfeng, MIAO Qiguang, SHEN Meng, et al. Algorithm for Tracking an Infrared Single Target Based on Correlation Filtering with Multi-feature Fusion[J]. Journal of Xidian University, 2019, 46(5): 142-147. | |
[5] | 王欣远, 肖嵩, 李磊, 等. 融合ELM和相关滤波的鲁棒性目标跟踪算法[J]. 西安电子科技大学学报, 2019, 46(1): 57-63. |
WANG Xinyuan, XIAO Song, LI Lei, et al. Robust Target Tracking Algorithm Based on the ELM and Discriminative Correlation Filter[J]. Journal of Xidian University, 2019, 46(1): 57-63. | |
[6] | NAM H, HAN B. Learning Multi-domain Convolutional Neural Networks for Visual Tracking[C]// Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2016: 4293-4302. |
[7] | NAM H, BAEK M, HAN B. Modeling and Propagating CNNs in a Tree Structure for Visual Tracking[C /OL]. [2019-11- 21]. https://arxiv.org/pdf/1608.07242.pdf. |
[8] | HELD D, THRUN S, SAVARESE S. Learning to Track at 100 FPS with Deep Regression Networks[C]// Lecture Notes in Computer Science: 9905. Heidelberg: Springer Verlag, 2016: 749-765. |
[9] | BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully Convolutional Siamese Networks for Object Tracking[C]// Lecture Notes in Computer Science: 9914. Heidelberg: Springer Verlag, 2016: 850-865. |
[10] | GUO Q, FENG W, ZHOU C, et al. Learning Dynamic Siamese Network for Visual Object Tracking[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1781-1789. |
[11] | LI B, YAN J, WU W, et al. High Performance Visual Tracking with Siamese Region Proposal Network[C]// Proceedings of the 2018 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2018: 8971-8980. |
[12] | WANG Q, TENG Z, XING J, et al. Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking[C]// Proceedings of the 2018 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2018: 4854-4863. |
[13] | ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese Networks for Visual Object Tracking[C]// Lecture Notes in Computer Science: 11213. Heidelberg: Springer Verlag, 2018: 103-119. |
[14] | DANELLJAN M, ROBINSON A, KHAN F S, et al. Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking[C]// Lecture Notes in Computer Science: 9909. Heidelberg: Springer Verlag, 2016: 472-488. |
[15] | DANELLJAN M, BHAT G, SHAHBAZ KHAN F, et al. ECO: Efficient Convolution Operators for Tracking[C]// Proceedings of the 2017 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6931-6939. |
[16] | VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End to End Representation Learning for Correlation Filter Based Tracking[C]// Proceedings of the 2017 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5000-5008. |
[17] | KRISTAN M, LEONARDIS A, MATAS J, et al. The Visual Object Tracking VOT2016 Challenge Results[C]// Lecture Notes in Computer Science: 9914. Heidelberg: Springer Verlag, 2016: 777-823. |
[18] | GUAN H, XUE X Y, AN Z Y. Advances on Application of Deep Learning for Video Object Tracking[J]. Acta Automatica Sinica, 2016, 42(6) : 834-847. |
[19] | RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet Large Scale Visual Recognition Challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. |
[20] | JIA Y, SHELHAMER E, DONAHUE J, et al. Caffe: Convolutional Architecture for Fast Feature Embedding[C]// Proceedings of the 2014 ACM Conference on Multimedia. New York: ACM, 2014: 675-678. |
[21] | REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression[C]// Proceedings of the 2019 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Washington: IEEE Computer Society, 2019: 658-666. |
[22] | HARE S, SAFFARI A, TORR P H S. Struck: Structured Output Tracking with Kernels[C]// Proceedings of the 2011 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2011: 263-270. |
[23] | SEVILLA LARA L, LEARNED MILLER E. Distribution Fields for Tracking[C]// Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2012: 1910-1917. |
[24] | WANG L J, OUYANG W, WANG X, et al. STCT: Sequentially Training Convolutional Networks for Visual Tracking[C] // Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2016: 1373-1381. |
[25] | FELSBERG M. Enhanced Distribution Field Tracking Using Channel Representations[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013: 121-128. |
[26] | POSSEGGER H. MAUTHNER T. BISCHOF H. In Defense of Color-based Model-free Tracking[C]// Proceedings of the 2015 IEEE Computer Society Conference on Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2015: 2113-2120. |
[27] |
LUKEZIC A, CEHOVIN ZAJC L, KRISTAN M. Deformable Parts Correlation Filters for Robust Visual Tracking[J]. IEEE Transactions on Cybernetics, 2018, 48(6): 1849-1861.
doi: 10.1109/TCYB.2017.2716101 pmid: 28678728 |
[28] |
AKIN O, ERDEM E, ERDEM A, et al. Deformable Part-based Tracking by Coupled Global and Local Correlation filters[J]. Journal of Visual Communication and Image Representation, 2016, 38: 763-774.
doi: 10.1016/j.jvcir.2016.04.018 |
[29] |
SMEULDERS A W M, CHU D M, CUCCHIARA R, et al. Visual Tracking: An Experimental Survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1442-1468.
doi: 10.1109/TPAMI.2013.230 |
[1] | 李源,崔玉爽,王伟. 一种基于字词双通道网络的文本情感分析方法[J]. 西安电子科技大学学报, 2021, 48(6): 179-186. |
[2] | 于浩洋,尹良,李书芳,吕顺. 生成对抗网络小样本雷达调制信号识别算法[J]. 西安电子科技大学学报, 2021, 48(6): 96-104. |
[3] | 顾兆军,陈辉,王家亮,高冰. 一种小型四轴飞行器目标跟踪控制算法[J]. 西安电子科技大学学报, 2021, 48(5): 117-127. |
[4] | 董如婵,焦李成,赵进,沈维燕. 一种深度融合机制的遥感图像目标检测技术[J]. 西安电子科技大学学报, 2021, 48(5): 128-138. |
[5] | 毛昭勇,王亦晨,王鑫,沈钧戈. 面向高速公路的车辆视频监控分析系统[J]. 西安电子科技大学学报, 2021, 48(5): 178-189. |
[6] | 孙彦景,魏力,张年龙,云霄,董锴文,葛敏,程小舟,侯晓峰. 联合DD-GAN和全局特征的井下人员重识别方法[J]. 西安电子科技大学学报, 2021, 48(5): 201-211. |
[7] | 周鹏,杨军. 采用神经网络架构搜索的遥感影像分割方法[J]. 西安电子科技大学学报, 2021, 48(5): 47-57. |
[8] | 詹克羽,孙岳,李颖. 一种多尺度三维卷积的视频超分辨率方法[J]. 西安电子科技大学学报, 2021, 48(5): 8-14. |
[9] | 杨云航,闵连权. 采用空洞卷积的多尺度融合草图识别模型[J]. 西安电子科技大学学报, 2021, 48(5): 92-99. |
[10] | 张宇浩,程培涛,张书豪,王秀美. 一种自适应权重学习的轻量超分辨率重建网络[J]. 西安电子科技大学学报, 2021, 48(5): 15-22. |
[11] | 陈昌川,王海宁,黄炼,黄涛,李连杰,黄向康,代少升. 一种基于局部表征的面部表情识别算法[J]. 西安电子科技大学学报, 2021, 48(5): 100-109. |
[12] | 宋建锋,苗启广,王崇晓,徐浩,杨瑾. 注意力机制的多尺度单目标跟踪算法[J]. 西安电子科技大学学报, 2021, 48(5): 110-116. |
[13] | 王海军,张圣燕,杜玉杰. 响应差异约束的相关滤波无人机目标跟踪算法[J]. 西安电子科技大学学报, 2021, 48(5): 149-155. |
[14] | 回海生,张雪英,吴泽林,李凤莲. 一种主辅路径注意力补偿的脑卒中病灶分割方法[J]. 西安电子科技大学学报, 2021, 48(4): 200-208. |
[15] | 王平,江雨泽,赵光辉. 目标检测的多尺度定位提升算法[J]. 西安电子科技大学学报, 2021, 48(3): 85-90. |
|