结合帧间目标回归网络的无人机视频车辆检测

doi:10.19665/j.issn1001-2400.2021.04.020

Abstract

Abstract:

UAV video has many advantages of flexible view,continuous view and wide monitoring scope,and at the same time,there are many problems,such as crowded targets,strong motion noises and so on,which make target detection difficult.To solve these problems,this paper proposes a video vehicle detection algorithm based on the interframe target regression network.According to the characteristics of crowded vehicles in UAV video,soft non maximum suppression is proposed as the detecting-box merging strategy of FCOS,and thus a single-frame vehicle detector is constructed.In order to deal with the problem that the single-frame detector can be easily disturbed by motion noise when it is directly applied to video detection,thus resulting in the change of the confidence level for the same target,an interframe target regression network is designed.The target features of adjacent multiple frames are fused by using interframe movement continuity,and the fused features are matched with the target features of the current frame to output the prediction results.Finally,the detection performance is improved by correcting prediction results through single-frame detection results.Compared with FCOS and FGFA,the average precision of the proposed algorithm is improved by 2% and 5% respectively,reaching 47.42%.Experimental results show that it is better than the existing FCOS and FGFA,and has better robustness and generalization.

Key words: UAV video, vehicle detection, interframe movements, fusion feature, interframe target regression

CLC Number:

TP391.4

ZHANG Zhi,ZHENG Jin. Interframe target regression network for vehicle detection in UAV video[J].Journal of Xidian University, 2021, 48(4): 151-158.

Figures/Tables 6

References 18

[1]	LI Q, MOU L, XU Q, et al. R3-Net:a Deep Network for Multioriented Vehicle Detection in Aerial Images and Videos[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(7):5028-5042. doi: 10.1109/TGRS.36
[2]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031
[3]	XIE X, YANG W, CAO G, et al. Real-Time Vehicle Detection from UAV Imagery[C]//Proceedings of the 2018 IEEE Fourth International Conference on Multimedia Big Data(BigMM).Piscataway:IEEE, 2018:1-5.
[4]	LIU W, ANGUELOV D, ERHAN D, et al. SSD:Single Shot Multibox Detector[C]//Proceedings of the 14th European Conference on Computer Vision Amsterdam.Berlin:Springer, 2016:21-37.
[5]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal Loss for Dense Object Detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2):318-327. doi: 10.1109/TPAMI.34
[6]	VAN ETTEN A. You Only Look Twice:Rapid Multi-Scale Object Detection In Satellite Imagery[EB/OL].[2018-05-24].https://arxiv.org/abs/1805.09512 .
[7]	崔艳鹏, 王元皓, 胡建伟. 一种改进YOLOv3的动态小目标检测方法[J]. 西安电子科技大学学报, 2020, 47(3):1-7.
	CUI Yanpeng, WANG Yuanhao, HU Jianwei. Detection Method for a Dynamic Small Target Using the Improved YOLOv3[J]. Journal of Xidian University, 2020, 47(3):1-7.
[8]	REDMON J, FARHADI A. YOLOV3:an Incremental Improvement[EB/OL].[2018-04-08].https://arxiv.org/abs/1804.02767 .
[9]	KANG K, LI H. T-CNN:Tubelets with Convolutional Neural Networks for Object Detection from Videos[EB/OL].[2017-08-03].https://arxiv.org/abs/1604.02532 .
[10]	ZHU X. Towards High Performance Video Object Detection[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition,Piscataway:IEEE, 2018:7210-7218.
[11]	WANG S, ZHOU Y, YAN J, et al. Fully Motion-Aware Network for Video Object Detection[C]//Proceedings of the European Conference on Computer Vision.Munich:Spronger, 2018:557-573.
[12]	BODLA N, SINGH, B, CHELLAPPA, R, et al. Soft-NMS:Improving Object Detection with One Line of Code[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision(ICCV).Piscataway:IEEE, 2017:5562-5570.
[13]	TIAN Z, SHEN C, CHEN H, et al. FCOS:Fully Convolutional One-Stage Object Detection[C]//Proceedings of the 2019 IEEE International Conference on Computer Vision(ICCV).Piscataway:IEEE, 2019:9626-9635.
[14]	HE K, ZHANG X, REN S. Deep Residual Learning for Image Recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway:IEEE, 2017:770-778.
[15]	LIN T, DOLLAR P, GIRSHICK R, et al. Feature Pyramid Networks for Object Detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway:IEEE, 2017:936-944.
[16]	ZHU P, WEN L, BIAN X, et al. Vision Meets Drones:a Challenge[EB/OL].[2018-04-23].https://arxiv.org/abs/1804.07437v2 .
[17]	HELD D, THRUN S, SAVARESE S. Learning to Track at 100 FPS with Deep Regression Networks[C]//Proceedings of the European Conference on Computer Vision Amsterdam.Berlin:Springer, 2016:749-765.
[18]	RAZAKARIVONY S, JURIE F. Vehicle Detection in Aerial Imagery:a Small Target Detection Benchmark[J]. Journal of Visual Communicationand Image Representation, 2016, 34:187-203.

方法	AP@0.5
FCOS	45.52
FCOS+Soft-NMS (单帧无人机图像检测器)	46.38
FCOS+Soft-NMS+ 帧间目标回归网络+融合1帧	47.30
FCOS+Soft-NMS+ 帧间目标回归网络+融合5帧	47.42
FCOS+Soft-NMS+ 帧间目标回归网络+融合10帧	47.10

方法	AP@0.5
Faster-Rcnn^[2](ResNet101+FPN)	44.82
RetinaNet^[5](ResNet101+FPN)	44.25
Yolov3^[8]	36.12
FCOS^[13] (ResNet101+FPN)	45.52
FGFA^[10] (视频检测器)	42.83
文中算法	47.42

Interframe target regression network for vehicle detection in UAV video

RichHTML

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 6

References 18

Related Articles 3

Metrics

Comments

Recommended 10

[1]	WANG Yanfen,ZHU Xuran,YUN Xiao,SUN Yanjing,SHI Yunkai,WANG Sainan. Vehicle re-identification by multi-cameras for public security surveillance [J]. Journal of Xidian University, 2019, 46(4): 190-196.
[2]	LI Dong;ZHANG Xueying;DUAN Shufei;YAN Mimi. Dysarthria recognition combining speech fusion feature and random forest [J]. Journal of Xidian University, 2018, 45(3): 149-155.
[3]	SONG Junfang;SONG Xiangyu;GUO Xiaojun;WANG Weixing. Vehicle detection using the location relationship model between multi-components [J]. Journal of Xidian University, 2017, 44(3): 89-95.