Journal of Xidian University ›› 2022, Vol. 49 ›› Issue (3): 160-170.doi: 10.19665/j.issn1001-2400.2022.03.018

• Computer Science and Technology & Artificial Intelligence • Previous Articles     Next Articles

Feature enhanced single-stage remote sensing image object detection model

WANG Xili(),LIANG Min(),LIU Tao()   

  1. School of Computer Science,Shaanxi Normal University,Xi’an 710119,China
  • Received:2021-01-27 Revised:2021-11-24 Online:2022-06-20 Published:2022-07-04

Abstract:

Purpose:The performance of remote sensing object detection has been largely improved with the development of the convolutional neural network.However,the complexity of the scene and the diversity of the target size and shape are still challenging in the remote sensing object detection task.Thus,the deep detection model of different sizes’ objects in a complex scenario is studied.Methods:Feature pyramids are an effective method for detecting objects with different sizes.But the way of transferring the feature layer by layer may lose information in the feature pyramids.Therefore,this paper proposes a feature pyramid network with shortcut connections,which can enhance the semantic and detailed information on each feature layer in the feature pyramid.Moreover,using the spatial attention weight to strengthen the possible target area is an effective method to improve the target detection rate,and it is helpful for object detection in the complex scene.But the available spatial attention will strengthen the imprecise prediction results simultaneously,so that it may interfere with the final prediction results.For this purpose,this paper proposes an anchor-based spatial attention module which mainly strengthens feature regions that are more likely to produce accurate prediction results.In this paper,the feature pyramid network with shortcut connections and the anchor-based spatial attention module are embedded into the Retina Net to form an end-to-end feature enhanced single-stage remote sensing object detection model,namely FENet (Feature Enhanced Network).Results:Experimental results show that the FENet model is 1.78% higher in mAP than the FAN (Feature Enhanced Network) on UCAS-AOD remote sensing dataset,and 1.48% higher than the FAN model on RSOD dataset.And the mAP results of the FENet are superior to those of the comparable models.In addition,the test time of the FENet for an image of 800800 pixel in a single Titan X GPU is 0.058s.Conclusions:Experimental results show that the proposed model can effectively enhance the object feature extraction ability,and thus improve the detection performance.

Key words: remote sensing image, feature pyramids, spatial attention, anchor box, single-stage object detection

CLC Number: 

  • TP391.4