基于注意力机制的水下遮挡目标检测算法

doi:10.16180/j.cnki.issn1007-7820.2023.05.010

Abstract

Abstract:

In view of the problems of foreground occlusion and background blur in underwater target detection task, an underwater target detection algorithm based on attention mechanism is proposed. Firstly, the image enhancement algorithms are used to improve the image quality. Then, based on the similarity function of the non-local neural network, the concatenation similarity function with logical reasoning capability is fused to enhance the expression ability of the network to the global context features. Additionally, the improved non-local neural network is combined with the triplet attention to make up for the channel features lost by the non-local neural network. Finally, the dilated convolution module is used to replace the pooling operation in triplet attention to reduce the loss of fine-grained information. Experiments show that the proposed algorithm increases the detection accuracy of baseline method from 65.66% to 68.55% on the data set provided by the 2020 National Underwater Target Detection Contest, which proves the effectiveness of the proposed algorithm.

Key words: deep learning, convolutional neural network, underwater target detection, occluded object detection, attention mechanism, similarity function, dilated convolution, Faster R-CNN

CLC Number:

TN957.52

SHI Jianke,QIAO Meiying,LI Bingfeng,ZHAO Yan. Underwater Occlusion Target Detection Algorithm Based on Attention Mechanism[J].Electronic Science and Technology, 2023, 36(5): 62-70.

Figures/Tables 20

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Table 1.

The structure of backbone network"

层数	尺寸	结构
Conv1	300×300×64	k=7×7,s=2
Maxpool	150×150×64	k=3×3,s=2
Layer1	150×150×256	$1 × 1,64 3 × 3,64 1 × 1,256$ ×3
Layer2	75×75×512	$1 × 1,128 3 × 3,128 1 × 1,512$ ×4
Layer3	38×38×1 024	$1 × 1,256 3 × 3,256 1 × 1,1 ? 024$ ×6
Layer4	19×19×2 048	$1 × 1,512 3 × 3,512 1 × 1,2 ? 048$ ×3
Attention	19×19×2 048	Proposed
Avgpool	13×13×2 048	k=7×7

Table 1.

Figure 6.

Figure 7.

Table 2.

Table 3.

Table 4.

Table 5.

Table 6.

Figure 8.

Table 7.

Table 8.

Figure 9.

Figure 10.

Figure 11.

Figure 12.

References 23

[1]	林森, 赵颍. 水下光学图像中目标探测关键技术研究综述[J]. 激光与光电子学进展, 2020, 57(6):26-37.
	Lin Sen, Zhao Ying. Review on key technologies of target exploration in underwater optical images[J]. Laser and Optoelectronic Progress, 2020, 57(6):26-37.
[2]	于红. 水产动物目标探测与追踪技术及应用研究进展[J]. 大连海洋大学学报, 2020, 35(6):793-804.
	Yu Hong. Research progress on object detection and tracking techniques utilization in aquaculture: A review[J]. Journal of Dalian Ocean University, 2020, 35(6):793-804.
[3]	张胜虎, 马惠敏. 遮挡对于目标检测的影响分析[J]. 图学学报, 2020, 41(6):891-896.
	Zhang Shenghu, Ma Huimin. An analysis of occlusion influence on object detection[J]. Journal of Graphics, 2020, 41(6):891-896.
[4]	赵晓飞, 于双和, 李清波, 等. 基于注意力机制的水下目标检测算法[J]. 扬州大学学报(自然科学版), 2021, 24(1):62-67.
	Zhao Xiaofei, Yu Shuanghe, Li Qingbo, et al. Underwater object detection algorithm based on attention mechanism[J]. Journal of Yangzhou University(Natural Science Edition), 2021, 24(1):62-67.
[5]	Wei X Y, Yu L, Tian S W, et al. Underwater target detection with an attention mechanism and improved scale[J]. Multimedia Tools and Applications, 2021, 80(1):33747-33761. doi: 10.1007/s11042-021-11230-2
[6]	Redmon J, Farhadi A. YOLOv3: An incremental improvement[C]. Salt Lake City: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018:1-12.
[7]	邹梓吟, 盖绍彦, 达飞鹏, 等. 基于注意力机制的遮挡行人检测算法[J]. 光学学报, 2021, 41(15):157-165.
	Zou Ziyin, Gai Shaoyan, Da Feipeng, et al. Occluded pedestrian detection algorithm based on attention mechanism[J]. Acta Optica Sinica, 2021, 41(15):157-165.
[8]	Woo S, Park J, Lee J Y, et al. CBAM: Convolutional block attention module[C]. Munich: Proceedings of European Conference on Computer Vision, 2018:3-19.
[9]	Hu J, Shen L, Albanie S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8):2011-2023. doi: 10.1109/TPAMI.2019.2913372 pmid: 31034408
[10]	Wang X L, Xiao T T, Jiang Y N, et al. Repulsion loss: Detecting pedestrians in a crowd[C]. Salt Lake City: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018:7774-7783.
[11]	Zhang S F, Wen L Y, Bian X, et al. Occlusion-aware R-CNN: Detecting pedestrians in a crowd[C]. Munich: Proceedings of European Conference on Computer Vision, 2018:637-653.
[12]	张莹, 刘子龙, 万伟. 基于Faster R-CNN的无人机车辆目标检测[J]. 电子科技, 2021, 34(11):11-20.
	Zhang Ying, Liu Zilong, Wan Wei. UAV vehicle target detection based on Faster R-CNN[J]. Electronic Science and Technology, 2021, 34(11):11-20.
[13]	Lin W H, Zhong J X, Liu S, et al. RoIMix:Proposal-fusion among multiple images for underwater object detection[C]. Barcelona: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, 2020:2588-2592.
[14]	Landskape D M, Nalamada T, Arasanipalai A U, et al. Rotate to attend: Convolutional triplet attention module[C]. Waikoloa: Proceedings of Conference on IEEE Winter Conference on Applications of Computer Vision, 2021:3139-3148.
[15]	Wang X L, Girshick R, Gupta A, et al. Non-local neural networks[C]. Salt Lake City: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018:7794-7803.
[16]	魏郭依哲, 陈思遥, 刘玉涛, 等. 水下图像增强和修复算法综述[J]. 计算机应用研究, 2021, 38(9):2561-2569.
	Wei Guoyizhe, Chen Siyao, Liu Yutao, et al. Survey of underwater image enhancement and restoration algorithms[J]. Application Research of Computers, 2021, 38(9):2561-2569.
[17]	Ma Y T, Lu T, Wu Y R. Multi-scale relational reasoning with regional attention for visual question answering[C]. Milan: Proceedings of the Twenty-fifth International Conference on Pattern Recognition, 2021:5642-5649.
[18]	YamashitaT, FurukawaH,Fujiyoshi H. Multiple skip connections of dilated convolution network for semantic segmentation[C]. Athens: Proceedings of the Twenty-fifth IEEE International Conference on Image Processing, 2018:1593-1597.
[19]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031 pmid: 27295650
[20]	Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]. Las Vegas: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016:779-788.
[21]	Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[C]. Amsterdam: Proceedings of European Conference on Computer Vision, 2016:21-37.
[22]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]. Las Vegas: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016:770-778.
[23]	Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. Institute of Electronics, Information and Communication Engineers Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2014, 14(9):1556-1561.

程度	数量/张	占比/%
无遮挡	647	12.94
一般遮挡	2 054	41.08
中度遮挡	1 775	35.50
严重遮挡	524	10.48

类型	海胆	扇贝	海星	海参	mAP
图5(a)	75.28	57.54	76.10	57.35	66.57
图5(b)	74.92	58.50	75.87	58.27	66.89
图5(c)	77.14	60.25	77.60	59.20	68.55

方法	海胆	扇贝	海星	海参	mAP
Z-pool	76.55	58.46	76.21	58.68	67.47
DCM	77.14	60.25	77.60	59.20	68.55

方法	海胆	扇贝	海星	海参	漏检率均值
Baseline	48.20	49.52	44.89	54.36	49.25
ResNet-50+NNNet	46.72	54.25	44.33	50.26	48.75
ResNet-50+TA	47.02	50.41	45.68	48.66	47.75
Proposed	45.96	47.88	45.47	49.30	46.50

相似度函数	海胆	扇贝	海星	海参	mAP
Gaussian	74.72	58.10	73.58	55.55	65.49
Embedded Gaussian	75.10	57.02	74.17	56.35	65.66
Dot Product	75.42	56.24	74.03	57.25	65.74
Concatenation	76.30	57.15	74.43	57.34	66.31
Proposed	77.14	60.25	77.60	59.20	68.55

Underwater Occlusion Target Detection Algorithm Based on Attention Mechanism

RichHTML

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 20

References 23

Related Articles 15

Metrics

Comments

Recommended 10

方法	主干网络	海胆/%	扇贝/%	海星/%	海参/%	mAP/%	帧率/frames·s^-1
YOLOv3	DarkNet-53	70.42	50.04	71.55	52.70	61.18	27
SSD300	VGG-16	71.72	52.30	72.54	54.72	62.82	22
SSD512	VGG-16	73.20	54.81	73.10	56.37	64.37	18
Faster R-CNN	VGG-16	73.35	54.03	72.52	55.32	63.81	13
Faster R-CNN	ResNet-50	75.10	57.02	74.17	56.35	65.66	12
Faster R-CNN	proposed	77.14	60.25	77.60	59.20	68.55	10

[1]	LIU Yuwei,CAO Min,FENG Haojia. CNAS Recognition Criteria Automatic Benchmarking System Based on Natural Language Processing [J]. Electronic Science and Technology, 2023, 36(5): 28-33.
[2]	ZHENG Yuheng,FU Dongxiang. UAV Detection Based on Slim-YOLOv4 with Embedded Device [J]. Electronic Science and Technology, 2023, 36(5): 55-61.
[3]	LUO Ruipeng,FENG Mingke,HUANG Xin,ZOU Renling,LI Dan. A Review of Research on EEG Signal Preprocessing Methods [J]. Electronic Science and Technology, 2023, 36(4): 36-43.
[4]	CUI Zhuodong,CHEN Wei,YIN Zhong. Helmet Wearing Detection Based on Enhanced Feature Fusion Network [J]. Electronic Science and Technology, 2023, 36(4): 44-51.
[5]	BAI Yingqi,PALIDAN·Tuerxun . A Scientific Literature Recommendation Method Based on Multi-Task Learning [J]. Electronic Science and Technology, 2023, 36(4): 59-64.
[6]	SUN Hong,ZHANG Yuxiang. Super-Resolution Image Reconstruction Algorithm Based on Multi-Feature Gated Feedback Residual Network [J]. Electronic Science and Technology, 2023, 36(4): 65-70.
[7]	TANG Zheng,ZHANG Huilin,MA Lixin,LIU Jinzhi,WANG Hao. Identification of Foreign Objects on Transmission Lines Using Lightweight Network Algorithm [J]. Electronic Science and Technology, 2023, 36(4): 71-77.
[8]	SUN Hong,LU Meike. Hybrid Recommendation Algorithm Fused with User Behavior Sequence Prediction [J]. Electronic Science and Technology, 2023, 36(4): 84-89.
[9]	HUANG Yuan,WEI Yunbing,TONG Dongbing,WANG Weigao. Short-Term Photovoltaic Power Prediction Based on VMD and Improved TCN [J]. Electronic Science and Technology, 2023, 36(3): 42-49.
[10]	LU Dongxiang. Research Progress of Node Assignment Optimization Strategy in Road Traffic Network [J]. Electronic Science and Technology, 2023, 36(3): 81-86.
[11]	ZUO Bin,LI Feifei. An Effective Segmentation Method for COVID-19 CT Image Based on Attention Mechanism and Inf-Net [J]. Electronic Science and Technology, 2023, 36(2): 22-28.
[12]	ZHAO Wenjun,ZHAI Han,ZHANG Hongyan. Total Variation and Sparsity Regularized Deep Nonnegative Matrix Factorization for Hyperspectral Unmixing [J]. Electronic Science and Technology, 2023, 36(2): 53-60.
[13]	YU Qiongfang,NIU Dongyang. Mixed Prediction of Mine Pressure Time and Space Based on LSTM Network [J]. Electronic Science and Technology, 2023, 36(2): 67-72.
[14]	ZHAO Jin,LI Feifei. A GAN-Based Lightweight Style Transfer Model for Ink Painting [J]. Electronic Science and Technology, 2023, 36(2): 81-86.
[15]	WU Tong,YU Lianzhi. The Recommendation Algorithm of Extreme Deep Factorization Machine Merged with Attention Network [J]. Electronic Science and Technology, 2023, 36(1): 38-43.

方法	海胆/%	扇贝/%	海星/%	海参/%	mAP/%	帧率/frames·s^-1
Baseline	75.10	57.02	74.17	56.35	65.66	12
FRANet	74.58	59.86	75.67	55.28	66.35	20
ResNet-50+SENet	73.51	60.10	76.20	58.55	67.09	10
ResNet-50+CBAM	71.34	59.56	72.74	56.36	65.00	9
ResNet-50+NNNet	75.85	58.26	73.35	57.17	66.16	10
ResNet-50+TA	74.24	57.52	74.94	57.55	66.06	12
ResNet-50+NNNet+TA	76.23	58.75	76.25	58.68	67.47	7
Proposed	77.14	60.25	77.60	59.20	68.55	10