西安电子科技大学学报 ›› 2022, Vol. 49 ›› Issue (4): 118-126.doi: 10.19665/j.issn1001-2400.2022.04.014

• 计算机科学与技术 • 上一篇    下一篇

融合注意力机制的多尺度显著性目标检测网络

刘迪(),郭继昌(),汪昱东(),张怡()   

  1. 天津大学 电气自动化与信息工程学院,天津 300072
  • 收稿日期:2021-05-14 出版日期:2022-08-20 发布日期:2022-08-15
  • 通讯作者: 郭继昌
  • 作者简介:刘 迪(1997—),女,天津大学硕士研究生,E-mail: liudi@tju.edu.cn|汪昱东(1995—),男,天津大学博士研究生,E-mail: yudongwang@tju.edu.cn|张 怡(1997—),男,天津大学博士研究生,E-mail: zhangyi123@tju.edu.cn
  • 基金资助:
    国家自然科学基金(61771334)

Multi-scale salient object detection network combining an attention mechanism

LIU Di(),GUO Jichang(),WANG Yudong(),ZHANG Yi()   

  1. School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China
  • Received:2021-05-14 Online:2022-08-20 Published:2022-08-15
  • Contact: Jichang GUO

摘要:

目前大多数显著性目标检测方法会受到图像复杂背景的干扰并且会出现检测结果亮度不均匀、边缘模糊的现象。针对上述问题,提出一种融合注意力机制的多尺度显著性目标检测网络方法。首先,网络以编码器-解码器架构为基础,并采用在编解码过程中连接相邻层特征的多尺度特征融合方法,以便于捕捉到图像中不同尺度的显著性目标;其次,在网络中融合注意力机制,用以关注特征的空间信息和通道信息,目的是得到均匀完整且边缘更加清晰的显著性目标检测结果;最后,在编码器与解码器之间使用一种并行多分支结构即上下文特征提取模块实现不同感受野下的特征提取,进一步提升显著性目标检测性能。实验结果表明,在ECSSD显著性目标检测数据集上检测平均绝对误差MAE和F-measure指标,相较于对比网络至少有10%和0.7%的提高。所提网络不仅能准确定位显著性目标并使其均匀显示,而且在复杂背景下能够精确预测显著性目标边缘。

关键词: 显著性目标检测, 注意力机制, 多尺度特征融合, 深度学习, 图像处理

Abstract:

At present,most salient object detection algorithms are disturbed by the complex background of the image,and the detection results show the phenomena of uneven brightness and blurred edges.To address the above issues,a salient object detection network combining attention mechanism and multi-scale feature fusion is proposed.First,the network is based on the encoder-decoder architecture and the features from adjacent layers are connected in the encoding and decoding process,which captures the multi-scale salient objects in the image.Second,the attention mechanism is integrated in the network to focus on the spatial information and channel information of features,with the purpose of obtaining uniform and complete salient object detection results with clear edges.Finally,a parallel multi-branch structure,named Context Feature Extraction Module,is used to extract features under different receptive fields to improve the performance of salient object detection.Experimental results show that the proposed method can not only accurately locate and highlight the salient objects,but also accurately predict the edge of the salient object in the complex background.Compared with the contrast methods,the average absolute error of MAE and F-Measure on the salient object detection dataset ECSSD can be improved by at least 10% and 0.7%,respectively.

Key words: salient object detection, attention mechanism, multi-scale feature fusion, deep learning, image processing

中图分类号: 

  • TP391.4