西安电子科技大学学报 ›› 2019, Vol. 46 ›› Issue (6): 52-59.doi: 10.19665/j.issn1001-2400.2019.06.008

• • 上一篇    下一篇

一种改进DeeplabV3网络的烟雾分割算法

汪梓艺1,苏育挺1,刘艳艳2,张为3   

  1. 1. 天津大学电气自动化与信息工程学院,天津 300072
    2. 南开大学电子信息与光学工程学院,天津 300071
    3. 天津大学微电子学院,天津 300072
  • 收稿日期:2019-06-07 出版日期:2019-12-20 发布日期:2019-12-21
  • 作者简介:汪梓艺(1993—),女,天津大学硕士研究生,E-mail:1247218542@qq.com
  • 基金资助:
    公安部技术研究计划竞争性遴选项目(2016JSYJD04-O3);公安部技术研究计划(2017JSYJC35)

Algorithm for segmentation of smoke using the improved DeeplabV3 network

WANG Ziyi1,SU Yuting1,LIU Yanyan2,ZHANG Wei3   

  1. 1. School of Electrical Automation and Information Engineering, Tianjin University, Tianjin 300072, China
    2. School of Electronic Information and Optical Engineering, Nankai University, Tianjin 300071, China
    3. School of Microelectronics, Tianjin University, Tianjin 300072, China
  • Received:2019-06-07 Online:2019-12-20 Published:2019-12-21

摘要:

由于现有的烟雾检测方法大多依靠手工选取特征,往往不能准确地分割出视频图像中的烟雾区域。基于此,提出了改进的DeeplabV3烟雾分割算法。改进的算法在基础编码器网络后添加了特征细化模块来削弱空洞卷积带来的网格效应;针对烟雾这类尺度和姿态多变的非刚性目标,在带有空洞卷积的空间金字塔模块中引入可变形卷积来更好地学习烟雾的形变;为了进一步恢复烟雾的空间细节,提出了通道注意力解码器模块。在烟雾图片数据集的测试下,改进后的模型平均每张图片的预测时间约达到71.73ms,平均像素精确度约达到97.78%,平均交并比约达到91.21%,精度与DeeplabV3模型相比分别提高了0.56%及2.17%,更加适用于烟雾分割。公开的烟雾视频测试结果表明,该模型的检测率高于现有的视频烟雾检测算法,具有一定的实用价值。

关键词: 图像处理, 烟雾检测, 语义分割, 可变形卷积, 注意力机制, 深度学习

Abstract:

Existing smoke detection methods depend mostly on features which are selected manually and the smoke areas in video images often cannot be segmented accurately. This paper proposes an improved DeeplabV3 smoke segmentation algorithm based on this. A feature refinement module is added after the basic encoder network to weaken the gridding effects caused by dilated convolutions. For the non-rigid objects such as smoke with variable scales and postures, the Atrous Spatial Pyramid Pooling module is combined with the deformable convolution to better adapt to the smoke deformation. And a channel attention decoder module is proposed to further restore the spatial details of smoke images. According to the test of the smoke image data set, the proposed model has a faster average prediction speed of 71.73ms per image. Besides, compared with the DeeplabV3 network, this algorithm can lead to a higher MPA (Mean Pixel Accuracy) of 97.78% and a higher MIoU (Mean Intersection over Union) of 91.21%, thus improving the performance by 0.56% and 2.17% respectively, and making it more suitable for smoke segmentation. Public smoke video test results show that this model outperforms other video-based smoke detection methods for the detection rate, and that it is of certain research significance and practical value.

Key words: image processing, smoke detection, semantic segmentation, deformable convolution, attention mechanism, deep learning

中图分类号: 

  • TP391