电子科技 ›› 2024, Vol. 37 ›› Issue (4): 38-46.doi: 10.16180/j.cnki.issn1007-7820.2024.04.006

• • 上一篇    下一篇

基于改进MobileNet网络的多类别垃圾分类算法

梁陈烨, 张轩雄   

  1. 上海理工大学 光电信息与计算机工程学院,上海 200093
  • 收稿日期:2022-11-11 出版日期:2024-04-15 发布日期:2024-04-19
  • 作者简介:梁陈烨(1998-),男,硕士研究生。研究方向:机器视觉。
    张轩雄(1963-),男,博士,教授。研究方向:微电子机械系统。
  • 基金资助:
    国家自然科学基金(62276167)

Research on Multiclass Garbage Classification Algorithm Based on Improved MobileNet Network

LIANG Chenye, ZHANG Xuanxiong   

  1. School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China
  • Received:2022-11-11 Online:2024-04-15 Published:2024-04-19
  • Supported by:
    National Natural Science Foundation of China(62276167)

摘要:

针对垃圾数量繁多及一张图片包含多个垃圾物体的情况,文中提出基于改进MobileNet网络的垃圾检测与分类算法。将MobileNet网络融合进YOLOv5(You Only Look Once v5)目标检测算法,同时在主干部分引入卷积注意力模块(Convolutional Block Attention Modul,CBAM)筛选有意义的信息,利用视觉Transformer聚合形成图像特征,并加入使用了加权双向特征金字塔网络区别不同特征的贡献度,引入高效通道注意力(Efficient Channel Attention,ECA)模块对图像特征进行组合并传递给预测层。最后,为了在垃圾目标之间有遮挡的情况下获得更好的性能,使用软性非极大值抑制(soft-Non Maximum Suppression,soft-NMS)方法,并利用Alpha-IoU(Alpha-Intersection over Union)损失函数对提取的特征进行预测。实验结果表明,所提方法能够实现多目标多类别垃圾的定位与识别, mAP(mean Average Percision)值达到了90.31%,相较于YOLOv5网络提升了4.95%,处理速度缩短了约2.4 s。相较于融合ResNet(Residual Network)网络的Faster R-CNN(Region-based Convolutional Neural Network)算法,所提算法在保证准确率的前提下提升了处理效率。

关键词: 垃圾分类, 目标检测, 视觉Transformer, MobileNet, 图像识别, 特征集成, 数据增强, 平均准确率

Abstract:

view of the large amount of garbage and the fact that a picture contains multiple garbage objects, this study proposes a garbage detection and classification algorithm based on the improved MobileNet network, which integrates the MobileNet network into YOLOv5(You Only Look Oncev5) target detection algorithm. At the same time, the CBAM(Convolutional Block Attention Modul) module is introduced in the backbone to filter meaningful information, and the vision transformer is used to aggregate and form image features. In addition, the weighted bidirectional feature pyramid network is used to distinguish the contribution of different features. At the same time, the ECA(Efficient Channel Attention) module is introduced to combine the image features and transmit them to the prediction layer. Finally, in order to obtain better performance when there is occlusion between garbage targets, soft-NMS(soft-Non Maximum Suppression) method and Alpha-IoU(Alpha-Intersection over Union) loss function is used to predict the extracted features. The experimental results show that the method proposed in this study can realize the location and recognition of multi-target and multi-category garbage., and the mAP(mean Average Percision) value reaches 90.31%, which is 4.95% higher than that of YOLOv5 network, and the processing speed is shortened by about 2.4 seconds. Compared with the Faster R-CNN(Region-based Convolutional Neural Network) algorithm which integrates ResNet(Residual Network) network, the algorithm proposed in this study improves the processing efficiency on the premise of ensuring the accuracy.

Key words: garbage classification, target detection, vision Transformer, MobileNet, image recognition, feature integration, data enhancement, average accuracy

中图分类号: 

  • TP391.4