电子科技 ›› 2023, Vol. 36 ›› Issue (2): 73-80.doi: 10.16180/j.cnki.issn1007-7820.2023.02.011
程长文,陈玮,陈劲宏,尹钟
收稿日期:
2021-08-29
出版日期:
2023-02-15
发布日期:
2023-01-17
作者简介:
程长文(1997-),男,硕士研究生。研究方向:图像处理。|陈玮(1964-),女,副教授。研究方向:图像处理与模式识别。|陈劲宏(1996 -),男,硕士研究生。研究方向:图像处理。|尹钟(1988-),男,副教授。研究方向:基于脑电信号的深度学习。
基金资助:
CHENG Changwen,CHEN Wei,CHEN Jinhong,YIN Zhong
Received:
2021-08-29
Online:
2023-02-15
Published:
2023-01-17
Supported by:
摘要:
现有的YOLO目标检测模型基于One-stage思想进行多目标检测,其对于双分类检测有所不足,并且检测时性能消耗较大。为了能够在新冠疫情爆发的特殊时期,提高双分类口罩佩戴的检测精度和检测效率,文中提出了一种基于YOLO的双目标口罩佩戴实时检测方法。改进模型的前馈输入层,优化了数据增强部分,添加了自适应图片缩放,以便提升双分类和小目标的检测精度和检测效率。添加了自适应锚定框,替换了激活函数,降低了方法的计算量从而提高方法的检测效率。Neck部分优化和添加的Focus结构提高了特征融合能力并且减少了参数量,达到了提速的效果。实验结果表明,与YOLOv4相比,所提方法在文中数据集中的F1提高了0.33%,mAp提高了0.71%,并且相同实验环境下的检测效率也提升明显。
中图分类号:
程长文,陈玮,陈劲宏,尹钟. 改进YOLO的口罩佩戴实时检测方法[J]. 电子科技, 2023, 36(2): 73-80.
CHENG Changwen,CHEN Wei,CHEN Jinhong,YIN Zhong. YOLO-Improve Detection Method of Real-Time Mask Wearing[J]. Electronic Science and Technology, 2023, 36(2): 73-80.
表1
数据增强可以调整的参数"
参数名 | 参数值 | 参数解释 |
---|---|---|
hsv_h | 0.015 | 图像 HSV-Hue 增强(小数) |
hsv_s | 0.7 | 图像HSV-饱和度增强(小数) |
hsv_v | 0.6 | 图像HSV-值增强(小数) |
degrees | 1.0 | 图像旋转(+/- deg) |
translate | 0.1 | 图像翻译(+/- fraction) |
scale | 0.6 | 图像比例(+/- gain) |
shear | 1.0 | 图像剪切(+/- deg) |
perspective | 0.0 | 图像透视(+/- fraction), range 0-0.001 |
flipud | 0.01 | 图像上下翻转(比例) |
fliplr | 0.5 | 图像左右翻转(比例) |
mixup | 0.2 | 图像混合(比例) |
[1] | Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Columbus: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014. |
[2] | Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in Neural Information Processing Systems, 2015, 28(4):91-99. |
[3] | Tan M, Pang R, Le Q V. Efficientdet: Scalable and efficient object detection[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. |
[4] | Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]. Shanghai: International Conference on MachineLearning, 2019. |
[5] | Deng J, Guo J, Ververas E, et al. Retinaface: Single-shot multi-level face localisation in the wild[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. |
[6] | 陈磊, 张孙杰, 王永雄. 基于改进的YOLOv3及其在遥感图像中的检测[J]. 小型微型计算机系统, 2020, 41(11):2321-2324. |
Chen Lei, Zhang Sunjie, Wang Yongxiong. Based on improved YOLOv3 and its detection in remote sensing images[J]. Journal of Chinese Computer Systems, 2020, 41(11):2321-2324. | |
[7] | Redmon J, Divvala S, Girshick R, et al. You only look once: Unified,real-time object detection[C]. Las Vegas: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. |
[8] | Wang C Y, Bochkovskiy A, Liao H Y M. Scaled-YOLOv4: Scaling cross stage partial network[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021. |
[9] | Redmon J, Farhadi A. YOLOv3: An incremental improvement[C]. Salt Lake City: Proceedings of the Conference on Computer Vision and Pattern Recognition, 2018. |
[10] | Maas A L, Hannun A Y, Ng A Y. Rectifier nonlinearities improve neural network acoustic models[C]. Atlanta: Proceedings of the International Conference on Machine Learning, 2013. |
[11] | Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]. Paris: International Conference on Machine Learning, 2015. |
[12] | He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Las Vegas: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. |
[13] | Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. |
[14] | Wang C Y, Liao H Y M, Wu Y H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. |
[15] | Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional Nnetworks[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. |
[16] | Wang C Y, Liao H Y M, Wu Y H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. |
[17] |
He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916.
doi: 10.1109/TPAMI.2015.2389824 pmid: 26353135 |
[18] | Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. |
[19] | Wang W, Xie E, Song X, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]. Seoul: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. |
[20] | Yun S, Han D, Oh S J, et al. Cutmix: Regularization strategy to train strong classifiers with localizable features[C]. Seoul: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. |
[21] | Ge S, Li J, Ye Q, et al. Detecting masked faces in the wild with lle-cnns[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. |
[22] | 张春蕾, 牛馨苑. 结合YOLO的ORB双目图像匹配方法研究[J]. 小型微型计算机系统, 2020, 41(1):185-189. |
Zhang Chunlei, Niu Xinyuan. Research on ORB binocular image matching method based on YOLO[J]. Journal of Chinese Computer Systems, 2020, 41(1):185-189. | |
[23] |
牛作东, 覃涛, 李捍东, 等. 改进RetinaFace的自然场景口罩佩戴检测算法[J]. 计算机工程与应用, 2020, 56(12):1-7.
doi: 10.3778/j.issn.1002-8331.2002-0402 |
Niu Zuodong, Qin Tao, Li Handong, et al. Improved algorithm of retinaFace for Natural scene mask wear detection[J]. Computer Engineering and Applications, 2020, 56(12):1-7.
doi: 10.3778/j.issn.1002-8331.2002-0402 |
|
[24] | 张修宝, 林子原, 田万鑫, 等. 全天候自然场景下的人脸佩戴口罩识别技术[J]. 中国科学:信息科学, 2020, 50(7):1110-1120. |
Zhang Xiubao, Lin Ziyuan, Tian Wanxin, et al. Mask-wearing recognition in the wild[J]. Science in China: Information Sciences, 2020, 50(7):1110-1120. | |
[25] | 邓黄潇. 基于迁移学习与RetinaNet的口罩佩戴检测的方法[J]. 电子技术与软件工程, 2020(5):209-211. |
Deng Huangxiao. Method of mask wearing detection based on transfer learning and RetinaNet[J]. Electronic Technology and Software Engineering, 2020(5):209-211. | |
[26] | 赵崇, 迟蒙蒙, 储聪, 等. 导盲犬行走机构运动仿真及其视觉识别算法研究[J]. 电子科技, 2021, 34(9):66-72. |
Zhao Chong, Chi Mengmeng, Chu Cong, et al. Research on motion simulation and visual recognition algorithm of guide dog walking mechanism[J]. Electronic Science and Technology, 2021, 34(9):66-72. |
[1] | 张漫秸,杨芳艳,季云峰. 球类运动中人体姿态估计研究进展[J]. 电子科技, 2023, 36(1): 28-37. |
[2] | 周永长,黄亚宇. 基于BP神经网络建立二次润叶工艺参数的预测模型[J]. 电子科技, 2022, 35(9): 79-86. |
[3] | 赵轩,周凡,余汉成. 基于改进特征提取及融合模块的YOLOv3模型[J]. 电子科技, 2022, 35(7): 40-45. |
[4] | 赵豆豆,张伟. 基于VW-IGRBF神经网络的出水BOD软测量[J]. 电子科技, 2022, 35(5): 26-32. |
[5] | 储萍,倪伟. 基于FPGA的SqueezeNet推断加速器设计[J]. 电子科技, 2022, 35(2): 20-26. |
[6] | 张伟,刘娜,江洋,李清都. 基于YOLO神经网络的垃圾检测与分类[J]. 电子科技, 2022, 35(10): 45-50. |
[7] | 赵崇,迟蒙蒙,储聪,张鹏. 导盲犬行走机构运动仿真及其视觉识别算法研究[J]. 电子科技, 2021, 34(9): 66-72. |
[8] | 司明明,陈玮,胡春燕,尹钟. 融合Resnet50和U-Net的眼底彩色血管图像分割[J]. 电子科技, 2021, 34(8): 19-24. |
[9] | 朱斌,刘子龙. 基于新型初始模块的卷积神经网络图像分类方法[J]. 电子科技, 2021, 34(2): 52-56. |
[10] | 叶飞, 刘子龙. 基于改进YOLOv3算法的行人检测研究[J]. 电子科技, 2021, 34(1): 5-9. |
[11] | 黎阳,沈烨,刘敏,戴仁月,姜晓燕. 融合运动信息与表观信息的多目标跟踪算法[J]. 电子科技, 2020, 33(9): 21-25. |
[12] | 王春江,李鹏. 基于ZYNQ的运动目标检测系统设计[J]. 电子科技, 2020, 33(5): 82-86. |
[13] | 官洪运,苏振涛,汪晨. 基于特征融合的背景差分改进算法[J]. 电子科技, 2020, 33(12): 22-27. |
[14] | 邵玉娥,王暕来,周生华,刘宏伟,张月红. 基于LASSO的雷达脉压压缩方法[J]. 电子科技, 2020, 33(11): 7-10. |
[15] | 杨标,刘翔,汤显,陈俊廷. 智能工厂下的AGV多目标跟踪[J]. 电子科技, 2019, 32(11): 23-27. |
|