电子科技 ›› 2023, Vol. 36 ›› Issue (8): 35-42.doi: 10.16180/j.cnki.issn1007-7820.2023.08.006

• • 上一篇    下一篇

一种面向密集场景的轻量化人群检测网络

潘昊1,刘翔1,赵静文1,张星2,3   

  1. 1.上海工程技术大学 电子电气工程学院,上海 201620
    2.上海工程技术大学 管理学院,上海 201620
    3.江苏大学 汽车工程学院,江苏 镇江 212023
  • 收稿日期:2022-03-19 出版日期:2023-08-15 发布日期:2023-08-14
  • 作者简介:潘昊(1996-),男,硕士研究生。研究方向:行人检测、行人跟踪。|刘翔(1972-),男,博士,副教授。研究方向:医学影像与多智能机器人协同、视频大数据的智能分析与高速检索、缺陷检测与高精度测量等。|赵静文(1992-),女,博士,讲师。研究方向:生物医学图像处理和分析。
  • 基金资助:
    中国高校产学研创新基金(2021FNB02001);文化部科技创新项目(2015KJCXXM19)

A Lightweight Crowd Detection Network for Dense Scenes

PAN Hao1,LIU Xiang1,ZHAO Jingwen1,ZHANG Xing2,3   

  1. 1. School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China
    2. School of Management,Shanghai University of Engineering Science,Shanghai 201620,China
    3. School of Automotive and Traffic Engineering, Jiangsu University,Zhenjiang 212023,China
  • Received:2022-03-19 Online:2023-08-15 Published:2023-08-14
  • Supported by:
    China University Industry-University-Research Innovation Fund(2021FNB02001);Science and Technology Innovation Project of the Ministry of Culture(2015KJCXXM19)

摘要:

针对密集场景下行人检测的遮挡问题,文中提出了基于YOLO(You Only Look Once)的SC-YOLOv4人群检测网络。在YOLOv4的CSPNet(Cross Stage Partial Network)结构基础上,结合ShuffleNetv2网络思想改进普通卷积结构,将原来普通的残差模块替换为Shuffle Module模块,提出了基于S-CSPDarkNet53(Shuffle CSPDarkNet53)的骨干网络结构,在保留精度的同时降低了网络参数量。文中在保留原来PANet(Path Aggregation Network)结构的基础上设计中心点预测模块,将原来的3个输出特征层改用基于中心点的预测方法,即对目标的中心点进行回归和训练计算损失,摒弃了原来的NMS(Non-Maximum Suppression)操作,进一步提高遮挡情况下的检测精度。实验结果表明,在CrowdHuamn数据集上采用S-CSPDarkNet53结构的YOLOv4较原网络的参数量显著减少,检测速度提升了5.2 frame·s-1,而最终的SC-YOLOv4网络在检测速度上较YOLOv4提升了4.9 frame·s-1

关键词: 人群检测, YOLO, Shuffle Module, 中心点检测, 密集人群, CrowdHuman, CSPNet, YOLOv4

Abstract:

For the occlusion problem of pedestrian detection in dense scenes, this study proposes the SC-YOLOv4 crowd detection network based on YOLO. Based on the CSPNet structure of YOLOv4 and combined with the idea of ShuffleNetv2 network, the common convolution structure is improved, and the original common residual module is replaced with the Shuffle Module. A backbone network structure based on S-CSPDarkNet53 is proposed, which preserves the accuracy and reduces the number of network parameters. The centroid prediction module is designed on the basis of retaining the original PANet structure, and the original three output feature layers are replaced with a centroid-based prediction method, that is, the regression and training of the target center point are carried out to calculate the loss, and the original NMS operation is discarded to further improve the detection accuracy in the case of occlusion. The experimental results show that YOLOv4 with S-CSPDarkNet53 structure on the CrowdHuamn data set reduces the amount of parameters and improves the detection speed by 5.2 frame·s-1 when compared with the original network. Compared with YOLOv4, the final SC-YOLOv4 network improves the detection speed by 4.9 frame·s-1.

Key words: crowd detection, YOLO, Shuffle Module, center point detection, dense crowd, CrowdHuman, CSPNET, YOLOv4

中图分类号: 

  • TP391.41