西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (5): 68-77.doi: 10.19665/j.issn1001-2400.2021.05.009

• • 上一篇    下一篇

一种复杂监控场景下的人体检测算法

张书伟(),李俊民()   

  1. 西安电子科技大学 数学与统计学院,陕西 西安 710071
  • 收稿日期:2021-04-26 出版日期:2021-10-20 发布日期:2021-11-09
  • 作者简介:张书伟(1995—),男,西安电子科技大学硕士研究生,E-mail: mathematics_zhang@163.com|李俊民(1965—),男,教授,博士,E-mail: jmli@mail.xidian.edu.cn
  • 基金资助:
    国家自然科学基金(61573013)

Human body detection algorithm in complex monitoring scenes

ZHANG Shuwei(),LI Junmin()   

  1. School of Mathematics and Statistics,Xidian University,Xi’an 710071,China
  • Received:2021-04-26 Online:2021-10-20 Published:2021-11-09

摘要:

在工业视频监控场景中,由于复杂背景、人体多姿态和人体遮挡等因素的影响,现有人体检测算法存在着准确率不高、模型泛化能力不强等显著问题。针对以上问题,基于特征图像金字塔和多尺度感受野理论,设计了一种检测网络的特征融合方式和特征图的生成策略,依靠轻量级的特征图像金字塔技术,并结合数据增强、锚点框匹配策略和遮挡损失函数等优化方法,进而提出了一种基于深度神经网络的人体检测算法EFIPNet。同时,为了充分验证EFIPNet算法的有效性,共建立了4个多元化的复杂监控场景下的人体数据集,其中涉及50种常见的人体行为。算法的有效性验证表明,新设计的高效人体检测算法可以有效地提高人体目标的检测精度,实现复杂监控场景下的人体精确和实时检测。此外,为了验证EFIPNet算法中不同模块的有效性,采用消融研究方法分析了网络中主要模块对人体检测模型性能的影响。在Person数据集上,EFIPNet算法在保持45帧/s检测速度的同时,相比于单发多盒检测器检测算法,将人体目标的检测精度提升了4.34%。

关键词: 深度神经网络, 视频监控, 特征融合, 特征图生成策略, 人体检测

Abstract:

In the video surveillance scene,due to the influence of factors such as complex background,multi-posture and occlusion,existing human body detection algorithms have problems such as low accuracy and weak model generalization ability.In response to the above problems,we have designed a detection network feature fusion method and feature map generation strategy based on the feature image pyramid and multi-scale receptive field theory.By relying on the lightweight feature image pyramid technology and combining optimization methods such as data enhancement,anchor box matching strategy and occlusion loss function,we have further proposed a human body detection algorithm EFIPNet based on the deep neural network.Meantime,in order to fully verify the effectiveness of the EFIPNet algorithm,this paper establishes 4 diversified video surveillance scene data sets,which involves a total of 50 common human body postures.The validation of the algorithm shows that the human detection network we have designed can effectively improve the detection accuracy of the human body,and achieve accurate and real-time human body detection in complex monitoring scenarios.In addition,in order to verify the effectiveness of different modules in the EFIPNet algorithm,we have used the ablation research method to analyze the influence of the main modules in the network on the performance of the human body detection model.On the Person dataset,compared with the SSD detection algorithm,the EFIPNet algorithm improves the detection accuracy of human targets by 4.34% while maintaining the detection speed of 45 frames per second.

Key words: deep neural network, video surveillance, feature fusion, feature map generation strategy, human body detection

中图分类号: 

  • TP18