电子科技 ›› 2022, Vol. 35 ›› Issue (4): 14-19.doi: 10.16180/j.cnki.issn1007-7820.2022.04.003

• • 上一篇    下一篇

面向室内动态场景的VSLAM

伞红军1,王汪林1,陈久朋1,谢飞亚2,徐洋洋1,陈佳1   

  1. 1.昆明理工大学 机电工程学院,云南 昆明 650500
    2.中国人民解放军第78098部队,四川 眉山 620031
  • 收稿日期:2021-05-07 出版日期:2022-04-15 发布日期:2022-04-15
  • 作者简介:伞红军 (1976-),男,博士,副教授。研究方向:并联机器人。|王汪林 (1998-),男,硕士研究生。研究方向:视觉SLAM。|陈久朋 (1993-),男,博士,讲师。研究方向:机器人技术及应用。
  • 基金资助:
    国家重点研发项目(2017YFC1702503);云南省科技厅重大专项(202002AC080001)

VSLAM for Indoor Dynamic Scenes

Hongjun SAN1,Wanglin WANG1,Jiupeng CHEN1,Feiya XIE2,Yangyang XU1,Jia CHEN1   

  1. 1. Faculty of Mechanical and Electrical Engineering,Kunming University of Science and Technology,Kunming 650500,China
    2. No.78098 Unit of PLA,Meishan 620031,China
  • Received:2021-05-07 Online:2022-04-15 Published:2022-04-15
  • Supported by:
    National Key R&D Projects(2017YFC1702503);Major Special Project of Yunnan Provincial S&T Department(202002AC080001)

摘要:

传统VSLAM算法基于静态场景实现,其在室内动态场景下定位精度退化,三维稀疏点云地图也会出现动态特征点误匹配等问题。文中在ORB-SLAM2框架上进行改进,结合Mask R-CNN进行图像的语义分割,剔除位于动态物体上的动态特征点,优化了相机位姿,得到了静态的三维稀疏点云地图。在公开的TUM数据集上的实验结果表明,结合Mask R-CNN的ORB-SLAM2有效提高了智能移动机器人的位姿估计精度,绝对轨迹的均方根误差可提高96.3%,相对平移轨迹的均方根误差可提高41.2%,相对旋转轨迹的误差也有明显改善。相较于ORB-SLAM2,文中所提方法能更准确地建立无动态物体特征点干扰的三维稀疏点云地图。

关键词: VSLAM, 室内动态场景, Mask R-CNN, 语义分割, 位姿估计精度, ORB-SLAM2, TUM数据集, 三维稀疏点云地图

Abstract:

The traditional VSLAM algorithm is implemented based on static scenes, and the positioning accuracy is degraded in indoor dynamic scenes, and the 3D sparse point cloud map has problems such as mismatching of dynamic feature points. In this study, the ORB-SLAM2 framework is improved, which is combined with Mask R-CNN to perform semantic segmentation of images to remove dynamic feature points located on dynamic objects, optimize the camera pose, and obtain a static 3D sparse point cloud map. The experimental results on the public TUM dataset show that ORB-SLAM2 combined with Mask R-CNN effectively improves the pose estimation accuracy of intelligent mobile robots. The root mean square error of the absolute trajectory can be increased by 96.3%. The root mean square error of relative translation trajectory can be increased by 41.2%, and the relative rotation trajectory error has also been significantly improved. Compared with ORB-SLAM2, the proposed method can more accurately establish a 3D sparse point cloud map without the interference of dynamic object feature points.

Key words: VSLAM, indoor dynamic scene, Mask R-CNN, semantic segmentation, accuracy of pose estimation, ORB-SLAM2, TUM data set, 3D sparse point cloud map

中图分类号: 

  • TP242.6