Submitted:
29 May 2023
Posted:
30 May 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
- A new stereo SLAM system based on the ORB-SLAM2 framework combined with a deep learning method is put forward to decrease the impact of dynamic objects on the camera pose and trajectory estimation. The approach of semantic segmentation network plays a role in the data preprocessing stage to filter out features expression of moving objects.
- A novel motion object detection method is presented to reduce influences of moving targets on the camera pose and trajectory estimation, which calculates the likelihood of each keyframe points belonging to the dynamic content and distinguishes between dynamic and static goals in scenarios.
- To the best of our knowledge, the semantic segmentation neural network ENet [12] that is appropriate for city spaces is first utilized to enhance the performance of the visual SLAM system, which makes our system become more robust and practical in high dynamic and complex city streets, and it has practical engineering applications to a certain extent.
2. Related Work
3. System Description
3.1. Overview

3.2. Semantic Segmentation
3.3. Moving Object Detection

3.4. Outliers Removal

4. Experiments
4.1. Evaluation using KITTI Benchmark Dataset



4.2. Time analysis
| ORB-SLAM2 | OpenVSLAM | OurSLAM | |
|---|---|---|---|
| Mean[ms/frame] | 65.35 | 55.46 | 68.56 |
| Median[ms/frame] | 66.43 | 56.25 | 69.21 |
4.3. Evaluation Test in Real Environment
| Item | ORB-SLAM2 | OurSLAM | Improvements | |||
|---|---|---|---|---|---|---|
| RMSE | STD | RMSE | STD | RMSE | STD | |
| ATE | 20.387 | 10.021 | 4.632 | 2.987 | 77.28% | 70.19% |
| RPE | 0.203 | 0.324 | 0.085 | 0.039 | 58.13% | 87.96% |

5. Conclusion
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Davison, A.; Reid, I.; Molton, N.; Stasse, O. MonoSLAM: Realtime single camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1052–1067,2007.
- Neira, J.; Davison, A.; Leonard, J. Guest editorial, special issue in visual slam. IEEE Transactions on Robotics, vol. 24, no. 5, pp.929–931, 2008.
- Klein, G.; Murray, D. Parallel tracking and mapping for small AR workspaces. 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, 2007.
- Forster, C.; Pizzoli, M.; Scaramuzza, D. SVO: Fast semi-direct monocular visual odometry. IEEE International Conference on Robotics and Automation, 2014.
- Engel, J.; Schops, T.; Cremers, D. Lsd-slam: Large-scale direct monocular slam. European conference on computer vision. Springer, pp. 834–849, 2014.
- Mur-Artal, R.; Montiel, J. M. M.; Tardos, J. D. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1147-11631, 2015. [CrossRef]
- Mur-Artal, R.; Tardos, J. D. ORB-SLAM2: an open-source SLAM system for monocular, stereo and RGB-D cameras. IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255–1262, 2017. [CrossRef]
- Campos, C.; Elvira, R.; Rodriguez, J.; Montiel, J.; Tardós, J. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM. IEEE Transactions on Robotics, vol. 37, no. 6, 2021. [CrossRef]
- Panchpor, A. A.; Shue, S.; Conrad, J. M. A survey of methods for mobile robot localization and mapping in dynamic indoor environments. 2018 Conference on Signal Processing and Communication Engineering Systems, 2018. [CrossRef]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets Robotics: The KITTI Dataset. The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231-1237, 2013.
- Bescos, B.; Facil, J. M.; Civera, J.; Neira, J. Dynaslam: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 4076–4083, 2018.
- Paszke, A.; Chaurasia, A.; Kim, S.; Culurciello, E. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. 2016.
- Davison, A. Mobile robot navigation using active vision. Ph.D. dissertation, University Oxford, U.K., 1998.
- Davison, A. J.; Murray, D. W. Simultaneous localization and map building using active vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 865–880, 2002. [CrossRef]
- Davison, A.; Kita, N. 3-D simultaneous localisation and map building using active vision for a robot moving on undulating terrain. IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 384–391, 2001.
- Iocchi, L.; Konolige, K.; Bajracharya, M. Visually realistic mapping of a planar environment with stereo. International Symposium on Experimental Robotics, pp. 521–532.
- Se, S.; Lowe, D.; Little, J. Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks. International Journal of Robotics Research, vol. 21, no. 8, pp. 735–758, 2002.
- Jung, I.; Lacroix, S. High resolution terrain mapping using low altitude aerial stereo imagery. the 9th International Conference on Computer Vision, Nice, France, vol. 2, pp. 946–951, 2003.
- Hygounenc, E.; Jung, I.; Soueres, P.; Lacroix, S. The autonomous blimp project of LAAS-CNRS: Achievements in flight control and terrain mapping. International Journal of Robotics Research, vol. 23, no. 4, pp. 473–511, 2004. [CrossRef]
- Saez, J.; Escolano, F.; Penalver, A. First Steps towards Stereobased 6DOF SLAM for the visually impaired. IEEE Conference on Computer Vision and Pattern Recognition-Workshops, Washington, vol. 3, pp. 23–23, 2005.
- Sim, R.; Elinas, P.; Griffin, M.; Little, J. Vision-based SLAM using the Rao–Blackwellised particle filter. International Joint Conference on Artificial Intelligence-Workshop Reason, Uncertainty Robot. Edinburgh, U.K., pp. 9–16, 2005.
- Sim, R.; Elinas, P.; Little, J. A study of the Rao–Blackwellised particle filter for efficient and accurate vision-based SLAM. International Journal of Computer Vision, vol. 74, no. 3, pp. 303–318, 2007. [CrossRef]
- Paz, L. M.; PiniÉs, P.; TardÓs, J. D.; Neira, J. Large-Scale 6-DOF SLAM With Stereo-in-Hand. IEEE Transactions on Robotics, , vol. 24, no. 5, pp.946-957, 2008. [CrossRef]
- Lin, K. H.; Wang, C. C. Stereo-based simultaneous localization, mapping and moving object tracking. 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 2010.
- Kawewong, A.; Tongprasit, N.; Tangruamsub, S.; Hasegawa, O. Online and Incremental Appearance-based SLAM in Highly Dynamic Environments. The International Journal of Robotics Research, vol. 30, no. 1, pp.33-55, 2011. [CrossRef]
- Alcantarilla, P. F.; Yebes, J. J.; Almazán, J.; Bergasa, L.M. On combining visual SLAM and dense scene flow to increase the robustness of localization and mapping in dynamic environments. IEEE International Conference on Robotics and Automation, pp. 1290-1297, 2012.
- Kaess, M.; Ni, K.; Dellaert., F. Flow separation for fast and robust stereo odometry. IEEE International Conference on Robotics and Automation, pp. 3539-3544, 2009.
- Karlsson, N.; Di Bernardo, E.; Ostrowski, J.; Goncalves, L.; Pirjanian, P.; Munich, M. E. The vSLAM algorithm for robust localization and mapping. IEEE International Conference on Robotics and Automation, pp. 24-29, 2005.
- Zou, D.; Tan. P. CoSLAM: Collaborative Visual SLAM in Dynamic Environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 2, pp.354-366, 2013.
- Fan, Y.; Han, H.; Tang, Y.; Zhi, T. Dynamic objects elimination in SLAM based on image fusion. Pattern Recognition Letters, vol. 127, no. 1, pp. 191-201, 2019. [CrossRef]
- He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask r-cnn. IEEE International Conference on Computer Vision, pp. 2961–2969, 2017.
- Sun, T.; Sun, Y.; Liu, M.; Yeung, D. Y. Movable-Object-Aware Visual SLAM via Weakly Supervised Semantic Segmentation. 2019.
- Balntas,V.; Riba, E.; Ponsa, D.; Mikolajczyk, K. Learning local feature descriptors with triplets and shallow convolutional neural networks. British Machine Vision Conference, vol. 1, pp. 3, 2016.
- Kang, R.; Shi, J.; Li, X.; Liu, Y. DF-SLAM: A Deep-Learning Enhanced Visual SLAM System based on Deep Local Features. 2019.
- Li, P.; Qin, T.; Shen, S. Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving. European Conference on Computer Vision, 2018.
- Ai, Y.; Rui, T.; Lu, M.; Fu, L.; Liu S.; Wang, S. DDL-SLAM: A robust RGB-D SLAM in dynamic environments combined with Deep Learning. IEEE Access, vol. 8, pp. 162335-162342, 2020. [CrossRef]
- Yu, C.; Liu, Z. X.; Liu, X. J.; Xie, F. G.; Yang, Y.; Wei, Q.; Qiao, F. DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018.
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. IEEE Conference on Computer Vision and Pattern Recognition, 2016. [CrossRef]
- Zhao, Y.; Shi, H.; Chen, X.; Li, X.; Wang, C. An overview of object detection and tracking. 2015 IEEE International Conference on Information and Automation, 2015.
- Lucas, B. D.; Kanade, T. An Iterative Image Registration Technique with an Application to Stereo Vision. the 7th International Joint Conference on Artificial Intelligence, 1997.
- Shi, J.; Tomasi, C. Good Features to Track. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 600, 2000.
- Sumikura, S.; Shibuya, M.; Sakurada, K. OpenVSLAM: A Versatile Visual SLAM Framework. the 27th ACM International Conference on Multimedia, pp.2292-2295, 2019.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).