Submitted:
30 May 2023
Posted:
30 May 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
- A feature classification using YOLOV5 [14] object detection algorithm is proposed in the front-end, which divide dynamic feature points into three categories: absolute static points, absolute dynamic points and temporary static points. Then, dynamic factors of temporary static features are calculated based on the IMU pre-integration prior constraint and the epipolar constraint. Temporary static features are classified again according to dynamic factors.
- A robust BA optimization method based on dynamics factor is proposed in the back-end. If the object is more dynamic, its features weights are decreased, and vice versa, its features weights are increased.
- Extensive experiments are carried out on public datasets like TUM,KITTI and VIODE and our dataset. The experiment results demonstrate the accuracy and robustness of our proposed D-VINS.
2. Related Work
2.1. Geometry Based Dynamic SLAM
2.2. Deep Learning Based Dynamic SLAM
3. Methods
3.1. Dynamic Object Classification
3.1.1. Semantic Label Incremental Updating with Bayes’ Rule

3.1.2. Feature Points Motion State Classification
3.2. Features Dynamics Check with IMU Prior and Epipolar Constraint
3.2.1. Dynamic Factor of Reprojection Error Based on IMU Prior Constraint
3.2.1. Dynamic Factor of Epipolar Constraint

3.3. Dynamic Adaptive Bundle Adjustment
3.3.1. Conventional Bundle Adjustment Optimization
3.3.2. Dynamic Adaptive Cost Function with Dynamic Factors
4. Experimental Results
4.1. TUM RGB-D, VIODE and KITTI Dataset Evaluation
4.1.1. TUM RGB-D Dataset
4.1.2. KITTI Dataset
4.1.3. VIODE Dataset
4.2. Data Collecting Equipment and Real Environment Dataset Experiments
4.2.1. Data Collection Devices and Real Datasets
- 5_SLAM_country_dynamic_loop_1 sequence was collected in a village in Xiangyin County, Yueyang City, Hunan Province, in a relatively open environment, where a pedestrian and child were always present in the image moving in synchronization with the camera. The start and end points of the sequence are close to each other, but there is no loop clouse to detect the drift.
- 14_SLAM_car_road_1 sequence is a street in Xiangyin County, Yueyang City, Hunan Province. The sequence is an open environment. This environment is challenging for stereo visual localization, which causes severe drift. Rural roads are narrow with many vehicles, and there are villager gatherings in the middle of the road. Pedestrians and vehicles are intricate and occupy a large field of view, making positioning difficult and challenging.
- 18_SLAM_car_road_2, sequence is an urban environment with wider roads, more vehicles and more pedestrians compared to 14 rural streets. It is suitable as a dynamic rejection algorithm evaluation sequence. The main data types include: GNSS raw data, IMU data, LiDAR point cloud data, and binocular color image data. The ground truth of trajectory is obtained with GNSS RTK.
4.2.2 Feature Classification Results in Real Dataset
4.2.3. Trajectories Results in Real Dataset
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Abaspur Kazerouni, I.; Fitzgerald, L.; Dooly, G.; Toal, D. A Survey of State-of-the-Art on Visual SLAM. Expert Systems with Applications 2022, 205, 117734. [CrossRef]
- Covolan, J.P.M.; Sementille, A.C.; Sanches, S.R.R. A Mapping of Visual SLAM Algorithms and Their Applications in Augmented Reality. In Proceedings of the 2020 22nd Symposium on Virtual and Augmented Reality (SVR); November 2020; pp. 20–29.
- Tourani, A.; Bavle, H.; Sanchez-Lopez, J.L.; Voos, H. Visual SLAM: What Are the Current Trends and What to Expect? Sensors 2022, 22, 9297. [CrossRef]
- Tourani, A.; Bavle, H.; Sanchez-Lopez, J.L.; Voos, H. Visual SLAM: What Are the Current Trends and What to Expect? Sensors 2022, 22, 9297. [CrossRef]
- Chen, C.; Zhu, H.; Li, M.; You, S. A Review of Visual-Inertial Simultaneous Localization and Mapping from Filtering-Based and Optimization-Based Perspectives. Robotics 2018, 7, 45. [CrossRef]
- Mourikis, A.I.; Roumeliotis, S.I. A Multi-State Constraint Kalman Filter for Vision-Aided Inertial Navigation. In Proceedings of the Proceedings 2007 IEEE International Conference on Robotics and Automation; IEEE: Rome, Italy, April 2007; pp. 3565–3572.
- Qin, T.; Li, P.; Shen, S. VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator. IEEE Trans. Robot. 2018, 34, 1004–1020. [CrossRef]
- Campos, C.; Elvira, R.; Rodríguez, J.J.G.; M. Montiel, J.M.; D. Tardós, J. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM. IEEE Transactions on Robotics 2021, 37, 1874–1890. [CrossRef]
- von Stumberg, L.; Cremers, D. DM-VIO: Delayed Marginalization Visual-Inertial Odometry. IEEE Robot. Autom. Lett. 2022, 7, 1408–1415. [CrossRef]
- Sun, K.; Mohta, K.; Pfrommer, B.; Watterson, M.; Liu, S.; Mulgaonkar, Y.; Taylor, C.J.; Kumar, V. Robust Stereo Visual Inertial Odometry for Fast Autonomous Flight. IEEE Robotics and Automation Letters 2018, 3, 965–972. [CrossRef]
- Qin, T.; Cao, S.; Pan, J.; Shen, S. A General Optimization-Based Framework for Global Pose Estimation with Multiple Sensors 2019.
- Mur-Artal, R.; Tardós, J.D. ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Transactions on Robotics 2017, 33, 1255–1262. [CrossRef]
- Wang, C.-C.; Thorpe, C.; Thrun, S. Online Simultaneous Localization and Mapping with Detection and Tracking of Moving Objects: Theory and Results from a Ground Vehicle in Crowded Urban Areas. In Proceedings of the 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422); September 2003; Vol. 1, pp. 842–849 vol.1.
- Jocher, G.; Chaurasia, A.; Stoken, A.; Borovec, J.; NanoCode012; Kwon, Y.; Michael, K.; TaoXie; Fang, J.; imyhxy; et al. Ultralytics/Yolov5: V7.0 - YOLOv5 SOTA Realtime Instance Segmentation 2022. [CrossRef]
- Fischler, M.A.; Bolles, R.C. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. In Readings in Computer Vision; Elsevier: Amsterdam, The Netherlands, 1987; pp. 726–740.
- Yan, L.; Hu, X.; Zhao, L.; Chen, Y.; Wei, P.; Xie, H. DGS-SLAM: A Fast and Robust RGBD SLAM in Dynamic Environments Combined by Geometric and Semantic Information. Remote Sens. 2022, 14, 795. [CrossRef]
- Song, S.; Lim, H.; Lee, A.J.; Myung, H. DynaVINS: A Visual-Inertial SLAM for Dynamic Environments. IEEE Robot. Autom. Lett. 2022. [CrossRef]
- Zhang, C.; Zhang, R.; Jin, S.; Yi, X. PFD-SLAM: A New RGB-D SLAM for Dynamic Indoor Environments Based on Non-Prior Semantic Segmentation. Remote Sens. 2022, 14, 2445. [CrossRef]
- Bian, J.; Lin, W.-Y.; Matsushita, Y.; Yeung, S.-K.; Nguyen, T.-D.; Cheng, M.-M. GMS: Grid-Based Motion Statistics for Fast, Ultra-Robust Feature Correspondence. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); July 2017; pp. 2828–2837.
- Huang, J.; Yang, S.; Zhao, Z.; Lai, Y.-K.; Hu, S. ClusterSLAM: A SLAM Backend for Simultaneous Rigid Body Clustering and Motion Estimation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV); October 2019; pp. 5874–5883.
- Bescos, B.; Fácil, J.M.; Civera, J.; Neira, J. DynaSLAM: Tracking, Mapping and Inpainting in Dynamic Scenes. IEEE Robot. Autom. Lett. 2018, 3, 4076–4083. [CrossRef]
- Xiao, L.; Wang, J.; Qiu, X.; Rong, Z.; Zou, X. Dynamic-SLAM: Semantic Monocular Visual Localization and Mapping Based on Deep Learning in Dynamic Environment. Robotics and Autonomous Systems 2019, 117, 1–16. [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision – ECCV 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, 2016; pp. 21–37.
- Yu, C.; Liu, Z.; Liu, X.; Xie, F.; Yang, Y.; Wei, Q.; Fei, Q. DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments. 2018. [CrossRef]
- Lucas, B.D.; Kanade, T. An Iterative Image Registration Technique with an Application to Stereo Vision. Proceedings of the DARPA Image Understanding Workshop, Washington, DC, USA, April 1981; pp. 674–679.
- Liu, J.; Li, X.; Liu, Y.; Chen, H. Dynamic-VINS: RGB-D Inertial Odometry for a Resource-Restricted Robot in Dynamic Environments. IEEE Robotics and Automation Letters 2022, 7, 9573–9580. [CrossRef]
- Wu, W.; Guo, L.; Gao, H.; You, Z.; Liu, Y.; Chen, Z. YOLO-SLAM: A Semantic SLAM System towards Dynamic Environment with Geometric Constraint. Neural Comput & Applic 2022, 34, 6011–6026. [CrossRef]
- Cheng, S.; Sun, C.; Zhang, S.; Zhang, D. SG-SLAM: A Real-Time RGB-D Visual SLAM Toward Dynamic Scenes With Semantic and Geometric Information. IEEE Transactions on Instrumentation and Measurement 2023, 72, 1–12. [CrossRef]
- Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the Computer Vision – ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Springer International Publishing: Cham, 2014; pp. 740–755.
- Shafi, O.; Rai, C.; Sen, R.; Ananthanarayanan, G. Demystifying TensorRT: Characterizing Neural Network Inference Engine on Nvidia Edge Devices. In Proceedings of the 2021 IEEE International Symposium on Workload Characterization (IISWC); November 2021; pp. 226–237.
- Wang, Q.; Yan, C.; Tan, R.; Feng, Y.; Sun, Y.; Liu, Y. 3D-CALI: Automatic Calibration for Camera and LiDAR Using 3D Checkerboard. Measurement 2022, 203, 111971. [CrossRef]
- Rehder, J.; Nikolic, J.; Schneider, T.; Hinzmann, T.; Siegwart, R. Extending Kalibr: Calibrating the Extrinsics of Multiple IMUs and of Individual Axes. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA); IEEE: Stockholm, Sweden, May 2016; pp. 4304–4311.
- Sturm, J.; Engelhard, N.; Endres, F.; Burgard, W.; Cremers, D. A Benchmark for the Evaluation of RGB-D SLAM Systems. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems; October 2012; pp. 573–580.
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision Meets Robotics: The KITTI Dataset. The International Journal of Robotics Research 2013, 32, 1231–1237. [CrossRef]
- Minoda, K.; Schilling, F.; Wüest, V.; Floreano, D.; Yairi, T. VIODE: A Simulated Dataset to Address the Challenges of Visual-Inertial Odometry in Dynamic Environments. IEEE Robot. Autom. Lett. 2021, 6, 1343–1350. [CrossRef]










| Sequences | ORB-SLAM2 | D-VINS* | Improvement | |||
|---|---|---|---|---|---|---|
| ATE | RPE | ATE | RPE | ATE | RPE | |
| fr3_sitting_static | 0.0116 | 0.0152 | 0.008 | 0.0114 | 31.03% | 25.00% |
| fr3_sitting_xyz | 0.0133 | 0.0199 | 0.0153 | 0.0179 | - | 10.05% |
| fr3_sitting_halfsphere | 0.0336 | 0.0124 | 0.0252 | 0.0122 | 25.00% | 1.61% |
| fr3_walking_static | 0.4121 | 0.0299 | 0.0069 | 0.0101 | 98.32% | 66.22% |
| fr3_walking_xyz | 0.8856 | 0.1255 | 0.0155 | 0.0182 | 98.24% | 85.49% |
| fr3_walking_rpy | 0.5987 | 0.0528 | 0.0422 | 0.0432 | 92.95% | 18.18% |
| fr3_walking_ halfsphere | 0.4227 | 0.0338 | 0.0216 | 0.0234 | 94.89% | 30.77% |
| Sequences | VINS-Fusion | DynaVINS | D-VINS |
|---|---|---|---|
| KITTI 05 | 1.913 | 12.4668 | 1.7631 |
| KITTI 07 | 2.1927 | 3.8006 | 2.1100 |
| Scenes | Sequences | VINS-Fusion | DynaVINS | D-VINS |
|---|---|---|---|---|
| Parking_lot | 0_none | 0.0774 | 0.0595 | 0.0538 |
| 1_low | 0.1126 | 0.0826 | 0.0472 | |
| 2_mid | 0.1174 | 0.0630 | 0.0396 | |
| 3_high | 0.1998 | 0.0982 | 0.0664 | |
| City_day | 0_none | 0.1041 | 0.1391 | 0.0882 |
| 1_low | 0.2043 | 0.0748 | 0.0912 | |
| 2_mid | 0.2319 | 0.0520 | 0.0864 | |
| 3_high | 0.3135 | 0.0743 | 0.0835 | |
| City_night | 0_none | 0.2624 | 0.1801 | 0.1561 |
| 1_low | 0.5665 | 0.1413 | 0.1221 | |
| 2_mid | 0.3862 | 0.1192 | 0.1395 | |
| 3_high | 0.7611 | 0.1519 | 0.1566 |
| Sequences | VINS-Fusion | DynaVINS | D-VINS |
|---|---|---|---|
| 5_SLAM_dynamic_loop_1 | 0.657039 | 2.145493 | 0.654882 |
| 14_SLAM_car_road_1 | 37.31964 | - | 27.60877 |
| 18_SLAM_car_road_2 | 299.7889 | - | 151.2075 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).