Submitted:
11 April 2025
Posted:
11 April 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Methodology
3.1. Model Architecture
3.2. SLAM and SfM Integration
3.3. Image Pair Construction
3.4. Rotation and Overlap Detection
3.5. Coarse Structure-from-Motion
3.6. Iterative Refinement
3.7. Parallel Bundle Adjustment
3.8. Feature Extraction and Matching
3.9. Final 3D Reconstruction
3.10. Loss Function
3.11. Data Preprocessing
4. Evaluation Metrics
4.1. Reprojection Error (RPE)
4.2. Mean Absolute Trajectory Error (ATE)
4.3. Precision and Recall for Feature Matching
5. Experiment Results
6. Conclusion
References
- Yang, X.; Ming, Y.; Cui, Z.; Calway, A. Fd-slam: 3-d reconstruction using features and dense matching. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA); IEEE, 2022; pp. 8040–8046. [Google Scholar]
- Lu, J. Enhancing Chatbot User Satisfaction: A Machine Learning Approach Integrating Decision Tree, TF-IDF, and BERTopic. Preprints 2024. [Google Scholar] [CrossRef]
- Zhong, Y.; Sun, Z.; Sun, Y.; Luo, S.; Wang, Y.; Zhang, W. Multi-view 3D Reconstruction from Video with Transformer. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP); IEEE, 2022; pp. 1661–1665. [Google Scholar]
- Yang, L.; Zhu, Z.; Lin, X.; Nong, J.; Liang, Y. Long-Range Grouping Transformer for Multi-View 3D Reconstruction. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision; 2023; pp. 18257–18267. [Google Scholar]
- Li, S. Harnessing Multimodal Data and Mult-Recall Strategies for Enhanced Product Recommendation in E-Commerce. Preprints 2024. [Google Scholar] [CrossRef]
- Hoshi, S.; Ito, K.; Aoki, T. Accurate and robust image correspondence for structure-from-motion and its application to multi-view stereo. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP); IEEE, 2022; pp. 2626–2630. [Google Scholar]
- Liu, J.; Ji, P.; Bansal, N.; Cai, C.; Yan, Q.; Huang, X.; Xu, Y. Planemvs: 3d plane reconstruction from multi-view stereo. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022; pp. 8665–8675. [Google Scholar]
- Lu, J. Optimizing E-Commerce with Multi-Objective Recommendations Using Ensemble Learning. Preprints 2024. [Google Scholar] [CrossRef]
- Shan, Y.; Xiao, J.; Liu, L.; Wang, Y.; Yu, D.; Zhang, W. A Coarse-to-Fine Transformer-Based Network for 3D Reconstruction from Non-Overlapping Multi-View Images. Remote Sensing 2024, 16, 901. [Google Scholar] [CrossRef]
- Shi, W.; Liu, Z.; Li, Y.; Wen, Y.; Liu, Y. A Transformer-based Network for Multi-view 3D Mesh Generation. In Proceedings of the 2023 IEEE Smart World Congress (SWC); IEEE, 2023; pp. 1–8. [Google Scholar]
- Li, S.; Zhou, X.; Wu, Z.; Long, Y.; Shen, Y. Strategic Deductive Reasoning in Large Language Models: A Dual-Agent Approach. Preprints 2024. [Google Scholar] [CrossRef]



| Method | Precision | Recall | RPE | ATE (m) |
|---|---|---|---|---|
| SPSG | 0.482 | 0.510 | 1.23 | 0.53 |
| SPSG + LoFTR | 0.526 | 0.558 | 1.15 | 0.45 |
| SPSG + DKMv3 | 0.594 | 0.602 | 1.05 | 0.39 |
| Ours (Coarse-to-Fine) | 0.628 | 0.640 | 0.93 | 0.33 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).