Submitted:
27 August 2025
Posted:
27 August 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Materials and Methods
3.1. ACR-TD3
3.2. Reward Functions
4. Training
5. Experiments
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Chen, Z.; Gan, Y.; Dong, S. Optimization of Mobile Robot Delivery System Based on Deep Learning. J. Comput. Sci. Res. 2024, 6, 51–65.
- Cimurs, R.; Merchán-Cruz, E.A. Leveraging Expert Demonstration Features for Deep Reinforcement Learning in Floor Cleaning Robot Navigation. Sensors 2022, 22, 7750.
- Abdeh, M.; Abut, F.; Akay, F. Autonomous Navigation in Search and Rescue Simulated Environment Using Deep Reinforcement Learning. Balkan J. Electr. Comput. Eng. 2021, 9, 92–98.
- Zhai, H.-Q.; Wang, L.-H. The Robust Residual-Based Adaptive Estimation Kalman Filter Method for Strap-Down Inertial and Geomagnetic Tightly Integrated Navigation System. Rev. Sci. Instrum. 2020, 91, 10.
- Bai, Y.; Zhang, H.; Wu, J.; Yang, W. UAV Path Planning Based on Improved A* and DWA Algorithms. Int. J. Aerosp. Eng. 2021, 2021, 4511252.
- Li, B.; Chen, B. An Adaptive Rapidly-Exploring Random Tree. IEEE/CAA J. Autom. Sin. 2022, 9, 283–294.
- Plasencia-Salgueiro, A.J. Deep Reinforcement Learning for Autonomous Mobile Robot Navigation. In Artificial Intelligence for Robotics and Autonomous Systems Applications; Springer: Cham, Switzerland, 2023; pp. 195–237.
- Ranaweera, M.; Mahmoud, Q.H. Virtual to Real-World Transfer Learning: A Systematic Review. Electronics 2021, 10, 1491.
- James, S.; Wohlhart, P.; Kalakrishnan, M.; Kalashnikov, D.; Irpan, A.; Ibarz, J.; et al. Sim-to-Real via Sim-to-Sim: Data-Efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. 2019, 12619–12629.
- Garulli, A.; Giannitrapani, A.; Prattichizzo, D.; Vicino, A. Mobile Robot SLAM for Line-Based Environment Representation. In Proceedings of the 44th IEEE Conference on Decision and Control (CDC), Seville, Spain, 2005; pp. 2041–2046.
- Harik, E.H.; Korsaeth, A. Combining Hector SLAM and Artificial Potential Field for Autonomous Navigation Inside a Greenhouse. Robotics 2018, 7, 22.
- Kim, Y.-H.; Jang, J.-I.; Yun, S. End-to-End Deep Learning for Autonomous Navigation of Mobile Robot. In Proceedings of the 2018 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 12–14 January 2018; pp. 1–6.
- Wang, J.K.; Zhang, X.; Zhao, Y.; Li, H.; Guo, K. A LiDAR-Based End-to-End Controller for Robot Navigation Using Deep Neural Network. In Proceedings of the 2017 IEEE International Conference on Unmanned Systems (ICUS), Beijing, China, 27–29 October 2017; pp. 302–307.
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-Level Control Through Deep Reinforcement Learning. Nature 2015, 518, 529–533.
- Tai, L.; Li, S.; Liu, M. A Deep-Network Solution Towards Model-Less Obstacle Avoidance. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 2759–2764.
- Zhu, Y.; Mottaghi, R.; Kolve, E.; Lim, J.J.; Gupta, A.; Fei-Fei, L.; Farhadi, A. Target-Driven Visual Navigation in Indoor Scenes Using Deep Reinforcement Learning. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 3357–3364.
- Ruan, X.; Zhang, Y.; Zhang, Z.; Zhou, X. Mobile Robot Navigation Based on Deep Reinforcement Learning. In Proceedings of the 2019 Chinese Control and Decision Conference (CCDC), Nanchang, China, 3–5 June 2019; pp. 5803–5808.
- Cimurs, R.; Suh, I.H.; Lee, S.H. Goal-Driven Autonomous Exploration Through Deep Reinforcement Learning. IEEE Robot. Autom. Lett. 2021, 7, 730–737.
- Wu, K.; Wang, X.; Zhang, S.; Huang, K. BND*-DDQN: Learn to Steer Autonomously Through Deep Reinforcement Learning. IEEE Trans. Cogn. Dev. Syst. 2019, 13, 249–261.
- Ma, L.; Zhao, T.; Wang, Y.; Wang, X.; Zhao, H.; Wang, Y. Learning to Navigate in Indoor Environments: From Memorizing to Reasoning. arXiv 2019, arXiv:1904.06933.
- Surmann, H.; Pörtner, A.; Pfingsthorn, M.; Wünsche, H. Deep Reinforcement Learning for Real Autonomous Mobile Robot Navigation in Indoor Environments. arXiv 2020, arXiv:2005.13857.
- Choi, J.; Dance, C.; Kim, J.E.; Park, K.S.; Han, J.; Seo, J.; et al. Fast Adaptation of Deep Reinforcement Learning-Based Navigation Skills to Human Preference. In Proc. IEEE Int. Conf. Robotics Autom. (ICRA), 2020; pp. 3363–3370.
- Samsani, S.S.; Mutahira, H.; Muhammad, M.S. Memory-Based Crowd-Aware Robot Navigation Using Deep Reinforcement Learning. Complex Intell. Syst. 2023, 9, 2147–2158.
- Jiang, H.; Ding, Z.; Cao, Z.; Liu, H. iTD3-CLN: Learn to Navigate in Dynamic Scene Through Deep Reinforcement Learning. Neurocomputing 2022, 503, 118–128.
- Liu, L.; Lin, H.; Zhang, M.; Wang, H.; Li, L.; Wang, Z. Robot Navigation in Crowded Environments Using Deep Reinforcement Learning. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 25–29 October 2020; pp. 11349–11355.
- Liu, Y.; Zhang, X.; Yu, D.; Xu, W.; Lu, Y.; Song, Y. A Soft Actor-Critic Deep Reinforcement-Learning-Based Robot Navigation Method Using LiDAR. Remote Sens. 2024, 16, 2072.
- Koenig, N.; Howard, A. Design and Use Paradigms for Gazebo, an Open-Source Multi-Robot Simulator. In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Sendai, Japan, 28 September–2 October 2004; Volume 3, pp. 2149–2154.
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Adv. Neural Inf. Process. Syst. 2019, 32, 8024–8035.
- Puck, L.; Walther, D.; Lüdtke, D.; Schlegel, C. Distributed and Synchronized Setup Towards Real-Time Robotic Control Using ROS2 on Linux. In Proceedings of the 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), Hong Kong, China, 20–21 August 2020; pp. 351–358.




| Parameter | Value |
| Learning Rate | 0.0005 |
| Discount Factor | 0.97 |
| Soft Target Update Parameter | 0.001 |
| Batch Size | 256 |
| Buffer Size | 2e6 |
| Method | Env1 | Env2 | Env3 |
| DDPG | 96.00% | 95.33% | 89.67% |
| TD3 | 96.00% | 94.00% | 90.33% |
| SAC | 96.67% | 95.00% | 88.00% |
| ACR-TD3 | 99.00% | 98.00% | 96.67% |
| Method | Env1 | Env2 | Env3 |
| DDPG | 4.476 | 4.553 | 4.785 |
| TD3 | 4.531 | 4.557 | 4.768 |
| SAC | 4.508 | 4.989 | 5.014 |
| ACR-TD3 | 4.307 | 4.452 | 4.547 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).