Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

A Reinforcement Learning Approach to Dynamic Trajectory Optimization with Consideration of Imbalanced Sub-Goals in Self-Driving Vehicles

Version 1 : Received: 15 May 2024 / Approved: 16 May 2024 / Online: 16 May 2024 (09:58:18 CEST)

How to cite: Kim, Y.-J.; Ahn, W.-J.; Jang, S.-H.; Lim, M.-T.; Pae, D.-S. A Reinforcement Learning Approach to Dynamic Trajectory Optimization with Consideration of Imbalanced Sub-Goals in Self-Driving Vehicles. Preprints 2024, 2024051085. https://doi.org/10.20944/preprints202405.1085.v1 Kim, Y.-J.; Ahn, W.-J.; Jang, S.-H.; Lim, M.-T.; Pae, D.-S. A Reinforcement Learning Approach to Dynamic Trajectory Optimization with Consideration of Imbalanced Sub-Goals in Self-Driving Vehicles. Preprints 2024, 2024051085. https://doi.org/10.20944/preprints202405.1085.v1

Abstract

Goal-conditioned reinforcement learning (RL) holds promise for addressing intricate control challenges by enabling agents to learn and execute desired skills through separate decision modules. However, the irregular occurrence of required skills poses a significant challenge to effective learning. In this paper, we demonstrate the detrimental effects of this imbalanced skill(sub-goal) distribution and propose a novel training approach, Classified Experience Replay (CER), designed to mitigate this challenge. We demonstrate that adapting our method to conventional RL methods significantly enhances the performance of the RL agent. Considering the challenges inherent in tasks such as driving, characterized by biased occurrences of required sub-goals, our study demonstrates the improvement in trained outcomes facilitated by the proposed method. In addition, we introduce a specialized framework tailored for self-driving tasks on highways, integrating Model Predictive Control (MPC) into our RL trajectory optimization training paradigm. Our approach, utilizing CER with the suggested framework, yields remarkable advancements in trajectory optimization for RL agents operating in highway environments.

Keywords

reinforcement learning; experience replay; self-driving; trajectory optimization

Subject

Engineering, Control and Systems Engineering

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.