ARTICLE | doi:10.20944/preprints202005.0181.v1
Subject: Computer Science And Mathematics, Mathematics Keywords: Reinforcement learning; Cartpole; Q Learning; Mathematical Modeling
Online: 10 May 2020 (18:02:43 CEST)
The prevalence of differential equations as a mathematical technique has refined the fields of control theory and constrained optimization due to the newfound ability to accurately model chaotic, unbalanced systems. However, in recent research, systems are increasingly more nonlinear and difficult to model using Differential Equations only. Thus, a newer technique is to use policy iteration and Reinforcement Learning, techniques that center around an action and reward sequence for a controller. Reinforcement Learning (RL) can be applied to control theory problems since a system can robustly apply RL in a dynamic environment such as the cartpole system (an inverted pendulum). This solution successfully avoids use of PID or other dynamics optimization systems, in favor of a more robust, reward-based control mechanism. This paper applies RL and Q-Learning to the classic cartpole problem, while also discussing the mathematical background and differential equations which are used to model the aforementioned system.