Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Attitude Control of Highly Maneuverable Aircraft Using an Improved Q-learning

Version 1 : Received: 22 October 2022 / Approved: 24 October 2022 / Online: 24 October 2022 (10:24:33 CEST)

How to cite: Zahmatkesh, M.; Emami, S.A.; Banazadeh, A.; Castaldi, P. Attitude Control of Highly Maneuverable Aircraft Using an Improved Q-learning. Preprints 2022, 2022100360. https://doi.org/10.20944/preprints202210.0360.v1 Zahmatkesh, M.; Emami, S.A.; Banazadeh, A.; Castaldi, P. Attitude Control of Highly Maneuverable Aircraft Using an Improved Q-learning. Preprints 2022, 2022100360. https://doi.org/10.20944/preprints202210.0360.v1

Abstract

Attitude control of a novel regional truss-braced wing aircraft with low stability characteristics is addressed in this paper using Reinforcement Learning (RL). In recent years, RL has been increasingly employed in challenging applications, particularly, autonomous flight control. However, a significant predicament confronting discrete RL algorithms is the dimension limitation of the state-action table and difficulties in defining the elements of the RL environment. To address these issues, in this paper, a detailed mathematical model of the mentioned aircraft is first developed to shape an RL environment. Subsequently, Q-learning, the most prevalent discrete RL algorithm will be implemented in both the Markov Decision Process (MDP), and Partially Observable Markov Decision Process (POMDP) frameworks to control the longitudinal mode of the air vehicle. In order to eliminate residual fluctuations that are a consequence of discrete action selection, and simultaneously track variable pitch angles, a Fuzzy Action Assignment (FAA) method is proposed to generate continuous control commands using the trained Q-table. Accordingly, it will be proved that by defining an accurate reward function, along with observing all crucial states (which is equivalent to satisfying the Markov Property), the performance of the introduced control system surpasses a well-tuned Proportional–Integral–Derivative (PID) controller.

Keywords

Reinforcement Learning, Q-learning, Fuzzy Q-learning, Attitude Control, Truss-braced Wing, Flight Control

Subject

Engineering, Control and Systems Engineering

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.