Preprint Review Version 2 Preserved in Portico This version is not peer-reviewed

Reinforcement Learning: Theory and Applications in HEMS

Version 1 : Received: 3 August 2022 / Approved: 5 August 2022 / Online: 5 August 2022 (04:43:42 CEST)
Version 2 : Received: 31 August 2022 / Approved: 1 September 2022 / Online: 1 September 2022 (04:27:12 CEST)

A peer-reviewed article of this Preprint also exists.

Al-Ani, O.; Das, S. Reinforcement Learning: Theory and Applications in HEMS. Energies 2022, 15, 6392. Al-Ani, O.; Das, S. Reinforcement Learning: Theory and Applications in HEMS. Energies 2022, 15, 6392.

Abstract

The steep rise in reinforcement learning (RL) in various applications in energy as well as the penetration of home automation in recent years are the motivation for this article. It surveys the use of RL in various home energy management system (HEMS) applications. There is a focus on deep neural network (DNN) models in RL. The article provides an overview of reinforcement learning. This is followed with discussions on state-of-the-art methods for value, policy, and actor–critic methods in deep reinforcement learning (DRL). In order to make the published literature in reinforcement learning more accessible to the HEMS community, verbal descriptions are accompanied with explanatory figures as well as mathematical expressions using standard machine learning terminology. Next, a detailed survey of how reinforcement learning is used in different HEMS domains is described. The survey also considers what kind of reinforcement learning algorithms are used in each HEMS application. It suggests that research in this direction is still in its infancy. Lastly, the article proposes four performance metrics to evaluate RL methods.

Keywords

HEMS; Reinforcement Learning; Deep Neural Network; Q-Value; Policy Gradient; Natural Gradient; Actor-Critic; Residential, Commercial, Academic.

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (1)

Comment 1
Received: 1 September 2022
Commenter: Sanjoy Das
Commenter's Conflict of Interests: Author
Comment: All major changes were done in response to the comments of three anonymous reviewers. The authors would like to thanks Reviewer-1 and Reviewer-2 for their suggestions, all of which helped improve the article.

Based on multiple reviewers' coments, the abstract was changed to better reflect the contents of the article. 

Based on Reviewer-1's suggestions, Section 8 ("Conclusion") and Figure 13 were added. A paragraph on what reinforcement learning approach to use for specific HEMS applications was included in Section 8. 

Based on Reviewer-2's suggestions, a paragraph on the use of Wi-Fi in HEMS was added in Section 2.1 with new references (cf. [56-58]). Additionally, as per Reviewer-2's suggestion, a brief outline on MPC (model predictive control) in Section 8, along with new references ([224, 225]).

Based on Reviewer-3's suggestions, the description on exploration and exploitation were further elaborated  in Section 4, and the abstract was modified. Reviewer-3 wanted more (unspecified) articles to be cited. A few more references that had some relevance to the subject, were included to satisfy this issue. 

Reviewer-3 also suggested that the article must include a descriptions on performance evaluation metrics and model accuracy of reinforcement learning algorithms. Since metrics for performance evaluation and model accuracy are an open research question, this suggestion could not be implemented. In its place, the a set of four metrics was suggested by the authors. The detailed description is provided in Section 8 ("Conclusion"). Specifically, see equations 44-47, and the new Figure 17. Additionally, reference [226] was cited.

Reviewer-3 suggested that two new references to be included. However, these were not included in the list of references as the authors felt that "Hunger Game Optimization" and "Humpback Whale Optimization" had no relevance to Reinforcement Learning or HEMS. For the benefit of readers interested in such metaheuristics, the two references that Reviewer-2 wanted, but were not included in the article are provided below.
[1] AbuShanab, WS; Elaziz, MA; Ghandourah, EI; Moustafa, EB; Elsheikh, AH: A new fine-tuned random vector functional link model using Hunger games search optimizer for modeling friction stir welding process of polymeric materials. Journal of Materials Research and Technology, 14, 2021, 1482-1493. 
[2] Moustafa, EB; Hammad, AH; Elsheikh, AH: A new optimized artificial neural network model to predict thermal efficiency and water yield of tubular solar still. Case Studies in Thermal Engineering, 30, 2022, 101750.
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.