Version 1
: Received: 8 June 2021 / Approved: 9 June 2021 / Online: 9 June 2021 (10:46:39 CEST)
How to cite:
Mastromichalakis, S. SigmoReLU: An Improvement Activation Function by Combining Sigmoid and ReLU. Preprints2021, 2021060252. https://doi.org/10.20944/preprints202106.0252.v1
Mastromichalakis, S. SigmoReLU: An Improvement Activation Function by Combining Sigmoid and ReLU . Preprints 2021, 2021060252. https://doi.org/10.20944/preprints202106.0252.v1
Mastromichalakis, S. SigmoReLU: An Improvement Activation Function by Combining Sigmoid and ReLU. Preprints2021, 2021060252. https://doi.org/10.20944/preprints202106.0252.v1
APA Style
Mastromichalakis, S. (2021). SigmoReLU: An Improvement Activation Function by Combining Sigmoid and ReLU<strong> </strong>. Preprints. https://doi.org/10.20944/preprints202106.0252.v1
Chicago/Turabian Style
Mastromichalakis, S. 2021 "SigmoReLU: An Improvement Activation Function by Combining Sigmoid and ReLU<strong> </strong>" Preprints. https://doi.org/10.20944/preprints202106.0252.v1
Abstract
Two of the most common activation functions (AF) in deep neural networks (DNN) training are Sigmoid and ReLU. Sigmoid was tend to be more popular the previous decades, but it was suffering with the common vanishing gradient problems. ReLU has resolved these problems by using zero gradient and not tiny values for negative weights and the value “1” for all positives. Although it significant resolves the vanishing of the gradients, it poses new issues with dying neurons of the zero values. Recent approaches for improvements are in a similar direction by just proposing variations of the AF, such as Leaky ReLU (LReLU), while maintaining the solution within the same unresolved gradient problems. In this paper, the combining of the Sigmoid and ReLU in one single function is proposed, as a way to take the advantages of the two. The experimental results demonstrate that by using the ReLU’s gradient solution on positive weights, and Sigmoid’s gradient solution on negatives, has a significant improvement on performance of training Neural Networks on image classification of diseases such as COVID-19, text and tabular data classification tasks on five different datasets.
Keywords
Activation Function Combination; dying / vanishing gradients; ReLU; Sigmoid; Neural Networks; Keras; Medical Image Classification
Subject
Engineering, Control and Systems Engineering
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.