ARTICLE | doi:10.20944/preprints202309.0058.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: convolutional neural networks; ensembles; fusion
Online: 4 September 2023 (03:51:24 CEST)
In computer vision and image analysis, Convolutional Neural Networks (CNNs) and other deep learning models are at the forefront of research and development. These advanced models have proven to be highly effective in tasks related to computer vision. One technique that has gained prominence in recent years is the construction of ensembles using Deep CNNs. These ensembles typically involve combining multiple pre-trained CNNs to create a more powerful and robust network. The purpose of this study is to evaluate the effectiveness of building CNN ensembles by combining several advanced techniques. Tested here are CNN ensembles constructed by replacing ReLU layers with different activation functions, employing various data augmentation techniques, and utilizing several algorithms, including some novel ones, that perturb network weights. Experimental results performed across many data sets representing different tasks demonstrate that our proposed methods for building deep ensembles produces superior results. All the resources required to replicate our experiments are available at https://github.com/LorisNanni.
ARTICLE | doi:10.20944/preprints202306.0552.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network; dolphin whistle; ensemble; spectrogram classification
Online: 7 June 2023 (12:54:44 CEST)
To effectively preserve marine environments and manage endangered species, it is necessary to employ efficient, precise, and scalable solutions for environmental monitoring. Ecoacoustics provides several benefits as it enables non-intrusive, prolonged sampling of environmental sounds, making it a promising tool for conducting biodiversity surveys. However, analyzing and interpreting acoustic data can be time-consuming and often demands substantial human supervision. This challenge can be addressed by harnessing contemporary methods for automated audio signal analysis, which have exhibited remarkable performance due to advancements in deep learning research. This paper introduces a research investigation into developing an automatic computerized system to detect dolphin whistles. The proposed method utilizes a fusion of various resnet50 networks integrated with data augmentation techniques. Through extensive experiments conducted on a publically available benchmark, our findings demonstrate that our ensemble yields significant performance enhancements across all evaluated metrics. The MATLAB/PyTorch source code is freely available at: https://github.com/LorisNanni/
ARTICLE | doi:10.20944/preprints202212.0498.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: face detection; depth map; deep learning; filtering
Online: 27 December 2022 (01:49:45 CET)
Face detection is an important problem in computer vision because it enables a wide range of applications, such as facial recognition and analysis of human behavior. The problem is challenging because of the large variations in facial appearance across different individuals and different lighting and pose conditions. One way to detect faces is to utilize a highly advanced face detection method, such as RetinaFace, which uses deep learning techniques to achieve high accuracy in various datasets. However, even the best face detectors can produce false positives, which can lead to incorrect or unreliable results. In this paper, we propose a method for reducing false positives in face detection by using information from a depth map. A depth map is a two-dimensional representation of the distance of objects in an image from the camera. By using the depth information, the proposed method is able to better differentiate between true faces and false positives. The authors evaluate their method on a combined dataset of 549 images, containing a total of 614 upright frontal faces. The results show that the proposed method is able to significantly reduce the number of false positives without sacrificing the overall detection rate. This indicates that the use of depth information can be a useful tool for improving face detection performance.
ARTICLE | doi:10.20944/preprints202111.0047.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Data augmentation; Deep Learning; Convolutional Neural Networks; Ensemble.
Online: 2 November 2021 (11:18:23 CET)
Convolutional Neural Networks (CNNs) have gained prominence in the research literature on image classification over the last decade. One shortcoming of CNNs, however, is their lack of generalizability and tendency to overfit when presented with small training sets. Augmentation directly confronts this problem by generating new data points providing additional information. In this paper, we investigate the performance of more than ten different sets of data augmentation methods, with two novel approaches proposed here: one based on the Discrete Wavelet Transform and the other on the Constant-Q Gabor transform. Pretrained ResNet50 networks are finetuned on each augmentation method. Combinations of these networks are evaluated and compared across three benchmark data sets of images representing diverse problems and collected by instruments that capture information at different scales: a virus data set, a bark data set, and a LIGO glitches data set. Experiments demonstrate the superiority of this approach. The best ensemble proposed in this work achieves state-of-the-art performance across all three data sets. This result shows that varying data augmentation is a feasible way for building an ensemble of classifiers for image classification (code available at https://github.com/LorisNanni).
ARTICLE | doi:10.20944/preprints202103.0180.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: convolutional neural networks; activation functions; biomedical classification; ensembles; MeLU variants
Online: 5 March 2021 (10:05:38 CET)
Recently, much attention has been devoted to finding highly efficient and powerful activation functions for CNN layers. Because activation functions inject different nonlinearities between layers that affect performance, varying them is one method for building robust ensembles of CNNs. The objective of this study is to examine the performance of CNN ensembles made with different activation functions, including six new ones presented here: 2D Mexican ReLU, TanELU, MeLU+GaLU, Symmetric MeLU, Symmetric GaLU, and Flexible MeLU. The highest performing ensemble was built with CNNs having different activation layers that randomly replaced the standard ReLU. A comprehensive evaluation of the proposed approach was conducted across fifteen biomedical data sets representing various classification tasks. The proposed method was tested on two basic CNN architectures: Vgg16 and ResNet50. Results demonstrate the superiority in performance of this approach. The MATLAB source code for this study will be available at https://github.com/LorisNanni.
ARTICLE | doi:10.20944/preprints202010.0526.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: audio classification; dissimilarity space; siamese network; ensemble of classifiers; pattern recognition; animal audio
Online: 26 October 2020 (13:57:01 CET)
The classifier system proposed in this work combines the dissimilarity spaces produced by a set of Siamese neural networks (SNNs) designed using 4 different backbones, with different clustering techniques for training SVMs for automated animal audio classification. The system is evaluated on two animal audio datasets: one for cat and another for bird vocalizations. Different clustering methods reduce the spectrograms in the dataset to a set of centroids that generate (in both a supervised and unsupervised fashion) the dissimilarity space through the Siamese networks. In addition to feeding the SNNs with spectrograms, additional experiments process the spectrograms using the Heterogeneous Auto-Similarities of Characteristics. Once the similarity spaces are computed, a vector space representation of each pattern is generated that is then trained on a Support Vector Machine (SVM) to classify a spectrogram by its dissimilarity vector. Results demonstrate that the proposed approach performs competitively (without ad-hoc optimization of the clustering methods) on both animal vocalization datasets. To further demonstrate the power of the proposed system, the best stand-alone approach is also evaluated on the challenging Dataset for Environmental Sound Classification (ESC50) dataset. The MATLAB code used in this study is available at https://github.com/LorisNanni.
ARTICLE | doi:10.20944/preprints202302.0396.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network; Ensemble Learning; Transfer Learning; Fine-tuning; Plankton Classification; foraminifera
Online: 23 February 2023 (03:37:23 CET)
This paper presents a study of an automated system for identifying planktic foraminifera at the species level. The system uses a combination of deep learning methods, specifically Convolutional Neural Networks (CNNs), to analyze digital images of foraminifera taken at different illumination angles. The dataset is composed of 1437 groups of sixteen grayscale images, one group for each foraminifer, that are then converted to RGB images with various processing methods. These RGB images are fed into a set of CNNs, organized in an Ensemble Learning (EL) environment. The ensemble is built by training different networks using different approaches for creating the RGB images. The study finds that an ensemble of CNN models trained on different RGB images improves the system's performance compared to other state-of-the-art approaches. The proposed system was also found to outperform human experts in classification accuracy.
ARTICLE | doi:10.20944/preprints202210.0224.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: multilabel; ensemble; incorporating multiple clustering centers; gated recurrent neural networks; temporal convolutional neural networks; long short-term memory
Online: 17 October 2022 (04:06:31 CEST)
Multilabel learning goes beyond standard supervised learning models by associating a sample with more than one class label. Among the many techniques developed in the last decade to handle multilabel learning best approaches are those harnessing the power of ensembles and deep learners. This work proposes merging both methods by combining a set of gated recurrent units, temporal convolutional neural networks, and long short-term memory networks trained with variants of the Adam optimization approach. We examine many Adam variants, each fundamentally based on the difference between present and past gradients, with step size adjusted for each parameter. We also combine Incorporating Multiple Clustering Centers and a bootstrap-aggregated decision trees ensemble, which is shown to further boost classification performance. In addition, we provide an ablation study for assessing the performance improvement that each module of our ensemble produces. Multiple experiments on a large set of datasets representing a wide variety of multilabel tasks demonstrate the robustness of our best ensemble, which is shown to outperform the state-of-the-art. The MATLAB code for generating the best ensembles in the experimental section will be made available at https://github.com/LorisNanni.
ARTICLE | doi:10.20944/preprints202108.0094.v1
Subject: Computer Science And Mathematics, Discrete Mathematics And Combinatorics Keywords: Siamese networks; Ensemble of classifiers; Loss function; Discrete cosine transform
Online: 3 August 2021 (15:49:22 CEST)
In this paper, we examine two strategies for boosting the performance of ensembles of Siamese networks (SNNs) for image classification using two loss functions (Triplet and Binary Cross Entropy) and two methods for building the dissimilarity spaces (FULLY and DEEPER). With FULLY, the distance between a pattern and a prototype is calculated by comparing two images using the fully connected layer of the Siamese network. With DEEPER, each pattern is described using a deeper layer combined with dimensionality reduction. The basic design of the SNNs takes advantage of supervised k-means clustering for building the dissimilarity spaces that train a set of support vector machines, which are then combined by sum rule for a final decision. The robustness and versatility of this approach are demonstrated on several cross-domain image data sets, including a portrait data set, two bioimage and two animal vocalization data sets. Results show that the strategies employed in this work to increase the performance of dissimilarity image classification using SNN is closing the gap with standalone CNNs. Moreover, when our best system is combined with an ensemble of CNNs, the resulting performance is superior to an ensemble of CNNs, demonstrating that our new strategy is extracting additional information.