ARTICLE | doi:10.20944/preprints201811.0579.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning, Cognitive, LSTM, Neural network, Ngrams
Online: 26 November 2018 (10:06:05 CET)
Cognitive neuroscience is the study of how the human brain functions on tasks like decision making, language, perception and reasoning. Deep learning is a class of machine learning algorithms that use neural networks. They are designed to model the responses of neurons in the human brain. Learning can be supervised or unsupervised. Ngram token models are used extensively in language prediction. Ngrams are probabilistic models that are used in predicting the next word or token. They are a statistical model of word sequences or tokens and are called Language Models or Lms. Ngrams are essential in creating language prediction models. We are exploring a broader sandbox ecosystems enabling for AI. Specifically, around Deep learning applications on unstructured content form on the web.
ARTICLE | doi:10.20944/preprints202201.0465.v1
Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: genetic algorithm; deep neural network; hidden layer; optimal architecture; intrusion detection
Online: 31 January 2022 (13:26:18 CET)
Computer network attacks are evolving in parallel with the evolution of hardware and neural network architecture. Despite major advancements in Network Intrusion Detection System (NIDS) technology, most implementations still depend on signature-based intrusion detection systems, which can’t identify unknown attacks. Deep learning can help NIDS to detect novel threats since it has a strong generalization ability. The deep neural network’s architecture has a significant impact on the model’s results. We propose a genetic algorithm based model to find the optimal number of hidden layers and the number of neurons in each layer of the deep neural network (DNN) architecture for the network intrusion detection binary classification problem. Experimental results demonstrate that the proposed DNN architecture shows better performance than classical machine learning algorithms at a lower computational cost.
ARTICLE | doi:10.20944/preprints202005.0347.v1
Subject: Engineering, Mechanical Engineering Keywords: deep learning; maximum mean discrepancy; gearbox; fault detection
Online: 22 May 2020 (05:21:56 CEST)
In the past years, various intelligent machine learning and deep learning algorithms have been developed and widely applied for gearbox fault detection and diagnosis. However, the real-time application of these intelligent algorithms has been limited, mainly due to the fact that the model developed using data from one machine or one operating condition has serious diagnosis performance degradation when applied to another machine or the same machine with a different operating condition. The reason for poor model generalization is the distribution discrepancy between the training and testing data. This paper proposes to address this issue using a deep learning based cross domain adaptation approach for gearbox fault diagnosis. Labelled data from training dataset and unlabeled data from testing dataset is used to achieve the cross-domain adaptation task. A deep convolutional neural network (CNN) is used as the main architecture. Maximum mean discrepancy is used as a measure to minimize the distribution distance between the labelled training data and unlabeled testing data. The study proposes to reduce the discrepancy between the two domains in multiple layers of the designed CNN to adapt the learned representations from the training data to be applied in the testing data. The proposed approach is evaluated using experimental data from a gearbox under significant speed variation and multiple health conditions. An appropriate benchmarking with both traditional machine learning methods and other domain adaptation methods demonstrates the superiority of the proposed method.
ARTICLE | doi:10.20944/preprints201811.0546.v4
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network (CNN), Deep learning, Architecture, Applications
Online: 14 February 2019 (10:01:31 CET)
With the increase of the Artificial Neural Network (ANN), machine learning has taken a forceful twist in recent times. One of the most spectacular kinds of ANN design is the Convolutional Neural Network (CNN). The Convolutional Neural Network (CNN) is a technology that mixes artificial neural networks and up to date deep learning strategies. In deep learning, Convolutional Neural Network is at the center of spectacular advances. This artificial neural network has been applied to several image recognition tasks for decades and attracted the eye of the researchers of the many countries in recent years as the CNN has shown promising performances in several computer vision and machine learning tasks. This paper describes the underlying architecture and various applications of Convolutional Neural Network.
ARTICLE | doi:10.20944/preprints202304.0141.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: plume rise; deep learning; plume cloud recognition
Online: 10 April 2023 (04:11:22 CEST)
Estimating plume cloud height is essential for various applications, such as global climate models. Smokestack plume rise is the constant height at which the plume cloud is carried downwind as its momentum dissipates and the plume cloud and the ambient temperatures equalize. Although different parameterizations are used in most air-quality models to predict the plume rise, they have been unable to estimate it properly. This paper proposes a novel framework to monitor smokestack plume clouds and make long-term, real-time measurements of the plume rise. For this purpose, a three-stage framework is developed based on Deep Convolutional Neural Networks (DCNNs). In the first stage, an improved Mask R-CNN, called Deep Plume Rise Network (DPRNet), is applied to recognize the plume cloud. Then, image processing analysis and least squares theory are respectively used to detect the plume cloud’s boundaries and fit an asymptotic model into their centerlines. The y-component coordinate of this model’s critical point is considered the plume rise. In the last stage, a geometric transformation phase converts image measurements into real-life ones. A wide range of images with different atmospheric conditions, including day, night, and cloudy/foggy, have been selected for the DPRNet training algorithm. Obtained results show that the proposed method outperforms widely-used networks in smoke border detection and recognition.
ARTICLE | doi:10.20944/preprints202005.0455.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: pattern recognition; deep convolutional neural network; Brahmi script; CNN
Online: 28 May 2020 (07:33:32 CEST)
Significant progress has made in pattern recognition technology. However, one obstacle that has not yet overcome is the recognition of words in the Brahmi script, specifically the identification of characters, compound characters, and word. This study proposes the use of the deep convolutional neural network with dropout to recognize the Brahmi words. This study also proposed a DCNN for Brahmi word recognition and a series of experiments are performed on standard Brahmi dataset. The practical operation of this method was systematically tested on accessible Brahmi image database, achieving 92.47% recognition rate by CNN with dropout respectively which is among the best while comparing with the ones reported in the literature for the same task.
ARTICLE | doi:10.20944/preprints202309.1273.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Facial Emotion Detection; Deep Learning; Classification; Neural Architecture Search Network
Online: 20 September 2023 (03:36:23 CEST)
Facial emotion detection is a challenging task that deals with emotion recognition. It has applications in various domains, such as behavior analysis, surveillance systems and human-computer interaction (HCI). Numerous studies have been implemented to detect emotions, including classical machine learning algorithms and advanced deep learning algorithms. For the machine learning algorithm, the hand-crafted feature needs to be extracted, which is a tiring task and requires human effort. Whereas in deep learning models, automated feature extraction is employed from samples. Therefore, in this study, we have proposed a novel and efficient deep learning model based on Neural Architecture Search Network utilizing superior artificial networks such as RNN and child networks. We performed the training utilizing the FER 2013 dataset comprising seven classes: happy, angry, neutral, sad, surprise, fear, and disgust. Furthermore, we analyzed the robustness of the proposed model on CK+ datasets and comparing with existing techniques. Due to the implication of reinforcement learning in the network, most representative features are extracted from the sample network. It extracts all key features without losing the key information. Our proposed model is based on one stage classifier and performs efficient classification. Our technique outperformed the existing models attaining an accuracy of 98.14%, recall of 97.57%, and precision of 97.84%.
BRIEF REPORT | doi:10.20944/preprints202207.0419.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: computer vision; deep learning; CoughNet model
Online: 27 July 2022 (10:01:54 CEST)
To solve two key problems in the identification of people who are infected with COVID-19: the first problem is that the identification accuracy is not high enough. The second problem is that present identification method such as nucleic acid testing is expensive in many countries. Methods: So, I decided to design a fast identification method for COVID-19 patients which is based on deep learning. After the model (CoughNet) learns more than 6,000 cough spectrograms of both COVID-19 patients and normal people, the accuracy rate of identification of COVID-19 patients and normal people is higher than 99% in the test set. Structure: This paper is mainly divided into three parts: the first part introduces the main background and research status of the research; The second part introduces the research methods; The third part introduces the specific process of the experiment.
ARTICLE | doi:10.20944/preprints201912.0252.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: time series; deep learning; convolutional neural network; recurrence plot; financial market prediction
Online: 19 December 2019 (07:39:54 CET)
An application of deep convolutional neural network and recurrence plot for financial market movement prediction is presented. Though it is challenging and subjective to interpret its information, the pattern formed by a recurrence plot provide a useful insight into the dy- namical system. We used a recurrence plot of seven financial time series to train a deep neural network for financial market movement predic- tion. Our approach is tested on our dataset and achieved an average of 53.25% classification accuracy. The result suggests that a well trained deep convolutional neural network can learn a recurrence plot and pre- dict a financial market direction.
ARTICLE | doi:10.20944/preprints202006.0368.v1
Subject: Business, Economics And Management, Finance Keywords: Fraud Detection; Recurrent Neural Network; PaySim; Financial Transactions; Deep Learning
Online: 30 June 2020 (11:34:34 CEST)
Online transactions are becoming more popular in present situation where the globe is facing an unknown disease COVID-19. Now authorities of Countries requested peoples to use cashless transaction as far as possible. Practically it is not always possible to use it in all transactions. Since number of such cashless transactions have been increasing during lockdown period due to COVID-19, fraudulent transactions are also increasing in a rapid way. Fraud can be analysed by viewing a series of customer transactions data that was done in his/her previous transactions. Normally banks or other transaction authorities warned their customers about the transaction If any deviation is noticed by them from available patterns. These authorities think that it is possibly of fraudulent transaction. For detection of fraud during COVID-19, banks and credit card companies are applying various methods such as data mining , decision tree, rule based mining, neural network, fuzzy clustering approach and machine learning methods. These approaches is try to find out normal usage pattern of customers based on their past activities. The objective of this paper is to find out such fraud transactions during such unmanageable situation.Digital payment schemes are often threatened by fraudulent activities. Detecting fraud transaction in during money transfer may save customers from financial loss. Mobile based money transactions are focused in this paper for fraud detection. A Deep Learning (DL) framework is suggested in this paper that monitors and detects fraudulent activities. Implementing and applying recurrent neural network on PaySim generated synthetic financial dataset, deceptive transactions are identified. The proposed method is capable to detect deceptive transactions with an accuracy of 99.87%, F1-Score of 0.99 and MSE of 0.01.
ARTICLE | doi:10.20944/preprints202106.0613.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: LRTI; URTI; Asthma; Cough Classification; Respiratory Pathology Classification; MFCCs; BiLSTM; Deep Neural Networks
Online: 25 June 2021 (09:45:00 CEST)
Intelligent systems are transforming the world, as well as our healthcare system. We propose a deep learning-based cough sound classification model that can distinguish between children with healthy versus pathological coughs such as asthma, upper respiratory tract infection (URTI), and lower respiratory tract infection (LRTI). In order to train a deep neural network model, we collected a new dataset of cough sounds, labelled with clinician's diagnosis. The chosen model is a bidirectional long-short term memory network (BiLSTM) based on Mel Frequency Cepstral Coefficients (MFCCs) features. The resulting trained model when trained for classifying two classes of coughs -- healthy or pathology (in general or belonging to a specific respiratory pathology), reaches accuracy exceeding 84\% when classifying cough to the label provided by the physicians' diagnosis. In order to classify subject's respiratory pathology condition, results of multiple cough epochs per subject were combined. The resulting prediction accuracy exceeds 91\% for all three respiratory pathologies. However, when the model is trained to classify and discriminate among the four classes of coughs, overall accuracy dropped: one class of pathological coughs are often misclassified as other. However, if one consider the healthy cough classified as healthy and pathological cough classified to have some kind of pathologies, then the overall accuracy of four class model is above 84\%. A longitudinal study of MFCC feature space when comparing pathologicial and recovered coughs collected from the same subjects revealed the fact that pathological cough irrespective of the underlying conditions occupy the same feature space making it harder to differentiate only using MFCC features.
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: waste classification; transfer learning; deep learning; recognition classification
Online: 23 February 2020 (14:01:01 CET)
Using machine learning or deep learning to solve the problem of garbage recognition and classification is an important application in computer vision, but due to the incomplete establishment of garbage datasets and the poor performance of complex network models on smart terminal devices, the existing garbage classification models The effect is not good.This paper presents a waste classification and identification method base on transfer learning and lightweight neural network. By migrating the lightweight neural network MobileNetV2 and rebuild it, The reconstructed network is used for feature extraction, and the extracted features are introduced into the SVM to realize the identification of 6 types of garbage. The model was trained and verified by using 2527 pieces of garbage labeled data in the TrashNet dataset, which ultimately resulted in a classification accuracy of 98.4% of the method, which proves that the method can effectively improve the classification accuracy and time and overcome the problem of weak data and less labeling. The over-fitting phenomenon encountered by small data sets in deep learning makes the model robust.
ARTICLE | doi:10.20944/preprints202209.0060.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Autonomous Driving; Deep Learning; LIDAR Data; Wavelets; 3D Object Detection
Online: 5 September 2022 (13:03:00 CEST)
3D object detection is crucial for autonomous driving to understand the driving environment. Since the pooling operation causes information loss in the standard CNN, we have designed a wavelet multiresolution analysis-based 3D object detection network without a pooling operation. Additionally, instead of using a single filter like the standard convolution, we use the lower-frequency and higher-frequency coefficients as a filter. These filters capture more relevant parts than a single filter, enlarging the receptive field. The model comprises a discrete wavelet transform (DWT) and an inverse wavelet transform (IWT) with skip connections to encourage feature reuse for contrasting and expanding layers. The IWT enriches the feature representation by fully recovering the lost details during the downsampling operation. Element-wise summation is used for the skip connections to decrease the computational burden. We train the model for the Haar and Daubechies (Db4) wavelets. The two-level wavelet decomposition result shows that we can build a lightweight model without losing significant performance. The experimental results on the KITTI’s BEV and 3D evaluation benchmark show our model outperforms the Pointpillars base model by up to 14 \% while reducing the number of trainable parameters. Code will be released.
REVIEW | doi:10.20944/preprints202104.0202.v1
Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: Spiking Neural Network (SNN); Biological Inspiration; Deep Learning; Neuromorphic Computing
Online: 7 April 2021 (12:13:16 CEST)
Recent advancement of deep learning has been elevated the multifaceted nature in various applications of this field. Artificial neural networks are now turning into a genuinely old procedure in the vast area of computer science; the principal thoughts and models are more than fifty years of age. However, in this modern computing era, 3rd generation intelligent models are introduced by scientists. In the biological neuron, actual film channels control the progression of particles over the layer by opening and shutting in light of voltage changes because of inborn current flows and remotely led to signals. A comprehensive 3rd generation, Spiking Neural Network (SNN) is diminishing the distance between deep learning, machine learning, and neuroscience in a biologically-inspired manner. It also connects neuroscience and machine learning to establish high-level efficient computing. Spiking Neural Networks initiate utilizing spikes, which are discrete functions that happen at focuses as expected, as opposed to constant values. This paper is a review of the biological-inspired spiking neural network and its applications in different areas. The author aims to present a brief introduction to SNN, which incorporates the mathematical structure, applications, and implementation of SNN. This paper also represents an overview of machine learning, deep learning, and reinforcement learning. This review paper can help advanced artificial intelligence researchers to get a compact brief intuition of spiking neural networks.
ARTICLE | doi:10.20944/preprints202103.0220.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Convolutional Neural Network; Deep Learning; Environmental Monitoring
Online: 8 March 2021 (13:37:58 CET)
Accurately mapping individual tree species in densely forested environments is crucial to forest inventory. When considering only RGB images, this is a challenging task for many automatic photogrammetry processes. The main reason for that is the spectral similarity between species in RGB scenes, which can be a hindrance for most automatic methods. State-of-the-art deep learning methods could be capable of identifying tree species with an attractive cost, accuracy, and computational load in RGB images. This paper presents a deep learning-based approach to detect an important multi-use species of palm trees (Mauritia flexuosa; i.e., Buriti) on aerial RGB imagery. In South-America, this palm tree is essential for many indigenous and local communities because of its characteristics. The species is also a valuable indicator of water resources, which comes as a benefit for mapping its location. The method is based on a Convolutional Neural Network (CNN) to identify and geolocate singular tree species in a high-complexity forest environment, and considers the likelihood of every pixel in the image to be recognized as a possible tree by implementing a confidence map feature extraction. This study compares the performance of the proposed method against state-of-the-art object detection networks. For this, images from a dataset composed of 1,394 airborne scenes, where 5,334 palm-trees were manually labeled, were used. The results returned a mean absolute error (MAE) of 0.75 trees and an F1-measure of 86.9%. These results are better than both Faster R-CNN and RetinaNet considering equal experiment conditions. The proposed network provided fast solutions to detect the palm trees, with a delivered image detection of 0.073 seconds and a standard deviation of 0.002 using the GPU. In conclusion, the method presented is efficient to deal with a high-density forest scenario and can accurately map the location of single species like the M flexuosa palm tree and may be useful for future frameworks.
ARTICLE | doi:10.20944/preprints202301.0148.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Uncertainty quantification; Deep learning, Alzheimer; MRI; MCD; Classification
Online: 9 January 2023 (06:58:45 CET)
One of the most common forms of dementia is Alzheimer’s disease (AD), which leads to progressive mental deterioration. Unfortunately, there is no definitive diagnosis and cure that can stop the condition progressing. The diagnosis is often performed based on the clinical history and neuropsychological data, including magnetic resonance imaging (MRI). Deep neural networks (DNN) algorithms are gaining popularity for medical diagnosis, and have been used widely for the analysis of MRI data. DNNs can extract hidden features from thousands of training images automatically. However, they cannot judge how confident they are about their predictions. To use DNNs in safety-critical applications such as medical diagnosis, uncertainty quantification of DNNs predictions is crucial. For this purpose, Monte Carlo dropout (MCD) has been widely used, however, it may lead to overconfident and miss calibrated results. This paper proposes a framework in which the MCD algorithm’s hyper-parameters are optimized during training using Bayesian optimization for the first time. The conducted optimization leads to assigning high predictive entropy to erroneous predictions and making it possible to recognize risky predictions. The proposed framework is used for AD diagnosis, which has not been done before. We compare our method with some existing methods in the literature based on different uncertainty quantification criteria. The results of comprehensive experiments on the Kaggle dataset using a deep model pre-trained on the ImageNet dataset show that the proposed algorithm can quantify uncertainty much better than the existing methods.
ARTICLE | doi:10.20944/preprints201607.0085.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: CNN; Deep Learning; AlexNet; VGGNet; Texture Descriptor; Garment Categories; 13 Garment Trend Identification; Design Classification for Garments.
Online: 27 July 2016 (15:39:53 CEST)
Automatic garments design class identification for recommending the fashion trends is important nowadays because of the rapid growth of online shopping. By learning the properties of images efficiently, a machine can give better accuracy of classification. Several methods, based on Hand-Engineered feature coding exist for identifying garments design classes. But, most of the time, those methods do not help to achieve better results. Recently, Deep Convolutional Neural Networks (CNNs) have shown better performances for different object recognition. Deep CNN uses multiple levels of representation and abstraction that helps a machine to understand the types of data (images, sound, and text) more accurately. In this paper, we have applied deep CNN for identifying garments design classes. To evaluate the performances, we used two well-known CNN models AlexNet and VGGNet on two different datasets. We also propose a new CNN model based on AlexNet and found better results than existing state-of-the-art by a significant margin.
ARTICLE | doi:10.20944/preprints201811.0612.v1
Subject: Environmental And Earth Sciences, Geophysics And Geology Keywords: geophysical signal processing; pattern recognition; temporal convolutional neural networks; seismology; deep learning; nuclear treaty monitoring
Online: 29 November 2018 (03:37:48 CET)
The detection of seismic events at regional and teleseismic distances is critical to Nuclear Treaty Monitoring. Traditionally, detecting regional and teleseismic events has required the use of an expensive multi-instrument seismic array; however in this work, we present DeepPick, a novel seismic detection algorithm capable of array-like performance from a single trace. We achieve this directly, by training our single-trace detector against labeled events from an array catalog, and by utilizing a deep temporal convolutional neural network. The training data consists of all arrivals in the International Seismological Centre Catalog for seven seismic arrays over a five year window from 1 Jan 2010 to 1 Jan 2015, yielding a total training set of 608,362 detections. The test set consists of the same seven arrays over a one year window from 1 Jan 2015 to 1 Jan 2016. We report our results by training the algorithm on six of the arrays and testing it on the seventh, so as to demonstrate the transportability and generalization of the technique to new stations. Detection performance against this test set is outstanding. Fixing a type-I error rate of 1%, the algorithm achieves an overall recall rate of 73% on the 141,095 array beam picks in the test set, yielding 102,394 correct detections. This is more than 4 times the 23,259 detections found in the analyst-reviewed single-trace catalogs over the same period, and represents an 8dB improvement in detector sensitivity over current methods. These results demonstrate the potential of our algorithm to significantly enhance the effectiveness of the global treaty monitoring network.
ARTICLE | doi:10.20944/preprints202105.0636.v1
Subject: Engineering, Automotive Engineering Keywords: cultural heritage; environment; deep learning; artificial intelligence; neural network.
Online: 26 May 2021 (13:06:34 CEST)
This work aims to contribute to better understanding the use of public street spaces. (1) Background: In this sense, with a multidisciplinary approach, the objective of this work is to propose an experimental and reproducible method on a large scale. (2) Study area: The applied methodology uses artificial intelligence to analyze Google Street View (GSV) images at street level. (3) Method: The purpose is to validate a methodology that allows to characterize and quantify the use (pedestrians and cars) of some squares in Rome belonging to different historical periods. (4) Results: Through the use of machine vision techniques, typical of artificial intelligence and which use convolutional neural networks, a historical reading of some selected squares is proposed with the aim of interpreting the dynamics of use and identifying some critical issues in progress. (5) Conclusions: This work validated the usefulness of a method applied to the use of artificial intelligence for the analysis of GSV images at street level.
ARTICLE | doi:10.20944/preprints202302.0299.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: deepfake detection; CNN; deep neural network; computer vision; scale invariant feature transform; histogram of oriented gradients
Online: 17 February 2023 (06:51:37 CET)
Deepfakes are manipulated or altered images, or video, that are created using deep learning models with high levels of photorealism. The two popular methods of producing a deepfake are based on either convolutional neural networks (CNN), or autoencoders. Deepfakes created using CNN comparatively show higher qualities of realism, yet oftentimes leave artifacts and distortions in the generated media that can be detected using machine learning and deep learning algorithms. In recent years, there has been an influx of periocular image and video data because of the increase usage of face masks. By wearing masks, much of what is used for facial recognition is hidden, leaving only the periocular region visible to an observer. This loss of vital information leads to easier misidentification of media, allowing deepfakes to less likely be identified as fake. In this work, feature extraction methods, such as Scale-Invariant Feature Transform (SIFT), Histogram of Oriented Gradients (HOG), and CNN, are used to train an ensemble deep learning model to detect deepfakes in videos on a frame-by-frame level based on the periocular region. Our proposed model is able to distinguish original and manipulated images with accuracies around 98.9 percent, which is an improvement to previous works by combining SIFT and HOG for deepfake detection in convolutional neural networks.
Subject: Computer Science And Mathematics, Computer Science Keywords: Indoor Localization; Sensor Fusion; Multimodal Deep Neural Network; Multimodal Sensing; WiFi Fingerprinting; Pedestrian Dead Reckoning
Online: 13 October 2021 (12:14:39 CEST)
Many engineered approaches have been proposed over the years for solving the hard problem of performing indoor localisation using smartphone sensors. However, specialising these solutions for difficult edge cases remains challenging. Here we propose an end-to-end hybrid multimodal deep neural network localisation system, MM-Loc, relying on zero hand-engineered features, learning them automatically from data instead. This is achieved by using modality-specific neural networks to extract preliminary features from each sensing modality, which are then combined by cross-modality neural structures. We show that our choice of modality-specific neural architectures is capable of estimating the location with good accuracy independently. But for better accuracy, a multimodal neural network fusing the features of early modality-specific representations is a better proposition. Our proposed MM-Loc solution is tested on cross-modality samples characterised by different sampling rates and data representation (inertial sensors, magnetic and WiFi signals), outperforming traditional approaches for location estimation. MM-Loc elegantly trains directly from data unlike conventional indoor positioning systems, which rely on human intuition.
ARTICLE | doi:10.20944/preprints201907.0121.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Artificial Neural Networks; Deep Learning; Generative Neural Networks; Incremental Learning; Novelty detection; Catastrophic Interference
Online: 8 July 2019 (14:29:28 CEST)
Deep learning models are part of the family of artificial neural networks and, as such, it suffers of catastrophic interference when they learn sequentially. In addition, most of these models have a rigid architecture which prevents the incremental learning of new classes. To overcome these drawbacks, in this article we propose the Self-Improving Generative Artificial Neural Network (SIGANN), a type of end-to-end Deep Neural Network system which is able to ease the catastrophic forgetting problem when leaning new classes. In this method, we introduce a novelty detection model to automatically detect samples of new classes, moreover an adversarial auto-encoder is used to produce samples of previous classes. This system consists of three main modules: a classifier module implemented using a Deep Convolutional Neural Network, a generator module based on an adversarial autoencoder; and a novelty detection module, implemented using an OpenMax activation function. Using the EMNIST data set, the model was trained incrementally, starting with a small set of classes. The results of the simulation show that SIGANN is able to retain previous knowledge with a gradual forgetfulness for each learning sequence. Moreover, SIGANN can detect new classes that are hidden in the data and, therefore, proceed with incremental class learning.
ARTICLE | doi:10.20944/preprints201910.0056.v1
Subject: Engineering, Control And Systems Engineering Keywords: Fusarium head blight disease; color imaging; deep neural network
Online: 6 October 2019 (04:11:58 CEST)
Fusarium head blight (FHB) disease is extensively distributed worldwide. This disease damages grain quality and reduces yield. The detection of this disease in a high throughput way is crucial to planters and breeders. Our study focused on developing a method for processing wheat color images and accurately detecting disease areas using deep learning and image processing techniques. The color images of wheat at the milky stage were collected and processed to construct datasets, which were used to retrain a deep convolutional neural network model using transfer learning. Testing results showed that the model can detect spikes, and the coefficient of determination of the number of spikes between the manual count and the detection was 0.80. The model was assessed, and the mean average precision for the testing dataset was 0.9201. On the basis of the results of spike detection, a new color feature was applied to obtain the gray image of each spike. Then, a modified region growing algorithm was implemented to segment and detect the diseased areas of each spike. Results show that the region growing algorithm performs better than K-means and Otsu’s method in segmenting the FHB disease. Overall, this study demonstrates that deep learning techniques enable the accurate detection of FHB in wheat using color images, and the proposed method can effectively detect spikes and diseased areas, thereby improving the efficiency of FHB detection.
REVIEW | doi:10.20944/preprints202102.0340.v1
Subject: Computer Science And Mathematics, Security Systems Keywords: Cybersecurity; Deep Learning; Artificial Neural Network; Artificial Intelligence; Cyber-Attacks; Cybersecurity Analytics; Cyber Threat Intelligence
Online: 16 February 2021 (15:31:02 CET)
Deep learning (DL), which is originated from an artificial neural network (ANN), is one of the major technologies of today's smart cybersecurity systems or policies to function in an intelligent manner. Popular deep learning techniques, such as Multi-layer Perceptron (MLP), Convolutional Neural Network (CNN or ConvNet), Recurrent Neural Network (RNN) or Long Short-Term Memory (LSTM), Self-organizing Map (SOM), Auto-Encoder (AE), Restricted Boltzmann Machine (RBM), Deep Belief Networks (DBN), Generative Adversarial Network (GAN), Deep Transfer Learning (DTL or Deep TL), Deep Reinforcement Learning (DRL or Deep RL), or their ensembles and hybrid approaches can be used to intelligently tackle the diverse cybersecurity issues. In this paper, we aim to present a comprehensive overview from the perspective of these neural networks and deep learning techniques according to today's diverse needs. We also discuss the applicability of these techniques in various cybersecurity tasks such as intrusion detection, identification of malware or botnets, phishing, predicting cyber-attacks, e.g. denial of service (DoS), fraud detection or cyber-anomalies, etc. Finally, we highlight several research issues and future directions within the scope of our study in the field. Overall, the ultimate goal of this paper is to serve as a reference point and guidelines for the academia and professionals in the cyber industries, especially from the deep learning point of view.
ARTICLE | doi:10.20944/preprints201807.0086.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: vibration measurement; frequency prediction; deep learning; convolutional neural network; photogrammetry; computer vison; non-contact measurement
Online: 5 July 2018 (08:31:00 CEST)
Vibration measurement serves as the basis for various engineering practices such as natural frequency or resonant frequency estimation. As image acquisition devices become cheaper and faster, vibration measurement and frequency estimation through image sequence analysis continue to receive increasing attention. In the conventional photogrammetry and optical methods of frequency measurement, vibration signals are first extracted before implementing the vibration frequency analysis algorithm. In this work, we demonstrated that frequency prediction can be achieved using a single feed-forward convolutional neural network. The proposed method is verified using a vibration signal generator and excitation system, and the result obtained was compared with that of an industrial contact vibrometer in a real application. Our experimental results demonstrate that the proposed method can achieve acceptable prediction accuracy even in unfavorable field conditions.
ARTICLE | doi:10.20944/preprints201808.0130.v1
Subject: Engineering, Mechanical Engineering Keywords: SHM; Electromechanical Impedance; Piezoelectricity; Intelligent Fault Diagnosis; Machine Learning; CNN; Deep Learning
Online: 6 August 2018 (21:51:53 CEST)
Preliminaries Convolutional Neural Network (CNN) applications have recently emerged in Structural Health Monitoring (SHM) systems focusing mostly on vibration analysis. However, the SHM literature shows clearly that there is a lack of application regarding the combination of PZT (Lead Zirconate Titanate) based method and CNN. Likewise, applications using CNN along with the Electromechanical Impedance (EMI) technique applied to SHM systems are rare. To encourage this combination, an innovative SHM solution through the combination of the EMI-PZT and CNN is presented here. To accomplish this, the EMI signature is split into several parts followed by computing the Euclidean distances among them to form a RGB (red, green and blue) frame. As a result, we introduce a dataset formed from the EMI-PZT signals of 720 frames, encompassing a total of 4 types of structural conditions for each PZT. In a case study, the CNN-based method was experimentally evaluated using three PZTs glued onto an aluminum plate. The results reveal an effective pattern classification; yielding a 100% hit rate which outperforms other SHM approaches. Furthermore, the method needs only a small dataset for training the CNN, providing several advantages for industrial applications.
ARTICLE | doi:10.20944/preprints202007.0209.v1
Subject: Engineering, Control And Systems Engineering Keywords: Deep learning; Head Related Transfer Function (HRTF); Restoration; Ambisonics; Spatial Audio; Spherical harmonic; Audio signal processing; Denoising; Auto-Encoder; Neural Network
Online: 10 July 2020 (08:58:11 CEST)
Spherical harmonic (SH) interpolation is a commonly used method to spatially up-sample sparse Head Related Transfer Function (HRTF) datasets to denser HRTF datasets. However, depending on the number of sparse HRTF measurements and SH order, this process can introduce distortions in high frequency representation of the HRTFs. This paper investigates whether it is possible to restore some of the distorted high frequency HRTF components using machine learning algorithms. A combination of Convolutional Auto-Encoder (CAE) and Denoising Auto-Encoder (DAE) models is proposed to restore the high frequency distortion in SH interpolated HRTFs. Results are evaluated using both Perceptual Spectral Difference (PSD) and localisation prediction models, both of which demonstrate significant improvement after the restoration process.
ARTICLE | doi:10.20944/preprints201812.0211.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: VHR image; building roof; segmentation; GF2; deep convolution neural network
Online: 18 December 2018 (04:07:47 CET)
This paper presents a novel approach for semantic segmentation of building roof in dense urban environment with Deep Convolution Neural Network (DCNN) using imagery acquired by a Chinese Very High Resolution (VHR) satellite mission, i.e. GaoFen-2 (GF-2). To provide an operational end-to-end work flow for accurate build roof mapping with feature extraction as well as image segmentation, a fully convolutional DCNN with both convolutional and deconvolutional layers is designed to perform the VHR image analysis for labeling pixels. Since the diverse urban patterns and building styles in large areas, sample image data sets of building roof and non-building roof are collected over different metropolitan regions in China. We selected typical cities with dense urban environment in each metropolitan region as study areas for collecting training and test samples. High performance cluster with GPU-mounted workstations is employed to perform the model training and optimization. With the building roof samples collected over different cities, the predictive model with multiple NN layers is developed for building roof labeling. The validation of the building roof map shows that the overall accuracy(OA) and the mean Intersection Over Union( mIOU) of DCNN based segmentation are 94.67%, 0.85 respectively, while CRF-refined segmentation achieved OA of 94.69% and mIOU of 0.83. The results suggest that the proposed approach is a promising solution for building roof mapping with VHR images over large areas across different urban and building patterns. With the operational acquisition of GF2 VHR imagery, it is expected to develop an automated pipeline for operational built-up area monitoring and timely update of building roof map over large areas.
Subject: Engineering, Automotive Engineering Keywords: traffic engineering; traffic incident detection; CNN-XGBoost; Convolution Neural Network; Deep Learning
Online: 15 April 2020 (14:13:35 CEST)
Accurate and efficient traffic incident detection methods can effectively alleviate traffic congestion caused by traffic incidents, prevent secondary accidents and improve the safety of urban road traffic.Aiming at the problems that the traditional machine learning event detection method cannot fully extract the parameter characteristics of traffic flow and is not suitable formulti-dimensional and non-linear massive data, we propose a new traffic event detection method(CNN-XGBoost).This method combines the respective advantages of Convolution Neural Network(CNN) and Extreme Gradation Boosting (XGBoost). Firstly, we preprocessed the original freeway traffic incident detection data set by constructing initial variable set, data normalization, data balance processing and dimension reorganization. Secondly,we use CNN network to automatically extract the deep features of event detection data, and use XGBoost as a classifier to classify the extracted features for expressway traffic event detection.Finally, we use the data set of Hangzhou expressway microwave detector in China to carry out simulation experiments on CNN-XGBoost. The experimental results show that compared with XGBoost, CNN, Support Vector Machine (SVM) and Gradient Boosting Decision Tree (GBDT) and other methods, CNN-XGBoost method can effectively improve the accuracy of expressway traffic event detection and has better generalization ability.
ARTICLE | doi:10.20944/preprints202209.0190.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: green coffee bean; lightweight framework; deep convolutional neural network; explainable model; random optimization
Online: 14 September 2022 (04:04:05 CEST)
In recent years, the demand for coffee has increased tremendously. During the production process, green coffee beans are traditionally screened manually for defective beans before they are packed into coffee bean packages; however, this method is not only time-consuming but also increases the rate of human error due to fatigue. Therefore, this paper proposed a lightweight deep convolutional neural network (LDCNN) for the quality detection system of green coffee beans, which combined depthwise separable convolution (DSC), squeeze-and-excite block (SE block), skip block, and other frameworks. To avoid the influence of low parameters of the lightweight model caused by the model training process, rectified Adam (RA), lookahead (LA), and gradient centralization (GC) were included to improve efficiency; the model was also put into the embedded system. Finally, the local interpretable model-agnostic explanations (LIME) model was employed to explain the predictions of the model. The experimental results indicated that the accuracy rate of the model could reach up to 98.38% and the F1 score could be as high as 98.24% when detecting the quality of green coffee beans. Hence, it can obtain higher accuracy, lower computing time, and lower parameters. Moreover, the interpretable model verified that the lightweight model in this work is reliable, providing the basis for screening personnel to understand the judgment through its interpretability, thereby improving the classification and prediction of the model.
ARTICLE | doi:10.20944/preprints202304.0996.v1
Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: Convolutional Neural Network; Deep Learning; Photoplethysmography; Respiratory Rate; Time Series
Online: 26 April 2023 (13:17:24 CEST)
Respiratory rate is an important biomarker that indicates changes in the clinical condition of critically ill patients, so a surveillance tool that can accurately monitor the changing respiratory rate in real time is needed. Through investigating various pairs of machine learning models, we proposed new machine learning model for real-time respiratory rate estimation using photoplethysmogram. New photoplethysmogram-driven respiratory rate dataset(StMary) was collected from surgical intensive care unit of a tertiary referral hospital, using photoplethysmogram signal collector. For 50patients and 50healthy volunteers, 2-minute photoplethysmogram was collected for each subject twice. To evaluate the respiratory rate of subject, it was inputted into the deep neural network model we built, and dataset was splitted into training, validation, testing dataset, then 4-fold cross validation was exploited. Our deep neural network model trained with StMary and two public datasets(BIDMC and CapnoBase) individually, or selectively merged dataset had shown a low error rate in respiration rate measurements. Our model trained with StMary showed low mean absolute error score(1.0273±0.8965), and trained with 3 datasets(CapnoBase, BIDMC and StMary) showed a lower error rate(1.7359±1.6724) than the model trained with CapnoBase and BIDMC(1.9480±1.6751). We could verify the performance of model evaluating respiratory rate from photoplethysmogram, and our dataset could contribute as the clinical research data that supports artificial intelligence models evaluating respiratory rate and surveillance tools to test whether their monitoring function works properly.
ARTICLE | doi:10.20944/preprints202211.0437.v3
Subject: Engineering, Civil Engineering Keywords: deep neural network; long short-term memory; suspended sediment; discharge
Online: 16 December 2022 (08:08:08 CET)
The dynamics of suspended sediment involves inherent non-linearity and complexity as a result of the presence of both spatial variability of the basin characteristics and temporal climatic patterns. As a result of this complexity, the conventional sediment rating curve (SRC) and other empirical methods produce inaccurate predictions. Deep neural networks (DNNs) have emerged as one of the advanced modeling techniques capable of addressing inherent non-linearity in hydrological processes over the last few decades. DNN algorithms are used to perform predictive analysis and investigate the interdependencies among the most pivotal water quantity and quality parameters i.e., discharge, suspended sediment concentration (SSC), and turbidity. In this study, the Long short-term memory (LSTM) algorithm of DNNs is used to model the discharge-suspended sediment relationship for the Stony Clove Creek. The simulations were run using primary data on discharge, SSC and turbidity. For the development of the DNN models and examining the effects of input vectors, combinations of different input vectors (namely discharge, and SSC) for the current and previous days are considered. Furthermore, a suitable modelling approach with an appropriate model input structure is suggested based on model performance indices for the training and testing phases. The performance of developed models is assessed using statistical indices such as root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2). Statistically, the performance of DNN-based models in simulating the daily SSC performed well with observed sediment concentration series data. The study demonstrates the suitability of the DNN approach for simulation and estimation of daily SSC, opening up new research avenues for applying hybrid soft computing models in hydrology.
REVIEW | doi:10.20944/preprints202206.0167.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; convolutional neural network; brain tumor classification; clinical application
Online: 13 June 2022 (04:57:42 CEST)
Deep learning has shown remarkable results in every field, especially in the biomedical field, due to its ability to exploit large-scale datasets. A convolutional neural network (CNN) is a widely used deep learning approach to solve medical imaging problems. Over the past few years, many studies have focused on CNN-based techniques for brain tumor diagnosis. There are, however, still some critical challenges that CNNs face towards clinic application. This study presents a comprehensive review of current literature that involves CNN architectures for brain tumor classification. We compare the key achievements in the performance evaluation metrics of the applied classification algorithms. In addition, this review assesses the clinical effectiveness of the included studies to elaborate on the limitations and directions of this area for future work. No review focusing on the clinical effectiveness of previous works in this field has been published. We believe that this study has the potential to elevate the application of CNN-based deep learning methods in clinical practice and also can be a quick reference for biomedical researchers who are interested in this field.
ARTICLE | doi:10.20944/preprints202302.0086.v2
Subject: Engineering, Civil Engineering Keywords: Deep neural network; long short-term memory; water quality; discharge; stream-water
Online: 17 April 2023 (07:21:31 CEST)
Multivariate predictive analysis of the Stream-Water (SW) parameters (discharge, water level, temperature, dissolved oxygen, pH, turbidity, and specific conductance) is a pivotal task in the field of water resource management during the era of rapid climate change. The highly dynamic and evolving nature of the meteorological and climatic features have a significant impact on the temporal distribution of the SW variables in recent days making the SW variables forecasting even more complicated for diversified water-related issues. To predict the SW variables, various physics-based numerical models are used using numerous hydrologic parameters. Extensive lab-based investigation and calibration are required to reduce the uncertainty involved in those parameters. However, in the age of data-informed analysis and prediction, several deep learning algorithms showed satisfactory performance in dealing with sequential data. In this research, a comprehensive Explorative Data Analysis (EDA) and feature engineering were performed to prepare the dataset to obtain the best performance of the predictive model. Long Short-Term Memory (LSTM) neural network regression model is trained using over several years of daily data to predict the SW variables up to one week ahead of time (lead time) with satisfactory performance. The performance of the proposed model is found highly adequate through the comparison of the predicted data with the observed data, visualization of the distribution of the errors, and a set of error matrices. Higher performance is achieved through the increase in the number of epochs and hyperparameter tuning. This model can be transferred to other locations with proper feature engineering and optimization to perform univariate predictive analysis and potentially be used to perform real-time SW variables prediction.
ARTICLE | doi:10.20944/preprints202309.0702.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Potato Leaf Diseases Detection; Classification; Transfer learning; Convolution Neural Network; Deep learning; Pre-trained Models
Online: 12 September 2023 (02:52:13 CEST)
Agriculture is one of the indispensable fields for the survival of mankind. Potatoes also have a significant role in the field of agriculture. The quality and quantity of potatoes are significantly impacted by several diseases, such as early blight and late blight, and manual interpretation of these leaf diseases is time-consuming and inconvenient. Fortunately, leaf appearance is used to detect diseases in potato plants. Productivity will considerably rise if the infections are detected early. Different types of image processing methods and machine learning methods are used for early recognition of those diseases from leaf images so that the product losses are decreased significantly. For the incredible and marvelous performance of CNN, it is the most popular deep learning method that is used immensely for the recognition of leaf diseases from images. Several pretrained deep learning models, such as VGG16, ResNet50, InceptionV3, MobileNetV2, Xception, and a deep learning model developed using CNN, are employed for potato leaf disease classification and recognition on the same dataset. The transfer learning technique is applied to the pretrained model and the data augmentation technique is applied to the Proposed CNN model for potato disease classification from leaf samples. Compared to pretrained models, the proposed CNN model offers the lowest loss and highest accuracy for potato leaf disease detection while using fewer parameters and layers. It achieves the best performance with a test accuracy of 99.33% compared to other Pretrained models used in the diagnosis of potato leaf disease.
ARTICLE | doi:10.20944/preprints201812.0258.v1
Subject: Chemistry And Materials Science, Surfaces, Coatings And Films Keywords: copper; polymer coatings; polyvinyl alcohol; silver nanoparticles; deep learning; CNN
Online: 21 December 2018 (07:51:06 CET)
In order to design effective protective coatings against corrosion, the polyvinyl alcohol (PVA) as compound and composite with silver nanoparticles (nAg/PVA) were electrodeposited on copper surface employing electrochemical techniques such as linear potentiometry and cyclic voltammetry. A new paradigm was used to distinguish the features of coatings, i.e., a Deep Convolutional Neural Network (CNN) was implemented to automatically and hierarchically extract the discriminative characteristics from the information given by optical microscopy images. The main arguments that invoke a CNN implementation in the surface science of materials are the following: artificial intelligence techniques can be successfully applied to learn differences between surface coatings; based on their popularity for image processing, CNN can model images related to the problem of coatings; deep learning is able to extract the features that are distinguishable between material surfaces. To provide an overview of the copper surface, CNN was applied on microscope slides (CNN@microscopy) and inherently learnt distinctive characteristics for each class of surface morphology. The material surface morphology controlled by CNN without the interference of the human factor was successfully conducted, in our study, to extract the similarities/differences between unprotected and protected surfaces to establish the PVA and nAg/PVA performance to retard the copper corrosion.
ARTICLE | doi:10.20944/preprints202001.0283.v3
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Autonomous vehicle; Self-driving; Real Driving Behavior; Deep Neural Network; LSV-DNN
Online: 30 November 2020 (11:16:54 CET)
Considering the significant advancements in autonomous vehicle technology, research in this field is of interest to researchers. To drive vehicles autonomously, controlling steer angle, gas hatch, and brakes need to be learned. The behavioral cloning method is used to imitate humans’ driving behavior. We created a dataset of driving in different routes and conditions and using the designed model, the output used for controlling the vehicle is obtained. In this paper, the Learning of Self-driving Vehicles Based on Real Driving Behavior Using Deep Neural Network Techniques (LSV-DNN) is proposed. We designed a convolutional network which uses the real driving data obtained through the vehicle’s camera and computer. The response of the driver is during driving is recorded in different situations and by converting the real driver’s driving video to images and transferring the data to an excel file, obstacle detection is carried out with the best accuracy and speed using the Yolo algorithm version 3. This way, the network learns the response of the driver to obstacles in different locations and the network is trained with the Yolo algorithm version 3 and the output of obstacle detection. Then, it outputs the steer angle and amount of brake, gas, and vehicle acceleration. The LSV-DNN is evaluated here via extensive simulations carried out in Python and TensorFlow environment. We evaluated the network error using the loss function. By comparing other methods which were conducted on the simulator’s data, we obtained good performance results for the designed network on the data from KITTI benchmark, the data collected using a private vehicle, and the data we collected.
Subject: Engineering, Automotive Engineering Keywords: virtual sensor; automotive control; active suspension; vehicle state estimation; neural networks; deep learning; long-short term memory; sequence regression
Online: 24 September 2021 (12:42:07 CEST)
With the automotive industry moving towards automated driving, sensing is increasingly important in enabling technology. The virtual sensors allow data fusion from various vehicle sensors and provide a prediction for measurement that is hard or too expensive to measure in another way or in the case of demand on continuous detection. In this paper, virtual sensing is discussed for the case of vehicle suspension control, where information about the relative velocity of the unsprung mass for each vehicle corner is required. The corresponding goal can be identified as a regression task with multi-input sequence input. The hypothesis is that the state-of-art method of Bidirectional Long-Short Term Memory (BiLSMT) can solve it. In this paper, a virtual sensor has been proposed and developed by training a neural network model. The simulations have been performed using an experimentally validated full vehicle model in IPG Carmaker. Simulations provided the reference data which was used for Neural Network (NN) training. The extensive dataset covering 26 scenarios has been used to obtain training, validation and testing data. The Bayesian Search was used to select the best neural network structure using root mean square error as a metric. The best network is made of 167 BiLSTM, 256 fully connected hidden units and 4 output units. Error histograms and spectral analysis of the predicted signal compared to the reference signal are presented. The results demonstrate the good applicability of neural network-based virtual sensors to estimate vehicle unsprung mass relative velocity.
ARTICLE | doi:10.20944/preprints202309.1174.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Design Right Infringement; Deep Learning; Ensemble Learning; Image Classification; Object Detection; Large-Scale Detection System
Online: 19 September 2023 (03:03:46 CEST)
This paper presents a two-stage hierarchical neural network using image classification and object detection algorithms as key building blocks for a system that automatically detects a potential design right infringement. This neural network is trained to return the Top-N original design right records that highly resemble the input image of a counterfeit. Design rights specify the unique aesthetic characteristics of a product. Due to the rapid change of trends, new design rights are continuously generated. This work proposes an Ensemble Neural Network (ENN), an artificial neural network model that aims to deal with a large amount of counterfeit data and design right records that are frequently added and deleted. At first, we performed image classification and objection detection learning per design right using the existing models with a proven track record of high accuracy. The distributed models form the backbone of the ENN and yield intermediate results aggregated at a master neural network. This master neural network is a deep residual network paired with a fully connected network. This ensemble layer is trained to determine the sub-models that return the best result for a given input image of a product. In the final stage, the ENN model multiples the inferred similarity coefficients to the weighted input vectors produced by the individual sub-models to assess the similarity between the test input image and the existing product design rights to see any sign of violation. Given 84 design rights and the sample product images taken meticulously under various conditions, our ENN model achieved average Top-1 and Top-3 accuracies of 98.409% and 99.460%, respectively. Upon introducing new design rights data, a partial update of the inference model was done an order of magnitude faster than the single model. ENN maintained a high level of accuracy as it scaled out to handle more design rights. Therefore, the ENN model is expected to offer practical help to the inspectors in the field, such as the customs at the border that deal with a swarm of products.
ARTICLE | doi:10.20944/preprints202309.0480.v1
Subject: Engineering, Marine Engineering Keywords: Flow Control Fin (FCF); Deep Neural Network (DNN); Transfer Learning (TL); containership; viscous resistance coefficients; wake flow distributions
Online: 7 September 2023 (08:36:58 CEST)
In this study, deep neural network (DNN) and transfer learning (TL) techniques were employed to predict the viscous resistance and wake distribution based on the positions of flow control fins (FCFs) applied to containerships of various sizes. Both methods utilized data collected through Computational Fluid Dynamics (CFD) analysis. The position of the flow control fin (FCF) and hull-form information were utilized as input data, and the output data included viscous resistance coefficients and components of propeller axial velocity. The base DNN model was trained and validated using a source dataset from a 1000 TEU containership. Grid search cross-validation technique was employed to optimize the hyperparameters of the base DNN model. Then, transfer learning was applied to predict the viscous resistance and wake distribution for containerships of varying sizes. To enhance the accuracy of feature prediction with a limited amount of dataset, learning rate optimization was conducted. Transfer learning involves retraining and reconfiguring the base DNN model, and the accuracy was verified based on the fine-tuning method of the learning model. The results of this study can provide hull designers for containerships with performance evaluation information by predicting wake distribution, without relying on CFD analysis.
ARTICLE | doi:10.20944/preprints202210.0059.v1
Subject: Engineering, Control And Systems Engineering Keywords: Artificial Intelligence; Cybersecurity; Remote Control; Fake Signals; Replay Attack; Deep Learning, ResNet50, Transfer Learning.
Online: 6 October 2022 (09:16:56 CEST)
The keyless systems have replaced the old fashion methods of inserting physical keys in the keyhole to, i.e., unlock the door, because they are inconvenient and easy to be exploited by the threat actors. Keyless systems use the technology of radio frequency (RF) as an interface to transmit signals from the key fob to the vehicle. However, Keyless systems are susceptible to being compromised by a thread actor who intercepts the transmitted signal and performs a reply attack. In this paper, we propose a transfer learning-based model to identify the replay attacks launched against remote keyless controlled vehicles. Specifically, the system makes use of a pre-trained ResNet50 deep neural network to predict the wireless remote signals used to lock or unlock doors of a remote-controlled vehicle system remotely. The signals are finally classified into three classes: real signal, fake signal high gain, and fake signal low gain. We have trained our model with 100 epochs (3800 iterations) on a KeFRA 2022 dataset, a modern dataset. The model has recorded a final validation accuracy of 99.71% and a final validation loss of 0.29% at a low inferencing time of 50 ms for the model-based SGD solver. The experimental evaluation revealed the supremacy of the proposed model.
ARTICLE | doi:10.20944/preprints202103.0302.v2
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Searaser; Flow-3D; Prediction; Long short term memory; deep neural network; Root mean error.
Online: 13 April 2021 (09:51:25 CEST)
Accurate forecasts of ocean waves energy can not only reduce costs for investment but it is also essential for management and operation of electrical power. This paper presents an innovative approach based on the Long Short Term Memory (LSTM) to predict the power generation of an economical wave energy converter named “Searaser”. The data for analyzing is provided by collecting the experimental data from another study and the exerted data from numerical simulation of searaser. The simulation is done with Flow-3D software which has high capability in analyzing the fluid solid interactions. The lack of relation between wind speed and output power in previous studies needs to be investigated in this field. Therefore, in this study the wind speed and output power are related with a LSTM method. Moreover, it can be inferred that the LSTM Network is able to predict power in terms of height more accurately and faster than the numerical solution in a field of predicting. The network output figures show a great agreement and the root mean square is 0.49 in the mean value related to the accuracy of LSTM method. Furthermore, the mathematical relation between the generated power and wave height was introduced by curve fitting of the power function to the result of LSTM method.
ARTICLE | doi:10.20944/preprints202306.0942.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: deep learning; partial discharge; convolutional neural network; medium voltage switchgear; air-insulated switchgear; autoencoder; long short-term memory
Online: 13 June 2023 (13:55:13 CEST)
The correct classification of defects originating from partial discharges (PD) in medium-voltage (MV) switchgears with air insulation (AIS) remains a challenging research topic for scientists worldwide. In this article, the authors simulated four possible defects occurring in the power industry, including one that is a simultaneous combination of two commonly ones. In addition, the correctness of the algorithm was checked by adding a classification class without any fault. The measurement signals were recorded with TEV sensors. The effectiveness of various hy-brid-connected neural networks was tested and discussed: GoogleNet and SqueezeNet based on spectrograms, SAE with FNN, 2D-CNN with LSTM, and hybrid AE combined with CNN and LSTM. The highest effectiveness – approximately 97% – was demonstrated by the GoogleNet and SqueezeNet networks. The research results are expected to form the basis for the development of a universal and wireless capacitive sensor for monitoring the level of PD in switchgears.
ARTICLE | doi:10.20944/preprints202009.0516.v1
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: gully erosion susceptibility; deep learning neural network; particle swarm optimization; Shiran watershed
Online: 22 September 2020 (09:48:07 CEST)
This study aims to evaluate a new approach in modeling gully erosion susceptibility based on deep learning neural network (DLNN) model, ensemble Particle swarm optimization (PSO) algorithm with DLNN (PSO-DLNN) and comparing these approaches with common artificial neural network (ANN) and support vector machine (SVM) models in Shiran watershed, Iran. For this purpose, 13 independent variables affecting gully erosion susceptibility in the study area, including altitude, slope, aspect, plan curvature, profile curvature, drainage density, distance from river, land use, soil, lithology, rainfall, , stream power index (SPI), topographic wetness index (TWI), were prepared. Also, 132 gully erosion locations were identified during field visits. Data for modeling were divided into two categories of training (70%) and testing (30%). Receiver operating characteristic (ROC) parameters including sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV) and area under curve (AUC) were used to evaluate the performance of the models. The results showed that, the AUC values from ROC with considering testing datasets of PSO-DLNN is 0.89 and which is associated with superb accuracy. Rest of the models also associated with optimal accuracy and near about PSO-DLNN model; the AUC values from ROC of DLNN, SVM and ANN for testing datasets are 0.87, 0.85 and 0.84 respectively. The PSO algorithm has updated and optimized the weights of DLNN model, and as a result, the efficiency of this model in predicting gully erosion susceptibility has increased. Therefore, it can be concluded that the use of DLNN model and its ensemble with PSO algorithm can be used as a novel and practical method in predicting the susceptibility of gully erosion that helps planners and managers in managing and reducing the risk of this phenomenon.
ARTICLE | doi:10.20944/preprints202209.0398.v1
Subject: Engineering, Civil Engineering Keywords: river discharge; hydro informatics; water resource; data-driven; deep learning; LSTM
Online: 26 September 2022 (11:30:24 CEST)
River flow prediction is a pivotal task in the field of water resource management during the era of rapid climate change. The highly dynamic and evolving nature of the climatic variables e.g., precipitation has a significant impact on the temporal distribution of the river discharge in recent days making the discharge forecasting even more complicated for diversified water-related issues e.g., flood prediction and irrigation planning. To predict the discharge, various physics-based numerical models are used using numerous hydrologic parameters. Extensive lab-based investigation and calibration are required to reduce the uncertainty involved in those parameters. However, in the age of data-driven predictions, several deep learning algorithms showed satisfactory performance in dealing with sequential data. In this research, Long Short-term Memory (LSTM) neural network regression model is trained using over 80 years of daily data to forecast the discharge time series up to 3 days ahead of time. The performance of the model is found satisfactory through the comparison of the predicted data with the observed data, visualization of the distribution of the errors and Root Mean Squared Error (RMSE) value of 0.09. Higher performance is achieved through the increase in the number of epochs and hyper parameter tuning. This model can be transferred to other locations with proper feature engineering and optimization to perform univariate predictive analysis and potentially be used to perform real-time river discharge prediction.
ARTICLE | doi:10.20944/preprints202209.0231.v1
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: neural networks; regularization; deep networks
Online: 15 September 2022 (13:06:13 CEST)
Numerous approaches address over-fitting in neural networks: by imposing a penalty on the parameters of the network (L1, L2, etc); by changing the network stochastically (drop-out, Gaussian noise, etc.); or by transforming the input data (batch normalization, etc.). In contrast, we aim to ensure that a minimum amount of supporting evidence is present when fitting the model parameters to the training data. This, at the single neuron level, is equivalent to ensuring that both sides of the separating hyperplane (for a standard artificial neuron) have a minimum number of data points — noting that these points need not belong to the same class for the inner layers. We firstly benchmark the results of this approach on the standard Fashion-MINST dataset, comparing it to various regularization techniques. Interestingly, we note that by nudging each neuron to divide, at least in part, its input data, the resulting networks make use of each neuron, avoiding a hyperplane completely on one side of its input data (which is equivalent to a constant into the next layers). To illustrate this point, we study the prevalence of saturated nodes throughout training, showing that neurons are activated more frequently and earlier in training when using this regularization approach. A direct consequence of the improved neuron activation is that deep networks are now easier to train. This is crucially important when the network topology is not known a priori and fitting often remains stuck in a suboptimal local minima. We demonstrate this property by training a network of increasing depth (and constant width): most regularization approaches will result in increasingly frequent training failures (over different random seeds) whilst the proposed evidence-based regularization significantly outperforms in its ability to train deep networks.
ARTICLE | doi:10.20944/preprints202306.2086.v1
Subject: Engineering, Industrial And Manufacturing Engineering Keywords: laser beam butt welding (LBW); Joint-gap formation; AE-analysis; Non-destructive test-30 ing (NDT); Deep learning; Audible to Ultrasonic sensors
Online: 29 June 2023 (10:14:05 CEST)
This study aimed to explore the feasibility of using airborne acoustic emission in laser beam butt welding for the development of an automated classification system based on neural networks. The focus was on monitoring the formation of joint gaps during the welding process. To simulate various sizes of butt joint gaps, controlled welding experiments were conducted, and the emitted acoustic signals were captured using audible to ultrasonic microphones. To implement an automated monitoring system, a method based on short-time Fourier transformation was developed to extract audio features, and a convolutional neural network architecture with data augmentation was utilized. The results demonstrated that this non-destructive and non-invasive approach was highly effective in detecting joint gap formations, achieving an accuracy of 98%. Furthermore, the system exhibited promising potential for low latency monitoring of the welding process. The classification accuracy for various gap sizes reached up to 90%, providing valuable insights for characterizing and categorizing joint gaps accurately. Additionally, increasing the quantity of training data with quality annotations could potentially improve the classifier model's performance further. This suggests that there is room for future enhancements in the study.
ARTICLE | doi:10.20944/preprints202109.0130.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: machine learning; deep learning; calibration; air quality; low-cost sensors; exposure assessment
Online: 7 September 2021 (14:24:56 CEST)
Although commercially-available low-cost air quality sensors have low accuracy, the sensor system are being used to collect the data for the regulation of PM2.5 emission caused by industrial activities or to estimate the personal exposure for PM2.5. In this work, to solve the accuracy problem of low-cost PM sensor, we developed a new PM2.5 calibration model by combining the deep neural network (DNN) optimized in calibration problem and a LSTM optimized in time-dependent characteristics. First, two datasets were generated to test the accuracy performance and generalization performance of the PM2.5 calibration machine learning (ML) model. The PM2.5 concentrations, temperature and humidity by low-cost sensor and gravimetric-based PM2.5 measuring instrument were sampled for a sufficiently long time. The proposed model was compared with benchmark (multiple linear regression model) and low-cost sensor results. For root mean square error (RMSE) for PM2.5 concentrations, the proposed model reduced 41-60% of error compared to the raw data of low-cost sensor, and reduced 30-51% of error compared to the benchmark model. R2 of ML model, MLR and raw data were 93, 80 and 59 %. Also, the developed model still showed consistent calibration performance when calibrated with new sensors in different locations. Low-cost sensors combined with ML model not only can improve the calibration performance of benchmark, but also can be applied to the sensor monitoring systems for various epidemiologic investigations and regulatory decisions.
ARTICLE | doi:10.20944/preprints202309.1202.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: speech emotion recognition; deep learning; Deep Belief Network; deep neural network; Convolutional Neural Network; LSTM; attention mechanism
Online: 19 September 2023 (08:24:22 CEST)
Speech Emotion Recognition (SER) is an interesting and difficult problem to handle. In this paper, we deal with it through the implementation of deep learning networks. We have designed and implemented six different deep learning networks, a Deep Belief Network (DBN), a simple deep neural network (SDNN), a LSTM network (LSTM), a LSTM network with the addition of an attention mechanism (LSTM-ATN), a Convolutional neural network (CNN), and a Convolutional neural network with the addition of an attention mechanism (CNN-ATN), having in mind, apart from solving the SER problem, to test the impact of attention mechanism to the results. Dropout and Batch Normalization techniques are also used to improve the generalization ability (prevention of overfitting) of the models as well as to speed up the training process. The Surrey Audio-Visual Expressed Emotion database (SAVEE), and the Ryerson Audio-Visual Database (RAVDESS) database were used for training and evaluation of our models. The results showed that networks with the addition of the attention mechanism did better than the others. Furthermore, they showed that CNN-ATN was the best among tested networks, achieving an accuracy of 74% for the SAVEE and 77% for the RAVDESS dataset, and exceeded existing state-of-the-art systems for the same datasets.
ARTICLE | doi:10.20944/preprints201903.0039.v2
Subject: Engineering, Control And Systems Engineering Keywords: Handwritten digit recognition; Convolutional Neural Network (CNN); Deep learning; MNIST dataset; Epochs; Hidden Layers; Stochastic Gradient Descent; Backpropagation
Online: 20 September 2019 (10:12:26 CEST)
In recent times, with the increase of Artificial Neural Network (ANN), deep learning has brought a dramatic twist in the field of machine learning by making it more Artificial Intelligence (AI). Deep learning is used remarkably used in vast ranges of fields because of its diverse range of applications such as surveillance, health, medicine, sports, robotics, drones etc. In deep learning, Convolutional Neural Network (CNN) is at the center of spectacular advances that mixes Artificial Neural Network (ANN) and up to date deep learning strategies. It has been used broadly in pattern recognition, sentence classification, speech recognition, face recognition, text categorization, document analysis, scene, and handwritten digit recognition. The goal of this paper is to observe the variation of accuracies of CNN to classify handwritten digits using various numbers of hidden layer and epochs and to make the comparison between the accuracies. For this performance evaluation of CNN, we performed our experiment using Modified National Institute of Standards and Technology (MNIST) dataset. Further, the network is trained using stochastic gradient descent and the backpropagation algorithm.
ARTICLE | doi:10.20944/preprints202203.0288.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: computer-aided detection; convolutional neural network; COVID-19; deep learning; image classification
Online: 22 March 2022 (02:19:50 CET)
One of the critical tools for early detection and subsequent evaluation of the incidence of lung diseases is chest radiography. This study presents a real-world implementation of a convolutional neural network (CNN) based Carebot Covid app to detect COVID-19 from chest X-ray (CXR) images. Our proposed model takes the form of a simple and intuitive application. Used CNN can be deployed as a STOW-RS prediction endpoint for direct implementation into DICOM viewers. The results of this study show that the deep learning model based on DenseNet and ResNet architecture can detect SARS-CoV-2 from CXR images with precision of 0.981, recall of 0.962 and AP of 0.993.
ARTICLE | doi:10.20944/preprints202304.0022.v1
Subject: Engineering, Aerospace Engineering Keywords: non-destructive testing; deep learning; automated defect recognition (ADR); semantic segmentation; digital X-ray radiography
Online: 3 April 2023 (10:25:31 CEST)
In response to the growing inspection demand exerted by process automation in component manufacturing, Non-destructive testing (NDT) continues to explore automated approaches that utilize deep learning algorithms for defect identification, including within digital X-ray radiography images. This necessitates a thorough understanding of the implication of image quality parameters on the performance of these deep learning models. This study investigates the influence of two image quality parameters, namely Signal-to-Noise Ratio (SNR) and Contrast-to-Noise Ratio (CNR), on the performance of U-net deep learning segmentation model. Input images were acquired with varying combinations of exposure factors such as kilovoltage, milli-ampere, and exposure time, which altered the resultant quality. The data was sorted into 5 different datasets according to their measured SNR and CNR values. The deep learning model was trained 5 distinct times, utilizing a unique dataset for each training session. Training the model with high CNR values yielded an intersection over Union (IoU) metric of 0.9594 on test data of the same category but drops to 0.5875 when tested on lower CNR test data. The result in this study emphasizes the importance of achieving a balance in training dataset according to the investigated quality parameters, to enhance the performance of deep learning segmentation models in NDT radiography applications.
ARTICLE | doi:10.20944/preprints202210.0448.v1
Subject: Arts And Humanities, Art Keywords: computational creativity; deep learning; feature extraction; image analysis; machine perception; painting classification; residual networks; transfer learning
Online: 28 October 2022 (09:37:03 CEST)
With the increasing availability of large digitized fine art collections, automated analysis and classification of paintings is becoming an interesting area of research. However, due to domain specificity, implicit subjectivity, and pervasive nuances that vaguely separate art movements, analyzing art using machine learning techniques poses significant challenges. Residual networks, or variants thereof, are one the most popular tools for image classification tasks, which can extract relevant features for well-defined classes. In this case study, we focus on the classification of a selected painting 'Portrait of the Painter Charles Bruni' by Johann Kupetzky and the analysis of the performance of the proposed classifier. We show that the features extracted during residual network training can be useful for image retrieval within search systems in online art collections.
REVIEW | doi:10.20944/preprints202104.0421.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: non-intrusive load monitoring; load disaggregation; NILM; review; deep learning; deep neural networks; machine learning
Online: 15 April 2021 (15:05:09 CEST)
This paper reviews non-intrusive load monitoring (NILM) approaches that employ deep neural networks to disaggregate appliances from low frequency data, i.e. data with sampling rates lower than the AC base frequency. We first review the many degrees of freedom of these approaches, what has already been done in literature, and compile the main characteristics of the reviewed publications in an extensive overview table. The second part of the paper discusses selected aspects of the literature and corresponding research gaps. In particular, we do a performance comparison with respect to reported MAE and F$_1$-scores and observe different recurring elements in the best performing approaches, namely data sampling intervals below 10\,s, a large field of view, the usage of GAN losses, multi-task learning, and post-processing. Subsequently, multiple input features, multi-task learning and related research gaps are discussed, the need for comparative studies is highlighted, and finally, missing elements for a successful deployment of NILM approaches based on deep neural networks are pointed out. We conclude the review with an outlook on possible future scenarios.
ARTICLE | doi:10.20944/preprints202102.0318.v3
Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: Machine Learning; Artificial Intelligence; Androgen Receptor; Random Forest; Deep Neural Network; Convolutional
Online: 24 February 2021 (13:14:01 CET)
Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine learning classifiers and regressors and evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to dif- ferent results, with deep neural networks (DNNs) on user-defined physicochemically-relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically-based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evalu- ation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and predic- tion, improving assessment and design of compounds. Source code and data are available at https://github.com/AlfonsoTGarcia-Sosa/ML
ARTICLE | doi:10.20944/preprints202108.0011.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Transformer; spike; neural decoding; CNN; RNN; LSTM; deep learning; information; neuroscience
Online: 2 August 2021 (09:51:43 CEST)
Neural decoding from spiking activity is an essential tool for understanding the information encoded in population neurons, especially in applications like brain-computer interface (BCI). Various quantitative methods have been proposed and have shown superiorities under different scenarios respectively. From the machine learning perspective, the decoding task is to map the high-dimensional spatial & temporal neuronal activity to the low-dimensional physical quantities (e.g., velocity, position). Because of the complex interactions and the abundant dynamics among neural circuits, good decoding algorithms usually have the capability of capturing flexible spatiotemporal structures embedded in the input feature space. Recently, the Transformer-based models are widely used in processing natural languages and images due to its superior performances in handling long-range and global dependencies. Hence, in this work we examine the potential applications of Transformers in neural decoding and introduce two Transformer-based models. Besides adapting the Transformer to neuronal data, we also propose a data augmentation method for overcoming the data shortage issue. We test our models on three experimental datasets and their performances are comparable to the previous state-of-the-art (SOTA) RNN-based methods. In addition, Transformer-based models show increased decoding performances when the input sequences are longer, while LSTM-based models deteriorate quickly. Our research suggests that Transformer-based models are important additions to the existing neural decoding solutions, especially for large datasets with long temporal dependencies.
ARTICLE | doi:10.20944/preprints202009.0524.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: COVID-19; chest X-ray images; deep convolutional neural network; COV-MCNet; deep learning
Online: 23 September 2020 (03:31:30 CEST)
The COVID-19 pandemic situation has created even more difficulties in the quick identification and screening of the COVID-19 patients for the medical specialists. Therefore, a significant study is necessary for detecting COVID-19 cases using an automated diagnosis method, which can aid in controlling the spreading of the virus. In this paper, the study suggests a Deep Convolutional Neural Network-based multi-classification approach (COV-MCNet) using eight different pre-trained architectures such as VGG16, VGG19, ResNet50V2, DenseNet201, InceptionV3, MobileNet, InceptionResNetV2, Xception which are trained and tested on the X-ray images of COVID-19, Normal, Viral Pneumonia, and Bacterial Pneumonia. The results from 3-class (Normal vs. COVID-19 vs. Viral Pneumonia) showed that only the ResNet50V2 model provides the highest classification performance (accuracy: 95.83%, precision: 96.12%, recall: 96.11%, F1-score: 96.11%, specificity: 97.84%) compared to rest of the models. The results from 4-class (Normal vs. COVID-19 vs. Viral Pneumonia vs. Bacterial Pneumonia) demonstrated that the pre-trained model DenseNet201 provides the highest classification performance (accuracy: 92.54%, precision: 93.05%, recall: 92.81%, F1-score: 92.83%, specificity: 97.47%). Notably, the ResNet50V2 (3-class) and DenseNet201 (4-class) models in the proposed COV-MCNet framework showed higher accuracy compared to the rest six models. This indicates that the designed system can produce promising results to detect the COVID-19 cases on the availability of more data. The proposed multi-classification network (COV-MCNet) significantly speeds up the existing radiology-based method, which will be helpful to the medical community and clinical specialists for early diagnosis of the COVID-19 cases during this pandemic.
ARTICLE | doi:10.20944/preprints202105.0117.v2
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: decision trees; deep feed-forward network; neural trees; consistency; optimal rate of convergence
Online: 9 November 2021 (16:54:30 CET)
Decision tree algorithms have been among the most popular algorithms for interpretable (transparent) machine learning since the early 1980s. On the other hand, deep learning methods have boosted the capacity of machine learning algorithms and are now being used for non-trivial applications in various applied domains. But training a fully-connected deep feed-forward network by gradient-descent backpropagation is slow and requires arbitrary choices regarding the number of hidden units and layers. In this paper, we propose near-optimal neural regression trees, intending to make it much faster than deep feed-forward networks and for which it is not essential to specify the number of hidden units in the hidden layers of the neural network in advance. The key idea is to construct a decision tree and then simulate the decision tree with a neural network. This work aims to build a mathematical formulation of neural trees and gain the complementary benefits of both sparse optimal decision trees and neural trees. We propose near-optimal sparse neural trees (NSNT) that is shown to be asymptotically consistent and robust in nature. Additionally, the proposed NSNT model obtain a fast rate of convergence which is near-optimal up to some logarithmic factor. We comprehensively benchmark the proposed method on a sample of 80 datasets (40 classification datasets and 40 regression datasets) from the UCI machine learning repository. We establish that the proposed method is likely to outperform the current state-of-the-art methods (random forest, XGBoost, optimal classification tree, and near-optimal nonlinear trees) for the majority of the datasets.
ARTICLE | doi:10.20944/preprints202109.0285.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: remote sensing; deep learning; image classification
Online: 16 September 2021 (13:38:55 CEST)
Autonomous image recognition has numerous potential applications in the field of planetary science and geology. For instance, having the ability to classify images of rocks would allow geologists to have immediate feedback without having to bring back samples to the laboratory. Also, planetary rovers could classify rocks in remote places and even in other planets without needing human intervention. Shu et al. classified 9 different types of rock images using a Support Vector Machine (SVM) with the image features extracted autonomously. Through this method, the authors achieved a test accuracy of 96.71%. In this research, Convolutional Neural Networks(CNN) have been used to classify the same set of rock images. Results show that a 3-layer network obtains an average accuracy of 99.60% across 10 trials on the test set. A version of Self-taught Learning was also implemented to prove the generalizability of the features extracted by the CNN. Finally, one model has been chosen to be deployed on a mobile device to demonstrate practicality and portability. The deployed model achieves a perfect classification accuracy on the test set, while taking only 0.068 seconds to make a prediction, equivalent to about 14 frames per second.
ARTICLE | doi:10.20944/preprints202301.0208.v1
Subject: Physical Sciences, Biophysics Keywords: Deep belief network; Diabetes; Prediction; Risk Factors; Deep Learning
Online: 12 January 2023 (03:54:15 CET)
Diabetes mellitus is a popular life-threatening disease and patients may gradually have started suffering from other diabetes-causing diseases such as heart attacks, stroke, hypertension, blurry vision, blindness, foot ulcer, amputation, kidney damage and other organ failures before diagnosis. Early detection can help reduce the fatality of this disease. Deep learning models have proven very useful in disease detection and computer-aided diagnosis. In this work, we proposed a deep unsupervised machine learning model for early detection of diabetes using voting ensemble feature selection and deep belief neural networks (DBN). Dataset was obtained from an online repository containing responses of prediagnosed patients to direct questionnaires administered in Sylhet Diabetes Hospital in Sylhet, Bangladesh. The dataset was preprocessed and preprocessed. Features were reduced using the ensemble feature selector. The DBN model was pretrained and tuned to obtain optimal performance. The model was also compared with other models with no multiple hidden layers. The DBN performed at its relative best with F1-measure, precision and recall of 1.00, 0.92 and 1.00 respectively. We conclude that DBN is a useful tool for an unsupervised early prediction of Type II diabetes mellitus.
ARTICLE | doi:10.20944/preprints202309.0356.v1
Subject: Engineering, Transportation Science And Technology Keywords: online shopping trip; offline shopping trips; deep neural network model; e-commerce and transportation; factors affecting shopping trip choice; sustainable development
Online: 6 September 2023 (14:28:39 CEST)
This study investigates the factors influencing the choice of online and offline shopping trips and their impacts on urban transportation, environment, and economy in Tehran, Iran. A questionnaire survey was conducted to collect data from 1,000 active e-commerce users who made successful orders in both online and offline services in the last 20 days of 2021 in areas 2 and 5 of Tehran. A deep neural network model was developed to estimate the type of shopping trip based on 10 indicators, such as age, gender, car ownership, delivery cost, product price, etc. The performance of the model was compared with three other algorithms: MLP, decision tree, and KNN. The results showed that the deep neural network model had the highest accuracy of 95.63%. The most important factors affecting the choice of shopping trips were delivery cost, delivery time, and product price. This study provides insights for transportation planners, e-commerce managers, and policymakers to design effective strategies for reducing transportation costs, pollutant emissions, urban traffic, and increasing user satisfaction and sustainable development.
ARTICLE | doi:10.20944/preprints201910.0376.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: artificial neural network; deep learning; LSTM; speech processing
Online: 31 October 2019 (16:40:30 CET)
Speech signals are degraded in real-life environments, product of background noise or other factors. The processing of such signals for voice recognition and voice analysis systems presents important challenges. One of the conditions that make adverse quality difficult to handle in those systems is reverberation, produced by sound wave reflections that travel from the source to the microphone in multiple directions.To enhance signals in such adverse conditions, several deep learning-based methods have been proposed and proven to be effective. Recently, recurrent neural networks, especially those with long and short-term memory (LSTM), have presented surprising results in tasks related to time-dependent processing of signals, such as speech. One of the most challenging aspects of LSTM networks is the high computational cost of the training procedure, which has limited extended experimentation in several cases. In this work, we present a proposal to evaluate the hybrid models of neural networks to learn different reverberation conditions without any previous information. The results show that some combination of LSTM and perceptron layers produce good results in comparison to those from pure LSTM networks, given a fixed number of layers. The evaluation has been made based on quality measurements of the signal's spectrum, training time of the networks and statistical validation of results. Results help to affirm the fact that hybrid networks represent an important solution for speech signal enhancement, with advantages in efficiency, but without a significan drop in quality.
ARTICLE | doi:10.20944/preprints202308.0047.v1
Subject: Physical Sciences, Astronomy And Astrophysics Keywords: image classification; astronomy; asteroids; convolutional neural network; deep learning
Online: 1 August 2023 (11:08:14 CEST)
Near Earth Asteroids represent potential threats to human life because their trajectories may bring them in the proximity of the Earth. Monitoring these objects could help predict future impact events, but such efforts are hindered by the large numbers of objects that pass through the Earth’s vicinity. Additionally, there is also the problem of distinguishing asteroids from other objects in the night sky, which implies sifting through large sets of telescope image data. Within this context, we believe that employing machine learning techniques could greatly improve the detection process by sorting out the most likely asteroid candidates to be reviewed by human experts. At the moment, the use of machine learning techniques is still limited in the field of astronomy and the main goal of the present paper is to study the effectiveness of deep CNNs for the classification of astronomical objects, asteroids in this particular case, by comparing some of the well-known deep convolutional neural networks, including InceptionV3, Xception, InceptionResNetV2 and ResNet152V2. We have applied transfer learning and fine-tuning on these pre-existing deep convolutional networks and from the results that we have obtained one can see the potential of using deep convolutional neural networks in the process of asteroid classification. The InceptionV3 model has the best results in the asteroid class, meaning that by using it, we loose the least number of valid asteroids.
REVIEW | doi:10.20944/preprints202202.0050.v1
Subject: Engineering, Bioengineering Keywords: Neuroprosthetics; Brain Computer Interface; Neural Implants; Deep Brain Stimulation
Online: 3 February 2022 (11:06:15 CET)
Recent progress in microfabrication technique allowed the rapid development of neural implants. They are getting categorized as effective tools for clinical practice, especially to treat traumatic and neurodegenerative disorders. Microelectrode arrays already have been used in numerous neural interface devices. Basically, almost all neural implants have been developed based on BCI (Brain Computer Interface) system. When BCI system falls under invasive technique, it is referred as BMI or Brain Machine Interface. BMIs hold promises for neurorehabilitation of motor and sensory function, cognitive state evaluation and treatment of neurological chaos. A directed overview of the field of neural implants is discussed in this article. The aim of this review is to give a brief introduction of neural prosthetics as well as their exciting applications in treating neurological disorders and a deep discussion on their functionality are mentioned. BCI system and their different types, their functionality, their pros and cons, how other neural implants developed, and their present status have been covered. Different possibilities and possible future of deep brain stimulation (DBS), Neuralink, motor and sensory neural prosthetics are further discussed.
ARTICLE | doi:10.20944/preprints202301.0579.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: hybrid modeling; deep neural networks; deep learning; SBML; systems biology; computational modeling
Online: 31 January 2023 (08:51:13 CET)
In this paper we propose a computational framework that merges mechanistic modeling with deep neural networks obeying the Systems Biology Markup Language (SBML) standard. Over the last 20 years, the systems biology community has developed a large number of mechanistic models in SBML that are currently stored in public databases. With the proposed framework, existing SBML mechanistic models may be upgraded to hybrid systems through the incorporation of deep neural networks into the model core, using a freely available python tool. The so-formed hybrid mechanistic/neural network models are trained with a deep learning algorithm based on the adaptive moment estimation method (ADAM), stochastic regularization and semidirect sensitivity equations. The trained hybrid models are encoded in SBML and stored back in model databases, where they can be further analyzed as regular SBML models. The application of this approach is illustrated with three well-known case studies: the threonine synthesis model in Escherichia coli, the P58IPK signal transduction model, and the Yeast glycolytic oscillations model. The proposed framework is expected to greatly facilitate the widespread use of hybrid modeling techniques for systems biology applications.
ARTICLE | doi:10.20944/preprints202307.0724.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: crop type recognition; deep learning; crowdsourcing; street-level imagery
Online: 12 July 2023 (04:40:51 CEST)
The creation of crop-type maps from satellite data has proven challenging, often impeded by a lack of accurate in-situ data. This paper aims to demonstrate a method for crop-type (ie. Maize, Wheat and Other) recognition based on Convolutional Neural Networks using a bottom-up approach. We trained the model with a highly accurate dataset of crowdsourced labelled street-level imagery. Classification results achieved an AUC of 0.87 for wheat, 0.85 for maize and 0.73 for other. Given that wheat and maize are the two most common food crops globally, combined with an ever-increasing amount of available street-level imagery, this approach could help address the need for improved crop-type monitoring globally. Challenges remain in addressing the noisy aspect of street-level imagery (ie. buildings, hedgerows, automobiles, etc.), where a variety of different objects tend to restrict the view and confound the algorithms
ARTICLE | doi:10.20944/preprints202304.1088.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: deep learning; image aesthetics assessment; image enhancement
Online: 28 April 2023 (03:15:16 CEST)
Abstract: Image aesthetic assessment (IAA) with neural attention has made significant progress due to its effectiveness in object recognition. Current studies have shown that the features learned by convolutional neural networks (CNN) at different learning stages indicate meaningful information. The shallow feature contains the low-level information of images and the deep feature perceives the image semantics and themes. Inspired by this, we propose a visual enhancement network with feature fusion (FF-VEN). It consists of two sub-modules, the visual enhancement module (VE module) and the shallow and deep feature fusion module (SDFF module). The former uses an adaptive filter in the spatial domain to simulate human eyes according to the region of interest (ROI) extracted by neural feedback. The latter not only takes out the shallow feature and the deep feature by transverse connection, but also uses a feature fusion unit (FFU) to fuse the pooled features together with the aim of information contribution maximization. Experiments on standard AVA dataset and Photo.net dataset show the effectiveness of FF-VEN.
Subject: Engineering, Control And Systems Engineering Keywords: deep learning; signal detection; wideband spectrogram; centerline
Online: 12 May 2020 (12:50:41 CEST)
Wideband signal detection is an important problem in wireless communication. With the rapid development of deep learning (DL) technology, some DL-based methods are applied to wireless technology, and the effect is obvious. In this paper, we propose a novel neural network for multi-type signal detection that can locate signals and recognize signal types in wideband spectrogram. Our network utilizes the key point estimation to locate the rough centerline of signal region and identify class. Then, several regressions are carried out to achieve properties, such as local offset and border offsets of bounding box, which is further synthesized for a more fine location. Experimental results demonstrate that our method performs more accurate than other DL-based object detection methods previously employed for the same detection task. Specifically, our method runs obviously faster than existing methods, and abandons the anchor generation, which makes it more favorable for real-time applications.
REVIEW | doi:10.20944/preprints202104.0739.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep neural network; survey; document images; review paper; deep learning; performance evaluation; page object detection, graphical page objects; document image analysis; page segmentation
Online: 28 April 2021 (10:17:49 CEST)
In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that makes digitization of documents viable. Since the advent of deep learning, the performance of deep learning-based object detection has improved many folds. In this work, we outline and summarize the deep learning approaches for detecting graphical page objects in the document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.
ARTICLE | doi:10.20944/preprints202211.0130.v1
Subject: Computer Science And Mathematics, Other Keywords: IoT; localization; LoRaWAN; Deep Learning
Online: 8 November 2022 (01:06:12 CET)
In the field of low power wireless networks, one of the techniques on which many researchers are putting their efforts is related to positioning methodologies such as fingerprinting in dense urban areas. This paper presents an experimental study aimed at quantifying the mean location estimation error in densely urbanized areas.Using a dataset made available by the University of Antwerp, a neural network was implemented with the aim of providing the position of the end-devices. In this way it was possible to measure the mean location estimation error in an area with high urban density. The results obtained show an accuracy in the localization of the end-device of less than 150 meters.This result would make it possible to use the fingerprint instead of alternative, energy consuming, methodologies such as GPS in IoT (Internet of Things) applications where battery life is the primary requirement to be met.
ARTICLE | doi:10.20944/preprints202002.0180.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; neural attention; loans; loan origination; machine learning
Online: 14 February 2020 (02:45:01 CET)
In this paper we address the understanding of the problem, of why a deep learning model decides that an individual is eligible for a loan or not. Here we propose a novel approach for inferring, which attributes matter the most, for making a decision in each specific individual case. Specifically we leverage concepts from neural attention to devise a novel feature wise attention mechanism. As we show, using real world datasets, our approach offers unique insights into the importance of various features, by producing a decision explanation for each specific loan case. At the same time, we observe that our novel mechanism, generates decisions which are much closer to the decisions generated by human experts, compared to the existent competitors.
ARTICLE | doi:10.20944/preprints202304.0804.v1
Subject: Computer Science And Mathematics, Robotics Keywords: registration; point clouds; urban scene; deep learning
Online: 23 April 2023 (13:59:52 CEST)
Urban scene point cloud pose significant challenges for registration due to its large data volume, similar scenarios and dynamic objects. In this paper, we propose PCRMLP, a model for urban scene point cloud registration that achieves comparable registration performance to prior learning-based methods. Compared to previous works which focus on extracting features and estimating correspondence, the model estimates the transformation implicitly from concrete instances. An instance-level urban scene representation method is introduced to extract instance descriptors via semantic segmentation and DBSCAN, which enable the model to obtain robust instance features, filter dynamic objects and estimate transformation in a more logical manner. Then a lightweight network consisting of MLPs is employed to obtain transformation in an encoder-decoder manner. We validate the approach on KITTI dataset. Experimental results demonstrate that PCRMLP can obtain a satisfactory coarse transformation from instance descriptors just in 0.0028s. With a subsequent ICP refinement module, the proposed method achieves higher registration accuracy and computational efficiency than prior learning-based works.
ARTICLE | doi:10.20944/preprints202301.0075.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: EMG; optimization; genetic algorithm; deep learning
Online: 4 January 2023 (09:21:39 CET)
Hand gesture recognition has many valuable applications in engineering and health care. This study proposes a novel model which can accurately distinguish hand gestures using forearm muscles' surface electromyogram (sEMG) signals. A deep learning algorithm with hyper parameters impacting the final model’s accuracy and a convolutional neural network (CNN) were employed in the recognition stage. The number of convolutional layers, kernels per layer, and neurons in the dense layer were selected for optimization, while the remaining parameters, such as the learning rate, batch size, and number of epochs, were chosen based on trial and error and prior knowledge. The optimal values for the selected hyperparameters were obtained using a genetic algorithm to achieve maximum recognition accuracy. The UC2018 Dual-Myo database was used for training and testing the model based on EMG signals characterizing the activity of eight different hand gestures. The final structure of the model consisted of two convolutional layers with 131 and 28 kernels, a dense layer with 111 neurons, and a softmax layer with eight neurons. Upon optimizing the hyperparameters using the genetic algorithm, the accuracy of the proposed model increased from 91.86% to 96.4% at best and 95.3% on average in real-time applications and 99.6% in an offline mode. Future work is warranted towards improving the architecture and the computational cost.
ARTICLE | doi:10.20944/preprints202307.0848.v1
Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: Root-related proteins; Deep learning; Graph convolutional network; Multi-head attention; Network embedding
Online: 12 July 2023 (12:41:34 CEST)
The root system plays an irreplaceable role in plant growth. Its improvement can increase crop productivity. However, such system is still mysterious for us. The underlying mechanism has not been fully uncovered. The investigation on proteins related to the root system is an important means to complete this task. In the previous time, lack of root-related proteins makes it impossible to adopt machine learning methods for designing efficient models for the discovery of novel root-related proteins. Recently, a public database on root-related proteins was set up and machine learning methods can be applied in this field. In this study, we proposed a machine learning based model, named Graph-Root, for identification of root-related proteins. The features derived from protein sequences and one network were extracted, where the former features were processed by graph convolutional neural network and multi-head attention, and the later features abstracted the linkage between proteins. These features were fed into the fully connected layer to make prediction. The 5-fold cross-validation and independent tests suggested its good performance. It also outperformed the only one previous model, SVM-Root. Furthermore, the importance of each feature type and component in the proposed model was investigated.
ARTICLE | doi:10.20944/preprints201811.0400.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Super-Resolution; Deep-learning; Generative Adversarial Networks; CMOS sensors
Online: 16 November 2018 (10:44:26 CET)
Complementary Metal-Oxide-Semiconductor (CMOS) is a typical image sensor that has a wide range of applications. However, considering the limitations of the weather condition and hardware cost, it is hard to capture high-resolution images by CMOS sensor. Recently, Super-Resolution (SR) techniques for image restoration has been gaining attentions due to its excellent performance. Under the powerful learning ability, Generative Adversarial Networks (GANs) have been proved to achieve great success. In this paper, we propose the Advanced Generative Adversarial Networks (AGAN) to efficiently correct these issues; 1) we design a Laplacian pyramid framework as pre-trained module, which is beneficial to provide multi-scale features for our input. 2) at each feature block, a convolutional skip-connections network, which may contain some latent information, is significant for generative model to reconstruct a plausible-looking image; 3) considering that edge details usually play an important role in image generation, a novel perceptual loss function is defined to train and seek optimal parameters. It is effective to achieve excellent and compelling quality captured by CMOS sensor. Quantitative and qualitative evaluations have been demonstrated that our algorithm not only fully takes advantage of Convolutional Neural Networks (CNNs) to improve the image quality, but also performs better than previous GAN algorithms for super-resolution task.
ARTICLE | doi:10.20944/preprints202008.0113.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Scene classification; Deep Learning; Convolutional Neural Networks; Feature learning
Online: 5 August 2020 (06:19:27 CEST)
State-of-the-art remote sensing scene classification methods employ different Convolutional Neural Network architectures for achieving very high classification performance. A trait shared by the majority of these methods is that the class associated with each example is ascertained by examining the activations of the last fully connected layer, and the networks are trained to minimize the cross-entropy between predictions extracted from this layer and ground-truth annotations. In this work, we extend this paradigm by introducing an additional output branch which maps the inputs to low dimensional representations, effectively extracting additional feature representations of the inputs. The proposed model imposes additional distance constrains on these representations with respect to identified class representatives, in addition to the traditional categorical cross-entropy between predictions and ground-truth. By extending the typical cross-entropy loss function with a distance learning function, our proposed approach achieves significant gains across a wide set of benchmark datasets in terms of classification, while providing additional evidence related to class membership and classification confidence.
ARTICLE | doi:10.20944/preprints202304.0203.v4
Subject: Engineering, Electrical And Electronic Engineering Keywords: Electric Vehicles; Battery Management System; Lithium-ion batteries; Deep Learning
Online: 19 April 2023 (03:34:32 CEST)
This paper presents an improved SOC estimation method for lithium ion batteries in Electric Vehicles using Bayesian optimized feedforward network. This innovative bayesian optimized neural network method attempts to minimize a scalar objective function by extracting hyperpa-rameters (hidden neurons in both layers) using a surrogate model. Furthemore, the hyperparameters are built and data samples are trained and validated. The performance of the proposed deep learning neural network is evaluated. Two reasonable size data samples are ex-tracted from Panasonic 18650PF Li-ion Mendeley datasets that are used for training and valida-tion. RNN and LSTM neural network algorithms offer the common core property of retaining past information and/or hidden states for better SOC estimation. However, the feature of this pro-posed method is the inclusion of Bayesian optimization that chooses optimal double layer hidden neurons. Analysis of results shows that Bayesian optimized feedforward algorithm with average MAPE (0.20%) is the lowest and is the best selection compared with average MAPE for other five deep learning algorithms. In the last quarter of fuel gauge, where fuel anxiety is severe, feed-forward with Bayesian Optimization algorithm is still the best selection (with MAPE of 0.64%).
ARTICLE | doi:10.20944/preprints202305.1522.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Recommendation system; Contrast learning; Deep Learning
Online: 22 May 2023 (11:55:55 CEST)
Modelling both long and short-term user interests from historical data is crucial for accurate recommendations. However, unifying these metrics across multiple application domains can be challenging, and existing approaches often rely on complex, intertwined models which can be difficult to interpret. To address this issue, we propose a lightweight, plug-and-play interest enhancement module that fuses interest vectors from two independent models. After analyzing the dataset, we identify deviations in the recommendation performance of long and short-term interest models. To compensate for these differences, we use feature enhancement and loss correction during training. In the fusion process, we explicitly split long-term interest features with longer duration into multiple local features. We then use a shared attention mechanism to fuse multiple local features with short-term interest features to obtain interaction features. To correct for bias between models, we introduce a comparison learning task that monitors the similarity between local features, short-term features, and interaction features. This adaptively reduces the distance between similar features. Our proposed module combines and compares multiple independent long-term and short-term interest models on multiple domain datasets. As a result, it not only accelerates the convergence of the models but also achieves outstanding performance in challenging recommendation scenarios.
ARTICLE | doi:10.20944/preprints202309.1081.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: dual mixing attention; UAV re-identification; deep learning
Online: 18 September 2023 (07:30:54 CEST)
Vehicle re-identification research under surveillance cameras has yielded impressive results. However, the challenge of Unmanned Aerial Vehicle (UAV)-based vehicle re-identification (ReID) presents a high degree of flexibility, mainly due to complicated shooting angles, occlusions, low discrimination of top-down features, and significant changes in vehicle scales. To address this, we propose a novel Dual Mixing Attention Network (DMANet) to extract discriminative features robust to variations in viewpoint. Specifically, we first present a plug-and-play Dual Mixing Attention Module (DMAM) to capture pixel-level pairwise relationships and channel dependencies, where DMAM is composed of Spatial Mixing Attention (SMA) and Channel Mixing Attention (CMA). First, the original feature is divided according to the dimensions of spatial and channel to obtain multiple subspaces. Then, a learnable weight is applied to capture the dependencies between local features in the mixture space. Finally, the features extracted from all subspaces are aggregated to promote their comprehensive feature interaction. Moreover, the DMAM can be readily inserted into backbone networks at any depth to improve vehicle discrimination. The experiments show that the proposed structure gains a better performance compared with the representative methods in the UAV-based vehicle ReID task. Our code and models will be publicly released.
ARTICLE | doi:10.20944/preprints201804.0286.v1
Subject: Business, Economics And Management, Finance Keywords: electricity price forecasting; deep learning; gated recurrent units; long short term memory; artificial intelligence, turkish day-ahead market
Online: 23 April 2018 (11:38:27 CEST)
Accurate electricity price forecasting has become a substantial requirement since the liberalization of the electricity markets. Due to the challenging nature of the electricity prices, which includes high volatility, sharp price spikes and seasonality, various types of electricity price forecasting models still compete and can not outperform each other consistently. Neural Networks have been successfully used in machine learning problems and Recurrent Neural Networks (RNNs) have been proposed to address time-dependent learning problems. In particular, Long Short Term Memory and Gated Recurrent Units (GRU) are tailor-made for time series price estimation. In this paper, we propose to use Gated Recurrent Units as a new technique for electricity price forecasting. We have trained a variety of algorithms with rolling 3-year window and compared the results with the RNNs. In our experiments, 3-layered GRUs outperformed all other neural network structures and state of the art statistical techniques in a statistically significant manner in the Turkish day-ahead market.
ARTICLE | doi:10.20944/preprints202207.0056.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: deep learning; convolutional neural networks; classification; machine learning; IoT
Online: 5 July 2022 (04:22:49 CEST)
In videos, the human's actions are of three-dimensional (3D) signals. These videos investigate the spatiotemporal knowledge of human behavior. The promising ability is investigated using 3D convolution neural networks (CNNs). The 3D CNNs have not yet achieved high output for their well-established two-dimensional (2D) equivalents in still photographs. Board 3D Convolutional Memory and Spatiotemporal fusion face training difficulty preventing 3D CNN from accomplishing remarkable evaluation. In this paper, we implement Hybrid Deep Learning Architecture that combines STIP and 3D CNN features to enhance the performance of 3D videos effectively. After implementation, the more detailed and deeper charting for training in each circle of space-time fusion. The training model further enhances the results after handling complicated evaluations of models. The video classification model is used in this implemented model. Intelligent 3D Network Protocol for Multimedia Data Classification using Deep Learning is introduced to further understand space-time association in human endeavors. In the implementation of the result, the well-known dataset, i.e., UCF101 to, evaluates the performance of the proposed hybrid technique. The results beat the proposed hybrid technique that substantially beats the initial 3D CNNs. The results are compared with state-of-the-art frameworks from literature for action recognition on UCF101 with an accuracy of 95%.
ARTICLE | doi:10.20944/preprints202009.0416.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: adversarial learning; deep cross-modal hashing; self-attention mechanism
Online: 18 September 2020 (04:16:58 CEST)
Recently deep cross-modal hashing networks have received increasing interests due to its superior query efficiency and low storage cost. However, most of existing methods concentrate less on hash representations learning part, which means the semantic information of data cannot be fully used. Furthermore, they may neglect the high-ranking relevance and consistency of hash codes. To solve these problems, we propose a Self-Attention and Adversary Guided Hashing Network (SAAGHN). Specifically, it employs self-attention mechanism in hash representations learning part to extract rich semantic relevance information. Meanwhile, in order to keep invariability of hash codes, adversarial learning is adopted in the hash codes learning part. In addition, to generate higher-ranking hash codes and avoid local minima early, a new batch semi-hard cosine triplet loss and a cosine quantization loss are proposed. Extensive experiments on two benchmark datasets have shown that SAAGHN outperforms other baselines and achieves the state-of-the-art performance.
ARTICLE | doi:10.20944/preprints202211.0488.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Capsule network; differential features; deep learning; micro-expression recognition; spatiotemporal features
Online: 28 November 2022 (03:12:12 CET)
Micro-expression (ME) is one of the key psychological stress reactions. It is a modest, spontaneous facial mechanism. ME has significant applicability in a variety of psychologically-related sectors because to its precision and unpredictability with regard to psychological manifestations. Nevertheless, the current Micro-expression recognition (MER) algorithms have poor accuracy and a limited quantity of ME data, and this study issue has not been thoroughly investigated. Therefore, we present an approach for deep learning based on a Spatio-temporal capsule network (STCP-Net). STCP-Net has four components: a jitter reduction module, a differential feature extraction module, an STCP module, and a fully linked layer. The first two modules are aimed to extract diversifying differential features more precisely and to limit the influence of head jitter. The STCP module is used to extract Spatio-temporal features layer by layer, taking the temporal and geographical connection between features into account. This research runs sufficient trials using the Leave One Subject Out (LOSO) methodology for cross-validation using the CASMEII dataset. The conclusion and analysis demonstrate that the algorithm is innovative and efficient.
ARTICLE | doi:10.20944/preprints202309.0696.v1
Subject: Engineering, Transportation Science And Technology Keywords: online shopping trip; offline shopping trips; gray wolf optimization; deep neural network model; e-commerce and transportation; factors affecting shopping trip choice; sustainable development
Online: 12 September 2023 (02:50:20 CEST)
Online and offline shopping trip have different impacts on various aspects of urban life, such as e-commerce, transportation systems, and sustainability. Therefore, it is important to evaluate the factors that influence their choices. We use a hybrid machine learning model that combines a gray wolf optimization algorithm and a deep convolutional neural network to estimate shopping trip based on a survey of 1,000 active e-commerce users who made successful orders in both online and offline services in the last 20 days of 2021 in areas 2 and 5 of Tehran. The gray wolf optimization algorithm performs feature selection and hyperparameter tuning for the deep convolutional neural network, which is a powerful deep learning model for image recognition and classification. The results show that our model achieves an accuracy of 97.81% with an MSE of 0.325 by selecting seven out of ten features. The most important features are delivery cost, delivery time, product price, car ownership. In addition, comparing the performance of the proposed method with other methods showed that the proposed algorithm with an accuracy of 97.81%, the accuracies of the single deep learning model, MLP neural network, decision tree, and KNN models were 95.63%, 90.0%, 86.49%, and 80.16%, respectively.
ARTICLE | doi:10.20944/preprints202308.1665.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: intrusion detection; internet of things; Deep Learning; AutoEncoder; network security
Online: 24 August 2023 (02:59:32 CEST)
With the continuous development of network technology, complex network systems generate massive unbalanced attack traffic. Due to the severe imbalance in the quantities of normal samples and attack samples, as well as among different types of attack samples, intrusion detection systems suffer from low detection rates for rare class attack data. In this paper, we propose a geometric synthetic minority oversampling technique based on optimized kernel density estimation algorithm. This method can generate diverse rare class attack data by learning the distribution of rare class attack data while maintaining similarity with the original sample features. Meanwhile, the balanced data is input to a feature extraction module built upon multiple denoising autoencoders, reducing information redundancy in high-dimensional data and improving the detection performance for unknown attacks. Subsequently, a soft voting ensemble learning technique is utilized for multi-class anomaly detection on the balanced and dimensionally reduced data. Finally, an intrusion detection system is constructed based on data preprocessing, imbalance handling, feature extraction, and anomaly detection modules, and validated on the NSL-KDD and N-BaIoT datasets. Comparative experiments with baseline models and other state-of-the-art methods demonstrate that the proposed system improves the detection rate of rare class attack data. Furthermore, it achieves a good overall detection rate on the Internet of Things dataset (N-BaIoT), indicating its strong applicability.
BRIEF REPORT | doi:10.20944/preprints202305.0768.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: Climate; Contiguous United States; Deep Neural Network; Land Cover; Large Wildfire
Online: 10 May 2023 (14:46:12 CEST)
Over the last several decades, large wildfires are increasingly common across the United States causing disproportionate impact on forest health and function, human well-being, and economy. Here, we examine the severity of large wildfires across the Contiguous United States over the past decade (2011-2020) using a wide array of meteorological, vegetational, and topographical features in the Deep Neural Network model. A total of 4,538 wildfire incidents were used in the analysis covering 87,305 square miles of burned area. We observed the highest number of large wildfires in California, Texas, and Idaho, with lightning causing 43 % of these incidents. Importantly, results indicate that the severity of wildfire occurrences is highly correlated with the climatological forcings, land cover, location, and elevation of the ecosystem. Overall, results may serve useful guide in managing landscapes under changing climate and disturbance regimes.
ARTICLE | doi:10.20944/preprints201905.0228.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning, LSTM, Machine learning, Post-filtering, Signal processing, Speech Synthesis
Online: 17 May 2019 (16:16:53 CEST)
Several researchers have contemplated deep learning-based post-filters to increase the quality of statistical parametric speech synthesis, which perform a mapping of the synthetic speech to the natural speech, considering the different parameters separately and trying to reduce the gap between them. The Long Short-term Memory (LSTM) Neural Networks have been applied successfully in this purpose, but there are still many aspects to improve in the results and in the process itself. In this paper, we introduce a new pre-training approach for the LSTM, with the objective of enhancing the quality of the synthesized speech, particularly in the spectrum, in a more efficient manner. Our approach begins with an auto-associative training of one LSTM network, which is used as an initialization for the post-filters. We show the advantages of this initialization for the enhancing of the Mel-Frequency Cepstral parameters of synthetic speech. Results show that the initialization succeeds in achieving better results in enhancing the statistical parametric speech spectrum in most cases when compared to the common random initialization approach of the networks.
ARTICLE | doi:10.20944/preprints201810.0494.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: unsupervised training; features learning; deep learning; time series forecasting
Online: 22 October 2018 (12:24:43 CEST)
A continuous Deep Belief Network (cDBN) with two hidden layers is proposed in this paper, focusing on the problem of weak feature learning ability when dealing with continuous data. In cDBN, the input data is trained in an unsupervised way by using continuous version of transfer functions, the contrastive divergence is designed in hidden layer training process to raise convergence speed, an improved dropout strategy is then implemented in unsupervised training to realize features learning by de-cooperating between the units, and then the network is fine-tuned using back propagation algorithm. Besides, hyper-parameters are analysed through stability analysis to assure the network can find the optimal. Finally, the experiments on Lorenz chaos series, CATS benchmark and other real world like CO2 and waste water parameters forecasting show that cDBN has the advantage of higher accuracy, simpler structure and faster convergence speed than other methods.
ARTICLE | doi:10.20944/preprints201805.0276.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: blood pressure; oscillometric measurement; statistical analysis; normality; confidence interval; deep belief networks
Online: 21 May 2018 (12:54:26 CEST)
Oscillometric blood pressure (BP) devices currently estimate a single point but do not identify fluctuations in BP or distinguish them from variations in response to physiological properties. In this paper, to analyze BP normality based on oscillometric measurements, we use statistical approaches including kurtosis, skewness, Kolmogorov-Smirnov, and correlation tests. Then, to mitigate uncertainties, we use a deep neural network (DNN) to determine the confidence limits (CLs) of BP measurements based on their normality. The proposed DNN regression model decreases the standard deviation of error (SDE) of the mean error (ME) and the mean absolute error (MAE) and reduces the uncertainty of the CLs and SDEs of the proposed technique. We validate the normality of the distribution of the BP estimation distribution which fits the Gaussian distribution very well. We use a rank test in the DNN regression model to demonstrate the independence of the artificial SBP and DBP estimations. First, we perform statistical tests to verify the normality of the BP measurements for individual subjects. The proposed methodology provides accurate BP estimations and reduces the uncertainties associated with the CLs and SDEs based on the DNN regression estimator.
ARTICLE | doi:10.20944/preprints202308.0739.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning; fault diagnosis; adaptive activation function; pumping unit
Online: 9 August 2023 (07:22:09 CEST)
Due to the complex underground environment, pumping machines are prone to produce numerous failures. The indicator diagrams of faults are similar in a certain degree, which produces indistinguishable samples. As the samples increases, manual diagnosis becomes difficult, which decreases the accuracy of fault diagnosis. For accurately and quickly judging the fault type, we propose an improved adaptive activation function and apply it to five types of neural networks. The adaptive activation function improves the negative semi-axis slope of the ReLU activation function by combining the gated channel conversion unit to improve the performance of the deep learning model. The proposed adaptive activation function is compared with the traditional activation function through the fault diagnosis data set and the public data set. The results show that the activation function has better nonlinearity, can improve the generalization performance of deep learning model, the accuracy of fault diagnosis. In addition, the proposed adaptive activation function can also be well embedded in other neural networks.
REVIEW | doi:10.20944/preprints202011.0152.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: EEG signal recognition; machine learning in EEG; neural networks in EEG; dry electrode EEG; deep learning EEG
Online: 3 November 2020 (14:07:29 CET)
In the last decade, unprecedented progress in the development of neural networks influenced dozens of different industries, among which are signal processing for the electroencephalography process (EEG). Electroencephalography, even though it appeared in the first half of the 20th century, to this day didn’t change the physical principles of operation. But the signal processing technique due to the use of neural networks progressed significantly in this area. Evidence for this can serve that for the past 5 years more than 1000 publications on the topic of using machine learning have been published in popular libraries. Many different models of neural networks complicate the process of understanding the real situation in this area. In this manuscript, we provided the most comprehensive overview of research where were used neural networks for EEG signal processing.
ARTICLE | doi:10.20944/preprints202107.0699.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: adaptive computing; dynamic deep neural structure; adpative convolution; dynamic training
Online: 30 July 2021 (12:25:45 CEST)
The colossal depths of the deep neural network sometimes suffer from ineffective backpropagation of the gradients through all its depths. Whereas, The strong performance of shallower multilayer neural structures prove their ability to increase the gradient signals in the early stages of training which easily gets backpropagated for global loss corrections. Shallow neural structures are always a good starting point for encouraging the sturdy feature characteristics of the input. In this research, a shallow, deep neural structure called PrimeNet is proposed. PrimeNet is aimed to dynamically identify and encourage the quality visual indicators from the input to be used by the subsequent deep network layers and increase the gradient signals in the lower stages of the training pipeline. In addition to this, the layerwise training is performed with the help of locally generated errors which means the gradient is not backpropagated to previous layers, and the hidden layer weights are updated during the forward pass, making this structure a backpropagation free variant. PrimeNet has obtained state-of-the-art results on various image datasets, attaining the dual objective of (1) compact dynamic deep neural structure, which (2) eliminates the problem of backwards-locking. The PrimeNet unit is proposed as an alternative to traditional convolution and dense blocks for faster and memory-efficient training, outperforming previously reported results aimed at adaptive methods for parallel and multilayer deep neural systems.
ARTICLE | doi:10.20944/preprints202005.0430.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Activity Context Sensing; Smartphones; Deep Convolutional Neural Networks; Smart devices
Online: 26 May 2020 (11:33:55 CEST)
With the widespread of embedded sensing capabilities of mobile devices, there has been unprecedented development of context-aware solutions. This allows the proliferation of various intelligent applications such as those for remote health and lifestyle monitoring, intelligent personalized services, etc. However, activity context recognition based on multivariate time series signals obtained from mobile devices in unconstrained conditions is naturally prone to imbalance class problems. This means that recognition models tend to predict classes with the majority number of samples whilst ignoring classes with the least number of samples, resulting in poor generalization. To address this problem, we propose to augment the time series signals from inertia sensors with signals from ambient sensing to train deep convolutional neural networks (DCNN) models. DCNN provides the characteristics that capture local dependency and scale invariance of these combined sensor signals. Consequently, we developed a DCNN model using only inertial sensor signals and then developed another model that combined signals from both inertia and ambient sensors aiming to investigate the class imbalance problem by improving the performance of the recognition model. Evaluation and analysis of the proposed system using data with imbalanced classes show that the system achieved better recognition accuracy when data from inertial sensors are combined with those from ambient sensors such as environment noise level and illumination, with an overall improvement of 5.3% accuracy.
ARTICLE | doi:10.20944/preprints202308.0712.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: bot; CNN; cyberattack; deep-learning; malware; NLP; phishing; social networks; spam
Online: 9 August 2023 (08:57:50 CEST)
Social networks have captured the attention of many people worldwide. However, these services have also attracted a considerable number of malicious users whose purpose is to compromise digital assets of other members by using messages as an attack vector to execute different variants of cyberattacks against them. Therefore, this work presents an approach based on Natural Language Processing tools and a Convolutional Neural Network architecture to detect and classify, on social network messages, four types of cyberattacks, such as malware, phishing, spam, and even one whose purpose is deceiving the user into spreading malicious messages to other users, which in this work is identified as bot attacks. One notable feature of this work is that it analyzes textual content without depending on any characteristics from a specific social network, making its analysis independent from particular data sources. Finally, this work was tested on real data, demonstrating its results in two stages. The first detects the existence of any of the four cyberattacks within the message, obtaining an accuracy value of 0.91. After detecting a message as a cyberattack, the next stage is to classify it into one of the four types of cyberattack, achieving an accuracy value of 0.82.
TECHNICAL NOTE | doi:10.20944/preprints202009.0678.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: multi-frame super resolution; wide activation super resolution; 3D convolutional neural network; deep learning
Online: 27 September 2020 (11:54:56 CEST)
The small satellite market continues to grow year after year. A compound annual growth rate of 17% is estimated during the period between 2020 and 2025. Low-cost satellites can send a vast amount of images to be post-processed at the ground to improve the quality and extract detailed information. In this domain lies the resolution enhancement task, where a low-resolution image is converted to a higher resolution automatically. Deep learning approaches to Super-Resolution (SR) reached the state-of-the-art in multiple benchmarks; however, most of them were studied in a single-frame fashion. With satellite imagery, multi-frame images can be obtained at different conditions giving the possibility to add more information per image and improve the final analysis. In this context, we developed and applied to the PROBA-V dataset of multi-frame satellite images a model that recently topped the European Space Agency’s Multi-frame Super Resolution (MFSR) competition. The model is based on proven methods that worked on 2D images tweaked to work on 3D: the Wide Activation Super Resolution (WDSR) family. We show that with a simple 3D CNN residual architecture with WDSR blocks and a frame permutation technique as data augmentation better scores can be achieved than with more complex models. Moreover, the model requires few hardware resources, both for training and evaluation, so it can be applied directly from a personal laptop.
ARTICLE | doi:10.20944/preprints202206.0043.v1
Subject: Medicine And Pharmacology, Orthopedics And Sports Medicine Keywords: deep learning; lumbarnet; lumbar spine; spondylolisthesis; u-net
Online: 3 June 2022 (10:15:57 CEST)
A common spinal condition, spondylolisthesis is the presence of a relative back or forth displacement between the upper and lower vertebra due to one vertebra being oriented away from the smooth curvature of a normal spine. Aging-related illnesses such as degenerative spondylolisthesis are especially burdensome on social welfare and health-care systems in an aging society, especially radiologists and clinical physicians. Therefore, we proposed a computer aided diagnosis algorithm, named LumbarNet, for vertebral slippage detection on clinical X-ray images. Collaborating with i) a P-grade, ii) a piecewise slope detection scheme, and iii) a dynamic shift detection routine, LumbarNet was thus specialized for analyzing complex structural patterns in lumbar spine X-ray images and outcompeted other U-Net based methods. Extensive experiments on lumbar spine X-ray images in standard clinical practices showed that LumbarNet achieved a mean intersection over union value of 0.88 in vertebral region detection and an accuracy of 88.83% in vertebral slippage detection.
ARTICLE | doi:10.20944/preprints202306.2220.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: infrared and visible image fusion; transformer; deep learning; residual dense block
Online: 30 June 2023 (10:40:08 CEST)
Infrared and visible image fusion technologies are used to characterize the same scene by diverse modalities. However, most existing deep learning-based fusion methods are designed as symmetric networks, which ignore the differences between modal images and lead to the source image information loss during feature extraction. In this paper, we propose a new fusion framework for the different characteristics of infrared and visible images. Specifically, we design a dual-stream asymmetric network with two different feature extraction networks to extract infrared and visible feature maps respectively. The transformer architecture is introduced in the infrared feature extraction branch, which can force the network to focus on the local features of infrared images while still obtaining their contextual information. And the visible feature extraction branch uses residual dense blocks to fully extract the rich background and texture detail information of visible images. In this way, it can provide better infrared targets and visible details for the fused image. Experimental results on multiple datasets indicate that DSA-Net outperforms state-of-the-art methods in both qualitative and quantitative evaluations. In addition, we also apply the fusion results to the target detection task, which indirectly demonstrates the fusion performances of our method.
ARTICLE | doi:10.20944/preprints202003.0035.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: meta-learning; lie group; machine learning; deep learning; convolutional neural network
Online: 3 March 2020 (11:09:53 CET)
Deep learning has achieved lots of successes in many fields, but when trainable sample are extremely limited, deep learning often under or overfitting to few samples. Meta-learning was proposed to solve difficulties in few-shot learning and fast adaptive areas. Meta-learner learns to remember some common knowledge by training on large scale tasks sampled from a certain data distribution to equip generalization when facing unseen new tasks. Due to the limitation of samples, most approaches only use shallow neural network to avoid overfitting and reduce the difficulty of training process, that causes the waste of many extra information when adapting to unseen tasks. Euclidean space-based gradient descent also make meta-learner's update inaccurate. These issues cause many meta-learning model hard to extract feature from samples and update network parameters. In this paper, we propose a novel method by using multi-stage joint training approach to post the bottleneck during adapting process. To accelerate adapt procedure, we also constraint network to Stiefel manifold, thus meta-learner could perform more stable gradient descent in limited steps. Experiment on mini-ImageNet shows that our method reaches better accuracy under 5-way 1-shot and 5-way 5-shot conditions.
ARTICLE | doi:10.20944/preprints201810.0756.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Dermoscopic Image, Skin Lesion, Melanoma, Simulation, Generative Adversarial Networks, Deep Learning
Online: 1 November 2018 (17:54:12 CET)
Automated skin lesion analysis is one of the trending fields that has gained attention among the dermatologists and healthcare practitioners. Skin lesion restoration is an essential preprocessing step for lesion enhancements for accurate automated analysis and diagnosis. Digital hair removal is a non-invasive method for image enhancement by solving the hair-occlusion artefact in previously captured images. Several hair removal methods were proposed for skin delineation and removal. However, manual annotation is one of the main challenges that hinder the validation of these proposed methods on a large number of images or using benchmarking datasets for comparison purposes. In the presented work, we propose a realistic hair simulator based on context-aware image synthesis using image-to-image translation techniques via conditional adversarial generative networks for generation of different hair occlusions in skin images, along with the ground-truth mask for hair location. Besides, we explored using three loss functions including L1-norm, L2-norm and structural similarity index (SSIM) to maximise the synthesis quality. For quantitatively evaluate the realism of image synthesis, the t-SNE feature mapping and Bland-Altman test are employed as objective metrics. Experimental results show the superior performance of our proposed method compared to previous methods for hair synthesis with plausible colours and preserving the integrity of the lesion texture.
ARTICLE | doi:10.20944/preprints201908.0068.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; convolutional neural networks (CNN); transfer learning; class activation mapping (CAM); building defects; structural-health monitoring
Online: 6 August 2019 (04:18:29 CEST)
Clients are increasingly looking for fast and effective means to quickly and frequently survey and communicate the condition of their buildings so that essential repairs and maintenance work can be done in a proactive and timely manner before it becomes too dangerous and expensive. Traditional methods for this type of work commonly comprise of engaging building surveyors to undertake a condition assessment which involves a lengthy site inspection to produce a systematic recording of the physical condition of the building elements, including cost estimates of immediate and projected long-term costs of renewal, repair and maintenance of the building. Current asset condition assessment procedures are extensively time consuming, laborious, and expensive and pose health and safety threats to surveyors, particularly at height and roof levels which are difficult to access. We propose a method for automated detection and localisation of key building defects from images using deep learning and convolution neural networks. The proposed model is based on a pre-trained VGG-16 classifier with Class Activation Mapping (CAM) for object localisation. The model has proven to be robust and able to accurately detect and localise mould growth, stains, and paint deterioration defects arising from dampness in buildings. The approach is being developed with potentials to scale-up to support automated detection of defects and deterioration of buildings in real-time using mobile devices and drones.