ARTICLE | doi:10.20944/preprints202209.0231.v1
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: neural networks; regularization; deep networks
Online: 15 September 2022 (13:06:13 CEST)
Numerous approaches address over-fitting in neural networks: by imposing a penalty on the parameters of the network (L1, L2, etc); by changing the network stochastically (drop-out, Gaussian noise, etc.); or by transforming the input data (batch normalization, etc.). In contrast, we aim to ensure that a minimum amount of supporting evidence is present when fitting the model parameters to the training data. This, at the single neuron level, is equivalent to ensuring that both sides of the separating hyperplane (for a standard artificial neuron) have a minimum number of data points — noting that these points need not belong to the same class for the inner layers. We firstly benchmark the results of this approach on the standard Fashion-MINST dataset, comparing it to various regularization techniques. Interestingly, we note that by nudging each neuron to divide, at least in part, its input data, the resulting networks make use of each neuron, avoiding a hyperplane completely on one side of its input data (which is equivalent to a constant into the next layers). To illustrate this point, we study the prevalence of saturated nodes throughout training, showing that neurons are activated more frequently and earlier in training when using this regularization approach. A direct consequence of the improved neuron activation is that deep networks are now easier to train. This is crucially important when the network topology is not known a priori and fitting often remains stuck in a suboptimal local minima. We demonstrate this property by training a network of increasing depth (and constant width): most regularization approaches will result in increasingly frequent training failures (over different random seeds) whilst the proposed evidence-based regularization significantly outperforms in its ability to train deep networks.
ARTICLE | doi:10.20944/preprints202102.0318.v3
Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: Machine Learning; Artificial Intelligence; Androgen Receptor; Random Forest; Deep Neural Network; Convolutional
Online: 24 February 2021 (13:14:01 CET)
Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine learning classifiers and regressors and evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to dif- ferent results, with deep neural networks (DNNs) on user-defined physicochemically-relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically-based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evalu- ation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and predic- tion, improving assessment and design of compounds. Source code and data are available at https://github.com/AlfonsoTGarcia-Sosa/ML
ARTICLE | doi:10.20944/preprints202309.1202.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: speech emotion recognition; deep learning; Deep Belief Network; deep neural network; Convolutional Neural Network; LSTM; attention mechanism
Online: 19 September 2023 (08:24:22 CEST)
Speech Emotion Recognition (SER) is an interesting and difficult problem to handle. In this paper, we deal with it through the implementation of deep learning networks. We have designed and implemented six different deep learning networks, a Deep Belief Network (DBN), a simple deep neural network (SDNN), a LSTM network (LSTM), a LSTM network with the addition of an attention mechanism (LSTM-ATN), a Convolutional neural network (CNN), and a Convolutional neural network with the addition of an attention mechanism (CNN-ATN), having in mind, apart from solving the SER problem, to test the impact of attention mechanism to the results. Dropout and Batch Normalization techniques are also used to improve the generalization ability (prevention of overfitting) of the models as well as to speed up the training process. The Surrey Audio-Visual Expressed Emotion database (SAVEE), and the Ryerson Audio-Visual Database (RAVDESS) database were used for training and evaluation of our models. The results showed that networks with the addition of the attention mechanism did better than the others. Furthermore, they showed that CNN-ATN was the best among tested networks, achieving an accuracy of 74% for the SAVEE and 77% for the RAVDESS dataset, and exceeded existing state-of-the-art systems for the same datasets.
REVIEW | doi:10.20944/preprints202104.0421.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: non-intrusive load monitoring; load disaggregation; NILM; review; deep learning; deep neural networks; machine learning
Online: 15 April 2021 (15:05:09 CEST)
This paper reviews non-intrusive load monitoring (NILM) approaches that employ deep neural networks to disaggregate appliances from low frequency data, i.e. data with sampling rates lower than the AC base frequency. We first review the many degrees of freedom of these approaches, what has already been done in literature, and compile the main characteristics of the reviewed publications in an extensive overview table. The second part of the paper discusses selected aspects of the literature and corresponding research gaps. In particular, we do a performance comparison with respect to reported MAE and F$_1$-scores and observe different recurring elements in the best performing approaches, namely data sampling intervals below 10\,s, a large field of view, the usage of GAN losses, multi-task learning, and post-processing. Subsequently, multiple input features, multi-task learning and related research gaps are discussed, the need for comparative studies is highlighted, and finally, missing elements for a successful deployment of NILM approaches based on deep neural networks are pointed out. We conclude the review with an outlook on possible future scenarios.
ARTICLE | doi:10.20944/preprints202308.0047.v1
Subject: Physical Sciences, Astronomy And Astrophysics Keywords: image classification; astronomy; asteroids; convolutional neural network; deep learning
Online: 1 August 2023 (11:08:14 CEST)
Near Earth Asteroids represent potential threats to human life because their trajectories may bring them in the proximity of the Earth. Monitoring these objects could help predict future impact events, but such efforts are hindered by the large numbers of objects that pass through the Earth’s vicinity. Additionally, there is also the problem of distinguishing asteroids from other objects in the night sky, which implies sifting through large sets of telescope image data. Within this context, we believe that employing machine learning techniques could greatly improve the detection process by sorting out the most likely asteroid candidates to be reviewed by human experts. At the moment, the use of machine learning techniques is still limited in the field of astronomy and the main goal of the present paper is to study the effectiveness of deep CNNs for the classification of astronomical objects, asteroids in this particular case, by comparing some of the well-known deep convolutional neural networks, including InceptionV3, Xception, InceptionResNetV2 and ResNet152V2. We have applied transfer learning and fine-tuning on these pre-existing deep convolutional networks and from the results that we have obtained one can see the potential of using deep convolutional neural networks in the process of asteroid classification. The InceptionV3 model has the best results in the asteroid class, meaning that by using it, we loose the least number of valid asteroids.
ARTICLE | doi:10.20944/preprints202109.0285.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: remote sensing; deep learning; image classification
Online: 16 September 2021 (13:38:55 CEST)
Autonomous image recognition has numerous potential applications in the field of planetary science and geology. For instance, having the ability to classify images of rocks would allow geologists to have immediate feedback without having to bring back samples to the laboratory. Also, planetary rovers could classify rocks in remote places and even in other planets without needing human intervention. Shu et al. classified 9 different types of rock images using a Support Vector Machine (SVM) with the image features extracted autonomously. Through this method, the authors achieved a test accuracy of 96.71%. In this research, Convolutional Neural Networks(CNN) have been used to classify the same set of rock images. Results show that a 3-layer network obtains an average accuracy of 99.60% across 10 trials on the test set. A version of Self-taught Learning was also implemented to prove the generalizability of the features extracted by the CNN. Finally, one model has been chosen to be deployed on a mobile device to demonstrate practicality and portability. The deployed model achieves a perfect classification accuracy on the test set, while taking only 0.068 seconds to make a prediction, equivalent to about 14 frames per second.
ARTICLE | doi:10.20944/preprints202308.0712.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: bot; CNN; cyberattack; deep-learning; malware; NLP; phishing; social networks; spam
Online: 9 August 2023 (08:57:50 CEST)
Social networks have captured the attention of many people worldwide. However, these services have also attracted a considerable number of malicious users whose purpose is to compromise digital assets of other members by using messages as an attack vector to execute different variants of cyberattacks against them. Therefore, this work presents an approach based on Natural Language Processing tools and a Convolutional Neural Network architecture to detect and classify, on social network messages, four types of cyberattacks, such as malware, phishing, spam, and even one whose purpose is deceiving the user into spreading malicious messages to other users, which in this work is identified as bot attacks. One notable feature of this work is that it analyzes textual content without depending on any characteristics from a specific social network, making its analysis independent from particular data sources. Finally, this work was tested on real data, demonstrating its results in two stages. The first detects the existence of any of the four cyberattacks within the message, obtaining an accuracy value of 0.91. After detecting a message as a cyberattack, the next stage is to classify it into one of the four types of cyberattack, achieving an accuracy value of 0.82.
ARTICLE | doi:10.20944/preprints201910.0376.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: artificial neural network; deep learning; LSTM; speech processing
Online: 31 October 2019 (16:40:30 CET)
Speech signals are degraded in real-life environments, product of background noise or other factors. The processing of such signals for voice recognition and voice analysis systems presents important challenges. One of the conditions that make adverse quality difficult to handle in those systems is reverberation, produced by sound wave reflections that travel from the source to the microphone in multiple directions.To enhance signals in such adverse conditions, several deep learning-based methods have been proposed and proven to be effective. Recently, recurrent neural networks, especially those with long and short-term memory (LSTM), have presented surprising results in tasks related to time-dependent processing of signals, such as speech. One of the most challenging aspects of LSTM networks is the high computational cost of the training procedure, which has limited extended experimentation in several cases. In this work, we present a proposal to evaluate the hybrid models of neural networks to learn different reverberation conditions without any previous information. The results show that some combination of LSTM and perceptron layers produce good results in comparison to those from pure LSTM networks, given a fixed number of layers. The evaluation has been made based on quality measurements of the signal's spectrum, training time of the networks and statistical validation of results. Results help to affirm the fact that hybrid networks represent an important solution for speech signal enhancement, with advantages in efficiency, but without a significan drop in quality.
ARTICLE | doi:10.20944/preprints202009.0524.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: COVID-19; chest X-ray images; deep convolutional neural network; COV-MCNet; deep learning
Online: 23 September 2020 (03:31:30 CEST)
The COVID-19 pandemic situation has created even more difficulties in the quick identification and screening of the COVID-19 patients for the medical specialists. Therefore, a significant study is necessary for detecting COVID-19 cases using an automated diagnosis method, which can aid in controlling the spreading of the virus. In this paper, the study suggests a Deep Convolutional Neural Network-based multi-classification approach (COV-MCNet) using eight different pre-trained architectures such as VGG16, VGG19, ResNet50V2, DenseNet201, InceptionV3, MobileNet, InceptionResNetV2, Xception which are trained and tested on the X-ray images of COVID-19, Normal, Viral Pneumonia, and Bacterial Pneumonia. The results from 3-class (Normal vs. COVID-19 vs. Viral Pneumonia) showed that only the ResNet50V2 model provides the highest classification performance (accuracy: 95.83%, precision: 96.12%, recall: 96.11%, F1-score: 96.11%, specificity: 97.84%) compared to rest of the models. The results from 4-class (Normal vs. COVID-19 vs. Viral Pneumonia vs. Bacterial Pneumonia) demonstrated that the pre-trained model DenseNet201 provides the highest classification performance (accuracy: 92.54%, precision: 93.05%, recall: 92.81%, F1-score: 92.83%, specificity: 97.47%). Notably, the ResNet50V2 (3-class) and DenseNet201 (4-class) models in the proposed COV-MCNet framework showed higher accuracy compared to the rest six models. This indicates that the designed system can produce promising results to detect the COVID-19 cases on the availability of more data. The proposed multi-classification network (COV-MCNet) significantly speeds up the existing radiology-based method, which will be helpful to the medical community and clinical specialists for early diagnosis of the COVID-19 cases during this pandemic.
ARTICLE | doi:10.20944/preprints201804.0286.v1
Subject: Business, Economics And Management, Finance Keywords: electricity price forecasting; deep learning; gated recurrent units; long short term memory; artificial intelligence, turkish day-ahead market
Online: 23 April 2018 (11:38:27 CEST)
Accurate electricity price forecasting has become a substantial requirement since the liberalization of the electricity markets. Due to the challenging nature of the electricity prices, which includes high volatility, sharp price spikes and seasonality, various types of electricity price forecasting models still compete and can not outperform each other consistently. Neural Networks have been successfully used in machine learning problems and Recurrent Neural Networks (RNNs) have been proposed to address time-dependent learning problems. In particular, Long Short Term Memory and Gated Recurrent Units (GRU) are tailor-made for time series price estimation. In this paper, we propose to use Gated Recurrent Units as a new technique for electricity price forecasting. We have trained a variety of algorithms with rolling 3-year window and compared the results with the RNNs. In our experiments, 3-layered GRUs outperformed all other neural network structures and state of the art statistical techniques in a statistically significant manner in the Turkish day-ahead market.
ARTICLE | doi:10.20944/preprints202310.0505.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: predictive maintenance; convolutional neural network; deep learning; vibration
Online: 9 October 2023 (11:37:52 CEST)
All kinds of vessels consist of dozens of complex machineries with rotating parts and electric motors that operate continuously in a harsh environment with excess temperature and humidity, vibration, fatigue and load. A breakdown or malfunction in one of these machineries can significantly impact the vessel’s operation and safety and consequently, the safety of the crew and the environment. To maintain operational efficiency and seaworthiness, the shipping industry invests substantial resources in preventive maintenance and repairs. This research presents the economic and technical benefits of predictive maintenance over to traditional preventive maintenance, and repair by replacement approaches in the maritime domain. By leveraging modern technology and Artificial Intelligence, we can analyze real-time operating conditions of machinery, enabling early detection of potential damages and allowing for effective planning of future maintenance and repair activities. In this paper, we propose and develop a Convolutional Neural Network that is fed with raw vibration measurements acquired in a laboratory environment from the ball bearings of a motor. Then, we investigate whether the proposed network can accurately detect the functional state of ball bearings and categorize any possible failures present, contributing to improved maintenance practices in the shipping industry.
ARTICLE | doi:10.20944/preprints202307.0724.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: crop type recognition; deep learning; crowdsourcing; street-level imagery
Online: 12 July 2023 (04:40:51 CEST)
The creation of crop-type maps from satellite data has proven challenging, often impeded by a lack of accurate in-situ data. This paper aims to demonstrate a method for crop-type (ie. Maize, Wheat and Other) recognition based on Convolutional Neural Networks using a bottom-up approach. We trained the model with a highly accurate dataset of crowdsourced labelled street-level imagery. Classification results achieved an AUC of 0.87 for wheat, 0.85 for maize and 0.73 for other. Given that wheat and maize are the two most common food crops globally, combined with an ever-increasing amount of available street-level imagery, this approach could help address the need for improved crop-type monitoring globally. Challenges remain in addressing the noisy aspect of street-level imagery (ie. buildings, hedgerows, automobiles, etc.), where a variety of different objects tend to restrict the view and confound the algorithms
REVIEW | doi:10.20944/preprints202104.0739.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep neural network; survey; document images; review paper; deep learning; performance evaluation; page object detection, graphical page objects; document image analysis; page segmentation
Online: 28 April 2021 (10:17:49 CEST)
In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that makes digitization of documents viable. Since the advent of deep learning, the performance of deep learning-based object detection has improved many folds. In this work, we outline and summarize the deep learning approaches for detecting graphical page objects in the document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.
ARTICLE | doi:10.20944/preprints202301.0208.v1
Subject: Physical Sciences, Biophysics Keywords: Deep belief network; Diabetes; Prediction; Risk Factors; Deep Learning
Online: 12 January 2023 (03:54:15 CET)
Diabetes mellitus is a popular life-threatening disease and patients may gradually have started suffering from other diabetes-causing diseases such as heart attacks, stroke, hypertension, blurry vision, blindness, foot ulcer, amputation, kidney damage and other organ failures before diagnosis. Early detection can help reduce the fatality of this disease. Deep learning models have proven very useful in disease detection and computer-aided diagnosis. In this work, we proposed a deep unsupervised machine learning model for early detection of diabetes using voting ensemble feature selection and deep belief neural networks (DBN). Dataset was obtained from an online repository containing responses of prediagnosed patients to direct questionnaires administered in Sylhet Diabetes Hospital in Sylhet, Bangladesh. The dataset was preprocessed and preprocessed. Features were reduced using the ensemble feature selector. The DBN model was pretrained and tuned to obtain optimal performance. The model was also compared with other models with no multiple hidden layers. The DBN performed at its relative best with F1-measure, precision and recall of 1.00, 0.92 and 1.00 respectively. We conclude that DBN is a useful tool for an unsupervised early prediction of Type II diabetes mellitus.
REVIEW | doi:10.20944/preprints202011.0152.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: EEG signal recognition; machine learning in EEG; neural networks in EEG; dry electrode EEG; deep learning EEG
Online: 3 November 2020 (14:07:29 CET)
In the last decade, unprecedented progress in the development of neural networks influenced dozens of different industries, among which are signal processing for the electroencephalography process (EEG). Electroencephalography, even though it appeared in the first half of the 20th century, to this day didn’t change the physical principles of operation. But the signal processing technique due to the use of neural networks progressed significantly in this area. Evidence for this can serve that for the past 5 years more than 1000 publications on the topic of using machine learning have been published in popular libraries. Many different models of neural networks complicate the process of understanding the real situation in this area. In this manuscript, we provided the most comprehensive overview of research where were used neural networks for EEG signal processing.
ARTICLE | doi:10.20944/preprints202304.0203.v4
Subject: Engineering, Electrical And Electronic Engineering Keywords: Electric Vehicles; Battery Management System; Lithium-ion batteries; Deep Learning
Online: 19 April 2023 (03:34:32 CEST)
This paper presents an improved SOC estimation method for lithium ion batteries in Electric Vehicles using Bayesian optimized feedforward network. This innovative bayesian optimized neural network method attempts to minimize a scalar objective function by extracting hyperpa-rameters (hidden neurons in both layers) using a surrogate model. Furthemore, the hyperparameters are built and data samples are trained and validated. The performance of the proposed deep learning neural network is evaluated. Two reasonable size data samples are ex-tracted from Panasonic 18650PF Li-ion Mendeley datasets that are used for training and valida-tion. RNN and LSTM neural network algorithms offer the common core property of retaining past information and/or hidden states for better SOC estimation. However, the feature of this pro-posed method is the inclusion of Bayesian optimization that chooses optimal double layer hidden neurons. Analysis of results shows that Bayesian optimized feedforward algorithm with average MAPE (0.20%) is the lowest and is the best selection compared with average MAPE for other five deep learning algorithms. In the last quarter of fuel gauge, where fuel anxiety is severe, feed-forward with Bayesian Optimization algorithm is still the best selection (with MAPE of 0.64%).
ARTICLE | doi:10.20944/preprints202008.0113.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Scene classification; Deep Learning; Convolutional Neural Networks; Feature learning
Online: 5 August 2020 (06:19:27 CEST)
State-of-the-art remote sensing scene classification methods employ different Convolutional Neural Network architectures for achieving very high classification performance. A trait shared by the majority of these methods is that the class associated with each example is ascertained by examining the activations of the last fully connected layer, and the networks are trained to minimize the cross-entropy between predictions extracted from this layer and ground-truth annotations. In this work, we extend this paradigm by introducing an additional output branch which maps the inputs to low dimensional representations, effectively extracting additional feature representations of the inputs. The proposed model imposes additional distance constrains on these representations with respect to identified class representatives, in addition to the traditional categorical cross-entropy between predictions and ground-truth. By extending the typical cross-entropy loss function with a distance learning function, our proposed approach achieves significant gains across a wide set of benchmark datasets in terms of classification, while providing additional evidence related to class membership and classification confidence.
BRIEF REPORT | doi:10.20944/preprints202305.0768.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: Climate; Contiguous United States; Deep Neural Network; Land Cover; Large Wildfire
Online: 10 May 2023 (14:46:12 CEST)
Over the last several decades, large wildfires are increasingly common across the United States causing disproportionate impact on forest health and function, human well-being, and economy. Here, we examine the severity of large wildfires across the Contiguous United States over the past decade (2011-2020) using a wide array of meteorological, vegetational, and topographical features in the Deep Neural Network model. A total of 4,538 wildfire incidents were used in the analysis covering 87,305 square miles of burned area. We observed the highest number of large wildfires in California, Texas, and Idaho, with lightning causing 43 % of these incidents. Importantly, results indicate that the severity of wildfire occurrences is highly correlated with the climatological forcings, land cover, location, and elevation of the ecosystem. Overall, results may serve useful guide in managing landscapes under changing climate and disturbance regimes.
ARTICLE | doi:10.20944/preprints201905.0228.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning, LSTM, Machine learning, Post-filtering, Signal processing, Speech Synthesis
Online: 17 May 2019 (16:16:53 CEST)
Several researchers have contemplated deep learning-based post-filters to increase the quality of statistical parametric speech synthesis, which perform a mapping of the synthetic speech to the natural speech, considering the different parameters separately and trying to reduce the gap between them. The Long Short-term Memory (LSTM) Neural Networks have been applied successfully in this purpose, but there are still many aspects to improve in the results and in the process itself. In this paper, we introduce a new pre-training approach for the LSTM, with the objective of enhancing the quality of the synthesized speech, particularly in the spectrum, in a more efficient manner. Our approach begins with an auto-associative training of one LSTM network, which is used as an initialization for the post-filters. We show the advantages of this initialization for the enhancing of the Mel-Frequency Cepstral parameters of synthetic speech. Results show that the initialization succeeds in achieving better results in enhancing the statistical parametric speech spectrum in most cases when compared to the common random initialization approach of the networks.
ARTICLE | doi:10.20944/preprints201908.0068.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; convolutional neural networks (CNN); transfer learning; class activation mapping (CAM); building defects; structural-health monitoring
Online: 6 August 2019 (04:18:29 CEST)
Clients are increasingly looking for fast and effective means to quickly and frequently survey and communicate the condition of their buildings so that essential repairs and maintenance work can be done in a proactive and timely manner before it becomes too dangerous and expensive. Traditional methods for this type of work commonly comprise of engaging building surveyors to undertake a condition assessment which involves a lengthy site inspection to produce a systematic recording of the physical condition of the building elements, including cost estimates of immediate and projected long-term costs of renewal, repair and maintenance of the building. Current asset condition assessment procedures are extensively time consuming, laborious, and expensive and pose health and safety threats to surveyors, particularly at height and roof levels which are difficult to access. We propose a method for automated detection and localisation of key building defects from images using deep learning and convolution neural networks. The proposed model is based on a pre-trained VGG-16 classifier with Class Activation Mapping (CAM) for object localisation. The model has proven to be robust and able to accurately detect and localise mould growth, stains, and paint deterioration defects arising from dampness in buildings. The approach is being developed with potentials to scale-up to support automated detection of defects and deterioration of buildings in real-time using mobile devices and drones.
REVIEW | doi:10.20944/preprints202310.1655.v1
Subject: Medicine And Pharmacology, Oncology And Oncogenics Keywords: Graph Neural Network; GNN; Deep Learning; Cancer; Oncology; Graphical Model; Bayesian Network
Online: 26 October 2023 (03:33:36 CEST)
Next-generation cancer and oncology research needs to take full advantage of the multi-modal structured, or graph, information, with the graph datatypes ranging from molecular structures to spatially resolved imaging and digital pathology to biological networks to knowledge graphs. Graph Neural Networks (GNNs) efficiently combine the graph structure representations with the high predictive performance of deep learning, especially on the large multi-modal datasets. In this review article, we survey the landscape of recent (2020-present) GNN applications in the context of cancer and oncology research, and delineate six currently predominant research areas. Subsequently, we identify the most promising directions for future research. We compare GNNs with graphical models and "non-structured" deep learning, and devise the guidelines for cancer and oncology researchers or physician-scientists asking the question of whether they should adopt the GNN methodology in their research pipelines.
ARTICLE | doi:10.20944/preprints201805.0276.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: blood pressure; oscillometric measurement; statistical analysis; normality; confidence interval; deep belief networks
Online: 21 May 2018 (12:54:26 CEST)
Oscillometric blood pressure (BP) devices currently estimate a single point but do not identify fluctuations in BP or distinguish them from variations in response to physiological properties. In this paper, to analyze BP normality based on oscillometric measurements, we use statistical approaches including kurtosis, skewness, Kolmogorov-Smirnov, and correlation tests. Then, to mitigate uncertainties, we use a deep neural network (DNN) to determine the confidence limits (CLs) of BP measurements based on their normality. The proposed DNN regression model decreases the standard deviation of error (SDE) of the mean error (ME) and the mean absolute error (MAE) and reduces the uncertainty of the CLs and SDEs of the proposed technique. We validate the normality of the distribution of the BP estimation distribution which fits the Gaussian distribution very well. We use a rank test in the DNN regression model to demonstrate the independence of the artificial SBP and DBP estimations. First, we perform statistical tests to verify the normality of the BP measurements for individual subjects. The proposed methodology provides accurate BP estimations and reduces the uncertainties associated with the CLs and SDEs based on the DNN regression estimator.
ARTICLE | doi:10.20944/preprints202301.0579.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: hybrid modeling; deep neural networks; deep learning; SBML; systems biology; computational modeling
Online: 31 January 2023 (08:51:13 CET)
In this paper we propose a computational framework that merges mechanistic modeling with deep neural networks obeying the Systems Biology Markup Language (SBML) standard. Over the last 20 years, the systems biology community has developed a large number of mechanistic models in SBML that are currently stored in public databases. With the proposed framework, existing SBML mechanistic models may be upgraded to hybrid systems through the incorporation of deep neural networks into the model core, using a freely available python tool. The so-formed hybrid mechanistic/neural network models are trained with a deep learning algorithm based on the adaptive moment estimation method (ADAM), stochastic regularization and semidirect sensitivity equations. The trained hybrid models are encoded in SBML and stored back in model databases, where they can be further analyzed as regular SBML models. The application of this approach is illustrated with three well-known case studies: the threonine synthesis model in Escherichia coli, the P58IPK signal transduction model, and the Yeast glycolytic oscillations model. The proposed framework is expected to greatly facilitate the widespread use of hybrid modeling techniques for systems biology applications.
ARTICLE | doi:10.20944/preprints202208.0197.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Deep neural networks; Adversarial Attacks; Poisoning; Backdoors; Trojans; Taxonomy; Ontology; Knowledge Base; Explainable AI; Green AI
Online: 10 August 2022 (09:39:07 CEST)
Deep neural networks (DNN) have successfully delivered a cutting-edge performance in several fields. With the broader deployment of DNN models on critical applications, the security of DNNs becomes an active and yet nascent area. Attacks against DNNs can have catastrophic results, according to recent studies. Poisoning attacks, including backdoor and Trojan attacks, are one of the growing threats against DNNs. Having a wide-angle view of these evolving threats is essential to better understand the security issues. In this regard, creating a semantic model and a knowledge graph for poisoning attacks can reveal the relationships between attacks across intricate data to enhance the security knowledge landscape. In this paper, we propose a DNN Poisoning Attacks Ontology (DNNPAO) that would enhance knowledge sharing and enable further advancements in the field. To do so, we have performed a systematic review of the relevant literature to identify the current state. We collected 28,469 papers from IEEE, ScienceDirect, Web of Science, and Scopus databases, and from these papers, 712 research papers were screened in a rigorous process, and 55 poisoning attacks in DNNs were identified and classified. We extracted a taxonomy of the poisoning attacks as a scheme to develop DNNPAO. Subsequently, we used DNNPAO as a framework to create a knowledge base. Our findings open new lines of research within the field of AI security.
TECHNICAL NOTE | doi:10.20944/preprints202009.0678.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: multi-frame super resolution; wide activation super resolution; 3D convolutional neural network; deep learning
Online: 27 September 2020 (11:54:56 CEST)
The small satellite market continues to grow year after year. A compound annual growth rate of 17% is estimated during the period between 2020 and 2025. Low-cost satellites can send a vast amount of images to be post-processed at the ground to improve the quality and extract detailed information. In this domain lies the resolution enhancement task, where a low-resolution image is converted to a higher resolution automatically. Deep learning approaches to Super-Resolution (SR) reached the state-of-the-art in multiple benchmarks; however, most of them were studied in a single-frame fashion. With satellite imagery, multi-frame images can be obtained at different conditions giving the possibility to add more information per image and improve the final analysis. In this context, we developed and applied to the PROBA-V dataset of multi-frame satellite images a model that recently topped the European Space Agency’s Multi-frame Super Resolution (MFSR) competition. The model is based on proven methods that worked on 2D images tweaked to work on 3D: the Wide Activation Super Resolution (WDSR) family. We show that with a simple 3D CNN residual architecture with WDSR blocks and a frame permutation technique as data augmentation better scores can be achieved than with more complex models. Moreover, the model requires few hardware resources, both for training and evaluation, so it can be applied directly from a personal laptop.
ARTICLE | doi:10.20944/preprints202310.0547.v2
Subject: Engineering, Electrical And Electronic Engineering Keywords: Artificial intelligence; AI images; photographs, PRNU; ELA; CCN; deep learning
Online: 30 October 2023 (16:14:36 CET)
Generative AI has gained enormous interest nowadays due to new applications like chatGPT, DALL E, Stable Difussion and Deep Fake. Particularly DALL E, Stable Difussion and others (Adobe Firefly, ImagineArt...) are able to create images from a text prompt and are also able to recreate real photographs. Due to this fact, intense research has arisen to create new image forensics applications able to distinguish between real captured images and videos and artificial ones. Detecting forgeries made with Deep Fake is one of the most researched issues. This paper is about another kind of forgery detection. The purpose of this research aims to detect photo realistic AI created images versus real photos coming from a physical camera. For this purpose, techniques that perform a pixel level feature extraction are used. First one is Photo Response Non-Uniformity (PRNU). PRNU is a special noise due to imperfections on the camera sensor that is used for source camera identification. The underlying idea is that AI images will have a different PRNU pattern. Second one is Error level analysis (ELA). This is other type of feature extraction traditionally used for detecting image editions. In fact, ELA is being used nowadays by photographers to detect manually AI created images. Both kinds of features are used to train Convolutional Neural Networks to differentiate between AI images and real photographs. Good results are obtained achieving accuracy rates over 95%. Both extraction methods are carefully assessed by computing precision/recall and F1-score measurements.
ARTICLE | doi:10.20944/preprints202305.0282.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Artificial intelligence; Neural Networks; Deep learning; Multitasking learning; Solar photovoltaic; Smart grids; Multiple Electrical Disturbances; Power quality
Online: 5 May 2023 (03:12:18 CEST)
Electrical power quality is one of the main elements in power generation systems. At the same time, it is one of the most significant challenges regarding stability and reliability. Due to different switching devices in this type of architecture, different kinds of power generators, and non-linear loads are used for different industrial processes. As a result of this, the need to classify and analyze Power quality disturbance (PQD) to prevent and analyze the degradation of the system reliability affected by the non-linear and non-stationary oscillatory nature. This paper presents A Novel Mul-titasking Deep Neural Network (MDL) for the Classification and Analysis of Multiple Electrical Disturbances. The characteristics are extracted with a specialized and adaptive methodology for non-stationary signals, Empirical Mode Decomposition (EMD). The methodology’s design, devel-opment, and various performance tests are carried out with 28 different difficulty levels, such as severity, disturbance duration time, and noise in the 20 dB to 60 dB signal range. MDL was devel-oped with a diverse data set in difficulty and noise, with a quantity of 4500 records of different samples of multiple electrical disturbances. The analysis and classification methodology has an average accuracy percentage of 95% with multiple disturbances. In addition, an average accuracy percentage of 90% in analyzing important signal aspects for studying electrical power quality such as crest factor, Per Unit voltage analysis, Short Term Flicker Perceptibility (Pst), and Total Harmonic Distortion (THD), among others.
Subject: Engineering, Bioengineering Keywords: AI; deep-learning; neural-networks; graph neural-networks; cheminformatics; molecular property; machine-learning; computational chemistry; lipophilicity; solubility
Online: 1 October 2021 (14:29:01 CEST)
The accurate prediction of molecular properties such as lipophilicity and aqueous solubility are of great importance and pose challenges in several stages of the drug discovery pipeline. Machine learning methods like graph-based neural networks (GNNs) have shown exceptionally good performance in predicting these properties. In this work, we introduce a novel GNN architecture, called directed edge graph isomorphism network (D-GIN). It is composed of two distinct sub-architectures (D-MPNN, GIN) and achieves an improvement in accuracy over its sub-architectures employing various learning, and featurization strategies. We argue that combining models with different key aspects help make graph neural networks deeper and simultaneously increase their predictive power. Furthermore, we address current limitations in assessment of deep-learning models, namely, comparison of single training run performance metrics, and offer a more robust solution.
Subject: Computer Science And Mathematics, Computer Science Keywords: Indoor Localization; Sensor Fusion; Multimodal Deep Neural Network; Multimodal Sensing; WiFi Fingerprinting; Pedestrian Dead Reckoning
Online: 13 October 2021 (12:14:39 CEST)
Many engineered approaches have been proposed over the years for solving the hard problem of performing indoor localisation using smartphone sensors. However, specialising these solutions for difficult edge cases remains challenging. Here we propose an end-to-end hybrid multimodal deep neural network localisation system, MM-Loc, relying on zero hand-engineered features, learning them automatically from data instead. This is achieved by using modality-specific neural networks to extract preliminary features from each sensing modality, which are then combined by cross-modality neural structures. We show that our choice of modality-specific neural architectures is capable of estimating the location with good accuracy independently. But for better accuracy, a multimodal neural network fusing the features of early modality-specific representations is a better proposition. Our proposed MM-Loc solution is tested on cross-modality samples characterised by different sampling rates and data representation (inertial sensors, magnetic and WiFi signals), outperforming traditional approaches for location estimation. MM-Loc elegantly trains directly from data unlike conventional indoor positioning systems, which rely on human intuition.
ARTICLE | doi:10.20944/preprints202005.0430.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Activity Context Sensing; Smartphones; Deep Convolutional Neural Networks; Smart devices
Online: 26 May 2020 (11:33:55 CEST)
With the widespread of embedded sensing capabilities of mobile devices, there has been unprecedented development of context-aware solutions. This allows the proliferation of various intelligent applications such as those for remote health and lifestyle monitoring, intelligent personalized services, etc. However, activity context recognition based on multivariate time series signals obtained from mobile devices in unconstrained conditions is naturally prone to imbalance class problems. This means that recognition models tend to predict classes with the majority number of samples whilst ignoring classes with the least number of samples, resulting in poor generalization. To address this problem, we propose to augment the time series signals from inertia sensors with signals from ambient sensing to train deep convolutional neural networks (DCNN) models. DCNN provides the characteristics that capture local dependency and scale invariance of these combined sensor signals. Consequently, we developed a DCNN model using only inertial sensor signals and then developed another model that combined signals from both inertia and ambient sensors aiming to investigate the class imbalance problem by improving the performance of the recognition model. Evaluation and analysis of the proposed system using data with imbalanced classes show that the system achieved better recognition accuracy when data from inertial sensors are combined with those from ambient sensors such as environment noise level and illumination, with an overall improvement of 5.3% accuracy.
Subject: Chemistry And Materials Science, Biomaterials Keywords: Microscopy Image Segmentation; Deep Learning; Data Augmentation; Synthetic Training Data; Parametric Models
Online: 1 March 2021 (13:07:00 CET)
The analysis of microscopy images has always been an important yet time consuming process in in materials science. Convolutional Neural Networks (CNNs) have been very successfully used for a number of tasks, such as image segmentation. However, training a CNN requires a large amount of hand annotated data, which can be a problem for material science data. We present a procedure to generate synthetic data based on ad-hoc parametric data modelling for enhancing generalization of trained neural network models. Especially for situations where it is not possible to gather a lot of data, such an approach is beneficial and may enable to train a neural network reasonably. Furthermore, we show that targeted data generation by adaptively sampling the parameter space of the generative models gives superior results compared to generating random data points.
ARTICLE | doi:10.20944/preprints201809.0361.v3
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: deep learning; convolutional neural networks; polar mesocyclones; satellite data processing; pattern recognition
Online: 29 October 2018 (10:16:49 CET)
Polar mesocyclones (MCs) are small marine atmospheric vortices. The class of intense MCs, called polar lows, are accompanied by extremely strong surface winds and heat fluxes and thus largely influencing deep ocean water formation in the polar regions. Accurate detection of polar mesocyclones in high-resolution satellite data, while challenging, is a time-consuming task, when performed manually. Existing algorithms for the automatic detection of polar mesocyclones are based on the conventional analysis of patterns of cloudiness and involve different empirically defined thresholds of geophysical variables. As a result, various detection methods typically reveal very different results when applied to a single dataset. We develop a conceptually novel approach for the detection of MCs based on the use of deep convolutional neural networks (DCNNs). As a first step, we demonstrate that DCNN model is capable of performing binary classification of 500x500km patches of satellite images regarding MC patterns presence in it. The training dataset is based on the reference database of MCs manually tracked in the Southern Hemisphere from satellite mosaics. We use a subset of this database with MC diameters falling in the range of 200-400 km. This dataset is further used for testing several different DCNN setups, specifically, DCNN built “from scratch”, DCNN based on VGG16 pre-trained weights also engaging the Transfer Learning technique, and DCNN based on VGG16 with Fine Tuning technique. Each of these networks is further applied to both infrared (IR) and a combination of infrared and water vapor (IR+WV) satellite imagery. The best skills (97% in terms of the binary classification accuracy score) is achieved with the model that averages the estimates of the ensemble of different DCNNs. The algorithm can be further extended to the automatic identification and tracking numerical scheme and applied to other atmospheric phenomena characterized by a distinct signature in satellite imagery.
ARTICLE | doi:10.20944/preprints202308.0739.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning; fault diagnosis; adaptive activation function; pumping unit
Online: 9 August 2023 (07:22:09 CEST)
Due to the complex underground environment, pumping machines are prone to produce numerous failures. The indicator diagrams of faults are similar in a certain degree, which produces indistinguishable samples. As the samples increases, manual diagnosis becomes difficult, which decreases the accuracy of fault diagnosis. For accurately and quickly judging the fault type, we propose an improved adaptive activation function and apply it to five types of neural networks. The adaptive activation function improves the negative semi-axis slope of the ReLU activation function by combining the gated channel conversion unit to improve the performance of the deep learning model. The proposed adaptive activation function is compared with the traditional activation function through the fault diagnosis data set and the public data set. The results show that the activation function has better nonlinearity, can improve the generalization performance of deep learning model, the accuracy of fault diagnosis. In addition, the proposed adaptive activation function can also be well embedded in other neural networks.
ARTICLE | doi:10.20944/preprints202304.0645.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Lip Reading; Multiclass Classification; Turkish Lip Reading Dataset; Deep Learning; Convolutional Neural Networks; Lip Detection
Online: 20 April 2023 (10:07:48 CEST)
Automated lip reading is a research problem that has developed considerably in recent years. Lip reading is evaluated both visually and audibly in some cases. The lip reading model is a field of use for detecting specific words using images from security cameras, but it is not possible to use audio-visual databases in this situation. It is not possible to obtain the sound input of the pronounced word in all cases. We collected a new Turkish dataset with only the image in this study. The new dataset is produced using Youtube videos, which is an uncontrolled environment. For this reason, images have difficult parameters in terms of environmental factors such as light, angle, color, and personal characteristics of the face. Despite the different features on the human face such as mustache, beard, and make-up, the visual speech recognition problem was developed on 10 classes including single words and two-word phrases using Convolutional Neural Networks (CNN) without any intervention on the data. The proposed study using only-visual data obtained a model which is automated visual speech recognition with a deep learning approach. In addition, since this study uses only-visual data, the computational cost and resource usage is less than in multi-modal studies. It is also the first known study to address the lip reading problem with a deep learning algorithm using a new dataset belonging to the Ural-Altaic languages.
REVIEW | doi:10.20944/preprints202110.0135.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: convolutional neural networks (CNNs); deep learning; computer-aided diagnosis; colorectal polyps; colorectal cancer; colonoscopy
Online: 8 October 2021 (10:50:53 CEST)
As a relatively high percentage of adenoma polyps are missed, a computer-aided diagnosis (CAD) tool based on deep learning can aid the endoscopist in diagnosing colorectal polyps or colorectal cancer in order to decrease polyps missing rate and prevent colorectal cancer mortality. Convolutional Neural Network (CNN) is a deep learning method and has achieved better results in detecting and segmenting specific objects in images in the last decade than conventional models such as regression, support vector machines or artificial neural networks. In recent years, based on the studies in medical imaging criteria, CNN models have acquired promising results in detecting masses and lesions in various body organs, including colorectal polyps. In this review, the structure and architecture of CNN models and how colonoscopy images are processed as input and converted to the output are explained in detail. In most primary studies conducted in the colorectal polyp detection and classification field, the CNN model has been regarded as a black box since the calculations performed at different layers in the model training process have not been clarified precisely. Furthermore, I discuss the differences between the CNN and conventional models, inspect how to train the CNN model for diagnosing colorectal polyps or cancer, and evaluate model performance after the training process.
ARTICLE | doi:10.20944/preprints202105.0429.v1
Subject: Medicine And Pharmacology, Other Keywords: Acute lymphoblastic leukemia; Deep convolutional neural networks; Ensemble image classifiers; C-NMC-2019 dataset.
Online: 19 May 2021 (07:42:23 CEST)
Although automated Acute Lymphoblastic Leukemia (ALL) detection is essential, it is challenging due to the morphological correlation between malignant and normal cells. The traditional ALL classification strategy is arduous, time-consuming, often suffers inter-observer variations, and necessitates experienced pathologists. This article has automated the ALL detection task, employing deep Convolutional Neural Networks (CNNs). We explore the weighted ensemble of deep CNNs to recommend a better ALL cell classifier. The weights are estimated from ensemble candidates' corresponding metrics, such as accuracy, F1-score, AUC, and kappa values. Various data augmentations and pre-processing are incorporated for achieving a better generalization of the network. We train and evaluate the proposed model utilizing the publicly available C-NMC-2019 ALL dataset. Our proposed weighted ensemble model has outputted a weighted F1-score of 88.6%, a balanced accuracy of 86.2%, and an AUC of 0.941 in the preliminary test set. The qualitative results displaying the gradient class activation maps confirm that the introduced model has a concentrated learned region. In contrast, the ensemble candidate models, such as Xception, VGG-16, DenseNet-121, MobileNet, and InceptionResNet-V2, separately produce coarse and scatter learned areas for most example cases. Since the proposed ensemble yields a better result for the aimed task, it can experiment in other domains of medical diagnostic applications.
ARTICLE | doi:10.20944/preprints201809.0481.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Brain-Computer Interfaces, spectrogram-based convolutional neural network model(pCNN), Deep Learning, EEG, LSTM, RCNN
Online: 25 September 2018 (08:58:34 CEST)
Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject’s motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g. hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause for the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: 1) a long short-term memory (LSTM); 2) a proposed spectrogram-based convolutional neural network model (pCNN); and 3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (manual) feature engineering. Results were evaluated on our own, publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from "BCI Competition IV". Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.
ARTICLE | doi:10.20944/preprints202309.1806.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Arabic air writing recognition; machine learning; OCR; recognition; deep learning
Online: 27 September 2023 (05:20:21 CEST)
It is a challenging problem that air-written Arabic letters has received a lot of attention in the past decades when compared to commonly spoken languages like English languages. To fill this gap, we propose a strong model that brings together machine learning (ML) and optical character recognition (OCR) methods. The model applied several ML methods, (i.e., Neural Networks (NN), Random Forest (RF), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM), with deep convolutional neural networks (CNNs) such as VGG16, VGG19, and SqueezeNet for effective feature extraction. Our study utilizes the AHAWP dataset, which consists of varied writing styles and variations in hand signs, to train and evaluate the model. Preprocessing systems are applied to improve data quality by reduction bias. Besides, OCR methods are combined into our model to sequestrate individual letters from continuous air-written gestures and refine recognition results. Results of this study show that the proposed model has achieved the extreme accuracy of 88.8% using NN with VGG16. This study presents an inclusive approach that combines ML, deep CNNs, and OCR methods to address the issue of Arabic in air writing recognition research.
ARTICLE | doi:10.20944/preprints202002.0334.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: deep learning; drone imagery; hyperspectral image classiﬁcation; tree species classification; 3D convolutional neural networks
Online: 24 February 2020 (01:13:13 CET)
Interest in drone solutions in forestry applications is growing. Using drones, datasets can be captured flexibly and at high spatial and temporal resolutions when needed. In forestry applications, fundamental tasks include the detection of individual trees, tree species classification, bio-mass estimation, etc. Deep Neural Networks (DNN) have shown superior results when comparing with conventional machine learning methods such as Multi-Layer Perceptron (MLP) in cases of huge input data. The objective of this research was to investigate 3D convolutional neural networks (3D-CNN) to classify three major tree species in a boreal forest: pine, spruce, and birch. The proposed 3D-CNN models were employed to classify tree species in a test site in Finland. The classifiers were trained with a dataset of 3039 manually labelled trees. Then the accuracies were assessed by employing independent datasets of 803 records. To find the most efficient set of feature combination, we compare the performances of 3D-CNN models trained with hyperspectral (HS) channels, RGB channels, and canopy height model (CHM), separately and combined. It is demonstrated that the proposed 3D-CNN model with RGB and HS layers produces the highest classification accuracy. The producer accuracy of the best 3D-CNN classifier on the test dataset were 99.6%, 94.8%, and 97.4% for pines, spruces, and birches, respectively. The best 3D-CNN classifier produced ~5% better classification accuracy than the MLP with all layers. Our results suggest that the proposed method provides excellent classification results with acceptable performance metrics for HS datasets. Our results show that pine class was detectable in most layers. Spruce was most detectable in RGB data, while birch was most detectable in the HS layers. Furthermore, the RGB datasets provide acceptable results for many low-accuracy applications.
ARTICLE | doi:10.20944/preprints202307.0014.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: deep learning; unbalanced dataset; augmentation; multiclass classification; metrics boosting method; sota algorithm; visual transformer; ResNet; Xception; Inception
Online: 3 July 2023 (08:25:13 CEST)
One of the critical problems in multiclass classification tasks is the imbalance of the dataset. This is especially true when using contemporary pre-trained neural networks, where, in fact, the last layers of the neural network are retrained. Therefore, the large datasets with highly unbalanced classes are not good for models’ training since the use of such a dataset leads to overfitting and, accordingly, poor metrics on test and validation datasets. In this paper the sensitivity to a dataset imbalance of Xception, ViT-384, ViT-224, VGG19, ResNet34, ResNet50, ResNet101, Inception_v3, DenseNet201, DenseNet161, DeIT was studied using a highly imbalanced dataset of 20,971 images sorted into 7 classes. It is shown that the best metrics were obtained when using a cropped dataset with augmentation of missing images in classes up to 15% of the initial number. So, the metrics can be increased by 2-6% compared to the metrics of the models on the initial unbalanced data set. Moreover, the metrics of the rare classes' classification also improved significantly – the TruePositive value can be increased by 0.3 and more. As result, the best approach to train considered networks on an initially unbalanced dataset was formulated.
ARTICLE | doi:10.20944/preprints202302.0070.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; aerial imagery; precision agriculture; plant detection; domain adaptation; unsupervised learning; self-supervision; adversarial learning; domain shift; tropical crops
Online: 3 February 2023 (10:14:09 CET)
This paper presents a novel approach for accurate counting and localization of tropical plants in aerial images that is able to work in new visual domains in which the available data is not labeled. Our approach uses deep learning and domain adaptation, designed to handle domain shift between the training and test data, which is a common challenge in this agricultural applications. This method uses a source dataset with annotated plants and a target dataset without annotations, and adapts a model trained on the source dataset to the target dataset using unsupervised domain alignment and pseudolabeling. The experimental results show the effectiveness of this approach for plant counting in aerial images of pineapples under significative domain shift, achieving a reduction up to 97% in the counting error when compared to the supervised baseline.
ARTICLE | doi:10.20944/preprints202009.0458.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: machine learning; deep leaning; physiological maturity; computer vision; plant breeding; Phenology; Glycine max (L.) Merr.
Online: 19 September 2020 (10:08:43 CEST)
Soybean maturity is a trait of critical importance for the development of new soybean cultivars, nevertheless, its characterization based on visual ratings has many challenges. Unmanned aerial vehicles (UAVs) imagery-based high-throughput phenotyping methodologies have been proposed as an alternative to the traditional visual ratings of pod senescence. However, the lack of scalable and accurate methods to extract the desired information from the images remains a significant bottleneck in breeding programs. The objective of this study was to develop an image-based high-throughput phenotyping system for evaluating soybean maturity in breeding programs. Images were acquired twice a week, starting when the earlier lines began maturation until the latest ones were mature. Two complementary convolutional neural networks (CNN) were developed to predict the maturity date. The first using a single date and the second using the five best image dates identified by the first model. The proposed CNN architecture was validated using more than 15,000 ground truth observations from five trials, including data from three growing seasons and two countries. The trained model showed good generalization capability with a root mean squared error lower than two days in four out of five trials. Four methods of estimating prediction uncertainty showed potential at identifying different sources of errors in the maturity date predictions. The architecture used solves limitations of previous research and can be used at scale in commercial breeding programs.
ARTICLE | doi:10.20944/preprints201905.0244.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Barrett's esophagus; deep learning; volumetric laser endomicroscopy; optical coherence tomography; classification; esophageal adenocarcinoma; glands; machine learning
Online: 20 May 2019 (11:50:09 CEST)
Barrett's esopaghagus (BE) is a known precursor of esophageal adenocarcinoma (EAC). Patients with BE undergo regular surveillance to early detect stages of EAC. Volumetric laser endomicroscopy (VLE) is a novel technology capable of imaging the inner tissue layers of the esophagus over a 6-cm length scan. However, interpretation of full VLE scans is still a challenge for human observers. In this work, we train an ensemble of deep convolutional neural networks to detect neoplasia in BE patients, using a dataset of images acquired with VLE in a multicenter study. We achieve values of AUC=$0.96$ on the unseen test dataset and we compare our results with previous work done with VLE analysis. Our method for detecting neoplasia in BE patients facilitates future advances on patient treatment and provides clinicians with new assisting solutions to process and better understand VLE data.
ARTICLE | doi:10.20944/preprints202108.0272.v1
Subject: Engineering, Industrial And Manufacturing Engineering Keywords: Remaining Useful Life; Deep Neural Network; Convolutional Neural Network; Genetic Optimization; Neural Network Optimization; Support Vector Regression; Depth Maps; Normal Maps; 3D Point Clouds.
Online: 12 August 2021 (10:40:23 CEST)
In the current industrial landscape, increasingly pervaded by technological innovations, the adoption of optimized strategies for asset management is becoming a critical key success factor. Among the various strategies available, the “Prognostics and Health Management” strategy is able to support maintenance management decisions more accurately, through continuous monitoring of equipment health and “Remaining Useful Life” forecasting. In the present study, Convolutional Neural Network-based Deep Neural Network techniques are investigated for the Remaining Useful Life prediction of a punch tool, whose degradation is caused by working surface deformations during the machining process. Surface deformation is determined using a 3D scanning sensor capable of returning point clouds with micrometric accuracy during the operation of the punching machine, avoiding both downtime and human intervention. The 3D point clouds thus obtained are transformed into bidimensional image-type maps, i.e., maps of depths and normal vectors, to fully exploit the potential of convolutional neural networks for extracting features. Such maps are then processed by comparing 15 genetically optimized architectures with the transfer learning of 19 pre-trained models, using a classic machine learning approach, i.e., Support Vector Regression, as a benchmark. The achieved results clearly show that, in this specific case, optimized architectures provide performance far superior (MAPE=0.058) to that of transfer learning which, instead, remains at a lower or slightly higher level (MAPE=0.416) than Support Vector Regression (MAPE=0.857).
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Car Detection; Convolutional Neural Networks; Deep Learning; Faster R-CNN; Unmanned Aerial Vehicles; You Only Look Once (Yolo).
Online: 12 March 2020 (08:57:09 CET)
In this paper, we address the problem of car detection from aerial images using Convolutional Neural Networks (CNN). This problem presents additional challenges as compared to car (or any object) detection from ground images because features of vehicles from aerial images are more difficult to discern. To investigate this issue, we assess the performance of two state-of-the-art CNN algorithms, namely Faster R-CNN, which is the most popular region-based algorithm, and YOLOv3, which is known to be the fastest detection algorithm. We analyze two datasets with different characteristics to check the impact of various factors, such as UAV's altitude, camera resolution, and object size. The objective of this work is to conduct a robust comparison between these two cutting-edge algorithms. By using a variety of metrics, we show that YOLOv3 yields better performance in most configurations, except that it exhibits a lower recall and less confident detections when object sizes and scales in the testing dataset differ largely from those in the training dataset.
ARTICLE | doi:10.20944/preprints201910.0195.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: car detection; convolutional neural networks; deep learning; you only look once (yolo); faster r-cnn; unmanned aerial vehicles
Online: 17 October 2019 (12:29:29 CEST)
In this paper, we address the problem of car detection from aerial images using Convolutional Neural Networks (CNN). This problem presents additional challenges as compared to car (or any object) detection from ground images because features of vehicles from aerial images are more difficult to discern. To investigate this issue, we assess the performance of two state-of-the-art CNN algorithms, namely Faster R-CNN, which is the most popular region-based algorithm, and YOLOv3, which is known to be the fastest detection algorithm. We analyze two datasets with different characteristics to check the impact of various factors, such as UAV’s altitude, camera resolution, and object size. The objective of this work is to conduct a robust comparison between these two cutting-edge algorithms. By using a variety of metrics, we show that none of the two algorithms outperforms the other in all cases.
ARTICLE | doi:10.20944/preprints202210.0477.v1
Subject: Computer Science And Mathematics, Analysis Keywords: High Throughput Plant Phenotyping; Deep Neural Network; Flower Detection; Temporal Phenotypes; Benchmark Dataset; Flower Status Report
Online: 31 October 2022 (10:00:24 CET)
A phenotype is the composite of an observable expression of a genome for traits in a given environment. The trajectories of phenotypes computed from an image sequence and timing of important events in a plant’s life cycle can be viewed as temporal phenotypes and indicative of the plant’s growth pattern and vigor. In this paper, we introduce a novel method called FlowerPhenoNet which uses deep neural networks for detecting flowers from multiview image sequences for high throughput temporal plant phenotyping analysis. Following flower detection, a set of novel flower-based phenotypes are computed, e.g., the day of emergence of the first flower in a plant’s life cycle, the total number of flowers present in the plant at a given time, the highest number of flowers bloomed in the plant, growth trajectory of a flower and the blooming trajectory of a plant. To develop a new algorithm and facilitate performance evaluation based on experimental analysis, a benchmark dataset is indispensable. Thus, we introduce a benchmark dataset called FlowerPheno which comprises image sequences of three flowering plant species, e.g., sunflower, coleus, and canna, captured by a visible light camera in a high throughput plant phenotyping platform from multiple view angles. The experimental analyses on the FlowerPheno dataset demonstrate the efficacy of the FlowerPhenoNet.
ARTICLE | doi:10.20944/preprints201706.0012.v3
Subject: Engineering, Control And Systems Engineering Keywords: deep convolutional neural networks; road segmentation; conditional random fields; landscape metrics; satellite images; aerial images; THEOS
Online: 5 June 2017 (06:39:54 CEST)
Object segmentation on remotely-sensed images: aerial (or very high resolution, VHS) images and satellite (or high resolution, HR) images, has been applied to many application domains, especially road extraction in which the segmented objects are served as a mandatory layer in geospatial databases. Several attempts in applying deep convolutional neural network (DCNN) to extract roads from remote sensing images have been made; however, the accuracy is still limited. In this paper, we present an enhanced DCNN framework specifically tailored for road extraction on remote sensing images by applying landscape metrics (LMs) and conditional random fields (CRFs). To improve DCNN, a modern activation function, called exponential linear unit (ELU), is employed in our network resulting in a higher number of and yet more accurate extracted roads. To further reduce falsely classified road objects, a solution based on an adoption of LMs is proposed. Finally, to sharpen the extracted roads, a CRF method is added to our framework. The experiments were conducted on Massachusetts road aerial imagery as well as THEOS satellite imagery data sets. The results showed that our proposed framework outperformed Segnet, the state-of-the-art object segmentation technique on any kinds of remote sensing imagery, in most of the cases in terms of precision, recall, and F1.
ARTICLE | doi:10.20944/preprints202309.0642.v1
Subject: Computer Science And Mathematics, Mathematics Keywords: physics-informed neural networks; deep neural networks; lattice Boltzmann method; fluid mechanics; inverse problem; PDEs
Online: 11 September 2023 (07:35:30 CEST)
The Physics-Informed Neural Networks (PINNs) improve the efficiency of data utilization by combining physical principles with neural network algorithms and ensure that the predictions are consistent and stable with the physical laws. PINNs opens up a new approach to address inverse problems in fluid mechanics. Based on the single-relaxation-time lattice Boltzmann method (SRT-LBM) with the Bhatnagar-Gross-Krook (BGK) collision operator, the PINN-SRT-LBM model is proposed in this paper for solving the inverse problem in fluid mechanics. The PINN-SRT-LBM model consists of three components. The first component involves a deep neural network that predicts the equilibrium control equations in different discrete velocity directions within SRT-LBM. The second component employs another deep neural network to predict non-equilibrium control equations, enabling inference of the fluid's non-equilibrium characteristics. The third component, a physics informed function translates the outputs of the first two networks into physical infor-mation. By minimizing the residuals of the physical partial differential equations (PDEs), the physics informed function infers relevant macroscopic quantities of the flow. The model evolves two sub-models applicable to different dimensions, named PINN-SRT-LBM-I and PINN-SRT-LBM-II models according to the construction of the physical informed function. The innovation of this work is the introduction of SRT-LBM and discrete velocity models as physical drivers into the neural network through the interpretation function. Therefore, PINN-SRT-LBM allows the neural network to handle inverse problems of various dimensions and focus on problem-specific solving. Experimental results confirm the accurate prediction of flow infor-mation at different Reynolds numbers within the computational domain. Relying on the PINN-SRT-LBM models, inverse problems in fluid mechanics can be solved efficiently.
ARTICLE | doi:10.20944/preprints202204.0163.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: artificial intelligence; deep learning; image-to-image translation; dual-energy computed tomography; pulmonary embolism; emergency radiology
Online: 18 April 2022 (09:45:00 CEST)
Detector-based spectral CT offers the possibility of obtaining spectral information from which discrete acquisitions at different energy levels can be derived, yielding so-called virtual monoenergetic images (VMI). In this study, we aimed to develop a jointly optimized deep learning framework based on dual-energy CT pulmonary angiography (DE-CTPA) data to generate synthetic monoenergetic images (SMI) for improving automatic pulmonary embolism (PE) detection in single-energy CTPA scans. For this purpose, we used two data sets: our institutional DE-CTPA data set D1 comprising polyenergetic arterial series and the corresponding VMI at low-energy levels (40 keV) with 7,892 image pairs, and a 10% subset of the 2020 RSNA Pulmonary Embolism Detection Challenge data set D2, which consisted of 161,253 polyenergetic images with dichotomous slice-wise annotations (PE/no PE). We trained a fully convolutional encoder-decoder on D1 to generate SMI from single-energy CTPA scans of D2, which were then fed into a ResNet50 network for training of the downstream PE classification task. The quantitative results on the reconstruction ability of our framework revealed high-quality visual SMI predictions with reconstruction results of 0.984 ± 0.002 (structural similarity) and 41.706 ± 0.547 dB (peak-signal-to-noise ratio). PE classification resulted in an AUC of 0.84 for our model, which achieved improved performance compared to other naive approaches with AUCs up to 0.81. Our study stresses the role of using joint optimization strategies for deep learning algorithms to improve automatic PE detection. The proposed pipeline may prove to be beneficial for computer-aided detection systems and could help rescue CTPA studies with suboptimal opacification of the pulmonary arteries from single-energy CT scanners.
ARTICLE | doi:10.20944/preprints201811.0579.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning, Cognitive, LSTM, Neural network, Ngrams
Online: 26 November 2018 (10:06:05 CET)
Cognitive neuroscience is the study of how the human brain functions on tasks like decision making, language, perception and reasoning. Deep learning is a class of machine learning algorithms that use neural networks. They are designed to model the responses of neurons in the human brain. Learning can be supervised or unsupervised. Ngram token models are used extensively in language prediction. Ngrams are probabilistic models that are used in predicting the next word or token. They are a statistical model of word sequences or tokens and are called Language Models or Lms. Ngrams are essential in creating language prediction models. We are exploring a broader sandbox ecosystems enabling for AI. Specifically, around Deep learning applications on unstructured content form on the web.
ARTICLE | doi:10.20944/preprints202210.0418.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: depth estimation; optimization; deep learning
Online: 27 October 2022 (03:25:23 CEST)
Deep learning, specifically the supervised approach, has proved to be a breakthrough in depth prediction. However, the generalization ability of deep networks is still limited, and they cannot maintain a satisfactory performance on some inputs. Addressing a similar problem in the segmentation field, a scheme (f-BRS) has been proposed to refine predictions in the inference time. f-BRS adapts an intermediate activation function to each input by using user clicks as sparse labels. Given the similarity between user clicks and sparse depth maps, this paper aims at extending the application of f-BRS to depth prediction. Our experiments show that f-BRS, fused with a depth estimation baseline, is trapped in local optima, and fails to improve the network predictions. To resolve that, we propose a double-stage adaptive refinement scheme (DARS). In the first stage, a Delaunay-based correction module significantly improves the depth generated by a baseline network. In the second stage, a particle swarm optimizer (PSO) delineates the estimation through fine-tuning f-BRS parameters—that is, scales and biases. DARS is evaluated on an outdoor benchmark, KITTI, and an indoor benchmark, NYUv2 while for both the network is pre-trained on KITTI. The proposed scheme outperformed rival methods on both datasets.
REVIEW | doi:10.20944/preprints202212.0191.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: machine learning; deep learning; generative models
Online: 12 December 2022 (04:05:39 CET)
Over the past decade, research in the field of Deep Learning has brought about novel improvements in image generation and feature learning; one such example being a Generative Adversarial Network. However, these improvements have been coupled with an increasing demand on mathematical literacy and previous knowledge in the field. Therefore, in this literature review, I seek to introduce Generative Adversarial Networks (GANs) to a broader audience by explaining their background and intuition at a more foundational level. I begin by discussing the mathematical background of this architecture, specifically topics in linear algebra and probability theory. I then proceed to introduce GANs in a more theoretical framework, along with some of the literature on GANs, including their architectural improvements and image-generation capabilities. Finally, I cover state-of-the-art image generation through style-based methods, as well as their implications on society.
ARTICLE | doi:10.20944/preprints202202.0015.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning; Machine learning
Online: 1 February 2022 (13:34:28 CET)
We study the brain segmentation by dividing the brain into multiple tissues. Given possible brain segmentation by deep, machine learning can be efficiently exploited to expedite the segmentation process in the clinical practice. To accomplish segmentation process, a MRI and tissues transfer using generative adversarial networks is proposed. Given the better result, we propose the transfer model using GAN. For the case of the brain tissues, white matter (WM), gray matter (GM) and cerebrospinal fluid (CSF) are segmented. Empirical results show that this proposed model significantly improved segmentation results compared to the stat-of-the-art results. Furthermore, a dice coefficient (DC) metric is used to evaluate the model performance.
ARTICLE | doi:10.20944/preprints202108.0011.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Transformer; spike; neural decoding; CNN; RNN; LSTM; deep learning; information; neuroscience
Online: 2 August 2021 (09:51:43 CEST)
Neural decoding from spiking activity is an essential tool for understanding the information encoded in population neurons, especially in applications like brain-computer interface (BCI). Various quantitative methods have been proposed and have shown superiorities under different scenarios respectively. From the machine learning perspective, the decoding task is to map the high-dimensional spatial & temporal neuronal activity to the low-dimensional physical quantities (e.g., velocity, position). Because of the complex interactions and the abundant dynamics among neural circuits, good decoding algorithms usually have the capability of capturing flexible spatiotemporal structures embedded in the input feature space. Recently, the Transformer-based models are widely used in processing natural languages and images due to its superior performances in handling long-range and global dependencies. Hence, in this work we examine the potential applications of Transformers in neural decoding and introduce two Transformer-based models. Besides adapting the Transformer to neuronal data, we also propose a data augmentation method for overcoming the data shortage issue. We test our models on three experimental datasets and their performances are comparable to the previous state-of-the-art (SOTA) RNN-based methods. In addition, Transformer-based models show increased decoding performances when the input sequences are longer, while LSTM-based models deteriorate quickly. Our research suggests that Transformer-based models are important additions to the existing neural decoding solutions, especially for large datasets with long temporal dependencies.
ARTICLE | doi:10.20944/preprints202310.0572.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; ensembles; segmentation; transformers
Online: 10 October 2023 (08:33:19 CEST)
To identify objects in images, a complex set of skills is needed that includes understanding the context and being able to determine the borders of objects. In computer vision, this task is known as semantic segmentation and it involves categorizing each pixel in an image. It is crucial in many real-world situations: for autonomous vehicles, it enables the identification of objects in the surrounding area; in medical diagnosis, it enhances the ability to detect dangerous pathologies early, thereby reducing the risk of serious consequences. In this study, we compare the performance of various ensembles of convolutional and transformer neural networks. Ensembles can be created, e.g, by varying the loss function, data augmentation method or the learning rate strategy. Our proposed ensemble, which is based on the simple average rule, demonstrates exceptional performance on several datasets. All the resources used in this study are available online at the following GitHub repository: https://github.com/LorisNanni.
ARTICLE | doi:10.20944/preprints202105.0117.v2
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: decision trees; deep feed-forward network; neural trees; consistency; optimal rate of convergence
Online: 9 November 2021 (16:54:30 CET)
Decision tree algorithms have been among the most popular algorithms for interpretable (transparent) machine learning since the early 1980s. On the other hand, deep learning methods have boosted the capacity of machine learning algorithms and are now being used for non-trivial applications in various applied domains. But training a fully-connected deep feed-forward network by gradient-descent backpropagation is slow and requires arbitrary choices regarding the number of hidden units and layers. In this paper, we propose near-optimal neural regression trees, intending to make it much faster than deep feed-forward networks and for which it is not essential to specify the number of hidden units in the hidden layers of the neural network in advance. The key idea is to construct a decision tree and then simulate the decision tree with a neural network. This work aims to build a mathematical formulation of neural trees and gain the complementary benefits of both sparse optimal decision trees and neural trees. We propose near-optimal sparse neural trees (NSNT) that is shown to be asymptotically consistent and robust in nature. Additionally, the proposed NSNT model obtain a fast rate of convergence which is near-optimal up to some logarithmic factor. We comprehensively benchmark the proposed method on a sample of 80 datasets (40 classification datasets and 40 regression datasets) from the UCI machine learning repository. We establish that the proposed method is likely to outperform the current state-of-the-art methods (random forest, XGBoost, optimal classification tree, and near-optimal nonlinear trees) for the majority of the datasets.
REVIEW | doi:10.20944/preprints201909.0304.v1
Subject: Medicine And Pharmacology, Oncology And Oncogenics Keywords: cancer; complexity; machine learning; deep learning; fluid dynamics; turbulence; chaos
Online: 27 September 2019 (07:24:43 CEST)
Cancers remain the lead cause of disease-related, pediatric death in North America. The emerging field of complex systems has redefined cancer networks as a computational system with intractable algorithmic complexity. Herein, a tumor and its heterogeneous phenotypes are discussed as dynamical systems having multiple, strange attractors. Machine learning, network science and algorithmic information dynamics are discussed as current tools for cancer network reconstruction. Deep Learning architectures and computational fluid models are proposed for better forecasting gene expression patterns in cancer ecosystems. Cancer cell decision-making is investigated within the framework of complex systems and complexity theory.
ARTICLE | doi:10.20944/preprints202304.0141.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: plume rise; deep learning; plume cloud recognition
Online: 10 April 2023 (04:11:22 CEST)
Estimating plume cloud height is essential for various applications, such as global climate models. Smokestack plume rise is the constant height at which the plume cloud is carried downwind as its momentum dissipates and the plume cloud and the ambient temperatures equalize. Although different parameterizations are used in most air-quality models to predict the plume rise, they have been unable to estimate it properly. This paper proposes a novel framework to monitor smokestack plume clouds and make long-term, real-time measurements of the plume rise. For this purpose, a three-stage framework is developed based on Deep Convolutional Neural Networks (DCNNs). In the first stage, an improved Mask R-CNN, called Deep Plume Rise Network (DPRNet), is applied to recognize the plume cloud. Then, image processing analysis and least squares theory are respectively used to detect the plume cloud’s boundaries and fit an asymptotic model into their centerlines. The y-component coordinate of this model’s critical point is considered the plume rise. In the last stage, a geometric transformation phase converts image measurements into real-life ones. A wide range of images with different atmospheric conditions, including day, night, and cloudy/foggy, have been selected for the DPRNet training algorithm. Obtained results show that the proposed method outperforms widely-used networks in smoke border detection and recognition.
ARTICLE | doi:10.20944/preprints202005.0347.v1
Subject: Engineering, Mechanical Engineering Keywords: deep learning; maximum mean discrepancy; gearbox; fault detection
Online: 22 May 2020 (05:21:56 CEST)
In the past years, various intelligent machine learning and deep learning algorithms have been developed and widely applied for gearbox fault detection and diagnosis. However, the real-time application of these intelligent algorithms has been limited, mainly due to the fact that the model developed using data from one machine or one operating condition has serious diagnosis performance degradation when applied to another machine or the same machine with a different operating condition. The reason for poor model generalization is the distribution discrepancy between the training and testing data. This paper proposes to address this issue using a deep learning based cross domain adaptation approach for gearbox fault diagnosis. Labelled data from training dataset and unlabeled data from testing dataset is used to achieve the cross-domain adaptation task. A deep convolutional neural network (CNN) is used as the main architecture. Maximum mean discrepancy is used as a measure to minimize the distribution distance between the labelled training data and unlabeled testing data. The study proposes to reduce the discrepancy between the two domains in multiple layers of the designed CNN to adapt the learned representations from the training data to be applied in the testing data. The proposed approach is evaluated using experimental data from a gearbox under significant speed variation and multiple health conditions. An appropriate benchmarking with both traditional machine learning methods and other domain adaptation methods demonstrates the superiority of the proposed method.
ARTICLE | doi:10.20944/preprints201811.0546.v4
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network (CNN), Deep learning, Architecture, Applications
Online: 14 February 2019 (10:01:31 CET)
With the increase of the Artificial Neural Network (ANN), machine learning has taken a forceful twist in recent times. One of the most spectacular kinds of ANN design is the Convolutional Neural Network (CNN). The Convolutional Neural Network (CNN) is a technology that mixes artificial neural networks and up to date deep learning strategies. In deep learning, Convolutional Neural Network is at the center of spectacular advances. This artificial neural network has been applied to several image recognition tasks for decades and attracted the eye of the researchers of the many countries in recent years as the CNN has shown promising performances in several computer vision and machine learning tasks. This paper describes the underlying architecture and various applications of Convolutional Neural Network.
BRIEF REPORT | doi:10.20944/preprints202207.0419.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: computer vision; deep learning; CoughNet model
Online: 27 July 2022 (10:01:54 CEST)
To solve two key problems in the identification of people who are infected with COVID-19: the first problem is that the identification accuracy is not high enough. The second problem is that present identification method such as nucleic acid testing is expensive in many countries. Methods: So, I decided to design a fast identification method for COVID-19 patients which is based on deep learning. After the model (CoughNet) learns more than 6,000 cough spectrograms of both COVID-19 patients and normal people, the accuracy rate of identification of COVID-19 patients and normal people is higher than 99% in the test set. Structure: This paper is mainly divided into three parts: the first part introduces the main background and research status of the research; The second part introduces the research methods; The third part introduces the specific process of the experiment.
REVIEW | doi:10.20944/preprints202202.0050.v1
Subject: Engineering, Bioengineering Keywords: Neuroprosthetics; Brain Computer Interface; Neural Implants; Deep Brain Stimulation
Online: 3 February 2022 (11:06:15 CET)
Recent progress in microfabrication technique allowed the rapid development of neural implants. They are getting categorized as effective tools for clinical practice, especially to treat traumatic and neurodegenerative disorders. Microelectrode arrays already have been used in numerous neural interface devices. Basically, almost all neural implants have been developed based on BCI (Brain Computer Interface) system. When BCI system falls under invasive technique, it is referred as BMI or Brain Machine Interface. BMIs hold promises for neurorehabilitation of motor and sensory function, cognitive state evaluation and treatment of neurological chaos. A directed overview of the field of neural implants is discussed in this article. The aim of this review is to give a brief introduction of neural prosthetics as well as their exciting applications in treating neurological disorders and a deep discussion on their functionality are mentioned. BCI system and their different types, their functionality, their pros and cons, how other neural implants developed, and their present status have been covered. Different possibilities and possible future of deep brain stimulation (DBS), Neuralink, motor and sensory neural prosthetics are further discussed.
ARTICLE | doi:10.20944/preprints202005.0455.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: pattern recognition; deep convolutional neural network; Brahmi script; CNN
Online: 28 May 2020 (07:33:32 CEST)
Significant progress has made in pattern recognition technology. However, one obstacle that has not yet overcome is the recognition of words in the Brahmi script, specifically the identification of characters, compound characters, and word. This study proposes the use of the deep convolutional neural network with dropout to recognize the Brahmi words. This study also proposed a DCNN for Brahmi word recognition and a series of experiments are performed on standard Brahmi dataset. The practical operation of this method was systematically tested on accessible Brahmi image database, achieving 92.47% recognition rate by CNN with dropout respectively which is among the best while comparing with the ones reported in the literature for the same task.
ARTICLE | doi:10.3390/sci2010013
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; computer vision; Cycle-GAN; image reconstruction
Online: 12 March 2020 (00:00:00 CET)
In this paper, our goal is to perform a virtual restoration of an ancient coin from its image. The present work is the first one to propose this problem, and it is motivated by two key promising applications. The first of these emerges from the recently recognised dependence of automatic image based coin type matching on the condition of the imaged coins; the algorithm introduced herein could be used as a pre-processing step, aimed at overcoming the aforementioned weakness. The second application concerns the utility both to professional and hobby numismatists of being able to visualise and study an ancient coin in a state closer to its original (minted) appearance. To address the conceptual problem at hand, we introduce a framework which comprises a deep learning based method using Generative Adversarial Networks, capable of learning the range of appearance variation of different semantic elements artistically depicted on coins, and a complementary algorithm used to collect, correctly label, and prepare for processing a large numbers of images (here 100,000) of ancient coins needed to facilitate the training of the aforementioned learning method. Empirical evaluation performed on a withheld subset of the data demonstrates extremely promising performance of the proposed methodology and shows that our algorithm correctly learns the spectra of appearance variation across different semantic elements, and despite the enormous variability present reconstructs the missing (damaged) detail while matching the surrounding semantic content and artistic style.
ARTICLE | doi:10.20944/preprints202309.1273.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Facial Emotion Detection; Deep Learning; Classification; Neural Architecture Search Network
Online: 20 September 2023 (03:36:23 CEST)
Facial emotion detection is a challenging task that deals with emotion recognition. It has applications in various domains, such as behavior analysis, surveillance systems and human-computer interaction (HCI). Numerous studies have been implemented to detect emotions, including classical machine learning algorithms and advanced deep learning algorithms. For the machine learning algorithm, the hand-crafted feature needs to be extracted, which is a tiring task and requires human effort. Whereas in deep learning models, automated feature extraction is employed from samples. Therefore, in this study, we have proposed a novel and efficient deep learning model based on Neural Architecture Search Network utilizing superior artificial networks such as RNN and child networks. We performed the training utilizing the FER 2013 dataset comprising seven classes: happy, angry, neutral, sad, surprise, fear, and disgust. Furthermore, we analyzed the robustness of the proposed model on CK+ datasets and comparing with existing techniques. Due to the implication of reinforcement learning in the network, most representative features are extracted from the sample network. It extracts all key features without losing the key information. Our proposed model is based on one stage classifier and performs efficient classification. Our technique outperformed the existing models attaining an accuracy of 98.14%, recall of 97.57%, and precision of 97.84%.
ARTICLE | doi:10.20944/preprints202103.0220.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Convolutional Neural Network; Deep Learning; Environmental Monitoring
Online: 8 March 2021 (13:37:58 CET)
Accurately mapping individual tree species in densely forested environments is crucial to forest inventory. When considering only RGB images, this is a challenging task for many automatic photogrammetry processes. The main reason for that is the spectral similarity between species in RGB scenes, which can be a hindrance for most automatic methods. State-of-the-art deep learning methods could be capable of identifying tree species with an attractive cost, accuracy, and computational load in RGB images. This paper presents a deep learning-based approach to detect an important multi-use species of palm trees (Mauritia flexuosa; i.e., Buriti) on aerial RGB imagery. In South-America, this palm tree is essential for many indigenous and local communities because of its characteristics. The species is also a valuable indicator of water resources, which comes as a benefit for mapping its location. The method is based on a Convolutional Neural Network (CNN) to identify and geolocate singular tree species in a high-complexity forest environment, and considers the likelihood of every pixel in the image to be recognized as a possible tree by implementing a confidence map feature extraction. This study compares the performance of the proposed method against state-of-the-art object detection networks. For this, images from a dataset composed of 1,394 airborne scenes, where 5,334 palm-trees were manually labeled, were used. The results returned a mean absolute error (MAE) of 0.75 trees and an F1-measure of 86.9%. These results are better than both Faster R-CNN and RetinaNet considering equal experiment conditions. The proposed network provided fast solutions to detect the palm trees, with a delivered image detection of 0.073 seconds and a standard deviation of 0.002 using the GPU. In conclusion, the method presented is efficient to deal with a high-density forest scenario and can accurately map the location of single species like the M flexuosa palm tree and may be useful for future frameworks.
ARTICLE | doi:10.20944/preprints202003.0096.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning; Energy demand; Temporal convolutional network; Time series forecasting
Online: 5 March 2020 (15:02:37 CET)
Modern energy systems collect high volumes of data that can provide valuable information about energy consumption. Electric companies can now use historical data to make informed decisions on energy production by forecasting the expected demand. Many deep learning models have been proposed to deal with these type of time series forecasting problems. Deep neural networks, such as recurrent or convolutional, can automatically capture complex patterns in time series data and provide accurate predictions. In particular, Temporal Convolutional Networks (TCN) are a specialised architecture that has advantages over recurrent networks for forecasting tasks. TCNs are able to extract long-term patterns using dilated causal convolutions and residual blocks, and can also be more efficient in terms of computation time. In this work, we propose a TCN-based deep learning model to improve the predictive performance in energy demand forecasting. Two energy-related time series with data from Spain have been studied: the national electric demand, and the power demand at charging stations for electric vehicles. An extensive experimental study has been conducted, involving more than 1900 models with different architectures and parametrisations. The TCN proposal outperforms the forecasting accuracy of Long Short-Term Memory (LSTM) recurrent networks, which are considered the state-of-the-art in the field.
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: waste classification; transfer learning; deep learning; recognition classification
Online: 23 February 2020 (14:01:01 CET)
Using machine learning or deep learning to solve the problem of garbage recognition and classification is an important application in computer vision, but due to the incomplete establishment of garbage datasets and the poor performance of complex network models on smart terminal devices, the existing garbage classification models The effect is not good.This paper presents a waste classification and identification method base on transfer learning and lightweight neural network. By migrating the lightweight neural network MobileNetV2 and rebuild it, The reconstructed network is used for feature extraction, and the extracted features are introduced into the SVM to realize the identification of 6 types of garbage. The model was trained and verified by using 2527 pieces of garbage labeled data in the TrashNet dataset, which ultimately resulted in a classification accuracy of 98.4% of the method, which proves that the method can effectively improve the classification accuracy and time and overcome the problem of weak data and less labeling. The over-fitting phenomenon encountered by small data sets in deep learning makes the model robust.
ARTICLE | doi:10.20944/preprints202002.0180.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; neural attention; loans; loan origination; machine learning
Online: 14 February 2020 (02:45:01 CET)
In this paper we address the understanding of the problem, of why a deep learning model decides that an individual is eligible for a loan or not. Here we propose a novel approach for inferring, which attributes matter the most, for making a decision in each specific individual case. Specifically we leverage concepts from neural attention to devise a novel feature wise attention mechanism. As we show, using real world datasets, our approach offers unique insights into the importance of various features, by producing a decision explanation for each specific loan case. At the same time, we observe that our novel mechanism, generates decisions which are much closer to the decisions generated by human experts, compared to the existent competitors.
ARTICLE | doi:10.20944/preprints202006.0368.v1
Subject: Business, Economics And Management, Finance Keywords: Fraud Detection; Recurrent Neural Network; PaySim; Financial Transactions; Deep Learning
Online: 30 June 2020 (11:34:34 CEST)
Online transactions are becoming more popular in present situation where the globe is facing an unknown disease COVID-19. Now authorities of Countries requested peoples to use cashless transaction as far as possible. Practically it is not always possible to use it in all transactions. Since number of such cashless transactions have been increasing during lockdown period due to COVID-19, fraudulent transactions are also increasing in a rapid way. Fraud can be analysed by viewing a series of customer transactions data that was done in his/her previous transactions. Normally banks or other transaction authorities warned their customers about the transaction If any deviation is noticed by them from available patterns. These authorities think that it is possibly of fraudulent transaction. For detection of fraud during COVID-19, banks and credit card companies are applying various methods such as data mining , decision tree, rule based mining, neural network, fuzzy clustering approach and machine learning methods. These approaches is try to find out normal usage pattern of customers based on their past activities. The objective of this paper is to find out such fraud transactions during such unmanageable situation.Digital payment schemes are often threatened by fraudulent activities. Detecting fraud transaction in during money transfer may save customers from financial loss. Mobile based money transactions are focused in this paper for fraud detection. A Deep Learning (DL) framework is suggested in this paper that monitors and detects fraudulent activities. Implementing and applying recurrent neural network on PaySim generated synthetic financial dataset, deceptive transactions are identified. The proposed method is capable to detect deceptive transactions with an accuracy of 99.87%, F1-Score of 0.99 and MSE of 0.01.
ARTICLE | doi:10.20944/preprints202301.0148.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Uncertainty quantification; Deep learning, Alzheimer; MRI; MCD; Classification
Online: 9 January 2023 (06:58:45 CET)
One of the most common forms of dementia is Alzheimer’s disease (AD), which leads to progressive mental deterioration. Unfortunately, there is no definitive diagnosis and cure that can stop the condition progressing. The diagnosis is often performed based on the clinical history and neuropsychological data, including magnetic resonance imaging (MRI). Deep neural networks (DNN) algorithms are gaining popularity for medical diagnosis, and have been used widely for the analysis of MRI data. DNNs can extract hidden features from thousands of training images automatically. However, they cannot judge how confident they are about their predictions. To use DNNs in safety-critical applications such as medical diagnosis, uncertainty quantification of DNNs predictions is crucial. For this purpose, Monte Carlo dropout (MCD) has been widely used, however, it may lead to overconfident and miss calibrated results. This paper proposes a framework in which the MCD algorithm’s hyper-parameters are optimized during training using Bayesian optimization for the first time. The conducted optimization leads to assigning high predictive entropy to erroneous predictions and making it possible to recognize risky predictions. The proposed framework is used for AD diagnosis, which has not been done before. We compare our method with some existing methods in the literature based on different uncertainty quantification criteria. The results of comprehensive experiments on the Kaggle dataset using a deep model pre-trained on the ImageNet dataset show that the proposed algorithm can quantify uncertainty much better than the existing methods.
ARTICLE | doi:10.20944/preprints201912.0252.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: time series; deep learning; convolutional neural network; recurrence plot; financial market prediction
Online: 19 December 2019 (07:39:54 CET)
An application of deep convolutional neural network and recurrence plot for financial market movement prediction is presented. Though it is challenging and subjective to interpret its information, the pattern formed by a recurrence plot provide a useful insight into the dy- namical system. We used a recurrence plot of seven financial time series to train a deep neural network for financial market movement predic- tion. Our approach is tested on our dataset and achieved an average of 53.25% classification accuracy. The result suggests that a well trained deep convolutional neural network can learn a recurrence plot and pre- dict a financial market direction.
ARTICLE | doi:10.20944/preprints202201.0465.v1
Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: genetic algorithm; deep neural network; hidden layer; optimal architecture; intrusion detection
Online: 31 January 2022 (13:26:18 CET)
Computer network attacks are evolving in parallel with the evolution of hardware and neural network architecture. Despite major advancements in Network Intrusion Detection System (NIDS) technology, most implementations still depend on signature-based intrusion detection systems, which can’t identify unknown attacks. Deep learning can help NIDS to detect novel threats since it has a strong generalization ability. The deep neural network’s architecture has a significant impact on the model’s results. We propose a genetic algorithm based model to find the optimal number of hidden layers and the number of neurons in each layer of the deep neural network (DNN) architecture for the network intrusion detection binary classification problem. Experimental results demonstrate that the proposed DNN architecture shows better performance than classical machine learning algorithms at a lower computational cost.
ARTICLE | doi:10.20944/preprints202209.0060.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Autonomous Driving; Deep Learning; LIDAR Data; Wavelets; 3D Object Detection
Online: 5 September 2022 (13:03:00 CEST)
3D object detection is crucial for autonomous driving to understand the driving environment. Since the pooling operation causes information loss in the standard CNN, we have designed a wavelet multiresolution analysis-based 3D object detection network without a pooling operation. Additionally, instead of using a single filter like the standard convolution, we use the lower-frequency and higher-frequency coefficients as a filter. These filters capture more relevant parts than a single filter, enlarging the receptive field. The model comprises a discrete wavelet transform (DWT) and an inverse wavelet transform (IWT) with skip connections to encourage feature reuse for contrasting and expanding layers. The IWT enriches the feature representation by fully recovering the lost details during the downsampling operation. Element-wise summation is used for the skip connections to decrease the computational burden. We train the model for the Haar and Daubechies (Db4) wavelets. The two-level wavelet decomposition result shows that we can build a lightweight model without losing significant performance. The experimental results on the KITTI’s BEV and 3D evaluation benchmark show our model outperforms the Pointpillars base model by up to 14 \% while reducing the number of trainable parameters. Code will be released.
REVIEW | doi:10.20944/preprints202104.0202.v1
Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: Spiking Neural Network (SNN); Biological Inspiration; Deep Learning; Neuromorphic Computing
Online: 7 April 2021 (12:13:16 CEST)
Recent advancement of deep learning has been elevated the multifaceted nature in various applications of this field. Artificial neural networks are now turning into a genuinely old procedure in the vast area of computer science; the principal thoughts and models are more than fifty years of age. However, in this modern computing era, 3rd generation intelligent models are introduced by scientists. In the biological neuron, actual film channels control the progression of particles over the layer by opening and shutting in light of voltage changes because of inborn current flows and remotely led to signals. A comprehensive 3rd generation, Spiking Neural Network (SNN) is diminishing the distance between deep learning, machine learning, and neuroscience in a biologically-inspired manner. It also connects neuroscience and machine learning to establish high-level efficient computing. Spiking Neural Networks initiate utilizing spikes, which are discrete functions that happen at focuses as expected, as opposed to constant values. This paper is a review of the biological-inspired spiking neural network and its applications in different areas. The author aims to present a brief introduction to SNN, which incorporates the mathematical structure, applications, and implementation of SNN. This paper also represents an overview of machine learning, deep learning, and reinforcement learning. This review paper can help advanced artificial intelligence researchers to get a compact brief intuition of spiking neural networks.
ARTICLE | doi:10.20944/preprints202106.0613.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: LRTI; URTI; Asthma; Cough Classification; Respiratory Pathology Classification; MFCCs; BiLSTM; Deep Neural Networks
Online: 25 June 2021 (09:45:00 CEST)
Intelligent systems are transforming the world, as well as our healthcare system. We propose a deep learning-based cough sound classification model that can distinguish between children with healthy versus pathological coughs such as asthma, upper respiratory tract infection (URTI), and lower respiratory tract infection (LRTI). In order to train a deep neural network model, we collected a new dataset of cough sounds, labelled with clinician's diagnosis. The chosen model is a bidirectional long-short term memory network (BiLSTM) based on Mel Frequency Cepstral Coefficients (MFCCs) features. The resulting trained model when trained for classifying two classes of coughs -- healthy or pathology (in general or belonging to a specific respiratory pathology), reaches accuracy exceeding 84\% when classifying cough to the label provided by the physicians' diagnosis. In order to classify subject's respiratory pathology condition, results of multiple cough epochs per subject were combined. The resulting prediction accuracy exceeds 91\% for all three respiratory pathologies. However, when the model is trained to classify and discriminate among the four classes of coughs, overall accuracy dropped: one class of pathological coughs are often misclassified as other. However, if one consider the healthy cough classified as healthy and pathological cough classified to have some kind of pathologies, then the overall accuracy of four class model is above 84\%. A longitudinal study of MFCC feature space when comparing pathologicial and recovered coughs collected from the same subjects revealed the fact that pathological cough irrespective of the underlying conditions occupy the same feature space making it harder to differentiate only using MFCC features.
ARTICLE | doi:10.20944/preprints202105.0636.v1
Subject: Engineering, Automotive Engineering Keywords: cultural heritage; environment; deep learning; artificial intelligence; neural network.
Online: 26 May 2021 (13:06:34 CEST)
This work aims to contribute to better understanding the use of public street spaces. (1) Background: In this sense, with a multidisciplinary approach, the objective of this work is to propose an experimental and reproducible method on a large scale. (2) Study area: The applied methodology uses artificial intelligence to analyze Google Street View (GSV) images at street level. (3) Method: The purpose is to validate a methodology that allows to characterize and quantify the use (pedestrians and cars) of some squares in Rome belonging to different historical periods. (4) Results: Through the use of machine vision techniques, typical of artificial intelligence and which use convolutional neural networks, a historical reading of some selected squares is proposed with the aim of interpreting the dynamics of use and identifying some critical issues in progress. (5) Conclusions: This work validated the usefulness of a method applied to the use of artificial intelligence for the analysis of GSV images at street level.
ARTICLE | doi:10.20944/preprints201910.0056.v1
Subject: Engineering, Control And Systems Engineering Keywords: Fusarium head blight disease; color imaging; deep neural network
Online: 6 October 2019 (04:11:58 CEST)
Fusarium head blight (FHB) disease is extensively distributed worldwide. This disease damages grain quality and reduces yield. The detection of this disease in a high throughput way is crucial to planters and breeders. Our study focused on developing a method for processing wheat color images and accurately detecting disease areas using deep learning and image processing techniques. The color images of wheat at the milky stage were collected and processed to construct datasets, which were used to retrain a deep convolutional neural network model using transfer learning. Testing results showed that the model can detect spikes, and the coefficient of determination of the number of spikes between the manual count and the detection was 0.80. The model was assessed, and the mean average precision for the testing dataset was 0.9201. On the basis of the results of spike detection, a new color feature was applied to obtain the gray image of each spike. Then, a modified region growing algorithm was implemented to segment and detect the diseased areas of each spike. Results show that the region growing algorithm performs better than K-means and Otsu’s method in segmenting the FHB disease. Overall, this study demonstrates that deep learning techniques enable the accurate detection of FHB in wheat using color images, and the proposed method can effectively detect spikes and diseased areas, thereby improving the efficiency of FHB detection.
ARTICLE | doi:10.20944/preprints202107.0699.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: adaptive computing; dynamic deep neural structure; adpative convolution; dynamic training
Online: 30 July 2021 (12:25:45 CEST)
The colossal depths of the deep neural network sometimes suffer from ineffective backpropagation of the gradients through all its depths. Whereas, The strong performance of shallower multilayer neural structures prove their ability to increase the gradient signals in the early stages of training which easily gets backpropagated for global loss corrections. Shallow neural structures are always a good starting point for encouraging the sturdy feature characteristics of the input. In this research, a shallow, deep neural structure called PrimeNet is proposed. PrimeNet is aimed to dynamically identify and encourage the quality visual indicators from the input to be used by the subsequent deep network layers and increase the gradient signals in the lower stages of the training pipeline. In addition to this, the layerwise training is performed with the help of locally generated errors which means the gradient is not backpropagated to previous layers, and the hidden layer weights are updated during the forward pass, making this structure a backpropagation free variant. PrimeNet has obtained state-of-the-art results on various image datasets, attaining the dual objective of (1) compact dynamic deep neural structure, which (2) eliminates the problem of backwards-locking. The PrimeNet unit is proposed as an alternative to traditional convolution and dense blocks for faster and memory-efficient training, outperforming previously reported results aimed at adaptive methods for parallel and multilayer deep neural systems.
ARTICLE | doi:10.20944/preprints202309.0476.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Joint information and energy relaying; energy harvesting; deep deterministic policy gradient
Online: 7 September 2023 (04:48:03 CEST)
Wireless energy harvesting (EH) communication has long been considered a sustainable networking solution. However, it has been limited in efficiency, which has been a major obstacle. Recently, strategies such as energy relaying and borrowing have been explored to overcome these difficulties and provide long-range wireless sensor connectivity. In this article, we examine the reliability of the wireless-powered communication network by maximizing the net bit rate. To accomplish our goal, we focus on enhancing the performance of hybrid access points and information sources by optimizing their transmit power. Additionally, we aim to maximize the use of harvested energy by energy-harvesting relays for both information transmission and energy relaying. However, this optimization problem is complex as it involves non-convex variables and requires combinatorial relay selection indicators optimization for decode and forward (DF) relaying. To simplify this problem, we utilize the Markov decision process and deep reinforcement learning framework based on the deep deterministic policy gradient algorithm. This approach enables us to tackle this non-tractable problem, which conventional convex optimisation techniques would be difficult to solve in complex problem environments. The proposed algorithm significantly improves the end-to-end net bit rate of the smart energy borrowing and relaying EH system by 13.22%,27.57%, and 14.12% compared to the benchmark algorithm based on borrowing energy with an adaptive reward for Quadrature Phase Shift Keying, 8-PSK, and 16-Quadrature amplitude modulation schemes, respectively.
ARTICLE | doi:10.20944/preprints201811.0612.v1
Subject: Environmental And Earth Sciences, Geophysics And Geology Keywords: geophysical signal processing; pattern recognition; temporal convolutional neural networks; seismology; deep learning; nuclear treaty monitoring
Online: 29 November 2018 (03:37:48 CET)
The detection of seismic events at regional and teleseismic distances is critical to Nuclear Treaty Monitoring. Traditionally, detecting regional and teleseismic events has required the use of an expensive multi-instrument seismic array; however in this work, we present DeepPick, a novel seismic detection algorithm capable of array-like performance from a single trace. We achieve this directly, by training our single-trace detector against labeled events from an array catalog, and by utilizing a deep temporal convolutional neural network. The training data consists of all arrivals in the International Seismological Centre Catalog for seven seismic arrays over a five year window from 1 Jan 2010 to 1 Jan 2015, yielding a total training set of 608,362 detections. The test set consists of the same seven arrays over a one year window from 1 Jan 2015 to 1 Jan 2016. We report our results by training the algorithm on six of the arrays and testing it on the seventh, so as to demonstrate the transportability and generalization of the technique to new stations. Detection performance against this test set is outstanding. Fixing a type-I error rate of 1%, the algorithm achieves an overall recall rate of 73% on the 141,095 array beam picks in the test set, yielding 102,394 correct detections. This is more than 4 times the 23,259 detections found in the analyst-reviewed single-trace catalogs over the same period, and represents an 8dB improvement in detector sensitivity over current methods. These results demonstrate the potential of our algorithm to significantly enhance the effectiveness of the global treaty monitoring network.
ARTICLE | doi:10.20944/preprints201812.0211.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: VHR image; building roof; segmentation; GF2; deep convolution neural network
Online: 18 December 2018 (04:07:47 CET)
This paper presents a novel approach for semantic segmentation of building roof in dense urban environment with Deep Convolution Neural Network (DCNN) using imagery acquired by a Chinese Very High Resolution (VHR) satellite mission, i.e. GaoFen-2 (GF-2). To provide an operational end-to-end work flow for accurate build roof mapping with feature extraction as well as image segmentation, a fully convolutional DCNN with both convolutional and deconvolutional layers is designed to perform the VHR image analysis for labeling pixels. Since the diverse urban patterns and building styles in large areas, sample image data sets of building roof and non-building roof are collected over different metropolitan regions in China. We selected typical cities with dense urban environment in each metropolitan region as study areas for collecting training and test samples. High performance cluster with GPU-mounted workstations is employed to perform the model training and optimization. With the building roof samples collected over different cities, the predictive model with multiple NN layers is developed for building roof labeling. The validation of the building roof map shows that the overall accuracy(OA) and the mean Intersection Over Union( mIOU) of DCNN based segmentation are 94.67%, 0.85 respectively, while CRF-refined segmentation achieved OA of 94.69% and mIOU of 0.83. The results suggest that the proposed approach is a promising solution for building roof mapping with VHR images over large areas across different urban and building patterns. With the operational acquisition of GF2 VHR imagery, it is expected to develop an automated pipeline for operational built-up area monitoring and timely update of building roof map over large areas.
ARTICLE | doi:10.20944/preprints201907.0121.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Artificial Neural Networks; Deep Learning; Generative Neural Networks; Incremental Learning; Novelty detection; Catastrophic Interference
Online: 8 July 2019 (14:29:28 CEST)
Deep learning models are part of the family of artificial neural networks and, as such, it suffers of catastrophic interference when they learn sequentially. In addition, most of these models have a rigid architecture which prevents the incremental learning of new classes. To overcome these drawbacks, in this article we propose the Self-Improving Generative Artificial Neural Network (SIGANN), a type of end-to-end Deep Neural Network system which is able to ease the catastrophic forgetting problem when leaning new classes. In this method, we introduce a novelty detection model to automatically detect samples of new classes, moreover an adversarial auto-encoder is used to produce samples of previous classes. This system consists of three main modules: a classifier module implemented using a Deep Convolutional Neural Network, a generator module based on an adversarial autoencoder; and a novelty detection module, implemented using an OpenMax activation function. Using the EMNIST data set, the model was trained incrementally, starting with a small set of classes. The results of the simulation show that SIGANN is able to retain previous knowledge with a gradual forgetfulness for each learning sequence. Moreover, SIGANN can detect new classes that are hidden in the data and, therefore, proceed with incremental class learning.
ARTICLE | doi:10.20944/preprints201607.0085.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: CNN; Deep Learning; AlexNet; VGGNet; Texture Descriptor; Garment Categories; 13 Garment Trend Identification; Design Classification for Garments.
Online: 27 July 2016 (15:39:53 CEST)
Automatic garments design class identification for recommending the fashion trends is important nowadays because of the rapid growth of online shopping. By learning the properties of images efficiently, a machine can give better accuracy of classification. Several methods, based on Hand-Engineered feature coding exist for identifying garments design classes. But, most of the time, those methods do not help to achieve better results. Recently, Deep Convolutional Neural Networks (CNNs) have shown better performances for different object recognition. Deep CNN uses multiple levels of representation and abstraction that helps a machine to understand the types of data (images, sound, and text) more accurately. In this paper, we have applied deep CNN for identifying garments design classes. To evaluate the performances, we used two well-known CNN models AlexNet and VGGNet on two different datasets. We also propose a new CNN model based on AlexNet and found better results than existing state-of-the-art by a significant margin.
ARTICLE | doi:10.20944/preprints201808.0130.v1
Subject: Engineering, Mechanical Engineering Keywords: SHM; Electromechanical Impedance; Piezoelectricity; Intelligent Fault Diagnosis; Machine Learning; CNN; Deep Learning
Online: 6 August 2018 (21:51:53 CEST)
Preliminaries Convolutional Neural Network (CNN) applications have recently emerged in Structural Health Monitoring (SHM) systems focusing mostly on vibration analysis. However, the SHM literature shows clearly that there is a lack of application regarding the combination of PZT (Lead Zirconate Titanate) based method and CNN. Likewise, applications using CNN along with the Electromechanical Impedance (EMI) technique applied to SHM systems are rare. To encourage this combination, an innovative SHM solution through the combination of the EMI-PZT and CNN is presented here. To accomplish this, the EMI signature is split into several parts followed by computing the Euclidean distances among them to form a RGB (red, green and blue) frame. As a result, we introduce a dataset formed from the EMI-PZT signals of 720 frames, encompassing a total of 4 types of structural conditions for each PZT. In a case study, the CNN-based method was experimentally evaluated using three PZTs glued onto an aluminum plate. The results reveal an effective pattern classification; yielding a 100% hit rate which outperforms other SHM approaches. Furthermore, the method needs only a small dataset for training the CNN, providing several advantages for industrial applications.
ARTICLE | doi:10.20944/preprints202302.0299.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: deepfake detection; CNN; deep neural network; computer vision; scale invariant feature transform; histogram of oriented gradients
Online: 17 February 2023 (06:51:37 CET)
Deepfakes are manipulated or altered images, or video, that are created using deep learning models with high levels of photorealism. The two popular methods of producing a deepfake are based on either convolutional neural networks (CNN), or autoencoders. Deepfakes created using CNN comparatively show higher qualities of realism, yet oftentimes leave artifacts and distortions in the generated media that can be detected using machine learning and deep learning algorithms. In recent years, there has been an influx of periocular image and video data because of the increase usage of face masks. By wearing masks, much of what is used for facial recognition is hidden, leaving only the periocular region visible to an observer. This loss of vital information leads to easier misidentification of media, allowing deepfakes to less likely be identified as fake. In this work, feature extraction methods, such as Scale-Invariant Feature Transform (SIFT), Histogram of Oriented Gradients (HOG), and CNN, are used to train an ensemble deep learning model to detect deepfakes in videos on a frame-by-frame level based on the periocular region. Our proposed model is able to distinguish original and manipulated images with accuracies around 98.9 percent, which is an improvement to previous works by combining SIFT and HOG for deepfake detection in convolutional neural networks.
ARTICLE | doi:10.20944/preprints202304.0996.v1
Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: Convolutional Neural Network; Deep Learning; Photoplethysmography; Respiratory Rate; Time Series
Online: 26 April 2023 (13:17:24 CEST)
Respiratory rate is an important biomarker that indicates changes in the clinical condition of critically ill patients, so a surveillance tool that can accurately monitor the changing respiratory rate in real time is needed. Through investigating various pairs of machine learning models, we proposed new machine learning model for real-time respiratory rate estimation using photoplethysmogram. New photoplethysmogram-driven respiratory rate dataset(StMary) was collected from surgical intensive care unit of a tertiary referral hospital, using photoplethysmogram signal collector. For 50patients and 50healthy volunteers, 2-minute photoplethysmogram was collected for each subject twice. To evaluate the respiratory rate of subject, it was inputted into the deep neural network model we built, and dataset was splitted into training, validation, testing dataset, then 4-fold cross validation was exploited. Our deep neural network model trained with StMary and two public datasets(BIDMC and CapnoBase) individually, or selectively merged dataset had shown a low error rate in respiration rate measurements. Our model trained with StMary showed low mean absolute error score(1.0273±0.8965), and trained with 3 datasets(CapnoBase, BIDMC and StMary) showed a lower error rate(1.7359±1.6724) than the model trained with CapnoBase and BIDMC(1.9480±1.6751). We could verify the performance of model evaluating respiratory rate from photoplethysmogram, and our dataset could contribute as the clinical research data that supports artificial intelligence models evaluating respiratory rate and surveillance tools to test whether their monitoring function works properly.
ARTICLE | doi:10.20944/preprints202211.0437.v3
Subject: Engineering, Civil Engineering Keywords: deep neural network; long short-term memory; suspended sediment; discharge
Online: 16 December 2022 (08:08:08 CET)
The dynamics of suspended sediment involves inherent non-linearity and complexity as a result of the presence of both spatial variability of the basin characteristics and temporal climatic patterns. As a result of this complexity, the conventional sediment rating curve (SRC) and other empirical methods produce inaccurate predictions. Deep neural networks (DNNs) have emerged as one of the advanced modeling techniques capable of addressing inherent non-linearity in hydrological processes over the last few decades. DNN algorithms are used to perform predictive analysis and investigate the interdependencies among the most pivotal water quantity and quality parameters i.e., discharge, suspended sediment concentration (SSC), and turbidity. In this study, the Long short-term memory (LSTM) algorithm of DNNs is used to model the discharge-suspended sediment relationship for the Stony Clove Creek. The simulations were run using primary data on discharge, SSC and turbidity. For the development of the DNN models and examining the effects of input vectors, combinations of different input vectors (namely discharge, and SSC) for the current and previous days are considered. Furthermore, a suitable modelling approach with an appropriate model input structure is suggested based on model performance indices for the training and testing phases. The performance of developed models is assessed using statistical indices such as root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2). Statistically, the performance of DNN-based models in simulating the daily SSC performed well with observed sediment concentration series data. The study demonstrates the suitability of the DNN approach for simulation and estimation of daily SSC, opening up new research avenues for applying hybrid soft computing models in hydrology.
REVIEW | doi:10.20944/preprints202206.0167.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; convolutional neural network; brain tumor classification; clinical application
Online: 13 June 2022 (04:57:42 CEST)
Deep learning has shown remarkable results in every field, especially in the biomedical field, due to its ability to exploit large-scale datasets. A convolutional neural network (CNN) is a widely used deep learning approach to solve medical imaging problems. Over the past few years, many studies have focused on CNN-based techniques for brain tumor diagnosis. There are, however, still some critical challenges that CNNs face towards clinic application. This study presents a comprehensive review of current literature that involves CNN architectures for brain tumor classification. We compare the key achievements in the performance evaluation metrics of the applied classification algorithms. In addition, this review assesses the clinical effectiveness of the included studies to elaborate on the limitations and directions of this area for future work. No review focusing on the clinical effectiveness of previous works in this field has been published. We believe that this study has the potential to elevate the application of CNN-based deep learning methods in clinical practice and also can be a quick reference for biomedical researchers who are interested in this field.
REVIEW | doi:10.20944/preprints202102.0340.v1
Subject: Computer Science And Mathematics, Security Systems Keywords: Cybersecurity; Deep Learning; Artificial Neural Network; Artificial Intelligence; Cyber-Attacks; Cybersecurity Analytics; Cyber Threat Intelligence
Online: 16 February 2021 (15:31:02 CET)
Deep learning (DL), which is originated from an artificial neural network (ANN), is one of the major technologies of today's smart cybersecurity systems or policies to function in an intelligent manner. Popular deep learning techniques, such as Multi-layer Perceptron (MLP), Convolutional Neural Network (CNN or ConvNet), Recurrent Neural Network (RNN) or Long Short-Term Memory (LSTM), Self-organizing Map (SOM), Auto-Encoder (AE), Restricted Boltzmann Machine (RBM), Deep Belief Networks (DBN), Generative Adversarial Network (GAN), Deep Transfer Learning (DTL or Deep TL), Deep Reinforcement Learning (DRL or Deep RL), or their ensembles and hybrid approaches can be used to intelligently tackle the diverse cybersecurity issues. In this paper, we aim to present a comprehensive overview from the perspective of these neural networks and deep learning techniques according to today's diverse needs. We also discuss the applicability of these techniques in various cybersecurity tasks such as intrusion detection, identification of malware or botnets, phishing, predicting cyber-attacks, e.g. denial of service (DoS), fraud detection or cyber-anomalies, etc. Finally, we highlight several research issues and future directions within the scope of our study in the field. Overall, the ultimate goal of this paper is to serve as a reference point and guidelines for the academia and professionals in the cyber industries, especially from the deep learning point of view.
ARTICLE | doi:10.20944/preprints201807.0086.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: vibration measurement; frequency prediction; deep learning; convolutional neural network; photogrammetry; computer vison; non-contact measurement
Online: 5 July 2018 (08:31:00 CEST)
Vibration measurement serves as the basis for various engineering practices such as natural frequency or resonant frequency estimation. As image acquisition devices become cheaper and faster, vibration measurement and frequency estimation through image sequence analysis continue to receive increasing attention. In the conventional photogrammetry and optical methods of frequency measurement, vibration signals are first extracted before implementing the vibration frequency analysis algorithm. In this work, we demonstrated that frequency prediction can be achieved using a single feed-forward convolutional neural network. The proposed method is verified using a vibration signal generator and excitation system, and the result obtained was compared with that of an industrial contact vibrometer in a real application. Our experimental results demonstrate that the proposed method can achieve acceptable prediction accuracy even in unfavorable field conditions.
Subject: Engineering, Automotive Engineering Keywords: traffic engineering; traffic incident detection; CNN-XGBoost; Convolution Neural Network; Deep Learning
Online: 15 April 2020 (14:13:35 CEST)
Accurate and efficient traffic incident detection methods can effectively alleviate traffic congestion caused by traffic incidents, prevent secondary accidents and improve the safety of urban road traffic.Aiming at the problems that the traditional machine learning event detection method cannot fully extract the parameter characteristics of traffic flow and is not suitable formulti-dimensional and non-linear massive data, we propose a new traffic event detection method(CNN-XGBoost).This method combines the respective advantages of Convolution Neural Network(CNN) and Extreme Gradation Boosting (XGBoost). Firstly, we preprocessed the original freeway traffic incident detection data set by constructing initial variable set, data normalization, data balance processing and dimension reorganization. Secondly,we use CNN network to automatically extract the deep features of event detection data, and use XGBoost as a classifier to classify the extracted features for expressway traffic event detection.Finally, we use the data set of Hangzhou expressway microwave detector in China to carry out simulation experiments on CNN-XGBoost. The experimental results show that compared with XGBoost, CNN, Support Vector Machine (SVM) and Gradient Boosting Decision Tree (GBDT) and other methods, CNN-XGBoost method can effectively improve the accuracy of expressway traffic event detection and has better generalization ability.
ARTICLE | doi:10.20944/preprints202307.0053.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Wireless Sensor Network (WSN); Low-Energy Adaptive Clustering Hierarchy (LEACH); Sensor Nodes (SNs); Deep Learning (DL); Artificial Neural Networks (ANNs)
Online: 3 July 2023 (10:52:07 CEST)
Applications for Wireless Sensor Networks (WSNs) range from monitoring the environment to automating factories. However, sustained and effective functioning is made more difficult by Sensor Nodes (SNs) limited energy supplies in which optimization is the main issue. So with the aim of increasing the lifespan by decreasing the energy consumption of WSN, Low-Energy Adaptive Clustering Hierarchy (LEACH) protocol with Deep Learning (DL) algorithm is analyzed in this paper. LEACH is a hierarchical mechanism that elects Cluster Heads (CHs) and regularly rotates their positions in order to allocate energy use effectively by using the same amount of energy. However, Deep Learning (DL) method is used to further improve energy optimization. In many applications, the types of Deep Learning methods like Artificial Neural Networks (ANNs) have shown to be very useful. Using this method, WSNs may make more efficient decisions that reduce energy consumption. Data aggregation, duty cycling, and transmission protocols may all be optimized by Deep Learning model's ability to recognize patterns and forecast network behavior. This results in lower energy consumption, a longer lifespan for the network, and better overall performance.
ARTICLE | doi:10.20944/preprints201812.0258.v1
Subject: Chemistry And Materials Science, Surfaces, Coatings And Films Keywords: copper; polymer coatings; polyvinyl alcohol; silver nanoparticles; deep learning; CNN
Online: 21 December 2018 (07:51:06 CET)
In order to design effective protective coatings against corrosion, the polyvinyl alcohol (PVA) as compound and composite with silver nanoparticles (nAg/PVA) were electrodeposited on copper surface employing electrochemical techniques such as linear potentiometry and cyclic voltammetry. A new paradigm was used to distinguish the features of coatings, i.e., a Deep Convolutional Neural Network (CNN) was implemented to automatically and hierarchically extract the discriminative characteristics from the information given by optical microscopy images. The main arguments that invoke a CNN implementation in the surface science of materials are the following: artificial intelligence techniques can be successfully applied to learn differences between surface coatings; based on their popularity for image processing, CNN can model images related to the problem of coatings; deep learning is able to extract the features that are distinguishable between material surfaces. To provide an overview of the copper surface, CNN was applied on microscope slides (CNN@microscopy) and inherently learnt distinctive characteristics for each class of surface morphology. The material surface morphology controlled by CNN without the interference of the human factor was successfully conducted, in our study, to extract the similarities/differences between unprotected and protected surfaces to establish the PVA and nAg/PVA performance to retard the copper corrosion.
ARTICLE | doi:10.20944/preprints202310.1519.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Deep Learning; Implicit Neural Representation; Sea Surface Temperature; Super Resolution; Satellite Retrieval Climate Data; Temporal Information
Online: 24 October 2023 (13:26:54 CEST)
Accurate climate data at fine spatial resolution are essential for scientific research and the development and planning of crucial social systems, such as energy and agriculture. Among them, sea surface temperature plays a critical role as the associated El Niño-Southern Oscillation (ENSO) is considered a significant signal of global interannual climate system. In this paper, we propose an implicit neural representation-based interpolation method with temporal information (T_INRI) to reconstruct climate data of high spatial resolution, with sea surface temperature as the research object. Traditional deep learning models for generating high-resolution climate data are only applicable to fixed resolution enhancement scales. In contrast, the proposed T_INRI method is not limited to the enhancement scale provided during the training process and its results indicate that it can enhance low-resolution input by arbitrary scale. Additionally, we discuss the impact of temporal information on the generation of high-resolution climate data, specifically, which month the low-resolution sea surface temperature data is from. Our experimental results indicate that T_INRI is advantageous over traditional interpolation methods under different enhancement scales, and the temporal information can improve T_INRI performance for a different calendar month. We also examined the potential capability of T_INRI in recovering missing grid value. These results demonstrate that the proposed T_INRI is a promising method for generating high-resolution climate data and has significant implications for climate research and related applications.
ARTICLE | doi:10.20944/preprints202006.0056.v1
Subject: Chemistry And Materials Science, Metals, Alloys And Metallurgy Keywords: Microstructure Modelling; Representative Volume Elements; DP-steel; Machine Learning; Deep Learning; Wasserstein GAN
Online: 5 June 2020 (14:37:20 CEST)
For the generation of representative volume elements a statistical description of the relevant parameters is necessary. These parameters usually describe the geometric structure of a single grain. Commonly, parameters like area, aspect ratio and slope of the grain relative to the rolling direction are applied. However, usually simple distribution functions like log normal or gamma distribution are used. Yet, these do not take the interdependencies between the microstructural parameters into account. To fully describe any metallic microstructure though, these interdependencies between the singular parameters need to be accounted for. To accomplish this representation, a machine learning approach was applied in this study. By implementing a Wasserstein generative adversarial network, the distribution, as well as the interdependencies could accurately be described. A validation scheme was applied to verify the excellent match between microstructure input data and synthetically generated output data.
ARTICLE | doi:10.20944/preprints202311.1286.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: ransomware; malware classification; deep learning; few-shot learning; entropy features; transfer learning
Online: 21 November 2023 (10:20:15 CET)
Ransomware attacks have rapidly proliferated, inflicting severe financial damages on businesses and individuals. Machine learning approaches to automate ransomware detection have shown promise but grapple with challenges like limited training data. This study introduces a novel deep learning model for few-shot ransomware classification. The model employs entropy features derived directly from malware binaries coupled with a twin neural network architecture utilizing transfer learning. Tests on over 1000 samples across 11 families demonstrate a weighted F1-score of 85.8\%, surpassing existing methods. The approach mitigates biases in limited training data and preserves intricacies lost in image-based features. It exhibits precise classification capabilities even with sparse samples of new ransomware variants. The research highlights the potential of entropy-driven deep learning to equip defenses against emerging zero-day ransomware strains.
ARTICLE | doi:10.20944/preprints202302.0086.v2
Subject: Engineering, Civil Engineering Keywords: Deep neural network; long short-term memory; water quality; discharge; stream-water
Online: 17 April 2023 (07:21:31 CEST)
Multivariate predictive analysis of the Stream-Water (SW) parameters (discharge, water level, temperature, dissolved oxygen, pH, turbidity, and specific conductance) is a pivotal task in the field of water resource management during the era of rapid climate change. The highly dynamic and evolving nature of the meteorological and climatic features have a significant impact on the temporal distribution of the SW variables in recent days making the SW variables forecasting even more complicated for diversified water-related issues. To predict the SW variables, various physics-based numerical models are used using numerous hydrologic parameters. Extensive lab-based investigation and calibration are required to reduce the uncertainty involved in those parameters. However, in the age of data-informed analysis and prediction, several deep learning algorithms showed satisfactory performance in dealing with sequential data. In this research, a comprehensive Explorative Data Analysis (EDA) and feature engineering were performed to prepare the dataset to obtain the best performance of the predictive model. Long Short-Term Memory (LSTM) neural network regression model is trained using over several years of daily data to predict the SW variables up to one week ahead of time (lead time) with satisfactory performance. The performance of the proposed model is found highly adequate through the comparison of the predicted data with the observed data, visualization of the distribution of the errors, and a set of error matrices. Higher performance is achieved through the increase in the number of epochs and hyperparameter tuning. This model can be transferred to other locations with proper feature engineering and optimization to perform univariate predictive analysis and potentially be used to perform real-time SW variables prediction.
ARTICLE | doi:10.20944/preprints202209.0190.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: green coffee bean; lightweight framework; deep convolutional neural network; explainable model; random optimization
Online: 14 September 2022 (04:04:05 CEST)
In recent years, the demand for coffee has increased tremendously. During the production process, green coffee beans are traditionally screened manually for defective beans before they are packed into coffee bean packages; however, this method is not only time-consuming but also increases the rate of human error due to fatigue. Therefore, this paper proposed a lightweight deep convolutional neural network (LDCNN) for the quality detection system of green coffee beans, which combined depthwise separable convolution (DSC), squeeze-and-excite block (SE block), skip block, and other frameworks. To avoid the influence of low parameters of the lightweight model caused by the model training process, rectified Adam (RA), lookahead (LA), and gradient centralization (GC) were included to improve efficiency; the model was also put into the embedded system. Finally, the local interpretable model-agnostic explanations (LIME) model was employed to explain the predictions of the model. The experimental results indicated that the accuracy rate of the model could reach up to 98.38% and the F1 score could be as high as 98.24% when detecting the quality of green coffee beans. Hence, it can obtain higher accuracy, lower computing time, and lower parameters. Moreover, the interpretable model verified that the lightweight model in this work is reliable, providing the basis for screening personnel to understand the judgment through its interpretability, thereby improving the classification and prediction of the model.