ARTICLE | doi:10.20944/preprints202102.0318.v3
Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: Machine Learning; Artificial Intelligence; Androgen Receptor; Random Forest; Deep Neural Network; Convolutional
Online: 24 February 2021 (13:14:01 CET)
Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine learning classifiers and regressors and evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to dif- ferent results, with deep neural networks (DNNs) on user-defined physicochemically-relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically-based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evalu- ation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and predic- tion, improving assessment and design of compounds. Source code and data are available at https://github.com/AlfonsoTGarcia-Sosa/ML
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: convolutional neural networks; horse emotion recognition; horse emotion
Online: 7 June 2021 (12:42:05 CEST)
Creating intelligent systems capable of recognizing emotions is a difficult task, especially when looking at emotions in animals. This paper describes the process of designing a “proof of concept” system to recognize emotions in horses. This system is formed by two elements, a detector and a model. The detector is a fast region-based convolutional neural network that detects horses in an image. The model is a convolutional neural network that predicts the emotions of those horses. These two elements were trained with multiple images of horses until they achieved high accuracy in their tasks. 400 images of horses were collected and labeled to train both the detector and the model while 80 were used to validate the system. Once the two components were validated, they were combined into a testable system that would detect equine emotions based on established behavioral ethograms indicating emotional affect through head, neck, ear, muzzle and eye position. The system showed an accuracy of between 69% and 74% on the validation set, demonstrating that it is possible to predict emotions in animals using autonomous intelligent systems. Such a system has multiple applications including further studies in the growing field of animal emotions as well as in the veterinary field to determine the physical welfare of horses or other livestock.
BRIEF REPORT | doi:10.20944/preprints201902.0257.v2
Subject: Engineering, Control And Systems Engineering Keywords: convolutional neural networks; pattern recognition; machine learning
Online: 12 March 2019 (10:18:12 CET)
This paper presents a study and implementation of a convolutional neural network to identify and recognize humpback whale specimens from the unique patterns of their tails. Starting from a dataset composed of images of whale tails, all the phases of the process of creation and training of a neural network are detailed – from the analysis and pre-processing of images to the elaboration of predictions, using TensorFlow and Keras frameworks. Other possible alternatives are also explained when it comes to tackling this problem and the complications that have arisen during the process of developing this paper.
ARTICLE | doi:10.20944/preprints202309.0058.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: convolutional neural networks; ensembles; fusion
Online: 4 September 2023 (03:51:24 CEST)
In computer vision and image analysis, Convolutional Neural Networks (CNNs) and other deep learning models are at the forefront of research and development. These advanced models have proven to be highly effective in tasks related to computer vision. One technique that has gained prominence in recent years is the construction of ensembles using Deep CNNs. These ensembles typically involve combining multiple pre-trained CNNs to create a more powerful and robust network. The purpose of this study is to evaluate the effectiveness of building CNN ensembles by combining several advanced techniques. Tested here are CNN ensembles constructed by replacing ReLU layers with different activation functions, employing various data augmentation techniques, and utilizing several algorithms, including some novel ones, that perturb network weights. Experimental results performed across many data sets representing different tasks demonstrate that our proposed methods for building deep ensembles produces superior results. All the resources required to replicate our experiments are available at https://github.com/LorisNanni.
ARTICLE | doi:10.20944/preprints202104.0501.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: convolutional neural networks; dilated neural networks; optimality
Online: 19 April 2021 (15:00:30 CEST)
One of the most effective image processing techniques is the use of convolutional neural networks, where we combine intensity values at grid points in the vicinity of each point. To speed up computations, researchers have developed a dilated version of this technique, in which only some points are processed. It turns out that the most efficient case is when we select points from a sub-grid. In this paper, we explain this empirical efficiency proving that the sub-grid is indeed optimal – in some reasonable sense. To be more precise, we prove that all reasonable optimality criteria, the optimal subset of the original grid is either a sub-grid, or a sub-grid-like set.
ARTICLE | doi:10.20944/preprints202308.0047.v1
Subject: Physical Sciences, Astronomy And Astrophysics Keywords: image classification; astronomy; asteroids; convolutional neural network; deep learning
Online: 1 August 2023 (11:08:14 CEST)
Near Earth Asteroids represent potential threats to human life because their trajectories may bring them in the proximity of the Earth. Monitoring these objects could help predict future impact events, but such efforts are hindered by the large numbers of objects that pass through the Earth’s vicinity. Additionally, there is also the problem of distinguishing asteroids from other objects in the night sky, which implies sifting through large sets of telescope image data. Within this context, we believe that employing machine learning techniques could greatly improve the detection process by sorting out the most likely asteroid candidates to be reviewed by human experts. At the moment, the use of machine learning techniques is still limited in the field of astronomy and the main goal of the present paper is to study the effectiveness of deep CNNs for the classification of astronomical objects, asteroids in this particular case, by comparing some of the well-known deep convolutional neural networks, including InceptionV3, Xception, InceptionResNetV2 and ResNet152V2. We have applied transfer learning and fine-tuning on these pre-existing deep convolutional networks and from the results that we have obtained one can see the potential of using deep convolutional neural networks in the process of asteroid classification. The InceptionV3 model has the best results in the asteroid class, meaning that by using it, we loose the least number of valid asteroids.
ARTICLE | doi:10.20944/preprints202101.0579.v2
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Network Interpretation; Image Classification; Convolutional Neural Network; Integrated Gradient
Online: 22 November 2021 (14:06:52 CET)
A convolutional neural network (CNN) is sometimes understood as a black box in the sense that while it can approximate any function, studying its structure will not give us any insights into the nature of the function being approximated. In other terms, the discriminative ability does not reveal much about the latent representation of a network. This research aims to establish a framework for interpreting the CNNs by profiling them in terms of interpretable visual concepts and verifying them by means of Integrated Gradient. We also ask the question, "Do different input classes have a relationship or are they unrelated?" For instance, could there be an overlapping set of highly active neurons to identify different classes? Could there be a set of neurons that are useful for one input class whereas misleading for a different one? Intuition answers these questions positively, implying the existence of a structured set of neurons inclined to a particular class. Knowing this structure has significant values; it provides a principled way for identifying redundancies across the classes. Here the interpretability profiling has been done by evaluating the correspondence between individual hidden neurons and a set of human-understandable visual semantic concepts. We also propose an integrated gradient-based class-specific relevance mapping approach that takes the spatial position of the region of interest in the input image. Our relevance score verifies the interpretability scores in terms of neurons tuned to a particular concept/class. Further, we perform network ablation and measure the performance of the network based on our approach.
ARTICLE | doi:10.20944/preprints201901.0319.v1
Subject: Chemistry And Materials Science, Nanotechnology Keywords: cascaded neural networks; memristor crossbar; convolutional neural networks
Online: 31 January 2019 (06:54:33 CET)
Multiply-accumulate calculations using a memristor crossbar array is an important method to realize neuromorphic computing. However, the memristor array fabrication technology is still immature, and it is difficult to fabricate large-scale arrays with high-yield, which restricts the development of memristor-based neuromorphic computing technology. Therefore, cascading small-scale arrays to achieve the neuromorphic computational ability that can be achieved by large-scale arrays, which is of great significance for promoting the application of memristor-based neuromorphic computing. To address this issue, we present a memristor-based cascaded framework with some basic computation units, several neural network processing units can be cascaded by this means to improve the processing capability of the dataset. Besides, we introduce a split method to reduce pressure of input terminal. Compared with VGGNet and GoogLeNet, the proposed cascaded framework can achieve 93.54% Fashion-MNIST accuracy under the 4.15M parameters. Extensive experiments with Ti/AlOx/TaOx/Pt we fabricated are conducted to show that the circuit simulation results can still provide a high recognition accuracy, and the recognition accuracy loss after circuit simulation can be controlled at around 0.26%.
ARTICLE | doi:10.20944/preprints202002.0231.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Convolutional Neural Networks; ensemble of classifiers; activation functions; image classification; skin detection
Online: 17 February 2020 (01:50:08 CET)
In recent years, the field of deep learning achieved considerable success in pattern recognition, image segmentation and may other classification fields. There are a lot of studies and practical applications of deep learning on images, video or text classification. In this study, we suggest a method for changing the architecture of the most performing CNN models with the aim of designing new models to be used as stand-alone networks or as a component of an ensemble. We propose to replace each activation layer of a CNN (usually a ReLu layer) by a different activation function stochastically drawn from a set of activation functions: in this way the resulting CNN has a different set of activation function layers.
ARTICLE | doi:10.20944/preprints202303.0221.v1
Subject: Computer Science And Mathematics, Computer Networks And Communications Keywords: polyp segmentation; computer vision; ensemble; transformers; convolutional neural networks
Online: 13 March 2023 (07:31:25 CET)
In the realm of computer vision, semantic segmentation is the task of recognizing objects in images at the pixel level. This is done by performing a classification of each pixel. The task is complex and requires sophisticated skills and knowledge about the context to identify objects’ boundaries. The importance of semantic segmentation in many domains is undisputed. In medical diagnostics, it simplifies the early detection of pathologies, thus mitigating the possible consequences. In this work, we provide a review of the literature on deep ensemble learning models for polyp segmentation and we develop new ensembles based on convolutional neural networks and transformers. The development of an effective ensemble entails ensuring diversity between its components. To this end, we combine different models (HarDNet-MSEG, Polyp-PVT, and HSNet) trained with different data augmentation techniques, optimization methods, and learning rates, which we experimentally demonstrate to be useful to form a better ensemble. Most importantly, we introduce a new method to obtain the segmentation mask which is more suitable for combining transformers in an ensemble. In our extensive experimental evaluation, the proposed ensembles exhibit state-of-the-art performance.
ARTICLE | doi:10.20944/preprints201910.0137.v1
Subject: Computer Science And Mathematics, Mathematics Keywords: topology optimization; convolutional neural network; high-resolution
Online: 12 October 2019 (03:56:19 CEST)
Topology optimization is a pioneering design method that can provide various candidates with high mechanical properties. However, the high-resolution for the optimum structures is highly desired, normally in turn leading to computationally intractable puzzle, especially for the famous Solid Isotropic Material with Penalization (SIMP) method. In this paper, an efficient and high-resolution topology optimization method is proposed based on the Super-Resolution Convolutional Neural Network (SRCNN) technique in the framework of SIMP. The SRCNN includes four processes, i.e. refining, path extraction & representation, non-linear mapping, and reconstruction. The high computational efficiency is achieved by a pooling strategy, which can balance the number of finite element analysis (FEA) and the output mesh in optimization process. To further reduce the high computational cost of 3D topology optimization problems, a combined treatment method using 2D SRCNN is built as another speeding-up strategy. A number of typical examples justify that the high-resolution topology optimization method adopting SRCNN has excellent applicability and high efficiency for 2D and 3D problems with arbitrary boundary conditions, any design domain shape, and varied load.
ARTICLE | doi:10.20944/preprints202302.0396.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network; Ensemble Learning; Transfer Learning; Fine-tuning; Plankton Classification; foraminifera
Online: 23 February 2023 (03:37:23 CET)
This paper presents a study of an automated system for identifying planktic foraminifera at the species level. The system uses a combination of deep learning methods, specifically Convolutional Neural Networks (CNNs), to analyze digital images of foraminifera taken at different illumination angles. The dataset is composed of 1437 groups of sixteen grayscale images, one group for each foraminifer, that are then converted to RGB images with various processing methods. These RGB images are fed into a set of CNNs, organized in an Ensemble Learning (EL) environment. The ensemble is built by training different networks using different approaches for creating the RGB images. The study finds that an ensemble of CNN models trained on different RGB images improves the system's performance compared to other state-of-the-art approaches. The proposed system was also found to outperform human experts in classification accuracy.
ARTICLE | doi:10.20944/preprints202308.1719.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: left ventricle segmentation; fully convolutional networks; U-NET; inception modules; medical image segmentation
Online: 24 August 2023 (07:20:53 CEST)
The automatic diagnosis of cardiovascular diseases has received much attention in the deep learning field. In this context, the segmentation of the left ventricle endocardium constitutes a major task in diagnosing heart conditions such as health failure and hypertrophic cardiomyopathy. The objective of this paper is to propose a "deep convolution network" for segmenting the internal cavity of the left ventricle (endocardium) using MRI images. In particular, we design an improved UNET model which handles additional inception modules for efficiently segmenting the internal cavity of the left ventricle. Our approach has been validated on the Sunnybrook Cardiac Data (SCD) dataset and has showed promising results in terms of precision. More specifically, the improved UNET largely outperforms the baseline UNET model and many existing state-of-the-art methods.
ARTICLE | doi:10.20944/preprints202008.0113.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Scene classification; Deep Learning; Convolutional Neural Networks; Feature learning
Online: 5 August 2020 (06:19:27 CEST)
State-of-the-art remote sensing scene classification methods employ different Convolutional Neural Network architectures for achieving very high classification performance. A trait shared by the majority of these methods is that the class associated with each example is ascertained by examining the activations of the last fully connected layer, and the networks are trained to minimize the cross-entropy between predictions extracted from this layer and ground-truth annotations. In this work, we extend this paradigm by introducing an additional output branch which maps the inputs to low dimensional representations, effectively extracting additional feature representations of the inputs. The proposed model imposes additional distance constrains on these representations with respect to identified class representatives, in addition to the traditional categorical cross-entropy between predictions and ground-truth. By extending the typical cross-entropy loss function with a distance learning function, our proposed approach achieves significant gains across a wide set of benchmark datasets in terms of classification, while providing additional evidence related to class membership and classification confidence.
ARTICLE | doi:10.20944/preprints202305.1490.v1
Subject: Engineering, Civil Engineering Keywords: Surrogate Model; Convolutional Neural Network; Physics-Informed Neural Networks; Elliptic PDE; FEM
Online: 22 May 2023 (09:48:22 CEST)
This study aimed at exploring what role artificial intelligence techniques could play in the futural numerical analysis. In this paper, a convolutional neural network techniques based on modified loss function is proposed as a surrogate of finite element method(FEM). Several surrogate-based physics-informed neural networks(PINNs) are developed to solve representative boundary value problems based on elliptic partial differential equations (PDEs). Results from the proposed surrogate-based approach are in good agreement with ones from conventional FEM. It is found that modification of the loss function could improve the prediction accuracy of the neural network. It is indicated that to some extent the artificial intelligence technique could replace conventional numerical analysis as a great surrogate model.
ARTICLE | doi:10.20944/preprints202304.1061.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: driving a car; driving behavior; electrooculography; convolutional neural networks
Online: 27 April 2023 (08:14:38 CEST)
To drive safely, the driver must be aware of the surroundings, pay attention to the road traffic, and be ready to adapt to new circumstances. Most studies on driving safety focus on detecting anomalies in driver behavior and monitoring the cognitive capabilities of drivers. In our study, we proposed a classifier for basic activities in driving a car, based on a similar approach that could be applied to the recognition of basic activities in daily life, that is, using electrooculographic (EOG) signals and a one-dimensional convolutional neural network (1D CNN). Our classifier achieved an accuracy of 80% for the 16 primary and secondary activities. The accuracy related to primary activities in driving, including crossroad, parking, roundabout was 97.9%, 96.8%, 97.4%, and 99.5%, respectively. The F1 score for secondary driving actions (0.99) was higher than for primary driving activities (0.93–0.94). Furthermore, using the same algorithm, it was possible to distinguish four secondary activities related to activities of daily life and secondary when driving a car.
TECHNICAL NOTE | doi:10.20944/preprints201811.0529.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Calving Front; Image Segmentation; U-Net; Convolutional Neural Network; Machine Learning; Greenland
Online: 21 November 2018 (14:05:00 CET)
The continuous and precise mapping of glacier calving fronts is essential for monitoring and understanding rapid glacier changes in Antarctica and Greenland, which have the potential for significant sea level rise within the current century. This effort has been mostly restricted to the slow and painstaking manual digitalization of the calving front positions in thousands of satellite imagery products. Here, we have developed a machine learning toolkit to robustly and automatically detect glacier calving front margins in satellite imagery. The toolkit is based on semantic image segmentation using Convolutional Neural Networks (CNN) with a modified U-Net architecture to isolate the calving fronts from satellite images after having been trained with a dataset of images and their corresponding manually-determined calving fronts. As a case study we train our neural network on a varied set Landsat images with lowered resolutions from Jakobshavn, Sverdrup, and Kangerlussuaq glaciers, Greenland and test the results on novel images from Helheim glacier, Greenland to evaluate the performance of the approach. The neural network is able to identify the calving front in new images with a mean deviation of 96.3 m from the true fronts, equivalent to 1.97 pixels on average, while the corresponding error for manually-determined fronts on the same resolution images is 92.5 m. We find that the trained neural network significantly outperforms common edge detection techniques, and can be used to continuously map out calving-ice fronts with a variety of data products.
ARTICLE | doi:10.20944/preprints202309.1202.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: speech emotion recognition; deep learning; Deep Belief Network; deep neural network; Convolutional Neural Network; LSTM; attention mechanism
Online: 19 September 2023 (08:24:22 CEST)
Speech Emotion Recognition (SER) is an interesting and difficult problem to handle. In this paper, we deal with it through the implementation of deep learning networks. We have designed and implemented six different deep learning networks, a Deep Belief Network (DBN), a simple deep neural network (SDNN), a LSTM network (LSTM), a LSTM network with the addition of an attention mechanism (LSTM-ATN), a Convolutional neural network (CNN), and a Convolutional neural network with the addition of an attention mechanism (CNN-ATN), having in mind, apart from solving the SER problem, to test the impact of attention mechanism to the results. Dropout and Batch Normalization techniques are also used to improve the generalization ability (prevention of overfitting) of the models as well as to speed up the training process. The Surrey Audio-Visual Expressed Emotion database (SAVEE), and the Ryerson Audio-Visual Database (RAVDESS) database were used for training and evaluation of our models. The results showed that networks with the addition of the attention mechanism did better than the others. Furthermore, they showed that CNN-ATN was the best among tested networks, achieving an accuracy of 74% for the SAVEE and 77% for the RAVDESS dataset, and exceeded existing state-of-the-art systems for the same datasets.
ARTICLE | doi:10.20944/preprints201908.0068.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; convolutional neural networks (CNN); transfer learning; class activation mapping (CAM); building defects; structural-health monitoring
Online: 6 August 2019 (04:18:29 CEST)
Clients are increasingly looking for fast and effective means to quickly and frequently survey and communicate the condition of their buildings so that essential repairs and maintenance work can be done in a proactive and timely manner before it becomes too dangerous and expensive. Traditional methods for this type of work commonly comprise of engaging building surveyors to undertake a condition assessment which involves a lengthy site inspection to produce a systematic recording of the physical condition of the building elements, including cost estimates of immediate and projected long-term costs of renewal, repair and maintenance of the building. Current asset condition assessment procedures are extensively time consuming, laborious, and expensive and pose health and safety threats to surveyors, particularly at height and roof levels which are difficult to access. We propose a method for automated detection and localisation of key building defects from images using deep learning and convolution neural networks. The proposed model is based on a pre-trained VGG-16 classifier with Class Activation Mapping (CAM) for object localisation. The model has proven to be robust and able to accurately detect and localise mould growth, stains, and paint deterioration defects arising from dampness in buildings. The approach is being developed with potentials to scale-up to support automated detection of defects and deterioration of buildings in real-time using mobile devices and drones.
ARTICLE | doi:10.20944/preprints202011.0527.v1
Subject: Engineering, Aerospace Engineering Keywords: Aircraft Maintenance Inspection; Anomaly Detection; Defect Inspection; Convolutional Neural Networks; Mask R-CNN; Generative Adversarial Networks; Image Augmentation
Online: 20 November 2020 (09:16:13 CET)
Convolutional Neural Networks combined with autonomous drones are increasingly seen as enablers of partially automating the aircraft maintenance visual inspection process. Such an innovative concept can have a significant impact on aircraft operations. Through supporting aircraft maintenance engineers detect and classify a wide range of defects, the time spent on inspection can significantly be reduced. Examples of defects that can be automatically detected include aircraft dents, paint defects, cracks and holes, and lightning strike damage. Additionally, this concept could also increase the accuracy of damage detection and reduce the number of aircraft inspection incidents related to human factors like fatigue and time pressure. In our previous work, we have applied a recent Convolutional Neural Network architecture known by MASK R-CNN to detect aircraft dents. MASK-RCNN was chosen because it enables the detection of multiple objects in an image while simultaneously generating a segmentation mask for each instance. The previously obtained F1 and F2 scores were 62.67% and 59.35% respectively. This paper extends the previous work by applying different techniques to improve and evaluate prediction performance experimentally. The approaches uses include (1) Balancing the original dataset by adding images without dents; (2) Increasing data homogeneity by focusing on wing images only; (3) Exploring the potential of three augmentation techniques in improving model performance namely flipping, rotating, and blurring; and (4) using a pre-classifier in combination with MASK R-CNN. The results show that a hybrid approache combining MASK R-CNN and augmentation techniques leads to an improved performance with an F1 score of (67.50%) and F2 score of (66.37%)
TECHNICAL NOTE | doi:10.20944/preprints202009.0678.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: multi-frame super resolution; wide activation super resolution; 3D convolutional neural network; deep learning
Online: 27 September 2020 (11:54:56 CEST)
The small satellite market continues to grow year after year. A compound annual growth rate of 17% is estimated during the period between 2020 and 2025. Low-cost satellites can send a vast amount of images to be post-processed at the ground to improve the quality and extract detailed information. In this domain lies the resolution enhancement task, where a low-resolution image is converted to a higher resolution automatically. Deep learning approaches to Super-Resolution (SR) reached the state-of-the-art in multiple benchmarks; however, most of them were studied in a single-frame fashion. With satellite imagery, multi-frame images can be obtained at different conditions giving the possibility to add more information per image and improve the final analysis. In this context, we developed and applied to the PROBA-V dataset of multi-frame satellite images a model that recently topped the European Space Agency’s Multi-frame Super Resolution (MFSR) competition. The model is based on proven methods that worked on 2D images tweaked to work on 3D: the Wide Activation Super Resolution (WDSR) family. We show that with a simple 3D CNN residual architecture with WDSR blocks and a frame permutation technique as data augmentation better scores can be achieved than with more complex models. Moreover, the model requires few hardware resources, both for training and evaluation, so it can be applied directly from a personal laptop.
ARTICLE | doi:10.20944/preprints202009.0524.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: COVID-19; chest X-ray images; deep convolutional neural network; COV-MCNet; deep learning
Online: 23 September 2020 (03:31:30 CEST)
The COVID-19 pandemic situation has created even more difficulties in the quick identification and screening of the COVID-19 patients for the medical specialists. Therefore, a significant study is necessary for detecting COVID-19 cases using an automated diagnosis method, which can aid in controlling the spreading of the virus. In this paper, the study suggests a Deep Convolutional Neural Network-based multi-classification approach (COV-MCNet) using eight different pre-trained architectures such as VGG16, VGG19, ResNet50V2, DenseNet201, InceptionV3, MobileNet, InceptionResNetV2, Xception which are trained and tested on the X-ray images of COVID-19, Normal, Viral Pneumonia, and Bacterial Pneumonia. The results from 3-class (Normal vs. COVID-19 vs. Viral Pneumonia) showed that only the ResNet50V2 model provides the highest classification performance (accuracy: 95.83%, precision: 96.12%, recall: 96.11%, F1-score: 96.11%, specificity: 97.84%) compared to rest of the models. The results from 4-class (Normal vs. COVID-19 vs. Viral Pneumonia vs. Bacterial Pneumonia) demonstrated that the pre-trained model DenseNet201 provides the highest classification performance (accuracy: 92.54%, precision: 93.05%, recall: 92.81%, F1-score: 92.83%, specificity: 97.47%). Notably, the ResNet50V2 (3-class) and DenseNet201 (4-class) models in the proposed COV-MCNet framework showed higher accuracy compared to the rest six models. This indicates that the designed system can produce promising results to detect the COVID-19 cases on the availability of more data. The proposed multi-classification network (COV-MCNet) significantly speeds up the existing radiology-based method, which will be helpful to the medical community and clinical specialists for early diagnosis of the COVID-19 cases during this pandemic.
ARTICLE | doi:10.20944/preprints202005.0430.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Activity Context Sensing; Smartphones; Deep Convolutional Neural Networks; Smart devices
Online: 26 May 2020 (11:33:55 CEST)
With the widespread of embedded sensing capabilities of mobile devices, there has been unprecedented development of context-aware solutions. This allows the proliferation of various intelligent applications such as those for remote health and lifestyle monitoring, intelligent personalized services, etc. However, activity context recognition based on multivariate time series signals obtained from mobile devices in unconstrained conditions is naturally prone to imbalance class problems. This means that recognition models tend to predict classes with the majority number of samples whilst ignoring classes with the least number of samples, resulting in poor generalization. To address this problem, we propose to augment the time series signals from inertia sensors with signals from ambient sensing to train deep convolutional neural networks (DCNN) models. DCNN provides the characteristics that capture local dependency and scale invariance of these combined sensor signals. Consequently, we developed a DCNN model using only inertial sensor signals and then developed another model that combined signals from both inertia and ambient sensors aiming to investigate the class imbalance problem by improving the performance of the recognition model. Evaluation and analysis of the proposed system using data with imbalanced classes show that the system achieved better recognition accuracy when data from inertial sensors are combined with those from ambient sensors such as environment noise level and illumination, with an overall improvement of 5.3% accuracy.
ARTICLE | doi:10.20944/preprints201809.0361.v3
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: deep learning; convolutional neural networks; polar mesocyclones; satellite data processing; pattern recognition
Online: 29 October 2018 (10:16:49 CET)
Polar mesocyclones (MCs) are small marine atmospheric vortices. The class of intense MCs, called polar lows, are accompanied by extremely strong surface winds and heat fluxes and thus largely influencing deep ocean water formation in the polar regions. Accurate detection of polar mesocyclones in high-resolution satellite data, while challenging, is a time-consuming task, when performed manually. Existing algorithms for the automatic detection of polar mesocyclones are based on the conventional analysis of patterns of cloudiness and involve different empirically defined thresholds of geophysical variables. As a result, various detection methods typically reveal very different results when applied to a single dataset. We develop a conceptually novel approach for the detection of MCs based on the use of deep convolutional neural networks (DCNNs). As a first step, we demonstrate that DCNN model is capable of performing binary classification of 500x500km patches of satellite images regarding MC patterns presence in it. The training dataset is based on the reference database of MCs manually tracked in the Southern Hemisphere from satellite mosaics. We use a subset of this database with MC diameters falling in the range of 200-400 km. This dataset is further used for testing several different DCNN setups, specifically, DCNN built “from scratch”, DCNN based on VGG16 pre-trained weights also engaging the Transfer Learning technique, and DCNN based on VGG16 with Fine Tuning technique. Each of these networks is further applied to both infrared (IR) and a combination of infrared and water vapor (IR+WV) satellite imagery. The best skills (97% in terms of the binary classification accuracy score) is achieved with the model that averages the estimates of the ensemble of different DCNNs. The algorithm can be further extended to the automatic identification and tracking numerical scheme and applied to other atmospheric phenomena characterized by a distinct signature in satellite imagery.
ARTICLE | doi:10.20944/preprints201801.0019.v1
Subject: Computer Science And Mathematics, Analysis Keywords: high resolution remote sensing image; convolutional neural networks; full convolution networks; Bayesian convolutional neural networks; building extraction; conditional probability density function
Online: 3 January 2018 (04:46:44 CET)
When extract building from high resolution remote sensing image with meter/sub-meter accuracy, the shade of trees and interference of roads are the main factors of reducing the extraction accuracy. Proposed a Bayesian Convolutional Neural Networks(BCNET) model base on standard fully convolutional networks(FCN) to solve these problems. First take building with no shade or artificial removal of shade as Sample-A, woodland as Sample-B, road as Sample-C. Set up 3 sample libraries. Learn these sample libraries respectively, get their own set of feature vector; Mixture Gauss model these feature vector set, evaluate the conditional probability density function of mixture of noise object and roofs; Improve the standard FCN from the 2 aspect:(1) Introduce atrous convolution. (2) Take conditional probability density function as the activation function of the last convolution. Carry out experiment using unmanned aerial vehicle(UVA) image, the results show that BCNET model can effectively eliminate the influence of trees and roads, the building extraction accuracy can reach 97%.
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: microcombs; optical neural networks; neuromorphic computing, artificial intelligence; Kerr microcombs; convolutional neural network
Online: 16 November 2020 (13:30:14 CET)
Convolutional neural networks (CNNs), inspired by biological visual cortex systems, are a powerful category of artificial neural networks that can extract the hierarchical features of raw data to greatly reduce the network parametric complexity and enhance the predicting accuracy. They are of significant interest for machine learning tasks such as computer vision, speech recognition, playing board games and medical diagnosis [1-7]. Optical neural networks offer the promise of dramatically accelerating computing speed to overcome the inherent bandwidth bottleneck of electronics. Here, we demonstrate a universal optical vector convolutional accelerator operating beyond 10 Tera-FLOPS (floating point operations per second), generating convolutions of images of 250,000 pixels with 8-bit resolution for 10 kernels simultaneously — enough for facial image recognition. We then use the same hardware to sequentially form a deep optical CNN with ten output neurons, achieving successful recognition of full 10 digits with 900 pixel handwritten digit images with 88% accuracy. Our results are based on simultaneously interleaving temporal, wavelength and spatial dimensions enabled by an integrated microcomb source. This approach is scalable and trainable to much more complex networks for demanding applications such as unmanned vehicle and real-time video recognition.
ARTICLE | doi:10.20944/preprints202304.0645.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Lip Reading; Multiclass Classification; Turkish Lip Reading Dataset; Deep Learning; Convolutional Neural Networks; Lip Detection
Online: 20 April 2023 (10:07:48 CEST)
Automated lip reading is a research problem that has developed considerably in recent years. Lip reading is evaluated both visually and audibly in some cases. The lip reading model is a field of use for detecting specific words using images from security cameras, but it is not possible to use audio-visual databases in this situation. It is not possible to obtain the sound input of the pronounced word in all cases. We collected a new Turkish dataset with only the image in this study. The new dataset is produced using Youtube videos, which is an uncontrolled environment. For this reason, images have difficult parameters in terms of environmental factors such as light, angle, color, and personal characteristics of the face. Despite the different features on the human face such as mustache, beard, and make-up, the visual speech recognition problem was developed on 10 classes including single words and two-word phrases using Convolutional Neural Networks (CNN) without any intervention on the data. The proposed study using only-visual data obtained a model which is automated visual speech recognition with a deep learning approach. In addition, since this study uses only-visual data, the computational cost and resource usage is less than in multi-modal studies. It is also the first known study to address the lip reading problem with a deep learning algorithm using a new dataset belonging to the Ural-Altaic languages.
REVIEW | doi:10.20944/preprints202110.0135.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: convolutional neural networks (CNNs); deep learning; computer-aided diagnosis; colorectal polyps; colorectal cancer; colonoscopy
Online: 8 October 2021 (10:50:53 CEST)
As a relatively high percentage of adenoma polyps are missed, a computer-aided diagnosis (CAD) tool based on deep learning can aid the endoscopist in diagnosing colorectal polyps or colorectal cancer in order to decrease polyps missing rate and prevent colorectal cancer mortality. Convolutional Neural Network (CNN) is a deep learning method and has achieved better results in detecting and segmenting specific objects in images in the last decade than conventional models such as regression, support vector machines or artificial neural networks. In recent years, based on the studies in medical imaging criteria, CNN models have acquired promising results in detecting masses and lesions in various body organs, including colorectal polyps. In this review, the structure and architecture of CNN models and how colonoscopy images are processed as input and converted to the output are explained in detail. In most primary studies conducted in the colorectal polyp detection and classification field, the CNN model has been regarded as a black box since the calculations performed at different layers in the model training process have not been clarified precisely. Furthermore, I discuss the differences between the CNN and conventional models, inspect how to train the CNN model for diagnosing colorectal polyps or cancer, and evaluate model performance after the training process.
ARTICLE | doi:10.20944/preprints202105.0429.v1
Subject: Medicine And Pharmacology, Other Keywords: Acute lymphoblastic leukemia; Deep convolutional neural networks; Ensemble image classifiers; C-NMC-2019 dataset.
Online: 19 May 2021 (07:42:23 CEST)
Although automated Acute Lymphoblastic Leukemia (ALL) detection is essential, it is challenging due to the morphological correlation between malignant and normal cells. The traditional ALL classification strategy is arduous, time-consuming, often suffers inter-observer variations, and necessitates experienced pathologists. This article has automated the ALL detection task, employing deep Convolutional Neural Networks (CNNs). We explore the weighted ensemble of deep CNNs to recommend a better ALL cell classifier. The weights are estimated from ensemble candidates' corresponding metrics, such as accuracy, F1-score, AUC, and kappa values. Various data augmentations and pre-processing are incorporated for achieving a better generalization of the network. We train and evaluate the proposed model utilizing the publicly available C-NMC-2019 ALL dataset. Our proposed weighted ensemble model has outputted a weighted F1-score of 88.6%, a balanced accuracy of 86.2%, and an AUC of 0.941 in the preliminary test set. The qualitative results displaying the gradient class activation maps confirm that the introduced model has a concentrated learned region. In contrast, the ensemble candidate models, such as Xception, VGG-16, DenseNet-121, MobileNet, and InceptionResNet-V2, separately produce coarse and scatter learned areas for most example cases. Since the proposed ensemble yields a better result for the aimed task, it can experiment in other domains of medical diagnostic applications.
ARTICLE | doi:10.20944/preprints201809.0481.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Brain-Computer Interfaces, spectrogram-based convolutional neural network model(pCNN), Deep Learning, EEG, LSTM, RCNN
Online: 25 September 2018 (08:58:34 CEST)
Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject’s motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g. hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause for the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: 1) a long short-term memory (LSTM); 2) a proposed spectrogram-based convolutional neural network model (pCNN); and 3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (manual) feature engineering. Results were evaluated on our own, publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from "BCI Competition IV". Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.
ARTICLE | doi:10.20944/preprints202002.0334.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: deep learning; drone imagery; hyperspectral image classiﬁcation; tree species classification; 3D convolutional neural networks
Online: 24 February 2020 (01:13:13 CET)
Interest in drone solutions in forestry applications is growing. Using drones, datasets can be captured flexibly and at high spatial and temporal resolutions when needed. In forestry applications, fundamental tasks include the detection of individual trees, tree species classification, bio-mass estimation, etc. Deep Neural Networks (DNN) have shown superior results when comparing with conventional machine learning methods such as Multi-Layer Perceptron (MLP) in cases of huge input data. The objective of this research was to investigate 3D convolutional neural networks (3D-CNN) to classify three major tree species in a boreal forest: pine, spruce, and birch. The proposed 3D-CNN models were employed to classify tree species in a test site in Finland. The classifiers were trained with a dataset of 3039 manually labelled trees. Then the accuracies were assessed by employing independent datasets of 803 records. To find the most efficient set of feature combination, we compare the performances of 3D-CNN models trained with hyperspectral (HS) channels, RGB channels, and canopy height model (CHM), separately and combined. It is demonstrated that the proposed 3D-CNN model with RGB and HS layers produces the highest classification accuracy. The producer accuracy of the best 3D-CNN classifier on the test dataset were 99.6%, 94.8%, and 97.4% for pines, spruces, and birches, respectively. The best 3D-CNN classifier produced ~5% better classification accuracy than the MLP with all layers. Our results suggest that the proposed method provides excellent classification results with acceptable performance metrics for HS datasets. Our results show that pine class was detectable in most layers. Spruce was most detectable in RGB data, while birch was most detectable in the HS layers. Furthermore, the RGB datasets provide acceptable results for many low-accuracy applications.
ARTICLE | doi:10.20944/preprints201906.0270.v2
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Land cover mapping; Convolutional neural networks; UNET; Sentinel-2
Online: 9 August 2019 (11:54:37 CEST)
The Sentinel-2 satellite mission offers high resolution multispectral time series image data, enabling the production of detailed land cover maps globally. At this scale, the trade-off between processing time and result quality is a central design decision. Currently, this machine learning task is usually performed using pixelwise classification methods. The radical shift of the computer vision field away from hand engineered image features and towards more automation by representation learning comes with many promises, including higher quality results and less engineering effort. In this paper we assess fully convolutional neural networks architectures as replacements for a Random Forest classifier in an operational context for the production of high resolution land cover maps with Sentinel-2 time series at the country scale. Our contributions include a framework for working with Sentinel-2 L2A time series image data, an adaptation of the U-Net model for dealing with sparse annotation data while maintaining high resolution output, and an analysis of those results in the context of operational production of land cover maps.
ARTICLE | doi:10.20944/preprints202304.0320.v1
Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: Ovarian Tumours; UNet; Convolutional Neural Networks; VGG 16; DenseNet; ResNet; Dice score; Jaccard score
Online: 13 April 2023 (10:50:53 CEST)
The difficulty in detecting tumors in earlier stages is the major cause of mortalities of patients, despite the advancements in treatment and research regarding ovarian cancer. Deep Learning algorithms are applied to serve the purpose of a diagnostic tool by applying them on CT scan images of the ovarian region. The images go through a series of pre-processing techniques and further the tumor is segmented using the UNet model. Instances are then classified into two categories – benign and malignant tumors. Classification is performed using Deep Learning models like CNN, ResNet, DenseNet, Inception-ResNet, VGG16 and Xception along with Machine Learning models such as Random Forest, Gradient Boosting, AdaBoosting, XGBoosting. DenseNet 121 emerges as the best model on this dataset even after applying optimization on the Machine Learning models by obtaining an accuracy of 95.7%. The current work demonstrates the comparison of multiple CNN architectures among themselves and with common Machine Learning algorithms, with and without optimization techniques applied.
ARTICLE | doi:10.20944/preprints202108.0272.v1
Subject: Engineering, Industrial And Manufacturing Engineering Keywords: Remaining Useful Life; Deep Neural Network; Convolutional Neural Network; Genetic Optimization; Neural Network Optimization; Support Vector Regression; Depth Maps; Normal Maps; 3D Point Clouds.
Online: 12 August 2021 (10:40:23 CEST)
In the current industrial landscape, increasingly pervaded by technological innovations, the adoption of optimized strategies for asset management is becoming a critical key success factor. Among the various strategies available, the “Prognostics and Health Management” strategy is able to support maintenance management decisions more accurately, through continuous monitoring of equipment health and “Remaining Useful Life” forecasting. In the present study, Convolutional Neural Network-based Deep Neural Network techniques are investigated for the Remaining Useful Life prediction of a punch tool, whose degradation is caused by working surface deformations during the machining process. Surface deformation is determined using a 3D scanning sensor capable of returning point clouds with micrometric accuracy during the operation of the punching machine, avoiding both downtime and human intervention. The 3D point clouds thus obtained are transformed into bidimensional image-type maps, i.e., maps of depths and normal vectors, to fully exploit the potential of convolutional neural networks for extracting features. Such maps are then processed by comparing 15 genetically optimized architectures with the transfer learning of 19 pre-trained models, using a classic machine learning approach, i.e., Support Vector Regression, as a benchmark. The achieved results clearly show that, in this specific case, optimized architectures provide performance far superior (MAPE=0.058) to that of transfer learning which, instead, remains at a lower or slightly higher level (MAPE=0.416) than Support Vector Regression (MAPE=0.857).
ARTICLE | doi:10.20944/preprints201706.0012.v3
Subject: Engineering, Control And Systems Engineering Keywords: deep convolutional neural networks; road segmentation; conditional random fields; landscape metrics; satellite images; aerial images; THEOS
Online: 5 June 2017 (06:39:54 CEST)
Object segmentation on remotely-sensed images: aerial (or very high resolution, VHS) images and satellite (or high resolution, HR) images, has been applied to many application domains, especially road extraction in which the segmented objects are served as a mandatory layer in geospatial databases. Several attempts in applying deep convolutional neural network (DCNN) to extract roads from remote sensing images have been made; however, the accuracy is still limited. In this paper, we present an enhanced DCNN framework specifically tailored for road extraction on remote sensing images by applying landscape metrics (LMs) and conditional random fields (CRFs). To improve DCNN, a modern activation function, called exponential linear unit (ELU), is employed in our network resulting in a higher number of and yet more accurate extracted roads. To further reduce falsely classified road objects, a solution based on an adoption of LMs is proposed. Finally, to sharpen the extracted roads, a CRF method is added to our framework. The experiments were conducted on Massachusetts road aerial imagery as well as THEOS satellite imagery data sets. The results showed that our proposed framework outperformed Segnet, the state-of-the-art object segmentation technique on any kinds of remote sensing imagery, in most of the cases in terms of precision, recall, and F1.
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Car Detection; Convolutional Neural Networks; Deep Learning; Faster R-CNN; Unmanned Aerial Vehicles; You Only Look Once (Yolo).
Online: 12 March 2020 (08:57:09 CET)
In this paper, we address the problem of car detection from aerial images using Convolutional Neural Networks (CNN). This problem presents additional challenges as compared to car (or any object) detection from ground images because features of vehicles from aerial images are more difficult to discern. To investigate this issue, we assess the performance of two state-of-the-art CNN algorithms, namely Faster R-CNN, which is the most popular region-based algorithm, and YOLOv3, which is known to be the fastest detection algorithm. We analyze two datasets with different characteristics to check the impact of various factors, such as UAV's altitude, camera resolution, and object size. The objective of this work is to conduct a robust comparison between these two cutting-edge algorithms. By using a variety of metrics, we show that YOLOv3 yields better performance in most configurations, except that it exhibits a lower recall and less confident detections when object sizes and scales in the testing dataset differ largely from those in the training dataset.
ARTICLE | doi:10.20944/preprints201910.0195.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: car detection; convolutional neural networks; deep learning; you only look once (yolo); faster r-cnn; unmanned aerial vehicles
Online: 17 October 2019 (12:29:29 CEST)
In this paper, we address the problem of car detection from aerial images using Convolutional Neural Networks (CNN). This problem presents additional challenges as compared to car (or any object) detection from ground images because features of vehicles from aerial images are more difficult to discern. To investigate this issue, we assess the performance of two state-of-the-art CNN algorithms, namely Faster R-CNN, which is the most popular region-based algorithm, and YOLOv3, which is known to be the fastest detection algorithm. We analyze two datasets with different characteristics to check the impact of various factors, such as UAV’s altitude, camera resolution, and object size. The objective of this work is to conduct a robust comparison between these two cutting-edge algorithms. By using a variety of metrics, we show that none of the two algorithms outperforms the other in all cases.
ARTICLE | doi:10.20944/preprints201808.0112.v2
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: remote sensing; image classification; fully connected conditional random fields (FC-CRF); convolutional neural networks (CNN)
Online: 28 November 2018 (07:11:42 CET)
The interpretation of land use and land cover (LULC) is an important issue in the fields of high-resolution remote sensing (RS) image processing and land resource management. Fully training a new or existing convolutional neural network (CNN) architecture for LULC classification requires a large amount of remote sensing images. Thus, fine-tuning a pre-trained CNN for LULC detection is required. To improve the classification accuracy for high resolution remote sensing images, it is necessary to use another feature descriptor and to adopt a classifier for post-processing. A fully connected conditional random fields (FC-CRF), to use the fine-tuned CNN layers, spectral features, and fully connected pairwise potentials, is proposed for image classification of high-resolution remote sensing images. First, an existing CNN model is adopted, and the parameters of CNN are fine-tuned by training datasets. Then, the probabilities of image pixels belong to each class type are calculated. Second, we consider the spectral features and digital surface model (DSM) and combined with a support vector machine (SVM) classifier, the probabilities belong to each LULC class type are determined. Combined with the probabilities achieved by the fine-tuned CNN, new feature descriptors are built. Finally, FC-CRF are introduced to produce the classification results, whereas the unary potentials are achieved by the new feature descriptors and SVM classifier, and the pairwise potentials are achieved by the three-band RS imagery and DSM. Experimental results show that the proposed classification scheme achieves good performance when the total accuracy is about 85%.
ARTICLE | doi:10.20944/preprints201711.0053.v3
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: ultrasound; b-mode; skeletal muscle; fascicle orientation; pennation angle; fiber orientation; fiber tract; fascicle tract; convolutional neural network; deconvolutional neural network
Online: 19 January 2018 (14:05:16 CET)
Direct measurement of strain within muscle is important for understanding muscle function in health and disease. Current technology (kinematics, dynamometry, electromyography) provides limited ability to measure strain within muscle. Regional fiber orientation and length are related with active/passive strain within muscle. Currently, ultrasound imaging provides the only non-invasive means of observing regional fiber orientation within muscle during dynamic tasks. Previous attempts to automatically estimate fiber orientation from ultrasound are not adequate, often requiring manual region selection, feature engineering, providing low-resolution estimations (one angle per muscle), and deep muscles are often not attempted. Here, we propose deconvolutional neural networks (DCNN) for estimating fiber orientation at the pixel-level. Dynamic ultrasound images sequences of the calf muscles were acquired (25 Hz) from 8 healthy volunteers (4 male, ages: 25–36, median 30). A combination of expert annotation and interpolation/extrapolation provided labels of regional fiber orientation for each image. We then trained DCNNs both with and without dropout using leave one out cross-validation. Our results demonstrated robust estimation of regional fiber orientation with approximately 3° error, which was an improvement on previous methods. The methods presented here provide new potential to study muscle in disease and health.
ARTICLE | doi:10.20944/preprints202305.0319.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: hyperspectral images; convolutional neural networks; graph convolutional networks; feature fusion
Online: 5 May 2023 (07:40:07 CEST)
Convolutional neural networks (CNN) have attracted much attention as a commonly used method for hyperspectral image (HSI) classification in recent years, however, CNNs can only be applied to Euclidean data and have limitations in dealing with relationships due to the limitations of local feature extraction. However, each pixel of a hyperspectral image contains a set of spectral bands that are correlated and interact with each other, and the methods used to process Euclidean data cannot effectively obtain these correlations. In contrast, the graph convolutional network (GCN) can be used in non-Euclidean data, but usually leads to oversmoothing and ignoring local detail features due to the need for superpixel segmentation processing to reduce computational effort. To overcome the above problems, we constructed a network a fusion network based on GCN and CNN, which contains two branches: a graph convolutional network based on superpixel segmentation and a convolutional network with added attention mechanism. The graph convolu-tional branch can extract the structural features and capture the relationships between the nodes, and the convolutional branch can extract the detailed features in the local fine region. Owing to the fact that the features extracted from the two branches are different, the classification performance can be improved by fusing the complementary features extracted from the two branches. To vali-date the proposed algorithm, experiments were conducted on three widely used datasets, namely Indian Pines, Pavia University, and Salinas, and the overall accuracy of 98.78% was obtained in the Indian Pines dataset, and the overall accuracy of 98.99% and 98.69% was obtained in the other two datasets. The results showed that the proposed fusion network can obtain richer features and achieve high classification accuracy.
ARTICLE | doi:10.20944/preprints202004.0271.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Keratoconus; smartphone; cornea; convolutional neural network
Online: 16 April 2020 (12:38:42 CEST)
Nowadays smartphone utilization for disease diagnosis and remote health care applications has become promising due to their ubiquity. Here, a novel convolutional neural network method for detecting keratoconus that is wholly implemented on a smartphone is proposed. The proposed method provides accurate detection of over 72.9% for all stages of keratoconus. Preliminary results indicate 90%, 83%, 64% and 52% detection rate for severe, advanced, moderate and mild stages of disease, respectively.
ARTICLE | doi:10.20944/preprints201811.0583.v1
Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: Station logo; Convolutional Neural Network; Detection
Online: 26 November 2018 (10:57:17 CET)
The Station logo is a way for a TV station to claim copyright, which can realize the analysis and understanding of the video by the identification of the station logo, so as to ensure that the broadcasted TV signal will not be illegally interfered. In this paper, we design a station logo detection method based on Convolutional Neural Network by the characteristics of the station, such as small scale-to-height ratio change and relatively fixed position. Firstly, in order to realize the preprocessing and feature extraction of the station data, the video samples are collected, filtered, framed, labeled and processed. Then, the training sample data and the test sample data are divided proportionally to train the station detection model. Finally, the sample is tested to evaluate the effect of the training model in practice. The simulation experiments prove its validity.
ARTICLE | doi:10.20944/preprints201902.0203.v1
Subject: Biology And Life Sciences, Plant Sciences Keywords: Northern Corn Leaf Blight (Exserohilum); Gray Leaf Spot (Cerospora); Common Rust (Puccinia sorghi); Convolutional Neural Networks (CNN); Neuroph Studio
Online: 21 February 2019 (13:04:05 CET)
Plant leaf diseases can affect the plants’ leaves to an extent that the plants can collapse and die completely. These diseases may drastically drop the supply of vegetables and fruits to the market, and result in a low agricultural economy. In the literature, different laboratory methods of plant leaf disease detection have been used. These methods were time consuming and could not cover large areas for the detection of leaf diseases. This study infiltrates through the facilitated principles of the Convolutional Neural Networks (CNN) in order to model a network for image recognition and classification of these diseases. Neuroph was used to perform the training of a CNN network that recognized and classified images of the maize leaf diseases that were collected by use of a smart phone camera. A novel way of training and the methodology used, expedite a quick and easy implementation of the system in practice. The developed model was able to recognize 3 different types of maize leaf diseases out of healthy leaves. The Northern Corn Leaf Blight (Exserohilum), Common Rust (Puccinia sorghi) and Gray Leaf Spot (Cerospora) diseases were chosen for this study as they affect most parts of Southern Africa’s maize fields.
ARTICLE | doi:10.20944/preprints201911.0019.v1
Subject: Engineering, Control And Systems Engineering Keywords: community detection; social network; convolutional neural network; auto-encoder
Online: 3 November 2019 (15:51:34 CET)
With the fast development of the mobile Internet, the online platforms of social networks have rapidly been developing for the purpose of making friends, sharing information, etc. In these online platforms, users being related to each other forms social networks. Literature reviews have shown that social networks have community structure. Through the studies of community structure, the characteristics and functions of networks structure and the dynamical evolution mechanism of networks can be used for predicting user behaviours and controlling information dissemination. Therefore, this study proposes a deep community detection method which includes (1) matrix reconstruction method, (2) spatial feature extraction method and (3) community detection method. The original adjacency matrix in social network is reconstructed based on the opinion leader and nearer neighbors for obtaining spatial proximity matrix. The spatial eigenvector of reconstructed adjacency matrix can be extracted by an auto-encoder based on convolution neural network for the improvement of modularity. In experiments, four open datasets of practical social networks were selected to evaluate the proposed method, and the experimental results show that the proposed deep community detection method obtained higher modularity than other methods. Therefore, the proposed deep community detection method can effectively detect high quality communities in social networks.
ARTICLE | doi:10.20944/preprints202307.1795.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Convolutional Neural Network; Network Scaling; Evolutionary Computation
Online: 26 July 2023 (10:19:29 CEST)
Convolutional Neural Networks (CNNs) are largely hand-crafted, which leads to inefficiency in the constructed network. Various other algorithms have been proposed to address this issue, but the inefficiencies resulting from human intervention have not been addressed. Our proposed EvolveNet algorithm is a task-agnostic evolutionary search algorithm that can find optimal depth and width scales automatically in an efficient way. The optimal configurations are not found using grid search, instead evolved from an existing network. This eliminates inefficiencies that emanate from hand-crafting, thus reducing the drop in accuracy. The proposed algorithm is a framework to search through a large search space of subnetworks until a suitable configuration is found. Extensive experiments on the ImageNet dataset demonstrate the superiority of the proposed method by outperforming the state-of-the-art methods.
ARTICLE | doi:10.20944/preprints201811.0546.v4
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network (CNN), Deep learning, Architecture, Applications
Online: 14 February 2019 (10:01:31 CET)
With the increase of the Artificial Neural Network (ANN), machine learning has taken a forceful twist in recent times. One of the most spectacular kinds of ANN design is the Convolutional Neural Network (CNN). The Convolutional Neural Network (CNN) is a technology that mixes artificial neural networks and up to date deep learning strategies. In deep learning, Convolutional Neural Network is at the center of spectacular advances. This artificial neural network has been applied to several image recognition tasks for decades and attracted the eye of the researchers of the many countries in recent years as the CNN has shown promising performances in several computer vision and machine learning tasks. This paper describes the underlying architecture and various applications of Convolutional Neural Network.
REVIEW | doi:10.20944/preprints202208.0313.v3
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network; domain; natural language processing; computer vision; semantic parsing
Online: 18 August 2022 (07:39:33 CEST)
Convolutional neural network (CNN), a class of artificial neural network (ANN) is attracting interests of researchers in all research domain. CNN was invented for computer vision. They have also shown to be useful for semantic parsing, sentence modeling and other natural language processing related tasks. Here in this paper we discuss the basics of CNN models and their scope to provide a reference/baseline to the researchers interested in using CNN models in their research.
ARTICLE | doi:10.20944/preprints202005.0455.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: pattern recognition; deep convolutional neural network; Brahmi script; CNN
Online: 28 May 2020 (07:33:32 CEST)
Significant progress has made in pattern recognition technology. However, one obstacle that has not yet overcome is the recognition of words in the Brahmi script, specifically the identification of characters, compound characters, and word. This study proposes the use of the deep convolutional neural network with dropout to recognize the Brahmi words. This study also proposed a DCNN for Brahmi word recognition and a series of experiments are performed on standard Brahmi dataset. The practical operation of this method was systematically tested on accessible Brahmi image database, achieving 92.47% recognition rate by CNN with dropout respectively which is among the best while comparing with the ones reported in the literature for the same task.
ARTICLE | doi:10.20944/preprints202103.0220.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Convolutional Neural Network; Deep Learning; Environmental Monitoring
Online: 8 March 2021 (13:37:58 CET)
Accurately mapping individual tree species in densely forested environments is crucial to forest inventory. When considering only RGB images, this is a challenging task for many automatic photogrammetry processes. The main reason for that is the spectral similarity between species in RGB scenes, which can be a hindrance for most automatic methods. State-of-the-art deep learning methods could be capable of identifying tree species with an attractive cost, accuracy, and computational load in RGB images. This paper presents a deep learning-based approach to detect an important multi-use species of palm trees (Mauritia flexuosa; i.e., Buriti) on aerial RGB imagery. In South-America, this palm tree is essential for many indigenous and local communities because of its characteristics. The species is also a valuable indicator of water resources, which comes as a benefit for mapping its location. The method is based on a Convolutional Neural Network (CNN) to identify and geolocate singular tree species in a high-complexity forest environment, and considers the likelihood of every pixel in the image to be recognized as a possible tree by implementing a confidence map feature extraction. This study compares the performance of the proposed method against state-of-the-art object detection networks. For this, images from a dataset composed of 1,394 airborne scenes, where 5,334 palm-trees were manually labeled, were used. The results returned a mean absolute error (MAE) of 0.75 trees and an F1-measure of 86.9%. These results are better than both Faster R-CNN and RetinaNet considering equal experiment conditions. The proposed network provided fast solutions to detect the palm trees, with a delivered image detection of 0.073 seconds and a standard deviation of 0.002 using the GPU. In conclusion, the method presented is efficient to deal with a high-density forest scenario and can accurately map the location of single species like the M flexuosa palm tree and may be useful for future frameworks.
REVIEW | doi:10.20944/preprints202309.1149.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: neural networks; machine learning; convolutional neural networks; computational complexity; ANN performance
Online: 19 September 2023 (03:53:36 CEST)
In recent years, Neural networks are increasingly deployed in various fields to learn complex patterns and make accurate predictions. However, designing an effective neural network model is a challenging task that requires careful consideration of various factors, including architecture, optimization method, and regularization technique. This paper aims to comprehensively overview the state-of-the-art artificial neural network (ANN) generation and highlight key challenges and opportunities in machine learning applications. It provides a critical analysis of current neural network model design methodologies, focusing on the strengths and weaknesses of different approaches. Also, it explores the use of different learning approaches, including convolutional neural networks (CNN), deep neural networks (DNN), and recurrent neural networks (RNN) in image recognition, natural language processing, and time series analysis. Besides, it discusses the benefits of choosing the ideal values for the different components of ANN, such as the number of Input/output layers, hidden layers number, activation function type, epochs number, and model type selection, which help improve the model performance and generalization. Furthermore, it identifies some common pitfalls and limitations of existing design methodologies, such as overfitting, lack of interpretability, and computational complexity. Finally, it proposes some directions for future research, such as developing more efficient and interpretable neural network architectures, improving the scalability of training algorithms, and exploring the potential of new paradigms, such as Spiking Neural Networks, quantum neural networks, and neuromorphic computing.
ARTICLE | doi:10.20944/preprints202103.0754.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: paraphrase generation; syntax information; Graph Convolutional Network; sequence-to-sequence
Online: 31 March 2021 (07:57:56 CEST)
Paraphrase generation is an important yet challenging task in NLP. Neural network-based approaches have achieved remarkable success in sequence-to-sequence(seq2seq) learning. Previous paraphrase generation work generally ignores syntactic information regardless of its availability, with the assumption that neural nets could learn such linguistic knowledge implicitly. In this work we make an endeavor to probe into the efficacy of explicit syntactic information for the task of paraphrase generation. Syntactic information can appear in the form of dependency trees which could be easily acquired from off-the-shelf syntactic parsers. Such tree structures could be conveniently encoded via graph convolutional networks(GCNs) to obtain more meaningful sentence representations, which could improve generated paraphrases. Through extensive experiments on four paraphrase datasets with different sizes and genres, we demonstrate the utility of syntactic information in neural paraphrase generation under the framework of seq2seq modeling. Specifically, our GCN-enhanced models consistently outperform their syntax-agnostic counterparts in multiple evaluation metrics.
ARTICLE | doi:10.20944/preprints202003.0096.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning; Energy demand; Temporal convolutional network; Time series forecasting
Online: 5 March 2020 (15:02:37 CET)
Modern energy systems collect high volumes of data that can provide valuable information about energy consumption. Electric companies can now use historical data to make informed decisions on energy production by forecasting the expected demand. Many deep learning models have been proposed to deal with these type of time series forecasting problems. Deep neural networks, such as recurrent or convolutional, can automatically capture complex patterns in time series data and provide accurate predictions. In particular, Temporal Convolutional Networks (TCN) are a specialised architecture that has advantages over recurrent networks for forecasting tasks. TCNs are able to extract long-term patterns using dilated causal convolutions and residual blocks, and can also be more efficient in terms of computation time. In this work, we propose a TCN-based deep learning model to improve the predictive performance in energy demand forecasting. Two energy-related time series with data from Spain have been studied: the national electric demand, and the power demand at charging stations for electric vehicles. An extensive experimental study has been conducted, involving more than 1900 models with different architectures and parametrisations. The TCN proposal outperforms the forecasting accuracy of Long Short-Term Memory (LSTM) recurrent networks, which are considered the state-of-the-art in the field.
ARTICLE | doi:10.20944/preprints202007.0650.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Myocarditis; Diagnosis; Convolutional Neural Network; Cardiac MRI; prediction
Online: 26 July 2020 (17:44:05 CEST)
Myocarditis is the form of an inflammation of the middle layer of the heart wall which is caused by a viral infection and can affect the heart muscle and its electrical system. It has remained as one of the most challenging diagnoses in cardiology. Myocardial is the prime cause of unexpected death in approximately 20% of adults less than 40 years of age. Cardiac MRI (CMR) has been considered as a noninvasive and golden standard diagnostic tool for suspected myocarditis and plays an indispensable role in diagnosing various cardiac diseases. However, the performance of CMR is heavily dependent on the clinical presentation and non-specific features such as chest pain, arrhythmia, and heart failure. Besides, other imaging factors like artifacts, technical errors, pulse sequence, acquisition parameters, contrast agent dose, and more importantly qualitatively visual interpretation can affect the result of the diagnosis. This paper introduces a new deep learning-based model called Convolutional Neural Network-Clustering (CNN-KCL) to diagnose the Myocarditis. The hybrid CNN-KCL method performs the early and accurate diagnosis of Myocarditis. To the best-of-our-knowledge, a Convolutional neural network has never been used before for the diagnosis of Myocarditis. In this study, we used 47 subjects to diagnose myocarditis patients from Tehran's Omid Hospital. The total number of data examined is 10425. Our results demonstrate that CNN-KCL achieves 92.3% in terms of diagnosis myocarditis prediction accuracy which is significantly better than those reported in previous studies.
ARTICLE | doi:10.20944/preprints202005.0493.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Convolutional Neural Networks; Dental Diagnosis; Image Recognition; Diabetic Retinopathy detection
Online: 31 May 2020 (18:55:43 CEST)
Retinopathy is a human eye disease that causes changes in retinal blood vessels that leads to bleed, leak fluid and vision impairment. Symptoms of retinopathy are blurred vision, changes in color perception, red spots, and eye pain. In this paper, a new methodology based on Convolutional Neural Networks (CNN) is developed and proposed to diagnose and give a decision about the presence of retinopathy. The CNN model is trained by different images of eyes that have retinopathy and those which do not have retinopathy. The performance of the proposed model is compared with the related methods of DREAM, KNN, GD-CNN and SVM. Experimental results show that the proposed CNN performs better.
ARTICLE | doi:10.20944/preprints201912.0252.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: time series; deep learning; convolutional neural network; recurrence plot; financial market prediction
Online: 19 December 2019 (07:39:54 CET)
An application of deep convolutional neural network and recurrence plot for financial market movement prediction is presented. Though it is challenging and subjective to interpret its information, the pattern formed by a recurrence plot provide a useful insight into the dy- namical system. We used a recurrence plot of seven financial time series to train a deep neural network for financial market movement predic- tion. Our approach is tested on our dataset and achieved an average of 53.25% classification accuracy. The result suggests that a well trained deep convolutional neural network can learn a recurrence plot and pre- dict a financial market direction.
ARTICLE | doi:10.20944/preprints202103.0180.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: convolutional neural networks; activation functions; biomedical classification; ensembles; MeLU variants
Online: 5 March 2021 (10:05:38 CET)
Recently, much attention has been devoted to finding highly efficient and powerful activation functions for CNN layers. Because activation functions inject different nonlinearities between layers that affect performance, varying them is one method for building robust ensembles of CNNs. The objective of this study is to examine the performance of CNN ensembles made with different activation functions, including six new ones presented here: 2D Mexican ReLU, TanELU, MeLU+GaLU, Symmetric MeLU, Symmetric GaLU, and Flexible MeLU. The highest performing ensemble was built with CNNs having different activation layers that randomly replaced the standard ReLU. A comprehensive evaluation of the proposed approach was conducted across fifteen biomedical data sets representing various classification tasks. The proposed method was tested on two basic CNN architectures: Vgg16 and ResNet50. Results demonstrate the superiority in performance of this approach. The MATLAB source code for this study will be available at https://github.com/LorisNanni.
ARTICLE | doi:10.20944/preprints202306.0623.v1
Subject: Engineering, Bioengineering Keywords: Epilepsy; Electroencephalogram; Convolutional neural networks; Brain signal integral; Brain signal derivative
Online: 8 June 2023 (10:13:28 CEST)
Epilepsy is a neurological disorder that affects approximately 1% of the world's population. To diagnose and estimate the occurrence of epilepsy, the analysis of recorded brain activity is performed by a neurologist, which is not only time-consuming and tedious but also occasionally accompanied by human error. Therefore, in recent decades, researchers have aimed to unravel an approach for designing and building an automated method for diagnosing and estimating the occurrence of epilepsy. Accordingly, the present study proposed two new-fangled ways based on brain signals and a convolutional neural network (CNN). Moreover, this research implements a CNN with a sequential three-layer structure. Numerous experiments were performed, and the accuracy of estimating epilepsy using the developed methods was achieved at 95% without feedback and 97% with feedback. The proposed methods were proven to be more accurate than the previous techniques and can be employed as a physician's assistant once entering the field of operation.
ARTICLE | doi:10.20944/preprints202210.0224.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: multilabel; ensemble; incorporating multiple clustering centers; gated recurrent neural networks; temporal convolutional neural networks; long short-term memory
Online: 17 October 2022 (04:06:31 CEST)
Multilabel learning goes beyond standard supervised learning models by associating a sample with more than one class label. Among the many techniques developed in the last decade to handle multilabel learning best approaches are those harnessing the power of ensembles and deep learners. This work proposes merging both methods by combining a set of gated recurrent units, temporal convolutional neural networks, and long short-term memory networks trained with variants of the Adam optimization approach. We examine many Adam variants, each fundamentally based on the difference between present and past gradients, with step size adjusted for each parameter. We also combine Incorporating Multiple Clustering Centers and a bootstrap-aggregated decision trees ensemble, which is shown to further boost classification performance. In addition, we provide an ablation study for assessing the performance improvement that each module of our ensemble produces. Multiple experiments on a large set of datasets representing a wide variety of multilabel tasks demonstrate the robustness of our best ensemble, which is shown to outperform the state-of-the-art. The MATLAB code for generating the best ensembles in the experimental section will be made available at https://github.com/LorisNanni.
ARTICLE | doi:10.20944/preprints201812.0296.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: staircase recognition; Convolutional Neural Networks (CNN); re-configurable robot; contour detection
Online: 25 December 2018 (05:33:12 CET)
Multi-floor environments are usually ignored while designing an autonomous robot for indoor cleaning applications. However, for efficient operation in such environments, the ability of a robotic platform to traverse staircases is crucial. Staircase detection and localization is highly important for planning the traversal on staircases. This paper describes a deep learning approach using Convolutional Neural Networks (CNNs) based Robot Operation System (ROS) to staircase detection and localization. We use an object detection network to detect staircases in images. We also localize these staircases using a contour detection algorithm to detect the target point, a point close to the center of the first step, and the angle of approach to the target point. Experiments are performed with data obtained from images captured on different types of staircases at different view points/angles. Results show that the approach is very accurate in identifying the presence of the staircase in the working environment and is also able to locate the target point with good accuracy.
ARTICLE | doi:10.20944/preprints202201.0068.v1
Subject: Engineering, Mechanical Engineering Keywords: Simulated annealing; Wavelet packet transform; Convolutional neural network
Online: 6 January 2022 (10:27:40 CET)
Bearings are widely used in various types of electrical machinery and equipment. As their core components, failures will often cause serious consequences . At present, most methods of parameter adjustment are still manual adjustment of parameters. This adjustment method is susceptible to prior knowledge and easy to fall into the local optimal solution, failing to obtain the global optimal solution and requires a lot of resources.Therefore, this paper proposes a new method of bearing fault diagnosis based on wavelet packet transform and convolutional neural network optimized by simulated annealing algorithm.The experimental results show that the method proposed in this paper has a more accurate effect in feature extraction and fault classification compared with traditional bearing fault diagnosis methods. At the same time, compared with the traditional artificial neural network parameter adjustment, this paper introduces the simulated annealing algorithm to automatically adjust the parameters of the neural network, thereby obtaining an adaptive bearing fault diagnosis method. To verify the effectiveness of the method, the Case Western Reserve University bearing database was used for testing, and the traditional intelligent bearing fault diagnosis method was compared. The results show that the method proposed in this paper has good results in bearing fault diagnosis. Provides a new way of thinking in the field of bearing fault diagnosis in parameter adjustment and fault classification algorithms
ARTICLE | doi:10.20944/preprints202304.0996.v1
Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: Convolutional Neural Network; Deep Learning; Photoplethysmography; Respiratory Rate; Time Series
Online: 26 April 2023 (13:17:24 CEST)
Respiratory rate is an important biomarker that indicates changes in the clinical condition of critically ill patients, so a surveillance tool that can accurately monitor the changing respiratory rate in real time is needed. Through investigating various pairs of machine learning models, we proposed new machine learning model for real-time respiratory rate estimation using photoplethysmogram. New photoplethysmogram-driven respiratory rate dataset(StMary) was collected from surgical intensive care unit of a tertiary referral hospital, using photoplethysmogram signal collector. For 50patients and 50healthy volunteers, 2-minute photoplethysmogram was collected for each subject twice. To evaluate the respiratory rate of subject, it was inputted into the deep neural network model we built, and dataset was splitted into training, validation, testing dataset, then 4-fold cross validation was exploited. Our deep neural network model trained with StMary and two public datasets(BIDMC and CapnoBase) individually, or selectively merged dataset had shown a low error rate in respiration rate measurements. Our model trained with StMary showed low mean absolute error score(1.0273±0.8965), and trained with 3 datasets(CapnoBase, BIDMC and StMary) showed a lower error rate(1.7359±1.6724) than the model trained with CapnoBase and BIDMC(1.9480±1.6751). We could verify the performance of model evaluating respiratory rate from photoplethysmogram, and our dataset could contribute as the clinical research data that supports artificial intelligence models evaluating respiratory rate and surveillance tools to test whether their monitoring function works properly.
REVIEW | doi:10.20944/preprints202206.0167.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; convolutional neural network; brain tumor classification; clinical application
Online: 13 June 2022 (04:57:42 CEST)
Deep learning has shown remarkable results in every field, especially in the biomedical field, due to its ability to exploit large-scale datasets. A convolutional neural network (CNN) is a widely used deep learning approach to solve medical imaging problems. Over the past few years, many studies have focused on CNN-based techniques for brain tumor diagnosis. There are, however, still some critical challenges that CNNs face towards clinic application. This study presents a comprehensive review of current literature that involves CNN architectures for brain tumor classification. We compare the key achievements in the performance evaluation metrics of the applied classification algorithms. In addition, this review assesses the clinical effectiveness of the included studies to elaborate on the limitations and directions of this area for future work. No review focusing on the clinical effectiveness of previous works in this field has been published. We believe that this study has the potential to elevate the application of CNN-based deep learning methods in clinical practice and also can be a quick reference for biomedical researchers who are interested in this field.
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: precipitation downscaling; convolutional neural networks; long short term memory networks; hydrological simulation
Online: 2 April 2019 (12:37:11 CEST)
Precipitation downscaling is widely employed for enhancing the resolution and accuracy of precipitation products from general circulation models (GCMs). In this study, we propose a novel statistical downscaling method to foster GCMs’ precipitation prediction resolution and accuracy for monsoon region. We develop a deep neural network composed of convolution and Long Short Term Memory (LSTM) recurrent module to estimate precipitation based on well-resolved atmospheric dynamical fields. The proposed model is compared against GCM precipitation product and classical downscaling methods in the Xiangjiang River Basin in South China. Results show considerable improvement compared to the ECMWF-Interim reanalysis precipitation. Also, the model outperforms benchmark downscaling approaches, including 1) quantile mapping, 2) support vector machine, and 3) convolutional neural network. To test the robustness of the model and its applicability in practical forecast, we apply the trained network for precipitation prediction forced by retrospective forecasts from ECMWF model. Compared to ECMWF precipitation forecast, our model makes better use of the resolved dynamical field for more accurate precipitation prediction at lead time from 1 day up to 2 weeks. This superiority decreases along forecast lead time, as GCM’s skill in predicting atmospheric dynamics being diminished by the chaotic effect. At last, we build a distributed hydrological model and force it with different sources of precipitation inputs. Hydrological simulation forced with the neural network precipitation estimation shows significant advantage over simulation forced with the original ERA-Interim precipitation (with NSE value increases from 0.06 to 0.64), and the performance is just slightly worse than the observed precipitation forced simulation (NSE=0.82). This further proves the value of the proposed downscaling method, and suggests its potential for hydrological forecasts.
ARTICLE | doi:10.20944/preprints201811.0612.v1
Subject: Environmental And Earth Sciences, Geophysics And Geology Keywords: geophysical signal processing; pattern recognition; temporal convolutional neural networks; seismology; deep learning; nuclear treaty monitoring
Online: 29 November 2018 (03:37:48 CET)
The detection of seismic events at regional and teleseismic distances is critical to Nuclear Treaty Monitoring. Traditionally, detecting regional and teleseismic events has required the use of an expensive multi-instrument seismic array; however in this work, we present DeepPick, a novel seismic detection algorithm capable of array-like performance from a single trace. We achieve this directly, by training our single-trace detector against labeled events from an array catalog, and by utilizing a deep temporal convolutional neural network. The training data consists of all arrivals in the International Seismological Centre Catalog for seven seismic arrays over a five year window from 1 Jan 2010 to 1 Jan 2015, yielding a total training set of 608,362 detections. The test set consists of the same seven arrays over a one year window from 1 Jan 2015 to 1 Jan 2016. We report our results by training the algorithm on six of the arrays and testing it on the seventh, so as to demonstrate the transportability and generalization of the technique to new stations. Detection performance against this test set is outstanding. Fixing a type-I error rate of 1%, the algorithm achieves an overall recall rate of 73% on the 141,095 array beam picks in the test set, yielding 102,394 correct detections. This is more than 4 times the 23,259 detections found in the analyst-reviewed single-trace catalogs over the same period, and represents an 8dB improvement in detector sensitivity over current methods. These results demonstrate the potential of our algorithm to significantly enhance the effectiveness of the global treaty monitoring network.
ARTICLE | doi:10.20944/preprints202309.1681.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: convolutional neural network; Fusarium wilt; transfer learning; ResNet-50; banana crop
Online: 25 September 2023 (11:29:36 CEST)
During the 1950s, the Gros Michel species of bananas were nearly wiped out by the incurable Fusarium Wilt, also known as Panama Disease. Originating in Southeast Asia, Fusarium Wilt is a banana pandemic that has been threatening the multi-billion-dollar banana industry worldwide. The disease is caused by a fungus that spreads rapidly throughout the soil and into the roots of banana plants. Currently, the only way to stop the spread of this disease is for farmers to manually inspect and remove infected plants as quickly as possible, whereas it is a time-consuming process. The main purpose of this study is to build a deep Convolutional Neural Network (CNN) using a transfer learning approach to rapidly identify fusarium wilt infections on banana crop leaves. We chose to use the ResNet50 architecture as the base CNN model for our transfer learning approach owing to its remarkable performance in image classification, which was demonstrated through its victory in the ImageNet competition. After its initial training and fine-tuning on a data set consisting of 300 healthy and diseased images, the CNN model achieved near-perfect accuracy of 0.99 and was fine-tuned to adapt the ResNet base model. ResNet50’s distinctive residual block structure could be the reason behind these results. To evaluate this CNN model, 500 test images, consisting of 250 diseased and healthy banana leaf images, were classified by the model. The deep CNN model was able to achieve an accuracy of 0.98 and an F-1 score of 0.98 by correctly identifying the class of 492 of the 500 images. These results show that this DCNN model outperforms existing models such as Sangeetha et al., 2023’s deep CNN model by at least 0.07 in accuracy and is a viable option for identifying Fusarium Wilt in banana crops.
ARTICLE | doi:10.20944/preprints202208.0029.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: heme distortion; pocket conformation; convolutional neural network; machine learning
Online: 2 August 2022 (03:20:13 CEST)
Heme proteins serve diverse and pivotal biological functions. Therefore, clarifying the mechanisms of these diverse functions of heme is a crucial scientific topic. Distortion of heme porphyrin is one of the key factors regulating the chemical properties of heme. Here, we constructed convolutional neural network models for predicting heme distortion from the tertiary structure of the heme-binding pocket to examine their correlation. For saddling, ruffling, doming, and waving distortions, the experimental structure and predicted values were closely correlated. Furthermore, we assessed the correlation between the cavity shape and molecular structure of heme and demonstrated that hemes in protein pockets with similar structures exhibit near-identical structures, indicating the regulation of heme distortion through the protein environment. These findings indicate that the tertiary structure of the heme-binding pocket regulates the distortion of heme porphyrin, thereby controlling the chemical properties of heme relevant to the protein function; this implies a structure–function correlation in heme proteins.
ARTICLE | doi:10.20944/preprints201910.0319.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: convolutional networks; satellite imagery; predictive modeling; disease density; urban housing; developing country
Online: 28 October 2019 (11:41:25 CET)
Rapid increase in digital data coupled with advances in deep learning algorithms is opening unprecedented opportunities for incorporating multiple data sources for modeling spatial dynamics of human infectious diseases. We used Convolutional Neural Networks (CNN) in conjunction with satellite imagery-based urban housing and socio-economic data to predict disease density in a developing country setting. We explored both single (uni) and multiple input (multimodality) network architectures for this purpose. We achieved maximum test set accuracy of 81.6 per cent using a single input CNN model built with one convolutional layer and trained using housing image data. However, this fairly good performance was biased in favor of specific disease density classes due to an unbalanced data set despite our use of methods to address the problem. These results suggest CNN are promising for modeling spatial dynamics of human infectious diseases, especially in a developing country setting. Urban housing signals extracted from satellite imagery seem suitable for this purpose, under the same context.
ARTICLE | doi:10.20944/preprints202305.1228.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: grape; Appearance quality; Classification; Convolutional neural network; Transfer learning; Support vector machine
Online: 17 May 2023 (10:28:16 CEST)
Grapes are a globally popular fruit, with grape cultivation worldwide being second only to citrus. This article focuses on the low efficiency and accuracy of traditional manual grading of red grape external appearance and proposes a small-sample red grape external appearance grading model based on transfer learning with convolutional neural networks (CNNs). Initially, the CNN transfer learning method was used to transfer the pre-trained AlexNet, VGG16, GoogleNet, InceptionV3, and ResNet50 network models on the ImageNet image dataset to the red grape image grading task. By comparing the classification performance of the CNN models of these five different network depths with fine-tuning, ResNet50 with a learning rate of 0.001 and a loop number of 10 was determined to be the best feature extractor for red grape images. Moreover, given the small number of red grape image samples in this study, different convolutional layer features output by the ResNet50 feature extractor were analyzed layer by layer to determine the effect of deep features extracted by each convolutional layer on SVM classification performance. This analysis helped to obtain a ResNet50+SVM red grape external appearance grading model based on the optimal ResNet50 feature extraction strategy. Experimental data showed that the classification model constructed using the feature parameters extracted from the 10th node of the ResNet50 network achieved an accuracy rate of 95.08% for red grape grading. These research results provide a reference for the online grading of red grape clusters based on external appearance quality and have certain guiding significance for the quality and efficiency of grape industry circulation and production.
ARTICLE | doi:10.20944/preprints202211.0094.v1
Subject: Engineering, Mechanical Engineering Keywords: Bearing fault feature extraction; Blind deconvolution (BD); Multi-task optimization; Convolutional neural network
Online: 4 November 2022 (13:41:46 CET)
Blind deconvolution (BD) is one of the effective methods that help pre-process vibration signals and assist in bearing fault diagnosis. Currently, most BD methods design an optimization criterion and use frequency or time domain information independently to optimize a deconvolution filter. It recovers weak periodic impulses related to incipient faults. However, the random noise interference may cause the optimizer to overfit. The time-domain-based BD methods tend to extract fault-unrelated single peak impulse, and the frequency-domain-based BD methods tend to retain the maximum energy frequency component, which will lose the fault-related harmonics frequency components. To solve the above issue, we propose a hybrid criterion that combines the kurtosis for time domain optimization and the $G-l_1/l_2$ norm for the frequency domain. These two criteria are monotonically increasing and decreasing, so they mutually constrain to avoid overfitting. After that, we design a multi-task one-dimensional convolutional neural network with time and frequency branches to achieve an optimal solution for this hybrid criterion. The multi-task neural network realizes the simultaneous optimization of two domains. Experimental results show that our proposed method outperforms other state-of-the-art methods.
ARTICLE | doi:10.20944/preprints201807.0086.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: vibration measurement; frequency prediction; deep learning; convolutional neural network; photogrammetry; computer vison; non-contact measurement
Online: 5 July 2018 (08:31:00 CEST)
Vibration measurement serves as the basis for various engineering practices such as natural frequency or resonant frequency estimation. As image acquisition devices become cheaper and faster, vibration measurement and frequency estimation through image sequence analysis continue to receive increasing attention. In the conventional photogrammetry and optical methods of frequency measurement, vibration signals are first extracted before implementing the vibration frequency analysis algorithm. In this work, we demonstrated that frequency prediction can be achieved using a single feed-forward convolutional neural network. The proposed method is verified using a vibration signal generator and excitation system, and the result obtained was compared with that of an industrial contact vibrometer in a real application. Our experimental results demonstrate that the proposed method can achieve acceptable prediction accuracy even in unfavorable field conditions.
ARTICLE | doi:10.20944/preprints202212.0010.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: structure–function correlation; active site conformation; convolutional neural network; machine learning
Online: 1 December 2022 (04:11:37 CET)
Structure–function relationships in proteins have been one of the crucial scientific topics. Heme proteins have diverse and pivotal biological functions. Therefore, clarifying their structure–function correlation is significant to understand their functional mechanism and is informative for various fields of science. In this study, we constructed convolutional neural network models for predicting protein functions from the tertiary structures of heme-binding sites (active sites) of heme proteins to examine the structure–function correlation. As a result, we succeeded in the classification of oxygen-binding protein (OB), oxidoreductase (OR), proteins with both functions (OB–OR), and electron transport protein (ET) with high accuracy. Although the misclassification rate for OR and ET was high, the rates between OB and ET and between OB and OR were almost zero, indicating that the prediction model works well between protein groups with very different functions. However, predicting the function of proteins modified with amino acid mutation(s) remains a challenge. Our findings indicate a structure–function correlation in the active site of heme proteins. This study is expected to be applied to the prediction of more detailed protein functions such as catalytic reactions.
ARTICLE | doi:10.20944/preprints202209.0190.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: green coffee bean; lightweight framework; deep convolutional neural network; explainable model; random optimization
Online: 14 September 2022 (04:04:05 CEST)
In recent years, the demand for coffee has increased tremendously. During the production process, green coffee beans are traditionally screened manually for defective beans before they are packed into coffee bean packages; however, this method is not only time-consuming but also increases the rate of human error due to fatigue. Therefore, this paper proposed a lightweight deep convolutional neural network (LDCNN) for the quality detection system of green coffee beans, which combined depthwise separable convolution (DSC), squeeze-and-excite block (SE block), skip block, and other frameworks. To avoid the influence of low parameters of the lightweight model caused by the model training process, rectified Adam (RA), lookahead (LA), and gradient centralization (GC) were included to improve efficiency; the model was also put into the embedded system. Finally, the local interpretable model-agnostic explanations (LIME) model was employed to explain the predictions of the model. The experimental results indicated that the accuracy rate of the model could reach up to 98.38% and the F1 score could be as high as 98.24% when detecting the quality of green coffee beans. Hence, it can obtain higher accuracy, lower computing time, and lower parameters. Moreover, the interpretable model verified that the lightweight model in this work is reliable, providing the basis for screening personnel to understand the judgment through its interpretability, thereby improving the classification and prediction of the model.
ARTICLE | doi:10.20944/preprints202210.0112.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: ARIMA; convolutional neural network; Kalman filter; passenger flow; transportation; short-term prediction; stochastic model
Online: 10 October 2022 (03:05:34 CEST)
The passenger prediction flow is very significant to transportation sustainability. This is due to some chaos of traffic jams encountered by the road users during their movement to the offices, schools, or markets at earlier of the days and during closing periods. This problem is peculiar to the transportation system of the Federal University of Technology Minna, Nigeria. However, the prevailing technique of passenger flow estimation is non-parametric which depends on the fixed planning and is easily affected by noise. In this research, we proposed the development of a hybrid intelligent passenger frequency prediction model using the Auto-Regressive Integrated Moving Average (ARIMA) linear model, Convolutional Neural Network (CNN), and Kalman Filter Algorithm (KFA). The passengers’ frequency of arrival at the bus terminals is obtained and enumerated through the closed-circuit television (CCTV) and demonstrated using the Markovian Queueing Systems Model (MQSM). The ARIMA model was used for learning and prediction and compared the result with the combined techniques of using CNN-KFA. The autocorrelation coefficient functions (ACF) and partial autocorrelation coefficient functions (PACF) are used to examine the stationary data with different features. The performance of the models was analyzed and evaluated in describing the short-term passenger flow frequency at each terminal using the Mean Absolute Percentage Error (MAPE) and Mean Squared Error (MSE) values. The CNN-Kalman-filter model was fitted into the short-term series and the MAPE values are below 10%. The Mean Square Error (MSE) shows that the CNN-Kalman Filter model has the overall best performance with 83.33% of the time better than the ARIMA model and provides high accuracy in forecasting.
ARTICLE | doi:10.20944/preprints202110.0375.v1
Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: Brain-Computer Interface (BCI), Convolutional neural network (CNN), Electroencephalogram (EEG), Explainable artificial intelligence (XAI)
Online: 26 October 2021 (11:45:00 CEST)
Functional connectivity (FC) is a potential candidate that can increase the performance of brain-computer interfaces (BCIs) in the elderly because of its compensatory role in neural circuits. However, it is difficult to decode FC by current machine learning techniques because of a lack of its physiological understanding. To investigate the suitability of FC in BCI for the elderly, we propose the decoding of lower- and higher-order FCs using a convolutional neural network (CNN) in six cognitive-motor tasks. The layer-wise relevance propagation (LRP) method describes how age-related changes in FCs impact BCI applications for the elderly compared to younger adults. Seventeen younger (24.5±2.7 years) and twelve older (72.5±3.2 years) adults were recruited to perform tasks related to hand-force control with or without mental calculation. CNN yielded a six-class classification accuracy of 75.3% in the elderly, exceeding the 70.7% accuracy for the younger adults. In the elderly, the proposed method increases the classification accuracy by 88.3% compared to the filter-bank common spatial pattern (FBCSP). LRP results revealed that both lower- and higher-order FCs were dominantly overactivated in the prefrontal lobe depending on task type. These findings suggest a promising application of multi-order FC with deep learning on BCI systems for the elderly.
ARTICLE | doi:10.20944/preprints202306.1435.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Application-Specific Integrated Circuits; Approximate Multiplier; CMOS; Convolutional Neural Network; Depthwise Separable Convolution; Processing Element
Online: 21 June 2023 (03:49:43 CEST)
For Convolutional Neural Network (CNN), Depthwise Separable CNN (DSCNN) is a preferred architecture for Application Specific Integrated Circuit (ASIC) implementation on edge devices. It can benefit from a multi-mode approximate multiplier proposed in this work. The proposed approximate multiplier uses two 4-bit multiplication operations to implement a 12-bit multiplication operation by reusing the same multiplier array. With this approximate multiplier, sequential multiplication operations are pipelined in a modified DSCNN to fully utilize the PE array in the convolutional layer. This Approximate (A-DSCNN) was implemented on TSMC 40-nm CMOS process with a supply voltage of 0.9 V. At the clock frequency of 200 MHz, the design achieves 4.78 GOPs/mW while occupying 1.24 mm x 1.24 mm silicon area. Compared to conventional DSCNN implemented in a similar process node, the chip area and power consumption were reduced by 53% and 25%, while the throughput was improved by 17%.
ARTICLE | doi:10.20944/preprints202011.0508.v2
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Deep Learning; Convolutional Neural Network; IoT Networks; Cyber-attack detection; Cyber-attack Classification
Online: 17 December 2020 (12:14:00 CET)
With the rapid expansion of intelligent resource-constrained devices and high-speed communication technologies, Internet of Things (IoT) has earned a wide recognition as the primary standard for low-power lossy networks (LLNs). Nevertheless, IoT infrastructures are vulnerable to cyber-attacks due to the constraints in computation, storage, and communication capacity of the endpoint devices. From one side, the majority of newly developed cyber-attacks are formed by slightly mutating formerly established cyber-attacks to produce a new attack tending to be treated as a normal traffic through the IoT network. From the other side, the influence of coupling the deep learning techniques with cybersecurity field has become a recent inclination of many security applications due to their impressive performance. In this paper, we provide a comprehensive development of a new intelligent and autonomous deep learning-based detection and classification system for cyber-attacks in IoT communication networks leveraging the power of convolutional neural networks, abbreviated as (IoT-IDCS-CNN). The proposed IoT-IDCS-CNN makes use of the high-performance computing employing the robust CUDA based Nvidia GPUs and the parallel processing employing the high-speed I9-Cores based Intel CPUs. In particular, the proposed system is composed of three subsystems: Feature Engineering subsystem, Feature Learning subsystem and Traffic classification subsystem. All subsystems are developed, verified, integrated, and validated in this research. To evaluate the developed system, we employed the NSL-KDD dataset which includes all the key attacks in the IoT computing. The simulation results demonstrated more than 99.3% and 98.2% of cyber-attacks’ classification accuracy for the binary-class classifier (normal vs anomaly) and the multi-class classifier (five categories) respectively. The proposed system was validated using k-fold cross validation method and was evaluated using the confusion matrix parameters (i.e., TN, TP, FN, FP) along with other classification performance metrics including precision, recall, F1-score, and false alarm rate. The test and evaluation results of the IoT-IDCS-CNN system outperformed many recent machine-learning based IDCS systems in the same area of study.
ARTICLE | doi:10.20944/preprints202308.1351.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: modular neural networks; convolutional neural networks; recurrent neural networks; rational choice theory; price fluctuations; sentiment analysis; Forex prediction
Online: 18 August 2023 (09:36:17 CEST)
Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been utilised to forecast the foreign exchange market (Forex). However, such models usually exhibit unstable behaviour, as data perturbations often downgrade the functionality of the entire network due to their monolithic architecture. Hence, this study proposes a novel neuroscience-informed modular network applied to the closing prices and the sentiments retrieved from Yahoo Finance and Twitter APIs, aiming to anticipate better price fluctuations in Euro to British Pound Sterling (EUR/GBP) rather than monolithic methods. The proposed model is based on a new modular CNN, replacing pooling layers with orthogonal kernel initialisation RNNs coupled with Monte Carlo Dropout (MCD), namely MCoRNNMCD. It combines two modules: i) a convolutional simple RNN and ii) a convolutional Gated Recurrent Unit (GRU), where orthogonality and MCD are added to reduce the overfitting, assessing each module's uncertainty. These parallel feature extraction modules concatenate their outputs to a final three-layer Artificial Neural Network (ANN) decision-making module. A comprehensive comparison viewing objective evaluation metrics such as the Mean Square Error (MSE) proved that the proposed MCoRNNMCD-ANN outperformed single CNN, LSTM, GRU, and the state-of-the-art hybrid BiCuDNNLSTM, CLSTM, CNN-LSTM, and LSTM-GRU in forecasting hourly EUR/GBP closing price fluctuations.
ARTICLE | doi:10.20944/preprints202308.1442.v1
Subject: Physical Sciences, Optics And Photonics Keywords: optical neural network; convolutional neural network; free-space optics; optical computer; smart pixels
Online: 22 August 2023 (03:47:05 CEST)
A scalable optical convolutional neural network (SOCNN) based on free-space optics and Koehler illumination was proposed to address the limitations of the previous 4f correlator system. Unlike Abbe illumination, Koehler illumination provides more uniform illumination and reduces crosstalk. SOCNN allows for scaling up of the input array and the use of incoherent light sources. Hence, the problems associated with 4f correlator systems can be avoided. We analyzed the limitations in scaling the kernel size and parallel throughput and found that SOCNN can offer a multilayer convolutional neural network with massive optical parallelism.
ARTICLE | doi:10.20944/preprints201905.0231.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Optical Music Recognition; historical document analysis; Medieval manuscripts; neume notation; fully convolutional neural networks
Online: 20 May 2019 (08:45:34 CEST)
Even today, the automatic digitisation of scanned documents in general but especially the automatic optical music recognition (OMR) of historical manuscripts still remain an enormous challenge, since both handwritten musical symbols and text have to be identified. This paper focuses on the Medieval so-called square notation developed in the 11th-12th century, which is already composed of staff lines, staves, clefs, accidentals, and neumes, that are roughly spoken connected single notes. The aim is to develop an algorithm that captures both the neume and pitch, that is melody information that can be used to reconstruct the original writing. Our pipeline is similar to the standard OMR approach and comprises a novel staff line and symbol detection algorithm, based on deep Fully Convolutional Networks (FCN), which perform pixel-based predictions for either staff lines or symbols and their respective types. Then, the staff line detection combines the extracted lines to staves and yields an F1-score of over 99% for both detecting lines and complete staves. For the music symbol detection we choose a novel approach that skips the step to identify neumes and instead directly predicts note components (NCs) and their respective affiliation to a neume. Furthermore, the algorithm detects clefs and accidentals. Our algorithm recognises these symbols with an F1-score of over 96% if the type is ignored and predicts the true symbol sequence of a staff with a diplomatic symbol accuracy rate (dSAR) of about 87%. If only the NCs without their respective connection to a neume, all clefs, and accidentals are of interest the algorithm reaches an harmonic symbol accuracy rate (hSAR) of approximately 90%.
ARTICLE | doi:10.20944/preprints202111.0230.v1
Subject: Engineering, Automotive Engineering Keywords: Convolutional neural network; Driver drowsiness; ECG signal; Heart rate variability; Wavelet scalogram
Online: 12 November 2021 (15:01:50 CET)
Driver drowsiness is one of the leading causes of traffic accidents. This paper proposes a new method for classifying driver drowsiness using deep convolution neural networks trained by wavelet scalogram images of electrocardiogram (ECG) signals. Three different classes were de-fined for drowsiness based on video observation of driving tests performed in a simulator for manual and automated modes. The Bayesian optimization method is employed to optimize the hyperparameters of the designed neural networks, such as the learning rate and the number of neurons in every layer. To assess the results of the deep network method, Heart Rate Variability (HRV) data is derived from the ECG signals, some features are extracted from this data, and finally, random forest and k-nearest neighbors (KNN) classifiers are used as two traditional methods to classify the drowsiness levels. Results show that the trained deep network achieves balanced accuracies of about 77% and 79% in the manual and automated modes, respectively. However, the best obtained balanced accuracies using traditional methods are about 62% and 64%. We conclude that designed deep networks working with wavelet scalogram images of ECG signals significantly outperform KNN and random forest classifiers which are trained on HRV-based features.
ARTICLE | doi:10.20944/preprints202203.0288.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: computer-aided detection; convolutional neural network; COVID-19; deep learning; image classification
Online: 22 March 2022 (02:19:50 CET)
One of the critical tools for early detection and subsequent evaluation of the incidence of lung diseases is chest radiography. This study presents a real-world implementation of a convolutional neural network (CNN) based Carebot Covid app to detect COVID-19 from chest X-ray (CXR) images. Our proposed model takes the form of a simple and intuitive application. Used CNN can be deployed as a STOW-RS prediction endpoint for direct implementation into DICOM viewers. The results of this study show that the deep learning model based on DenseNet and ResNet architecture can detect SARS-CoV-2 from CXR images with precision of 0.981, recall of 0.962 and AP of 0.993.
ARTICLE | doi:10.20944/preprints202308.0825.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: max-pooling; convolutional neural network (CNN); FPGA; rank tracking based max-pooling (RTB-MAXP); cascaded maximum based max-pooling (CMB-MAXP)
Online: 10 August 2023 (05:48:50 CEST)
This paper proposes two max-pooling engines, named the RTB-MAXP engine and the CMB-MAXP engine, with a scalable window size parameter for FPGA-based convolutional neural network (CNN) implementation. The max-pooling operation for the CNN can be decomposed into two stages, i.e., a horizontal axis max-pooling operation and a vertical axis max-pooling operation. These two one-dimensional max-pooling operations are performed by tracking the rank of the values within the window in the RTB-MAXP engine and cascading the maximum operations of the values in CMB-MAXP engine. Both the RBM-MAXP engine and the CMB-MAXP engine were implemented using VHSIC Hardware Description Language (VHDL) and verified by simulations. They have been employed for and tested in our CNN accelerator targeting at the CNN model YOLOv4-CSP-S-Leaky for object detection.
ARTICLE | doi:10.20944/preprints202306.0942.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: deep learning; partial discharge; convolutional neural network; medium voltage switchgear; air-insulated switchgear; autoencoder; long short-term memory
Online: 13 June 2023 (13:55:13 CEST)
The correct classification of defects originating from partial discharges (PD) in medium-voltage (MV) switchgears with air insulation (AIS) remains a challenging research topic for scientists worldwide. In this article, the authors simulated four possible defects occurring in the power industry, including one that is a simultaneous combination of two commonly ones. In addition, the correctness of the algorithm was checked by adding a classification class without any fault. The measurement signals were recorded with TEV sensors. The effectiveness of various hy-brid-connected neural networks was tested and discussed: GoogleNet and SqueezeNet based on spectrograms, SAE with FNN, 2D-CNN with LSTM, and hybrid AE combined with CNN and LSTM. The highest effectiveness – approximately 97% – was demonstrated by the GoogleNet and SqueezeNet networks. The research results are expected to form the basis for the development of a universal and wireless capacitive sensor for monitoring the level of PD in switchgears.
ARTICLE | doi:10.20944/preprints202010.0502.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Statistical downscaling; Generative Adversarial Network; Combination of Errors; Convolutional Neural Network; multi-scale structural similarity index; Wasserstein GAN
Online: 25 October 2020 (19:33:49 CET)
Despite numerous studies in statistical downscaling methodology, there remains a lack of methods that can downscale from precipitation modeled in global climate models to regional level high resolution gridded precipitation. This paper reports a novel downscaling method using a Generative Adversarial Network (GAN), CliGAN, which can downscale large-scale annual maximum precipitation given by simulation of multiple atmosphere-ocean global climate models (AOGCM) from Coupled Model Inter-comparison Project 6 (CMIP6) to regional-level gridded annual maximum precipitation data. This framework utilizes a convolution encoder-dense decoder network to create a generative network and a similar network to create a critic network. The model is trained using an adversarial training approach. The critic uses the Wasserstein distance loss function and the generator is trained using a combination of adversarial loss Wasserstein distance, structural loss with the multi-scale structural similarity index (MSSIM), and content loss with the Nash-Sutcliff Model Efficiency (NS). The MSSIM index allowed us to gain insight into the model’s regional characteristics and shows that relying exclusively on point-based error functions, widely used in statistical downscaling, may not be enough to reliably simulate regional precipitation characteristics. Further use of structural loss functions within CNN-based downscaling methods may lead to higher quality downscaled climate model products.
ARTICLE | doi:10.20944/preprints201903.0039.v2
Subject: Engineering, Control And Systems Engineering Keywords: Handwritten digit recognition; Convolutional Neural Network (CNN); Deep learning; MNIST dataset; Epochs; Hidden Layers; Stochastic Gradient Descent; Backpropagation
Online: 20 September 2019 (10:12:26 CEST)
In recent times, with the increase of Artificial Neural Network (ANN), deep learning has brought a dramatic twist in the field of machine learning by making it more Artificial Intelligence (AI). Deep learning is used remarkably used in vast ranges of fields because of its diverse range of applications such as surveillance, health, medicine, sports, robotics, drones etc. In deep learning, Convolutional Neural Network (CNN) is at the center of spectacular advances that mixes Artificial Neural Network (ANN) and up to date deep learning strategies. It has been used broadly in pattern recognition, sentence classification, speech recognition, face recognition, text categorization, document analysis, scene, and handwritten digit recognition. The goal of this paper is to observe the variation of accuracies of CNN to classify handwritten digits using various numbers of hidden layer and epochs and to make the comparison between the accuracies. For this performance evaluation of CNN, we performed our experiment using Modified National Institute of Standards and Technology (MNIST) dataset. Further, the network is trained using stochastic gradient descent and the backpropagation algorithm.
ARTICLE | doi:10.20944/preprints201808.0034.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: android; malware; convolutional neural network
Online: 2 August 2018 (06:12:48 CEST)
Using smartphone especially android platform has already got eighty percent market shares, due to aforementioned report, it becomes attacker’s primary goal. There is a growing number of private data onto smart phones and low safety defense measure, attackers can use multiple way to launch and to attack user’s smartphones.(e.g. Using different coding style to confuse the software of detecting malware). Existing android malware detection methods use multiple features, like safety sensor API, system call, control flow structure and data information flow, then using machine learning to check whether its malware or not. These feature provide app’s unique property and limitation, that is to say, from some perspectives it might suit for some specific attack, but wouldn’t suit for others. Nowadays most malware detection methods use only one aforementioned feature, and these methods mostly analysis to detect code, but facing the influence of malware’s code confusion and zero-day attack, aforementioned feature extraction method may cause wrong judge. So, it’s necessary to design an effective technique analysis to prevent malware. In this paper, we use the importance of word from apk, because of code confusion, some malware attackers only rename variables, if using general static analysis wouldn’t judge correctly, then use these importance value to go through our proposed method to generate picture, finally using convolutional neural network to see whether the apk file is malware or not.
ARTICLE | doi:10.20944/preprints202308.0683.v1
Subject: Public Health And Healthcare, Public Health And Health Services Keywords: heart disease; arrhythmia detection; ECG; convolutional neural network; deep learning; convolutional blocks; identity blocks
Online: 9 August 2023 (07:28:33 CEST)
Arrhythmia is a cardiac condition characterized by an irregular heart rhythm that hinders the proper circulation of blood, posing a severe risk to individuals’ lives. Globally, arrhythmias are recognized as a significant health concern, accounting for nearly 12 percent of all deaths. As a result, there has been a growing focus on utilizing artificial intelligence for the detection and classification of abnormal heartbeats. In recent years, self-operated heartbeat detection research has gained popularity due to its cost-effectiveness and potential for expediting therapy for individuals at risk of arrhythmias. However, building an efficient automatic heartbeat monitoring approach for arrhythmia identification and classification comes with several significant challenges. These challenges include addressing issues related to data quality, determining the range for heart rate segmentation, managing data imbalance difficulties, handling intra- and inter-patient variations, distinguishing supraventricular irregular heartbeats from regular heartbeats, and ensuring model interpretability. In this study, we propose the Reseek-Arrhythmia model, which leverages deep learning techniques to automatically detect and classify heart arrhythmia diseases. The model combines different convolutional blocks and identity blocks, along with essential components such as convolution layers, batch normalization layers, and activation layers. To train and evaluate the model, we utilized the MIT-BIH and PTB datasets. Remarkably, the proposed model achieves outstanding performance with an accuracy of 99.35% and 93.50% and an acceptable loss of 0.688 and 0.2564, respectively.
ARTICLE | doi:10.20944/preprints201807.0119.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network,Single Shot Detector, Regional Convolutional Neural Network, Machine Learning, Visualization-Localization
Online: 6 July 2018 (14:38:52 CEST)
The emerging use of visualization techniques in pathology and microbiol- ogy has been accelerated by machine learning (ML) approaches towards image preprocessing, classification, and feature extraction in an increasingly complex series of datasets. Modern Convolutional Neural Network (CNN) architectures have developed into an umbrella of vast image reinforcement and recognition methods, including a combined classification-localization of single/multi-object featured images. As a subtype neural network, CNN cre- ates a rapid order of complexity by initially detecting borderlines, edges, and colours in images for dataset construction, eventually capable in mapping intricate objects and conformities. This paper investigates the disparities between Tensorflow object detection APIs, exclusively, Single Shot Detector (SSD) Mobilenet V1 and the Faster RCNN Inception V2 model, to sample computational drawbacks in accuracy-precision vs. real time visualization capabilities. The situation of rapid ML medical image analysis is theoretically framed in regions with limited access to pathology and disease prevention departments (e.g. 3rd world and impoverished countries). Dark field mi- croscopy datasets of an initial 62 XML-JPG annotated training files were processed under Malaria and Syphilis classes. Model trainings were halted as soon as loss values were regularized and converged.
ARTICLE | doi:10.20944/preprints202307.1049.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Invariant Graph Convolutional Network (GCN); Convolutional Neural Network (CNN); Binary quantization; Hyperspectral image (HSI) classification
Online: 17 July 2023 (14:09:49 CEST)
Hyperspectral image and LiDAR image fusion plays a crucial role in remote sensing by capturing spatial relationships and modeling semantic information for accurate classification and recognition. However, existing methods, like Graph Convolutional Networks (GCNs), face challenges in constructing effective graph structures due to variations in local semantic information and limited receptiveness to large-scale contextual structures. To overcome these limitations, we proposed a invariant attribute-driven binary bi-branch classification (IABC) method which is a unified network that combines binary Convolutional Neural Network (CNN) and GCN with invariant attributes. Our approach utilizes a joint detection framework that can simultaneously learn features from small-scale regular regions and large-scale irregular regions, resulting in an enhanced structured representation of HSI and LiDAR images in the spectral-spatial domain. This approach not only improves the accuracy of classification and recognition but also reduces storage requirements and enables real-time decision-making, which is crucial for effectively processing large-scale remote sensing data. Extensive experiments demonstrates the superior performance of our proposed method in hyperspectral image analysis tasks. The combination of CNNs and GCNs allows for accurate modeling of spatial relationships and effective construction of graph structures. Furthermore, the integration of binary quantization enhances computational efficiency, enabling real-time processing of large-scale data. Therefore, our approach presents a promising opportunity for advancing remote sensing applications using deep learning techniques.
ARTICLE | doi:10.20944/preprints202307.0326.v1
Subject: Engineering, Mechanical Engineering Keywords: Contact Fatigue; Feature Extraction; Health Index; Degradation Prediction; Temporal Convolutional Network; Convolutional Auto-Encoder Network
Online: 5 July 2023 (14:04:06 CEST)
In order to realize the performance degradation trend prediction accurately, a prediction method based on multi-domain features and temporal convolutional network (TCN) is proposed. Firstly, construct a high-dimensional feature set in the multi-domain of vibration signals, and use comprehensive evaluation indicators to preliminarily screen performance degradation indexes with good sensitivity and strong trend. Secondly, the kernel principal component analysis (KPCA) method is adopted to eliminate redundant information between multi-domain features, and construct a health index (HI) based on convolutional auto-encoder (CAE) network. Thirdly, a TCN-based performance degradation trend prediction model is constructed, and direct multi-step prediction is used to predict the performance degradation trend of the monitored object. On this basis, the validity of the proposed method is verified using the bearing public data, and it is successfully applied to performance degradation trend prediction of rolling contact fatigue specimen. The results show that the feature set can be reduced from 14 dimensions to 4 dimensions by using KPCA, while 98.33% of the information of the original feature set is retained. Furthermore, the method of constructing HI based on CAE network is effective. The change process of the HI constructed truly reflects the performance degradation process of the rolling contact fatigue specimen. Compared with the two commonly used HI construction methods, auto-encoding (AE) network and gaussian mixture model (GMM), this method has obvious advantages. At the same time, the prediction model based on TCN can accurately predict the performance degradation of the rolling contact fatigue specimen with the root mean square error 0.0146 and the mean absolute error 0.0105, which has better performance and higher prediction accuracy than the prediction model based on the long short-term memory (LSTM) network and the gated recurrent unit (GRU). This method has general significance and may be extended to the performance degradation prediction of other mechanical equipment/parts.
ARTICLE | doi:10.20944/preprints202305.2163.v1
Subject: Engineering, Bioengineering Keywords: chickpea; convolutional neural network; transfer learning; classification
Online: 31 May 2023 (03:32:49 CEST)
Chickpea is one of the most widely consumed pulses globally because of its high protein content. The morphological features of chickpea seed, such as colour, texture, are observable and play a major role in classifying different chickpea varieties. This process is often carried out by human experts, and is time-consuming, inaccurate, and expensive. The objective of the study was to design an automated chickpea classifier using an RGB colour image-based model by considering the morphological features of chickpea seed. As part of the data acquisition process, five hundred and fifty images were collected per variety for four varieties of chickpea (CDC-Alma, CDC-Consul, CDC-Cory, and CDC-Orion) using an industrial RGB camera and a mobile phone camera. Three CNN-based models such as NasNet-A (mobile), MobileNetV3 (small), and EfficientNetB0 were evaluated using a transfer learning-based approach. The classification accuracy was 97%, 99%, and 98% for NasNet-A (mobile), MobileNetV3 (small), and EfficientNetB0 models, respectively. The MobileNetV3 model was used for further deployment on an Android mobile and Raspberry Pi 4 devices based on its higher accuracy and light-weight architecture. The classification accuracy for the four chickpea varieties was 100% while the MobileNetV3 model was deployed on both Android mobile and Raspberry Pi 4 platforms.
ARTICLE | doi:10.20944/preprints202211.0226.v1
Subject: Computer Science And Mathematics, Analysis Keywords: deep learning; convolutional neural networks; remote sensing
Online: 14 November 2022 (01:20:07 CET)
Deep Learning is an extremely important research topic in Earth Observation. Current use-cases range from semantic image segmentation, object detection to more common problems found in computer vision such as object identification. Earth Observation is an excellent source for different types of problems and data for Machine Learning in general and Deep Learning in particular. It can be argued that both Earth Observation and Deep Learning as fields of research will benefit greatly from this recent trend of research. In this paper we take several state of the art Deep Learning network topologies and provide a detailed analysis of their performance for semantic image segmentation for building footprint detection. The dataset used is comprised of high resolution images depicting urban scenes. We focused on single model performance on simple RGB images. In most situations several methods have been applied to increase the accuracy of prediction when using deep learning such as ensembling, alternating between optimisers during training and using pretrained weights to bootstrap new models. These methods although effective, are not indicative of single model performance. Instead, in this paper, we present different topology variations of these state of the art topologies and study how these variations effect both training convergence and out of sample, single model, performance.
ARTICLE | doi:10.20944/preprints202111.0186.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Explainable AI; Convolutional Neural Network; Network Compression
Online: 9 November 2021 (15:03:27 CET)
Model understanding is critical in many domains, particularly those involved in high-stakes decisions, i.e., medicine, criminal justice, and autonomous driving. Explainable AI (XAI) methods are essential for working with black-box models such as Convolutional Neural Networks. This paper evaluates the traffic sign classifier of Deep Neural Network (DNN) from the Programmable Systems for Intelligence in Automobiles (PRYSTINE) project for explainability. The results of explanations were further used for the CNN PRYSTINE classifier vague kernels` compression. After all, the precision of the classifier was evaluated in different pruning scenarios. The proposed classifier performance methodology was realised by creating the original traffic sign and traffic light classification and explanation code. First, the status of the kernels of the network was evaluated for explainability. For this task, the post-hoc, local, meaningful perturbation-based forward explainable method was integrated into the model to evaluate each kernel status of the network. This method enabled distinguishing high and low-impact kernels in the CNN. Second, the vague kernels of the classifier of the last layer before the fully connected layer were excluded by withdrawing them from the network. Third, the network's precision was evaluated in different kernel compression levels. It is shown that by using the XAI approach for network kernel compression, the pruning of 5% of kernels leads only to a 1% loss in traffic sign and traffic light classification precision. The proposed methodology is crucial where execution time and processing capacity prevail.
ARTICLE | doi:10.20944/preprints202007.0379.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Transfer Learning; Convolutional Neural Networks; Emotion Recognition
Online: 17 July 2020 (13:58:18 CEST)
The paper concludes the first research on mouth-based Emotion Recognition (ER), adopting a Transfer Learning (TL) approach. Transfer Learning results paramount for mouth-based emotion ER, because a few data sets are available, and most of them include emotional expressions simulated by actors, instead of adopting a real-world categorization. Using TL we can use fewer training data than training a whole network from scratch, thus more efficiently fine-tuning the network with emotional data and improving the convolutional neural network accuracy in the desired domain. The proposed approach aims at improving the Emotion Recognition dynamically, taking into account not only new scenarios but also modified situations with respect to the initial training phase, because the image of the mouth can be available even when the whole face is visible only in an unfavourable perspective. Typical applications include automated supervision of bedridden critical patients in an healthcare management environment, or portable applications supporting disabled users having difficulties in seeing or recognizing facial emotions. This work takes advantage from previous preliminary works on mouth-based emotion recognition using CNN deep-learning, and has the further benefit of testing and comparing a set of networks on large data sets for face-based emotion recognition well known in literature. The final result is not directly comparable with works on full-face ER, but valorizes the significance of mouth in emotion recognition, obtaining consistent performances on the visual emotion recognition domain.
ARTICLE | doi:10.20944/preprints201812.0090.v3
Subject: Engineering, Control And Systems Engineering Keywords: deep convolutional neural networks; multi-class segmentation; global convolutional network; channel attention; transfer learning; ISPRS Vaihingen; Landsat-8
Online: 4 January 2019 (11:47:42 CET)
In the remote sensing domain, it is crucial to complete semantic segmentation on the raster images, e.g., river, building, forest, etc, on raster images. A deep convolutional encoder--decoder (DCED) network is the state-of-the-art semantic segmentation method for remotely sensed images. However, the accuracy is still limited, since the network is not designed for remotely sensed images and the training data in this domain is deficient. In this paper, we aim to propose a novel CNN for semantic segmentation particularly for remote sensing corpora with three main contributions. First, we propose applying a recent CNN called a global convolutional network (GCN), since it can capture different resolutions by extracting multi-scale features from different stages of the network. Additionally, we further enhance the network by improving its backbone using larger numbers of layers, which is suitable for medium resolution remotely sensed images. Second, "channel attention'' is presented in our network in order to select the most discriminative filters (features). Third, "domain-specific transfer learning'' is introduced to alleviate the scarcity issue by utilizing other remotely sensed corpora with different resolutions as pre-trained data. The experiment was then conducted on two given datasets: (i) medium resolution data collected from Landsat-8 satellite and (ii) very high resolution data called the ISPRS Vaihingen Challenge Dataset. The results show that our networks outperformed DCED in terms of $F1$ for 17.48% and 2.49% on medium and very high resolution corpora, respectively.
ARTICLE | doi:10.20944/preprints202308.0321.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: speaker identification; convolutional neural network; dung beetle optimizer
Online: 3 August 2023 (09:35:50 CEST)
Speaker recognition methods based on convolutional neural networks (CNN) have been widely used in the security field and smart wearable devices. However, the traditional CNN has a large number of hyperparameters that are difficult to be determined, which makes the model easy to fall into local optimum or even fail to converge during the training process. Intelligent algorithms such as particle swarm optimization and genetic algorithm are used to solve the above problems. However, these algorithms have poor performance compared with the current emerging meta-heuristic algorithms. In this study, the dung beetle optimized convolution neural network (DBO-CNN) is proposed to identify the speakers, which is helpful in finding suitable hyperparameters for training. By testing the dataset of 50 people, it was demonstrated that the accuracy of the model was significantly improved by using this approach. Compared with the traditional CNN and CNN optimized by other intelligent algorithms, the accuracy of DBO-CNN has increased by 0.6%~4.8%, and reached 98.3%.
ARTICLE | doi:10.20944/preprints202306.0552.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network; dolphin whistle; ensemble; spectrogram classification
Online: 7 June 2023 (12:54:44 CEST)
To effectively preserve marine environments and manage endangered species, it is necessary to employ efficient, precise, and scalable solutions for environmental monitoring. Ecoacoustics provides several benefits as it enables non-intrusive, prolonged sampling of environmental sounds, making it a promising tool for conducting biodiversity surveys. However, analyzing and interpreting acoustic data can be time-consuming and often demands substantial human supervision. This challenge can be addressed by harnessing contemporary methods for automated audio signal analysis, which have exhibited remarkable performance due to advancements in deep learning research. This paper introduces a research investigation into developing an automatic computerized system to detect dolphin whistles. The proposed method utilizes a fusion of various resnet50 networks integrated with data augmentation techniques. Through extensive experiments conducted on a publically available benchmark, our findings demonstrate that our ensemble yields significant performance enhancements across all evaluated metrics. The MATLAB/PyTorch source code is freely available at: https://github.com/LorisNanni/
ARTICLE | doi:10.20944/preprints202306.0430.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Commonsense question answering; Knowledge Graph; Graph Convolutional Network
Online: 6 June 2023 (09:45:47 CEST)
Existing Knowledge Graph (KG) models for commonsense question answering present two challenges: (i) existing methods retrieved entities related to questions from the knowledge graph, which may extract noise and irrelevant nodes, and (ii) lack of interaction representation between questions and graph entities. However, current methods mainly focus on retrieving relevant entities with some noisy and irrelevant nodes. In this paper, we propose a novel Retrieval-augmented Knowledge Graph (RAKG) model, which solves the above issues through two key innovations. First, we leverage the density matrix to make the model reason along the corrected knowledge path and extract an enhanced knowledge graph subgraph. Second, we fuse representations of questions and graph entities through a bidirectional attention strategy, in which two representations fuse and update by Graph Convolutional Network (GCN). To evaluate the performance of our method, we conduct experiments on two widely-used benchmark datasets CommonsenseQA and OpenBookQA. The case study gives insight into findings that the augmented subgraph provides reasoning along the corrected knowledge path for question answering.
ARTICLE | doi:10.20944/preprints202306.0260.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: chromosome classification; convolutional neural networks; ensemble; data augmentation
Online: 5 June 2023 (08:02:04 CEST)
Object classification is a crucial task in deep learning, which involves the identification and categorization of objects in images or videos. Although humans can easily recognize common objects, such as cars, animals, or plants, performing this task on a large scale can be time-consuming and error-prone. Therefore, automating this process using neural networks can save time and effort while achieving higher accuracy. Our study focuses on the classification step of human chromosome karyotyping, an important medical procedure that helps diagnose genetic disorders. Traditionally, this task is performed manually by expert cytologists, which is a time-consuming process that requires specialized medical skills. Therefore, automating it through deep learning can be immensely useful. To accomplish this, we implemented and adapted existing preprocessing and data augmentation techniques to prepare the chromosome images for classification. We used ResNet-50 convolutional neural networks and an ensemble approach to classify the chromosomes, obtaining state-of-the-art performance in the tested dataset.
COMMUNICATION | doi:10.20944/preprints202209.0041.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep Learning; Convolutional Neural Networks; Medical Image Segmentation
Online: 5 September 2022 (03:12:55 CEST)
Convolutional neural network architectures have become increasingly complex, which has improved the performance slowly on well-known benchmark datasets in the recent years. In this research, we have analyzed the true need for such complexity. We have introduced G-Net light, a lightweight modified GoogleNet with improved filter count per layer to reduce feature overlaps and complexity. Additionally, by limiting the amount of pooling layers in the proposed architecture, we have exploited the skip connections to minimize the spatial information loss. The investigations on the proposed architecture are evaluated on three retinal vessel segmentation publicly available datasets. The proposed G-Net light outperforms other vessel segmentation architectures by reducing the number of trainable parameters..