ARTICLE | doi:10.20944/preprints202102.0569.v1
Online: 25 February 2021 (10:00:59 CET)
Potholes on roads pose a major threat to motorists and autonomous vehicles. Driving over a pothole has the potential to cause serious damage to a vehicle, which in turn may result in fatal accidents. Currently, many pothole detection methods exist. However, these methods do not utilize deep learning techniques to detect a pothole in real-time, determine the location thereof and display its location on a map. The success of determining an effective pothole detection method, which includes the aforementioned deep learning techniques, is dependent on acquiring a large amount of data, including images of potholes. Once adequate data had been gathered, the images were processed and annotated. The next step was to determine which deep learning algorithms could be utilized. Three different models, including Faster R-CNN, SSD and YOLOv3 were trained on the custom dataset containing images of potholes to determine which network produces the best results for real-time detection. It was revealed that YOLOv3 produced the most accurate results and performed the best in real-time, with an average detection time of only 0.836s per image. The final results revealed that a real-time pothole detection system, integrated with a cloud and maps service, can be created to allow drivers to avoid potholes.
ARTICLE | doi:10.20944/preprints202010.0060.v1
Subject: Engineering, Automotive Engineering Keywords: Fast R-CNN; R-CNN; NDT; X-ray; transfer learning
Online: 5 October 2020 (10:40:49 CEST)
To ensure the safety in aircraft flying, we aim use of the deep learning methods of nondestructive examination with multiple defect detection paradigms for X-ray image detection posed. The use of the Fast Region-based Convolutional Neural Networks (Fast R-CNN) driven model seeks to augment and improve existing automated Non-Destructive Testing (NDT) diagnosis. Within the context of X-ray screening, limited numbers insufficient types of X-ray aeronautics engine defect data samples can thus pose another problem in training model tackling multiple detections perform accuracy. To overcome this issue, we employ a deep learning paradigm of transfer learning tackling both single and multiple detection. Overall the achieve result get more then 90% accuracy based on the AE-RTISNet retrained with 8 types of defect detection. Caffe structure software to make networks tracking detection over multiples Fast R-CNN. We consider the AE-RTISNet provide best results to the more traditional multiple Fast R-CNN approaches simpler translate to C++ code and installed in the Jetson™ TX2 embedded computer. With the use of LMDB format, all images using input images of size 640 × 480 pixel. The results scope achieves 0.9 mean average precision (mAP) on 8 types of material defect classifiers problem and requires approximately 100 microseconds.
REVIEW | doi:10.20944/preprints202107.0375.v1
Online: 16 July 2021 (14:31:17 CEST)
Edge AI accelerators have been emerging as a solution for near customers’ applications in areas such as unmanned aerial vehicles (UAVs), image recognition sensors, wearable devices, robotics, and remote sensing satellites. These applications not only require meeting performance targets but also meeting strict reliability and resilience constraints due to operations in harsh and hostile environments. Numerous research articles have been proposed, but not all of these include full specifications. Most of these tend to compare their architecture with other existing CPUs, GPUs, or other reference research. This implies that the performance results of the articles are not comprehensive. Thus, this work lists the three key features in the specifications such as computation ability, power consumption, and the area size of prior art edge AI accelerators and the CGRA accelerators during the past few years to define and evaluate the low power ultra-small edge AI accelerators. We introduce the actual evaluation results showing the trend in edge AI accelerator design about key performance metrics to guide designers on the actual performance of existing edge AI accelerators’ capability and provide future design directions and trends for other applications with challenging constraints.
ARTICLE | doi:10.20944/preprints202104.0653.v1
Online: 26 April 2021 (10:55:00 CEST)
The aim of this paper is to use deep learning tools to innovate pre-trained object detection models to improve the accuracy of non-destructive testing (NDT) of civil aviation maintenance. First, this thesis classifies object defects for NDT, such as cracks, undercut, etc. Nowadays, thesis surveys innovation deep-learning methods technology is used to improve the defect detection performance inferencing capability, increase the accuracy and efficiency of automatic identification which in enhanced the safety and reliability of aircraft fuselage in future, mark hidden cracks and solve the challenges that cannot be identified by manual inspection. Second, recent mainstream techniques the YOLOv4 neural network to the graphics card GPU core operator to speed up the recognition of defect images is being applied to the non-destructive inspection process of aircraft maintenance on A, C and D-Level, fully validating the deep learning model's powerful defect detection target capability. The attention-based YOLOv4 algorithm is improved by applying a one-stage attention mechanism to the YOLOv4, thereby improving the accuracy of the innovation model. Finally, thesis improved YOLOv4 based on an attention mechanism is proposed for object detection NDT via the deep learning method to effectively improve and shorten the inspection anomaly detection method for automatic detection sensor systems.
ARTICLE | doi:10.20944/preprints201703.0061.v1
Online: 13 March 2017 (08:31:22 CET)
This paper presents a novel CNN-based architecture, referred to as Q-Net, to learn local feature descriptors that are useful for matching image patches from two different spectral bands. Given correctly matched and non-matching cross-spectral image pairs, a quadruplet network is trained to map input image patches to a common Euclidean space, regardless of the input spectral band. Our approach is inspired by the recent success of triplet networks in the visible spectrum, but adapted for cross-spectral scenarios, where for each matching pair there are always two possible non-matching patches; one for each spectrum. Experimental evaluations on a public cross-spectral VIS-NIR dataset shows that the proposed approach improves the state-of-the-art. Moreover, the proposed technique can also be used in mono-spectral settings, obtaining a similar performance to triplet network descriptors, but requiring less training data.
ARTICLE | doi:10.20944/preprints202207.0308.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: micro-video classification; 3D CNN; multi-modal
Online: 21 July 2022 (03:09:34 CEST)
Along with the popularity of the Internet, people are exposed to more and more ways of micro-videos, and a huge amount of micro-video data has emerged. micro-videos have gradually become the Internet content preferred by the public, and a large number of micro-video apps have also emerged, such as Tiktok and Kwai. Intelligent classification and mining of micro-videos can greatly enhance user experience, improve business operation efficiency and enhance user experience. Through deep intelligent analysis and mining of micro-videos, important information in micro-videos can be extracted to provide an important basis for beautifying videos, content appreciation, video recommendation, content search, etc. In the past, content understanding for short videos often used human work annotation, but in recent years, with the great success of deep convolutional neural networks in image recognition, short video content understanding based on this method has gradually developed. Nowadays, most recognition algorithms extract the feature representation of each frame independently and then fuse them. However, while extracting the feature representation, some low-level semantic features are lost, which makes the algorithm unable to accurately distinguish the category of the video. At present, the algorithm of micro-video recognition based on deep learning has surpassed the iDT algorithm, making these traditional methods fade out of people’s view. In this paper according to the micro-video classification task, a new network model is proposed to concatenate features of each modality into the overall features of various modalities through the network, and then fuse the various modal features with the attention mechanism to obtain the whole micro-video features, which will be used for classification. In order to verify the effectiveness of the algorithm proposed in this paper, experiments are conducted in the public dataset, and it is shown the effectiveness of our model.
ARTICLE | doi:10.20944/preprints202204.0033.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Biometrics; Face spoofing; CNN; BS; ResNet-50
Online: 6 April 2022 (07:55:23 CEST)
Currently, face recognition technologies are the most widely used methods for verifying 1an individual’s identity. Nevertheless, it has increased in popularity, raising concerns about face spoofing attacks, in which a photo or video of an authorized person’s face is used to get access to services. Based on a combination of Background Subtraction (BS) and Convolutional Neural Networks (CNN), as well as an ensemble of classifiers, we propose an efficient and more robust face spoof detection algorithm. This algorithm includes a Fully Connected (FC) classifier with a Majority Vote (MV) algorithm, which uses different face spoof attacks (e.g., printed photo and replayed video). By including a majority vote to determine whether the input video is genuine or not, the proposed method significantly enhances the performance of the Face Anti-Spoofing (FAS) system. For evaluation, we considered the MSU MFSD, REPLAY-ATTACK, and CASIA-FASD databases. The obtained results by our proposed approach are better than those obtained by state of the art methods. On the REPLAY-ATTACK database, we were able to attain a Half Total Error Rate (HTER) of 0.62% and an Equal Error Rate (EER) of 0.58%. It was possible to attain an EER of 0% on both the CASIA-FASD and the MSU FAS databases.
CONCEPT PAPER | doi:10.20944/preprints202207.0294.v1
Subject: Mathematics & Computer Science, Other Keywords: CNN; brain tumor; GLCM; segmentation; superpixel; spectral clustering
Online: 20 July 2022 (05:28:57 CEST)
Extensive growth in the volume of irregular brain cells is known as brain tumor. Human brain is surrounded by stiff skull. There are various issues that occur due to the growth of any tumor inside this restricted space. The malignant and benign are two main categories of the brain tumor. The skull is pressurized to enlarge from inside in case of growth of any benign or malignant tumor. This tumor leads to harm in brain and it may be dangerous to life also. The brain tumor is divided into two kinds - primary or secondary. The brain tumor detection techniques have various phases. In this paper, comparative study of CNN with GLCM approach and superpixel based spectral clustering is done tumor. This work takes into account metrics like accuracy, sensitivity and specificity for drawing the comparison between both the techniques.
ARTICLE | doi:10.20944/preprints202005.0151.v3
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: deep learning; CNN; DenseNet; COVID-19; transfer learning
Online: 18 February 2022 (14:44:55 CET)
COVID-19 has a severe risk of spreading rapidly, the quick identification of which is essential. In this regard, chest radiology images have proven to be a practical screening approach for COVID-19 aﬀected patients. This study proposes a deep learning-based approach using Densenet-121 to detect COVID-19 patients eﬀectively. We have trained and tested our model on the COVIDx dataset and performed both 2-class and 3-class classification, achieving 96.49% and 93.71% accuracy, respectively. By successfully utilizing transfer learning, we achieve comparable performance to the state-of-the-art method while using 15x fewer model parameters. Moreover, we performed an interpretability analysis using Grad-CAM to highlight the most significant image regions at test time. Finally, we developed a website that takes chest radiology images as input and detects the presence of COVID-19 or pneumonia and a heatmap highlighting the infected regions. Source code for reproducing results and model weights are available.
REVIEW | doi:10.20944/preprints202202.0123.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: cnn; ann; robotics; machine learning; ann; artificial intelligence
Online: 8 February 2022 (15:42:51 CET)
Self-Driving Vehicles or Autonomous Driving (AD) have emerged as the prime field of research in Artificial Intelligence and Machine Learning of late. The indicated market share of existing vehicles might be supplanted by these self-driving vehicles within the next few decades. While AD may appear to be relatively easy, in fact, it is quite the contrary owing to involvement and coordination amongst various kinds of systems. Numerous research studies are being conducted at various stages of these AD systems. While some find the various stages of the AD Pipeline beneficial, others tend to rely on Computer Vision mostly. This paper attempts to summarise the recent developments in Autonomous Vehicle architecture. Although some people might seem to be sceptical about the pragmatic use of AD as an alternative to existing vehicles, the plethora of research and experiments being conducted suggests the opposite. Indeed, there are many challenges to implementing AD in the real world, but significant progress made in the last couple of years indicates general acceptance of AD in upcoming years.
ARTICLE | doi:10.20944/preprints202105.0303.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Emotion detection; CNN; VGG16; Education; Transfer learning; Engagement
Online: 13 May 2021 (13:59:44 CEST)
There is a crucial need for advancement in the online educational system due to the unexpected, forced migration of classroom activities to a fully remote format, due to the coronavirus pandemic. Not only this, but online education is the future, and its infrastructure needs to be improved for an effective teaching-learning process. One of the major concerns with the current video call-based online classroom system is student engagement analysis. Teachers are often concerned about whether the students can perceive the teachings in a novel format. Such analysis was involuntarily done in the offline mode, however, is difficult in an online environment. This research presents an autonomous system for analyzing the students' engagement in the class by detecting the emotions exhibited by the students. This is done by capturing the video feed of the students and passing the detected faces to an emotion detection mode. The emotion detection model in the proposed architecture was designed by finetuning VGG16 pre-trained image classifier model. Lastly, the average student engagement index is calculated. We received considerable performance setting reliability of the use of the proposed system in real-time giving a future scope to this research.
ARTICLE | doi:10.20944/preprints201811.0546.v4
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Convolutional Neural Network (CNN), Deep learning, Architecture, Applications
Online: 14 February 2019 (10:01:31 CET)
With the increase of the Artificial Neural Network (ANN), machine learning has taken a forceful twist in recent times. One of the most spectacular kinds of ANN design is the Convolutional Neural Network (CNN). The Convolutional Neural Network (CNN) is a technology that mixes artificial neural networks and up to date deep learning strategies. In deep learning, Convolutional Neural Network is at the center of spectacular advances. This artificial neural network has been applied to several image recognition tasks for decades and attracted the eye of the researchers of the many countries in recent years as the CNN has shown promising performances in several computer vision and machine learning tasks. This paper describes the underlying architecture and various applications of Convolutional Neural Network.
ARTICLE | doi:10.20944/preprints202106.0194.v1
Subject: Engineering, Automotive Engineering Keywords: cracks; wavelets; multiresolution analysis; ultrasound imaging; deep learning; CNN
Online: 7 June 2021 (15:53:18 CEST)
In this paper, we propose a new methodology for crack monitoring in concrete structures. This approach is based on a n this paper, we propose a new methodology for monitoring cracks in concrete structures. This approach is based on a multi-resolution analysis of a sample or a specimen of the studied material subjected to several types of solicitation. The image obtained by ultrasonic investigation and processing by a dedicated wavelet will be analyzed according to several scales in order to detect internal cracks and crack initiation. The ultimate goal of this work is to propose an automatic crack type identification scheme based on convolutional neural networks (CNN). In this context, crack propagation can be monitored without access to the concrete surface and the goal is to detect cracks before they are visible on the concrete surface. The key idea allowing such a performance is the combination of two major data analysis tools which are wavelets and Deep Learning. This original procedure allows to reach a high accuracy close to 0.90. In this work, we have also implemented another approach for automatic detection of external cracks by deep learning from publicly available datasets.
ARTICLE | doi:10.20944/preprints202005.0455.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: pattern recognition; deep convolutional neural network; Brahmi script; CNN
Online: 28 May 2020 (07:33:32 CEST)
Significant progress has made in pattern recognition technology. However, one obstacle that has not yet overcome is the recognition of words in the Brahmi script, specifically the identification of characters, compound characters, and word. This study proposes the use of the deep convolutional neural network with dropout to recognize the Brahmi words. This study also proposed a DCNN for Brahmi word recognition and a series of experiments are performed on standard Brahmi dataset. The practical operation of this method was systematically tested on accessible Brahmi image database, achieving 92.47% recognition rate by CNN with dropout respectively which is among the best while comparing with the ones reported in the literature for the same task.
ARTICLE | doi:10.20944/preprints202105.0424.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Convolutional Neural Network (CNN); Emotion Recognition; Facial Expression; Classification; Accuracy
Online: 18 May 2021 (11:34:19 CEST)
Emotion recognition defined as identifying human emotion and is directly related to different fields such as human-computer interfaces, human emotional processing, irrational analysis, medical diagnostics, data-driven animation, human-robot communi- cation and many more. The purpose of this study is to propose a new facial emotional recognition model using convolutional neural network. Our proposed model, “ConvNet”, detects seven specific emotions from image data including anger, disgust, fear, happiness, neutrality, sadness, and surprise. This research focuses on the model’s training accuracy in a short number of epoch which the authors can develop a real-time schema that can easily fit the model and sense emotions. Furthermore, this work focuses on the mental or emotional stuff of a man or woman using the behavioral aspects. To complete the training of the CNN network model, we use the FER2013 databases, and we test the system’s success by identifying facial expressions in the real-time. ConvNet consists of four layers of convolution together with two fully connected layers. The experimental results show that the ConvNet is able to achieve 96% training accuracy which is much better than current existing models. ConvNet also achieved validation accuracy of 65% to 70% (considering different datasets used for experiments), resulting in a higher classification accuracy compared to other existing models. We also made all the materials publicly accessible for the research community at: https://github.com/Tanoy004/Emotion-recognition-through-CNN.
ARTICLE | doi:10.20944/preprints202009.0323.v1
Subject: Medicine & Pharmacology, Nursing & Health Studies Keywords: COVID-19; lockdown; CNN; DLNN; GRU; mental anxiety; hybrid approach
Online: 15 September 2020 (02:56:33 CEST)
COVID-19 and new concept, lockdown, change social life of all classes of humans. Children partially feel the changes of daily life and this situation has been children’s free mind. Children are under a new type of restriction imposed on them by their parents. Normally they prefer play with their their friends than study and always waiting for holidays. They heard a new jargon i.e. lockdown where everything stands still. Very often they see peoples in the roads and few vehicles are moving in the roads. However, a peculiar thing happens now that they sit in front of computer to hear the virtual classes that are taken by the teachers. This also happens when there is no lockdown since COVID-19 still affects people. The environment is totally changed and they do not find any proper answers from the parents about the scenario.This study has been made an attempt to carry out the mental affairs of children in West Bengal, India. Several families are surveyed for collecting responses mostly from rural areas as well as urban areas for the time-period from April, 2020 to July, 2020. An effort has been given in this paper to predict the stress, depression and anxiety faced by children during the COVID-19. A Deep Learning Neural Network (DLNN) based method is applied to understand the stress level, depression level and anxiety level amongst the children. A hybrid DLNN has been presented in this research that combines both Convolutional Layer and Gated-Recurrent Unit (GRU) for obtaining the prediction of the mental health of children. The model obtains an accuracy of 89.57% for defeminizing mental anxiety of children.
ARTICLE | doi:10.20944/preprints201908.0289.v1
Subject: Earth Sciences, Geoinformatics Keywords: drone video; human action recognition; CNN; Support vector machine (SVM)
Online: 28 August 2019 (03:52:22 CEST)
Recognition of the human interaction on the unconstrained videos taken from cameras and remote sensing platforms like a drone is a challenging problem. This study presents a method to resolve issues of motion blur, poor quality of videos, occlusions, the difference in body structure or size, and high computation or memory requirement. This study contributes to the improvement of recognition of human interaction during disasters such as an earthquake and flood utilizing drone videos for rescue and emergency management. We used Support Vector Machine (SVM) to classify the high-level and stationary features obtained from Convolutional Neural Network (CNN) in key-frames from videos. We extracted conceptual features by employing CNN to recognize objects from first and last images from a video. The proposed method demonstrated the context of a scene, which is significant in determining the behaviour of human in the videos. In this method, we do not require person detection, tracking, and many instances of images. The proposed method was tested for the University of Central Florida (UCF Sports Action), Olympic Sports videos. These videos were taken from the ground platform. Besides, camera drone video was captured from Southwest Jiaotong University (SWJTU) Sports Centre and incorporated to test the developed method in this study. This study accomplished an acceptable performance with an accuracy of 90.42%, which has indicated improvement of more than 4.92% as compared to the existing methods.
ARTICLE | doi:10.20944/preprints201812.0258.v1
Subject: Materials Science, Surfaces, Coatings & Films Keywords: copper; polymer coatings; polyvinyl alcohol; silver nanoparticles; deep learning; CNN
Online: 21 December 2018 (07:51:06 CET)
In order to design effective protective coatings against corrosion, the polyvinyl alcohol (PVA) as compound and composite with silver nanoparticles (nAg/PVA) were electrodeposited on copper surface employing electrochemical techniques such as linear potentiometry and cyclic voltammetry. A new paradigm was used to distinguish the features of coatings, i.e., a Deep Convolutional Neural Network (CNN) was implemented to automatically and hierarchically extract the discriminative characteristics from the information given by optical microscopy images. The main arguments that invoke a CNN implementation in the surface science of materials are the following: artificial intelligence techniques can be successfully applied to learn differences between surface coatings; based on their popularity for image processing, CNN can model images related to the problem of coatings; deep learning is able to extract the features that are distinguishable between material surfaces. To provide an overview of the copper surface, CNN was applied on microscope slides (CNN@microscopy) and inherently learnt distinctive characteristics for each class of surface morphology. The material surface morphology controlled by CNN without the interference of the human factor was successfully conducted, in our study, to extract the similarities/differences between unprotected and protected surfaces to establish the PVA and nAg/PVA performance to retard the copper corrosion.
ARTICLE | doi:10.20944/preprints202108.0011.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Transformer; spike; neural decoding; CNN; RNN; LSTM; deep learning; information; neuroscience
Online: 2 August 2021 (09:51:43 CEST)
Neural decoding from spiking activity is an essential tool for understanding the information encoded in population neurons, especially in applications like brain-computer interface (BCI). Various quantitative methods have been proposed and have shown superiorities under different scenarios respectively. From the machine learning perspective, the decoding task is to map the high-dimensional spatial & temporal neuronal activity to the low-dimensional physical quantities (e.g., velocity, position). Because of the complex interactions and the abundant dynamics among neural circuits, good decoding algorithms usually have the capability of capturing flexible spatiotemporal structures embedded in the input feature space. Recently, the Transformer-based models are widely used in processing natural languages and images due to its superior performances in handling long-range and global dependencies. Hence, in this work we examine the potential applications of Transformers in neural decoding and introduce two Transformer-based models. Besides adapting the Transformer to neuronal data, we also propose a data augmentation method for overcoming the data shortage issue. We test our models on three experimental datasets and their performances are comparable to the previous state-of-the-art (SOTA) RNN-based methods. In addition, Transformer-based models show increased decoding performances when the input sequences are longer, while LSTM-based models deteriorate quickly. Our research suggests that Transformer-based models are important additions to the existing neural decoding solutions, especially for large datasets with long temporal dependencies.
ARTICLE | doi:10.20944/preprints201812.0296.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: staircase recognition; Convolutional Neural Networks (CNN); re-configurable robot; contour detection
Online: 25 December 2018 (05:33:12 CET)
Multi-floor environments are usually ignored while designing an autonomous robot for indoor cleaning applications. However, for efficient operation in such environments, the ability of a robotic platform to traverse staircases is crucial. Staircase detection and localization is highly important for planning the traversal on staircases. This paper describes a deep learning approach using Convolutional Neural Networks (CNNs) based Robot Operation System (ROS) to staircase detection and localization. We use an object detection network to detect staircases in images. We also localize these staircases using a contour detection algorithm to detect the target point, a point close to the center of the first step, and the angle of approach to the target point. Experiments are performed with data obtained from images captured on different types of staircases at different view points/angles. Results show that the approach is very accurate in identifying the presence of the staircase in the working environment and is also able to locate the target point with good accuracy.
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: breast cancer; mitosis detection; CNN; Stain-normalization; CRF; multi-scale feature
Online: 9 November 2017 (10:05:53 CET)
Accurate assessment of the breast cancer deterioration degree plays a crucial role in making medical plan, and the important basis for degree assessment is the number of mitoses in a given area of the pathological image. We utilized deep multi-scale fused fully convolutional neural network (MFF-CNN) combing with conditional random felid (CRF) to detect mitoses in hematoxylin and eosin stained histology image. Analyze the characteristics of mitotic detection ----scale invariance and sparsity, as well as the difficulties ---- small amount of data , inconsistent image staining and sample class unbalanced. Based on this, mitotic detection model is designed. In this paper, a tissue-based staining equalization method is used, and to establish an effective training sample set, we select training samples by using CNN. A mitotic detection model fusing multi-level and multi-scale features and context information was designed, and the corresponding training strategy was made to reduce over-fitting. As preliminarily validated on the public 2014 ICPR MITOSIS data, our method achieves a better performance in term of detection accuracy than ever recorded for this dataset.
ARTICLE | doi:10.20944/preprints202108.0279.v1
Subject: Keywords: Glaucoma; Diabetic Retinopathy; Convolution Neural Network (CNN); Vision Loss; Blindness; Machine Learning
Online: 12 August 2021 (15:36:51 CEST)
In the last few decades, glaucoma became the second biggest leading cause of irreversible vision loss. Because of its asymptotic growth, it is not properly diagnosed until the relatively late stage. To stop the severe damage by glaucoma it is needed to detect glaucoma in its early stages. Surprisingly diabetes also be the greatest cause of glaucoma. In the modern era, artificial intelligence makes great progress in the medical image processing field. Image analysis based on machine learning gives a huge success in diagnosis glaucoma without any misdiagnosis. The aim of this proposed paper is to create an automated process that can detect glaucoma and diabetic retinopathy. Here various Machine Learning models are used and results of these methods are presented.
ARTICLE | doi:10.20944/preprints202011.0571.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: 1D-CNN; fault diagnosis; rolling bearing; vibration signal; single load; different load
Online: 23 November 2020 (09:22:43 CET)
The diagnosis of a rolling bearing for monitoring its status is critical for maintaining industrial equipment using rolling bearings. The traditional method of diagnosing faults of the rolling bearing has low identification accuracy, which needs artificial feature extraction to enhance the accuracy. 1D-CNN method not only can diagnose bearing faults accurately but also overcome shortcomings of the traditional methods. Different from machine learning and other deep learning models, the 1D-CNN method does not need pre-processing one-dimensional data of rolling bearing’s vibration. Thus, it enhances the processing speed and improves the network structure to have a reasonable design for small sample data sets. This study proposes and tests a 1D-CNN method for diagnosing rolling bearings. By introducing the dropout operation, the method obtains high accuracy and improves the generalizing ability. The experimental results show 99.52% of the average accuracy under a single load and 98.26% under different loads.
Subject: Engineering, Automotive Engineering Keywords: traffic engineering; traffic incident detection; CNN-XGBoost; Convolution Neural Network; Deep Learning
Online: 15 April 2020 (14:13:35 CEST)
Accurate and efficient traffic incident detection methods can effectively alleviate traffic congestion caused by traffic incidents, prevent secondary accidents and improve the safety of urban road traffic.Aiming at the problems that the traditional machine learning event detection method cannot fully extract the parameter characteristics of traffic flow and is not suitable formulti-dimensional and non-linear massive data, we propose a new traffic event detection method(CNN-XGBoost).This method combines the respective advantages of Convolution Neural Network(CNN) and Extreme Gradation Boosting (XGBoost). Firstly, we preprocessed the original freeway traffic incident detection data set by constructing initial variable set, data normalization, data balance processing and dimension reorganization. Secondly,we use CNN network to automatically extract the deep features of event detection data, and use XGBoost as a classifier to classify the extracted features for expressway traffic event detection.Finally, we use the data set of Hangzhou expressway microwave detector in China to carry out simulation experiments on CNN-XGBoost. The experimental results show that compared with XGBoost, CNN, Support Vector Machine (SVM) and Gradient Boosting Decision Tree (GBDT) and other methods, CNN-XGBoost method can effectively improve the accuracy of expressway traffic event detection and has better generalization ability.
ARTICLE | doi:10.20944/preprints201811.0314.v1
Subject: Engineering, Biomedical & Chemical Engineering Keywords: High-speed video-endoscopy, laryngeal image processing, glottis delineation, Machine Learning, CNN
Online: 13 November 2018 (12:57:10 CET)
Detection of the region of interest (ROI) is a critical step in laryngeal image analysis for the delineation of glottis contour. The process can improve both computational efficiency and accuracy of the image segmentation task, which will facilitate subsequent analysis and characterization of the vocal fold vibration as it correlates with voice quality and pathology. This study aims to develop machine learning based approaches for automatic detection of ROI for glottis image sequences captured by high-speed video-endoscopy (HSV), a clinical laryngeal imaging modality. In particular, we first applied the supporting vector machine (SVM) method using histogram of oriented gradients (HOG) feature descriptor, and second, trained a convolutional neural network (CNN) model for this task. Comparisons are made for both approaches in terms of accuracy of recognition and computation time.
Subject: Engineering, Electrical & Electronic Engineering Keywords: pest recognition; Tangerine; advanced deep learning; minimum classification error; Inception Module; CNN
Online: 7 November 2018 (13:09:30 CET)
To improve the tangerine crop yield, the work of recognizing and then disposing of specific pests is becoming increasingly important. The task of recognition is based on the features extracted from the images that have been collected from websites and outdoors. Traditional recognition and deep learning methods, such as KNN (k-nearest neighbors) and AlexNet, are not preferred by knowledgeable researchers, who have proven them inaccurate. In this paper, we exploit four kinds of structures of advanced deep learning to classify 10 citrus pests. The experimental results show that Inception-ResNet-V3 obtains the minimum classification error.
ARTICLE | doi:10.20944/preprints201808.0130.v1
Subject: Engineering, Mechanical Engineering Keywords: SHM; Electromechanical Impedance; Piezoelectricity; Intelligent Fault Diagnosis; Machine Learning; CNN; Deep Learning
Online: 6 August 2018 (21:51:53 CEST)
Preliminaries Convolutional Neural Network (CNN) applications have recently emerged in Structural Health Monitoring (SHM) systems focusing mostly on vibration analysis. However, the SHM literature shows clearly that there is a lack of application regarding the combination of PZT (Lead Zirconate Titanate) based method and CNN. Likewise, applications using CNN along with the Electromechanical Impedance (EMI) technique applied to SHM systems are rare. To encourage this combination, an innovative SHM solution through the combination of the EMI-PZT and CNN is presented here. To accomplish this, the EMI signature is split into several parts followed by computing the Euclidean distances among them to form a RGB (red, green and blue) frame. As a result, we introduce a dataset formed from the EMI-PZT signals of 720 frames, encompassing a total of 4 types of structural conditions for each PZT. In a case study, the CNN-based method was experimentally evaluated using three PZTs glued onto an aluminum plate. The results reveal an effective pattern classification; yielding a 100% hit rate which outperforms other SHM approaches. Furthermore, the method needs only a small dataset for training the CNN, providing several advantages for industrial applications.
ARTICLE | doi:10.20944/preprints202209.0025.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: object detection; semi-supervised learning; Mask R-CNN; floor-plan images; computer vision
Online: 1 September 2022 (15:16:43 CEST)
Research has been growing on object detection using semi-supervised methods in past few years. We examine the intersection of these two areas for floor-plan objects to promote the research objective of detecting more accurate objects with less labelled data. The floor-plan objects include different furniture items with multiple types of the same class, and this high inter-class similarity impacts the performance of prior methods. In this paper, we present Mask R-CNN based semi-supervised approach that provides pixel-to-pixel alignment to generate individual annotation masks for each class to mine the inter-class similarity. The semi-supervised approach has a student-teacher network that pulls information from the teacher network and feeds it to the student network. The teacher network uses unlabeled data to form pseudo-boxes, and the student network uses both unlabeled data with the pseudo boxes and labelled data as ground truth for training. It learns representations of furniture items by combining labelled and unlabeled data. On the Mask R-CNN detector with ResNet-101 backbone network, the proposed approach achieves mAP of 98.8%, 99.7%, and 99.8% with only 1%, 5% and 10% labelled data, respectively. Our experiment affirms the efficiency of the proposed approach as it outperforms the fully supervised counterpart using only 10% of the labels.
ARTICLE | doi:10.20944/preprints202107.0154.v1
Subject: Engineering, Automotive Engineering Keywords: microcracking; concrete; feature detection; damage detection; structural health monitoring; CNN based damage classification
Online: 6 July 2021 (13:34:21 CEST)
High costs for the repair of concrete structures can be prevented if damage at an early stage of degradation is detected and precautionary maintenance measures are applied. To this end, we use numerical wave propagation simulations to identify simulated damage in concrete using convolutional neural networks (CNN). Damage in concrete subjected to compression is modeled at the mesoscale using the discrete element method. Ultrasonic wave propagation simulation on the damaged concrete specimens are performed using the rotated staggered finite-difference grid method. The simulated ultrasonic signals are used to train a CNN based classifier capable of classifying three different damage stages (microcrack initiation, microcrack growth and microcrack coalescence leading to macrocracks). The performance of the classifier is improved by refining the dataset via an analysis of the averaged envelope of the signal. The classifier using the refined dataset has an overall accuracy of 90%.
ARTICLE | doi:10.20944/preprints202001.0149.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Optical Music Recognition; Historical Document Analysis; Medieval manuscripts; neume notation; CNN; LSTM; CTC
Online: 15 January 2020 (12:11:25 CET)
The automatic recognition of scanned Medieval manuscripts still represents a challenge due to degradation, non standard layouts, or notations. This paper focuses on the Medieval square notation developed around the 11th century which is composed of staff lines, clefs, accidentals, and neumes which are basically connected single notes. We present a novel approach to tackle the automatic transcription by applying CNN/LSTM networks that are trained using the segmentation-free CTC-loss-function which considerably facilitates the GT-production. For evaluation, we use three different manuscripts and achieve a dSAR of 86.0% on the most difficult book and 92.2% on the cleanest one. To further improve the results, we apply a neume dictionary during decoding which yields a relative improvement of about 5%.
ARTICLE | doi:10.20944/preprints202110.0375.v1
Subject: Engineering, Biomedical & Chemical Engineering Keywords: Brain-Computer Interface (BCI), Convolutional neural network (CNN), Electroencephalogram (EEG), Explainable artificial intelligence (XAI)
Online: 26 October 2021 (11:45:00 CEST)
Functional connectivity (FC) is a potential candidate that can increase the performance of brain-computer interfaces (BCIs) in the elderly because of its compensatory role in neural circuits. However, it is difficult to decode FC by current machine learning techniques because of a lack of its physiological understanding. To investigate the suitability of FC in BCI for the elderly, we propose the decoding of lower- and higher-order FCs using a convolutional neural network (CNN) in six cognitive-motor tasks. The layer-wise relevance propagation (LRP) method describes how age-related changes in FCs impact BCI applications for the elderly compared to younger adults. Seventeen younger (24.5±2.7 years) and twelve older (72.5±3.2 years) adults were recruited to perform tasks related to hand-force control with or without mental calculation. CNN yielded a six-class classification accuracy of 75.3% in the elderly, exceeding the 70.7% accuracy for the younger adults. In the elderly, the proposed method increases the classification accuracy by 88.3% compared to the filter-bank common spatial pattern (FBCSP). LRP results revealed that both lower- and higher-order FCs were dominantly overactivated in the prefrontal lobe depending on task type. These findings suggest a promising application of multi-order FC with deep learning on BCI systems for the elderly.
ARTICLE | doi:10.20944/preprints201901.0281.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Wind Energy; Wind Turbine; Drone Inspection; Damage Detection; Deep Learning; Convolutional Neural Network (CNN)
Online: 28 January 2019 (15:50:04 CET)
Timely detection of surface damages on wind turbine blades is imperative for minimising downtime and avoiding possible catastrophic structural failures. With recent advances in drone technology, a large number of high-resolution images of wind turbines are routinely acquired and subsequently analysed by experts to identify imminent damages. Automated analysis of these inspection images with the help of machine learning algorithms can reduce the inspection cost, thereby reducing the overall maintenance cost arising from the manual labour involved. In this work, we develop a deep learning based automated damage suggestion system for subsequent analysis of drone inspection images. Experimental results demonstrate that the proposed approach could achieve almost human level precision in terms of suggested damage location and types on wind turbine blades. We further demonstrate that for relatively small training sets advanced data augmentation during deep learning training can better generalise the trained model providing a significant gain in precision.
ARTICLE | doi:10.20944/preprints201812.0067.v1
Subject: Earth Sciences, Environmental Sciences Keywords: built-up area; classification; Landsat 8- OLI; feature engineering; feature learning; CNN; accuracy evaluation
Online: 5 December 2018 (12:06:34 CET)
Detailed built-up area information is valuable for mapping complex urban environments. Although a large number of classification algorithms about built-up areas have been developed, they are rarely tested from the perspective of feature engineering and feature learning. Therefore we launched a unique investigation to provide a full test of the OLI imagery for 15-m resolution built-up area classification in 2015, in Beijing, China. Training a classifier requires many sample points, and we propose a method based on the ESA's 38-meter global built-up area data of 2014, Open Street Map and MOD13Q1-NDVI to achieve rapid and automatic generation of a large number of sample points. Our aim is to examine the influence of a single pixel and image patch under traditional feature engineering and modern feature learning strategies. In feature engineering, we consider spectra, shape and texture as the input features, and SVM, random forest (RF) and AdaBoost as the classification algorithms. In feature learning, the convolution neural network (CNN) is used as the classification algorithm. In total, 26 built-up land cover maps were produced. Experimental results show that: (1) the approaches based on feature learning are generally better than those based on feature engineering in terms of classification accuracy, and the performance of ensemble classifiers e.g., RF, is comparable to that of CNN. Two dimensional CNN and the 7 neighborhood RF have the highest classification accuracy of nearly 91%. (2) Overall, the classification effect and accuracy based on image patches are better than those based on single pixels. The features that can highlight the information of the target category (for example, PanTex and EMBI) can help improve classification accuracy.
ARTICLE | doi:10.20944/preprints202208.0345.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Deep learning; solar radiation forecasting; model prediction; solar energy; multi climates data; generalizability; sustainability; Long Short-Term Memory (LSTM); Gated Recurrent Unit (GRU); Convolutional Neural Network (CNN); Hybrid CNN-Bidirectional LSTM; LSTM Autoencoder
Online: 18 August 2022 (10:59:18 CEST)
The sustainability of the planet and its inhabitants is in dire danger and is among the highest priorities on global agendas such as the Sustainable Development Goals (SDGs) of the United Nations (UN). Solar energy -- among other clean, renewable, and sustainable energies -- is seen as essential for environmental, social, and economic sustainability. Predicting solar energy accurately is critical to increasing reliability and stability, and reducing the risks and costs of the energy systems and markets. Researchers have come a long way in developing cutting-edge solar energy forecasting methods. However, these methods are far from optimal in terms of their accuracies, generalizability, benchmarking, and other requirements. Particularly, no single method performs well across all climates and weathers due to the large variations in meteorological data. This paper proposes SENERGY (an acronym for Sustainable Energy), a novel deep learning-based auto-selective approach and tool that, instead of generalising a specific model for all climates, predicts the best performing deep learning model for GHI forecasting in terms of forecasting error. The approach is based on carefully devised deep learning methods and feature sets through an extensive analysis of deep learning forecasting and classification methods using ten meteorological datasets from three continents. We analyse the tool in great detail through a range of metrics and methods for performance analysis, visualization, and comparison of solar forecasting methods. SENERGY outperforms existing methods in all performance metrics including Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Forecast Skills (FS), Relative Forecasting Error, and the normalised versions of these metrics. The proposed auto-selective approach can be extended to other research problems such as wind energy forecasting and predict forecasting models based on different criteria (in addition to the minimum forecasting error used in this paper) such as the energy required or speed of model execution, different input features, different optimisations of the same models, or other user preferences.
ARTICLE | doi:10.20944/preprints201808.0112.v2
Subject: Mathematics & Computer Science, Computational Mathematics Keywords: remote sensing; image classification; fully connected conditional random fields (FC-CRF); convolutional neural networks (CNN)
Online: 28 November 2018 (07:11:42 CET)
The interpretation of land use and land cover (LULC) is an important issue in the fields of high-resolution remote sensing (RS) image processing and land resource management. Fully training a new or existing convolutional neural network (CNN) architecture for LULC classification requires a large amount of remote sensing images. Thus, fine-tuning a pre-trained CNN for LULC detection is required. To improve the classification accuracy for high resolution remote sensing images, it is necessary to use another feature descriptor and to adopt a classifier for post-processing. A fully connected conditional random fields (FC-CRF), to use the fine-tuned CNN layers, spectral features, and fully connected pairwise potentials, is proposed for image classification of high-resolution remote sensing images. First, an existing CNN model is adopted, and the parameters of CNN are fine-tuned by training datasets. Then, the probabilities of image pixels belong to each class type are calculated. Second, we consider the spectral features and digital surface model (DSM) and combined with a support vector machine (SVM) classifier, the probabilities belong to each LULC class type are determined. Combined with the probabilities achieved by the fine-tuned CNN, new feature descriptors are built. Finally, FC-CRF are introduced to produce the classification results, whereas the unary potentials are achieved by the new feature descriptors and SVM classifier, and the pairwise potentials are achieved by the three-band RS imagery and DSM. Experimental results show that the proposed classification scheme achieves good performance when the total accuracy is about 85%.
ARTICLE | doi:10.20944/preprints202105.0605.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: deep learning; computed tomography; image classification; COVID-19; medical image analysis; pneumonia; CNN, LSTM, medical diagnosis
Online: 25 May 2021 (10:32:29 CEST)
Advancements in deep learning and availability of medical imaging data have led to use of CNN based architectures in disease diagnostic assisted systems. In spite of the abundant use of reverse transcription-polymerase chain reaction (RT-PCR) based tests in COVID-19 diagnosis, CT images offer an applicable supplement with its high sensitivity rates. Here, we study classification of COVID-19 pneumonia (CP) and non-COVID-19 pneumonia (NCP) in chest CT scans using efficient deep learning methods to be readily implemented by any hospital. We report our deep network framework design that encompasses Convolutional Neural Networks (CNNs) and bidirectional Long Short Term Memory (biLSTM) architectures. Our study achieved high specificity (CP: 98.3%, NCP: 96.2% Healthy: 89.3%) and high sensitivity (CP: 84.0%, NCP: 93.9% Healthy: 94.9%) in classifying COVID-19 pneumonia, non-COVID-19 pneumonia and healthy patients. Next, we provide visual explanations for the CNN predictions with gradient-weighted class activation mapping (Grad-CAM). The results provided a model explainability by showing that Ground Glass Opacities (GGO), indicators of COVID-19 pneumonia disease, were captured by our CNN network. Finally, we have implemented our approach in three hospitals proving its compatibility and efficiency.
ARTICLE | doi:10.20944/preprints202110.0089.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Object Detection; Cascade Mask R-CNN; Floor Plan Images; Deep Learning; Transfer Learning; Dataset Augmentation; Computer Vision
Online: 5 October 2021 (15:09:26 CEST)
Object detection is one of the most critical tasks in the field of Computer vision. This task comprises identifying and localizing an object in the image. Architectural floor plans represent the layout of buildings and apartments. The floor plans consist of walls, windows, stairs, and other furniture objects. While recognizing floor plan objects is straightforward for humans, automatically processing floor plans and recognizing objects is a challenging problem. In this work, we investigate the performance of the recently introduced Cascade Mask R-CNN network to solve object detection in floor plan images. Furthermore, we experimentally establish that deformable convolution works better than conventional convolutions in the proposed framework. Identifying objects in floor plan images is also challenging due to the variety of floor plans and different objects. We faced a problem in training our network because of the lack of publicly available datasets. Currently, available public datasets do not have enough images to train deep neural networks efficiently. We introduce SFPI, a novel synthetic floor plan dataset consisting of 10000 images to address this issue. Our proposed method conveniently surpasses the previous state-of-the-art results on the SESYD dataset and sets impressive baseline results on the proposed SFPI dataset. The dataset can be downloaded from SFPI Dataset Link. We believe that the novel dataset enables the researcher to enhance the research in this domain further.
ARTICLE | doi:10.20944/preprints202107.0165.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Formula detection; Cascade Mask R-CNN; Mathematical expression detection; document image analysis; deep neural networks; computer vision.
Online: 6 July 2021 (17:42:24 CEST)
This paper presents a novel architecture for detecting mathematical formulas in document images, which is an important step for reliable information extraction in several domains. Recently, Cascade Mask R-CNN networks have been introduced to solve object detection in computer vision. In this paper, we suggest a couple of modifications to the existing Cascade Mask R-CNN architecture: First, the proposed network uses deformable convolutions instead of conventional convolutions in the backbone network to spot areas of interest better. Second, it uses a dual backbone of ResNeXt-101, having composite connections at the parallel stages. Finally, our proposed network is end-to-end trainable. We evaluate the proposed approach on the ICDAR-2017 POD and Marmot datasets. The proposed approach demonstrates state-of-the-art performance on ICDAR-2017 POD at a higher IoU threshold with an f1-score of 0.917, reducing the relative error by 7.8%. Moreover, we accomplished correct detection accuracy of 81.3% on embedded formulas on the Marmot dataset, which results in a relative error reduction of 30%.
ARTICLE | doi:10.20944/preprints201908.0068.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: deep learning; convolutional neural networks (CNN); transfer learning; class activation mapping (CAM); building defects; structural-health monitoring
Online: 6 August 2019 (04:18:29 CEST)
Clients are increasingly looking for fast and effective means to quickly and frequently survey and communicate the condition of their buildings so that essential repairs and maintenance work can be done in a proactive and timely manner before it becomes too dangerous and expensive. Traditional methods for this type of work commonly comprise of engaging building surveyors to undertake a condition assessment which involves a lengthy site inspection to produce a systematic recording of the physical condition of the building elements, including cost estimates of immediate and projected long-term costs of renewal, repair and maintenance of the building. Current asset condition assessment procedures are extensively time consuming, laborious, and expensive and pose health and safety threats to surveyors, particularly at height and roof levels which are difficult to access. We propose a method for automated detection and localisation of key building defects from images using deep learning and convolution neural networks. The proposed model is based on a pre-trained VGG-16 classifier with Class Activation Mapping (CAM) for object localisation. The model has proven to be robust and able to accurately detect and localise mould growth, stains, and paint deterioration defects arising from dampness in buildings. The approach is being developed with potentials to scale-up to support automated detection of defects and deterioration of buildings in real-time using mobile devices and drones.
ARTICLE | doi:10.20944/preprints201607.0085.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: CNN; Deep Learning; AlexNet; VGGNet; Texture Descriptor; Garment Categories; 13 Garment Trend Identification; Design Classification for Garments.
Online: 27 July 2016 (15:39:53 CEST)
Automatic garments design class identification for recommending the fashion trends is important nowadays because of the rapid growth of online shopping. By learning the properties of images efficiently, a machine can give better accuracy of classification. Several methods, based on Hand-Engineered feature coding exist for identifying garments design classes. But, most of the time, those methods do not help to achieve better results. Recently, Deep Convolutional Neural Networks (CNNs) have shown better performances for different object recognition. Deep CNN uses multiple levels of representation and abstraction that helps a machine to understand the types of data (images, sound, and text) more accurately. In this paper, we have applied deep CNN for identifying garments design classes. To evaluate the performances, we used two well-known CNN models AlexNet and VGGNet on two different datasets. We also propose a new CNN model based on AlexNet and found better results than existing state-of-the-art by a significant margin.
ARTICLE | doi:10.20944/preprints202111.0078.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: GRaVN; machine learning; convolutional neural networks; CNN; raman spectroscopy; analogue missions; planetary science; random undersampling; random oversampling; CanMoon
Online: 3 November 2021 (09:24:38 CET)
During planetary exploration mission operations, one of the key responsibilities of the instrument teams to determine data viability for subsequent analysis. During the 2019 CanMoon Lunar Sample Return Analogue Mission, the Lead Raman Specialist manually examined each spectra to provide quality assurance/validation. This non-trivial process requires years of experience to complete accurately. With the proven efficacy of Convolutional Neural Networks (CNNs) in classification tasks, and the increased use of automation and control loops on planetary space platforms for navigation and science targeting, an opportunity presents itself to approach this validation problem utilising CNNs. We present the Generalised Raman Validation Network (GRaVN), an neural network focused specifically on extracting the generalised structure of Raman spectra for quality assurance/validation. This work demonstrates the viability of utilising a CNN network in validation activities for Raman spectroscopy. Utilising only two hidden layers, a configuration was developed that provided good levels of accuracy on a manually curated dataset. This indicates that such a system could be useful as part of an autonomous control loop during planetary exploration activities.
DATASET | doi:10.20944/preprints202012.0047.v1
Subject: Engineering, Automotive Engineering Keywords: eye tracking dataset; gaze tracking dataset; iris tracking dataset; CNN for eye-tracking; neural networks for eye-tracking
Online: 2 December 2020 (08:00:46 CET)
In recent years many different deep neural networks were developed, but due to a large number of layers in deep networks, their training requires a long time and a large number of datasets. Today is popular to use trained deep neural networks for various tasks, even for simple ones in which such deep networks are not required. The well-known deep networks such as YoloV3, SSD, etc. are intended for tracking and monitoring various objects, therefore their weights are heavy and the overall accuracy for a specific task is low. Eye-tracking tasks need to detect only one object - an iris in a given area. Therefore, it is logical to use a neural network only for this task. But the problem is the lack of suitable datasets for training the model. In the manuscript, we presented a dataset that is suitable for training custom models of convolutional neural networks for eye-tracking tasks. Using data set data, each user can independently pre-train the convolutional neural network models for eye-tracking tasks. This dataset contains annotated 10,000 eye images in an extension of 416 by 416 pixels. The table with annotation information shows the coordinates and radius of the eye for each image. This manuscript can be considered as a guide for the preparation of datasets for eye-tracking devices.
ARTICLE | doi:10.20944/preprints202011.0527.v1
Subject: Keywords: Aircraft Maintenance Inspection; Anomaly Detection; Defect Inspection; Convolutional Neural Networks; Mask R-CNN; Generative Adversarial Networks; Image Augmentation
Online: 20 November 2020 (09:16:13 CET)
Convolutional Neural Networks combined with autonomous drones are increasingly seen as enablers of partially automating the aircraft maintenance visual inspection process. Such an innovative concept can have a significant impact on aircraft operations. Through supporting aircraft maintenance engineers detect and classify a wide range of defects, the time spent on inspection can significantly be reduced. Examples of defects that can be automatically detected include aircraft dents, paint defects, cracks and holes, and lightning strike damage. Additionally, this concept could also increase the accuracy of damage detection and reduce the number of aircraft inspection incidents related to human factors like fatigue and time pressure. In our previous work, we have applied a recent Convolutional Neural Network architecture known by MASK R-CNN to detect aircraft dents. MASK-RCNN was chosen because it enables the detection of multiple objects in an image while simultaneously generating a segmentation mask for each instance. The previously obtained F1 and F2 scores were 62.67% and 59.35% respectively. This paper extends the previous work by applying different techniques to improve and evaluate prediction performance experimentally. The approaches uses include (1) Balancing the original dataset by adding images without dents; (2) Increasing data homogeneity by focusing on wing images only; (3) Exploring the potential of three augmentation techniques in improving model performance namely flipping, rotating, and blurring; and (4) using a pre-classifier in combination with MASK R-CNN. The results show that a hybrid approache combining MASK R-CNN and augmentation techniques leads to an improved performance with an F1 score of (67.50%) and F2 score of (66.37%)
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Car Detection; Convolutional Neural Networks; Deep Learning; Faster R-CNN; Unmanned Aerial Vehicles; You Only Look Once (Yolo).
Online: 12 March 2020 (08:57:09 CET)
In this paper, we address the problem of car detection from aerial images using Convolutional Neural Networks (CNN). This problem presents additional challenges as compared to car (or any object) detection from ground images because features of vehicles from aerial images are more difficult to discern. To investigate this issue, we assess the performance of two state-of-the-art CNN algorithms, namely Faster R-CNN, which is the most popular region-based algorithm, and YOLOv3, which is known to be the fastest detection algorithm. We analyze two datasets with different characteristics to check the impact of various factors, such as UAV's altitude, camera resolution, and object size. The objective of this work is to conduct a robust comparison between these two cutting-edge algorithms. By using a variety of metrics, we show that YOLOv3 yields better performance in most configurations, except that it exhibits a lower recall and less confident detections when object sizes and scales in the testing dataset differ largely from those in the training dataset.
ARTICLE | doi:10.20944/preprints201910.0195.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: car detection; convolutional neural networks; deep learning; you only look once (yolo); faster r-cnn; unmanned aerial vehicles
Online: 17 October 2019 (12:29:29 CEST)
In this paper, we address the problem of car detection from aerial images using Convolutional Neural Networks (CNN). This problem presents additional challenges as compared to car (or any object) detection from ground images because features of vehicles from aerial images are more difficult to discern. To investigate this issue, we assess the performance of two state-of-the-art CNN algorithms, namely Faster R-CNN, which is the most popular region-based algorithm, and YOLOv3, which is known to be the fastest detection algorithm. We analyze two datasets with different characteristics to check the impact of various factors, such as UAV’s altitude, camera resolution, and object size. The objective of this work is to conduct a robust comparison between these two cutting-edge algorithms. By using a variety of metrics, we show that none of the two algorithms outperforms the other in all cases.
ARTICLE | doi:10.20944/preprints201903.0039.v2
Subject: Engineering, Control & Systems Engineering Keywords: Handwritten digit recognition; Convolutional Neural Network (CNN); Deep learning; MNIST dataset; Epochs; Hidden Layers; Stochastic Gradient Descent; Backpropagation
Online: 20 September 2019 (10:12:26 CEST)
In recent times, with the increase of Artificial Neural Network (ANN), deep learning has brought a dramatic twist in the field of machine learning by making it more Artificial Intelligence (AI). Deep learning is used remarkably used in vast ranges of fields because of its diverse range of applications such as surveillance, health, medicine, sports, robotics, drones etc. In deep learning, Convolutional Neural Network (CNN) is at the center of spectacular advances that mixes Artificial Neural Network (ANN) and up to date deep learning strategies. It has been used broadly in pattern recognition, sentence classification, speech recognition, face recognition, text categorization, document analysis, scene, and handwritten digit recognition. The goal of this paper is to observe the variation of accuracies of CNN to classify handwritten digits using various numbers of hidden layer and epochs and to make the comparison between the accuracies. For this performance evaluation of CNN, we performed our experiment using Modified National Institute of Standards and Technology (MNIST) dataset. Further, the network is trained using stochastic gradient descent and the backpropagation algorithm.
ARTICLE | doi:10.20944/preprints202101.0534.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: fruit occlusion; deep learning; machine vision; yield estimation; fruit count; neural network; CNN; tree crop; Mangifera indica; MLP; canopy
Online: 26 January 2021 (11:29:49 CET)
Imaging systems mounted to ground vehicles are used to image fruit tree canopies for estimation of fruit load, but frequently need correction for fruit occluded by branches, foliage or other fruits. This can be achieved using an orchard ‘occlusion factor’, estimated from a manual count of fruit load on a sample of trees (referred to as the reference method). It was hypothesised that canopy images could hold information related to the number of occluded fruit. Five approaches to correct for occluded fruit based on canopy images were compared using data of three mango orchards in two seasons. However, no attribute correlates to the number of hidden fruit were identified. Several image features obtained through segmentation of fruit and canopy areas, such as the proportion of fruit that were partly occluded, were used in training Random forest and multi-layered perceptron (MLP) models for estimation of a correction factor per tree. In another approach, deep learning convolutional neural networks (CNNs) were directly trained against harvest fruit count on trees. The supervised machine learning methods for direct estimation of fruit load per tree delivered an improved prediction outcome over the reference method for data of the season/orchard from which training data was acquired. For a set of 2017 season tree images (n=98 trees), a R2 of 0.98 was achieved for the correlation of the number of fruits predicted by a Random forest model and the ground truth fruit count on the trees, compared to a R2 of 0.68 for the reference method. The best prediction of whole orchard (n = 880 trees) fruit load, in the season of the training data, was achieved by the MLP model, with an error to packhouse count of 1.6% compared to the reference method error of 13.6%. However, the performance of these models on new season data (test set images) was at best equivalent and generally poorer than the reference method. This result indicates that training on one season of data was insufficient for the development of a robust model. This outcome was attributed to variability in tree architecture and foliage density between seasons and between orchards, such that the characters of the canopy visible from the interrow that relate to the proportion of hidden fruit are not consistent. Training of these models across several seasons and orchards is recommended.
ARTICLE | doi:10.20944/preprints202003.0284.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: deep learning; composite hybrid feature selection; machine learning; stack hybrid classification; CT-image; MPEG7 edge histogram feature extraction; CNN
Online: 18 March 2020 (08:32:46 CET)
The paper demonstrates the analysis of Corona Virus Disease based on a probabilistic model. It involves a technique for classification and prediction by recognizing typical and diagnostically most important CT images features relating to Corona Virus. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases at applying our proposed approach for feature extraction. The combination of the conventional statistical and machine learning tools is applied for feature extraction from CT images through four images filters in combination with proposed composite hybrid feature extraction (CHFS). The selected features were classified by the stack hybrid classification system(SHC). Experimental study with real data demonstrates the feasibility and potential of the proposed approach for the said cause.
ARTICLE | doi:10.20944/preprints201902.0203.v1
Subject: Engineering, Other Keywords: Northern Corn Leaf Blight (Exserohilum); Gray Leaf Spot (Cerospora); Common Rust (Puccinia sorghi); Convolutional Neural Networks (CNN); Neuroph Studio
Online: 21 February 2019 (13:04:05 CET)
Plant leaf diseases can affect the plants’ leaves to an extent that the plants can collapse and die completely. These diseases may drastically drop the supply of vegetables and fruits to the market, and result in a low agricultural economy. In the literature, different laboratory methods of plant leaf disease detection have been used. These methods were time consuming and could not cover large areas for the detection of leaf diseases. This study infiltrates through the facilitated principles of the Convolutional Neural Networks (CNN) in order to model a network for image recognition and classification of these diseases. Neuroph was used to perform the training of a CNN network that recognized and classified images of the maize leaf diseases that were collected by use of a smart phone camera. A novel way of training and the methodology used, expedite a quick and easy implementation of the system in practice. The developed model was able to recognize 3 different types of maize leaf diseases out of healthy leaves. The Northern Corn Leaf Blight (Exserohilum), Common Rust (Puccinia sorghi) and Gray Leaf Spot (Cerospora) diseases were chosen for this study as they affect most parts of Southern Africa’s maize fields.
REVIEW | doi:10.20944/preprints202101.0426.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: deep learning; machine learning; ischemic stroke; demyelinating disease; image processing; computer aided diagnostics; brain MRI; CNN; White Matter Hyperintensities; VOSViewer
Online: 21 January 2021 (14:55:05 CET)
Medical brain image analysis is a necessary step in the Computers Assisted /Aided Diagnosis (CAD) systems. Advancements in both hardware and software in the past few years have led to improved segmentation and classification of various diseases. In the present work, we review the published literature on systems and algorithms that allow for classification, identification, and detection of White Matter Hyperintensities (WMHs) of brain MRI images specifically in cases of ischemic stroke and demyelinating diseases. For the selection criteria, we used the bibliometric networks. Out of a total of 140 documents we selected 38 articles that deal with the main objectives of this study. Based on the analysis and discussion of the revised documents, there is constant growth in the research and proposal of new models of deep learning to achieve the highest accuracy and reliability of the segmentation of ischemic and demyelinating lesions. Models with indicators (Dice Score, DSC: 0.99) were found, however with little practical application due to the uses of small datasets and lack of reproducibility. Therefore, the main conclusion is to establish multidisciplinary research groups to overcome the gap between CAD developments and their complete utilization in the clinical environment.
ARTICLE | doi:10.20944/preprints202103.0189.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Flying Social Robot; Autonomous Unmanned Aerial Vehicle (UAV); Emotion Recognition; Convolution Neural Network (CNN); Virtual Reality (VR); Unity; MATLAB/Simulink; Python
Online: 5 March 2021 (11:52:50 CET)
This work is part of an ongoing research project to develop an unmanned flying social robot to monitor dependants at home in order to detect the person’s state and bring the necessary assistance. In this sense, this paper focuses on the description of a virtual reality (VR) simulation platform for the monitoring process of an avatar in a virtual home by a rotatory-wing autonomous unmanned aerial vehicle (UAV). This platform is based on a distributed architecture composed of three modules communicated through the Message Queue Telemetry Transport (MQTT) protocol: the UAV Simulator implemented in MATLAB/Simulink, the VR Visualiser developed in Unity, and the new emotion recognition (ER) System developed in Python. Using a face detection algorithm and a convolutional neural network (CNN), the ER System is able to detect the person’s face in the image captured by the UAV’s on-board camera and classify the emotion among seven possible ones (surprise, fear, happiness, sadness, disgust, anger or neutral expression). The experimental results demonstrate the correct integration of this new computer vision module within the VR platform, as well as the good performance of the designed CNN, with around 85% in the F1-score, a mean of the precision and recall of the model. The developed emotion detection system can be used in the future implementation of the assistance UAV that monitors dependent people in a real environment, since the methodology used is valid for images of real people.
ARTICLE | doi:10.20944/preprints202003.0299.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Data Mining; Alzheimer’s Dementia; Composite Hybrid Feature Selection; Machine learning; stack Hybrid Classification; AI; MRI; Neuroimaging; MPEG7 edge histogram feature extraction; CNN
Online: 19 March 2020 (11:25:01 CET)
Alzheimer's disease (AD) detection acting as an essential role in global health care due to misdiagnosis and sharing many clinical sets with other types of dementia, and costly monitoring the progression of the disease over time by magnetic reasoning imaging (MRI) with consideration of human error in manual reading. This paper goal a comparative study on the performance of data mining techniques on two datasets of Clinical and Neuroimaging Tests with AD. Our proposed model in the first stage, Apply clinical medical dataset to a composite hybrid feature selection (CHFS), for extract new features to select the best features due to eliminating obscures features, In parallel with Apply a novel hybrid feature extraction of three batch edge detection algorithm and texture from MRI images dataset and optimized with fuzzy 64-bin histogram. In the second stage, we applied a clinical dataset to a stacked hybrid classification(SHC) model to combine Jrip and random forest classifiers with six model evaluations as meta-classifier individually to improve the prediction of clinical diagnosis. At the same stage of improving the classification accuracy of neuroimaging (MRI) dataset images by applying a convolution neural network (CNN) in comparison with traditional classifiers, running on extracted features from images. The authors have collected the clinical dataset of 426 subjects with (1229 potential patient sample) from oasis.org and (MRI) dataset from a benchmark kaggle.com with a total of around ~5000 images each segregated into the severity of Alzheimer's. The datasets evaluated using an explorer set of weka data mining software for the analysis purpose. The experimental show that the proposed model of (CHFS) feature extraction lead to effectively reduced the false-negative rate with a relatively high overall accuracy with a stack hybrid classification of support vector machine (SVM) as meta-classifier of 96.50% compared to 68.83% of the previous result on a clinical dataset, Besides a compared model of CNN classification on MRI images dataset of 80.21%. The results showed the superiority of our CHFS model in predicting Alzheimer's disease more accurately with the clinical medical dataset in early-stage compared with the neuroimaging (MRI) dataset. The results of the proposed model were able to predict with accurately classify Alzheimer's clinical samples at a low cost in comparison with the MRI-CNN images model at the early stage and get a good indicator for high classification rate for MRI images when applying our proposed model of SHC.
ARTICLE | doi:10.20944/preprints202111.0112.v1
Subject: Medicine & Pharmacology, Other Keywords: forensic medicine; forensic dentistry; forensic anthropology; 3D CNN; AI; deep learning; biological age determination; sex determination; 3D cephalometric; AI face estimation; growth prediction
Online: 5 November 2021 (10:00:56 CET)
Three-dimensional convolutional neural networks (3D CNN) as a type of artificial intelligence (AI) are powerful in image processing and recognition using deep learning to perform generative and descriptive tasks. The advantage of CNN compared to its predecessors is that it automatically detects the important features without any human supervision. 3D CNN are used to extract features in three dimensions where input is a 3D volume or a sequence of 2D pictures e.g., slices in a cone-beam computer tomography scan (CBCT). The main aim of this article was to bridge interdisciplinary cooperation between forensic medical experts and deep learning engineers. With emphasis activating clinical forensic experts in the field with possibly basic knowledge of advanced artificial intelligence techniques with interest in its implementation in their efforts to advance the forensic research further. This paper introduces a novel workflow of 3D CNN analysis of full-head CBCT scans. Authors explore and present 3D CNN method for forensic research design concept in five perspectives: (1) sex determination, (2) biological age estimation, (3) 3D cephalometric landmark annotation, (4) growth vectors prediction, (5) facial soft-tissue estimation from the skull and vice versa. In conclusion, 3D CNN application can be a watershed moment in forensic medicine, leading to unprecedented improvement of forensic analysis workflows based on 3D neural networks.
ARTICLE | doi:10.20944/preprints202110.0359.v2
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: brain; pituitary adenoma; Dysembryoplastic neuroepithelial tumor; DNET; Ganglioglioma; deep learning; digital pathology; convolutional neural network; computer vision; machine learning; convolutional neural network; CNN
Online: 26 October 2021 (14:10:11 CEST)
Background: Processing whole-slide images (WSI) to train neural networks can be intricate and laborious. We developed an open-source library covering recurrent tasks in processing of WSI and in evaluating the performance of the trained networks for classification tasks. Methods: Two histopathology use-cases were selected. First we aimed to train a CNN to distinguish H&E-stained slides obtained from neuropathologically classified low-grade epilepsy-associated dysembryoplastic neuroepithelial tumor (DNET) and ganglioglioma (GG). The second project we trained a convolutional neural network (CNN) to predict the hormone expression of pituitary adenoms only from hematoxylin and eosin (H&E) stained slides. In the same approach, we addressed the issue to also predict clinically silent corticotroph adenoma. We included four clinico-pathological disease conditions in a multilabel approach. Results: Our best performing CNN achieved an area under the curve (AUC) of 0.97 for the receiver operating characteristic (ROC) for corticotroph adenoma, 0.86 for silent corticotroph adenoma and 0.98 for gonadotroph adenoma. Our DNET-GG classifier achieved an AUC of 1.00 for the ROC curve. All scores were calculated with the help of our library on predictions on a case basis. Conclusions: Our comprehensive library is most helpful to standardize the work-flow and minimize the work-burden in training CNN. It is also compatible with fastai. Indeed, our new CNNs reliably extracted neuropathologically relevant information from the H&E staining only. This approach will supplement the clinico-pathological diagnosis of brain tumors, which is currently based on cost-intense microscopic examination and variable panels of immunohistochemical stainings.
REVIEW | doi:10.20944/preprints201805.0484.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: deep learning; deep convolutional neural networks; dcnn; convolutional neural networks; cnn; robot learning; transfer learning; robotic grasping; robotic grasp detection; human-robot collaboration
Online: 31 May 2018 (17:27:23 CEST)
In order for robots to attain more general-purpose utility, grasping is a necessary skill to master. Such general-purpose robots may use their perception abilities in order to visually identify grasps for a given object. A grasp describes how a robotic end-effector can be arranged on top of an object to securely grab it between the robotic gripper and successfully lift it without slippage. Traditionally, grasp detection requires expert human knowledge to analytically form the task-specific algorithm, but this is an arduous and time-consuming approach. During the last five years, deep learning methods have enabled significant advancements in robotic vision, natural language processing, and automated driving applications. The successful results of these methods have driven robotics researchers to explore the application of deep learning methods in task generalised robotic applications. This paper reviews the current state-of-the-art in regards to the application of deep learning methods to generalised robotic grasping and discusses how each element of the deep learning approach has improved the overall performance of robotic grasp detection. A number of the most promising approaches are evaluated and the most successful for grasp detection is identified as the one-shot detection method. The availability of suitable volumes of appropriate training data is identified as a major obstacle for effective utilisation of the deep learning approaches, and the use of transfer learning techniques is identified as a potential mechanism to address this. Finally, current trends in the field and future potential research directions are discussed.
ARTICLE | doi:10.20944/preprints202109.0059.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: table detection; table recognition; cascade Mask R-CNN; atrous convolution; recursive feature pyramid networks; document image analysis; deep neural networks; computer vision, object detection.
Online: 3 September 2021 (11:05:10 CEST)
Table detection is a preliminary step in extracting reliable information from tables in scanned document images. We present CasTabDetectoRS, a novel end-to-end trainable table detection framework that operates on Cascade Mask R-CNN, including Recursive Feature Pyramid network and Switchable Atrous Convolution in the existing backbone architecture. By utilizing a comparatively lightweight backbone of ResNet-50, this paper demonstrates that superior results are attainable without relying on pre and post-processing methods, heavier backbone networks (ResNet-101, ResNeXt-152), and memory-intensive deformable convolutions. We evaluate the proposed approach on five different publicly available table detection datasets. Our CasTabDetectoRS outperforms the previous state-of-the-art results on four datasets (ICDAR-19, TableBank, UNLV, and Marmot) and accomplishes comparable results on ICDAR-17 POD. Upon comparing with previous state-of-the-art results, we obtain a significant relative error reduction of 56.36%, 20%, 4.5%, and 3.5% on the datasets of ICDAR-19, TableBank, UNLV, and Marmot, respectively. Furthermore, this paper sets a new benchmark by performing exhaustive cross-datasets evaluations to exhibit the generalization capabilities of the proposed method.
ARTICLE | doi:10.20944/preprints202007.0634.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: CVD rehabilitation; Local muscular endurance exercises; Exercise-based rehabilitation; Deep Learning; AlexNet; CNN; SVM; kNN; RF; MLP; PCA; multi-class classification; INSIGHT-LME dataset
Online: 26 July 2020 (15:21:08 CEST)
Exercise-based cardiac rehabilitation requires patients to perform a set of certain prescribed exercises a specific number of times. Local muscular endurance (LME) exercises are an important part of the rehabilitation program. Automatic exercise recognition and repetition counting, from wearable sensor data is an important technology to enable patients to perform exercises independently in remote settings, e.g. their own home. In this paper we first report on a comparison of traditional approaches to exercise recognition and repetition counting, corresponding to supervised machine learning and peak detection from inertial sensing signals respectively, with more recent machine learning approaches, specifically Convolutional Neural Networks (CNNs). We investigated two different types of CNN: one using the AlexNet architecture, the other using time-series array. We found that the performance of CNN based approaches were better than the traditional approaches. For exercise recognition task, we found that the AlexNet based single CNN model outperformed other methods with an overall 97.18% F1-score measure. For exercise repetition counting , again the AlexNet architecture based single CNN model outperformed other methods by correctly counting repetitions in 90% of the performed exercise sets within an error of ±1. To the best of our knowledge, our approach of using a single CNN method for both recognition and repetition counting is novel. In addition to reporting our findings, we also make the dataset we created, the INSIGHT-LME dataset, publicly available to encourage further research.
REVIEW | doi:10.20944/preprints202005.0234.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: SIRD; Twitter; GHSI; Pre-symptomatic; EHR; Contact tracing; On-line survey; qRT-PCR; X-ray; CT/HRCT; CNN; Autoencoder; Drug affinity; CPI; and Inflation.
Online: 14 May 2020 (11:25:57 CEST)
World is now experiencing a major health calamity due to the coronavirus disease (COVID-19) pandemic, caused by the severe acute respiratory syndrome coronavirus clade 2 (SARS-CoV- 2). The foremost challenge facing the scientific community is to explore the growth and transmission capability of the virus. Use of artificial intelligence (AI), such as, deep learning, in (i) rapid disease detection from x-ray/computerized tomography (CT)/ high-resolution computed tomography (HRCT) images, (ii) accurate prediction of the epidemic patterns and their saturation throughout the globe, (iii) identification of the epicenter in each country/state and forecasting the disease from social networking data, (iv) prediction of drug-protein interactions for repurposing the drugs, and (v) socio-economic impact and prediction of future relapses, has attracted much attention. In the present manuscript, we describe the role of various AI-based technologies for rapid and efficient detection from CT images complementing quantitative real time polymerase chain reaction (qRT-PCR) and immunodiagnostic assays. AI-based technologies to anticipate the current pandemic pattern, possibility of future relapses and socio-economic impact are also discussed. We inspect how the virus transmits depending on different factors, such as, population density and mobility among others. We depict how AI-based mobile app for contact tracing and surveys can prevent the transmission. A modified deep learning technique can assess affinity of the most probable drugs to treat COVID-19. Here a few effective antiviral drugs, such as, Geneticin, Avermectin B1, and Ancriviroc among others, have been reported with their appropriate validation from previous investigations.
ARTICLE | doi:10.20944/preprints202107.0277.v1
Subject: Medicine & Pharmacology, Oncology & Oncogenics Keywords: Cervical cancer; Pap smear test; whole slide image (WSI); feature pyramid network (FPN); global context aware (GCA); region based convolutional neural networks (R-CNN); Region Proposal Network (RPN).
Online: 12 July 2021 (23:05:34 CEST)
Cervical cancer is a worldwide public health problem with a high rate of illness and mortality among women. In this study, we proposed a novel framework based on Faster RCNN-FPN ar-chitecture for the detection of abnormal cervical cells in cytology images from cancer screening test. We extended the Faster RCNN-FPN model by infusing deformable convolution layers into the feature pyramid network (FPN) to improve scalability. Furthermore, we introduced a global contextual aware module alongside the Region Proposal Network (RPN) to enhance the spatial correlation between the background and the foreground. Extensive experimentations with the proposed deformable and global context aware (DGCA) RCNN were carried out using the cer-vical image dataset of “Digital Human Body" Vision Challenge from the Alibaba Cloud TianChi Company. Performance evaluation based on the mean average precision (mAP) and receiver operating characteristic (ROC) curve has demonstrated considerable advantages of the proposed framework. Particularly, when combined with tagging of the negative image samples using tra-ditional computer-vision techniques, 6-9% increase in mAP has been achieved. The proposed DGCA-RCNN model has potential to become a clinically useful AI tool for automated detection of cervical cancer cells in whole slide images of Pap smear.
REVIEW | doi:10.20944/preprints201908.0152.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: deep learning; machine learning model; convolutional neural networks (CNN); recurrent neural networks (RNN); denoising autoencoder (DAE); deep belief networks (DBNs); long short-term memory (LSTM); review; survey; state of the art
Online: 13 August 2019 (09:32:09 CEST)
Deep learning (DL) algorithms have recently emerged from machine learning and soft computing techniques. Since then, several deep learning (DL) algorithms have been recently introduced to scientific communities and are applied in various application domains. Today the usage of DL has become essential due to their intelligence, efficient learning, accuracy and robustness in model building. However, in the scientific literature, a comprehensive list of DL algorithms has not been introduced yet. This paper provides a list of the most popular DL algorithms, along with their applications domains.