ARTICLE | doi:10.20944/preprints201904.0095.v2
Subject: Computer Science And Mathematics, Information Systems Keywords: EMR; SVM; Classification; Clustering
Online: 12 April 2019 (20:53:16 CEST)
Lately, the Critical Pathway(CP) of Electronic Medical Record(EMR) is used to the guideline for a treatment in the public hospital. We propose a healthcare promotion service using disease pattern with lifestyle risk factors. We classify a medical historical patient data with disease codes with lifestyle risk factors (hypertension, diabetes, smoking, overweight, excessive alcohol intake, and low physical activity) to make the lifestyle risk factors through the classification. We finally make the clusters of disease code with lifestyle risk factors using the medical historical data based on EMR's electronic discharge summary data. As the result of that, we do a healthcare recommending service based on the disease pattern with lifestyle risk. We can build a medical help desk of a public hospital to support people as we check into the public hospital; how to get the procedure of curing, the desired curing clinical method for the healthcare promotion service by each disease code, and how to be better our healthcare. We evaluate the performance of the proposed system by experimenting with the datasets collected at the medical center to measure performance and report some experimental results.
ARTICLE | doi:10.20944/preprints202105.0441.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: emotion recognition; MLP; SVM; RAVDESS
Online: 19 May 2021 (12:53:55 CEST)
herein, we have compared the performance of SVM and MLP in emotion recognition using speech and song channels of the RAVDESS dataset. We have undertaken a journey to extract various audio features, identify optimal scaling strategy and hyperparameter for our models. To increase sample size, we have performed audio data augmentation and addressed data imbalance using SMOTE. Our data indicate that optimised SVM outperforms MLP with an accuracy of 82 compared to 75%. Following data augmentation, the performance of both algorithms was identical at ~79%, however, overfitting was evident for the SVM. Our final exploration indicated that the performance of both SVM and MLP were similar in which both resulted in lower accuracy for the speech channel compared to the song channel. Our findings suggest that both SVM and MLP are powerful classifiers for emotion recognition in a vocal-dependent manner.
ARTICLE | doi:10.20944/preprints201703.0156.v2
Subject: Engineering, Electrical And Electronic Engineering Keywords: CBIR; , Late Fusion; SVM; BOVW
Online: 21 March 2017 (03:49:41 CET)
One of the challenges in Content-Based Image Retrieval (CBIR) is to reduce the semantic gaps between low-level features and high-level semantic concepts. In CBIR, the images are represented in the feature space and the performance of CBIR depends on the type of selected feature representation. Late fusion also known as visual words integration is applied to enhance the performance of image retrieval. The recent advances in image retrieval diverted the focus of research towards the use of binary descriptors as they are reported computationally efficient. In this paper, we aim to investigate the late fusion of Fast Retina Keypoint (FREAK) and Scale Invariant Feature Transform (SIFT). The late fusion of binary and local descriptor is selected because among binary descriptors, FREAK has shown good results in classification-based problems while SIFT is robust to translation, scaling, rotation and small distortions. The late fusion of FREAK and SIFT integrates the performance of both feature descriptors for an effective image retrieval. Experimental results and comparisons show that the proposed late fusion enhances the performances of image retrieval.
ARTICLE | doi:10.20944/preprints202201.0415.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Drought tolerance index; Stress tolerance index; MLP; SVM; MLP-GA; SVM-GA; Genetic Algorithm
Online: 27 January 2022 (11:21:14 CET)
Maize (Zea mays subsp. mays) is the staple food crop in the world. In this study, multi-layer perceptron (MLP), support vector machine (SVM), genetic algorithm-based multi-layer perceptron (MLP-GA), and genetic algorithm-based support vector machine (SVM-GA) hybrid artificial intelligence algorithms were used for the prediction of drought tolerance and stress tolerance indices in teosinte maize lines. Correspondingly, the gamma test technique was applied to determine efficient input and output vectors. The potential of developed models was evaluated based on statistical indices and graphical representation. Results of gamma test based on the least value of gamma and standard error indices show that day of anthesis (DOA), day of silking (DOS), yield index (YI), and gross yield per plant (GYP) information vector arrangements were determined as efficient information vector combination for drought-tolerant index (DTI) as well as the stress-tolerant index (STI). The results of MLP, SVM, MLP-GA, and SVM-GA algorithms were compared based on statistical indices and visual interpretation that have satisfactory for prediction of the drought-tolerant index and stress-tolerant index in maize crop. It has also seemed that genetic algorithm-based hybrid models (MLP-GA and SVM-GA) were found a better prediction of the drought-tolerant index and stress-tolerant index in maize crop. Similarly, the SVM-GA model has the highest potential to forecast the DTI and STI in maize crops as compared to MLP, SVM, MLP-GA models.
ARTICLE | doi:10.20944/preprints202308.2102.v1
Subject: Biology And Life Sciences, Neuroscience And Neurology Keywords: BCI, rt-fMRI, MI, DWGC, svm
Online: 31 August 2023 (09:43:14 CEST)
This article presents a method for extracting neural signal features to identify the imagination of left and right hand grasping movements. A functional magnetic resonance imaging (fMRI) experiment is employed to identify four brain regions with significant activations during motor imagery(MI) and the effective connections between these regions of interest (ROIs) were calculated using Dynamic Window-level Granger Causality (DWGC). Then, a real-time fMRI(rt-fMRI) classification system for left and right hand MI is developed using the Open-NFT platform. The experimental results show that incorporating effective connections can enhance the average accuracy of real-time three-class classification (rest, left hand and right hand) by 3% in comparison to traditional multivoxel pattern classification analysis(MVPA). Moreover, it significantly improves classification accuracy during the initial stage of MI tasks while reducing the latency effects in real-time decoding. The study suggests that the effective connections obtained through the DWGC method serve as valuable features for real-time decoding of MI using fMRI. Moreover, they exhibit higher sensitivity to changes in brain states. This research offers theoretical support and technical guidance for extracting neural signal features in the context of fMRI-based studies.
ARTICLE | doi:10.20944/preprints202311.0773.v1
Online: 13 November 2023 (08:47:54 CET)
Recently, significant progress has been made in developing computer-aided diagnosis (CAD) systems for identifying glaucoma abnormalities using fundus images. Despite their drawbacks, methods for extracting features such as wavelets and their variations, along with classifier like support vector machines (SVM), are frequently employed in such systems. This paper introduces a practical and enhanced system for detecting glaucoma in fundus images. This system adresses the chanallages encountered by other existing models in recent litrature. Initially, we have employed contrast limited adaputive histogram equalization (CLAHE) to enhanced the visualization of input fundus inmages. Then, the discrete ripplet-II transform (DR2T) employing a degree of 2 for feature extraction. Subsequently, a golden jackal optimization algorithm (GJO) employed to select the optimal features to reduce the dimension of the extracted feature vector. During the classification stage the least square support vector machine (LS-SVM) with three kernels called as linear, polynomial and radial basis function(RBF), for classifying of fundus images as glaucoma or healthy. The proposed method is validated with the current state-of-the-art models on two standard datasets, namely, G1020 and ORIGA. The results obtained from our experimental result demonstrate that our best suggested approach DR2T+GJO+LS-SVM-RBF obtains better classification accuracy 93.38% and 97.31% for G1020 and ORIGA dataset with less number of features. It establishes a more concise network structure when contrasted with traditional classifiers.
Subject: Engineering, Electrical And Electronic Engineering Keywords: coronavirus; COVID-19; diagnosis; deep features; SVM
Online: 22 April 2020 (05:58:22 CEST)
The detection of coronavirus (COVID-19) is now a critical task for the medical practitioner. The coronavirus spread so quickly between people and approaches 100,000 people worldwide. In this consequence, it is very much essential to identify the infected people so that prevention of spread can be taken. In this paper, the deep feature plus support vector machine (SVM) based methodology is suggested for detection of coronavirus infected patient using X-ray images. For classification, SVM is used instead of deep learning based classifier, as the later one need a large dataset for training and validation. The deep features from the fully connected layer of CNN model are extracted and fed to SVM for classification purpose. The SVM classifies the corona affected X-ray images from others. The methodology consists of three categories of Xray images, i.e., COVID-19, pneumonia and normal. The method is beneficial for the medical practitioner to classify among the COVID-19 patient, pneumonia patient and healthy people. SVM is evaluated for detection of COVID-19 using the deep features of different 13 number of CNN models. The SVM produced the best results using the deep feature of ResNet50. The classification model, i.e. ResNet50 plus SVM achieved accuracy, sensitivity, FPR and F1 score of 95.33%,95.33%,2.33% and 95.34% respectively for detection of COVID-19 (ignoring SARS, MERS and ARDS). Again, the highest accuracy achieved by ResNet50 plus SVM is 98.66%. The result is based on the Xray images available in the repository of GitHub and Kaggle. As the data set is in hundreds, the classification based on SVM is more robust compared to the transfer learning approach. Also, a comparison analysis of other traditional classification method is carried out. The traditional methods are local binary patterns (LBP) plus SVM, histogram of oriented gradients (HOG) plus SVM and Gray Level Co-occurrence Matrix (GLCM) plus SVM. In traditional image classification method, LBP plus SVM achieved 93.4% of accuracy.
ARTICLE | doi:10.20944/preprints201909.0082.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Mexico City; subsidence; InSAR; GPS; PSI; SVM
Online: 7 September 2019 (01:19:04 CEST)
This study presents an analysis of subsidence rates and their effects on Mexico City. Mexico City is well known for its subsidence as a result of excess water withdrawal for many years. This study focuses on this problem utilizing the integration of Interferometric Synthetic Aperture Radar (InSAR), Continuous Global Positioning Systems (CGPS), and optical remote sensing data. Fifty-two ENVISAT-ASAR, nine GPS stations, and one Landsat ETM+ image from Mexico City area have been analyzed to prepare a better understanding of the subsidence rates and its effects on Mexico City’s commune. This study has utilized InSAR methods. It includes differential interferometry and Persistent Scatter Interferometry (PSI) to monitor the existing subsidence in the Mexico City area. The InSAR data covers the temporal baseline between 2002 until June 2010, and the GPS data include temporal baseline from 1998 until 2012. Maximum of 352 mm annually change in Line Of Sight (LOS) direction is in agreement with the previous geodetic studies. InSAR data have been compared with CGPS data at the same time interval. The finding of this study reveals a high amount of correlation (up to 0.98) between two independent geodetic methods. We also implemented the Support Vector Machine (SVM) analysis method based on Landsat ETM+ image to classify Mexico City’s populated density area. This method performed comparing the subsidence rates with populated area buildings. This integrated study shows that the fastest subsidence zone (i.e., areas greater than 100 mm/yr) in the over mentioned temporal baseline occurs in the high and sparsely populated areas
ARTICLE | doi:10.20944/preprints201803.0128.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: moments invariants; ZM,PZM; OFMM; SVM; PSO
Online: 16 March 2018 (05:26:31 CET)
This paper provides orthogonal moments (OM) such as, Zernike Moments(ZM), Psuedo Zernike Moments(PZM) and Orthogonal Fourier Mellin Moments(OFMM) for the analysis of melanoma images. The moment invariants may vary with respect to geometric variations. For the analysis of orthogonal moments hundred random melanoma images and hundred non-melanoma images have been taken into consideration from the database of 570 melanoma images and 250 non-melanoma images respectively. Orthoganal moments have been computed by varying the phase angles from 10° to 40° with an equal interval of 10° degree for the orders 2, 4,8,16,32,64,128,256 respectively. For the optimal OMs Particle Swarm Optimization (PSO) technique have been used. These set of extracted optimal OMs have been further applied to classify melanoma images. Support Vector Machine (SVM) has been used for the classification of sensitivity=88.78%.
ARTICLE | doi:10.20944/preprints202304.1034.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: ITSC fault; traction motor; fault diagnosis; apFFT; SVM
Online: 27 April 2023 (04:29:04 CEST)
Abstract: The EMU(electric multiple units) traction motors are powered by converters. The PWM(pulse width modulation) voltage increases the voltage stress borne by the motor insulation system, making the ITSC(inter-turn short-circuit) fault more prominent. An index based on short-circuit thermal power was proposed in the article to evaluate the non-metallic ITSC faults degree. The apFFT(all phase FFT) time-shift phase difference correction with double Hanning-windows is used to calculate the fundamental frequency of the traction motor's ZSVC（zero se-quence voltage component）, the fundamental amplitudes of ZSVC and three-phase current. The five parameters are used as fault features to train the SVM (support vector machine)fault diagnosis model. The SVM hyper-parameters C and g are optimized by K-CV (K fold cross-validation) and grid search methods. The experimental verification was carried out by the EMU electric traction simulation experimental platform. According to the non-metallic degree index proposed in this article, the experimental samples were divided into three categories, normal, incipient and serious fault samples. The ITSC fault diagnosis accuracy was 100% on the training data set and 93.33 % on the test data set. There was no misclassification between normal and serious ITSC fault samples.
ARTICLE | doi:10.20944/preprints202206.0163.v1
Subject: Engineering, Civil Engineering Keywords: MARS; SVM; RF; rainfall; runoff; rainfall-runoff modelling
Online: 13 June 2022 (03:29:36 CEST)
Nowadays, great attention has been attributed to the study of runoff and its fluctuation over space and time. There is a crucial need for a good soil and water management system to overcome the challenges of water scarcity and other natural adverse events like floods and landslides, among others. Rainfall-runoff modeling is an appropriate approach for runoff prediction, making it possible to take preventive measures to avoid damage caused by natural hazards such as floods. In the present study, several data driven models, namely: Multiple linear regression (MLR), Multiple adaptive regression splines (MARS), Support vector machine (SVM), and Random Forest (RF), were used for rainfall-runoff prediction of the Gola watershed, located in the south-eastern part of the Uttarakhand. The performance of the models was evaluated based on the coefficient of determination (R2), root mean square error (RMSE), Nash-Sutcliffe efficiency (NSE), and percent bias (PBAIS) indices. In addition to the numerical comparison, the models were evaluated and their performances were evaluated base on graphical plotting, i.e., line diagram, scatter plot, Violin plot, relative error plot and Taylor diagram (TD). The comparison results revealed that the four heuristic methods gave higher accuracy than the MLR model. Among the machine learning models, the RF (RMSE (m3/s), R2, NSE, and PBIAS (%) = 6.31, 0.96, 0.94, and -0.20 during the training period, respectively, and 5.53, 0.95, 0.92, and -0.20 during the testing period, respectively) surpassed the MARS, SVM, and the MLR models in forecasting daily runoff for all cases studies. Among all four models, the RF model outperformed in the training and testing periods. It can be summarized that the RF model is best-in-class and delivers a strong potential for runoff prediction of the Gola watershed.
ARTICLE | doi:10.20944/preprints202007.0628.v2
Subject: Engineering, Energy And Fuel Technology Keywords: DFIG; SVM; VC; Wind Turbine (WT); parameters uncertainly
Online: 16 June 2021 (12:04:39 CEST)
This paper presents the super-twisting algorithm (STA) direct power control (DPC) scheme for the control of active and reactive powers of grid-connected DFIG. Simulations of 5 KW DFIG has been presented to validate the effectiveness and robustness of the proposed approach in the presence of uncertainties with respect to vector control (VC). The proposed controller schemes with fixed gains are effective in reducing the ripple of active and reactive powers, effectively suppress sliding-mode chattering and the effe This paper presents a comparative study of two approaches for the direct power control (DPC) of doubly-fed induction generator (DFIG) based on wind energy conversion system (WECS). Vector Control (VC) and Sliding Mode Control (SMC). The simulation results of the DFIG of 5 KW in the presence of various uncertainties were carried out to evaluate the capability and robustness of the proposed control scheme. The (SMC) strategy is the most appropriate scheme with the best combination such as reducing high powers ripple, diminishing steady-state error in addition to the fact that the impact of machine parameter variations does not change the system performance. cts of parametric uncertainties not affecting system performance.
Subject: Medicine And Pharmacology, Pulmonary And Respiratory Medicine Keywords: COVID-19; Cough; Signal processing; STFT; MFCC; SVM
Online: 13 January 2021 (11:09:03 CET)
Sound signals from the respiratory system are largely the harbingers of human health. Early diagnosis of respiratory tract diseases is of great importance as it creates irreversible effects on human health when delayed. This diagnostic in the medical world has been made possible thanks to machine learning and signal processing analysis. The coronavirus epidemic, which is in question today and deeply shakes the whole world, has been revealed the importance of this issue even more. In terms of the coronavirus pandemic, it has become the focus of researchers to differentiate symptoms from similar diseases such as normal flu or influenza. Among these symptoms, the difference in cough sound has played a distinctive role in the proposed study. Several pioneering studies have proven that almost two-thirds of people who get corona have a dry cough. At this stage, the information of studies based on cough constitutes the main framework of our study. On the other hand, the basis of this study is based on machine learning algorithms. Clinical data collected under the supervision of doctors in a reliable environment was used as dataset. This dataset consists of 16 subjects suspected of the coronavirus with a specific patient demographic. In this study, using the polymerase chain reaction (PCR) test, suspected subjects were divided into two groups as negative and positive. The negative and positive labels represent the patient with non-COVID and with a COVID-19 cough respectively. Using the 3D plot or waterfall representation of the signal frequency spectrum, the salient features of the cough data are revealed. In this way, COVID-19 can be differentiated from other coughs by applying effective feature extraction and classification techniques. Power Spectral Density (PSD) based on Short Time Fourier Transform (STFT) and Mel Frequency Cepstral Coefficients (MFCC) were chosen as the efficient feature extraction method. Finally, among the classification techniques the Support Vector Machine (SVM) algorithm, was applied to the processed signals in order to identify and classify COVID-19 cough. In terms of results evaluation, the cough of subjects with COVID-19 has obtained with 95.86% classification accuracy thanks to the RBF kernel function of SVM and the MFCC method. In other words, the diagnosis of COVID-19 coughs was obtained with 98.6% and 91.7% sensitivity and specificity measures respectively.
ARTICLE | doi:10.20944/preprints202009.0216.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Bitcoin; SVM; linear mixed models; word embedding; ELMo
Online: 10 September 2020 (04:02:53 CEST)
Introduced in 2009, Bitcoin has demonstrated a huge potential as the world’s first digital currency and has been widely used as a financial investment. Our research aims to uncover the relationship between Bitcoin prices and people’s sentiments about Bitcoin on social media. Among various social media platforms, micro-blogging is one of the most popular. Millions of people use micro-blogging platforms to exchange ideas, broadcast views, and to provide opinions on different topics related to politics, culture, science, and technology. This makes them a potentially rich source of data for sentiment analysis. Therefore we chose one of the busiest micro-blogging platforms, Twitter, to perform sentiment analysis on Bitcoin. We used ELMo embedding model to convert Bitcoin-related tweets into a vector form and SVM classifier to divide the tweets into three sentiment categories - positive, negative, and neutral. We then used the sentiment data to find its relation with Bitcoin price fluctuation using the linear mixed model.
ARTICLE | doi:10.20944/preprints202006.0048.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Pattern Recognition; Feature extraction; SVM; HOG; Zonal density
Online: 5 June 2020 (14:03:45 CEST)
Significant progress has made in pattern recognition technology. However, one obstacle that has not yet overcome is the recognition of words in the Brahmi script, specifically the recognition of characters, compound characters, and word because of complex structure. For this kind of complex pattern recognition problem, it is always difficult to decide which feature extraction and classifier would be the best choice. Moreover, it is also true that different feature extraction and classifiers offer complementary information about the patterns to be classified. Therefore, combining feature extraction and classifiers, in an intelligent way, can be beneficial compared to using any single feature extraction. This study proposed the combination of HOG +zonal density with SVM to recognize the Brahmi words. Keeping these facts in mind, in this paper, information provided by structural and statistical based features are combined using SVM classifier for script recognition (word-level) purpose from the Brahmi words images. Brahmi word dataset contains 6,475 and 536 images of Brahmi words of 170 classes for the training and testing, respectively, and the database is made freely available. The word samples from the mentioned database are classified based on the confidence scores provided by support vector machine (SVM) classifier while HOG and zonal density use to extract the features of Brahmi words. Maximum accuracy suggested by system is 95.17% which is better than previously suggested studies.
ARTICLE | doi:10.20944/preprints202305.0489.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Covid-19; KNN; SVM; Fractional Fourier transform; Feature Extraction
Online: 8 May 2023 (09:12:58 CEST)
Covid-19 is a lung disease caused by a Coronavirus family virus. Due to its extraordinary prevalence and death rates, it has spread quickly to every country in the world. Thus, achieving peaks and outlines and curing different types of relapses is extremely important. Given the worldwide prevalence of Coronavirus and the participation of physicians in all countries, Information has been gathered regarding the properties of the virus, its diverse types, and the means of analyzing it. Numerous approaches have been used to identify this evolving virus. It is generally considered the most accurate and acceptable method of examining the patient's lungs and chest through a CT scan. As part of the feature extraction process, a method known as fractional Fourier transform (FrFT) has been applied as one of the time-frequency domain transformations. The proposed method was applied to a database consisting of 2481 CT images. Following the transformation of all images into equal sizes and the removal of non-lung areas, multiple combination windows are used to reduce the number of features extracted from the images. In this paper, the results obtained for KNN and SVM classification have been obtained with accuracy values of 99.84% and 99.90%, respectively.
ARTICLE | doi:10.20944/preprints202104.0146.v1
Subject: Environmental And Earth Sciences, Oceanography Keywords: Salt Marshes, Google Earth Engine, SVM, Distribution, China’s coast
Online: 5 April 2021 (14:28:19 CEST)
Based on the cloud platform of Google Earth Engine (GEE), this study selected Landsat 5/8 and Sentinel-2 remote sensing images and used Support Vector Machine (SVM) classification method to classify the 35 years of intertidal salt marshes in China, and verified the classification results in combination with field survey. Finally, combining with various driving factors, the reasons and laws affecting the changes of salt marshes species and area were discussed and analyzed. The main results of the study are as follows:The main types of salt marshes plants in China include Phragmites australis, Spartina alterniflora, Suaeda salsa, Scirpus mariquete, Tamarix chinensis, Cyperus malaccensis and Sesuvium portulacastrum. The results salt marshes classification indicated that 166999.32 ha in 1985, 172893.87 ha in 1990, 174952.29 ha in 1995, 125567.51 ha in 2000, 93257.97 ha in 2005, 102539.04 ha in 2010, 96302.92 ha in 2015, and 115722.75 ha in 2019. The main driving factors of salt marsh change from 1985 to 2015 are reclamation, mudflat aquaculture, climate change, coastal zone erosion, invasion of alien species, and natural competition and succession among salt marshes species. The results can be used to quantitatively analyze the salt marshes carbon storage in space and time, and provide data support for the protection of salt marsh wetlands, the restoration of ecological functions and the implementation of "carbon neutral".
ARTICLE | doi:10.20944/preprints201909.0308.v1
Subject: Medicine And Pharmacology, Other Keywords: Alzheimer’s disease; emphasis learning; multi-modal classification; svm; pca
Online: 27 September 2019 (10:26:34 CEST)
A method for classification is introduced in this article, and it is tested on ADNI database to diagnose alzheimer’s disease (AD). It is obvious that tunning the performance of a classification to get better results is a complicated problem, and when we want model’s accuracy or other peformance measurments higher than 90%, the problem will be more complicated. In this study, we tried and succeeded to discover a method to solve this problem. The final feature set can be used clustering too, because outgrowth feature set of the proposed method is invigorated. In the recent years, a lot of activities is done to develop computer aided systems (CAD) for alzheimer’s disease diagnosis. Most of these recently developed systems concenterated on extracting and combining features from MRI, PET, CSF, and …; in this article, we made attempt to do so and utilized one more technique to increase classification performance. Finding and producing the best features to solve three binary classiﬁcation problems of AD vs. Normal Control (NC), Mild Cognitive Impairment (MCI) vs. NC, and MCI vs. AD are the purposes of this article. Experiments indicate performance and effectiveness rates of the proposed method, which are accuracies of 98.81%, 81.61%, and 81.40% for AD vs. NC, MCI vs. NC, and AD vs. MCI classification problems, respectively. As can be seen, using this method increased the performance of the three binary problems incredibly.
ARTICLE | doi:10.20944/preprints202311.0692.v1
Subject: Medicine And Pharmacology, Pharmacy Keywords: chemometrics model; PAT; NIR; DoE; dissolution analysis; ANN; PLS; SVM
Online: 10 November 2023 (10:20:01 CET)
The pharmaceutical industry is making significant strides in enhancing process comprehension through the development of concepts like Quality by Design (QbD) and Process Analytical Technology (PAT). This shift has moved from traditional offline testing methods to real-time estimation of product quality. The dissolution characteristics of pharmaceutical tablets play a crucial role in maximizing the release of medications and their bioavailability. One factor that can affect dissolution is the blending procedure. Inadequate mixing can lead to patches of concentrated active components or excipients within the tablet, resulting in inconsistent dissolution behavior. A study investigated the impact of blending time and speed on the dissolution behavior of Amlodipine tablets using near-infrared (NIR) spectroscopy and multivariate modeling. NIR spectra were collected for Amlodipine tablets produced under various blending conditions using a 2-level central composite design. Dissolving profiles were analyzed using a USP dissolution device. Multivariate analysis techniques, including principal component analysis (PCA), partial least squares (PLS), Support Vector Machine (SVM), and Artificial Neural Networks (ANN), were applied to the collected NIR spectra. The findings demonstrated that blending speed and time had a significant influence on the dissolving properties of Amlodipine tablets. Blending at faster speeds and for shorter durations resulted in excessive shear and insufficient mixing, ultimately reduced drug release. The multivariate models constructed using ANN outperformed SVM and PLS in predicting dissolution profiles based on NIR spectra. This research highlights the effectiveness of NIR spectroscopy and multivariate modeling in optimizing tablet dissolution. These advancements enable continuous manufacturing of high-quality pharmaceutical products.
ARTICLE | doi:10.20944/preprints202208.0109.v1
Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: speech emotion recognition; affective computing; data augmentations; wav2vec 2.0; SVM
Online: 4 August 2022 (14:09:21 CEST)
Data augmentation techniques recently gained more adoption in speech processing, including speech emotion recognition. Although more data tends to be more effective, there may be a trade-off in which more data will not provide a better model. This paper reports experiments on investigating the effects of data augmentation in speech emotion recognition. The investigation aims at finding the most useful type of data augmentation and the number of data augmentations for speech emotion recognition. The experiments are conducted on the Japanese Twitter-based emotional speech corpus. The results show that for speaker-independent data, two data augmentations with glottal source extraction and silence removal exhibited the best performance among others, even with more data augmentation techniques. For the text-independent data (including speaker and text-independent), more data augmentations tend to improve speech emotion recognition performances. The results highlight the trade-off between the number of data augmentation and the performance of speech emotion recognition showing the necessity to choose a proper data augmentation technique for a specific application.
ARTICLE | doi:10.20944/preprints202106.0137.v1
Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: Multiple sclerosis, Machine Learning, precision, Decision tree, Art Fuzzy, SVM
Online: 4 June 2021 (10:59:51 CEST)
Multiple sclerosis (MS) is a debilitating disease of the brain and spinal cord (central nervous system). In MS, the immune system attacks the protective sheath (myelin) that covers the nerve fibers, causing communication problems between the brain and the rest of the body. Eventually the disease can cause permanent damage or nerve damage. The signs and symptoms of MS are very different and depend on the extent of the nerve damage and which nerves are affected. Some people with severe MS may lose the ability to walk independently or completely, while others may experience a long recovery period without any new symptoms. Most people with MS have a relapsing-remitting illness. They experience periods of new symptoms or recurrences that occur over days or weeks and usually improve somewhat or completely. Following these recurrences, there are periods of recovery that can last for months or even years. In this Project, we used some methods of machine learning in order to evaluate the precision and accuracy of Methods to Predict and classification of Multiple Sclerosis with different stages. In order to calculate accuracy, precision, recall Fscore we used some different method such as Art Fuzzy, SVM, Decision tree to compare the classes two by two. To improve the results we used the method of Adaptive fuzzy optimization. we used two options Genetic algorithm and particle swarm optimization.
ARTICLE | doi:10.20944/preprints202009.0699.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: SVM; MRMR; Bootstrap; Genes; Gene Expression; Biological Relevance; Subject Classification
Online: 29 September 2020 (09:09:52 CEST)
Selection of biologically relevant genes from high dimensional expression data is a key research problem in gene expression genomics. Most of the available gene selection methods are either based on relevancy or redundancy measure, which are usually adjudged through post selection classification accuracy. Through these methods the ranking of genes was done on a single high-dimensional expression data, which leads to the selection of spuriously associated and redundant genes. Hence, we developed a statistical approach through combining Support Vector Machine with Maximum Relevance and Minimum Redundancy under a sound statistical setup for the selection of biologically relevant genes. Here, the genes are selected through statistical significance values computed using a non-parametric test statistic under a bootstrap based subject sampling model. Further, a systematic and rigorous evaluation of the proposed approach with nine existing competitive methods was carried on six different real crop gene expression datasets. This performance analysis was carried out under three comparison settings, i.e. subject classification, biological relevant criteria based on quantitative trait loci, and gene ontology. Our analytical results showed that the proposed approach selects genes that are more biologically relevant as compared to the existing methods. Moreover, the proposed approach was also found to be better with respect to the competitive existing methods. The proposed statistical approach provides a framework for combining filter, and wrapper methods of gene selection.
ARTICLE | doi:10.20944/preprints201908.0289.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: drone video; human action recognition; CNN; Support vector machine (SVM)
Online: 28 August 2019 (03:52:22 CEST)
Recognition of the human interaction on the unconstrained videos taken from cameras and remote sensing platforms like a drone is a challenging problem. This study presents a method to resolve issues of motion blur, poor quality of videos, occlusions, the difference in body structure or size, and high computation or memory requirement. This study contributes to the improvement of recognition of human interaction during disasters such as an earthquake and flood utilizing drone videos for rescue and emergency management. We used Support Vector Machine (SVM) to classify the high-level and stationary features obtained from Convolutional Neural Network (CNN) in key-frames from videos. We extracted conceptual features by employing CNN to recognize objects from first and last images from a video. The proposed method demonstrated the context of a scene, which is significant in determining the behaviour of human in the videos. In this method, we do not require person detection, tracking, and many instances of images. The proposed method was tested for the University of Central Florida (UCF Sports Action), Olympic Sports videos. These videos were taken from the ground platform. Besides, camera drone video was captured from Southwest Jiaotong University (SWJTU) Sports Centre and incorporated to test the developed method in this study. This study accomplished an acceptable performance with an accuracy of 90.42%, which has indicated improvement of more than 4.92% as compared to the existing methods.
ARTICLE | doi:10.20944/preprints201906.0195.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: adaptive bilateral; marker watershed; PSO; fuzzy C-mean; GLCM; SVM
Online: 20 June 2019 (09:22:05 CEST)
Recently, the medical image processing is extensively used in several areas. In earlier detection and treatment of these diseases is very helpful to find out the abnormality issues in that image. Here there are number of methods available for segmentation to detect the lung nodule of computer tomography (CT) image. The main result of this paper, the earlier detection of lung nodules using Pre-processing techniques of top-hat transform, median and adaptive bilateral filter was compared both filtering methods and proved the adaptive bilateral filter is suitable method for CT images. The proposed segmentation technique uses novel strip method and the image is split into number of strips 3, 4, 5 and 6. A marker- watershed method based on PSO and Fuzzy C-mean Clustering method was proposed method. Firstly, the input image was reduced noise reduction and smoothing and the filter image is using strips method and then the image is segmented by marker watershed method. Secondly, the enhanced PSO technique was used to locate the better accurate value of the clustering centers of Fuzzy C-mean Clustering. Final stage, with the accurate value of centers and the enhanced target function and the small region of the segmented object was clustered by Fuzzy C-mean. In segmentation algorithm presented in this paper gives 95% of accuracy rate to detect lung nodules when strip count is 5.
ARTICLE | doi:10.20944/preprints202310.1957.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: machine learning; hybrid models; svm; LSTM; GRU; Stock market; CNN; Tensorflow
Online: 31 October 2023 (02:59:40 CET)
Stock market prediction is a challenging task to perform, as we know the fluctuations that take place in the market makes it versatile and hard to predict the prices. In this research, we have explored the power of usage of hybrid ensemble algorithms to improve the predictive accuracy in the stock market forecasting. Our research comprises construction and evaluation of diverse hybrid models using different algorithms. The methodology presented in the paper involves comprehensive data preparation, feature engineering, and model normalization. Evaluating the different hybrid models, one stands out distinctly: LSTM (long short-term memory networks) + GRU (Gated recurrent units) + Conv1D (one-dimensional convolutional layer) hybrid. It illustrated its potential to revolutionize decision-making tools for investors and financial analysts in stock market analytics. The harmonious integration of algorithms not only underscores the effectiveness of hybrid modeling but also beckons for further exploration within the ever-evolving domain of predictive modeling, driven by the pursuit of precision and accuracy. This model shows a remarkable accuracy metrics including a Mean Absolute Error (MAE) of 0.95, a Mean Squared Error (MSE) of 2.1222, a Root Mean Squared Error (RMSE) of 1.52, and a R-squared (R2) score of 0.9982.
ARTICLE | doi:10.20944/preprints202004.0316.v2
Subject: Environmental And Earth Sciences, Environmental Science Keywords: Precision farming; Early crop-type mapping; Sentinel-2; Random Forest; SVM
Online: 17 January 2022 (10:54:10 CET)
Crop-type mapping is an important intermediate step for cost-effective crop management at the field level, as an overview of all fields with a particular crop type can be used for monitoring or yield forecasting, for instance. Our study used a data set with 2400 fields and corresponding satellite observations from the federal state of Bavaria, Germany. The study classified corn, winter wheat, winter barley, sugar beet, potato, and winter rapeseed as the main crops grown in Upper Bavaria. We additionally experimented with a rejection class "Other", which summarised further crop types. Corresponding Sentinel-2 data included the normalised difference vegetation index (NDVI) and raw bands from 2016 to 2018 for each selected field. The influence of raw bands compared to NDVI was analysed and the classification algorithms, i.e. support vector machine (SVM) and random forest (RF), were compared. The study showed that the use of an index should be critically questioned and that raw bands provided a wider spectral bandwidth, which significantly improved the mapping of crop types. The results underline the use of RF with raw bands and achieved overall accuracies (OA) of up to 92%. We also predicted crop types in an unknown year with significantly different weather conditions and several months before the end of the growing season. Thus, the influence of climate anomalies and the accuracy depending on the time of prediction were assessed. The crop types of a test site and year without labels could be determined with an OA of up to 86%. The results demonstrate the usefulness of the proof-of-concept and its readiness for use in real applications.
ARTICLE | doi:10.20944/preprints202112.0307.v1
Subject: Engineering, Civil Engineering Keywords: Road safety; Safety management; Road transportation; GMDH; GOA-SVM; Machine learning
Online: 20 December 2021 (10:37:05 CET)
Evaluation of road safety is a critical issue having to be conducted for successful safety management in road transport systems, whereas safety management is considered in road transportation systems as a challenging task according to the dynamic of this issue and the presence of a large number of effective parameters on road safety. Therefore, evaluation and analysis of important contributing factors affecting the number of crashes play a key role in increasing the efficiency of road safety. For this purpose, in this research work, two machine learning algorithms including the group method of data handling (GMDH)-type neural network and a combination of support vector machine (SVM) and the grasshopper optimization algorithm (GOA) are employed for evaluating the number of vehicles involved in the accident based on the seven factors affecting transport safety including the Daylight (DL), Weekday (W), Type of accident (TA), Location (L), Speed limit (SL), Average speed (AS) and Annual average daily traffic (AADT) of rural roads of Cosenza in southern Italy. In this study, 564 data sets of rural areas were investigated and relevant effective parameters were measured. In the next stage, several models were developed to investigate the parameters affecting the safety management of road transportation for rural areas. The results obtained demonstrated that "Average speed" has the highest level and "Weekday" has the lowest level of importance in the investigated rural area. Finally, although the results of both algorithms were the same, the GOA-SVM model showed a better degree of accuracy and robustness than the GMDH model.
ARTICLE | doi:10.20944/preprints201705.0098.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: rule-based classification model; wetland remote sensing; SVM; TC-Wetness; China
Online: 11 May 2017 (08:03:34 CEST)
Wetlands are among the most bio-diverse and highest productivity ecosystems on earth, making their monitoring a high priority to conservation, protection and management interests. Although visual interpretation of satellite images is generally precise for monitoring wetlands, recent works have emphasized computerized classification methods because of the reduction in analyst time. However, it is difficult to automatically identify wetland solely based on spectral characteristics due to the complexity of wetland ecosystems. The ability to extract wetland information rapidly and accurately is the basis and the key to wetland mapping at a large scale. Here we propose an operational method to map China wetlands based on Landsat TM data and ancillary data. On the basis of theoretical analysis of wetland automatic classification, we developed a revised multi-layer wetland classification scheme and a rule-based classification model. In the latter, supervised classification (SVM and decision tree) and unsupervised classification (ISODATA) methods were tested. Four Landsat TM images, representing various wetland eco-regions in China (i.e. the Sanjiang Plain in the northeast China, the North China Plain, the Zoige Plateau in the southwest China and the Pearl River Estuary in southeast China), were automatically classified. The overall classification accuracies were 86.57%, 96.00%, 84.51% and 88.30%, respectively, which we considered to be satisfactory accuracy. Our results indicate that issues such as the resolution of geographic data and the understanding of wetland samples should be carefully addressed in the future.
ARTICLE | doi:10.20944/preprints202309.1219.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Radial Fourier signatures; SVM; Machine Learning; skin lesions; texture descriptors; image processing
Online: 19 September 2023 (11:54:59 CEST)
Eight lesions were analyzed using some algorithms of Intelligence Artificial: basal cell carcinoma (BCC), squamous cell carcinoma (SCC), melanoma (MEL), actinic keratosis (AK), benign keratosis (BKL), dermatofibromas (DF), melanocytic nevi (NV), and vascular lesions (VASC). This manuscript presents the possibility of using concatenated signatures (instead of images) obtained from different integral transforms, such as Fourier, Mellin, and Hilbert, to classify skin lesions. Eleven other Artificial Intelligence models were applied so that eight skin lesions could be classified by analyzing the particular signatures of each lesion. The database was randomly divided into 80%–20% for the training and test datasets images, respectively. The metrics that are being reported are accuracy, sensitivity, specificity, and precision. Each case was repeated 30 times to avoid bias, according to the central limit theorem in this work, and the average and ±standard deviation were reported. Although all the results were very satisfactory, the best average mark for the eight lesions analyzed was obtained using the Subspace KNN model, where the metrics for the test were 99.98% accuracy, 99.96% sensitivity, 99.99% specificity, and 99.95% precision.
ARTICLE | doi:10.20944/preprints202303.0021.v1
Subject: Engineering, Civil Engineering Keywords: Soil dynamic; Cyclic simple shear; Damping ratio; Sand particle shape; ANN; SVM
Online: 1 March 2023 (10:56:01 CET)
This paper reports on a series of dynamic simple shear tests conducted to investigate the influence of particle shape on the damping ratio of dry sand. The tests were conducted on sand samples subjected to simple cyclic shear tests to evaluate their cyclic behavior. The particle shape was quantified using three shape parameters: roundness, sphericity, and regularity. The sand samples were subjected to twelve different scenarios with varying vertical stresses and cyclic stress ratios (CSR), in both constant and controlled stress states. Each scenario involved five cyclic tests, using the same sand that was reconstructed from its previous cyclic test. After each cyclic test, hysteresis loops were created to determine the damping ratio. The results showed that the shape of the sand particles changed during cyclic loading, becoming more rounded and spherical, which resulted in an increase in damping ratio. Moreover, the paper presents two artificial intelligence models, an artificial neural network (ANN) and a support vector machine (SVM), which were developed to predict the effect of grain shape on the damping ratio. The models were found to be effective in predicting the damping ratio based on the shape of the grain, vertical stress, CSR, and number of loading cycles. Furthermore, a parameter analysis was conducted to identify the most important shape parameter, which was found to be vertical stress and regularity, while parameter CSR was the least important. Overall, this study contributes to a better understanding of the relationship between particle shape and damping ratio, which could have practical implications for geotechnical engineering applications.
ARTICLE | doi:10.20944/preprints201908.0225.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: water bodies; satellite images; vector data; SVM; positive and negative buffering; polygons
Online: 21 August 2019 (10:30:16 CEST)
The technique of obtaining information or data about any feature or object from afar, called in technical parlance as remote sensing, has proven extremely useful in diverse fields. In the ecological sphere, especially, remote sensing has enabled collection of data or information about large swaths of areas or landscapes. Even then, in remote sensing the task of identifying and monitoring of different water reservoirs has proved a tough one. This is mainly because getting correct appraisals about the spread and boundaries of the area under study and the contours of any water surfaces lodged therein becomes a factor of utmost importance. Identification of water reservoirs is rendered even tougher because of presence of cloud in satellite images, which becomes the largest source of error in identification of water surfaces. To overcome this glitch, the method of the shape matching approach for analysis of cloudy images in reference to cloud-free images of water surfaces with the help of vector data processing, is recommended. It includes the database of water bodies in vector format, which is a complex polygon structure. This analysis highlights three steps: First, the creation of vector database for the analysis; second, simplification of multi-scale vector polygon features; and third, the matching of reference and target water bodies database within defined distance tolerance. This feature matching approach provides matching of one to many and many to many features. It also gives the corrected images that are free of clouds.
ARTICLE | doi:10.20944/preprints202103.0434.v1
Subject: Engineering, Automotive Engineering Keywords: WiFi sounder; CSI; MIMO; indoor location estimation; array signal processing; machine learning; SVM
Online: 17 March 2021 (10:57:38 CET)
In recent years, since the propagation channel characteristics have been effectively used for applications such as motion sensing, position detection, etc. A great deal of attention is attracted to channel sounding methods easy to utilize using low-cost devices. This paper presents a device-free indoor location estimation method using spatio-temporal features of radio propagation channels using the 2.4-GHz band 3-by-3 MIMO channel sounder developed using commodity wireless LANs. The measurement results demonstrated a reasonable performance of the proposed method with small number of antennas.
ARTICLE | doi:10.20944/preprints202306.1619.v1
Subject: Engineering, Other Keywords: Machine learning; SVM; ANN; Fracture porosity prediction; Anisotropy; Well logging; Shear waves; Image logs.
Online: 22 June 2023 (12:15:56 CEST)
The purpose of this work is to compare two fracture prediction models with real-world data. The pure Artificial Neural Network (ANN) model emphasizes regression analysis, while the hybrid model (SVM-ANN) focuses on the combination of regression and classification analysis or Support Vector Machine. The results were subsequently tested against logging data by combining the Machine Learning approach with advanced logging tools. In this context, we used electrical image logs and the dipole acoustic tool which together allowed the distinction of 404 open fractures and 231 closed fractures and, consequently the estimation of fracture porosity. The results are then fed into two machine-learning algorithms. Pure Artificial Neural Networks and hybrid models are used to establish comprehensive results, which are subsequently tested to check the accuracy of the models. The outputs obtained from the two methods demonstrate that the hybridized model has a lower Root Mean Square Error (RMSE) than pure ANN. The results of our approach strongly suggest that incorporating hybridized machine learning algorithms in fracture porosity estimations can contribute to the development of more trustworthy static reservoir models in simulation programs. Finally, the combination of Machine Learning (ML) and well-log analysis do permit reliable estimation of fracture porosity in the Ahnet field in Algeria, where, in many places, advanced logging data is absent and costly.
ARTICLE | doi:10.20944/preprints202303.0335.v1
Subject: Medicine And Pharmacology, Internal Medicine Keywords: NLP; NLU; Twitter; Sentiment Analysis; Opinion Mining; Nigeria; Election; Machine Learning; BERT; LSTM; SVM
Online: 20 March 2023 (02:52:52 CET)
Election outcomes have been predicted in the past with the help of various state-of-the-art language models. Sentiment analysis helps in establishing the opinions of the public about a particular subject, a popular experiment known as opinion mining. Twitter has grown in popularity and proven to be a key tool in mining people’s sentiments concerning election and other trending subjects of interest. The outcome of the just concluded Presidential election in Nigeria shifts the focus on Lagos State governorship election. In this study, we propose a Bidirectional Encoder Representations from Transformers (BERT) model for the sentiment analysis of governorship election in Lagos State Nigeria using Twitter data. A total of 800,000 personal and public tweets were scraped from twitter concerning the three prominent contesting candidates using carefully selected search queries. The tweets were preprocessed to avoid noise and inconsistencies. The preprocessed tweets were passed into the pretrained and finetuned BERT model. The result was analyzed to establish the sentiments of the public about the candidates. The social networks of the candidates were also analyzed. The parameter-tuning yield different results with different learning rates (LR). Results showed that the learning rate at 1e-7 gave the best performance and that the smaller the learning rate, the higher the accuracy but the larger the epoch size, the higher the accuracy.
ARTICLE | doi:10.20944/preprints202210.0238.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: NLP; NLU; Twitter; Sentiment Analysis; Opinion Mining; Nigeria; Election; Machine Learning; BERT; LSTM; SVM
Online: 17 October 2022 (12:01:42 CEST)
Introduction: Social media platforms such as Facebook, LinkedIn, Twitter, among others have been used as a tool for staging protests, opinion polls, campaign strategy, medium of agitation and a place of interest expression especially during elections. Past studies have established people’s opinion elections using social media posts. The advent of state-of-the-art algorithms for unstructured text processing implies tremendous progress in natural language processing and understanding. Aim: In this work, a Natural Language framework is designed to understand Nigeria 2023 presidential election based on public opinion using Twitter dataset. Methods: Raw datasets concerning discourse around Nigeria 2023 elections from Twitter of 2,059,113 18 dimensions were collected. Sentiment analysis was performed on the preprocessed dataset using three different machine learning models namely: Long Short-Term Memory (LSTM) Recurrent Neural Network, Bidirectional Encoder Representations from Transformers (BERT) and Linear Support Vector Classifier (LSVC) models. Personal tweet analysis of the three candidates provided insight on their campaign strategies and personalities while public tweet analysis established the public’s opinion about them. The performance of the models was also compared using accuracy, recall, false positive rate, precision and F-measure. Results: LSTM model gave an accuracy, precision, recall, AUC and f-measure of 88%, 82.7%, 87.2% , 87.6% and 82.9% respectively; the BERT model gave an accuracy, precision, recall, AUC and f-measure of 94%, 88.5%, 92.5%, 94.7% and 91.7% respectively while the LSVC model gave an accuracy, precision, recall, AUC and f-measure of 73%, 81.4%, 76.4%, 81.2% and 79.2% respectively. Conclusion: The experimental results show that sentiment analysis and other Natural Language Processing tasks can aid in the understanding of the social media space. Results also revealed the leverage of each aspirant towards winning the election. We conclude that sentiment analysis can form a general basis for generating insights for election and modeling election outcomes.
ARTICLE | doi:10.20944/preprints202108.0366.v2
Subject: Medicine And Pharmacology, Oncology And Oncogenics Keywords: Lung adenocarcinoma; PD-1 inhibitor; LASSO analysis and SVM-RFE; Immune cell infiltration; TCGA
Online: 25 August 2021 (09:22:41 CEST)
In a recent study, the PD-1 inhibitor has been widely used in clinical trials and shown to improve various cancers. However, PD-1/PD-L1 inhibitors showed a low response rate and showed to be effective for a small number of cancer patients. Thus, it is important to identify key genes, which can enhance the PD-1/PD-L1 response for promoting immunotherapy. Here, we used ssGSEA and unsupervised clustering analysis to identify three clusters to show different immune cell infiltration status, prognosis, and biological action. The cluster C showed a better survival rate, high immune cells infiltration, and immunotherapy effect enriched in a variety of immune active pathways, including T and B cell signal receptors. Besides, it showed more immune subtypes C2 and C3. Further, we used WGCNA analysis to confirm the cluster C correlated genes. The red module highly correlated with cluster C for 111 genes which were enriched in a variety of immune-related pathways. To pick candidate genes in SD/PD and CR/PR patients, we used the Least Absolute Shrinkage and SVM-RFE algorithms. In conclusion, our LASSO analysis and SVM-RFE based research identified targets with better prognosis, activated immune-related pathways, and better immunotherapy. The KLRC3 was identified as the key gene which can efficiently respond to immunotherapy with greater efficacy and better prognosis.
ARTICLE | doi:10.20944/preprints202106.0633.v1
Subject: Engineering, Automotive Engineering Keywords: Conditional temporal moments; Optimizable support vector machine (SVM); Gearbox fault diagnosis; Vibration analysis.
Online: 28 June 2021 (09:57:04 CEST)
Fault diagnosis of the gearbox is a decisive part of the modern industry to find the many gearbox defects like gear tooth crack, chipped or broken, etc. But sometimes, the nonstationary properties of vibration signal and low energy of minimal faults make this procedure very challenging. Previously, many types of techniques have been developed for gearbox condition monitoring. But most of the methods are dealing with conventional techniques of the gearbox condition monitoring, such as time-domain analysis or frequency domain analysis. Most of the conventional methods are not suitable for the nonstationary vibration signal. Thus, this paper presents a novel gearbox fault diagnosis technique using conditional temporal moments and an optimizable support vector machine (SVM). This work also presents an integrated features extraction technique based on the standard features, i.e., statistical and spectral features with the combinations of moment features. The impact of the four conditional temporal moments of each gearbox condition is also presented. This work shows that the proposed method successfully classifies and categorizes the gearbox faults at an early stage.
Subject: Environmental And Earth Sciences, Geophysics And Geology Keywords: gold deposit; alteration information; ASTER image; support vector machine (SVM); principal component analysis (PCA)
Online: 22 October 2019 (04:26:18 CEST)
Dayaoshan, as an important metal ore producing area in China, is faced with the dilemma of resource depletion due to long-term exploitation. In this paper, remote sensing method is used to circle the favorable metallogenic areas and find new ore points for Gulong. Firstly, vegetation interference bas been removed by using mixed pixel decomposition method with hyperplane and genetic algorithm (GA) optimization; then, altered mineral distribution information has been extracted based on principal component analysis (PCA) and support vector machine (SVM) method; Thirdly, the favorable areas of gold mining in Gulong has been delineated by using ant colony algorithm (ACA) optimization SVM model to remove false altered minerals; Lastly, field survey verified that the extracted alteration mineralization information is correct and effective. The results show that the mineral alteration extraction method proposed in this paper has certain guiding significance for metallogenic prediction by remote sensing.
ARTICLE | doi:10.20944/preprints201903.0122.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Classification, SVM Classifier, ML Classifier, Supervised and Unsupervised Classification, Object-based Classification, Multispectral Data
Online: 11 March 2019 (09:01:44 CET)
This paper focuses on the crucial role that remote sensing plays in divining land features. Data that is collected distantly provides information in spectral, spatial, temporal and radiometric domains, with each domain having the specific resolution to information collected. Diverse sectors such as hydrology, geology, agriculture, land cover mapping, forestry, urban development and planning, oceanography and others are known to use and rely on information that is gathered remotely from different sensors. In the present study, IRS LISS IV Multi-spectral data is used for land cover mapping. It is known, however, that the task of classifying high-resolution imagery of land cover through manual digitizing consumes time and is way too costly. Therefore, this paper proposes accomplishing classifications by way of enforcing algorithms in computers. These classifications fall in three classes: supervised, unsupervised, and object-based classification. In the case of supervised classification, two approaches are relied upon for land cover classification of high-resolution LISS-IV multispectral image. These approaches are Maximum Likelihood and Support Vector Machine (SVM). Finally, the paper proposes a step-by-step procedure for optical image classification methodology. This paper concludes that in optical data classification, SVM classification gives a better result than the ML classification technique.
ARTICLE | doi:10.20944/preprints202310.1467.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: body-worn sensors; multi layer classifier; random forest; kernel fisher discriminant analysis; SVM; stepwise regression
Online: 23 October 2023 (16:18:56 CEST)
This study presents a research plan that utilizes data obtained from wearable devices to identify human activities and gain insights into human behavior. We developed a model capable of classifying activities similar to human behavior and evaluated the effectiveness and generalization capabilities of this model. The data underwent initial preprocessing, including standardization and normalization. Additionally, recognizing the inherent similarities between human activity behaviors, we introduced a multi-layer classifier model. The first layer is a random forest model based on stepwise regression, which may encounter reduced accuracy for similar activities. The second layer employs a Support Vector Machine (SVM) model based on Kernel Fisher Discriminant Analysis (KFDA). KFDA is used to reduce the dimensionality of data points with potential confusion, followed by SVM for classification. The model was experimentally evaluated and applied to four benchmark datasets: UCI DSA, UCI HAR, WISDM, and IM-WSHA. The experimental results demonstrate that our approach achieved recognition accuracies of 99.71%, 98.71%, 99.12%, and 97.6% on these datasets, indicating excellent recognition performance. Furthermore, to assess the model's generalization ability, we performed K-fold cross-validation on the random forest model and utilized ROC curves for the SVM classifier. The results indicate that our multi-layer classifier model exhibits robust generalization capabilities.
ARTICLE | doi:10.20944/preprints202307.1306.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Recurrent Neural Network(RNN); Support Vector Machine (SVM); Kernel-Adatron algorithm (KA); Euler-Cauchy Algorithm
Online: 19 July 2023 (08:06:13 CEST)
When implementing SVMs, two major problems are encountered: (a) the number of local minima increases exponentially with the number of samples and (b) the quantity of required computer storage, required for a regular quadratic programming solver, increases by an exponential mag-nitude as the problem size expands. The Kernel-Adatron family of algorithms gaining attention lately which has allowed it to handle very large classification and regression problems. Howev-er, these methods treat different types of samples (Noise, border, and core) in the same manner, which causes searches in unpromising areas and increases the number of iterations. In this work, we introduce a hybrid method to overcome these shortcomings, namely Optimal Recurrent Neu-ral Network Density Based Support Vector Machine (Opt-RNN-DBSVM). This method consists of four steps: (a) characterization of different samples, (b) elimination of samples with a low probability of being a support vector, (c) construction of an appropriate recurrent neural network based on an original energy function, and (d) solution of the system of differential equations, managing the dynamics of the RNN, using the Euler-Cauchy method involving an optimal time step. The RNN remembers the regions explored during the search process thanks to its recurrent architecture. We demonstrated that RNN-SVM converges to feasible support vectors and Opt-RNN-DBSVM has a very low time complexity compared to RNN-SVM with constant time step, and KAs-SVM. Several experiments were performed on academic data sets. We used several classification performances measures to compare Opt-RNN-DBSVM to different classification methods and the results obtained show the good performance of the proposed method.
ARTICLE | doi:10.20944/preprints202104.0183.v1
Subject: Computer Science And Mathematics, Computer Networks And Communications Keywords: Intrusion detection systems; machine learning; NSL-KDD; feature selection; classification model; SBDS, ABDS, Snort, SVM
Online: 6 April 2021 (17:59:47 CEST)
Cloud computing is an emerging area which provide on demand computing resources and services through internet. It is faster and efficient technique but prone to severe security attacks. In this paper author have proposed a Network Intrusion Detection System (NIDS) to detect attacks at front end and backend when bulky flow of data packets flowing in a cloud environment. In our framework we used Signature based detection system for identifying the intruder and the Anomaly based detection system for detecting network attacks. The NIDS sensors were placed in a collaborative manner to prevent the attacks and to update the knowledge bases. Author have used supervised learning model to detect abnormal behavior of packets from network traffic. The dataset were trained and tested in terms of precision, recall, accuracy and model build time to select the best machine-learning model for detection of intruder and to improve the computational time and performance.
ARTICLE | doi:10.20944/preprints202308.0939.v1
Subject: Engineering, Marine Engineering Keywords: Underwater acoustic sensor network (UASN); intelligent routing protocol; support vector machine (SVM); packet delivery ratio (PDR)
Online: 14 August 2023 (09:57:46 CEST)
Underwater acoustic sensor network (UASN) plays a crucial role in collecting real-time data from remote areas of the ocean. However, deploying the UASN is a challenging problem due to the harsh environment and high deployment cost. Therefore, it is essential to design an appropriate routing protocol to effectively address the issues of routing void, packet delivery delay, and energy utilization. In this paper, an adaptive support-vector-machine-based routing (ASVMR) protocol is proposed for the UASN to prolong the network lifetime and reduce the end-to-end packet delivery delay. The proposed protocol employs a distributed routing approach that dynamically optimizes the routing path in real-time by considering four types of node state information. Moreover, the ASVMR protocol establishes a "routing vector" spanning from the current node to the sink node, and selects a suitable pipe radius according to the packet delivery ratio (PDR). In addition, the ASVMR protocol incorporates future states of sensor nodes into the decision-making process, along with the adoption of a waiting time mechanism and routing void recovery mechanism. Extensive simulation results demonstrate that the proposed ASVMR protocol performs well, in terms of the PDR, the hop count, the end-to-end delay, and the energy efficiency in dynamic underwater environments.
ARTICLE | doi:10.20944/preprints202210.0426.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: novelty-class; one online-Class SVM (OCSVM); memory dump; Malware; Principal Component Analysis (PCA); dimensionality reduction
Online: 27 October 2022 (08:17:43 CEST)
Malware complexity is rapidly increasing, causing catastrophic impacts on computer systems. Memory dump malware is gaining increased attention due to its ability to expose plaintext passwords or key encryption files. This paper presents an enhanced classification model based on One class SVM (OCSVM) classifier that can identify any deviation from the normal memory dump file patterns and detect it as malware. The proposed model integrates OCSVM and Principal Component Analysis (PCA) for increased model sensitivity and efficiency. An up-to-date dataset known as “MALMEMANALYSIS-2022” was utilized during the evaluation phase of this study. The accuracy achieved by the traditional one-class classification (TOCC) model was 55%, compared to 99.4% in the one-class classification with PCA (OCC-PCA) model. Such results have confirmed the increased performance achieved by the proposed model.
ARTICLE | doi:10.20944/preprints202005.0451.v1
Subject: Computer Science And Mathematics, Applied Mathematics Keywords: Bilateral Line Local Binary Patterns; Facial matrix; Statistical subspace; Face recognition; Calibrated SVM model; Ensemble learning
Online: 27 May 2020 (12:07:19 CEST)
Local binary pattern is one of the visual descriptors and can be used as a powerful feature extractor for texture classification. In this paper, a novel representation for face recognition is proposed, called it Bilateral Line Local Binary Patterns (BL-LBP). This scheme is an extension of Line Local Binary Patterns descriptors in the statistical learning subspace. The present bilateral descriptors are fused with an ensemble learning of calibrated SVM models. The performance of this scheme is evaluated using 5 standard face databases. It is found that it is robust against illumination variation, diverse facial expressions and head pose variations and its recognition accuracy reaches 98 percent, running on a mobile device with a processing speed of 63 ms per face. Results suggest that our proposed method can be very useful for the vision systems that have limited resources where the computational cost is critical.
ARTICLE | doi:10.20944/preprints202305.0042.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Support Vector Machine (SVM); Worldview2; Satellite Imagery; Iterative Dichotomiser 3 (ID3); Burn Extent; Burn Severity; Biomass Consumption
Online: 2 May 2023 (02:23:31 CEST)
Through the use of machine learning algorithms like the Support Vector Machine, it has been show that burn extent can be accurately mapped from hyperspatial drone imagery in both grasslands and forests. Despite these successes, hyperspatial imagery must be acquired via drones, requiring large amounts of time and resources to capture areas much smaller than the large catastrophic fires which result in the majority of the lands burned each year by wildland fires. To overcome this difficulty, high spatial resolution satellite imagery from Worldview2 can be substituted for hyperspatial drone imagery, allowing for larger regions of images to be acquired more easily and efficiently. Additionally, Worldview2 trades spatial resolution for spectral resolution and extent, capturing images in 8 multispectral bands as opposed to 3 band imagery in the visible spectra. This research examines the utility of each of the 8 bands observed in Worldview2 imagery using an Iterative Dichotomiser 3 decision tree, then uses these bands to map burn extent and biomass consumption. Several classifications of burn extent and biomass consumption are produced and compared based on the bands used as inputs. The results show that using Worldview2 imagery to map burn extent and biomass consumption results in highly accurate maps, with slight improvements when additional bands are added.
ARTICLE | doi:10.20944/preprints202012.0054.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Patterns recognition; Machine learning; Hereditary Ataxia diseases; K-Nearest Neighbors; Multi Layer Perceptron; Ensemble Classification Trees; SVM.
Online: 2 December 2020 (09:33:15 CET)
The progressive impairment analysis in gait from neurological diseases patients such as Hereditary Ataxias (HA) has been carried out using gait data collected with movement sensors. This research is focused on finding the minimum amount required of gait features to recognize efficiently and less intrusive way, HA patients based on data collected with iPhone movement sensors placed on the ankles from 14 HA patients and 14 healthy people. A twofold proposal is made , first a local minimum prominent peak criterion to find out the starting point of each stride, to get 10-stride window about which 56 spatial-temporal features are derived; second a search strategy based on Hill Climbing algorithm to reduce the number of gait features and sensors. The main results were the findings that with two gait patterns a 96% of classification accuracy was achieved by using K-Nearest Neighbors (KNN) and Multi-Layer Perceptron (MLP) algorithms, but in addition, MLP only right ankle sensor patterns were required which also allows to reduce the intrusion.
ARTICLE | doi:10.20944/preprints201910.0349.v2
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: hybrid machine learning; extreme learning machine (ELM); radial basis function (RBF); breast cancer; support vector machine (SVM)
Online: 24 February 2020 (04:10:49 CET)
Mammography is often used as the most common laboratory method for the detection of breast cancer, yet associated with the high cost and many side effects. Machine learning prediction as an alternative method has shown promising results. This paper presents a method based on a multilayer fuzzy expert system for the detection of breast cancer using an extreme learning machine (ELM) classification model integrated with radial basis function (RBF) kernel called ELM-RBF, considering the Wisconsin dataset. The performance of the proposed model is further compared with a linear-SVM model. The proposed model outperforms the linear-SVM model with RMSE, R2, MAPE equal to 0.1719, 0.9374 and 0.0539, respectively. Furthermore, both models are studied in terms of criteria of accuracy, precision, sensitivity, specificity, validation, true positive rate (TPR), and false-negative rate (FNR). The ELM-RBF model for these criteria presents better performance compared to the SVM model.
ARTICLE | doi:10.20944/preprints202307.0679.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: SQL injection attacks; Recurrent neural network (RNN) autoencoderANN; CNN; Decision Tree; Naïve Bayes; SVM; Random Forest; Logistic Regression
Online: 11 July 2023 (10:53:24 CEST)
SQL injection attacks are one of the most common types of attacks on web applications. These attacks exploit vulnerabilities in the application’s database access mechanisms, allowing attackers to execute unauthorized SQL queries. In this study, we propose an architecture for detecting SQL injection attacks using a recurrent neural network (RNN) autoencoder. The proposed architecture was trained on a publicly available dataset of SQL injection attacks. Then compared with several other machine learning models, including ANN, CNN, Decision Tree, Naïve Bayes, SVM, Random Forest, and Logistic Regression. The experimental result showed that the proposed approach achieved an accuracy of 94% and an F1 score of 92%, which demonstrate its effectiveness in detecting QL injection attacks with high accuracy in comparison with other models covered in the study.
ARTICLE | doi:10.20944/preprints202107.0638.v1
Subject: Biology And Life Sciences, Plant Sciences Keywords: Image Processing; Automated Plant Diseases Detection; Histogram Oriented Gradient (HOG); Local Binary Pattern (LBP); Support Vector Machine (SVM)
Online: 28 July 2021 (17:18:04 CEST)
: On earth, plants play the most important part. Every organ of a plant plays a vital role in the ecological field as well as the medicinal field. But on the whole earth there are several species of plants are available. Different plants have different diseases. Therefore it is needed to identify the plants and their diseases to prevent loss. Now to identify the plants and their diseases manually is very time consuming. In this research an automatic plant and their disease detection system is proposed. For experimental purposes, high-quality leaf images are accepted for training and testing. For detecting the healthy and diseased area in a leaf, region-based and color-based region thresholding techniques were used. For feature selection Histogram Oriented Gradient (HOG) and Local Binary Pattern (LBP) method were applied. Finally for classification two-class and multi-class Support Vector Machine (SVM) was used. It is observed that both feature selection processes with SVM give 99% accuracy. Finally to understand the automated system a graphical user interface was created for all users.
ARTICLE | doi:10.20944/preprints202311.0248.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Skin cancer; Contourlet Transform (CT); Particle Swarm Optimization (PSO); Support Vector Machine (SVM); Random Forest (RF); Neural Network (NN)
Online: 6 November 2023 (01:19:45 CET)
In recent years, computer-aided analysis techniques have emerged as valuable tools in assisting dermatologists by providing objective and efficient analysis of skin cancer images. This paper utilizes the combination of the Contourlet Transform (CT) and Local Binary Pattern (LBP) techniques for accurately recognizing borders, contrast changes, and shapes of skin cancer images. These results often contain many features, leading to high computational costs and potential over-fitting issues. Hence, we applied Particle Swarm Optimization (PSO) to select the most informative and discriminating features, reducing the dimensionality while retaining important information for accurate classification. After reducing the feature set with PSO, we applied these sets to Machine learning classification algorithms: Support Vector Machine (SVM), Random Forest (RF), and Neural Networks (NN). The results show that SVM has the lowest time complexity of 0.0458 seconds, followed by the Neural Network at 0.08730 seconds, and the Random Forest model has the highest time complexity of 0.1622 seconds. The SVM and Neural Network models are faster to train than the Random Forest model, making them more suitable for real-time or latency-sensitive applications. We also compared our proposed model with the state-of-the-art models and obtained the accuracy of 86.9%, which is the highest among the models.
ARTICLE | doi:10.20944/preprints202111.0345.v1
Subject: Medicine And Pharmacology, Psychiatry And Mental Health Keywords: brain-computer interface (BCI); electroencephalography (EEG); stress state recognition; feature selection; particle swarm optimization (PSO); mRMR; SVM; DEEP; SEED
Online: 19 November 2021 (11:01:19 CET)
Mental stress state recognition using electroencephalogram (EEG) signals for real-life applications needs a conventional wearable device. This requires an efficient number of EEG channels and an optimal feature set. The main objective of the study is to identify an optimal feature subset that can best discriminate mental stress states while enhancing the overall performance. Thus, multi-domain feature extraction methods were employed, namely, time domain, frequency domain, time-frequency domain, and network connectivity features, to form a large feature vector space. To avoid the computational complexity of high dimensional space, a hybrid feature selection (FS) method of minimum Redundancy Maximum Relevance with Particle Swarm Optimization and Support Vector Machine (mRMR-PSO-SVM) is proposed to remove noise, redundant, and irrelevant features and keep the optimal feature subset. The performance of the proposed method is evaluated and verified using four datasets, namely EDMSS, DEAP, SEED, and EDPMSC. To further consolidate, the effectiveness of the proposed method is compared with that of the state-of-the-art heuristic methods. The proposed model has significantly reduced the features vector space by an average of 70% in comparison to the state-of-the-art methods while significantly increasing overall detection performance.
ARTICLE | doi:10.20944/preprints202004.0503.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: UWB; NLOS identification; multi-path detection; NLOS and MP discrimination; machine learning; SVM; random forest; multilayer perceptron; LOS; DWM1000; indoor localization
Online: 29 April 2020 (10:29:54 CEST)
In Ultra-wideband (UWB)-based wireless ranging or distance measurement, differentiation between line-of-sight~(LOS), non-line-of-sight~(NLOS), and multi-path (MP) conditions are important for precise indoor localization. This is because the accuracy of the reported measured distance in UWB ranging systems is directly affected by the measurement conditions (LOS, NLOS or MP). However, the major contributions in literature only address the binary classification between LOS and NLOS in UWB ranging systems. The MP condition is usually ignored. In fact, the MP condition also has a significant impact on the ranging errors of the UWB compared to the direct LOS measurement results. Though, the magnitudes of the error contained in MP conditions are generally lower than completely blocked NLOS scenarios. This paper addresses machine learning techniques for identification of the mentioned three classes (LOS, NLOS, and MP) in the UWB indoor localization system using an experimental data-set. The data-set was collected in different conditions at different scenarios in indoor environments. Using the collected real measurement data, we compare three machine learning (ML) classifiers, i.e., support vector machine (SVM), random forest (RF) based on an ensemble learning method, and multilayer perceptron (MLP) based on a deep artificial neural network, in terms of their performance. The results show that applying ML methods in UWB ranging systems are effective in identification of the above-mentioned three classes. In specific, the overall accuracy reaches up to 91.9% in the best-case scenario and 72.9% in the worst-case scenario. Regarding the F1-score, it is 0.92 in the best-case and 0.69 in the worst-case scenario. For reproducible results and further exploration, we (will) provide the publicly accessible experimental research data discussed in this paper at PUB - Publications at Bielefeld University. The evaluations of the three classifiers are conducted using the open-source python machine learning library scikit-learn.
ARTICLE | doi:10.20944/preprints202007.0634.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: CVD rehabilitation; Local muscular endurance exercises; Exercise-based rehabilitation; Deep Learning; AlexNet; CNN; SVM; kNN; RF; MLP; PCA; multi-class classification; INSIGHT-LME dataset
Online: 26 July 2020 (15:21:08 CEST)
Exercise-based cardiac rehabilitation requires patients to perform a set of certain prescribed exercises a specific number of times. Local muscular endurance (LME) exercises are an important part of the rehabilitation program. Automatic exercise recognition and repetition counting, from wearable sensor data is an important technology to enable patients to perform exercises independently in remote settings, e.g. their own home. In this paper we first report on a comparison of traditional approaches to exercise recognition and repetition counting, corresponding to supervised machine learning and peak detection from inertial sensing signals respectively, with more recent machine learning approaches, specifically Convolutional Neural Networks (CNNs). We investigated two different types of CNN: one using the AlexNet architecture, the other using time-series array. We found that the performance of CNN based approaches were better than the traditional approaches. For exercise recognition task, we found that the AlexNet based single CNN model outperformed other methods with an overall 97.18% F1-score measure. For exercise repetition counting , again the AlexNet architecture based single CNN model outperformed other methods by correctly counting repetitions in 90% of the performed exercise sets within an error of ±1. To the best of our knowledge, our approach of using a single CNN method for both recognition and repetition counting is novel. In addition to reporting our findings, we also make the dataset we created, the INSIGHT-LME dataset, publicly available to encourage further research.
ARTICLE | doi:10.20944/preprints202310.2071.v1
Subject: Engineering, Energy And Fuel Technology Keywords: artificial neural network (ANN); Support Vector Machine (SVM); Support Vector Regression (SVR); Lightweight Gradient Boosting Machines (Light GBM); Machine Learning; Solar Irradiance (SI); solar forecasting
Online: 31 October 2023 (12:30:45 CET)
Keywords: Artificial Neural Network (ANN), Support Vector Machine (SVM), Support Vector Regression (SVR), Lightweight Gradient Boosting Machines (Light GBM), Machine Learning, Solar Irradiance (SI), Solar forecasting
ARTICLE | doi:10.20944/preprints202307.1441.v1
Subject: Biology And Life Sciences, Food Science And Technology Keywords: DSC melting profile; Orthogonal Partial Least Squares Discriminant Analysis (OPLS-DA); Artificial neural networks (ANN); Multiple Linear Regression (MLR); MARS; SVM; Food fraud; Oils adulteration
Online: 20 July 2023 (13:56:40 CEST)
Flaxseed oil is one of the best sources of n-3 fatty acids, thus its adulteration with refined oils can lead to a reduction in its nutritional value and overall quality. The purpose of this study was to use the differential scanning calorimetry (DSC) technique to detect adulterations of cold-pressed flaxseed oil with refined rapeseed oil (RP). Based on the melting phase transition curve, parameters such as peak temperature (T), peak height (h), and percentage of area (P) were determined for pure and adulterated flaxseed oils with a RP concentration of 5, 10, 20, 30, 50% (w/w). Significant linear correlations (p ≤ 0.05) between the RP concentration and all DSC parameters were observed, except for h1. In order to assess the usefulness of the DSC technique for detecting adulterations, three chemometric approaches were compared: 1) classification models (Linear Discriminant Analysis, LDA Adaptive Regression Splines, MARS, Support Vector Machine, SVM, Artificial Neural Networks, ANNs); 2) regression models (Multiple Linear Regression, MLR, MARS, SVM, ANNs, PLS) and 3) a combined model of Orthogonal Partial Least Squares Discriminant Analysis (OPLS-DA). With the LDA model, the highest accuracy of 99.5% in classifying the samples, followed by ANN> SVM > MARS was achieved. Among the regression models, the ANN model showed the highest correlation between observed and predicted values (R= 0.996), while other models showed goodness of fit as following MARS> SVM> MLR. Comparing OPLS-DA and PLS methods, higher values of R2X(cum) =0.986 and Q2 =0.973 were observed with the PLS model than OPLS-DA. These results demonstrate the usefulness of the DSC technique combined with chemometrics for predicting the adulteration of cold-pressed flaxseed oil with refined rapeseed oil.
ARTICLE | doi:10.20944/preprints201811.0293.v1
Subject: Engineering, Energy And Fuel Technology Keywords: machine Learning (ML); artificial neutral network (ANN); bagging decision tree (BDT); SUpport Vector Machines (SVM); no free lunch theorem (NFLT); hyperparameter optimisation; model comparison; heat meter
Online: 13 November 2018 (04:41:07 CET)
Heat metres are used to calculate the consumed energy in central heating systems. The subject of this article is to prepare a method of predicting a failure of a heat meter in the next settlement period. Predicting failures is essential to coordinate the process of exchanging the heat metres and to avoid inaccurate readings, incorrect billing and additional costs. The reliability analysis of heat metres was based on historical data collected over many years. Three independent models of machine learning were proposed, and they were applied to predict failures of metres. The efficiency of the models was confirmed and compared using the selected metrics. The optimisation of hyperparameters characteristics for each of models was successfully applied. The article shows that the diagnostics of devices does not have to rely only on newly collected information, but it is also possible to use the existing big data sets.
ARTICLE | doi:10.20944/preprints201907.0319.v1
Subject: Engineering, Energy And Fuel Technology Keywords: heat meter; district heating; fault detection; predictive maintenance; Machine Learning (ML); Artificial Neural Network (ANN); Bagging Decision Tree (BDT); Support Vector Machines (SVM); hyperparameter optimisation; ensemble model
Online: 28 July 2019 (16:26:47 CEST)
The need to increase the energy efficiency of buildings as well as the use of local renewable heat sources has caused that heat meters are used not only to calculate the consumed energy but also for the active management of central heating systems. Increasing the reading frequency and the use of measurement data to control the heating system expands the requirements for the reliability of heat meters. The aim of the research is to analyse a large set of meters in the real network and predict their faults to avoid inaccurate readings, incorrect billing, heating system disruption and unnecessary maintenance. The reliability analysis of heat metres, based on historical data collected over several years, shows some regularities which cannot be easily described by physics-based models. The failure rate is almost constant and does depend on the past but is a non-linear combination of state variables. To predict meters' failures in the next settlement period, three independent machine learning models are implemented and compared with selected metrics because even the high performance of a single model (87\% True Positive for Neural Network) may be insufficient to make a maintenance decision. Additionally, performing hyperparameters optimisation boosts models' performance by a few percent. Finally, three improved models are used to build an ensemble classifier which outperforms the individual models. The proposed procedure ensures the high efficiency of fault detection (>95\%), while maintaining overfitting at the minimum level. The methodology is universal and can be utilised to study the reliability and predict faults of other types of meters and different objects with the constant failure rate.
REVIEW | doi:10.20944/preprints202303.0066.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Fourth Industrial Revolution (4IR); Machine Learning (ML); Precision Agriculture; Space Vector Machine (SVM); Artificial Neural Network (ANN); k-Nearest Neighbour (k-NN); Fuzzy Classification; Global Navigation and Satellite System (GNSS)
Online: 3 March 2023 (09:28:23 CET)
The globe and more particularly the economically developed regions of the world are currently in the era of the fourth Industrial revolution (4IR). Conversely; the economically developing regions in the world and more particularly the African continent have not yet even fully passed through the Third Industrial Revolution (3IR) wave and its economy is still heavily dependent on the agricultural field. On the other hand, the state of global food insecurity is worsening on an annual basis thanks to the exponential growth of the global human population which continuously heightens the food demand in both quantity and quality. This justifies the significance of the focus on digitizing agricultural practices to improve the farm yield to meet up with the steep food demand and stabilize the economy of the African continent and countries like India whose economy is mainly dependent on Agriculture. The tools we have at our disposal to utilize in the digitization of farming practices include space technology and Global Navigation and Satellite System (GNSS) in particular, Machine learning (ML), precision agriculture and communication systems such as the Internet of Things (IoT) and Information And Communication Technologies (ICT). The most pressing challenges in the farming field include the monitoring of diseases, pests, weeds and nutrient deficiencies in the crops as early detection translates to swift and timely correction actions and hence more yield at the end of a farming cycle. Vast opportunities in the field of precision agriculture still exist that can amount to further research studies such as the lack of real-time monitoring and real-time corrective action focus.
ARTICLE | doi:10.20944/preprints202308.0528.v1
Subject: Engineering, Bioengineering Keywords: Speech Imagery; Mental Task; Machine Leaning; Feature Extraction; Common spatial pattern (CSP); Filter bank Common Spatial Pattern (FBCSP); Brain – Computer Interface (BCI); Principal Components Analysis (PCA); Feature Selection; Channel Selection; Mutual Information; Lagrange Formula; Deep Learning; SVM Classifier
Online: 7 August 2023 (10:23:13 CEST)
Nowadays, brain signal processing is performed rapidly in various brain-computer interface (BCI) applications. Most researchers focus on developing new methods for the future or improving the basic implemented models to identify the optimum standalone feature set. Our research focuses on four ideas. One of them introduces future communication models, and the others are for improving old models or methods. These are: 1) new communication imagery model instead of speech imager using the mental task: Due to speech imagery is very difficult, and it is impossible to imagine sound for all of the characters in all of the languages. Our research introduces a new mental task model for all languages that call Lip-sync imagery. This model can use for all characters in all languages. This paper implemented two lip-sync for two sounds, characters or letters. 2) New combination Signals: Selecting an inopportune frequency domain can lead to inefficient feature extraction. Therefore, domain selection is so important for processing. This combination of limited frequency ranges proposes a preliminary for creating Fragmentary Continuous frequency. For the first model, two s intervals of 4 Hz as filter banks were examined and tested. The primary purpose is to identify the combination of filter banks with 4Hz (scale of each filter bank) from the 4Hz to 40Hz frequency domain as new combination signals (8Hz) to obtain well and efficient features using increasing distinctive patterns and decreasing similar patterns of brain activities.3) new supplement bond graph classifier for SVM classifier: When SVM linear uses in very noisy, the performance is decreased. But we introduce a new bond graph linear classifier to supplement SVM linear in noisy data. 4) a deep formula recognition model: it converts the data of the first layer into a formula model (formula extraction model). The main goal is to reduce the noise in the subsequent layers for the coefficients of the formulas. The output of the last layer is the coefficients selected by different functions in different layers. Finally, the classifier extracts the root interval of the formulas, and the diagnosis does based on the root interval. For all of the ideas achieved the results of implementing methods. The results are between 55% to 98%. Less result is 55% for the deep detection formula, and the highest result is 98% for new combination signals.
ARTICLE | doi:10.20944/preprints201810.0073.v1
Subject: Medicine And Pharmacology, Other Keywords: Classification; F-score; Gray-Level Co-occurrence Matrix (GLCM); Gray-Level Run-Length Matrix (GLRLM); Hepatocellular Carcinoma (HCC); Liver Cancer; Liver Abscess; Image Texture, Sequential Backward Selection (SBS); Sequential Forward Selection (SFS); Support Vector Machine (SVM); Ultrasound Image.
Online: 4 October 2018 (14:01:42 CEST)
This paper discusses the computer-aided (CAD) classification between Hepatocellular Carcinoma (HCC), i.e., the most common type of liver cancer, and Liver Abscess, based on ultrasound image texture features and Support Vector Machine (SVM) classifier. Among 79 cases of liver diseases, with 44 cases of HCC and 35 cases of liver abscess, this research extracts 96 features of Gray-Level Co-occurrence Matrix (GLCM) and Gray-Level Run-Length Matrix (GLRLM) from the region of interests (ROIs) in ultrasound images. Three feature selection models, i) Sequential Forward Selection, ii) Sequential Backward Selection, and iii) F-score, are adopted to determine the identification of these liver diseases. Finally, the developed system can classify HCC and liver abscess by SVM with the accuracy of 88.875%. The proposed methods can provide diagnostic assistance while distinguishing two kinds of liver diseases by using a CAD system.
REVIEW | doi:10.20944/preprints201905.0175.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: demand prediction, energy systems; machine learning; artificial neural network (ANN); support vector machines (SVM); neuro-fuzzy; ANFIS; wavelet neural network (WNN); big data; decision tree (DT); ensemble learning; hybrid models; data science; deep learning; renewable energies; energy informatics; prediction; forecasting; energy demand
Online: 14 May 2019 (14:00:40 CEST)
Electricity demand prediction is vital for energy production management and proper exploitation of the present resources. Recently, several novel machine learning (ML) models have been employed for electricity demand prediction to estimate the future prospects of the energy requirements. The main objective of this study is to review the various ML models applied for electricity demand prediction. Through a novel search and taxonomy, the most relevant original research articles in the field are identified and further classified according to the ML modeling technique, perdition type, and the application area. A comprehensive review of the literature identifies the major ML models, their applications and a discussion on the evaluation of their performance. This paper further makes a discussion on the trend and the performance of the ML models. As the result, this research reports an outstanding rise in the accuracy, robustness, precision and the generalization ability of the prediction models using the hybrid and ensemble ML algorithms.