ARTICLE | doi:10.20944/preprints201705.0035.v1
Subject: Earth Sciences, Geology Keywords: landslide; classifier ensemble; instance based learning; Rotation Forest; GIS; Vietnam
Online: 4 May 2017 (08:25:12 CEST)
This study proposes a novel hybrid machine learning approach for modeling of rainfall-induced shallow landslides. The proposed approach is a combination of an instance-based learning algorithm (k-NN) and Rotation Forest (RF), state of the art machine techniques that have seldom explored for landslide modeling. The Lang Son city area (Vietnam) is selected as a case study. For this purpose, a spatial database for the study area was constructed, and then, was used to build and evaluate the hybrid model. Performance of the model was assessed using Receiver Operating Characteristic (ROC), area under the ROC curve (AUC), success rate and prediction rate, and several statistical evaluation metrics. The results showed that the model has high performance with both the training data (AUC = 0.948) and the validation data (AUC = 0.848). The results were compared with those obtained from soft computing techniques i.e. Random Forest, J48 Decision Trees, and Multilayer Perceptron Neural Networks. Overall, the performance of the proposed model is better than those obtained from the above methods. Therefore, the proposed model is a promising tool for landslide modeling. The research result can be highly useful for land use planning and management in landslide prone areas.
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Unsupervised anomalous sound detection; classification-based model; Outlier classifier; ID classifier
Online: 17 August 2021 (08:36:44 CEST)
The task of unsupervised anomalous sound detection (ASD) is challenging for detecting anomalous sounds from a large audio database without any annotated anomalous training data. Many unsupervised methods were proposed, but previous works have confirmed that the classification-based models far exceeds the unsupervised models in ASD. In this paper, we adopt two classification-based anomaly detection models: (1) Outlier classifier is to distinguish anomalous sounds or outliers from the normal; (2) ID classifier identifies anomalies using both the confidence of classification and the similarity of hidden embeddings. We conduct experiments in task 2 of DCASE 2020 challenge, and our ensemble method achieves an averaged area under the curve (AUC) of 95.82% and averaged partial AUC (pAUC) of 92.32%, which outperforms the state-of-the-art models.
ARTICLE | doi:10.20944/preprints201903.0122.v1
Subject: Earth Sciences, Geoinformatics Keywords: Classification, SVM Classifier, ML Classifier, Supervised and Unsupervised Classification, Object-based Classification, Multispectral Data
Online: 11 March 2019 (09:01:44 CET)
This paper focuses on the crucial role that remote sensing plays in divining land features. Data that is collected distantly provides information in spectral, spatial, temporal and radiometric domains, with each domain having the specific resolution to information collected. Diverse sectors such as hydrology, geology, agriculture, land cover mapping, forestry, urban development and planning, oceanography and others are known to use and rely on information that is gathered remotely from different sensors. In the present study, IRS LISS IV Multi-spectral data is used for land cover mapping. It is known, however, that the task of classifying high-resolution imagery of land cover through manual digitizing consumes time and is way too costly. Therefore, this paper proposes accomplishing classifications by way of enforcing algorithms in computers. These classifications fall in three classes: supervised, unsupervised, and object-based classification. In the case of supervised classification, two approaches are relied upon for land cover classification of high-resolution LISS-IV multispectral image. These approaches are Maximum Likelihood and Support Vector Machine (SVM). Finally, the paper proposes a step-by-step procedure for optical image classification methodology. This paper concludes that in optical data classification, SVM classification gives a better result than the ML classification technique.
ARTICLE | doi:10.20944/preprints202106.0602.v1
Subject: Engineering, Automotive Engineering Keywords: classifier; coffee beans; efficiency; specific energy; sieves
Online: 24 June 2021 (11:22:48 CEST)
Nowadays, some coffee production centers are still classification manually, so it requires a very long time, a lot of labor, and expensive operational costs. Therefore, the purpose of this research was to design and performance of the coffee bean classifier that can accelerate the process of classification beans. The classifier used consists of three main parts, namely the frame, driving force, and sieves. Research parameters include classifier work capacity, power, specific energy, classification distribution and effectiveness, and efficiency. The results showed that the best operating conditions of the coffee bean classifier was found at a rotational speed of 91.07 rpm and a 16° sieves angle with a classifier working capacity of 38.27 kg/h, the distribution of the seeds retained in the first sieve was 56.77 %, the second sieves was 28.12%, and the third sieves was 15.11%. The efficiency of using a classifier was found at a rotating speed of 91.07 rpm and a sieves angle of 16°. This classifier was simple in design, easy to operate, and can sort coffee beans into three classification, namely small, medium, and large.
ARTICLE | doi:10.20944/preprints202201.0259.v2
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: image classifier; image part; quick learning; feature overlap; positional context
Online: 11 April 2022 (10:17:57 CEST)
This paper describes an image processing method that makes use of image parts instead of neural parts. Neural networks excel at image or pattern recognition and they do this by constructing complex networks of weighted values that can cover the complexity of the pattern data. These features however are integrated holistically into the network, which means that they can be difficult to use in an individual sense. A different method might scan individual images and use a more local method to try to recognise the features in it. This paper suggests such a method, where a trick during the scan process can not only recognise separate image parts, as features, but it can also produce an overlap between the parts. It is therefore able to produce image parts with real meaning and also place them into a positional context. Tests show that it can be quite accurate, on some handwritten digit datasets, but not as accurate as a neural network, for example. The fact that it offers an explainable interface could make it interesting however. It also fits well with an earlier cognitive model, and an ensemble-hierarchy structure in particular.
ARTICLE | doi:10.20944/preprints202003.0036.v1
Subject: Engineering, Biomedical & Chemical Engineering Keywords: ECG feature selection; heartbeat classification; arrhythmia detection; random forest classifier
Online: 3 March 2020 (11:12:20 CET)
Finding an optimal combination of features and classifier is still an open problem in the development of automatic heartbeat classification systems, especially when applications that involve resource-constrained devices are considered. In this paper, a novel study of the selection of informative features and the use of a random forest classifier while following the recommendations of the Association for the Advancement of Medical Instrumentation (AAMI) and an inter-patient division of datasets is presented. Features were selected using a filter method based on the mutual information ranking criterion on the training set. Results showed that normalized R-R intervals and features relative to the width of the QRS complex are the most discriminative among those considered. The best results achieved on the MIT-BIH Arrhythmia Database were an overall accuracy of 96.14% and F1-scores of 97.97%, 73.06%, and 90.85% in the classification of normal beats, supraventricular ectopic beats, and ventricular ectopic beats respectively. In comparison with other state of the art approaches tested under similar constraints, this work represents one of the highest performances reported to date while relying on a very small feature vector.
ARTICLE | doi:10.20944/preprints201906.0169.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: non-intrusive load monitoring; load disaggregation; linear classifier; demand response
Online: 18 June 2019 (06:06:23 CEST)
Non-intrusive load monitoring (NILM) is a core technology for demand response (DR) and energy conservation services. Traditional NILM methods are rarely combined with practical applications, and most studies aim to decompose the whole loads in a household, which leads to low identification accuracy. In this paper, an NILM approach based on multi-feature integrated classification (MFIC) is explored, which combines some non-electrical features such as ON/OFF duration, usage frequency of appliances, and usage period to improve load differentiability. The implementation of MFIC algorithm is consistent with traditional event-based method. The uniqueness of our algorithm is that it designs an event detector based on steady-state segmentation and a linear discriminant classifier group based on multi-feature global similarity. Simulation results using an open-access dataset demonstrate the effectiveness and high accuracy of MFIC algorithm, with the state-of-the-art NILM methods as benchmarks.
ARTICLE | doi:10.20944/preprints202106.0205.v1
Subject: Behavioral Sciences, Applied Psychology Keywords: Mental Health; Machine Learning; Modern Imaging Techniques; Psychogenic fever; rheumatic; Classifier
Online: 8 June 2021 (09:49:42 CEST)
Brain, the most complex object known in the universe, uses few watts of power. To mimic it a nuclear power plant will be required and this power house controls the human body, single handedly. Surprisingly, “On the left side, nothing is right and on the right side there is nothing left”. Typically, there are two lateral halves of the brain: Left hemisphere and right hemisphere working distinctly. Left hemisphere is inclined towards logic; Right hemisphere is the root cause of imagination added with critical thinking. In situations like the current pandemic, COVID-19, it is the right half which tends to dominate the processing. This gives birth to mental stress, anxiety thus, aggravating the existing medical condition. Considering this pattern, a survey was conducted in Durg district of Chhattisgarh, which is one of the most hard hit epicentre of the COVID-19 second wave in India. According to this survey, it was revealed that largely women of all age groups (10-25, 26-40, 40 above) were right brained i.e. dominance of right over left hemisphere. Being more imaginative and creative thinkers they are more likely to suffer from mental issues than males. The aim of this research is to improve the mental wellbeing of the citizens in such threatening conditions. To prevent this situation awareness is a must and some stress relieving games have also been created.
ARTICLE | doi:10.20944/preprints202102.0147.v1
Subject: Medicine & Pharmacology, Allergology Keywords: Pseudomonas; antimicrobial; QSAR; chemical descriptors; machine-learning; KNN; support vector classifier; AdaBoost
Online: 4 February 2021 (22:04:37 CET)
Pseudomonas aeruginosa is a Gram-negative bacillus included among the six "ESKAPE" microbial species with an outstanding ability to "escape" currently used antibiotics and developing new antibiotics against it is of the highest priority. Whereas minimum inhibitory concentration (MIC) values against Pseudomonas aeruginosa have been used previously for QSAR model development, disk diffusion results (inhibition zones) have not been apparently used for this purpose in the literature, and we decided to explore their use in this sense. We developed multiple QSAR methods using several machine learning algorithms (Support vector classifier, K Nearest Neighbors, Random Forest Classifier, Decision Tree Classifier, AdaBoost Classifier, Logistic Regression, and Naive Bayes Classifier). The main descriptors used in building the models belonged to the families of adjacency matrix, constitutional descriptors, first highest eigenvalue of Burden matrix, centered Moreau-Broto autocorrelation, and averaged and centered Moreau-Broto autocorrelation descriptors. A total of 32 models were built, of which 28 were selected and stacked to create a meta-model. In terms of balanced accuracy, the best performance was provided by KNN, SVM and AdaBoost algorithms, but the ensemble method had slightly superior results in nested cross-validation.
ARTICLE | doi:10.20944/preprints202006.0333.v1
Subject: Keywords: Lung Cancer Prediction; Neural Network; Cross-validation; Gradient Boosting Classifier; Automated tool
Online: 28 June 2020 (09:56:30 CEST)
Lung cancer is known as lung carcinoma. It is a disease which is malignant tumor leading to the uncontrolled cell growth in the lung tissue. Lung Cancer disease is one of the most prominent cause of death in all over world. Early detection of this disease can assist medical care unit as well as physicians to provide counter measures to the patients. The objective of this paper is to approach an automated tool that takes influential causes of lung cancer as input and detect patients with higher probabilities of being affected by this disease. A neural network classifier accompanied by cross-validation technique is proposed in this paper as a predictive tool. Later, this proposed method is compared with another baseline classifier Gradient Boosting Classifier in order to justify the prediction performance.
ARTICLE | doi:10.20944/preprints201801.0290.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: ontology; conceptual model; natural language processing; engineering design; fuzzy hierarchical classifier; clustering
Online: 31 January 2018 (02:44:53 CET)
Software engineers from all over the world solve independently a lot of similar problems. In this condition the problem of code or even better architecture reusing becomes an issue of the day. In this paper two phase approach to determining the functional and structural likenesses of software projects is proposed. This approach combines two methods of artificial intelligence: natural language processing techniques with a novel method for comparing software projects based on ontological representation of their architecture automatically obtained from the projects source code. Additionally several similarity metrics are proposed to estimate similarity between projects.
ARTICLE | doi:10.20944/preprints201709.0084.v1
Subject: Engineering, Control & Systems Engineering Keywords: Passive Sonar; Target Detection; Adaptive Threshold; Bayesian Classifier; K-Mean; Particle Filter
Online: 18 September 2017 (17:04:13 CEST)
This paper presents the results of an experimental investigation about target detecting with passive sonar in Persian Gulf. Detecting propagated sounds in the water is one of the basic challenges of the researchers in sonar field. This challenge will be complex in shallow water (like Persian Gulf) and noise less vessels. Generally, in passive sonar the targets are detected by sonar equation (with constant threshold) which increase the detection error in shallow water. Purpose of this study is proposed a new method for detecting targets in passive sonars using adaptive threshold. In this method, target signal (sound) is processed in time and frequency domain. For classifying, Bayesian classification is used and prior distribution is estimated by Maximum Likelihood algorithm. Finally, target was detected by combining the detection points in both domains using LMS adaptive filter. Results of this paper has showed that proposed method has improved true detection rate about 27% compare other the best detection method.
ARTICLE | doi:10.20944/preprints202001.0205.v1
Subject: Behavioral Sciences, Other Keywords: itch; scratch; automated real-time detection; machine-learning based image classifier; image sharpness
Online: 19 January 2020 (03:13:48 CET)
A 'little brother' of pain, itch is an unpleasant sensation that creates a specific urge to scratch. To date, various machine-learning based image classifiers (MBICs) have been proposed for quantitative analysis of itch-induced scratch behaviour of laboratory animals in an automated, non-invasive, inexpensive and real-time manner. In spite of MBICs' advantages, the overall performances (accuracy, sensitivity and specificity) of current MBIC approaches remains inconsistent, with their values varying from ~50% to ~99%, for which the reasons underlying have yet to be investigated further, both computationally and experimentally. To look into the variation of the performance of MBICs in automated detection of itch-induced scratch, this article focuses on the experimental data recording step, and reports here for the first time that MBICs' overall performance is inextricably linked to the sharpness of experimentally recorded video of laboratory animal scratch behaviour. This article furthermore demonstrates for the first time that a linearly correlated relationship exists between video sharpness and overall performance (accuracy and specificity, but not sensitivity) of MBICs, and highlight the primary role of experimental data recording in rapid, accurate and consistent quantitative assessment of laboratory animal itch.
ARTICLE | doi:10.20944/preprints202010.0616.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Spike-and-wave; Generalized Gaussian distribution; EEG; Morlet wavelet; k-nearest neighbors classifier; Epilepsy
Online: 29 October 2020 (14:05:54 CET)
Spike-and-wave discharge (SWD) pattern detection in electroencephalography (EEG) signals is a key signal processing problem. It is particularly important for overcoming time-consuming, difficult, and error-prone manual analysis of long-term EEG recordings. This paper presents a new SWD method with a low computational complexity that can be easily trained with data from standard medical protocols. Precisely, EEG signals are divided into time segments for which the Morlet 1-D decomposition is applied. The generalized Gaussian distribution (GGD) statistical model is fitted to the resulting wavelet coefficients. A k-nearest neighbors (k-NN) self-supervised classifier is trained using the GGD parameters to detect the spike-and-wave pattern. Experiments were conducted using 106 spike-and-wave signals and 106 non-spike-and-wave signals for training and another 96 annotated EEG segments from six human subjects for testing. The proposed SWD classification methodology achieved 95 % sensitivity (True positive rate), 87% specificity (True Negative Rate), and 92% accuracy. These results set the path to new research to study causes underlying the so-called absence epilepsy in long-term EEG recordings.
ARTICLE | doi:10.20944/preprints201809.0146.v1
Subject: Earth Sciences, Geoinformatics Keywords: Fuzzy c-Means (FCM) Classifier, Similarity and Dissimilarity measures, Distance, Fuzzy Error Matrix (FERM)
Online: 8 September 2018 (01:46:24 CEST)
In this study, the fuzzy c- means classifier has been studied with nine other similarity and dissimilarity measures: Manhattan distance, chessboard distance, Bray-Curtis distance, Canberra, Cosine distance, correlation distance, mean absolute difference, median absolute difference and normalised squared Euclidean distance. Both single and composite modes were used with a varying weight constant (m) and also at different α-cuts. The two best single norms obtained were combined to study the effect of composite norms on the datasets used. An image to image accuracy check was conducted to assess the accuracy of the classified images. Fuzzy Error Matrix (FERM) was applied to measure the accuracy assessment outcomes for a Landsat-8 dataset with respect to the Formosat-2 dataset. To conclude FCM classifier with Cosine norm performed better than the conventional Euclidean norm. But, due to the incapability of the FCM classifier to handle noise properly, the classification accuracy was around 75%.
ARTICLE | doi:10.20944/preprints201801.0097.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: deep learning; automatic modulation classification; classifier fusion; convolutional neural network; long short-term memory
Online: 11 January 2018 (04:47:00 CET)
Deep learning has recently attracted much attention due to its excellent performance in processing audio, image, and video data. However, few studies are devoted to the field of automatic modulation classification (AMC). It is one of the most well-known research topics in communication signal recognition, which remains challenging for traditional methods due to the complex disturbance from other sources. This paper proposes a heterogeneous deep model fusion (HDMF) method to solve the problem in a unified framework. The contributions include: 1) The convolutional neural network (CNN) and long short-term memory (LSTM) are combined by two different ways without prior knowledge involved; 2) A large database, including eleven types of single-carrier modulation signals with various noises as well as a fading channel, is collected with various signal-to-noise ratios (SNRs) based on a real geographical environment; and 3) Experimental results demonstrate that HDMF is super capable of copping with the AMC problem, and achieves much better performance when compared with the independent network. The source code and the database will be publically available.
ARTICLE | doi:10.20944/preprints202008.0206.v1
Subject: Life Sciences, Microbiology Keywords: tongue microbiome; salivary microbiome; amplicon sequence variant (ASV); operational taxonomical unit (OTU); denoising; DADA2; taxonomic classifier
Online: 8 August 2020 (09:29:46 CEST)
The bacterial composition of oral samples has traditionally been determined by PCR amplicon sequencing of 16S rRNA genes. Recent amplicon sequence variant (ASV)-based analyses of 16S rRNA genes differ from that based on operational taxonomic unit (OTU) clustering in the way it deals with sequences having potential errors. However, little information is available on its application in oral microbiome studies. Here, we conducted ASV-based analysis of oral microbiome samples using QIIME 2. We investigated the optimal parameters for sequence denoising, using DADA2, and found the trimming of the first 20 nucleotides from 5′-end of both paired reads avoided excessive sequence loss during chimera removal. Truncating reads at positions 240–245 allowed the removal of low-quality sequences while maintaining sufficient length to merge matching paired ends. Taxonomic assignment, using the naïve Bayes classifier trained with the V3-V4 region of reference 16S rRNA sequences in the extended human oral microbiome database (eHOMD), resulted in bacterial compositions similar to those of OTU-based analyses. Contrary to OTU-based clustering, ASV-based analysis showed taxonomic abundance at the genus or species level to not differ significantly in tongue microbiomes, regardless of brushing. QIIME 2 can, therefore, be a standard pipeline for ASV-based analysis of oral microbiomes.
ARTICLE | doi:10.20944/preprints202112.0264.v1
Subject: Medicine & Pharmacology, Clinical Neurology Keywords: concussion; mild traumatic brain injury; working memory; long-term cognitive outcome; support vector machine classifier; personalized prediction
Online: 16 December 2021 (10:24:08 CET)
Concussion, also known as mild traumatic brain injury (mTBI), commonly causes transient neurocognitive symptoms, but in some cases, it causes cognitive impairment, including working memory (WM) deficit, which can be long-lasting and impede a patient’s return to work. The predictors of long-term cognitive outcomes following mTBI remain unclear because abnormality is often absent in structural imaging findings. The purpose of the study was to determine whether machine learning-based models using functional magnetic resonance imaging (fMRI) biomarkers and demographic or neuropsychological measures at baseline could effectively predict 1-year cognitive outcomes of concussion. We conducted a prospective, observational study of patients with mTBI who were compared with demographically-matched healthy controls enrolled between September 2015 to August 2020. Baseline assessments were collected within the first week of injury, and follow-ups were conducted at 6 weeks, 3 months, 6 months, and 1 year. Potential demographic, neuropsychological, and fMRI features were selected according to the significance of correlation with the estimated changes in WM ability. The support vector machine classifier was trained using these potential features and estimated changes in WM between the predefined time periods. Patients demonstrated significant cognitive recovery at the third month, followed by worsened performance after 6 months, which persisted until 1 year after concussion. Approximately half of the patients experienced prolonged cognitive impairment at 1-year follow up. Satisfactory predictions were achieved for patients whose WM function did not recover at 3 months (accuracy=87.5%), 6 months (accuracy=83.3%), 1 year (accuracy=83.3%), and performed worse at 1-year follow-up compared to baseline assessment (accuracy=83.3%). This study demonstrated the feasibility of personalized prediction for long-term postconcussive WM outcomes based on baseline fMRI and demographic features, opening a new avenue for early rehabilitation intervention in selected individuals with possible poor long-term cognitive outcomes.
ARTICLE | doi:10.20944/preprints202009.0257.v1
Subject: Keywords: Face Detection; Kohonen Self-Organizing Feature Map(K-SOM); Skin Color Segmentation; K-Nearest Neighbour (KNN) Classifier
Online: 11 September 2020 (12:10:28 CEST)
In today's world it is very much important to maintain the security of information and its risks. The biometric-based techniques are very much useful in these problems. Among the several kinds of biometric-based technique, face detection is much complex and much more important. Due to the age and several other problems, a human face structure changes over time, again a human has lots of expressions. Sometimes due to the lighting condition or the variation of the angle of an input device, the pattern of a human face structure also changed. As a result, the face cannot be detected properly. In this paper, a method is proposed that can detect the human faces both automatically and manually very efficiently. In manual mode, a user can select the input faces referred by the system according to their choice. In automated mode, the system detected all possible face areas using the Kohonen Self-Organizing Feature Map technique. This method reduced the complex color image into a vector quantized image with desired colors. Then a color segmentation technique is used to detect the possible face skin areas from the vector quantized image. Then the Histogram Oriented Gradient technique used to detect the feature from the faces and K-Nearest Neighbour Classifier is used to compare both face images detected by the two modes. The automated method prosed better accuracy than the manual method.
ARTICLE | doi:10.20944/preprints201910.0148.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: static synchronous compensator (STATCOM); discrete wavelet transform (DWT); multi-layer perceptron neural network (MLP); Bayes and Naive Bayes (NB) classifier
Online: 13 October 2019 (16:22:41 CEST)
This paper presents the methodology to detect and identify the type of fault that occurs in shunt connected static synchronous compensator (STATCOM) transmission line using a combination of Discrete Wavelet Transform (DWT) and Naive Bayes classifier. To study this, the network model is designed using Mat-lab/Simulink. The different faults such as Line to Ground (LG), Line to Line (LL), Double Line to Ground (LLG) and three-phase (LLLG) fault are applied at different zones of system with and without STATCOM considering the effect of varying fault resistance. The three-phase fault current waveforms obtained are decomposed into several levels using daubechies mother wavelet of db4 to extract the features such as standard deviation and Energy values. The extracted features are used to train the classifiers such as Multi-Layer Perceptron Neural Network (MLP), Bayes and Naive Bayes (NB) classifier to classify the type of fault that occurs in the system. The results reveal that the proposed NB classifier outperforms in terms of accuracy rate, misclassification rate, kappa statistics, mean absolute error (MAE), root mean square error (RMSE), relative absolute error (RAE) and root-relative square error (RRSE) than MLP and Bayes classifier.
ARTICLE | doi:10.20944/preprints202008.0330.v1
Subject: Keywords: Skin Detection; Color Space Model; Aggregated Channel Features (ACF) Detector; Histogram Oriented Gradient (HOG) Features Detection; Bootstrap Aggregation Decision Tree Classifier; Spot Detection
Online: 15 August 2020 (03:28:51 CEST)
Human Face and facial parts are the most significant parts as it reveals a person’s true identity. It plays an important role in various biometric applications like crowd analysis, human tracking, photography, cosmetic surgery, etc. There are many techniques are available to detect a facial image. Among them, skin detection is the most popular one. The aim of this paper is to detect first the person's identity from facial image and finally check any spot present the the detected person. The first step is to detect the maximum skin region based on a combination method of RGB and HSV color space model. Next it is to verify the skin areas of human through machine learning approach. The Aggregated Channel Features (ACF) detector is used to identify the different facial parts like eye pairs, nose, and mouth. Bootstrap aggregation decision tree classifier is applied to classify the person’s identity based on Histogram Oriented Gradient (HOG) features value. The experimental results show that the proposed method gives the average 97% accuracy.
ARTICLE | doi:10.20944/preprints201806.0188.v1
Subject: Earth Sciences, Geoinformatics Keywords: minimum noise fraction (MNF) transformation; object-based image analysis (OBIA); APEX hyperspectral imagery; Random forest (RF) classifier; multiresolution segmentation (MRS); tree species classification
Online: 12 June 2018 (10:55:07 CEST)
Tree species composition is an important key element for biodiversity and sustainable forest management, and hyperspectral data provide detailed spectral information, which can be used for tree species classification. There are two main challenges for using hyperspectral imagery: a) Hughes phenomena, meaning by increasing the number of bands in hyperspectral imagery, the number of required classification samples would increase exponentially, and b) in a more complex environment, such as riparian mixed forest, focusing on spectral variability per pixel may not be adequate for definability of tree species. Therefore, the focus of this study is to assess spectral-spatial dimensionality reduction of airborne hyperspectral imagery by using minim noise fraction (MNF) transformation, and object-based image analysis (OBIA). An airborne prism experiment (APEX) hyperspectral imagery was used. A study area was a riparian mixed forest located along the Salzach river, and six tree species including Picea abies, Populus (canadensis and balsamifera), Fraxinus excelsior, Alnus incana, and Salix alba were selected. A machine learning algorithm random forest (RF) was used to train and apply a prediction model for classification. Using a spectral dimensionality reduced APEX, a pixel-level classification was also done. According to a confusion matrix, the object-level classification of MNF-derived components achieved the overall accuracy of 85 %, and kappa coefficient of 0.805. The performance of classes according to producer’s accuracy varied between 80% for Fraxinus excelsior, Alnus incana, and Populus canadensis to 90% for Salix alba and Picea abies. Comparison the results to a pixel-level classification, showed a better performance of object-level classification (an overall accuracy of 63% and Kappa coefficient of 0.559 were achieved for pixel-level classification). The performance of classes using pixel-based classification varied 45 % for Alnus incana to 80% for Picea abies. In general, Spectral-spatial complexity reduction using MNF transformation and object-level classification yielded a statistically satisfactory results.