ARTICLE | doi:10.20944/preprints202008.0336.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image processing; image classification; computer vision; expert systems; amber gemstones
Online: 15 August 2020 (04:39:11 CEST)
The article describes a classification solution for amber stones. The problem of classifying amber is known for a long time among jewelers and artisans of amber art. Existing solutions can classify amber pieces according to color, but a need to classify by shape and texture is not satisfied up to now. The proposed solution is capable of classifying the gemstones according to a shape. Amber can be considered as a specific object since the form is difficult to define unambiguously. Data for amber experiments was gathered from amber art craftsmen. In the proposed solution amber form can be classified into 10 different classes (7 classes chosen during the experiment).
REVIEW | doi:10.20944/preprints202105.0127.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Image Acquisition, Image preprocessing, Image enhancement, beatboxing, segmentation
Online: 7 May 2021 (09:09:14 CEST)
Human beatboxing is a vocal art making use of speech organs to produce vocal drum sounds and imitate musical instruments. Beatbox sound classification is a current challenge that can be used for automatic database annotation and music-information retrieval. In this study, a large-vocabulary humanbeatbox sound recognition system was developed with an adaptation of Kaldi toolbox, a widely-used tool for automatic speech recognition. The corpus consisted of eighty boxemes, which were recorded repeatedly by two beatboxers. The sounds were annotated and transcribed to the system by means of a beatbox specific morphographic writing system (Vocal Grammatics). The image processing techniques plays vital role on image Acquisition, image pre-processing, Clustering, Segmentation and Classification techniques with different kind of images such as Fruits, Medical, Vehicle and Digital text images etc. In this study the various images to remove unwanted noise and performs enhancement techniques such as contrast limited adaptive histogram equalization, Laplacian and Harr filtering, unsharp masking, sharpening, high boost filtering and color models then the Clustering algorithms are useful for data logically and extract pattern analysis, grouping, decision-making, and machine-learning techniques and Segment the regions using binary, K-means and OTSU segmentation algorithm. It Classifying the images with the help of SVM and K-Nearest Neighbour(KNN) Classifier to produce good results for those images.
ARTICLE | doi:10.20944/preprints202112.0140.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Image Recognition; Preference Net
Online: 8 December 2021 (14:43:39 CET)
Accuracy and computational cost are the main challenges of deep neural networks in image recognition. This paper proposes an efficient ranking reduction to binary classification approach using a new feed-forward network and feature selection based on ranking the image pixels. Preference net (PN) is a novel deep ranking learning approach based on Preference Neural Network (PNN), which uses new ranking objective function and positive smooth staircase (PSS) activation function to accelerate the image pixels’ ranking. PN has a new type of weighted kernel based on spearman ranking correlation instead of convolution to build the features matrix. The PN employs multiple kernels that have different sizes to partial rank image pixels’ in order to find the best features sequence. PN consists of multiple PNNs’ have shared output layer. Each ranker kernel has a separate PNN. The output results are converted to classification accuracy using the score function. PN has promising results comparing to the latest deep learning (DL) networks using the weighted average ensemble of each PN models for each kernel on CFAR-10 and Mnist-Fashion datasets in terms of accuracy and less computational cost.
ARTICLE | doi:10.20944/preprints202308.0070.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep Neural Networks; Hyperbolic Geometry; Image Classification; Lorentz Model; Graphs
Online: 2 August 2023 (02:08:21 CEST)
Deep neural networks with powerful auto-optimization tools have been widely applied in various research fields, e.g., NLP and computer vision. However, existing neural network architectures typically are constructed using different inductive biases, e.g., preconceptions, expecting to decrease parameter search space during training, reduce computational cost or introduce expert knowledge in the neural network design. As an alternative, Multilayer Perceptron (MLP) provides much better freedom for exploration, has a lower inductive bias than convolutional neural networks (CNNs), and offers good flexibility in learning complex patterns. Even though, such neural architectures are commonly built in a flat Euclidean space, which is not necessarily the optimal space for any data, and is especially not good for modeling hierarchical correlations. Hyperbolic neural networks (HNNs) have gained attention for their ability to capture hierarchical structures present in complex data types like graphs. Recently, there has been an increasing interest to extend HNNs to computer vision tasks, motivated by the observations that images possess rich hierarchical relations. However, this is generally applied by employing a Euclidean backbone for learning higher-level semantic representations and only incorporating a hyperbolic classifier for classification, which, we argue, does not make full use of the advantage of hyperbolic space. Considering the recovery of the attention-free Multilayer Perceptron(MLP), in this paper, we extend it to non-Euclidean space and propose a novel architecture, named Hyperbolic Res-MLP (HR-MLP), that leverages fully hyperbolic layers to learn feature embeddings and perform image classification in an end-to-end fashion. With the help of the proposed Lorentz cross-patch and cross-channel layers, we can directly perform operations in the hyperbolic domain with fewer parameters, making it faster to train and providing comparatively better performance than its Euclidean counterpart. Experiments on CIFAR10, CIFAR100, and MiniImageNet demonstrate a comparable and superior performance when compared to Euclidean baselines. Our code is available at (https://github.com/Ahmad-Omar-Ahsan/HR-MLP)
ARTICLE | doi:10.20944/preprints202310.2003.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Image classification; Computer vision; Transfer learning; Image database; Plant nutrition; Leaf analysis
Online: 31 October 2023 (08:13:17 CET)
Computer vision is a powerful technology that has enabled solutions in various fields by analyzing visual attributes in images. One field that has taken advantage of computer vision is agricultural automation, which promotes high-quality crop production. The nutritional status of a crop is a crucial factor in determining its productivity. This status is mediated by approximately 14 chemical elements acquired by the plant, and their determination plays a pivotal role in farm management. To address the timely identification of nutritional disorders, this study focuses on the classification of three levels of phosphorus deficiencies through individual leaf analysis. The methodological steps include: (1) generating a database with laboratory-grown maize plants that were induced to total phosphorus deficiency, medium deficiency, and total nutrition, using different capture devices; (2) processing the images with state-of-the-art transfer learning architectures (i.e. VGG16, ResNet50, GoogLeNet, DenseNet201, and MobileNetV2); and (3) evaluating the classification performance of the models using the created database. The results show that the VGG16 model achieves superior performance, with 98% classification accuracy. However, the other studied architectures also demonstrate competitive performance and are considered state-of-the-art automatic leaf deficiency detection tools. The proposed method can be a starting point to fine-tune machine vision-based solutions tailored for real-time monitoring of crop nutritional status.
ARTICLE | doi:10.20944/preprints202109.0285.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: remote sensing; deep learning; image classification
Online: 16 September 2021 (13:38:55 CEST)
Autonomous image recognition has numerous potential applications in the field of planetary science and geology. For instance, having the ability to classify images of rocks would allow geologists to have immediate feedback without having to bring back samples to the laboratory. Also, planetary rovers could classify rocks in remote places and even in other planets without needing human intervention. Shu et al. classified 9 different types of rock images using a Support Vector Machine (SVM) with the image features extracted autonomously. Through this method, the authors achieved a test accuracy of 96.71%. In this research, Convolutional Neural Networks(CNN) have been used to classify the same set of rock images. Results show that a 3-layer network obtains an average accuracy of 99.60% across 10 trials on the test set. A version of Self-taught Learning was also implemented to prove the generalizability of the features extracted by the CNN. Finally, one model has been chosen to be deployed on a mobile device to demonstrate practicality and portability. The deployed model achieves a perfect classification accuracy on the test set, while taking only 0.068 seconds to make a prediction, equivalent to about 14 frames per second.
ARTICLE | doi:10.20944/preprints202308.0047.v1
Subject: Physical Sciences, Astronomy And Astrophysics Keywords: image classification; astronomy; asteroids; convolutional neural network; deep learning
Online: 1 August 2023 (11:08:14 CEST)
Near Earth Asteroids represent potential threats to human life because their trajectories may bring them in the proximity of the Earth. Monitoring these objects could help predict future impact events, but such efforts are hindered by the large numbers of objects that pass through the Earth’s vicinity. Additionally, there is also the problem of distinguishing asteroids from other objects in the night sky, which implies sifting through large sets of telescope image data. Within this context, we believe that employing machine learning techniques could greatly improve the detection process by sorting out the most likely asteroid candidates to be reviewed by human experts. At the moment, the use of machine learning techniques is still limited in the field of astronomy and the main goal of the present paper is to study the effectiveness of deep CNNs for the classification of astronomical objects, asteroids in this particular case, by comparing some of the well-known deep convolutional neural networks, including InceptionV3, Xception, InceptionResNetV2 and ResNet152V2. We have applied transfer learning and fine-tuning on these pre-existing deep convolutional networks and from the results that we have obtained one can see the potential of using deep convolutional neural networks in the process of asteroid classification. The InceptionV3 model has the best results in the asteroid class, meaning that by using it, we loose the least number of valid asteroids.
ARTICLE | doi:10.20944/preprints202306.1679.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: long-tailed image classification; contrastive learning; data augmentation
Online: 23 June 2023 (12:17:21 CEST)
To solve the problem that the common long-tailed classification method does not use the semantic features of the original label text of the image, and the difference between the classification accuracy of most classes and minority classes is large, the long-tailed image classification method based on enhanced contrast visual language trains the head class and tail class samples separately, uses text image to pre-train the information, and uses enhanced momentum contrast loss function and RandAugment enhancement to improve the learning of tail class samples. On the ImageNet-LT long-tailed dataset, the enhanced contrastive visual-language based long-tailed image classification method has improved all class accuracy, tail class accuracy, middle class accuracy, and F1 values by 3.4%, 7.6%, 3.5%, and 11.2%, respectively, compared to the BALLAD method. The difference in accuracy between the head class and tail class is reduced by 1.6% compared to the BALLAD method. The results of three comparative experiments indicate that the long-tailed image classification method based on enhanced contrastive visual-language has improved the performance of tail classes and reduced the accuracy difference between majority and minority classes.
ARTICLE | doi:10.20944/preprints202007.0591.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Bacterial -Viral Pneumonia; COVID-19; X-ray Image; Deep Learning; Convolution Neural Network
Online: 24 July 2020 (14:02:07 CEST)
The paper demonstrates the analysis of Corona Virus Disease based on a CNN probabilistic model. It involves a technique for classification and prediction by recognizing typical and diagnostically most important CT images features relating to Corona Virus. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases at applying our proposed Convolution neural network structure. The Study is validated on 2002 chest X-ray images with 60 confirmed positive covid19 cases and (650 bacterial – 412 viral -880 normal) x-ray images. The proposed CNN compared with traditional classifiers with proposed CHFS feature extraction model. The experimental study has done with real data demonstrates the feasibility and potential of the proposed approach for the said cause. The result of proposed CNN structure has been successfully done to achieve 98.20% accuracy of covid19 potential cases with comparable of traditional classifiers.
ARTICLE | doi:10.20944/preprints202103.0408.v1
Subject: Engineering, Automotive Engineering Keywords: Auto encoder; IoT; Image encryption; Artificial Neural Network; Machine Learning
Online: 16 March 2021 (09:32:11 CET)
Machine Learning has completely transformed health care system, which transmits medical data through IOT sensors. So it is very important to encrypt them to protect patient data. encrypting medical images from a performance perspective consumes time; hence the use of an auto encoder is essential. An auto encoder is used in this work to compress the image as a vector prior to the encryption process. The digital image passes across description function and a decoder to get back the image in the proposed work; various experiments are carried out on hyper parameters to achieve the highest outcome of the classification. The findings demonstrate that the combination of Mean Square Logarithmic Error as the loss function, ADA grad as an optimizer, two layers for the encoder, and another reverse for the decoder, RELU as the activation function generates the best auto encoder results. The combination of Mean square error (lose function), RMS prop (optimizer), three layers for the encoder and another reverse for the decoder, and RELU (activation function) has the best classification result. All the experiments with different hyper parameter has run almost very close to each other even when changing the number of layers. The running time is between 9 and 16 second for each epoch.
ARTICLE | doi:10.20944/preprints201911.0218.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: Landsat; Google Earth; water index; unsupervised image classification; supervised image classification; Kappa coefficient
Online: 19 November 2019 (03:10:17 CET)
To address three important issues related to extraction of water features from Landsat imagery, i.e., selection of water indexes and classification algorithms for image classification, collection of ground truth data for accuracy assessment, this study applied four sets (ultra-blue, blue, green, and red light based) of water indexes (NWDI, MNDWI, MNDWI2, AWEIns, and AWEIs) combined with three types of image classification methods (zero-water index threshold, Otsu, and kNN) to 24 selected lakes across the globe to extract water features from Landsat-8 OLI imagery. 1440 (4x5x3x24) image classification results were compared with the extracted water features from high resolution Google Earth images with the same (or ±1 day) acquisition dates through computing the Kappa coefficients. Results show the kNN method is better than the Otsu method, and the Otsu method is better than the zero-water index threshold method. If the computational cost is not an issue, the kNN method combined with the ultra-blue light based AWEIns is the best method for extracting water features from Landsat imagery because it produced the highest Kappa coefficients. If the computational cost is taken into account, the Otsu method is a good choice. AWEIns and AWEIs are better than NDWI, MNDWI and MNDWI2. AWEIns works better than AWEIs under the Otsu method, and the average rank of the image classification accuracy from high to low is the ultra-blue, blue, green, and red light-based AWEIns.
ARTICLE | doi:10.20944/preprints202306.0616.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Crop Disease Classification; Crop Disease Dataset; Image Augmentation
Online: 8 June 2023 (09:47:54 CEST)
Crop disease classification has always been a critical and persistent problem in the field of agricultural and forestry sciences, where often we do not have access to a sufficient number of samples to know the distribution of real-world samples. How to make full use of the existing data is the starting point of our thinking. To address this problem, this paper proposes a supervised image augmentation method Negative Contrast, which uses the contrast images of existing disease samples after removing disease areas as negative samples for image augmentation when samples are relatively scarce. Numerous experiments have shown that several classical models using this augmentation method have improved in disease classification of four crops, rice, wheat, corn, and soybean, with a maximum accuracy improvement of 30.8%. In addition, the comparative analysis of attentional heat map shows that the model using Negative Contrast is more accurate and intense on the area of interest of diseases, and thus reflects better generalization ability in real-world disease classification. Our dataset and codes can be found in https://www.kaggle.com/datasets/w970704112/corn-wheat-rice-soybean and https://github.com/hiter0/contrastaug .
COMMUNICATION | doi:10.20944/preprints202207.0450.v1
Subject: Environmental And Earth Sciences, Oceanography Keywords: SAR image; ship wake; deep learning; synthetic dataset
Online: 29 July 2022 (05:51:03 CEST)
The classification of vessel types in SAR imagery is of crucial importance for maritime applications. However, the ability to use real SAR imagery for deep learning classification is limited, due to the general lack of such data and/or the labor-intensive nature of labeling them. Simulating SAR images can overcome these limitations, allowing the generation of an infinite number of datasets. In this contribution, we present a synthetic SAR imagery dataset with ship wakes, which comprises 46080 images for ten different real vessel models. The variety of simulation parameters includes 16 ship heading directions, 6 ship velocities, 8 wind directions, 2 wind velocities, and 3 incidence angles. In addition, we extensively investigate classification performance for noise-free, noisy, and denoised ship wake scenes. We utilize the standard AlexNet architecture and employ training from scratch. To achieve the best classification performance, we conduct Bayesian optimization to determine hyperparameters. Results demonstrate that the classification of vessel types based on their SAR signatures is highly efficient, with maximum accuracies of 96.16%, 92.7%, and 93.59%, when training using noise-free, noisy, and denoised datasets respectively. Thus, we conclude that the best strategy in practical applications should be to train convolutional neural networks on denoised SAR datasets. The results show that the versatility of the SAR simulator can open up new horizons in the application of machine learning to a variety of SAR platforms.
REVIEW | doi:10.20944/preprints202309.1820.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Breast Cancer; Deep Learning Methods; Image Classification; GAN; Transfer Learning; Lifelong Learning
Online: 27 September 2023 (05:17:09 CEST)
Breast cancer is a common malignant tumour and studies have shown that early and accurate detection is crucial for patients. With the maturity of medical imaging and deep learning development, significant progress has been made in breast cancer classification, which greatly improves the accuracy and efficiency of classification. This review focuses on deep learning, migration learning, GAN, and lifelong learning to elaborate and summarise the important roles arising from breast cancer detection. This review also examines the dataset and labeling issues required for breast cancer classification. In conclusion, at the end of the article, we look at future directions for breast cancer classification research, including cross-migration learning, multimodal data fusion, model interpretability, and lifelong learning, and also explore how to provide personalized treatment plans for patients.
ARTICLE | doi:10.20944/preprints202210.0092.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: complex network; neural network architecture; isotropic architecture; image classification
Online: 8 October 2022 (04:04:47 CEST)
Although neural network architectures are critical for their performance, how the structural characteristics of a neural network affect its performance has still not been fully explored. We here map architectures of neural network to directed acyclic graphs, and find that incoherence, a structural characteristic to measure the order of directed acyclic graphs, is a good indicator for the performance of corresponding neural networks. Therefore we propose a deep isotropic neural network architecture by folding a chain of same blocks then connecting the blocks with skip connections at different distances. Our models, named FoldNet, have two distinguishing features compared with traditional residual neural netowrks. First, the distances between block pairs connected by skip connections increase from always equal to one to specially selected different values, which lead to more incoherent graphs and let the neural network explore larger receptive fields and thus enhance its multi-scale representation ability. Second, the number of direct paths increases from one to multiple, which leads to a larger proportion of shorter paths and thus improve the direct propagation of information throughout the entire network. Image classification results on CIFAR-10 and Tiny ImageNet benchmarks suggested that our new network architecture performs better than traditional residual neural networks.
ARTICLE | doi:10.20944/preprints202102.0083.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: SAR image classification; Spiking Neural Network(SNN); unsupervised learning
Online: 2 February 2021 (10:35:38 CET)
Recent neuroscience research results show that the nerve information in the brain is not only encoded by the spatial information. Spiking neural network based on pulse frequency coding plays a very important role in dealing with the problem of brain signal, especially complicated space-time information. In this paper, an unsupervised learning algorithm for bilayer feedforward spiking neural networks based on spike-timing dependent plasticity (STDP) competitiveness is proposed and applied to SAR image classification on MSTAR for the first time. The SNN learns autonomously from the input value without any labeled signal and the overall classification accuracy of SAR targets reached 80.8%. The experimental results show that the algorithm adopts the synaptic neurons and network structure with stronger biological rationality, and has the ability to classify targets on SAR image. Meanwhile, the feature map extraction ability of neurons is visualized by the generative property of SNN, which is a beneficial attempt to apply the brain-like neural network into SAR image interpretation.
Subject: Environmental And Earth Sciences, Oceanography Keywords: breaking waves; optical flow; convolutional neural networks; image classification
Online: 11 October 2021 (15:49:36 CEST)
The use of convolutional neural networks (CNNs) in image classification has become the standard method of approaching computer vision problems. Here we apply pre-trained networks to classify images of non-breaking, plunging and spilling breaking waves. The CNNs are used as basic feature extractors and a classifier is then trained on top of these networks. The dynamic nature of breaking waves is exploited by using image sequences to gain extra information and improve the classification results. We also see improved classification performance in using pre-computed image features such as the optical flow between image pairs. The inclusion of the dynamic information improves the classification between breaking wave classes. We also provide corrections to the methodology from the article from which the data originates to achieve a more accurate assessment of performance.
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: aerial scene classification; remote-sensing image classification; few-shot learning; meta-learning
Online: 15 December 2020 (13:21:49 CET)
CNN-based methods have dominated the field of aerial scene classification for the past few years. While achieving remarkable success, CNN-based methods suffer from excessive parameters and notoriously rely on large amounts of training data. In this work, we introduce few-shot learning to the aerial scene classification problem. Few-shot learning aims to learn a model on base-set that can quickly adapt to unseen categories in novel-set, using only a few labeled samples. To this end, we proposed a meta-learning method for few-shot classification of aerial scene images. First, we train a feature extractor on all base categories to learn a representation of inputs. Then in the meta-training stage, the classifier is optimized in the metric space by cosine distance with a learnable scale parameter. At last, in the meta-testing stage, the query sample in the unseen category is predicted by the adapted classifier given a few support samples. We conduct extensive experiments on two challenging datasets: NWPU-RESISC45 and RSD46-WHU. The experimental results show that our method yields state-of-the-art performance. Furthermore, several ablation experiments are conducted to investigate the effects of dataset scale, the impact of different metrics and the number of support shots; the experiment results confirm that our model is specifically effective in few-shot settings.
ARTICLE | doi:10.20944/preprints202309.1441.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: fine grain image recognition; Inception-V3; reinforcement complementary learning; complementary learning; inter-class gap
Online: 21 September 2023 (08:54:27 CEST)
Abstract:The objects of fine-grained image categories(e.g., bird species) are various subclass under different categories. Because the differences between subclass are very subtle and most of them are concentrated in multiple local areas, the task of fine-grained image recognition is very challenging. At the same time, some fine-grained networks tend to focus on a certain region when judging the target category, resulting in the lack of other auxiliary regional features. To this end, Inception V3 is used as the backbone network, and an enhanced and complementary fine-grained image classification network is designed. While adopting the method of reinforcement learning to obtain more detailed fine grain image features, the complementary network can obtain the complementary discriminant area of the target through the method of attention erasure to increase the network's perception of the overall target. Finally, experiments are conducted on CUB-200-2011, FGVC Aircraft and Stanford dogs three open datasets. The experimental results show that the proposed model has better performance.
ARTICLE | doi:10.20944/preprints202106.0634.v1
Subject: Computer Science And Mathematics, Mathematics Keywords: Hyperspectral image; HSI; PCA; K-means clustering; unsupervised; classification; bands; satellite; ROSIS; AVIRIS
Online: 28 June 2021 (10:01:41 CEST)
The visualization of hyperspectral images in display devices, having RGB colour composition channels is quite difficult due to the high dimensionality of these images. Thus, principal component analysis has been used as a dimensionality reduction algorithm to reduce information loss, by creating uncorrelated features. To classify regions in the hyperspectral images, K-means clustering has been used to form clusters/regions. These two algorithms have been implemented on the three datasets imaged by AVIRIS and ROSIS sensors.
ARTICLE | doi:10.20944/preprints202311.1420.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Image Classification; Complex-valued Neural Network; FPGA Implementation; CVNN on FPGA
Online: 22 November 2023 (15:14:18 CET)
This proposed research explores a novel approach to image classification by deploying a complex-valued neural network (CVNN) on a field-programmable gate array (FPGA), specifically for classifying 2D images transformed into polar form. The aim of this research is to address the limitations of existing neural network models in terms of energy and resource efficiency, by exploring the potential of FPGA-based hardware acceleration in conjunction with advanced neural network architectures like CVNNs. The methodological innovation of this research lies in the Cartesian to polar transformation of 2D images, effectively reducing the input data volume required for neural network processing. Subsequent efforts focused on constructing a CVNN model optimized for FPGA implementation, emphasizing the enhancement of computational efficiency and overall performance. The experimental findings provide empirical evidence supporting the efficacy of the image classification system developed in this study. One of the developed models, CVNN_128, achieves an accuracy of 88.3% with an inference time of just 1.6ms and a power consumption of 4.66mW for the classification of the MNIST test dataset consists of 10,000 frames. While there is a slight concession in accuracy compared to recent FPGA implementations that achieve 94.43%, our model significantly excels in classification speed and power efficiency—surpassing existing models by more than a factor of 100. In conclusion, the paper demonstrates the substantial advantages of FPGA-implementation of CVNNs for image classification tasks, particularly in scenarios where speed, resource, and power consumption are critical. The study’s reproducible results and corresponding code are available on GitHub at the following link: https://github.com/mahmad2005/CVNNonFPGA
ARTICLE | doi:10.20944/preprints202204.0163.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: artificial intelligence; deep learning; image-to-image translation; dual-energy computed tomography; pulmonary embolism; emergency radiology
Online: 18 April 2022 (09:45:00 CEST)
Detector-based spectral CT offers the possibility of obtaining spectral information from which discrete acquisitions at different energy levels can be derived, yielding so-called virtual monoenergetic images (VMI). In this study, we aimed to develop a jointly optimized deep learning framework based on dual-energy CT pulmonary angiography (DE-CTPA) data to generate synthetic monoenergetic images (SMI) for improving automatic pulmonary embolism (PE) detection in single-energy CTPA scans. For this purpose, we used two data sets: our institutional DE-CTPA data set D1 comprising polyenergetic arterial series and the corresponding VMI at low-energy levels (40 keV) with 7,892 image pairs, and a 10% subset of the 2020 RSNA Pulmonary Embolism Detection Challenge data set D2, which consisted of 161,253 polyenergetic images with dichotomous slice-wise annotations (PE/no PE). We trained a fully convolutional encoder-decoder on D1 to generate SMI from single-energy CTPA scans of D2, which were then fed into a ResNet50 network for training of the downstream PE classification task. The quantitative results on the reconstruction ability of our framework revealed high-quality visual SMI predictions with reconstruction results of 0.984 ± 0.002 (structural similarity) and 41.706 ± 0.547 dB (peak-signal-to-noise ratio). PE classification resulted in an AUC of 0.84 for our model, which achieved improved performance compared to other naive approaches with AUCs up to 0.81. Our study stresses the role of using joint optimization strategies for deep learning algorithms to improve automatic PE detection. The proposed pipeline may prove to be beneficial for computer-aided detection systems and could help rescue CTPA studies with suboptimal opacification of the pulmonary arteries from single-energy CT scanners.
ARTICLE | doi:10.20944/preprints202202.0058.v2
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Document Image Classification; Corruption Robustness; Robustness to Distortions; Model Robustness
Online: 14 June 2022 (08:43:57 CEST)
Deep neural networks have been extensively researched in the field of document image classification to improve classification performance and have shown excellent results. However, there is little research in this area that addresses the question of how well these models would perform in a real-world environment, where the data the models are confronted with often exhibits various types of noise or distortion. In this work, we present two separate benchmark datasets, namely RVL-CDIP-D and Tobacco3482-D, to evaluate the robustness of existing state-of-the-art document image classifiers to different types of data distortions that are commonly encountered in the real world. The proposed benchmarks are generated by inserting 21 different types of data distortions with varying severity levels into the well-known document datasets RVL-CDIP and Tobacco3482, respectively, which are then used to quantitatively evaluate the impact of the different distortion types on the performance of latest document image classifiers. In doing so, we show that while the higher accuracy models also exhibit relatively higher robustness, they still severely underperform on some specific distortions, with their classification accuracies dropping from ~90% to as low as ~40% in some cases. We also show that some of these high accuracy models perform even worse than the baseline AlexNet model in the presence of distortions, with the relative decline in their accuracy sometimes reaching as high as 300-450% that of AlexNet. The proposed robustness benchmarks are made available to the community and may aid future research in this area.
ARTICLE | doi:10.20944/preprints202309.0575.v1
Subject: Environmental And Earth Sciences, Geochemistry And Petrology Keywords: Self-supervised; Pretrained Model; Transfer learning; Metric Learning; Transformer; Mask AutoEncoder; Hyperspectral Image Classification
Online: 8 September 2023 (07:42:04 CEST)
"Finding fresh water in the ocean of data." is a challenge that all deep learning domains struggle with, especially in the area of hyperspectral image analysis. As hyperspectral remote sensing technology advances by leaps and bounds, there are increasing amounts of hyperspectral images(HSIs) can be available. Whereas in fact, these unlabeled HSIs are powerless to be used as material to driven a supervised learning task due to the extremely expensive labeling costs and some unknown regions. Although learning-based methods have achieved remarkable performance due to their superior ability to represent features, at the cost, these methods are complex, inflexible and tough to carry out transfer learning. In this paper, we propose the "Instructional Mask AutoEncoder"(IMAE), which is a simple and powerful self-supervised learner for HSI classification that uses a transformer-based mask autoencoder to extract the general features of HSIs through a self-reconstructing agent task. Moreover, we utilize the metric learning to perform an instructor which can direct the model focus on the human interested region of the input so that we can alleviate the defects of transformer-based model such as local attention distraction, lack of inductive bias and tremendous training data requirement. In downstream forward propagation, instead of global average pooling, we employ a learnable aggregation to put the tokens into fullplay. The obtained results illustrate that our method effectively accelerates the convergence rate and promotes the performance in downstream task.
ARTICLE | doi:10.20944/preprints202009.0566.v1
Subject: Engineering, Automotive Engineering Keywords: transportation mode classification; vulnerable road users; recurrence plots; computer vision; image classification system
Online: 24 September 2020 (04:41:32 CEST)
As the Autonomous Vehicle (AV) industry is rapidly advancing, classification of non-motorized (vulnerable) road users (VRUs) becomes essential to ensure their safety and to smooth operation of road applications. The typical practice of non-motorized road users’ classification usually takes numerous training time and ignores the temporal evolution and behavior of the signal. In this research effort, we attempt to detect VRUs with high accuracy be proposing a novel framework that includes using Deep Transfer Learning, which saves training time and cost, to classify images constructed from Recurrence Quantification Analysis (RQA) that reflect the temporal dynamics and behavior of the signal. Recurrence Plots (RPs) were constructed from low-power smartphone sensors without using GPS data. The resulted RPs were used as inputs for different pre-trained Convolutional Neural Network (CNN) classifiers including constructing 227×227 images to be used for AlexNet and SqueezeNet; and constructing 224×224 images to be used for VGG16 and VGG19. Results show that the classification accuracy of Convolutional Neural Network Transfer Learning (CNN-TL) reaches 98.70%, 98.62%, 98.71%, and 98.71% for AlexNet, SqueezeNet, VGG16, and VGG19, respectively. The results of the proposed framework outperform other results in the literature (to the best of our knowledge) and show that using CNN-TL is promising for VRUs classification. Because of its relative straightforwardness, ability to be generalized and transferred, and potential high accuracy, we anticipate that this framework might be able to solve various problems related to signal classification.
ARTICLE | doi:10.20944/preprints201912.0059.v2
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: hyperspectral image classiﬁcation; deep learning; channel-wise attention mechanism; spatial-wise attention mechanism
Online: 12 February 2020 (05:40:08 CET)
In recent years, researchers have paid increasing attention on hyperspectral image (HSI) classification using deep learning methods. To improve the accuracy and reduce the training samples, we propose a double-branch dual-attention mechanism network (DBDA) for HSI classification in this paper. Two branches are designed in DBDA to capture plenty of spectral and spatial features contained in HSI. Furthermore, a channel attention block and a spatial attention block are applied to these two branches respectively, which enables DBDA to refine and optimize the extracted feature maps. A series of experiments on four hyperspectral datasets show that the proposed framework has superior performance to the state-of-the-art algorithm, especially when the training samples are signally lacking.
ARTICLE | doi:10.20944/preprints202307.1483.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Artificial intelligence; deep learning; transfer learning; image classification; fresco
Online: 21 July 2023 (08:17:41 CEST)
The unique characteristics of frescoes on overseas Chinese buildings can attest to the integration and historical background of Chinese and Western cultures. Reasonable analysis and preservation of overseas Chinese frescoes can provide sustainable development for culture and history. This research adopts the image analysis technology based on artificial intelligence, and proposes a ResNet-34 model and method integrating transfer learning. This deep learning model can identify and classify the source of the frescoes of the emigrants, and can effectively deal with the problems such as the small number of fresco images on the emigrants' buildings, poor quality, difficulty in feature extraction, and similar pattern text and style. The experimental results show that the training process of the model proposed in this article is stable. On the constructed Jiangmen and Haikou fresco JHD datasets, the final accuracy is 98.41%, and the recall rate is 98.53%. The above evaluation indicators are superior to classic models such as AlexNet, GoogLeNet, and VGGNet. It can be seen that the model in this article has strong generalization ability and is not prone to overfitting. It can effectively identify and classify the cultural connotations and regions of frescoes.
ARTICLE | doi:10.20944/preprints201712.0057.v1
Subject: Environmental And Earth Sciences, Other Keywords: dimension reduction; feature extraction; hyperspectral image; weighted feature space; low rank representation; spectral clustering
Online: 11 December 2017 (06:55:22 CET)
Containing hundreds of spectral bands (features), hyperspectral images (HSIs) have high ability in discrimination of land cover classes. Traditional HSIs data processing methods consider the same importance for all bands in the original feature space (OFS), while different spectral bands play different roles in identification of samples of different classes. In order to explore the relative importance of each feature, we learn a weighting matrix and obtain the relative weighted feature space (RWFS) as an enriched feature space for HSIs data analysis in this paper. To overcome the difficulty of limited labeled samples which is common case in HSIs data analysis, we extend our method to semisupervised framework. To transfer available knowledge to unlabeled samples, we employ graph based clustering where low rank representation (LRR) is used to define the similarity function for graph. After construction the RWFS, any arbitrary dimension reduction method and classification algorithm can be employed in RWFS. The experimental results on two well-known HSIs data set show that some dimension reduction algorithms have better performance in the new weighted feature space.
ARTICLE | doi:10.20944/preprints201810.0073.v1
Subject: Medicine And Pharmacology, Other Keywords: Classification; F-score; Gray-Level Co-occurrence Matrix (GLCM); Gray-Level Run-Length Matrix (GLRLM); Hepatocellular Carcinoma (HCC); Liver Cancer; Liver Abscess; Image Texture, Sequential Backward Selection (SBS); Sequential Forward Selection (SFS); Support Vector Machine (SVM); Ultrasound Image.
Online: 4 October 2018 (14:01:42 CEST)
This paper discusses the computer-aided (CAD) classification between Hepatocellular Carcinoma (HCC), i.e., the most common type of liver cancer, and Liver Abscess, based on ultrasound image texture features and Support Vector Machine (SVM) classifier. Among 79 cases of liver diseases, with 44 cases of HCC and 35 cases of liver abscess, this research extracts 96 features of Gray-Level Co-occurrence Matrix (GLCM) and Gray-Level Run-Length Matrix (GLRLM) from the region of interests (ROIs) in ultrasound images. Three feature selection models, i) Sequential Forward Selection, ii) Sequential Backward Selection, and iii) F-score, are adopted to determine the identification of these liver diseases. Finally, the developed system can classify HCC and liver abscess by SVM with the accuracy of 88.875%. The proposed methods can provide diagnostic assistance while distinguishing two kinds of liver diseases by using a CAD system.
ARTICLE | doi:10.20944/preprints202307.1049.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Invariant Graph Convolutional Network (GCN); Convolutional Neural Network (CNN); Binary quantization; Hyperspectral image (HSI) classification
Online: 17 July 2023 (14:09:49 CEST)
Hyperspectral image and LiDAR image fusion plays a crucial role in remote sensing by capturing spatial relationships and modeling semantic information for accurate classification and recognition. However, existing methods, like Graph Convolutional Networks (GCNs), face challenges in constructing effective graph structures due to variations in local semantic information and limited receptiveness to large-scale contextual structures. To overcome these limitations, we proposed a invariant attribute-driven binary bi-branch classification (IABC) method which is a unified network that combines binary Convolutional Neural Network (CNN) and GCN with invariant attributes. Our approach utilizes a joint detection framework that can simultaneously learn features from small-scale regular regions and large-scale irregular regions, resulting in an enhanced structured representation of HSI and LiDAR images in the spectral-spatial domain. This approach not only improves the accuracy of classification and recognition but also reduces storage requirements and enables real-time decision-making, which is crucial for effectively processing large-scale remote sensing data. Extensive experiments demonstrates the superior performance of our proposed method in hyperspectral image analysis tasks. The combination of CNNs and GCNs allows for accurate modeling of spatial relationships and effective construction of graph structures. Furthermore, the integration of binary quantization enhances computational efficiency, enabling real-time processing of large-scale data. Therefore, our approach presents a promising opportunity for advancing remote sensing applications using deep learning techniques.
ARTICLE | doi:10.20944/preprints202212.0082.v1
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: K-means clustering algorithm; Elbow method; Silhouette technique; Kneedle Algorithm; Image Segmentation; Conventional Neural Network.
Online: 6 December 2022 (01:30:03 CET)
The agricultural sector in Palestine has a significant role in its economy. However, the production of this sector is affected by different kinds of plant diseases, specifically leaf diseases. Automatic agricultural leaf disease detection is essential for the early diagnosis and controls the overall health of fields. Image segmentation techniques, clustering, and deep learning are often used to detect diseased leaves. This study proposes a novel hybrid approach based on image classification. The hybrid approach combines the k-means clustering algorithm with Convolutional Neural Network (CNN), where k-means is used to detect the leaf’s infected area, then CNN is used for specifying the disease. We used the PlantVillage dataset for experimental verification as it contains several crops with different kinds of challenging diseases. We also examined the selection of optimal k-value using the Silhouette coefficient, Elbow method, and Kneedle Algorithm. The Silhouette technique was analyzed using three distance metrics; Euclidean, Manhattan, and Cosine. Its scores for the three-distance metrics were low, near-zero, and failed to produce the optimum k value. Besides, the Elbow method was complicated to use in image segmentation in terms of executing and visualizing the k value in its graph plot. Based on verification results, the Kneedle Algorithm produced better results in the consistency of choosing the optimal k value and showed superiority over other approaches. Therefore, the processed images were segmented with the k-means clustering algorithm with a Kneedle algorithm-based k value. Finally, a Convolutional Neural Network (CNN) is trained to classify the type of disease based on analyzing and testing leaf images. The hybrid model achieved high accuracy of 93.79% in disease identification, confirming the proposed model’s robustness.
ARTICLE | doi:10.20944/preprints202309.1219.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Radial Fourier signatures; SVM; Machine Learning; skin lesions; texture descriptors; image processing
Online: 19 September 2023 (11:54:59 CEST)
Eight lesions were analyzed using some algorithms of Intelligence Artificial: basal cell carcinoma (BCC), squamous cell carcinoma (SCC), melanoma (MEL), actinic keratosis (AK), benign keratosis (BKL), dermatofibromas (DF), melanocytic nevi (NV), and vascular lesions (VASC). This manuscript presents the possibility of using concatenated signatures (instead of images) obtained from different integral transforms, such as Fourier, Mellin, and Hilbert, to classify skin lesions. Eleven other Artificial Intelligence models were applied so that eight skin lesions could be classified by analyzing the particular signatures of each lesion. The database was randomly divided into 80%–20% for the training and test datasets images, respectively. The metrics that are being reported are accuracy, sensitivity, specificity, and precision. Each case was repeated 30 times to avoid bias, according to the central limit theorem in this work, and the average and ±standard deviation were reported. Although all the results were very satisfactory, the best average mark for the eight lesions analyzed was obtained using the Subspace KNN model, where the metrics for the test were 99.98% accuracy, 99.96% sensitivity, 99.99% specificity, and 99.95% precision.
ARTICLE | doi:10.20944/preprints202201.0352.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Per-pixel classification confidence; spatial pattern; image classification; accuracy assessment; interpolation method
Online: 24 January 2022 (11:53:46 CET)
Obtaining classification confidence at the pixel level is a challenging task for accuracy assessment in remote sensing image classification. Among the various methods for estimating classification confidence at the pixel level, interpolation-based methods have drawn special attention in the literature. Even though they have been widely recognized in the literature, their usefulness has not been rigorously evaluated. This paper conducts a comprehensive evaluation of three interpolation-based methods: local error matrix method, bootstrap method, and geostatistical method. We applied each of the three methods to three representative datasets with different spatial resolutions, spectral bands, and the number of classes. We then derive the estimated classification confidence and true classification confidence and compared the results with each other using both exploratory data analysis (bi-histogram) and statistical analysis (Willmott's d and Binned classification quality). The results indicate that the three interpolation methods provide some interesting insights on various aspects of estimating per-pixel classification confidence. Unfortunately, the interpolation assumes that classification confidence is smooth across the space, which is usually not true in practice. In other words, interpolation-based methods have limited practical use.
ARTICLE | doi:10.20944/preprints202201.0367.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Artificial Intelligence; Deep Learning; Image Classification; Machine Learning; Predictive Models; Small Datasets; Supervised Learning
Online: 25 January 2022 (08:24:17 CET)
One of the most important challenges in the Machine and Deep Learning areas today is to build good models using small datasets, because sometimes it is not possible to have large ones. Several techniques have been proposed in the literature to address this challenge. This paper aims at studying the different available Deep Learning techniques and performing a thorough experimentation to analyze which technique or combination thereof improves the performance and effectiveness of the models. A complete comparison with classical Machine Learning techniques was carried out, to contrast the results obtained using both techniques when working with small datasets. Thirteen algorithms were implemented and trained using three different small datasets (MNIST, Fashion MNIST, and CIFAR-10). Each experiment was evaluated using a well-established set of metrics (Accuracy, Precision, Recall, F1, and the Matthews correlation coefficient). The experimentation allowed concluding that it is possible to find a technique or combination of them to mitigate a lack of data, but this depends on the nature of the dataset, the amount of data, and the metrics used to evaluate them.
ARTICLE | doi:10.20944/preprints202302.0408.v1
Subject: Physical Sciences, Optics And Photonics Keywords: imaging; deblurring; deep learning; image classification; Lucy-Richardson algorithm; holography; aberrations; diffraction; incoherent optics; smart phone
Online: 23 February 2023 (09:49:26 CET)
Pattern recognition techniques form the heart of most, if not all, incoherent linear shift-invariant systems. When an object is recorded using a camera, the object information gets sampled by the point spread function (PSF) of the system, replacing every object point with the PSF in the sensor. The PSF is a sharp Kronecker Delta-like function when the numerical aperture (NA) is large with no aberrations. When the NA is small, and the system has aberrations, the PSF appears blurred. In the above case, if the PSF is known, then the object information can be obtained by scanning the PSF over the recorded object intensity pattern and looking for pattern matching conditions through a mathematical process called correlation. In this study, a recently developed deconvolution method, the Lucy-Richardson-Rosen algorithm (LR2A), has been implemented to computationally refocus images recorded in the presence of spatio-spectral aberrations. The performance of LR2A was compared against the Lucy-Richardson algorithm and non-linear reconstruction. LR2A exhibits a superior deconvolution capability even in extreme cases of spatio-spectral aberrations and blur. Experimental results of deblurring a picture captured using high-resolution smartphone cameras are presented. LR2A was implemented to significantly improve the performances of the widely used deep convolutional neural networks for image classification.
ARTICLE | doi:10.20944/preprints202002.0334.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: deep learning; drone imagery; hyperspectral image classiﬁcation; tree species classification; 3D convolutional neural networks
Online: 24 February 2020 (01:13:13 CET)
Interest in drone solutions in forestry applications is growing. Using drones, datasets can be captured flexibly and at high spatial and temporal resolutions when needed. In forestry applications, fundamental tasks include the detection of individual trees, tree species classification, bio-mass estimation, etc. Deep Neural Networks (DNN) have shown superior results when comparing with conventional machine learning methods such as Multi-Layer Perceptron (MLP) in cases of huge input data. The objective of this research was to investigate 3D convolutional neural networks (3D-CNN) to classify three major tree species in a boreal forest: pine, spruce, and birch. The proposed 3D-CNN models were employed to classify tree species in a test site in Finland. The classifiers were trained with a dataset of 3039 manually labelled trees. Then the accuracies were assessed by employing independent datasets of 803 records. To find the most efficient set of feature combination, we compare the performances of 3D-CNN models trained with hyperspectral (HS) channels, RGB channels, and canopy height model (CHM), separately and combined. It is demonstrated that the proposed 3D-CNN model with RGB and HS layers produces the highest classification accuracy. The producer accuracy of the best 3D-CNN classifier on the test dataset were 99.6%, 94.8%, and 97.4% for pines, spruces, and birches, respectively. The best 3D-CNN classifier produced ~5% better classification accuracy than the MLP with all layers. Our results suggest that the proposed method provides excellent classification results with acceptable performance metrics for HS datasets. Our results show that pine class was detectable in most layers. Spruce was most detectable in RGB data, while birch was most detectable in the HS layers. Furthermore, the RGB datasets provide acceptable results for many low-accuracy applications.
ARTICLE | doi:10.20944/preprints202209.0169.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Synthetic Aperture Rader (SAR); Optical image (Sentinel 2); Random Forest (RF); CART; GEE
Online: 13 September 2022 (10:06:14 CEST)
Observing cultivated crops and other forms of land use is an important environmental and economic concern for agricultural land management and crop classification. Crop categorization offers significant crop management data, ensuring food security, and developing agricultural policies. Remote sensing data, especially publicly available Sentinel 1 and 2 data, has effectively been used in crop mapping and classification in cloudy places because of their high spatial and temporal resolution. This study aimed to improve crop type classification by combining Sentinel-1 (Synthetic Aperture Rader (SAR)) data and the Sentinel-2 Multispectral Instrument (MSI) data. In the study, Random Forest (RF) and Classification and Regression Trees (CART) classier were used to classify grain crops (Barley and Wheat). The classification results based on the combination of Sentinel-2 and Sentinel-1 data indicated an overall accuracy (OA) of 93 % and a kappa coefficient (K) of 0.896 for RF and (89.15%, 0.84) for the CART classifier. It is suggested to employ a mix of radar and optical data to attain the highest level of classification accuracy since doing so improves the likelihood that the details will be observed in comparison to the single-sensor classification technique and yields more accurate results.
ARTICLE | doi:10.20944/preprints202309.1174.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Design Right Infringement; Deep Learning; Ensemble Learning; Image Classification; Object Detection; Large-Scale Detection System
Online: 19 September 2023 (03:03:46 CEST)
This paper presents a two-stage hierarchical neural network using image classification and object detection algorithms as key building blocks for a system that automatically detects a potential design right infringement. This neural network is trained to return the Top-N original design right records that highly resemble the input image of a counterfeit. Design rights specify the unique aesthetic characteristics of a product. Due to the rapid change of trends, new design rights are continuously generated. This work proposes an Ensemble Neural Network (ENN), an artificial neural network model that aims to deal with a large amount of counterfeit data and design right records that are frequently added and deleted. At first, we performed image classification and objection detection learning per design right using the existing models with a proven track record of high accuracy. The distributed models form the backbone of the ENN and yield intermediate results aggregated at a master neural network. This master neural network is a deep residual network paired with a fully connected network. This ensemble layer is trained to determine the sub-models that return the best result for a given input image of a product. In the final stage, the ENN model multiples the inferred similarity coefficients to the weighted input vectors produced by the individual sub-models to assess the similarity between the test input image and the existing product design rights to see any sign of violation. Given 84 design rights and the sample product images taken meticulously under various conditions, our ENN model achieved average Top-1 and Top-3 accuracies of 98.409% and 99.460%, respectively. Upon introducing new design rights data, a partial update of the inference model was done an order of magnitude faster than the single model. ENN maintained a high level of accuracy as it scaled out to handle more design rights. Therefore, the ENN model is expected to offer practical help to the inspectors in the field, such as the customs at the border that deal with a swarm of products.
ARTICLE | doi:10.20944/preprints202311.1888.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; convolutional neural networks; image analysis; machine learning; ResNetV50; skin lesions; VGG16; cancer; computer vision
Online: 29 November 2023 (11:33:49 CET)
Skin cancer is one of the widespread diseases that typically develop on the skin due to continuous exposure to sunlight. Although cancer can appear on any part of the human body, skin cancer reports account for over half of all cancer occurrences worldwide. There are substantial obstacles to the precise diagnosis and classification of skin lesions because of the morphological variety and indistinguishable characteristics across skin malignancies. Recently, Deep Learning models have been used in the field of image-based lesion diagnosis, and it has demonstrated diagnostic efficiency on par with that of dermatologists. To increase classification efficiency and accuracy for skin lesions, a cutting-edge multi-layer deep Convolutional Neural Network (CNN) termed SkinLesNet has been built in this study. The ResNetV50 and VGG16 models have been carefully compared to review the performance of the proposed model. The dataset used in this study, PAD-UFES-20, contains 1314 samples in total and includes three common forms of skin lesions. The proposed approach, SkinLesNet, significantly outperforms the well-known compared models in the given conditions
ARTICLE | doi:10.20944/preprints202108.0325.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: Multi-granularity encoding neural networks (MGNNE); feature extraction; multilayer perceptron (MLP); Principal component analysis (PCA); Remote Sensing image classification,LCLU.
Online: 16 August 2021 (11:28:21 CEST)
Deep learning classification is the state-of-the-art of machine learning approach. Earlier work proves that the deep convolutional neural network has successfully and brilliantly in different applications such as images or video data. Recognizing and clarifying the remote sensing aspect of the earth's surface and exploit land cover and land use (LCLU). First, this article summarized the remote sensing emerging application and challenges for deep learning methods. Second, we propose four approaches to learn efficient and effective CNNs to transfer image representation on the ImageNet dataset to recognize LCLU datasets. We use VGG16, Inception-ResNet-V2, Inception-V3, and DenseNet201 models to extract features from the EACC dataset. We use pre-trained CNNs on ImageNet to extract features. For feature selection we proposed principal component analysis (PCA) to improve accuracy and speed up the model. We train our model by multi-layer perceptron (MLP) as a classifier. Lastly, we apply the multi-granularity encoding ensemble model. We achieve an overall accuracy of 92.3% for the nine-class classification problem. This work will help remote sensing scientists understand deep learning tools and apply them in large-scale remote sensing challenges
ARTICLE | doi:10.20944/preprints202310.0166.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: PolSAR image classification; Wishart-based complex matrix network; multi-feature network; Double-channel convolution network; Edge-preserving MRF
Online: 3 October 2023 (15:23:55 CEST)
Deep learning methods have been widely used in PolSAR image classification. To learn the polarimetric information, many deep learning methods expect to learn high-level semantic features from original PolSAR data. However, only original data cannot learn multiple scattering features and complex structures for extremely heterogeneous terrain objects. In addition, deep learning methods always cause edge confusion due to the high-level features. To overcome these shortages, we propose a double-channel CNN network combined with an edge-preserving MRF model(DCCNN-MRF) for PolSAR image classification. Firstly, to combine complex matrix data and multiple scattering features together, a double-channel convolution network(DCCNN) is developed, which consists of a Wishart-based complex matrix and multi-feature subnetworks. The Wishart-based complex matrix network can learn the statistical characteristics and channel correlation well, and the multi-feature network can learn high-level semantic features well. Then, a unified network framework is designed to fuse two kinds of features to enhance advantageous features and reduce redundant ones. Finally, an edge preserving MRF model is designed to combine with the DCCNN network. In the MRF model, a sketch map-based edge energy function is designed by defining adaptive weighted neighborhood for edge pixels. Experiments are conducted on four real PolSAR data sets with different sensors and bands. Experimental results demonstrate the effectiveness of the proposed DCCNN-MRF method.
ARTICLE | doi:10.20944/preprints201808.0112.v2
Subject: Computer Science And Mathematics, Computational Mathematics Keywords: remote sensing; image classification; fully connected conditional random fields (FC-CRF); convolutional neural networks (CNN)
Online: 28 November 2018 (07:11:42 CET)
The interpretation of land use and land cover (LULC) is an important issue in the fields of high-resolution remote sensing (RS) image processing and land resource management. Fully training a new or existing convolutional neural network (CNN) architecture for LULC classification requires a large amount of remote sensing images. Thus, fine-tuning a pre-trained CNN for LULC detection is required. To improve the classification accuracy for high resolution remote sensing images, it is necessary to use another feature descriptor and to adopt a classifier for post-processing. A fully connected conditional random fields (FC-CRF), to use the fine-tuned CNN layers, spectral features, and fully connected pairwise potentials, is proposed for image classification of high-resolution remote sensing images. First, an existing CNN model is adopted, and the parameters of CNN are fine-tuned by training datasets. Then, the probabilities of image pixels belong to each class type are calculated. Second, we consider the spectral features and digital surface model (DSM) and combined with a support vector machine (SVM) classifier, the probabilities belong to each LULC class type are determined. Combined with the probabilities achieved by the fine-tuned CNN, new feature descriptors are built. Finally, FC-CRF are introduced to produce the classification results, whereas the unary potentials are achieved by the new feature descriptors and SVM classifier, and the pairwise potentials are achieved by the three-band RS imagery and DSM. Experimental results show that the proposed classification scheme achieves good performance when the total accuracy is about 85%.
ARTICLE | doi:10.20944/preprints201807.0516.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: land-cover classification; very high spatial resolution remote sensing image; adaptive majority vote; post-classification.
Online: 26 July 2018 (15:05:16 CEST)
Land-cover classification that uses very-high-resolution (VHR) remote sensing images is a topic of considerable interest. Although many classification methods have been developed, there is still room for improvements in the accuracy and usability of classification systems. In this paper, a novel post-processing approach based on a dual-adaptive majority voting strategy (D-AMVS) is proposed for improving the performance of initial classification maps. D-AMVS defines a strategy for refining each label of a classified map that is obtained by different classification methods from the same original image and fusing the different refined classification maps to generate a final classification result. The proposed D-AMVS contains three main blocks. 1) An adaptive region is generated by extending gradually the region around a central pixel based on two predefined parameters (T1 and T2) in order to utilize the spatial feature of ground targets in a VHR image. 2) For each classified map, the label of the central pixel is refined according to the majority voting rule within the adaptive region. This is defined as adaptive majority voting (AMV). Each initial classified map is refined in this manner pixel by pixel. 3) Finally, the refined classified maps are used to generate a final classification map, and the label of the central pixel in the final classification map is determined by applying AMV again. Each entire classified map is scanned and refined pixel by pixel based on the proposed D-AMVS. The accuracies of the proposed D-AMVS approach are investigated through two remote sensing images with high spatial resolutions of 1.0 and 1.3 m, respectively. Compared with the classical majority voting method and a relatively new post-processing method called general post-classification framework, the proposed D-AMVS can achieve a land-cover classification map with less noise and higher classification accuracies.
ARTICLE | doi:10.20944/preprints202307.1043.v2
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Image classification; Land use/land cover mapping; Accuracy assessment; Landsat-8; Snetinel-2
Online: 14 August 2023 (09:01:24 CEST)
Satellite-based data classification performance remains a challenge for research community in the field of land use/land cover mapping. Here we investigated supervised per-pixel classifications performance under different scenarios, based on single and seasonal multispectral data combi-nations of different sensors (Landsat-8 OLI and Sentinel-2 MSI). In case of Landsat, seasonal spectral indices (EVI and NDMI) were included. A typical Mediterranean watershed with a complex landscape comprised of various forest and wetland ecosystems, crops, artificial surfaces, and lake water was selected to test our approach. All available geospatial data from national databases (Forest Map, LPIS, Natura2000 habitats, cadastral parcels, etc.) are used as ancillary data for clas-sification training and validation. We examined and compared the performance of ML, RF, KNN and SVM classifiers under different scenarios for land use/land cover mapping, according to Copernicus Land Cover (CLC2018) nomenclature. In total, eight land use/land cover classes were identified in Landsat-8 OLI and nine in Sentinel-2a MSI for an acceptable overall accuracy over 85%. A comparison of the overall classification accuracies shows that Sentinel-2a overall accuracy was slightly higher than Landsat-8 (96.68% vs. 93.02%). Respectively, the best-performed algorithm was ML in Sentinel-2 while in Landsat-8 was KNN. However, machine-learning algorithms have similar results regardless the type of sensor. We concluded that best classification performances achieved using seasonal multispectral data. Future research should be oriented towards inte-grating time-series multispectral data of different sensors and geospatial ancillary data for land use/land cover mapping.
ARTICLE | doi:10.20944/preprints201703.0134.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: spatial-spectral feature; very high spatial resolution image; classification; Tobler’s First Law of Geography
Online: 17 March 2017 (05:06:12 CET)
Aerial image classification has become popular and has attracted extensive research efforts in recent decades. The main challenge lies in its very high spatial resolution but relatively insufficient spectral information. To this end, spatial-spectral feature extraction is a popular strategy for classification. However, parameter determination for that feature extraction is usually time-consuming and depends excessively on experience. In this paper, an automatic spatial feature extraction approach based on image raster and segmental vector data cross-analysis is proposed for the classification of very high spatial resolution (VHSR) aerial imagery. First, multi-resolution segmentation is used to generate strongly homogeneous image objects and extract corresponding vectors. Then, to automatically explore the region of a ground target, two rules, which are derived from Tobler’s First Law of Geography (TFL) and a topological relationship of vector data, are integrated to constrain the extension of a region around a central object. Third, the shape and size of the extended region are described. A final classification map is achieved through a supervised classifier using shape, size, and spectral features. Experiments on three real aerial images of VHSR (0.1 to 0.32 m) are done to evaluate effectiveness and robustness of the proposed approach. Comparisons to state-of-the-art methods demonstrate the superiority of the proposed method in VHSR image classification.
ARTICLE | doi:10.20944/preprints202301.0162.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Content-based image classification; Data curation and preparation; Convolutional neural networks (CNN); Deep learning; Artificial intelligence (AI)
Online: 9 January 2023 (10:59:31 CET)
Background: MR image classification in datasets collected from multiple sources is complicated by inconsistent and missing DICOM metadata. Therefore, we aimed to establish a method for the efficient automatic classification of MR brain sequences. Methods: Deep convolutional neural networks (DCNN) were trained as one-vs-all classifiers to differentiate between six classes, T1 weighted (w), contrast-enhanced T1w, T2w, T2w-FLAIR, ADC, and SWI. Each classifier yields a probability, allowing threshold-based and relative probability assignment while excluding images with low probability (label: unknown, open-set recognition problem). Data from three high-grade glioma (HGG) cohorts was assessed; C1 (320 patients, 20101 MRI images) was used for training, while C2 (197, 11333) and C3 (256, 3522) were for testing. Two raters manually checked images through an interactive labeling tool. Finally, MR-Class' added value was evaluated via radiomics models' performance for progression-free survival (PFS) prediction in C2, utilizing the concordance index (C-I). Results: Approximately 10% of annotation errors were observed in each cohort between the DICOM series descriptions and the derived labels. MR-Class accuracy was 96.7% [95%-Cl: 95.8, 97.3] for C2 and 94.4% [93.6, 96.1] for C3. 620 images were misclassified; Manual assessment of those frequently showed motion artifacts or alterations of anatomy by large tumors. Implementation of MR-Class increased on average the PFS model C-I by 14.6% compared to a model trained without MR-Class. Conclusions: We provide a DCNN-based method for sequence classification of brain MR images and demonstrate its usability in two independent HGG datasets.
ARTICLE | doi:10.20944/preprints201806.0188.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: minimum noise fraction (MNF) transformation; object-based image analysis (OBIA); APEX hyperspectral imagery; Random forest (RF) classifier; multiresolution segmentation (MRS); tree species classification
Online: 12 June 2018 (10:55:07 CEST)
Tree species composition is an important key element for biodiversity and sustainable forest management, and hyperspectral data provide detailed spectral information, which can be used for tree species classification. There are two main challenges for using hyperspectral imagery: a) Hughes phenomena, meaning by increasing the number of bands in hyperspectral imagery, the number of required classification samples would increase exponentially, and b) in a more complex environment, such as riparian mixed forest, focusing on spectral variability per pixel may not be adequate for definability of tree species. Therefore, the focus of this study is to assess spectral-spatial dimensionality reduction of airborne hyperspectral imagery by using minim noise fraction (MNF) transformation, and object-based image analysis (OBIA). An airborne prism experiment (APEX) hyperspectral imagery was used. A study area was a riparian mixed forest located along the Salzach river, and six tree species including Picea abies, Populus (canadensis and balsamifera), Fraxinus excelsior, Alnus incana, and Salix alba were selected. A machine learning algorithm random forest (RF) was used to train and apply a prediction model for classification. Using a spectral dimensionality reduced APEX, a pixel-level classification was also done. According to a confusion matrix, the object-level classification of MNF-derived components achieved the overall accuracy of 85 %, and kappa coefficient of 0.805. The performance of classes according to producer’s accuracy varied between 80% for Fraxinus excelsior, Alnus incana, and Populus canadensis to 90% for Salix alba and Picea abies. Comparison the results to a pixel-level classification, showed a better performance of object-level classification (an overall accuracy of 63% and Kappa coefficient of 0.559 were achieved for pixel-level classification). The performance of classes using pixel-based classification varied 45 % for Alnus incana to 80% for Picea abies. In general, Spectral-spatial complexity reduction using MNF transformation and object-level classification yielded a statistically satisfactory results.
ARTICLE | doi:10.20944/preprints202102.0189.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: image quality assessment; image databases; superpixels; color image; color space; image quality measures
Online: 8 February 2021 (11:11:47 CET)
Objective Image Quality Assessment (IQA) measures are playing an increasingly important role in the evaluation of digital image quality. New IQA indices are expected to be strongly correlated with subjective observer evaluations expressed by MOS/DMOS scores. One such recently proposed index is the SuperPixel-based SIMilarity (SPSIM) index, which uses superpixel patches instead of the rectangular pixel grid.The authors in this paper have been proposed three modifications of SPSIM index. For this purpose, the color space used by SPSIM was changed and the way SPSIM determines similarity maps was modified using methods derived from the algorithm for computing the MDSI index. The third modification was a combination of the first two. These three new quality indices were used in the assessment process. The experimental results obtained on many color images from five image databases demonstrated the advantages of the proposed SPSIM modifications.
ARTICLE | doi:10.20944/preprints202007.0686.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: document scanning; whiteboard capture; image enhancement; image alignment; image registration; image quality assessment
Online: 28 July 2020 (14:03:51 CEST)
The move from paper to online is not only necessary for remote working, it is also significantly more sustainable. This trend has seen a rising need for high-quality digitization of content from pages and whiteboards to sharable online material. But capturing this information is not always easy, nor are the results always satisfactory. Available scanning apps vary in their usability and do not always produce clean results, retaining surface imperfections from the page or whiteboard in their output images. CleanPage, a novel smartphone-based document and whiteboard scanning system, is presented. CleanPage requires one button-tap to capture, identify, crop and clean an image of a page or whiteboard. Unlike equivalent systems, no user intervention is required during processing and the result is a high-contrast, low-noise image with a clean homogenous background. Results are presented for a selection of scenarios showing the versatility of the design. CleanPage is compared with two market leader scanning apps using two testing approaches: real paper scans and ground-truth comparisons. These comparisons are achieved by a new testing methodology that allows scans to be compared to unscanned counterparts, by using synthesized images. Real paper scans are tested using image quality measures. An evaluation of standard image quality assessments is included in this work and a novel quality measure for scanned images is proposed and validated. The user experience for each scanning app is assessed, showing CleanPage to be fast and easier to use.
ARTICLE | doi:10.20944/preprints202010.0323.v1
Subject: Engineering, Automotive Engineering Keywords: Image segmentation; sonar image; ocean engineering；morphological image processing
Online: 15 October 2020 (13:10:41 CEST)
It has remained a hard nut for years to segment sonar images, most of which are noisy images with inevitable blur after noise reduction. For the purpose of solutions to this problem, a fast segmentation algorithm is proposed on the basis of the gray value characteristics of sonar images. This algorithm is endowed with the advantage in no need of segmentation thresholds to be calculated. To realize this goal, it follows the undermentioned steps: first, calculate the gray matrix of the fuzzy image background. After adjusting the gray value, segment the region into the background region, buffer region and target regions. After filtering, reset the pixels with gray value lower than 255 to binarize images and eliminate most artifacts. Finally, remove the remaining noise from images by means of morphological image processing. The simulation results of several sonar images show that the algorithm can segment the fuzzy sonar image quickly and effectively, with no problem of incomplete image target shape. Thus, the stable and feasible method is testified.
ARTICLE | doi:10.20944/preprints201711.0193.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: computational intelligence; quantum hybrid intelligent systems; quantum machine learning; medical image processing; disease diagnosis; Fuzzy k-NN; Quantum-behaved PSO; cervical smear images; cancer detection
Online: 30 November 2017 (07:21:00 CET)
A quantum hybrid (QH) intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO) method with the intuitionistic rationality of traditional fuzzy k-nearest neighbours (Fuzzy k-NN) algorithm (known simply as the Q-Fuzzy approach) is proposed for efficient feature selection and classification of cells in cervical smeared (CS) images. From an initial multitude of seventeen (17) features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e. global best particles) that represent a pruned down collection of seven (7) features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: The All-features (i.e. classification without prior feature selection) and another hybrid technique combining the standard PSO algorithm with the Fuzzy k-NN technique (P-Fuzzy approach). In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e. QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in scenarios 1 and 3. The synergy between the QPSO and Fuzzy k-NN in the proposed Q-Fuzzy approach marginally improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.
REVIEW | doi:10.20944/preprints202306.1179.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image forensics; image forgery detection; robust image watermarking; deep learning
Online: 16 June 2023 (11:07:50 CEST)
Digital images have become an important carrier for people to access information in the information age. However, with the development of the technology, digital images are vulnerable to illegal access and tampering, to the extent that they pose a serious threat to personal privacy, social order and national security. Therefore, image forensic techniques have become an important research topic in the field of multimedia information security. In recent years, deep learning technology has been widely applied in the field of image forensics and the performance achieved has significantly exceeded the conventional forensic algorithms. This survey compares the state-of-the-art image forensic techniques based on deep learning in recent years. The image forensic techniques are divided into passive and active forensics. In passive forensics, forgery detection techniques are reviewed, and the basic framework, evaluation metrics and commonly used datasets for forgery detection are presented. The performance, advantages and disadvantages of existing methods are also compared and analyzed according to different types of detection. In active forensics, robust image watermarking techniques are overviewed, the evaluation metrics and basic framework of robust watermarking techniques are presented. The technical characteristics and performance of existing methods are analyzed based on the different types of attacks on images. Finally, future research directions and conclusions are given to provide useful suggestions for people in image forensics and related research fields.
ARTICLE | doi:10.20944/preprints201703.0086.v1
Subject: Engineering, Control And Systems Engineering Keywords: image enhancement; image fusion; color space; edge detector; underwater image
Online: 14 March 2017 (17:52:48 CET)
In order to improve contrast and restore color for underwater image captured by camera sensors without suffering from insufficient details and color cast, a fusion algorithm for image enhancement in different color spaces based on contrast limited adaptive histogram equalization (CLAHE) is proposed in this article. The original color image is first converted from RGB color space to two different special color spaces: YIQ and HSI. The color space conversion from RGB to YIQ is a linear transformation, while the RGB to HSI conversion is nonlinear. Then, the algorithm separately operates CLAHE in YIQ and HSI color spaces to obtain two different enhancement images. The luminance component (Y) in the YIQ color space and the intensity component (I) in the HSI color space are enhanced with CLAHE algorithm. The CLAHE has two key parameters: Block Size and Clip Limit, which mainly control the quality of CLAHE enhancement image. After that, the YIQ and HSI enhancement images are respectively converted backward to RGB color. When the three components of red, green, and blue are not coherent in the YIQ-RGB or HSI-RGB images, the three components will have to be harmonized with the CLAHE algorithm in RGB space. Finally, with 4 direction Sobel edge detector in the bounded general logarithm ratio operation, a self-adaptive weight selection nonlinear image enhancement is carried out to fuse YIQ-RGB and HSI-RGB images together to achieve the final fused image. The enhancement fusion algorithm has two key factors: average of Sobel edge detector and fusion coefficient, and these two factors determine the effects of enhancement fusion algorithm. A series of evaluate metrics such as mean, contrast, entropy, colorfulness metric (CM), mean square error (MSE) and peak signal to noise ratio (PSNR) are used to assess the proposed enhancement algorithm. The experiments results showed that the proposed algorithm provides more detail enhancement and higher values of colorfulness restoration as compared to other existing image enhancement algorithms. The proposed algorithm can suppress effectively noise interference, improve the image quality for underwater image availably.
REVIEW | doi:10.20944/preprints202307.0585.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Underwater image analysis; Underwater image restoration; Underwater image enhancement; Underwater datasets; Underwater image quality evaluation
Online: 10 July 2023 (10:06:22 CEST)
In recent years, underwater exploration for deep-sea resource utilization and development has a considerable interest. In an underwater environment, the obtained images and videos undergo several types of quality degradation resulting from light absorption and scattering, low contrast, color deviation, blurred details, and nonuniform illumination. Therefore, the restoration and enhancement of degraded images and videos are critical. Numerous techniques of image processing, pattern recognition and computer vision have been proposed for image restoration and enhancement, but many challenges remain. This survey presents a comparison of the most prominent approaches in underwater image processing and analysis. It also discusses an overview of the underwater environment with a broad classification into enhancement and restoration techniques and introduces the main underwater image degradation reasons in addition to the underwater image model. The existing underwater image analysis techniques, methods, datasets, and evaluation metrics are presented in detail. Furthermore, the existing limitations are analyzed, which are classified into image-related and environment-related categories. In addition, the performance is validated on images from the UIEB dataset for qualitative, quantitative, and computational time assessment. Areas in which underwater images have recently been applied are briefly discussed. Finally, recommendations for future research are provided and the conclusion is presented.
ARTICLE | doi:10.20944/preprints201902.0089.v3
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Digital image processing, color image, grayscale image, histogram equalization, histogram specification, image enhancement, RGB channel
Online: 11 February 2019 (10:42:57 CET)
This paper has two major parts. In the first part histogram equalization for the image enhancement was implemented without using the built-in function in MATLAB. Here, at first, a color image of a rat was chosen and the image was transformed into a grayscale image. After this conversion, histogram equalization was implemented on the grayscale image. Later on, in the same image for each RGB channel, histogram equalization was implemented to observe the effect of histogram equalization on each channel. In the end, the histogram equalization was implemented to this specific color image of a rat. In the second part, for the grayscale image in part 1, the desired histogram of another colored image of a rat was introduced and histogram specification was implemented on the original colored image.
ARTICLE | doi:10.20944/preprints201811.0565.v1
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Digital image processing, color image, grayscale image, histogram equalization, histogram specification, image enhancement, RGB channel
Online: 23 November 2018 (14:17:13 CET)
This paper has two major parts. In the first part histogram equalization for the image enhancement was implemented without using the built-in function in MATLAB. Here, at first, a color image of a rat was chosen and the image was transformed into a grayscale image. After this conversion, histogram equalization was implemented on the grayscale image. Later on, in the same image for each RGB channel, histogram equalization was implemented to observe the effect of histogram equalization on each channel. In the end, the histogram equalization was implemented to this specific color image of a rat. In the second part, for the grayscale image in part 1, the desired histogram of another colored image of a rat was introduced and histogram specification was implemented on the original colored image.
ARTICLE | doi:10.20944/preprints202309.2177.v1
Subject: Engineering, Mechanical Engineering Keywords: particle image velocimetry; OpenPIV; python; image processing
Online: 30 September 2023 (09:59:14 CEST)
Particle Image Velocimetry (PIV) is a widely used experimental technique for measuring flow. In recent years, open-source PIV software has become more popular as it offers researchers and practitioners enhanced computational capabilities. Software development for graphical processing unit (GPU) architectures requires careful algorithm design and data structure selection for optimal performance. PIV software, optimized for central processing units (CPUs), offer an alternative to specialized GPU software. In the present work, an improved algorithm for the OpenPIV-Python software is presented and implemented under a traditional CPU framework. The Python language was selected due to its versatility and widespread adoption. The algorithm was also tested on a supercomputing cluster, a workstation, and Google Colaboratory during the development phase. Using a known velocity field, the algorithm precisely captured the time-average flow, monetary velocity fields, and vortices.
ARTICLE | doi:10.20944/preprints202304.1088.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: deep learning; image aesthetics assessment; image enhancement
Online: 28 April 2023 (03:15:16 CEST)
Abstract: Image aesthetic assessment (IAA) with neural attention has made significant progress due to its effectiveness in object recognition. Current studies have shown that the features learned by convolutional neural networks (CNN) at different learning stages indicate meaningful information. The shallow feature contains the low-level information of images and the deep feature perceives the image semantics and themes. Inspired by this, we propose a visual enhancement network with feature fusion (FF-VEN). It consists of two sub-modules, the visual enhancement module (VE module) and the shallow and deep feature fusion module (SDFF module). The former uses an adaptive filter in the spatial domain to simulate human eyes according to the region of interest (ROI) extracted by neural feedback. The latter not only takes out the shallow feature and the deep feature by transverse connection, but also uses a feature fusion unit (FFU) to fuse the pooled features together with the aim of information contribution maximization. Experiments on standard AVA dataset and Photo.net dataset show the effectiveness of FF-VEN.
ARTICLE | doi:10.20944/preprints202310.0838.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: neural networks; image denoising; image processing; denoising algorithms
Online: 13 October 2023 (04:19:29 CEST)
Image denoising has been one of the important problems in the field of computer vision, and it has a wide range of practical value in many applications, such as medical image processing, image enhancement, and computational photography. Traditional image denoising methods are usually based on hand-designed features and filters, but these methods perform poorly under complex noise and image structures. In recent years, the rapid development of neural network technology has revolutionized the image-denoising task. This paper introduces the knowledge about neural networks and image denoising, explores the impact of neural networks on image denoising, and how is it possible to denoise images by neural networks. It also summarises other image-denoising methods and finally points out the challenges and problems faced by image-denoising at present. Some possible new development directions are proposed to provide new solutions for image-denoising researchers and to promote the development of the field.
ARTICLE | doi:10.20944/preprints202306.0081.v1
Subject: Engineering, Bioengineering Keywords: Deep Learning; Image Synthesis; Image Generation; Machine Learning; Medical Imaging; CT to MRI; Synthetic MRI; Stroke; Image-to-image Translation
Online: 1 June 2023 (11:30:09 CEST)
CT scans are currently the most common imaging modality used for suspected stroke patients due to their short acquisition time and wide availability. However, MRI offers superior tissue contrast and image quality. In this study, eight deep learning models are developed, trained, and tested using a dataset of 181 CT/MR pairs from stroke patients. The resultant synthetic MRIs generated by these models are compared through a variety of qualitative and quantitative methods. The synthetic MRIs generated by a 3D UNet model consistently demonstrated superior performance across all methods of evaluation. Overall, the generation of synthetic MRIs from CT scans using the methods described in this paper produces realistic MRIs that can guide the registration of CT scans to MRI atlases. The synthetic MRIs enable the segmentation of white matter, gray matter, and cerebrospinal fluid using algorithms designed for MRIs, exhibiting a high degree of similarity to true MRIs.
ARTICLE | doi:10.20944/preprints202309.0946.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: adversarial attacks; artificial neural networks; robustness; image filtering; convolutional neural networks; image recognition; image distortion
Online: 14 September 2023 (08:31:30 CEST)
In this paper, we continue the research cycle on the properties of convolutional neural network-based image recognition systems and ways to improve noise immunity and robustness . Currently, a popular research area related to artificial neural networks is adversarial attacks. The effect of adversarial attacks on the image is not highly perceptible to the human eye, also it drastically reduces the neural network accuracy. Image perception by a machine is highly dependent on the propagation of high frequency distortions throughout the network. At the same time, a human efficiently ignores high-frequency distortions, perceiving the shape of objects as a whole. The approach proposed in this paper can improve the image recognition accuracy in the presence of high-frequency distortions, in particular, caused by adversarial attacks. The proposed technique makes it possible to measure up the logic of artificial neural network to that of a human, for whom high-frequency distortions are not decisive in object recognition.
ARTICLE | doi:10.20944/preprints202306.0736.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image denoising; image deblurring; salt&pepper noise; nonlinear diffusion.
Online: 12 June 2023 (02:18:59 CEST)
An algorithm for the treatment of images affected by both blurring and salt&pepper noise is proposed with a cost only proportional to the number of pixels. The methodology uses a discretization scheme for the Laplace operator multiplied by a suitable nonlinear term depending on the gradient. Even if this approach resembles a diffusion type algorithm, only one step of the procedure is applied, leading to significant time savings. The procedure is successfully tested on some standard black&white natural images.
ARTICLE | doi:10.20944/preprints202108.0286.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Image enhancement; DCT-Domain Perceived Contrast; Perceptual Image Quality
Online: 13 August 2021 (08:31:37 CEST)
This paper develops a detail image signal enhancement that makes images perceived as clearer and more resolved and so is more effective for higher resolution displays. We observe that the local variant signal enhancement makes images more vivid, and the more revealed granular signals harmonically embedded on the local variant signals make images more resolved. Based on this observation, we develop a method that not only emphasizes the local variant signals by scaling up the frequency energy in accordance with human visual perception, but also strengths up the granular signals by embedding the alpha-rooting enhanced frequency components. The proposed energy scaling method emphasizes the detail signals in texture images and rarely boosts noisy signals in plain images. In addition, to avoid the local ringing artifact, the proposed method adjusts the enhancement direction to be parallel to the underlying image signal direction. It was verified through the subjective and objective quality evaluations that the developed method makes images perceived as clearer and highly resolved.
ARTICLE | doi:10.20944/preprints202101.0345.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: image processing; low resolution image; crack detection; user algorithm
Online: 18 January 2021 (14:26:38 CET)
Abstract: Imaging devices of less than 300,000 pixels are mostly used for sewage conduit exploration due to the petty nature of the survey industry in Korea. P articular ly , devices of less than 100,000 pixels are still widely used, and the environment for image processing is very bitter . Since the sewage conduit image s covered in this study ha ve a very low resolution (240 × 320 = 76,800 pixels), it is very difficult to detect cracks. Because most of the resolution of the sewe r conduit images are very low in Korea, this problem of low resolution was selected as the subject of study. Cracks were detected through a total of six steps of improving the crack in Step 2, finding the optimal threshold value in Step 3, and applying an algorithm to detect cracks in Step 5. Cracks were effectively detected by the optimal parameters in Steps 2 and 3 and the user algorithm in Step 5. Desp ite the very low resolution, the cracked image s showed 96.4% accuracy of detection, and the non cracked image s showed 94.5% accuracy . Moreover, the analysis was excellent in quality , also . It is believed that the findings of this study can be effectively u sed for crack detection with low resolution images.
ARTICLE | doi:10.20944/preprints201810.0393.v1
Subject: Engineering, Control And Systems Engineering Keywords: image analysis; Turin Shroud; body-image formation; energy propagation
Online: 18 October 2018 (03:55:21 CEST)
Recent studies on the image of the Turin Shroud (TS) lead to think it could have been formed through a not well-identified mechanism of energy radiation. In order to remove some lacunas about this imaging process, a reverse engineering method has been applied to it, arriving to exclude some possible mechanisms. The image formation of a human face wrapped on a cloth by using an ad-hoc developed software has been simulated. The results of different kinds of the radiation depending from different parameters have been simulated, each one connected with accredited hypotheses. On the basis of the comparison among the different images produced by the software and the TS Face, some useful information both about the kind of radiation and the cloth wrapping conditions have been obtained. The effect of image distortion of a cloth wrapped around a face has been discussed too by defining the best laws of radiation and of their attenuation with distance. A Lambertian law is not compatible with the TS image. A vertical radiation shows a problem in reproducing the requested resolution. A radiation perpendicular to the emitting surface, like that produced by an electric field appears promising to explain the TS Face.
ARTICLE | doi:10.20944/preprints201705.0028.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: monocular image; image segment; SIFT; depth measurement; convex hull
Online: 3 May 2017 (09:19:59 CEST)
It is one of very important and basic problem in compute vision field that recovering depth information of objects from two-dimensional images. In view of the shortcomings of existing methods of depth estimation, a novel approach based on SIFT (the Scale Invariant Feature Transform) is presented in this paper. The approach can estimate the depths of objects in two images which are captured by an un-calibrated ordinary monocular camera. In this approach, above all, the first image is captured. All of the camera parameters remain unchanged, and the second image is acquired after moving the camera a distance d along the optical axis. Then image segmentation and SIFT feature extraction are implemented on the two images separately, and objects in the images are matched. Lastly, an object depth can be computed by the lengths of a pair of straight line segments. In order to ensure that the best appropriate a pair of straight line segments are chose and reduce the computation, the theory of convex hull and the knowledge of triangle similarity are employed. The experimental results show our approach is effective and practical.
ARTICLE | doi:10.20944/preprints201611.0057.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: multi-focus image, image fusion, region mosaic, contrast pyramid
Online: 10 November 2016 (07:34:22 CET)
This paper proposes a new approach for multi-focus images fusion based on Region Mosaicing on Contrast Pyramids (REMCP). A density-based region growing method is developed to construct a focused region mask for multi-focus images. The segmented focused region mask is decomposed into a mask pyramid, which is then used for supervised region mosaicking on a contrast pyramid. In this way, the focus measurement and the continuity of focused regions are incorporated and the pixel level pyramid fusion is improved at the region level. Objective and subjective experiments show that the proposed REMCP is more robust to noise than compared algorithms and can fully preserves the focus information of the multi-focus images meanwhile reducing distortions of the fused images.
ARTICLE | doi:10.20944/preprints201811.0566.v2
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Color image, grayscale image, motion blurring, random noise, inverse filtering, Wiener filtering, restoration of an image
Online: 5 February 2019 (16:13:14 CET)
In this paper, at first, a color image of a car is taken. Then the image is transformed into a grayscale image. After that, the motion blurring effect is applied to that image according to the image degradation model described in equation 3. The blurring effect can be controlled by a and b components of the model. Then random noise is added in the image via Matlab programming. Many methods can restore the noisy and motion blurred image; particularly in this paper Inverse filtering as well as Wiener filtering are implemented for the restoration purpose. Consequently, both motion blurred and noisy motion blurred images are restored via Inverse filtering as well as Wiener filtering techniques and the comparison is made among them.
ARTICLE | doi:10.20944/preprints202201.0259.v2
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: image classifier; image part; quick learning; feature overlap; positional context
Online: 11 April 2022 (10:17:57 CEST)
This paper describes an image processing method that makes use of image parts instead of neural parts. Neural networks excel at image or pattern recognition and they do this by constructing complex networks of weighted values that can cover the complexity of the pattern data. These features however are integrated holistically into the network, which means that they can be difficult to use in an individual sense. A different method might scan individual images and use a more local method to try to recognise the features in it. This paper suggests such a method, where a trick during the scan process can not only recognise separate image parts, as features, but it can also produce an overlap between the parts. It is therefore able to produce image parts with real meaning and also place them into a positional context. Tests show that it can be quite accurate, on some handwritten digit datasets, but not as accurate as a neural network, for example. The fact that it offers an explainable interface could make it interesting however. It also fits well with an earlier cognitive model, and an ensemble-hierarchy structure in particular.
ARTICLE | doi:10.20944/preprints202006.0117.v1
Subject: Medicine And Pharmacology, Other Keywords: Image Noise Removal; Image Enhancement; MFNR; Speckle noise; Median Filter
Online: 9 June 2020 (05:00:26 CEST)
Speckle noise is one of the most difficult noises to remove especially in medical applications. It is a nuisance in ultrasound imaging systems which is used in about half of all medical screening systems. Thus, noise removal is an important step in these systems, thereby creating reliable, automated, and potentially low cost systems. Herein, a generalized approach MFNR (Multi-Frame Noise Removal) is used, which is a complete Noise Removal system using KDE (Kernal Density Estimation). Any given type of noise can be removed if its probability density function (PDF) is known. Herein, we extracted the PDF parameters using KDE. Noise removal and detail preservation are not contrary to each other as the case in single-frame noise removal methods. Our results showed practically complete noise removal using MFNR algorithm compared to standard noise removal tools. The Peak Signal to Noise Ratio (PSNR) performance was used as a comparison metric. This paper is an extension to our previous paper where MFNR Algorithm was showed as a general purpose complete noise removal tool for all types of noises
ARTICLE | doi:10.20944/preprints202002.0125.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: image inpainting; image completion; attention; pyramid structure loss; deep learning
Online: 10 February 2020 (10:16:37 CET)
This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures --- edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.
ARTICLE | doi:10.20944/preprints201906.0248.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image segmentation; neutrosophic information; Shannon entropy; gray level image threshold
Online: 25 June 2019 (08:48:22 CEST)
This article presents a new method of segmenting grayscale images by minimizing Shannon's neutrosophic entropy. For the proposed segmentation method, the neutrosophic information components, i.e., the degree of truth, the degree of neutrality and the degree of falsity are defined taking into account the belonging to the segmented regions and at the same time to the separation threshold area. The principle of the method is simple and easy to understand and can lead to multiple thresholds. The efficacy of the method is illustrated using some test gray level images. The experimental results show that the proposed method has good performance for segmentation with optimal gray level thresholds.
ARTICLE | doi:10.20944/preprints201904.0078.v1
Subject: Social Sciences, Psychology Keywords: forest recreation; forest landscape; landscape image; landscape image sketching technique
Online: 8 April 2019 (09:08:30 CEST)
The landscape image is the bridge of communication between people and forests, and the cut point of the supply-side reform of forest tourism products. The research collected 140 copies in total of forest landscape image drawings from non-art-major graduate students by randomly sampling during April and May, 2018, and constructed the landscape image conceptual model of forest by utilizing the landscape image sketching technique. The results showed that (1) In regard to linguistic knowledge, the natural landscape elements for instance, herbaceous plants, terrains, creatures, water and sky, and the broad-leaf forest objectively reflected not only the real forest landscape and the local native vegetation, but the variation of forest species with little attention. (2) On the perspective of spatial view, the sideways view indicated that graduate students preferred to watch forests at a moderate distance externally and few looked at forests internally. (3) In the view of self-orientation, the objective landscape indicated that graduate students preferred to demonstrate forest landscapes, they did not realize to interact with the environment. (4) On the aspect of social meaning, the scenic view and forest structure stated that graduate students preferred rural forest landscapes, not significantly for other special interests for forest. In conclusions, (1) the forest is thought to be a feature of people's life world and of rural scenes around homes, not an objective perception of the forest. (2) The forest is regarded as an important habitat for animals and a limited resource for people's life, production and recreation needs, into which people will go only to meet such needs. (3) The natural values of forests, like the ecology and aesthetics, etc. get more attention, while the social values of forests, like the life, production and culture receives rather low attention.
ARTICLE | doi:10.20944/preprints202006.0091.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Breast Cancer Screening; Digital Image Elasto Tomography (DIET); Image Noise Removal, Image Enhancement; Multiple Frame Noise Removal (MFNR)
Online: 7 June 2020 (14:53:34 CEST)
Breast cancer is a leading cause of death among women. Conventional screening methods, such as mammography, and ultrasound diagnosis are expensive and have significant limitations. Digital Image Elasto Tomography (DIET) is a new noninvasive breast cancer screening system that has a potential to be a low cost and reliable breast cancer screening tool. It is based on modal analysis of the breast mass, and stereographic 3D image analysis to detect the stiffer abnormal tissues. However, camera sensor noise, especially Gaussian noise is a major source of Optical Flow (OF) error in this approach to tumor detection. This work studies the performance of different conventional filters, including the standard Gaussian filter tool to remove this noise and produce more robust screening results. A radical approach, Multiple Frame Noise Removal (MFNR) is proposed, for use in this type of medical image processing instead of a Gaussian filter or other typical image noise removal tools. Its a multiple frame noise removal method where Probability Density Function (PDF) of noise is extracted from the multiple images by characterizing the same pixel positions in multiple images. The noise becomes deterministic, and hence easily removed. The proposed algorithm was applied to a data set from 10 phantom breast tests with a prototype DIET system, and 10 in-vivo samples from healthy women. Comparisons were made to an optimal Gaussian filter form that is commonly used. Reductions in OF error using these digitally imaged data sets was used to compare performance. Refinement of the images for medical applications requires higher PSNR, which was successfully achieved by using MFNR algorithm. In this study, the algorithm was used to improve the imaging results of a DIET system. The conventional wisdom that states that noise removal and detail preservation are contrasting effects is
ARTICLE | doi:10.20944/preprints202310.1144.v1
Subject: Biology And Life Sciences, Life Sciences Keywords: image denoising; filtering methods; biomedical image denoising; healthcare; adaptive filtering methods
Online: 18 October 2023 (09:18:36 CEST)
In this paper, the filtering method of biomedical image denoising is described comprehensively. Firstly, it introduces the biomedical image denoising, describes the relationship between biomedical image denoising and medical care, introduces the filtering methods, the filtering methods of biomedical image denoising, the challenges encountered by the current filtering methods, and other application fields of filtering methods. Firstly, the background of biomedical image denoising is introduced. Biomedical image denoising is a challenge. Different imaging modes have different noise characteristics, and noise levels can vary greatly depending on the specific application. Secondly, it describes that biomedical image denoising plays an important role in medical care, and the biomedical image directly affects the patient's diagnosis, treatment plan and the overall quality of medical care service. Then the filtering method is introduced in detail, describing the core concepts and related features of linear filtering, nonlinear filtering and frequency domain filtering, and then focusing on the adaptive filtering method, describing the characteristics, conditions of use, common algorithms and advantages of adaptive filtering method. Then the filter methods of biomedical image denoising are introduced, and the core concepts of Gaussian filter, median filter, total variation denoising and Wiener filter are introduced respectively. Then, the challenges encountered by filtering methods are described, such as the accurate selection of filters, the balance between noise reduction and image detail preservation are introduced. Finally, the application of filtering method in other fields is mentioned, such as audio processing, speech recognition and so on. In summary, this paper comprehensively expounds the denoising and filtering methods of biomedical images, the filtering methods of medical image denoising, the relationship between medical image denoising and medical care, and the challenges encountered by filtering methods.
REVIEW | doi:10.20944/preprints202309.0223.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: deep learning; medical images; image registration; medical image analysis; survey; review
Online: 5 September 2023 (03:51:29 CEST)
Image registration (IR) is a process that deforms images to align them with respect to a reference space, making it easier for medical practitioners to examine various medical images in a standardized reference frame, such as having the same rotation and scale. This document introduces image registration using a simple numeric example. It provides a definition of image registration along with a space-oriented symbolic representation. This review covers various aspects of image transformations, including affine, deformable, invertible, and bidirectional transformations, as well as medical image registration algorithms such as Voxelmorph, Demons, SyN, Iterative Closest Point, and SynthMorph. It also explores atlas-based registration and multistage image registration techniques, including coarse-fine and pyramid approaches. Furthermore, this survey paper discusses medical image registration taxonomies, datasets, evaluation measures, such as correlation-based metrics, segmentation-based metrics, processing time, and model size. It also explores applications in image-guided surgery, motion tracking, and tumor diagnosis. Finally, the document addresses future research directions, including the further development of transformers.
ARTICLE | doi:10.20944/preprints202105.0408.v1
Subject: Engineering, Automotive Engineering Keywords: UAV Images; Monoscopic Mapping; Stereoscopic Plotting; Image Overlap; Optimal Image Selection
Online: 18 May 2021 (10:10:07 CEST)
Recently, the mapping industry has been focusing on the possibility of large-scale mapping from unmanned aerial vehicles (UAVs) owing to advantages such as easy operation and cost reduction. In order to produce large-scale maps from UAV images, it is important to obtain precise orientation parameters. For this, various techniques have been developed and are included in most of the commercial UAV image processing software. For mapping, it is equally important to select images that can cover a region of interest (ROI) with the fewest possible images. Otherwise, to map the ROI, one may have to handle too many images, and commercial software does not provide information needed to select images, nor does it explicitly explain how to select images for mapping. For these reasons, stereo mapping of UAV images in particular is time consuming and costly. In order to solve these problems, this study proposes a method to select images intelligently. We can select a minimum number of image pairs to cover the ROI with the fewest possible images. We can also select optimal image pairs to cover the ROI with the most accurate stereo pairs. We group images by strips, and generate the initial image pairs. We then apply an intelligent scheme to iteratively select optimal image pairs from the start to the end of an image strip. According to the results of the experiment, the number of images selected is greatly reduced by applying the proposed optimal image–composition algorithm. The selected image pairs produce a dense 3D point cloud over the ROI without any holes. For stereoscopic plotting, the selected image pairs were map the ROI successfully on a digital photogrammetric workstation (DPW), and a digital map covering the ROI is generated. The proposed method should contribute to time and cost reductions in UAV mapping.
REVIEW | doi:10.20944/preprints202012.0479.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Image classification; Texture image analysis; Discriminant features; Combination methods; texture operators
Online: 18 December 2020 (16:21:50 CET)
In many image processing and computer vision applications, the main aim is to describe image contents. So, different visual properties such as color, texture and shape are extracted to make aim. In this respect, texture information play important role in image description and visual pattern classification. Texture is referred to a specific local distribution of intensities that is repeated throughout the image. Since now different operations or descriptors have been proposed to analysis texture characteristics. In the multi object images specific texture operators usually doesn’t provide accurate results. So, in many cases, combination of texture operators are used to achieve more discriminant features. In this paper, some combination methods are survived to analysis effect of combinational texture features in image content description. Also, in the result part, different related methods are compared in terms of accuracy and computational complexity.
ARTICLE | doi:10.20944/preprints202005.0167.v1
Subject: Computer Science And Mathematics, Applied Mathematics Keywords: neutrosophic information; Onicescu information energy; image segmentation; gray level image threshold
Online: 10 May 2020 (14:41:04 CEST)
This article presents a method of segmenting images with gray levels that uses Onicescu's information energy calculated in the context of the neutrosophic theory. Starting from the information energy calculation for complete neutrosophic information, it is shown how to extend its calculation for incomplete and inconsistent neutrosophic information. The segmentation method is based on calculation of thresholds for separating the gray levels using the local maximum points of the Onicescu information energy.
ARTICLE | doi:10.20944/preprints202303.0326.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: document image processing; deskew; Hough Line Transform; image rectification; machine learning; OCR; document orientation; image preprocessing; computer vision; AI
Online: 17 March 2023 (13:25:06 CET)
Document deskewing is a fundamental problem in document image processing. While existing methods have limitations, such as Hough Line Transformation that can deskew images upside down, and Deep Learning models that require huge amounts of human labour and computational resources and still fail to deskew while taking care of orientation, OCR-based methods also struggle to read text when it is tilted. In this paper, we propose a novel, simple, cost-effective deep learning method for fixing the skew and orientation of documents. Our approach reduces the search space for the machine learning model to predict whether an image is upside down or not, avoiding the huge search space of predicting an angle between 0 and 360. We finetuned a MobileNetV2 model, which was pre-trained on imagenet, using only 200 images and achieve good results. This method is useful for automation-based tasks, such as data extraction using OCR technology, and can greatly reduce manual labour.
ARTICLE | doi:10.20944/preprints202307.1395.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Fractional order differential operator; Fractional order integral operator; Image enhancement; Image denoising
Online: 20 July 2023 (10:42:06 CEST)
The theory of fractional calculus extends the order of classical integer calculus from integer to non-integer. As a new engineering application tool, it has made many important research achievements in many fields, including image processing. This paper mainly studies the application of fractional calculus theory in image enhancement and denoising, including the basic theory of fractional calculus and its amplitude frequency characteristics, the application of fractional differential operator in image enhancement, and the application of fractional integral operator in image denoising. The experimental results show that the fractional calculus theory has more special advantages in image enhancement and denoising. Compared with the existing integer order image enhancement operators, the fractional differential operator can more effectively enhance the "weak edge" and "strong texture" details of the image. The fractional order integral image denoising operator can not only improve the signal-to-noise ratio of the image compared to traditional denoising methods, but also better preserve detailed information such as edges and textures of the image.
ARTICLE | doi:10.20944/preprints202304.0723.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: no-reference image quality assessment; multitask learning; image restoration; multi-level features.
Online: 21 April 2023 (10:52:35 CEST)
When the image quality is evaluated, the human visual system (HVS) infers the details in the image through its internal generative mechanism. In this process, the HVS integrates both local and global information of the image, utilizes contextual information to restore the original image information, and compares it with the distorted image information for image quality evaluation. Inspired by this mechanism, a no-reference image quality assessment method is proposed based on a multitask image restoration network. The multitask image restoration network generates a pseudo-reference image as the main task and produces structural similarity index measure map as an auxiliary task. By mutually promoting the two tasks, a higher quality pseudo-reference image is generated. In addition, when predicting the image quality score, both the quality restoration features and the difference features between the distorted and reference images are used, thereby fully utilizing the information from the pseudo-reference image. To enable the model to focus on both global and local features, a multi-scale feature fusion module is proposed. Experimental results demonstrate that the proposed method achieves excellent performance on both synthetically and authentically distorted databases.
TECHNICAL NOTE | doi:10.20944/preprints202203.0095.v1
Subject: Engineering, Control And Systems Engineering Keywords: pre-processing; image transformation; image enhancement; geometric correction; radiometric correction; Satellite Imagery
Online: 7 March 2022 (09:43:08 CET)
During the few years, various algorithms have been developed to extract features from high-resolution satellite imagery. For the classification of these extracted features, several complex algorithms have been developed. But these algorithms do not possess critical refining stages of processing the data at the preliminary phase. Various satellite sensors have been launched such as LISS3, IKONOS, QUICKBIRD, and WORLDVIEW etc. Before classification and extraction of semantic data, imagery of the high resolution must be refined. The whole refinement process involves several steps of interaction with the data. These steps are pre-processing algorithms that are presented in this paper. Pre-processing steps involves Geometric correction, radiometric correction, Noise removal, Image enhancement etc. Due to these pre-processing algorithms, the accuracy of the data is increased. Various applications of these pre-processing of the data are in meteorology, hydrology, soil science, forest, physical planning etc. This paper also provides a brief description of the local maximum likelihood method, fuzzy method, stretch method and pre-processing methods, which are used before classifying and extracting features from the image.
ARTICLE | doi:10.20944/preprints201705.0027.v2
Subject: Social Sciences, Geography, Planning And Development Keywords: remote sensing; image registration; multiple image features; different viewpoint; non-rigid distortion
Online: 13 June 2017 (09:52:10 CEST)
Remote sensing image registration plays an important role in military and civilian fields, such as natural disaster damage assessment, military damage assessment and ground targets identification, etc. However, due to the ground relief variations and imaging viewpoint changes, non-rigid geometric distortion occurs between remote sensing images with different viewpoint, which further increases the difficulty of remote sensing image registration. To address the problem, we propose a multi-viewpoint remote sensing image registration method which contains the following contributions. (i) A multiple features based finite mixture model is constructed for dealing with different types of image features. (ii) Three features are combined and substituted into the mixture model to form a feature complementation, i.e., the Euclidean distance and shape context are used to measure the similarity of geometric structure, and the SIFT (scale-invariant feature transform) distance which is endowed with the intensity information is used to measure the scale space extrema. (iii) To prevent the ill-posed problem, a geometric constraint term is introduced into the L2E-based energy function for better behaving the non-rigid transformation. We evaluated the performances of the proposed method by three series of remote sensing images obtained from the unmanned aerial vehicle (UAV) and Google Earth, and compared with five state-of-the-art methods where our method shows the best alignments in most cases.
ARTICLE | doi:10.20944/preprints202108.0392.v1
Subject: Engineering, Control And Systems Engineering Keywords: image quality assessment; real-time image processing; image functions adaptation; convolutional neural network; face alignment; deep neural network; random forest
Online: 18 August 2021 (17:06:02 CEST)
In recent years, data providers are generating and streaming a large number of images. More particularly, processing images that contain faces have received great attention due to its numerous applications, such as entertainment and social media apps. The enormous amount of images shared on these applications presents serious challenges and requires massive computing resources to ensure efficient data processing. However, images are subject to a wide range of distortions in real application scenarios during the processing, transmission, sharing, or combination of many factors. So, there is a need to guarantee acceptable delivery content, even though some distorted images do not have access to their original version. In this paper, we present a framework developed to estimate the images' quality while processing a large number of images in real-time. Our quality evaluation is measured using an integration of a deep network with random forests. In addition, a face alignment metric is used to assess the facial features. Experimental results have been conducted on two artificially distorted benchmark datasets, LIVE and TID2013. We show that our proposed approach outperforms the state-of-art methods, having a Pearson Correlation Coefficient (PCC) and Spearman Rank Order Correlation Correlation Coefficient (SROCC) with subjective human scores of almost 0.942 and 0.931 while minimizing the processing time from 4.8ms to 1.8ms.
ARTICLE | doi:10.20944/preprints202311.0161.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: deep learning; skin cancer; image augmentation; GAN; geometric augmentation; image classification; interpretable technique
Online: 2 November 2023 (10:52:57 CET)
This research paper presents a deep learning approach to early detection of skin cancer using image augmentation techniques. The authors propose a two-stage image augmentation technique that involves the use of geometric augmentation and generative adversarial network (GAN) to classify skin lesions as either benign or malignant. This research utilized the public HAM10000 dataset to test the proposed model. Several pre-trained models of CNN were employed, namely Xception, Inceptionv3, Resnet152v2, EfficientnetB7, InceptionresnetV2, and VGG19. Our approach achieved accuracy, precision, recall, and F1-score of 96.90%, 97.07%, 96.87%, 96.97%, respectively, which is higher than the performance achieved by other state-of-the-art methods. The paper also discusses the use of SHapley Additive exPlanations (SHAP), an interpretable technique for skin cancer diagnosis, which can help clinicians understand the reasoning behind the diagnosis and improve trust in the system. Overall, the proposed method presents a promising approach to automated skin cancer detection that could improve patient outcomes and reduce healthcare costs.
ARTICLE | doi:10.20944/preprints202310.0524.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning; image representation learning; self-supervised learning; masked image modeling; contrastive learning
Online: 9 October 2023 (12:52:30 CEST)
Self-supervised learning is a method that learns general representation from unlabeled data. Masked image modeling (MIM), one of the generative self-supervised learning methods, has drawn attention showing state-of-the-art performance on various downstream tasks, though showing poor linear separability resulting from the token-level approach. In this paper, we propose a contrastive learning-based multi-view masked autoencoder for MIM, exploiting an image-level approach by learning common features from two different augmented views. We strengthen MIM by learning long-range global patterns from contrastive loss. Our framework adopts simple encoder-decoder architecture, learning rich and general representation by following a simple process: 1) two different views are generated from an input image with random masking and by contrastive loss, we can learn semantic distance of the representations generated by an encoder. By applying a high mask ratio, 80%, it works as strong augmentation and alleviates the representation collapse problem. 2) With reconstruction loss, decoder learns to reconstruct an original image from the masked image. We assess our framework by several experiments on benchmark datasets of image classification, object detection, and semantic segmentation. We achieve 84.3% fine-tuning accuracy on ImageNet-1K classification and 76.7% in linear probing, exceeding previous studies and show promising results on other downstream tasks. Experimental results demonstrate that our work can learn rich and general image representation by applying contrastive loss to masked image modeling.
REVIEW | doi:10.20944/preprints202309.2137.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Medical image analysis, Medical image data, Deep learning, Computer vision techniques, Optimisation methods
Online: 30 September 2023 (17:58:32 CEST)
Medical image analysis is an important branch in the field of medicine, which mainly uses image processing and analysis techniques to interpret and diagnose medical image data. Medical image data helps doctors to effectively observe and diagnose patients' body structures, tissues and lesions. Medical image analysis has been an important research area in the medical field, and it is important for disease diagnosis, treatment planning, and condition monitoring. In recent years, the rapid development of deep learning and computer vision technologies has contributed greatly to the automation, multimodal data fusion, real-time application, and accuracy improvement of medical image analysis. In addition, the development of deep learning has given rise to some new research areas in medical image analysis, such as Generative Adversarial Networks (GANs) for synthetic medical images, self-supervised learning for unsupervised feature learning, and neural network interpretability. In this paper, we will introduce some optimisation methods for medical images which are effective in improving the accuracy, efficiency and reliability of medical image analysis.
ARTICLE | doi:10.20944/preprints202306.0922.v1
Subject: Computer Science And Mathematics, Signal Processing Keywords: Multimodality medical image; Image fusion; Sparse representation (SR); Kronecker criterion; Activity level measure
Online: 13 June 2023 (10:09:15 CEST)
Multimodal medical image fusion is a fundamental but challenging problem in the fields of brain science research and brain disease diagnosis, and it is challenging for sparse representation (SR)-based fusion to characterize activity level with single measurement and no loss of effective information. In this paper, the Kronecker-criterion-based SR framework is applied for medical image fusion with a patch-based activity level integrating salient features of multiple domains. Inspired by the formation process of vision system, the spatial saliency is characterized by textural contrast (TC), which is composed of luminance and orientation contrasts to promote more highlighted texture information to participate in the fusion process. As substitution of the conventional l1-norm-based sparse saliency, a metric of sum of sparse salient features (SSSF) is used for promoting more significant coefficients to participate in the composition of activity level measure. The designed activity level measure is verified to be more conducive to maintain the integrity and sharpness of detailed information. Various experiments on multiple groups of clinical medical images verify the effectiveness of the proposed fusion method on both visual quality and objective assessment. Furthermore, the research work of this paper is helpful for further detection and segmentation of medical images.
ARTICLE | doi:10.20944/preprints202303.0319.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: industrial image processing; feature amplification; image transformation strategy; text detection; Probabilistic Hough Transform
Online: 17 March 2023 (09:05:54 CET)
Industrial nameplates serve as a means of conveying critical information and parameters. In this work, we propose a novel approach for rectifying industrial nameplate pictures utilizing a probabilistic Hough transform. Our method effectively corrects for distortions and clipping, and features a collection of challenging nameplate pictures for analysis. To determine the corners of the nameplate, we employ a progressive probability Hough transform, which not only enhances detection accuracy but also possesses the ability to handle complex industrial scenarios. The results of our approach are clear and readable nameplate text, as demonstrated through experiments that show improved accuracy in model identification compared to other methods.
CONCEPT PAPER | doi:10.20944/preprints202204.0129.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Digital Design; Digital Architecture; Image Processing; Machine learning; FPGA; Dedicated Design; Image Processor
Online: 14 April 2022 (05:09:47 CEST)
Many dedicated designs for real-time operations provide functionality on fixed-sized operators, but where speed, scalability, and flexibility are required, extensive research is demanded. Dedicated designs can provide real-time processing for many applications. This paper presents an FPGA-based design of a general image processor. The proposed design is based on a fixed-point representation of binary numbers. The proposed design provides a mechanism to manage matrices on-chip along with matrix arithmetic. The matrices are represented with simple identifiers and microinstruction that assist in the computation of many operations which are useful for solving complex problems. The design was successfully implemented and tested using VHDL language. The proposed design is an efficient architecture as a standalone processor with all embedding computational resources necessary for an embedded image processing application.
ARTICLE | doi:10.20944/preprints202001.0205.v1
Subject: Social Sciences, Behavior Sciences Keywords: itch; scratch; automated real-time detection; machine-learning based image classifier; image sharpness
Online: 19 January 2020 (03:13:48 CET)
A 'little brother' of pain, itch is an unpleasant sensation that creates a specific urge to scratch. To date, various machine-learning based image classifiers (MBICs) have been proposed for quantitative analysis of itch-induced scratch behaviour of laboratory animals in an automated, non-invasive, inexpensive and real-time manner. In spite of MBICs' advantages, the overall performances (accuracy, sensitivity and specificity) of current MBIC approaches remains inconsistent, with their values varying from ~50% to ~99%, for which the reasons underlying have yet to be investigated further, both computationally and experimentally. To look into the variation of the performance of MBICs in automated detection of itch-induced scratch, this article focuses on the experimental data recording step, and reports here for the first time that MBICs' overall performance is inextricably linked to the sharpness of experimentally recorded video of laboratory animal scratch behaviour. This article furthermore demonstrates for the first time that a linearly correlated relationship exists between video sharpness and overall performance (accuracy and specificity, but not sensitivity) of MBICs, and highlight the primary role of experimental data recording in rapid, accurate and consistent quantitative assessment of laboratory animal itch.
REVIEW | doi:10.20944/preprints202308.0657.v2
Subject: Physical Sciences, Radiation And Radiography Keywords: image quality; interventional radiology; pediatrics
Online: 30 August 2023 (04:05:02 CEST)
Pediatric interventional cardiology procedures are essential in diagnosing and treating congenital heart disease in children; however, they raise concerns about potential radiation exposure. Managing radiation doses and assessing image quality in angiographs becomes imperative for safe and effective interventions. This systematic review aims to comprehensively analyze the current understanding of physical image quality metrics relevant for characterizing X-ray systems used in fluoroscopy-guided pediatric cardiac interventional procedures, considering the main factors reported in the literature that influence this outcome. A search in Scopus and Web of Science, using relevant keywords and inclusion/exclusion criteria, yielded fourteen relevant articles published between 2000 and 2022. The physical image quality metrics reported were noise, signal-to-noise ratio, contrast, contrast-to-noise ratio, and high contrast spatial resolution. Various factors influencing image quality were investigated, such as polymethyl methacrylate thickness (often used to simulate water equivalent tissue thickness), operation mode, anti-scatter grid presence and tube voltage. Objective evaluations using these metrics ensure impartial assessments for main factors affecting image quality, improving in the characterization fluoroscopic X-ray systems, and aiding informed decisions to safeguard pediatric patients during procedures.
COMMUNICATION | doi:10.20944/preprints202306.0492.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: strongest activations; image complexity; convolution
Online: 7 June 2023 (05:42:05 CEST)
Neural networks were treated as black boxes for a long time. Previous works have unearthed what aspects of an image were important for convolutional layers at different positions in the network. This was done using deconvolutional networks. In this paper, we examine how well a convolutional neural network performs when those convolutional layers which are relatively unimportant for a particular image (i.e., the image does not produce one of the strongest activations) are skipped in the training, validating, and testing process.
ARTICLE | doi:10.20944/preprints201906.0166.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: MRI image; Texture Features; GLCM
Online: 18 June 2019 (05:36:29 CEST)
This paper presented a feature vector using a different statistical texture analysis of brain tumor from MRI image. The statistical feature texture is computed using GLCM (Gray Level Co-occurrence Matrices) of Brain Nodule structure. For this paper, the brain nodule segmented using strips method to implemented marker watershed image segmentation based on PSO (Particle Swarm Optimization) and Fuzzy C-means clustering (FCM). Furthermore, the four angles 0o, 45o, 90o and 135o are calculated the segmented brain image in GLCM. The four angular directions are calculated using texture features are correlation, energy, contrast and homogeneity. The texture analysis is performed a different types of images using past years. So the algorithm proposed statistical texture features are calculated for iterative image segmentation. These results show that MRI image can be implemented in a system of brain cancer detection.
ARTICLE | doi:10.20944/preprints202309.0762.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image processing; image analysis; deep learning; roof structure extraction; roof vectorization; frame field learning
Online: 12 September 2023 (08:36:45 CEST)
A topic of growing interest in urban remote sensing is the automated extraction of geometrical building information for 3D city modeling. Roof geometry information is useful for applications such as urban planning, solar potential estimation and telecommunication installation planning, and wind flow simulations for pollutant diffusion analysis. Recent research has proven that the advance in remote sensing technologies and deep learning methods offer the prospects of deriving the roof structure information accurately and efficiently. In this study, we propose a Vectorized Roof Extractor- method based on Fully Convolutional Networks (FCNs) and advanced polygonization method to extract roof structure from aerial imagery and a normalized Digital Surface Models (nDSM) in a regularized vector format. The roof structure consists of building outlines, external edges of the building roof, inner rooflines, internal intersections of the main roof planes. The methodology is comprised of segmentation, vectorization and post-processing for outer rooflines, external edges of the building roof, and inner rooflines, and internal intersections of the main roof planes. For the comparison, we adapt the Frame field Learning (FFL) method originally designed to extract building polygons . Our experiments are conducted on a custom data set derived for the city of Enschede, The Netherlands, using aerial imagery, nDSM and manually digitized training polygons. The results show that the proposed Vectorized Roof Extractor outperformed adapted FFL on PoLiS distance with values of 3.5 m and 1.2 m for outlines and inner rooflines, respectively. Furthermore, the model surpassed the adapted FFL on PoLiS-thresholded F-score for outlines and inner rooflines, with 0.31 and 0.57, respectively. The Vectorized Roof Extractor produced adequate visual results, with straighter walls and fewer missed inner roofline detections. It can predict buildings with common walls thanks to skeleton graph computation. To summarize, the proposed method is suitable for urban applications and has the potential to be improved further.
ARTICLE | doi:10.20944/preprints201810.0534.v1
Subject: Engineering, Industrial And Manufacturing Engineering Keywords: non-destructive testing; process optimization; porosity; pore hotspots; image-based simulations; 3D image analysis
Online: 23 October 2018 (09:58:18 CEST)
This paper presents the latest developments in microCT, both globally and locally, for supporting the additive manufacturing industry. There are a number of recently developed capabilities which are especially relevant to the non-destructive quality inspection of additive manufactured parts; and also for advanced process optimization. These new capabilities are all locally available but not yet utilized to their full potential, most likely due to a lack of knowledge of these capabilities. The aim of this paper is therefore to fill this gap and provide an overview of these latest capabilities, showcasing numerous local examples.
ARTICLE | doi:10.20944/preprints201805.0240.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: background reconstruction; image quality assessment; image dataset; subjective evaluation; perceptual quality; objective quality metric
Online: 17 May 2018 (09:36:33 CEST)
With an increased interest in applications that require a clean background image, such as video surveillance, object tracking, street view imaging and location-based services on web-based maps, multiple algorithms have been developed to reconstruct a background image from cluttered scenes. Traditionally, statistical measures and existing image quality techniques have been applied for evaluating the quality of the reconstructed background images. Though these quality assessment methods have been widely used in the past, their performance in evaluating the perceived quality of the reconstructed background image has not been verified. In this work, we discuss the shortcomings in existing metrics and propose a full reference Reconstructed Background image Quality Index (RBQI) that combines color and structural information at multiple scales using a probability summation model to predict the perceived quality in the reconstructed background image given a reference image. To compare the performance of the proposed quality index with existing image quality assessment measures, we construct two different datasets consisting of reconstructed background images and corresponding subjective scores. The quality assessment measures are evaluated by correlating their objective scores with human subjective ratings. The correlation results show that the proposed RBQI outperforms all the existing approaches. Additionally, the constructed datasets and the corresponding subjective scores provide a benchmark to evaluate the performance of future metrics that are developed to evaluate the perceived quality of reconstructed background images.
ARTICLE | doi:10.20944/preprints201612.0075.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: image recognition bases location; indoor positioning; RGB-D images; LiDAR; DataBase; mobile computing; image retrieval
Online: 15 December 2016 (07:17:35 CET)
This paper describes the first results of an Image Recognition Based Location (IRBL) for mobile application focusing on the procedure to generate a Database of range images (RGB-D). In an indoor environment, to estimate the camera position and orientation, a prior spatial knowledge of the surrounding is needed. In order to achieve this objective a complete 3D survey of two different environment (Bangbae metro station of Seoul and E.T.R.I. building in Daejeon – Republic of Korea) was performed using LiDAR (Light Detection And Ranging) instrument and the obtained scans were processed in order to obtain a spatial model of the environments. From this, two databases of reference images were generated using a specific software realized by the Geomatics group of Politecnico di Torino (ScanToRGBDImage). This tool allow to generate synthetically different RGB-D images) centered in the each scan position in the environment. Later, the external parameters (X, Y, Z, ω, φ, κ) and the range information extracted from the DB images retrieved, are used as reference information for pose estimation of a set of acquired mobile pictures in the IRBL procedure. In this paper the survey operations, the approach for generating the RGB-D images and the IRB strategy are reported. Finally the analysis of the results and the validation test are described.
ARTICLE | doi:10.20944/preprints202312.0005.v1
Subject: Engineering, Mechanical Engineering Keywords: Microfracture; image processing; network; simulation analyzes
Online: 1 December 2023 (05:14:14 CET)
Fatigue fractures in materials are the main cause of approximately 80% of all material failures, and it is believed that such failures can be predicted and mathematically calculated in a reliable manner. It is possible to establish prediction modalities in cases of fatigue fracture, according to three fundamental variables in fatigue, such as volume, number of fracture cycles, as well as applied stress, with the integration of Weibull constants (length characteristic). This investigation was carried out mechanical fatigue tests on specimens smaller than 4 mm2 in section of different industrial materials for their subsequent analysis through precision computed tomography in search of microfractures. The measurement of these microfractures, along with their metrics and classifications, was recorded. A convolutional neural network trained with deep learning was used to achieve the detection of microfractures in image processing. The detection of microfractures in images with 480x854 or 960x960 pixels is the primary objective of this network, and its accuracy is above 95%. Images that have microfractures and those that do not are classified by the network. Subsequently, by means of image processing, the microfracture is isolated. Finally, the images that do contain this feature are interpreted by image processing to obtain their area, perimeter, characteristic length, circularity, orientation, and type microfracture metrics. All values will be obtained in pixels and converted to metric units (μm) through a conversion factor based on image resolution.