ARTICLE | doi:10.20944/preprints202008.0336.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: image processing; image classification; computer vision; expert systems; amber gemstones
Online: 15 August 2020 (04:39:11 CEST)
The article describes a classification solution for amber stones. The problem of classifying amber is known for a long time among jewelers and artisans of amber art. Existing solutions can classify amber pieces according to color, but a need to classify by shape and texture is not satisfied up to now. The proposed solution is capable of classifying the gemstones according to a shape. Amber can be considered as a specific object since the form is difficult to define unambiguously. Data for amber experiments was gathered from amber art craftsmen. In the proposed solution amber form can be classified into 10 different classes (7 classes chosen during the experiment).
REVIEW | doi:10.20944/preprints202105.0127.v1
Subject: Keywords: Image Acquisition, Image preprocessing, Image enhancement, beatboxing, segmentation
Online: 7 May 2021 (09:09:14 CEST)
Human beatboxing is a vocal art making use of speech organs to produce vocal drum sounds and imitate musical instruments. Beatbox sound classification is a current challenge that can be used for automatic database annotation and music-information retrieval. In this study, a large-vocabulary humanbeatbox sound recognition system was developed with an adaptation of Kaldi toolbox, a widely-used tool for automatic speech recognition. The corpus consisted of eighty boxemes, which were recorded repeatedly by two beatboxers. The sounds were annotated and transcribed to the system by means of a beatbox specific morphographic writing system (Vocal Grammatics). The image processing techniques plays vital role on image Acquisition, image pre-processing, Clustering, Segmentation and Classification techniques with different kind of images such as Fruits, Medical, Vehicle and Digital text images etc. In this study the various images to remove unwanted noise and performs enhancement techniques such as contrast limited adaptive histogram equalization, Laplacian and Harr filtering, unsharp masking, sharpening, high boost filtering and color models then the Clustering algorithms are useful for data logically and extract pattern analysis, grouping, decision-making, and machine-learning techniques and Segment the regions using binary, K-means and OTSU segmentation algorithm. It Classifying the images with the help of SVM and K-Nearest Neighbour(KNN) Classifier to produce good results for those images.
ARTICLE | doi:10.20944/preprints202112.0140.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Image Recognition; Preference Net
Online: 8 December 2021 (14:43:39 CET)
Accuracy and computational cost are the main challenges of deep neural networks in image recognition. This paper proposes an efficient ranking reduction to binary classification approach using a new feed-forward network and feature selection based on ranking the image pixels. Preference net (PN) is a novel deep ranking learning approach based on Preference Neural Network (PNN), which uses new ranking objective function and positive smooth staircase (PSS) activation function to accelerate the image pixels’ ranking. PN has a new type of weighted kernel based on spearman ranking correlation instead of convolution to build the features matrix. The PN employs multiple kernels that have different sizes to partial rank image pixels’ in order to find the best features sequence. PN consists of multiple PNNs’ have shared output layer. Each ranker kernel has a separate PNN. The output results are converted to classification accuracy using the score function. PN has promising results comparing to the latest deep learning (DL) networks using the weighted average ensemble of each PN models for each kernel on CFAR-10 and Mnist-Fashion datasets in terms of accuracy and less computational cost.
ARTICLE | doi:10.20944/preprints202109.0285.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: remote sensing; deep learning; image classification
Online: 16 September 2021 (13:38:55 CEST)
Autonomous image recognition has numerous potential applications in the field of planetary science and geology. For instance, having the ability to classify images of rocks would allow geologists to have immediate feedback without having to bring back samples to the laboratory. Also, planetary rovers could classify rocks in remote places and even in other planets without needing human intervention. Shu et al. classified 9 different types of rock images using a Support Vector Machine (SVM) with the image features extracted autonomously. Through this method, the authors achieved a test accuracy of 96.71%. In this research, Convolutional Neural Networks(CNN) have been used to classify the same set of rock images. Results show that a 3-layer network obtains an average accuracy of 99.60% across 10 trials on the test set. A version of Self-taught Learning was also implemented to prove the generalizability of the features extracted by the CNN. Finally, one model has been chosen to be deployed on a mobile device to demonstrate practicality and portability. The deployed model achieves a perfect classification accuracy on the test set, while taking only 0.068 seconds to make a prediction, equivalent to about 14 frames per second.
ARTICLE | doi:10.20944/preprints202007.0591.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Bacterial -Viral Pneumonia; COVID-19; X-ray Image; Deep Learning; Convolution Neural Network
Online: 24 July 2020 (14:02:07 CEST)
The paper demonstrates the analysis of Corona Virus Disease based on a CNN probabilistic model. It involves a technique for classification and prediction by recognizing typical and diagnostically most important CT images features relating to Corona Virus. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases at applying our proposed Convolution neural network structure. The Study is validated on 2002 chest X-ray images with 60 confirmed positive covid19 cases and (650 bacterial – 412 viral -880 normal) x-ray images. The proposed CNN compared with traditional classifiers with proposed CHFS feature extraction model. The experimental study has done with real data demonstrates the feasibility and potential of the proposed approach for the said cause. The result of proposed CNN structure has been successfully done to achieve 98.20% accuracy of covid19 potential cases with comparable of traditional classifiers.
ARTICLE | doi:10.20944/preprints202103.0408.v1
Subject: Engineering, Automotive Engineering Keywords: Auto encoder; IoT; Image encryption; Artificial Neural Network; Machine Learning
Online: 16 March 2021 (09:32:11 CET)
Machine Learning has completely transformed health care system, which transmits medical data through IOT sensors. So it is very important to encrypt them to protect patient data. encrypting medical images from a performance perspective consumes time; hence the use of an auto encoder is essential. An auto encoder is used in this work to compress the image as a vector prior to the encryption process. The digital image passes across description function and a decoder to get back the image in the proposed work; various experiments are carried out on hyper parameters to achieve the highest outcome of the classification. The findings demonstrate that the combination of Mean Square Logarithmic Error as the loss function, ADA grad as an optimizer, two layers for the encoder, and another reverse for the decoder, RELU as the activation function generates the best auto encoder results. The combination of Mean square error (lose function), RMS prop (optimizer), three layers for the encoder and another reverse for the decoder, and RELU (activation function) has the best classification result. All the experiments with different hyper parameter has run almost very close to each other even when changing the number of layers. The running time is between 9 and 16 second for each epoch.
ARTICLE | doi:10.20944/preprints201911.0218.v1
Subject: Earth Sciences, Environmental Sciences Keywords: Landsat; Google Earth; water index; unsupervised image classification; supervised image classification; Kappa coefficient
Online: 19 November 2019 (03:10:17 CET)
To address three important issues related to extraction of water features from Landsat imagery, i.e., selection of water indexes and classification algorithms for image classification, collection of ground truth data for accuracy assessment, this study applied four sets (ultra-blue, blue, green, and red light based) of water indexes (NWDI, MNDWI, MNDWI2, AWEIns, and AWEIs) combined with three types of image classification methods (zero-water index threshold, Otsu, and kNN) to 24 selected lakes across the globe to extract water features from Landsat-8 OLI imagery. 1440 (4x5x3x24) image classification results were compared with the extracted water features from high resolution Google Earth images with the same (or ±1 day) acquisition dates through computing the Kappa coefficients. Results show the kNN method is better than the Otsu method, and the Otsu method is better than the zero-water index threshold method. If the computational cost is not an issue, the kNN method combined with the ultra-blue light based AWEIns is the best method for extracting water features from Landsat imagery because it produced the highest Kappa coefficients. If the computational cost is taken into account, the Otsu method is a good choice. AWEIns and AWEIs are better than NDWI, MNDWI and MNDWI2. AWEIns works better than AWEIs under the Otsu method, and the average rank of the image classification accuracy from high to low is the ultra-blue, blue, green, and red light-based AWEIns.
COMMUNICATION | doi:10.20944/preprints202207.0450.v1
Subject: Earth Sciences, Oceanography Keywords: SAR image; ship wake; deep learning; synthetic dataset
Online: 29 July 2022 (05:51:03 CEST)
The classification of vessel types in SAR imagery is of crucial importance for maritime applications. However, the ability to use real SAR imagery for deep learning classification is limited, due to the general lack of such data and/or the labor-intensive nature of labeling them. Simulating SAR images can overcome these limitations, allowing the generation of an infinite number of datasets. In this contribution, we present a synthetic SAR imagery dataset with ship wakes, which comprises 46080 images for ten different real vessel models. The variety of simulation parameters includes 16 ship heading directions, 6 ship velocities, 8 wind directions, 2 wind velocities, and 3 incidence angles. In addition, we extensively investigate classification performance for noise-free, noisy, and denoised ship wake scenes. We utilize the standard AlexNet architecture and employ training from scratch. To achieve the best classification performance, we conduct Bayesian optimization to determine hyperparameters. Results demonstrate that the classification of vessel types based on their SAR signatures is highly efficient, with maximum accuracies of 96.16%, 92.7%, and 93.59%, when training using noise-free, noisy, and denoised datasets respectively. Thus, we conclude that the best strategy in practical applications should be to train convolutional neural networks on denoised SAR datasets. The results show that the versatility of the SAR simulator can open up new horizons in the application of machine learning to a variety of SAR platforms.
ARTICLE | doi:10.20944/preprints202210.0092.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: complex network; neural network architecture; isotropic architecture; image classification
Online: 8 October 2022 (04:04:47 CEST)
Although neural network architectures are critical for their performance, how the structural characteristics of a neural network affect its performance has still not been fully explored. We here map architectures of neural network to directed acyclic graphs, and find that incoherence, a structural characteristic to measure the order of directed acyclic graphs, is a good indicator for the performance of corresponding neural networks. Therefore we propose a deep isotropic neural network architecture by folding a chain of same blocks then connecting the blocks with skip connections at different distances. Our models, named FoldNet, have two distinguishing features compared with traditional residual neural netowrks. First, the distances between block pairs connected by skip connections increase from always equal to one to specially selected different values, which lead to more incoherent graphs and let the neural network explore larger receptive fields and thus enhance its multi-scale representation ability. Second, the number of direct paths increases from one to multiple, which leads to a larger proportion of shorter paths and thus improve the direct propagation of information throughout the entire network. Image classification results on CIFAR-10 and Tiny ImageNet benchmarks suggested that our new network architecture performs better than traditional residual neural networks.
ARTICLE | doi:10.20944/preprints202102.0083.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: SAR image classification; Spiking Neural Network(SNN); unsupervised learning
Online: 2 February 2021 (10:35:38 CET)
Recent neuroscience research results show that the nerve information in the brain is not only encoded by the spatial information. Spiking neural network based on pulse frequency coding plays a very important role in dealing with the problem of brain signal, especially complicated space-time information. In this paper, an unsupervised learning algorithm for bilayer feedforward spiking neural networks based on spike-timing dependent plasticity (STDP) competitiveness is proposed and applied to SAR image classification on MSTAR for the first time. The SNN learns autonomously from the input value without any labeled signal and the overall classification accuracy of SAR targets reached 80.8%. The experimental results show that the algorithm adopts the synaptic neurons and network structure with stronger biological rationality, and has the ability to classify targets on SAR image. Meanwhile, the feature map extraction ability of neurons is visualized by the generative property of SNN, which is a beneficial attempt to apply the brain-like neural network into SAR image interpretation.
Subject: Earth Sciences, Oceanography Keywords: breaking waves; optical flow; convolutional neural networks; image classification
Online: 11 October 2021 (15:49:36 CEST)
The use of convolutional neural networks (CNNs) in image classification has become the standard method of approaching computer vision problems. Here we apply pre-trained networks to classify images of non-breaking, plunging and spilling breaking waves. The CNNs are used as basic feature extractors and a classifier is then trained on top of these networks. The dynamic nature of breaking waves is exploited by using image sequences to gain extra information and improve the classification results. We also see improved classification performance in using pre-computed image features such as the optical flow between image pairs. The inclusion of the dynamic information improves the classification between breaking wave classes. We also provide corrections to the methodology from the article from which the data originates to achieve a more accurate assessment of performance.
Subject: Mathematics & Computer Science, Other Keywords: aerial scene classification; remote-sensing image classification; few-shot learning; meta-learning
Online: 15 December 2020 (13:21:49 CET)
CNN-based methods have dominated the field of aerial scene classification for the past few years. While achieving remarkable success, CNN-based methods suffer from excessive parameters and notoriously rely on large amounts of training data. In this work, we introduce few-shot learning to the aerial scene classification problem. Few-shot learning aims to learn a model on base-set that can quickly adapt to unseen categories in novel-set, using only a few labeled samples. To this end, we proposed a meta-learning method for few-shot classification of aerial scene images. First, we train a feature extractor on all base categories to learn a representation of inputs. Then in the meta-training stage, the classifier is optimized in the metric space by cosine distance with a learnable scale parameter. At last, in the meta-testing stage, the query sample in the unseen category is predicted by the adapted classifier given a few support samples. We conduct extensive experiments on two challenging datasets: NWPU-RESISC45 and RSD46-WHU. The experimental results show that our method yields state-of-the-art performance. Furthermore, several ablation experiments are conducted to investigate the effects of dataset scale, the impact of different metrics and the number of support shots; the experiment results confirm that our model is specifically effective in few-shot settings.
ARTICLE | doi:10.20944/preprints202106.0634.v1
Subject: Keywords: Hyperspectral image; HSI; PCA; K-means clustering; unsupervised; classification; bands; satellite; ROSIS; AVIRIS
Online: 28 June 2021 (10:01:41 CEST)
The visualization of hyperspectral images in display devices, having RGB colour composition channels is quite difficult due to the high dimensionality of these images. Thus, principal component analysis has been used as a dimensionality reduction algorithm to reduce information loss, by creating uncorrelated features. To classify regions in the hyperspectral images, K-means clustering has been used to form clusters/regions. These two algorithms have been implemented on the three datasets imaged by AVIRIS and ROSIS sensors.
ARTICLE | doi:10.20944/preprints202204.0163.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: artificial intelligence; deep learning; image-to-image translation; dual-energy computed tomography; pulmonary embolism; emergency radiology
Online: 18 April 2022 (09:45:00 CEST)
Detector-based spectral CT offers the possibility of obtaining spectral information from which discrete acquisitions at different energy levels can be derived, yielding so-called virtual monoenergetic images (VMI). In this study, we aimed to develop a jointly optimized deep learning framework based on dual-energy CT pulmonary angiography (DE-CTPA) data to generate synthetic monoenergetic images (SMI) for improving automatic pulmonary embolism (PE) detection in single-energy CTPA scans. For this purpose, we used two data sets: our institutional DE-CTPA data set D1 comprising polyenergetic arterial series and the corresponding VMI at low-energy levels (40 keV) with 7,892 image pairs, and a 10% subset of the 2020 RSNA Pulmonary Embolism Detection Challenge data set D2, which consisted of 161,253 polyenergetic images with dichotomous slice-wise annotations (PE/no PE). We trained a fully convolutional encoder-decoder on D1 to generate SMI from single-energy CTPA scans of D2, which were then fed into a ResNet50 network for training of the downstream PE classification task. The quantitative results on the reconstruction ability of our framework revealed high-quality visual SMI predictions with reconstruction results of 0.984 ± 0.002 (structural similarity) and 41.706 ± 0.547 dB (peak-signal-to-noise ratio). PE classification resulted in an AUC of 0.84 for our model, which achieved improved performance compared to other naive approaches with AUCs up to 0.81. Our study stresses the role of using joint optimization strategies for deep learning algorithms to improve automatic PE detection. The proposed pipeline may prove to be beneficial for computer-aided detection systems and could help rescue CTPA studies with suboptimal opacification of the pulmonary arteries from single-energy CT scanners.
ARTICLE | doi:10.20944/preprints202202.0058.v2
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Document Image Classification; Corruption Robustness; Robustness to Distortions; Model Robustness
Online: 14 June 2022 (08:43:57 CEST)
Deep neural networks have been extensively researched in the field of document image classification to improve classification performance and have shown excellent results. However, there is little research in this area that addresses the question of how well these models would perform in a real-world environment, where the data the models are confronted with often exhibits various types of noise or distortion. In this work, we present two separate benchmark datasets, namely RVL-CDIP-D and Tobacco3482-D, to evaluate the robustness of existing state-of-the-art document image classifiers to different types of data distortions that are commonly encountered in the real world. The proposed benchmarks are generated by inserting 21 different types of data distortions with varying severity levels into the well-known document datasets RVL-CDIP and Tobacco3482, respectively, which are then used to quantitatively evaluate the impact of the different distortion types on the performance of latest document image classifiers. In doing so, we show that while the higher accuracy models also exhibit relatively higher robustness, they still severely underperform on some specific distortions, with their classification accuracies dropping from ~90% to as low as ~40% in some cases. We also show that some of these high accuracy models perform even worse than the baseline AlexNet model in the presence of distortions, with the relative decline in their accuracy sometimes reaching as high as 300-450% that of AlexNet. The proposed robustness benchmarks are made available to the community and may aid future research in this area.
ARTICLE | doi:10.20944/preprints202009.0566.v1
Subject: Engineering, Automotive Engineering Keywords: transportation mode classification; vulnerable road users; recurrence plots; computer vision; image classification system
Online: 24 September 2020 (04:41:32 CEST)
As the Autonomous Vehicle (AV) industry is rapidly advancing, classification of non-motorized (vulnerable) road users (VRUs) becomes essential to ensure their safety and to smooth operation of road applications. The typical practice of non-motorized road users’ classification usually takes numerous training time and ignores the temporal evolution and behavior of the signal. In this research effort, we attempt to detect VRUs with high accuracy be proposing a novel framework that includes using Deep Transfer Learning, which saves training time and cost, to classify images constructed from Recurrence Quantification Analysis (RQA) that reflect the temporal dynamics and behavior of the signal. Recurrence Plots (RPs) were constructed from low-power smartphone sensors without using GPS data. The resulted RPs were used as inputs for different pre-trained Convolutional Neural Network (CNN) classifiers including constructing 227×227 images to be used for AlexNet and SqueezeNet; and constructing 224×224 images to be used for VGG16 and VGG19. Results show that the classification accuracy of Convolutional Neural Network Transfer Learning (CNN-TL) reaches 98.70%, 98.62%, 98.71%, and 98.71% for AlexNet, SqueezeNet, VGG16, and VGG19, respectively. The results of the proposed framework outperform other results in the literature (to the best of our knowledge) and show that using CNN-TL is promising for VRUs classification. Because of its relative straightforwardness, ability to be generalized and transferred, and potential high accuracy, we anticipate that this framework might be able to solve various problems related to signal classification.
ARTICLE | doi:10.20944/preprints201912.0059.v2
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: hyperspectral image classiﬁcation; deep learning; channel-wise attention mechanism; spatial-wise attention mechanism
Online: 12 February 2020 (05:40:08 CET)
In recent years, researchers have paid increasing attention on hyperspectral image (HSI) classification using deep learning methods. To improve the accuracy and reduce the training samples, we propose a double-branch dual-attention mechanism network (DBDA) for HSI classification in this paper. Two branches are designed in DBDA to capture plenty of spectral and spatial features contained in HSI. Furthermore, a channel attention block and a spatial attention block are applied to these two branches respectively, which enables DBDA to refine and optimize the extracted feature maps. A series of experiments on four hyperspectral datasets show that the proposed framework has superior performance to the state-of-the-art algorithm, especially when the training samples are signally lacking.
ARTICLE | doi:10.20944/preprints201712.0057.v1
Subject: Earth Sciences, Other Keywords: dimension reduction; feature extraction; hyperspectral image; weighted feature space; low rank representation; spectral clustering
Online: 11 December 2017 (06:55:22 CET)
Containing hundreds of spectral bands (features), hyperspectral images (HSIs) have high ability in discrimination of land cover classes. Traditional HSIs data processing methods consider the same importance for all bands in the original feature space (OFS), while different spectral bands play different roles in identification of samples of different classes. In order to explore the relative importance of each feature, we learn a weighting matrix and obtain the relative weighted feature space (RWFS) as an enriched feature space for HSIs data analysis in this paper. To overcome the difficulty of limited labeled samples which is common case in HSIs data analysis, we extend our method to semisupervised framework. To transfer available knowledge to unlabeled samples, we employ graph based clustering where low rank representation (LRR) is used to define the similarity function for graph. After construction the RWFS, any arbitrary dimension reduction method and classification algorithm can be employed in RWFS. The experimental results on two well-known HSIs data set show that some dimension reduction algorithms have better performance in the new weighted feature space.
ARTICLE | doi:10.20944/preprints201810.0073.v1
Subject: Engineering, Biomedical & Chemical Engineering Keywords: Classification; F-score; Gray-Level Co-occurrence Matrix (GLCM); Gray-Level Run-Length Matrix (GLRLM); Hepatocellular Carcinoma (HCC); Liver Cancer; Liver Abscess; Image Texture, Sequential Backward Selection (SBS); Sequential Forward Selection (SFS); Support Vector Machine (SVM); Ultrasound Image.
Online: 4 October 2018 (14:01:42 CEST)
This paper discusses the computer-aided (CAD) classification between Hepatocellular Carcinoma (HCC), i.e., the most common type of liver cancer, and Liver Abscess, based on ultrasound image texture features and Support Vector Machine (SVM) classifier. Among 79 cases of liver diseases, with 44 cases of HCC and 35 cases of liver abscess, this research extracts 96 features of Gray-Level Co-occurrence Matrix (GLCM) and Gray-Level Run-Length Matrix (GLRLM) from the region of interests (ROIs) in ultrasound images. Three feature selection models, i) Sequential Forward Selection, ii) Sequential Backward Selection, and iii) F-score, are adopted to determine the identification of these liver diseases. Finally, the developed system can classify HCC and liver abscess by SVM with the accuracy of 88.875%. The proposed methods can provide diagnostic assistance while distinguishing two kinds of liver diseases by using a CAD system.
ARTICLE | doi:10.20944/preprints202201.0352.v1
Subject: Earth Sciences, Geoinformatics Keywords: Per-pixel classification confidence; spatial pattern; image classification; accuracy assessment; interpolation method
Online: 24 January 2022 (11:53:46 CET)
Obtaining classification confidence at the pixel level is a challenging task for accuracy assessment in remote sensing image classification. Among the various methods for estimating classification confidence at the pixel level, interpolation-based methods have drawn special attention in the literature. Even though they have been widely recognized in the literature, their usefulness has not been rigorously evaluated. This paper conducts a comprehensive evaluation of three interpolation-based methods: local error matrix method, bootstrap method, and geostatistical method. We applied each of the three methods to three representative datasets with different spatial resolutions, spectral bands, and the number of classes. We then derive the estimated classification confidence and true classification confidence and compared the results with each other using both exploratory data analysis (bi-histogram) and statistical analysis (Willmott's d and Binned classification quality). The results indicate that the three interpolation methods provide some interesting insights on various aspects of estimating per-pixel classification confidence. Unfortunately, the interpolation assumes that classification confidence is smooth across the space, which is usually not true in practice. In other words, interpolation-based methods have limited practical use.
ARTICLE | doi:10.20944/preprints202201.0367.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Artificial Intelligence; Deep Learning; Image Classification; Machine Learning; Predictive Models; Small Datasets; Supervised Learning
Online: 25 January 2022 (08:24:17 CET)
One of the most important challenges in the Machine and Deep Learning areas today is to build good models using small datasets, because sometimes it is not possible to have large ones. Several techniques have been proposed in the literature to address this challenge. This paper aims at studying the different available Deep Learning techniques and performing a thorough experimentation to analyze which technique or combination thereof improves the performance and effectiveness of the models. A complete comparison with classical Machine Learning techniques was carried out, to contrast the results obtained using both techniques when working with small datasets. Thirteen algorithms were implemented and trained using three different small datasets (MNIST, Fashion MNIST, and CIFAR-10). Each experiment was evaluated using a well-established set of metrics (Accuracy, Precision, Recall, F1, and the Matthews correlation coefficient). The experimentation allowed concluding that it is possible to find a technique or combination of them to mitigate a lack of data, but this depends on the nature of the dataset, the amount of data, and the metrics used to evaluate them.
ARTICLE | doi:10.20944/preprints202002.0334.v1
Subject: Earth Sciences, Geoinformatics Keywords: deep learning; drone imagery; hyperspectral image classiﬁcation; tree species classification; 3D convolutional neural networks
Online: 24 February 2020 (01:13:13 CET)
Interest in drone solutions in forestry applications is growing. Using drones, datasets can be captured flexibly and at high spatial and temporal resolutions when needed. In forestry applications, fundamental tasks include the detection of individual trees, tree species classification, bio-mass estimation, etc. Deep Neural Networks (DNN) have shown superior results when comparing with conventional machine learning methods such as Multi-Layer Perceptron (MLP) in cases of huge input data. The objective of this research was to investigate 3D convolutional neural networks (3D-CNN) to classify three major tree species in a boreal forest: pine, spruce, and birch. The proposed 3D-CNN models were employed to classify tree species in a test site in Finland. The classifiers were trained with a dataset of 3039 manually labelled trees. Then the accuracies were assessed by employing independent datasets of 803 records. To find the most efficient set of feature combination, we compare the performances of 3D-CNN models trained with hyperspectral (HS) channels, RGB channels, and canopy height model (CHM), separately and combined. It is demonstrated that the proposed 3D-CNN model with RGB and HS layers produces the highest classification accuracy. The producer accuracy of the best 3D-CNN classifier on the test dataset were 99.6%, 94.8%, and 97.4% for pines, spruces, and birches, respectively. The best 3D-CNN classifier produced ~5% better classification accuracy than the MLP with all layers. Our results suggest that the proposed method provides excellent classification results with acceptable performance metrics for HS datasets. Our results show that pine class was detectable in most layers. Spruce was most detectable in RGB data, while birch was most detectable in the HS layers. Furthermore, the RGB datasets provide acceptable results for many low-accuracy applications.
ARTICLE | doi:10.20944/preprints202209.0169.v1
Subject: Earth Sciences, Geoinformatics Keywords: Synthetic Aperture Rader (SAR); Optical image (Sentinel 2); Random Forest (RF); CART; GEE
Online: 13 September 2022 (10:06:14 CEST)
Observing cultivated crops and other forms of land use is an important environmental and economic concern for agricultural land management and crop classification. Crop categorization offers significant crop management data, ensuring food security, and developing agricultural policies. Remote sensing data, especially publicly available Sentinel 1 and 2 data, has effectively been used in crop mapping and classification in cloudy places because of their high spatial and temporal resolution. This study aimed to improve crop type classification by combining Sentinel-1 (Synthetic Aperture Rader (SAR)) data and the Sentinel-2 Multispectral Instrument (MSI) data. In the study, Random Forest (RF) and Classification and Regression Trees (CART) classier were used to classify grain crops (Barley and Wheat). The classification results based on the combination of Sentinel-2 and Sentinel-1 data indicated an overall accuracy (OA) of 93 % and a kappa coefficient (K) of 0.896 for RF and (89.15%, 0.84) for the CART classifier. It is suggested to employ a mix of radar and optical data to attain the highest level of classification accuracy since doing so improves the likelihood that the details will be observed in comparison to the single-sensor classification technique and yields more accurate results.
ARTICLE | doi:10.20944/preprints202108.0325.v1
Subject: Earth Sciences, Environmental Sciences Keywords: Multi-granularity encoding neural networks (MGNNE); feature extraction; multilayer perceptron (MLP); Principal component analysis (PCA); Remote Sensing image classification,LCLU.
Online: 16 August 2021 (11:28:21 CEST)
Deep learning classification is the state-of-the-art of machine learning approach. Earlier work proves that the deep convolutional neural network has successfully and brilliantly in different applications such as images or video data. Recognizing and clarifying the remote sensing aspect of the earth's surface and exploit land cover and land use (LCLU). First, this article summarized the remote sensing emerging application and challenges for deep learning methods. Second, we propose four approaches to learn efficient and effective CNNs to transfer image representation on the ImageNet dataset to recognize LCLU datasets. We use VGG16, Inception-ResNet-V2, Inception-V3, and DenseNet201 models to extract features from the EACC dataset. We use pre-trained CNNs on ImageNet to extract features. For feature selection we proposed principal component analysis (PCA) to improve accuracy and speed up the model. We train our model by multi-layer perceptron (MLP) as a classifier. Lastly, we apply the multi-granularity encoding ensemble model. We achieve an overall accuracy of 92.3% for the nine-class classification problem. This work will help remote sensing scientists understand deep learning tools and apply them in large-scale remote sensing challenges
ARTICLE | doi:10.20944/preprints201808.0112.v2
Subject: Mathematics & Computer Science, Computational Mathematics Keywords: remote sensing; image classification; fully connected conditional random fields (FC-CRF); convolutional neural networks (CNN)
Online: 28 November 2018 (07:11:42 CET)
The interpretation of land use and land cover (LULC) is an important issue in the fields of high-resolution remote sensing (RS) image processing and land resource management. Fully training a new or existing convolutional neural network (CNN) architecture for LULC classification requires a large amount of remote sensing images. Thus, fine-tuning a pre-trained CNN for LULC detection is required. To improve the classification accuracy for high resolution remote sensing images, it is necessary to use another feature descriptor and to adopt a classifier for post-processing. A fully connected conditional random fields (FC-CRF), to use the fine-tuned CNN layers, spectral features, and fully connected pairwise potentials, is proposed for image classification of high-resolution remote sensing images. First, an existing CNN model is adopted, and the parameters of CNN are fine-tuned by training datasets. Then, the probabilities of image pixels belong to each class type are calculated. Second, we consider the spectral features and digital surface model (DSM) and combined with a support vector machine (SVM) classifier, the probabilities belong to each LULC class type are determined. Combined with the probabilities achieved by the fine-tuned CNN, new feature descriptors are built. Finally, FC-CRF are introduced to produce the classification results, whereas the unary potentials are achieved by the new feature descriptors and SVM classifier, and the pairwise potentials are achieved by the three-band RS imagery and DSM. Experimental results show that the proposed classification scheme achieves good performance when the total accuracy is about 85%.
ARTICLE | doi:10.20944/preprints201807.0516.v1
Subject: Earth Sciences, Other Keywords: land-cover classification; very high spatial resolution remote sensing image; adaptive majority vote; post-classification.
Online: 26 July 2018 (15:05:16 CEST)
Land-cover classification that uses very-high-resolution (VHR) remote sensing images is a topic of considerable interest. Although many classification methods have been developed, there is still room for improvements in the accuracy and usability of classification systems. In this paper, a novel post-processing approach based on a dual-adaptive majority voting strategy (D-AMVS) is proposed for improving the performance of initial classification maps. D-AMVS defines a strategy for refining each label of a classified map that is obtained by different classification methods from the same original image and fusing the different refined classification maps to generate a final classification result. The proposed D-AMVS contains three main blocks. 1) An adaptive region is generated by extending gradually the region around a central pixel based on two predefined parameters (T1 and T2) in order to utilize the spatial feature of ground targets in a VHR image. 2) For each classified map, the label of the central pixel is refined according to the majority voting rule within the adaptive region. This is defined as adaptive majority voting (AMV). Each initial classified map is refined in this manner pixel by pixel. 3) Finally, the refined classified maps are used to generate a final classification map, and the label of the central pixel in the final classification map is determined by applying AMV again. Each entire classified map is scanned and refined pixel by pixel based on the proposed D-AMVS. The accuracies of the proposed D-AMVS approach are investigated through two remote sensing images with high spatial resolutions of 1.0 and 1.3 m, respectively. Compared with the classical majority voting method and a relatively new post-processing method called general post-classification framework, the proposed D-AMVS can achieve a land-cover classification map with less noise and higher classification accuracies.
ARTICLE | doi:10.20944/preprints201703.0134.v1
Subject: Earth Sciences, Geoinformatics Keywords: spatial-spectral feature; very high spatial resolution image; classification; Tobler’s First Law of Geography
Online: 17 March 2017 (05:06:12 CET)
Aerial image classification has become popular and has attracted extensive research efforts in recent decades. The main challenge lies in its very high spatial resolution but relatively insufficient spectral information. To this end, spatial-spectral feature extraction is a popular strategy for classification. However, parameter determination for that feature extraction is usually time-consuming and depends excessively on experience. In this paper, an automatic spatial feature extraction approach based on image raster and segmental vector data cross-analysis is proposed for the classification of very high spatial resolution (VHSR) aerial imagery. First, multi-resolution segmentation is used to generate strongly homogeneous image objects and extract corresponding vectors. Then, to automatically explore the region of a ground target, two rules, which are derived from Tobler’s First Law of Geography (TFL) and a topological relationship of vector data, are integrated to constrain the extension of a region around a central object. Third, the shape and size of the extended region are described. A final classification map is achieved through a supervised classifier using shape, size, and spectral features. Experiments on three real aerial images of VHSR (0.1 to 0.32 m) are done to evaluate effectiveness and robustness of the proposed approach. Comparisons to state-of-the-art methods demonstrate the superiority of the proposed method in VHSR image classification.
ARTICLE | doi:10.20944/preprints201806.0188.v1
Subject: Earth Sciences, Geoinformatics Keywords: minimum noise fraction (MNF) transformation; object-based image analysis (OBIA); APEX hyperspectral imagery; Random forest (RF) classifier; multiresolution segmentation (MRS); tree species classification
Online: 12 June 2018 (10:55:07 CEST)
Tree species composition is an important key element for biodiversity and sustainable forest management, and hyperspectral data provide detailed spectral information, which can be used for tree species classification. There are two main challenges for using hyperspectral imagery: a) Hughes phenomena, meaning by increasing the number of bands in hyperspectral imagery, the number of required classification samples would increase exponentially, and b) in a more complex environment, such as riparian mixed forest, focusing on spectral variability per pixel may not be adequate for definability of tree species. Therefore, the focus of this study is to assess spectral-spatial dimensionality reduction of airborne hyperspectral imagery by using minim noise fraction (MNF) transformation, and object-based image analysis (OBIA). An airborne prism experiment (APEX) hyperspectral imagery was used. A study area was a riparian mixed forest located along the Salzach river, and six tree species including Picea abies, Populus (canadensis and balsamifera), Fraxinus excelsior, Alnus incana, and Salix alba were selected. A machine learning algorithm random forest (RF) was used to train and apply a prediction model for classification. Using a spectral dimensionality reduced APEX, a pixel-level classification was also done. According to a confusion matrix, the object-level classification of MNF-derived components achieved the overall accuracy of 85 %, and kappa coefficient of 0.805. The performance of classes according to producer’s accuracy varied between 80% for Fraxinus excelsior, Alnus incana, and Populus canadensis to 90% for Salix alba and Picea abies. Comparison the results to a pixel-level classification, showed a better performance of object-level classification (an overall accuracy of 63% and Kappa coefficient of 0.559 were achieved for pixel-level classification). The performance of classes using pixel-based classification varied 45 % for Alnus incana to 80% for Picea abies. In general, Spectral-spatial complexity reduction using MNF transformation and object-level classification yielded a statistically satisfactory results.
ARTICLE | doi:10.20944/preprints202102.0189.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: image quality assessment; image databases; superpixels; color image; color space; image quality measures
Online: 8 February 2021 (11:11:47 CET)
Objective Image Quality Assessment (IQA) measures are playing an increasingly important role in the evaluation of digital image quality. New IQA indices are expected to be strongly correlated with subjective observer evaluations expressed by MOS/DMOS scores. One such recently proposed index is the SuperPixel-based SIMilarity (SPSIM) index, which uses superpixel patches instead of the rectangular pixel grid.The authors in this paper have been proposed three modifications of SPSIM index. For this purpose, the color space used by SPSIM was changed and the way SPSIM determines similarity maps was modified using methods derived from the algorithm for computing the MDSI index. The third modification was a combination of the first two. These three new quality indices were used in the assessment process. The experimental results obtained on many color images from five image databases demonstrated the advantages of the proposed SPSIM modifications.
ARTICLE | doi:10.20944/preprints202007.0686.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: document scanning; whiteboard capture; image enhancement; image alignment; image registration; image quality assessment
Online: 28 July 2020 (14:03:51 CEST)
The move from paper to online is not only necessary for remote working, it is also significantly more sustainable. This trend has seen a rising need for high-quality digitization of content from pages and whiteboards to sharable online material. But capturing this information is not always easy, nor are the results always satisfactory. Available scanning apps vary in their usability and do not always produce clean results, retaining surface imperfections from the page or whiteboard in their output images. CleanPage, a novel smartphone-based document and whiteboard scanning system, is presented. CleanPage requires one button-tap to capture, identify, crop and clean an image of a page or whiteboard. Unlike equivalent systems, no user intervention is required during processing and the result is a high-contrast, low-noise image with a clean homogenous background. Results are presented for a selection of scenarios showing the versatility of the design. CleanPage is compared with two market leader scanning apps using two testing approaches: real paper scans and ground-truth comparisons. These comparisons are achieved by a new testing methodology that allows scans to be compared to unscanned counterparts, by using synthesized images. Real paper scans are tested using image quality measures. An evaluation of standard image quality assessments is included in this work and a novel quality measure for scanned images is proposed and validated. The user experience for each scanning app is assessed, showing CleanPage to be fast and easier to use.
ARTICLE | doi:10.20944/preprints202010.0323.v1
Subject: Engineering, Automotive Engineering Keywords: Image segmentation; sonar image; ocean engineering；morphological image processing
Online: 15 October 2020 (13:10:41 CEST)
It has remained a hard nut for years to segment sonar images, most of which are noisy images with inevitable blur after noise reduction. For the purpose of solutions to this problem, a fast segmentation algorithm is proposed on the basis of the gray value characteristics of sonar images. This algorithm is endowed with the advantage in no need of segmentation thresholds to be calculated. To realize this goal, it follows the undermentioned steps: first, calculate the gray matrix of the fuzzy image background. After adjusting the gray value, segment the region into the background region, buffer region and target regions. After filtering, reset the pixels with gray value lower than 255 to binarize images and eliminate most artifacts. Finally, remove the remaining noise from images by means of morphological image processing. The simulation results of several sonar images show that the algorithm can segment the fuzzy sonar image quickly and effectively, with no problem of incomplete image target shape. Thus, the stable and feasible method is testified.
ARTICLE | doi:10.20944/preprints201711.0193.v1
Subject: Keywords: computational intelligence; quantum hybrid intelligent systems; quantum machine learning; medical image processing; disease diagnosis; Fuzzy k-NN; Quantum-behaved PSO; cervical smear images; cancer detection
Online: 30 November 2017 (07:21:00 CET)
A quantum hybrid (QH) intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO) method with the intuitionistic rationality of traditional fuzzy k-nearest neighbours (Fuzzy k-NN) algorithm (known simply as the Q-Fuzzy approach) is proposed for efficient feature selection and classification of cells in cervical smeared (CS) images. From an initial multitude of seventeen (17) features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e. global best particles) that represent a pruned down collection of seven (7) features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: The All-features (i.e. classification without prior feature selection) and another hybrid technique combining the standard PSO algorithm with the Fuzzy k-NN technique (P-Fuzzy approach). In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e. QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in scenarios 1 and 3. The synergy between the QPSO and Fuzzy k-NN in the proposed Q-Fuzzy approach marginally improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.
ARTICLE | doi:10.20944/preprints201703.0086.v1
Subject: Engineering, General Engineering Keywords: image enhancement; image fusion; color space; edge detector; underwater image
Online: 14 March 2017 (17:52:48 CET)
In order to improve contrast and restore color for underwater image captured by camera sensors without suffering from insufficient details and color cast, a fusion algorithm for image enhancement in different color spaces based on contrast limited adaptive histogram equalization (CLAHE) is proposed in this article. The original color image is first converted from RGB color space to two different special color spaces: YIQ and HSI. The color space conversion from RGB to YIQ is a linear transformation, while the RGB to HSI conversion is nonlinear. Then, the algorithm separately operates CLAHE in YIQ and HSI color spaces to obtain two different enhancement images. The luminance component (Y) in the YIQ color space and the intensity component (I) in the HSI color space are enhanced with CLAHE algorithm. The CLAHE has two key parameters: Block Size and Clip Limit, which mainly control the quality of CLAHE enhancement image. After that, the YIQ and HSI enhancement images are respectively converted backward to RGB color. When the three components of red, green, and blue are not coherent in the YIQ-RGB or HSI-RGB images, the three components will have to be harmonized with the CLAHE algorithm in RGB space. Finally, with 4 direction Sobel edge detector in the bounded general logarithm ratio operation, a self-adaptive weight selection nonlinear image enhancement is carried out to fuse YIQ-RGB and HSI-RGB images together to achieve the final fused image. The enhancement fusion algorithm has two key factors: average of Sobel edge detector and fusion coefficient, and these two factors determine the effects of enhancement fusion algorithm. A series of evaluate metrics such as mean, contrast, entropy, colorfulness metric (CM), mean square error (MSE) and peak signal to noise ratio (PSNR) are used to assess the proposed enhancement algorithm. The experiments results showed that the proposed algorithm provides more detail enhancement and higher values of colorfulness restoration as compared to other existing image enhancement algorithms. The proposed algorithm can suppress effectively noise interference, improve the image quality for underwater image availably.
ARTICLE | doi:10.20944/preprints201902.0089.v3
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: Digital image processing, color image, grayscale image, histogram equalization, histogram specification, image enhancement, RGB channel
Online: 11 February 2019 (10:42:57 CET)
This paper has two major parts. In the first part histogram equalization for the image enhancement was implemented without using the built-in function in MATLAB. Here, at first, a color image of a rat was chosen and the image was transformed into a grayscale image. After this conversion, histogram equalization was implemented on the grayscale image. Later on, in the same image for each RGB channel, histogram equalization was implemented to observe the effect of histogram equalization on each channel. In the end, the histogram equalization was implemented to this specific color image of a rat. In the second part, for the grayscale image in part 1, the desired histogram of another colored image of a rat was introduced and histogram specification was implemented on the original colored image.
ARTICLE | doi:10.20944/preprints201811.0565.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: Digital image processing, color image, grayscale image, histogram equalization, histogram specification, image enhancement, RGB channel
Online: 23 November 2018 (14:17:13 CET)
This paper has two major parts. In the first part histogram equalization for the image enhancement was implemented without using the built-in function in MATLAB. Here, at first, a color image of a rat was chosen and the image was transformed into a grayscale image. After this conversion, histogram equalization was implemented on the grayscale image. Later on, in the same image for each RGB channel, histogram equalization was implemented to observe the effect of histogram equalization on each channel. In the end, the histogram equalization was implemented to this specific color image of a rat. In the second part, for the grayscale image in part 1, the desired histogram of another colored image of a rat was introduced and histogram specification was implemented on the original colored image.
ARTICLE | doi:10.20944/preprints202108.0286.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: Image enhancement; DCT-Domain Perceived Contrast; Perceptual Image Quality
Online: 13 August 2021 (08:31:37 CEST)
This paper develops a detail image signal enhancement that makes images perceived as clearer and more resolved and so is more effective for higher resolution displays. We observe that the local variant signal enhancement makes images more vivid, and the more revealed granular signals harmonically embedded on the local variant signals make images more resolved. Based on this observation, we develop a method that not only emphasizes the local variant signals by scaling up the frequency energy in accordance with human visual perception, but also strengths up the granular signals by embedding the alpha-rooting enhanced frequency components. The proposed energy scaling method emphasizes the detail signals in texture images and rarely boosts noisy signals in plain images. In addition, to avoid the local ringing artifact, the proposed method adjusts the enhancement direction to be parallel to the underlying image signal direction. It was verified through the subjective and objective quality evaluations that the developed method makes images perceived as clearer and highly resolved.
ARTICLE | doi:10.20944/preprints202101.0345.v1
Online: 18 January 2021 (14:26:38 CET)
Abstract: Imaging devices of less than 300,000 pixels are mostly used for sewage conduit exploration due to the petty nature of the survey industry in Korea. P articular ly , devices of less than 100,000 pixels are still widely used, and the environment for image processing is very bitter . Since the sewage conduit image s covered in this study ha ve a very low resolution (240 × 320 = 76,800 pixels), it is very difficult to detect cracks. Because most of the resolution of the sewe r conduit images are very low in Korea, this problem of low resolution was selected as the subject of study. Cracks were detected through a total of six steps of improving the crack in Step 2, finding the optimal threshold value in Step 3, and applying an algorithm to detect cracks in Step 5. Cracks were effectively detected by the optimal parameters in Steps 2 and 3 and the user algorithm in Step 5. Desp ite the very low resolution, the cracked image s showed 96.4% accuracy of detection, and the non cracked image s showed 94.5% accuracy . Moreover, the analysis was excellent in quality , also . It is believed that the findings of this study can be effectively u sed for crack detection with low resolution images.
ARTICLE | doi:10.20944/preprints201810.0393.v1
Subject: Engineering, Other Keywords: image analysis; Turin Shroud; body-image formation; energy propagation
Online: 18 October 2018 (03:55:21 CEST)
Recent studies on the image of the Turin Shroud (TS) lead to think it could have been formed through a not well-identified mechanism of energy radiation. In order to remove some lacunas about this imaging process, a reverse engineering method has been applied to it, arriving to exclude some possible mechanisms. The image formation of a human face wrapped on a cloth by using an ad-hoc developed software has been simulated. The results of different kinds of the radiation depending from different parameters have been simulated, each one connected with accredited hypotheses. On the basis of the comparison among the different images produced by the software and the TS Face, some useful information both about the kind of radiation and the cloth wrapping conditions have been obtained. The effect of image distortion of a cloth wrapped around a face has been discussed too by defining the best laws of radiation and of their attenuation with distance. A Lambertian law is not compatible with the TS image. A vertical radiation shows a problem in reproducing the requested resolution. A radiation perpendicular to the emitting surface, like that produced by an electric field appears promising to explain the TS Face.
ARTICLE | doi:10.20944/preprints201705.0028.v1
Online: 3 May 2017 (09:19:59 CEST)
It is one of very important and basic problem in compute vision field that recovering depth information of objects from two-dimensional images. In view of the shortcomings of existing methods of depth estimation, a novel approach based on SIFT (the Scale Invariant Feature Transform) is presented in this paper. The approach can estimate the depths of objects in two images which are captured by an un-calibrated ordinary monocular camera. In this approach, above all, the first image is captured. All of the camera parameters remain unchanged, and the second image is acquired after moving the camera a distance d along the optical axis. Then image segmentation and SIFT feature extraction are implemented on the two images separately, and objects in the images are matched. Lastly, an object depth can be computed by the lengths of a pair of straight line segments. In order to ensure that the best appropriate a pair of straight line segments are chose and reduce the computation, the theory of convex hull and the knowledge of triangle similarity are employed. The experimental results show our approach is effective and practical.
ARTICLE | doi:10.20944/preprints201611.0057.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: multi-focus image, image fusion, region mosaic, contrast pyramid
Online: 10 November 2016 (07:34:22 CET)
This paper proposes a new approach for multi-focus images fusion based on Region Mosaicing on Contrast Pyramids (REMCP). A density-based region growing method is developed to construct a focused region mask for multi-focus images. The segmented focused region mask is decomposed into a mask pyramid, which is then used for supervised region mosaicking on a contrast pyramid. In this way, the focus measurement and the continuity of focused regions are incorporated and the pixel level pyramid fusion is improved at the region level. Objective and subjective experiments show that the proposed REMCP is more robust to noise than compared algorithms and can fully preserves the focus information of the multi-focus images meanwhile reducing distortions of the fused images.
ARTICLE | doi:10.20944/preprints201811.0566.v2
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: Color image, grayscale image, motion blurring, random noise, inverse filtering, Wiener filtering, restoration of an image
Online: 5 February 2019 (16:13:14 CET)
In this paper, at first, a color image of a car is taken. Then the image is transformed into a grayscale image. After that, the motion blurring effect is applied to that image according to the image degradation model described in equation 3. The blurring effect can be controlled by a and b components of the model. Then random noise is added in the image via Matlab programming. Many methods can restore the noisy and motion blurred image; particularly in this paper Inverse filtering as well as Wiener filtering are implemented for the restoration purpose. Consequently, both motion blurred and noisy motion blurred images are restored via Inverse filtering as well as Wiener filtering techniques and the comparison is made among them.
ARTICLE | doi:10.20944/preprints202201.0259.v2
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: image classifier; image part; quick learning; feature overlap; positional context
Online: 11 April 2022 (10:17:57 CEST)
This paper describes an image processing method that makes use of image parts instead of neural parts. Neural networks excel at image or pattern recognition and they do this by constructing complex networks of weighted values that can cover the complexity of the pattern data. These features however are integrated holistically into the network, which means that they can be difficult to use in an individual sense. A different method might scan individual images and use a more local method to try to recognise the features in it. This paper suggests such a method, where a trick during the scan process can not only recognise separate image parts, as features, but it can also produce an overlap between the parts. It is therefore able to produce image parts with real meaning and also place them into a positional context. Tests show that it can be quite accurate, on some handwritten digit datasets, but not as accurate as a neural network, for example. The fact that it offers an explainable interface could make it interesting however. It also fits well with an earlier cognitive model, and an ensemble-hierarchy structure in particular.
ARTICLE | doi:10.20944/preprints202006.0117.v1
Online: 9 June 2020 (05:00:26 CEST)
Speckle noise is one of the most difficult noises to remove especially in medical applications. It is a nuisance in ultrasound imaging systems which is used in about half of all medical screening systems. Thus, noise removal is an important step in these systems, thereby creating reliable, automated, and potentially low cost systems. Herein, a generalized approach MFNR (Multi-Frame Noise Removal) is used, which is a complete Noise Removal system using KDE (Kernal Density Estimation). Any given type of noise can be removed if its probability density function (PDF) is known. Herein, we extracted the PDF parameters using KDE. Noise removal and detail preservation are not contrary to each other as the case in single-frame noise removal methods. Our results showed practically complete noise removal using MFNR algorithm compared to standard noise removal tools. The Peak Signal to Noise Ratio (PSNR) performance was used as a comparison metric. This paper is an extension to our previous paper where MFNR Algorithm was showed as a general purpose complete noise removal tool for all types of noises
ARTICLE | doi:10.20944/preprints202002.0125.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: image inpainting; image completion; attention; pyramid structure loss; deep learning
Online: 10 February 2020 (10:16:37 CET)
This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures --- edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.
ARTICLE | doi:10.20944/preprints201906.0248.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: image segmentation; neutrosophic information; Shannon entropy; gray level image threshold
Online: 25 June 2019 (08:48:22 CEST)
This article presents a new method of segmenting grayscale images by minimizing Shannon's neutrosophic entropy. For the proposed segmentation method, the neutrosophic information components, i.e., the degree of truth, the degree of neutrality and the degree of falsity are defined taking into account the belonging to the segmented regions and at the same time to the separation threshold area. The principle of the method is simple and easy to understand and can lead to multiple thresholds. The efficacy of the method is illustrated using some test gray level images. The experimental results show that the proposed method has good performance for segmentation with optimal gray level thresholds.
ARTICLE | doi:10.20944/preprints201904.0078.v1
Subject: Behavioral Sciences, Social Psychology Keywords: forest recreation; forest landscape; landscape image; landscape image sketching technique
Online: 8 April 2019 (09:08:30 CEST)
The landscape image is the bridge of communication between people and forests, and the cut point of the supply-side reform of forest tourism products. The research collected 140 copies in total of forest landscape image drawings from non-art-major graduate students by randomly sampling during April and May, 2018, and constructed the landscape image conceptual model of forest by utilizing the landscape image sketching technique. The results showed that (1) In regard to linguistic knowledge, the natural landscape elements for instance, herbaceous plants, terrains, creatures, water and sky, and the broad-leaf forest objectively reflected not only the real forest landscape and the local native vegetation, but the variation of forest species with little attention. (2) On the perspective of spatial view, the sideways view indicated that graduate students preferred to watch forests at a moderate distance externally and few looked at forests internally. (3) In the view of self-orientation, the objective landscape indicated that graduate students preferred to demonstrate forest landscapes, they did not realize to interact with the environment. (4) On the aspect of social meaning, the scenic view and forest structure stated that graduate students preferred rural forest landscapes, not significantly for other special interests for forest. In conclusions, (1) the forest is thought to be a feature of people's life world and of rural scenes around homes, not an objective perception of the forest. (2) The forest is regarded as an important habitat for animals and a limited resource for people's life, production and recreation needs, into which people will go only to meet such needs. (3) The natural values of forests, like the ecology and aesthetics, etc. get more attention, while the social values of forests, like the life, production and culture receives rather low attention.
ARTICLE | doi:10.20944/preprints202006.0091.v1
Subject: Mathematics & Computer Science, General & Theoretical Computer Science Keywords: Breast Cancer Screening; Digital Image Elasto Tomography (DIET); Image Noise Removal, Image Enhancement; Multiple Frame Noise Removal (MFNR)
Online: 7 June 2020 (14:53:34 CEST)
Breast cancer is a leading cause of death among women. Conventional screening methods, such as mammography, and ultrasound diagnosis are expensive and have significant limitations. Digital Image Elasto Tomography (DIET) is a new noninvasive breast cancer screening system that has a potential to be a low cost and reliable breast cancer screening tool. It is based on modal analysis of the breast mass, and stereographic 3D image analysis to detect the stiffer abnormal tissues. However, camera sensor noise, especially Gaussian noise is a major source of Optical Flow (OF) error in this approach to tumor detection. This work studies the performance of different conventional filters, including the standard Gaussian filter tool to remove this noise and produce more robust screening results. A radical approach, Multiple Frame Noise Removal (MFNR) is proposed, for use in this type of medical image processing instead of a Gaussian filter or other typical image noise removal tools. Its a multiple frame noise removal method where Probability Density Function (PDF) of noise is extracted from the multiple images by characterizing the same pixel positions in multiple images. The noise becomes deterministic, and hence easily removed. The proposed algorithm was applied to a data set from 10 phantom breast tests with a prototype DIET system, and 10 in-vivo samples from healthy women. Comparisons were made to an optimal Gaussian filter form that is commonly used. Reductions in OF error using these digitally imaged data sets was used to compare performance. Refinement of the images for medical applications requires higher PSNR, which was successfully achieved by using MFNR algorithm. In this study, the algorithm was used to improve the imaging results of a DIET system. The conventional wisdom that states that noise removal and detail preservation are contrasting effects is
ARTICLE | doi:10.20944/preprints202105.0408.v1
Subject: Engineering, Automotive Engineering Keywords: UAV Images; Monoscopic Mapping; Stereoscopic Plotting; Image Overlap; Optimal Image Selection
Online: 18 May 2021 (10:10:07 CEST)
Recently, the mapping industry has been focusing on the possibility of large-scale mapping from unmanned aerial vehicles (UAVs) owing to advantages such as easy operation and cost reduction. In order to produce large-scale maps from UAV images, it is important to obtain precise orientation parameters. For this, various techniques have been developed and are included in most of the commercial UAV image processing software. For mapping, it is equally important to select images that can cover a region of interest (ROI) with the fewest possible images. Otherwise, to map the ROI, one may have to handle too many images, and commercial software does not provide information needed to select images, nor does it explicitly explain how to select images for mapping. For these reasons, stereo mapping of UAV images in particular is time consuming and costly. In order to solve these problems, this study proposes a method to select images intelligently. We can select a minimum number of image pairs to cover the ROI with the fewest possible images. We can also select optimal image pairs to cover the ROI with the most accurate stereo pairs. We group images by strips, and generate the initial image pairs. We then apply an intelligent scheme to iteratively select optimal image pairs from the start to the end of an image strip. According to the results of the experiment, the number of images selected is greatly reduced by applying the proposed optimal image–composition algorithm. The selected image pairs produce a dense 3D point cloud over the ROI without any holes. For stereoscopic plotting, the selected image pairs were map the ROI successfully on a digital photogrammetric workstation (DPW), and a digital map covering the ROI is generated. The proposed method should contribute to time and cost reductions in UAV mapping.
REVIEW | doi:10.20944/preprints202012.0479.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Image classification; Texture image analysis; Discriminant features; Combination methods; texture operators
Online: 18 December 2020 (16:21:50 CET)
In many image processing and computer vision applications, the main aim is to describe image contents. So, different visual properties such as color, texture and shape are extracted to make aim. In this respect, texture information play important role in image description and visual pattern classification. Texture is referred to a specific local distribution of intensities that is repeated throughout the image. Since now different operations or descriptors have been proposed to analysis texture characteristics. In the multi object images specific texture operators usually doesn’t provide accurate results. So, in many cases, combination of texture operators are used to achieve more discriminant features. In this paper, some combination methods are survived to analysis effect of combinational texture features in image content description. Also, in the result part, different related methods are compared in terms of accuracy and computational complexity.
ARTICLE | doi:10.20944/preprints202005.0167.v1
Subject: Mathematics & Computer Science, Applied Mathematics Keywords: neutrosophic information; Onicescu information energy; image segmentation; gray level image threshold
Online: 10 May 2020 (14:41:04 CEST)
This article presents a method of segmenting images with gray levels that uses Onicescu's information energy calculated in the context of the neutrosophic theory. Starting from the information energy calculation for complete neutrosophic information, it is shown how to extend its calculation for incomplete and inconsistent neutrosophic information. The segmentation method is based on calculation of thresholds for separating the gray levels using the local maximum points of the Onicescu information energy.
TECHNICAL NOTE | doi:10.20944/preprints202203.0095.v1
Subject: Engineering, Other Keywords: pre-processing; image transformation; image enhancement; geometric correction; radiometric correction; Satellite Imagery
Online: 7 March 2022 (09:43:08 CET)
During the few years, various algorithms have been developed to extract features from high-resolution satellite imagery. For the classification of these extracted features, several complex algorithms have been developed. But these algorithms do not possess critical refining stages of processing the data at the preliminary phase. Various satellite sensors have been launched such as LISS3, IKONOS, QUICKBIRD, and WORLDVIEW etc. Before classification and extraction of semantic data, imagery of the high resolution must be refined. The whole refinement process involves several steps of interaction with the data. These steps are pre-processing algorithms that are presented in this paper. Pre-processing steps involves Geometric correction, radiometric correction, Noise removal, Image enhancement etc. Due to these pre-processing algorithms, the accuracy of the data is increased. Various applications of these pre-processing of the data are in meteorology, hydrology, soil science, forest, physical planning etc. This paper also provides a brief description of the local maximum likelihood method, fuzzy method, stretch method and pre-processing methods, which are used before classifying and extracting features from the image.
ARTICLE | doi:10.20944/preprints201705.0027.v2
Subject: Social Sciences, Geography Keywords: remote sensing; image registration; multiple image features; different viewpoint; non-rigid distortion
Online: 13 June 2017 (09:52:10 CEST)
Remote sensing image registration plays an important role in military and civilian fields, such as natural disaster damage assessment, military damage assessment and ground targets identification, etc. However, due to the ground relief variations and imaging viewpoint changes, non-rigid geometric distortion occurs between remote sensing images with different viewpoint, which further increases the difficulty of remote sensing image registration. To address the problem, we propose a multi-viewpoint remote sensing image registration method which contains the following contributions. (i) A multiple features based finite mixture model is constructed for dealing with different types of image features. (ii) Three features are combined and substituted into the mixture model to form a feature complementation, i.e., the Euclidean distance and shape context are used to measure the similarity of geometric structure, and the SIFT (scale-invariant feature transform) distance which is endowed with the intensity information is used to measure the scale space extrema. (iii) To prevent the ill-posed problem, a geometric constraint term is introduced into the L2E-based energy function for better behaving the non-rigid transformation. We evaluated the performances of the proposed method by three series of remote sensing images obtained from the unmanned aerial vehicle (UAV) and Google Earth, and compared with five state-of-the-art methods where our method shows the best alignments in most cases.
ARTICLE | doi:10.20944/preprints202108.0392.v1
Subject: Engineering, Other Keywords: image quality assessment; real-time image processing; image functions adaptation; convolutional neural network; face alignment; deep neural network; random forest
Online: 18 August 2021 (17:06:02 CEST)
In recent years, data providers are generating and streaming a large number of images. More particularly, processing images that contain faces have received great attention due to its numerous applications, such as entertainment and social media apps. The enormous amount of images shared on these applications presents serious challenges and requires massive computing resources to ensure efficient data processing. However, images are subject to a wide range of distortions in real application scenarios during the processing, transmission, sharing, or combination of many factors. So, there is a need to guarantee acceptable delivery content, even though some distorted images do not have access to their original version. In this paper, we present a framework developed to estimate the images' quality while processing a large number of images in real-time. Our quality evaluation is measured using an integration of a deep network with random forests. In addition, a face alignment metric is used to assess the facial features. Experimental results have been conducted on two artificially distorted benchmark datasets, LIVE and TID2013. We show that our proposed approach outperforms the state-of-art methods, having a Pearson Correlation Coefficient (PCC) and Spearman Rank Order Correlation Correlation Coefficient (SROCC) with subjective human scores of almost 0.942 and 0.931 while minimizing the processing time from 4.8ms to 1.8ms.
CONCEPT PAPER | doi:10.20944/preprints202204.0129.v1
Subject: Mathematics & Computer Science, Other Keywords: Digital Design; Digital Architecture; Image Processing; Machine learning; FPGA; Dedicated Design; Image Processor
Online: 14 April 2022 (05:09:47 CEST)
Many dedicated designs for real-time operations provide functionality on fixed-sized operators, but where speed, scalability, and flexibility are required, extensive research is demanded. Dedicated designs can provide real-time processing for many applications. This paper presents an FPGA-based design of a general image processor. The proposed design is based on a fixed-point representation of binary numbers. The proposed design provides a mechanism to manage matrices on-chip along with matrix arithmetic. The matrices are represented with simple identifiers and microinstruction that assist in the computation of many operations which are useful for solving complex problems. The design was successfully implemented and tested using VHDL language. The proposed design is an efficient architecture as a standalone processor with all embedding computational resources necessary for an embedded image processing application.
ARTICLE | doi:10.20944/preprints202001.0205.v1
Subject: Behavioral Sciences, Other Keywords: itch; scratch; automated real-time detection; machine-learning based image classifier; image sharpness
Online: 19 January 2020 (03:13:48 CET)
A 'little brother' of pain, itch is an unpleasant sensation that creates a specific urge to scratch. To date, various machine-learning based image classifiers (MBICs) have been proposed for quantitative analysis of itch-induced scratch behaviour of laboratory animals in an automated, non-invasive, inexpensive and real-time manner. In spite of MBICs' advantages, the overall performances (accuracy, sensitivity and specificity) of current MBIC approaches remains inconsistent, with their values varying from ~50% to ~99%, for which the reasons underlying have yet to be investigated further, both computationally and experimentally. To look into the variation of the performance of MBICs in automated detection of itch-induced scratch, this article focuses on the experimental data recording step, and reports here for the first time that MBICs' overall performance is inextricably linked to the sharpness of experimentally recorded video of laboratory animal scratch behaviour. This article furthermore demonstrates for the first time that a linearly correlated relationship exists between video sharpness and overall performance (accuracy and specificity, but not sensitivity) of MBICs, and highlight the primary role of experimental data recording in rapid, accurate and consistent quantitative assessment of laboratory animal itch.
ARTICLE | doi:10.20944/preprints201906.0166.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: MRI image; Texture Features; GLCM
Online: 18 June 2019 (05:36:29 CEST)
This paper presented a feature vector using a different statistical texture analysis of brain tumor from MRI image. The statistical feature texture is computed using GLCM (Gray Level Co-occurrence Matrices) of Brain Nodule structure. For this paper, the brain nodule segmented using strips method to implemented marker watershed image segmentation based on PSO (Particle Swarm Optimization) and Fuzzy C-means clustering (FCM). Furthermore, the four angles 0o, 45o, 90o and 135o are calculated the segmented brain image in GLCM. The four angular directions are calculated using texture features are correlation, energy, contrast and homogeneity. The texture analysis is performed a different types of images using past years. So the algorithm proposed statistical texture features are calculated for iterative image segmentation. These results show that MRI image can be implemented in a system of brain cancer detection.
ARTICLE | doi:10.20944/preprints201810.0534.v1
Subject: Engineering, Industrial & Manufacturing Engineering Keywords: non-destructive testing; process optimization; porosity; pore hotspots; image-based simulations; 3D image analysis
Online: 23 October 2018 (09:58:18 CEST)
This paper presents the latest developments in microCT, both globally and locally, for supporting the additive manufacturing industry. There are a number of recently developed capabilities which are especially relevant to the non-destructive quality inspection of additive manufactured parts; and also for advanced process optimization. These new capabilities are all locally available but not yet utilized to their full potential, most likely due to a lack of knowledge of these capabilities. The aim of this paper is therefore to fill this gap and provide an overview of these latest capabilities, showcasing numerous local examples.
ARTICLE | doi:10.20944/preprints201805.0240.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: background reconstruction; image quality assessment; image dataset; subjective evaluation; perceptual quality; objective quality metric
Online: 17 May 2018 (09:36:33 CEST)
With an increased interest in applications that require a clean background image, such as video surveillance, object tracking, street view imaging and location-based services on web-based maps, multiple algorithms have been developed to reconstruct a background image from cluttered scenes. Traditionally, statistical measures and existing image quality techniques have been applied for evaluating the quality of the reconstructed background images. Though these quality assessment methods have been widely used in the past, their performance in evaluating the perceived quality of the reconstructed background image has not been verified. In this work, we discuss the shortcomings in existing metrics and propose a full reference Reconstructed Background image Quality Index (RBQI) that combines color and structural information at multiple scales using a probability summation model to predict the perceived quality in the reconstructed background image given a reference image. To compare the performance of the proposed quality index with existing image quality assessment measures, we construct two different datasets consisting of reconstructed background images and corresponding subjective scores. The quality assessment measures are evaluated by correlating their objective scores with human subjective ratings. The correlation results show that the proposed RBQI outperforms all the existing approaches. Additionally, the constructed datasets and the corresponding subjective scores provide a benchmark to evaluate the performance of future metrics that are developed to evaluate the perceived quality of reconstructed background images.
ARTICLE | doi:10.20944/preprints201612.0075.v1
Subject: Earth Sciences, Geoinformatics Keywords: image recognition bases location; indoor positioning; RGB-D images; LiDAR; DataBase; mobile computing; image retrieval
Online: 15 December 2016 (07:17:35 CET)
This paper describes the first results of an Image Recognition Based Location (IRBL) for mobile application focusing on the procedure to generate a Database of range images (RGB-D). In an indoor environment, to estimate the camera position and orientation, a prior spatial knowledge of the surrounding is needed. In order to achieve this objective a complete 3D survey of two different environment (Bangbae metro station of Seoul and E.T.R.I. building in Daejeon – Republic of Korea) was performed using LiDAR (Light Detection And Ranging) instrument and the obtained scans were processed in order to obtain a spatial model of the environments. From this, two databases of reference images were generated using a specific software realized by the Geomatics group of Politecnico di Torino (ScanToRGBDImage). This tool allow to generate synthetically different RGB-D images) centered in the each scan position in the environment. Later, the external parameters (X, Y, Z, ω, φ, κ) and the range information extracted from the DB images retrieved, are used as reference information for pose estimation of a set of acquired mobile pictures in the IRBL procedure. In this paper the survey operations, the approach for generating the RGB-D images and the IRB strategy are reported. Finally the analysis of the results and the validation test are described.
ARTICLE | doi:10.20944/preprints202109.0295.v1
Subject: Medicine & Pharmacology, Other Keywords: Obesity; Eating Disorder; Body Image; Adolescents.
Online: 16 September 2021 (16:34:57 CEST)
There is growing recognition of the adverse effects of body image dissatisfaction (BID) and eating disorder (ED) symptoms on adolescent health. The aim of this study was to estimate the prevalence of ED symptoms, BID, and their relationship in adolescents from public schools in Southern Brazil. A total of 782 schoolchildren (male: n=420, female: n=362); age: 15 ± 0,4 years) answered a self-administrated questionnaire to identify sociodemographic data. Children´s Figure Rating Scale was adopted to identify body image and Eating Attitudes Test (EAT-26) was applied to investigate ED symptoms. Inferential statistics and hierarchical model-controlled logistic regression were used for association between variables. Most of the schoolchildren reported being satisfied with their bodies. However, we observed a higher prevalence of dissatisfaction among girls for being overweight and thinness among boys. Female students and students from schools located in the central area of the city showed higher chances of developing ED symptoms, and the absence of symptoms of ED appeared to act as a protective factor against BID in schoolchildren. Results of this study show the need to reflect on these factors that influence the development of ED and non-acceptance of their own body in a population concerned with their physical appearance.
CASE REPORT | doi:10.20944/preprints202012.0785.v1
Subject: Earth Sciences, Atmospheric Science Keywords: built environment; image analysis; remote sensing
Online: 31 December 2020 (09:51:50 CET)
The development of unmanned satellite space technology is increasingly willing, the emergence of medium resolution satellites with sensitivity and spectral variants such as Landsat is very effective in observing environmental changes, while the purpose of this study is to monitor the development of built-in land using image transformation techniques, estimating built-in land changes. The research method uses the NDVI image transformation technique, NDBI and Built Up Index, with Landsat satellite image data obtained from USGS. Accuracy sampling is done by purposive sampling with confusion matrix accuracy test technique. The research results were found. developed land for the period 2004 - 2010 with a percentage of 19.25%, for stages 2010 - 2018 with a percentage of 30.25%. The land development was built based on the area of the highest sub-district in the Kubung area in the early period with a percentage of 7.20% then in the second period with a percentage of 32.23%. The quality of the accuracy of the results of image analysis using confusion matrix technique with an image accuracy level in a field sample of 185 with an image accuracy of 86.04%.
ARTICLE | doi:10.20944/preprints202012.0727.v1
Subject: Social Sciences, Accounting Keywords: city marketing; sustainable development; resillience; image
Online: 29 December 2020 (11:24:13 CET)
The focus of this study is to identify whether resilience and sustainable development can be used as an image for strategic planning of the city marketing. Resilience is about building and planning for future proof the cities. How urban challenges and crisis have the lowest impact and the maximum of bounce back and evolution. Resilience is part of the sustainable development. Thus, it is important for the decision-makers to define the mission on their strategic planning in a holistically way taking into consideration the basic assets of a city, the environment, the economy and the society and how can all of them can be combined to marketing the city and take into consideration the internal and external environment. As the past few years’ city marketing has become an important tool for the urban development. The main goal is to show how city marketing can be applied on a city that tries to be more resilient and more sustainable by using strategic urban planning to set the vision, to identify the challenges and the problematic areas and to set new goals and objectives in order to plan and build to future proof the complexity of an urban system. For answering the questions of this article we use two case studies Rotterdam (Netherlands) and Thessaloniki (Greece), using a literature review and researches conducted alongside with a benchmarking of their resilient strategies as both of the cities are members of the Resilient Cities Network. From a different perspective of resilient thinking, both of the cities have managed to use resilience as a marketing image for further sustainable development.
ARTICLE | doi:10.20944/preprints201910.0188.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: digital watermarking; multiple image; transform domain
Online: 17 October 2019 (08:48:19 CEST)
In this paper, a technique of image watermarking using multiple images as watermarks is presented. The technique is based on transform domain functions including discrete wavelet transform (DWT), discrete cosine transform (DCT) and singular value decomposition (SVD) with an image as the host signal i.e. the watermarks will be used as proofs of the authenticity of the host image. The technique is executed by performing multilevel DWT followed by applying DCT and SVD to both the host and watermark. Multiple watermarks are used for the insurance of better security level. The scheme is immune to common image processing operations & some attacks and exhibits PSNR of 108.3781dB, normalized cross correlation (NCC) over 0.99 and normalized correlation (NC) over 0.99.
ARTICLE | doi:10.20944/preprints201906.0215.v1
Subject: Social Sciences, Education Studies Keywords: addiction; triathletes; bogy image; behavior regulation
Online: 21 June 2019 (11:36:23 CEST)
The aim of the research was getting to know the risk of dependency on physical exercising in individual sportspeople and the relationship with body dissatisfaction and motivation. 225 triathletes, swimmers, cyclists and athletes- with ages going from 18 to 63 years old took part in the research, of which 145 were men (M = 35.57 ±10.46 years) and 80 women (M = 32.83 ±10.31 years). The EDS-R was used to study the dependency on exercising, BSQ to study body dissatisfaction, BREQ-3 to know the motivation of participants and BIAQ to analyse conducts of avoidance to body image. The obtained results show that 8.5% of the subjects had risk of dependency on exercising and that 18.2% tend to have corporal dissatisfaction, without meaningful differences in the kind of sport they practiced. However, there were important differences concerning the dependency on physical exercise (15% vs 4.8%) and body dissatisfaction (31.1% vs 11%) in relation to sex, being the higher percentage referring to women. The introjected regulation and the conduct of food restriction were the predictor variables of the dependency on exercising and corporal dissatisfaction.
REVIEW | doi:10.20944/preprints201903.0095.v1
Subject: Life Sciences, Biophysics Keywords: Striated Muscle, image reconstruction, muscle physiology
Online: 7 March 2019 (12:42:36 CET)
Much has been learned about the interaction between myosin and actin through biochemistry, in vitro motility assays and cryo-electron microscopy of F-actin decorated with myosin heads. Comparatively less is known about actin-myosin interactions within the filament lattice of muscle, where myosin heads function as independent force generators and thus most measurements report an average signal from multiple biochemical and mechanical states. All of the 3-D imaging by electron microscopy that has revealed the interplay of the regular array of actin subunits and myosin heads within the filament lattice has been accomplished using the flight muscle of the large waterbug Lethocerus sp. Lethocerus flight muscle possesses a particularly favorable filament arrangement that enables all the myosin cross-bridges contacting the actin filament to be visualized in a thin section. This review covers the history of this effort and the progress toward visualizing the complex set of conformational changes that myosin heads make when binding to actin in several static states as well as fast frozen actively contracting muscle. The efforts have revealed a consistent pattern of changes to the myosin head structures determined by X-ray crystallography needed to explain the structure of the different acto-myosin interactions observed in situ.
ARTICLE | doi:10.20944/preprints201811.0028.v1
Subject: Social Sciences, Business And Administrative Sciences Keywords: ISO; social responsibility; image; profitability; SMEs
Online: 2 November 2018 (06:53:35 CET)
At present, business strategies in SMEs (Small and medium enterprises) are crucial for consolidation in highly competitive markets, in achieving a better image and in business profitability. One of the strategies that have the most success and business success are sustainable practices and social responsibility such as: ISO 14001 and ISO 26001. The literature related to sustainable business is based mainly on the theory of resources and capabilities, and in theory based on Stakeholders. These currents state that companies should focus on profitable strategies to ensure significant and long-term results, in order to achieve organizational and financial results for stakeholders. In this work, the sample consists of 215 companies from the commerce, services and industry sectors, located in the southern region of the State of Sonora in Mexico. The objective of the work is to analyze the influence of ISO 14001 and 26001 standards on the image and profitability of SMEs. The statistical analysis of the data has been carried out through the linear regression technique by OLS (Ordinary Least Squares). The findings prove that the ISO 14001 standard is the one that most influences the improvement of the business image and the level of profitability of the SME. In addition, we discovered that ISO 26001 has a partial influence on the image and profitability of the SME.
ARTICLE | doi:10.20944/preprints201810.0305.v1
Online: 15 October 2018 (11:49:29 CEST)
As the demand for a more sustainable society increases, adopting a sustainable banking approach serves as a competitive advantage for banks that are focused on attaining bank loyalty. This study revolves around understanding the role of sustainable banking practices on bank loyalty, while exploring the mediating effect of corporate image in the relationship between sustainable banking practices and bank loyalty. 511 data derived from customers of the banking sector was adopted for this study. Result from the structural equation modeling shows that sustainable banking practices positively and directly affects bank loyalty and corporate image, corporate image directly and positively affect bank loyalty, and also mediates in the relationship between sustainable banking practices and bank loyalty.
ARTICLE | doi:10.20944/preprints201802.0103.v1
Online: 15 February 2018 (16:49:55 CET)
An effective on-board cloud detection method in small satellites would greatly improve the downlink data transmission efficiency and reduce the memory cost. In this paper, an ensemble method combining a lightweight U-Net with wavelet image compression is proposed and evaluated. The red, green, blue and infrared waveband images from Landsat-8 dataset are trained and tested to estimate the performance of proposed method. The LeGall-5/3 wavelet transform is applied on the dataset to accelerate the neural network and improve the feasibility of on-board implement. The experiment results illustrate that the overall accuracy of the proposed model achieves 97.45% by utilizing only four bands. Tests on low coefficients of compressed dataset have shown that the overall accuracy of the proposed method is still higher than 95%, while its inference speed is accelerated to 0.055 second per million pixels and maximum memory cost reduces to 2Mb. By taking advantage of mature image compression system in small satellites, the proposed method provides a good possibility of on-board cloud detection based on deep learning.
ARTICLE | doi:10.20944/preprints202105.0605.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: deep learning; computed tomography; image classification; COVID-19; medical image analysis; pneumonia; CNN, LSTM, medical diagnosis
Online: 25 May 2021 (10:32:29 CEST)
Advancements in deep learning and availability of medical imaging data have led to use of CNN based architectures in disease diagnostic assisted systems. In spite of the abundant use of reverse transcription-polymerase chain reaction (RT-PCR) based tests in COVID-19 diagnosis, CT images offer an applicable supplement with its high sensitivity rates. Here, we study classification of COVID-19 pneumonia (CP) and non-COVID-19 pneumonia (NCP) in chest CT scans using efficient deep learning methods to be readily implemented by any hospital. We report our deep network framework design that encompasses Convolutional Neural Networks (CNNs) and bidirectional Long Short Term Memory (biLSTM) architectures. Our study achieved high specificity (CP: 98.3%, NCP: 96.2% Healthy: 89.3%) and high sensitivity (CP: 84.0%, NCP: 93.9% Healthy: 94.9%) in classifying COVID-19 pneumonia, non-COVID-19 pneumonia and healthy patients. Next, we provide visual explanations for the CNN predictions with gradient-weighted class activation mapping (Grad-CAM). The results provided a model explainability by showing that Ground Glass Opacities (GGO), indicators of COVID-19 pneumonia disease, were captured by our CNN network. Finally, we have implemented our approach in three hospitals proving its compatibility and efficiency.
ARTICLE | doi:10.20944/preprints202104.0318.v1
Subject: Keywords: Kerr frequency comb; Hilbert transform; integrated optics; all-optical signal processing; image processing; video image processing
Online: 12 April 2021 (14:27:20 CEST)
Advanced image processing will be crucial for emerging technologies such as autonomous driving, where the requirement to quickly recognize and classify objects under rapidly changing, poor visibility environments in real time will be needed. Photonic technologies will be key for next-generation signal and information processing, due to their wide bandwidths of 10’s of Terahertz and versatility. Here, we demonstrate broadband real time analog image and video processing with an ultrahigh bandwidth photonic processor that is highly versatile and reconfigurable. It is capable of massively parallel processing over 10,000 video signals simultaneously in real time, performing key functions needed for object recognition, such as edge enhancement and detection. Our system, based on a soliton crystal Kerr optical micro-comb with a 49GHz spacing with >90 wavelengths in the C-band, is highly versatile, performing different functions without changing the physical hardware. These results highlight the potential for photonic processing based on Kerr microcombs for chip-scale fully programmable high-speed real time video processing for next generation technologies.
ARTICLE | doi:10.20944/preprints201710.0187.v1
Subject: Mathematics & Computer Science, Analysis Keywords: medical image classification; local binary patterns; characteristic curves; whole slide image pro-cessing; automated HER2 scoring
Online: 31 October 2017 (03:10:22 CET)
This paper presents novel feature descriptors and classification algorithms for automated scoring of HER2 in Whole Slide Images (WSI) of breast cancer histology slides. Since a large amount of processing is involved in analyzing WSI images, the primary design goal has been to keep the computational complexity to the minimum possible level and to use simple, yet robust feature descriptors that can provide accurate classification of the slides. We propose two types of feature descriptors that encode important information about staining patterns and the percentage of staining present in ImmunoHistoChemistry (IHC) stained slides. The first descriptor is called a characteristic curve which is a smooth non-increasing curve that represents the variation of percentage of staining with saturation levels. The second new descriptor introduced in this paper is an LBP feature curve which is also a non-increasing smooth curve that represents the local texture of the staining patterns. Both descriptors show excellent interclass variance and intraclass correlation, and are suitable for the design of automatic HER2 classification algorithms. This paper gives the detailed theoretical aspects of the feature descriptors and also provides experimental results and comparative analysis.
ARTICLE | doi:10.20944/preprints201710.0181.v1
Subject: Mathematics & Computer Science, Analysis Keywords: ultrasound image analysis; speckle noise; synthetic ultrasound images; texture features; local binary patterns; image quality assessment
Online: 30 October 2017 (09:37:59 CET)
Speckle noise reduction is an important area of research in the field of ultrasound image processing. Several algorithms for speckle noise characterization and analysis have been recently proposed in the area. Synthetic ultrasound images can play a key role in noise evaluation methods as they can be used to generate a variety of speckle noise models under different interpolation and sampling schemes, and can also provide valuable ground truth data for estimating the accuracy of the chosen methods. However, not much work has been done in the area of modelling synthetic ultrasound images, and in simulating speckle noise generation to get images that are as close as possible to real ultrasound images. An important aspect of simulated synthetic ultrasound images is the requirement for extensive quality assessment for ensuring that they have the texture characteristics and gray-tone features of real images. This paper presents texture feature analysis of synthetic ultrasound images using local binary patterns (LBP) and demonstrates the usefulness of a set of LBP features for image quality assessment. Experimental results presented in the paper clearly show how these features could provide an accurate quality metric that correlates very well with subjective evaluations performed by clinical experts.
ARTICLE | doi:10.20944/preprints202206.0384.v1
Subject: Engineering, Biomedical & Chemical Engineering Keywords: Deep Learning; Smartphone Image; Acne Grading; Acne Object DetectionDeep Learning, Smartphone Image, Acne Grading, Acne Object Detection
Online: 28 June 2022 (10:05:25 CEST)
Skin image analysis using artificial intelligence (AI) has recently attracted significant research interest, particularly for analyzing skin images captured by mobile devices. Acne is one of the most common skin conditions with profound effects in severe cases. In this study, we developed an AI system called AcneDet for automatic acne object detection and acne severity grading using facial images captured by smartphones. AcneDet includes two models for conducting two tasks: (1) a Faster R-CNN-based deep learning model for the detection of acne lesion objects of four types including blackheads/whiteheads, papules/pustules, nodules/cysts, and acne scars; and (2) a LightGBM machine learning model for grading acne severity using the Investigator’s Global Assessment (IGA) scale. The output of the Faster R-CNN model, i.e., the counts of each acne type, were used as input for the LightGBM model for acne severity grading. A dataset consisting of 1,572 labeled facial images captured by both iOS and Android smartphones was used for training. The results show that the Faster R-CNN model achieves a mAP of 0.54 for acne object detection. The mean accuracy of acne severity grading by the LightGBM model is 0.85. With this study, we hope to contribute to the development of artificial intelligent systems that are able to help acne patients understand more about their conditions and support doctors in acne diagnosis.
ARTICLE | doi:10.20944/preprints201812.0137.v2
Subject: Life Sciences, Other Keywords: microscopy, fluorescence, machine learning, deep learning, inverse problems, image reconstruction, image restoration, super-resolution, deconvolution, spectral unmixing
Online: 5 February 2019 (10:30:40 CET)
Deep Learning is a recent and important addition to the computational toolbox available for image reconstruction in fluorescence microscopy. We review state-of-the-art applications such as image restoration, super-resolution, and light-field imaging, and discuss how the latest Deep Learning research can be applied to other image reconstruction tasks such as structured illumination, spectral deconvolution, and sample stabilisation. Despite its successes, Deep Learning also poses significant challenges, has often misunderstood capabilities, and overlooked limits. We will address key questions, such as: What are the challenges in obtaining training data? Can we discover structures not present in the training data? And, what is the danger of inferring unsubstantiated image details?
ARTICLE | doi:10.20944/preprints201709.0098.v2
Subject: Earth Sciences, Geoinformatics Keywords: farming-pasture ecotone; TM image; remote sensing; vegetation cover factor; scale conversion; land use; high resolution image
Online: 21 September 2017 (16:33:49 CEST)
The key to simulating soil erosion is to calculate the vegetation cover (C) factor. Methods that apply remote sensing to calculate C factor at regional scale cannot directly use the C factor formula. That is because the C factor formula is obtained by experiment, and needs the coverage ratio data of croplands, woodlands and grasslands at standard plot scale. In this paper, we present a C factor conversion method from a standard plot to a km-sized grid based on large sample theory and multi-scale remote sensing. Results show that: 1) Compared with the existing C factor formula, our method is based on the coverage ratio of croplands, woodlands and grasslands on a km-sized grid, takes the C factor formula obtained from the standard plot experiment and applies it to regional scale. This method improves the applicability of the C factor formula, and can satisfy the need to simulate soil erosion in large areas. 2) The vegetation coverage obtained by remote sensing interpretation is significantly consistent (paired samples t-test, t = −0.03, df = 0.12, 2-tail significance p < 0.05) and significantly correlated with the measured vegetation coverage. 3) The C factor of the study area is smaller in the middle, southern and northern regions, and larger in the eastern and western regions. The main reason for that is the distribution of woodlands, the Hunshandake and Horqin sandy lands and the valleys affected by human activities. 4) The method presented in this paper is more meticulous than the C factor method based on the vegetation index, improves the applicability of the C factor formula, and can be used to simulate soil erosion on large scale and provide strong support for regional soil and water conservation planning.
REVIEW | doi:10.20944/preprints202205.0343.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Depth Completion; Depth Maps; Image-Guidance; Lidar
Online: 25 May 2022 (05:26:16 CEST)
Depth maps produced by LiDAR based approaches are sparse. Even high-end LiDAR sensors produce highly sparse depth maps, which are also noisy around the object boundaries. Depth completion is the task of generating a dense depth map from a sparse depth map. While the traditional approaches focus on directly completing this sparsity from the sparse depth maps, modern techniques use RGB images as a guidance tool to resolve this problem. Whilst many others rely on affinity matrices for depth completion. Based on these approaches, we have sub-divided the literature into two major categories; traditional approaches and backbone-based approaches. The latter is further sub-divided into two-branch, and spatial propagation approaches. The two-branch approaches still have a sub-category named guided-kernel approaches. In this paper, for the first time ever we present a comprehensive survey of depth completion methods. We present a novel taxonomy of depth completion approaches, review and detail different state-of-the art techniques within each category for depth completion of LiDAR data, and provide quantitative results for the approaches on KITTI and NYUv2 depth completion benchmark datasets.
REVIEW | doi:10.20944/preprints202201.0016.v2
Online: 4 February 2022 (13:40:05 CET)
Video editing is a high-required job, for it requires skilled artists or workers equipped with plentiful physical strength and multidisciplinary knowledge, such as cinematography, aesthetics. Thus gradually, more and more researches focus on proposing semi-automatical and even fully automatical solutions to reduce workloads. Since those conventional methods are usually designed to follow some simple guidelines, they lack flexibility and capability to learn complex ones. Fortunately, the advances of computer vision and machine learning make up the shortages of traditional approaches and make AI editing feasible. There is no survey to conclude those emerging researches yet. This paper summaries the development history of automatic video editing, and especially the applications of AI in partial and full workflows. We emphasizes video editing and discuss related works from multiple aspects: modality, type of input videos, methology, optimization, dataset, and evaluation metric. Besides, we also summarize the progresses in image editing domain, i.e., style transferring, retargeting, and colorization, and seek for the possibility to transfer those techniques to video domain. Finally, we give a brief conclusion about this survey and explore some open problems.
CONCEPT PAPER | doi:10.20944/preprints202109.0024.v1
Online: 1 September 2021 (14:32:47 CEST)
Microscopes based on dielectric mesoscale particles, using the effect of a photonic jet or terajet in the terahertz range, are a promising tool for overcoming the diffraction limit. However, the image they generate has limited contrast, which limits the application of this method. In this letter, we demonstrate that it is possible to increase the contrast of an image based on dielectric mesoscale particles that provide the formation of photonic hooks. In this case, the illumination of the object is carried out by an oblique incidence of subwavelength terajet, which significantly (more than 2 times) increases the contrast of the image.
Subject: Engineering, Automotive Engineering Keywords: forest fire; image recognition; graph neural network;
Online: 13 July 2021 (11:31:18 CEST)
Forest fire identification is important for forest resource protection. Effective monitoring of forest fires requires the deployment of multiple monitors with different viewpoints, while most traditional recognition models can only recognize images from a single source. By ignoring the information from images with different viewpoints, these models produce high rates of missed and false alarms. In this paper, we propose a graph neural network model based on the similarity of dynamic features of multi-view images to improve the accuracy of forest fire recognition. The input features of the nodes on the graph are converted into relational features of different gallery pairs by establishing pairs (nodes) representing different viewpoint images and gallery images. The new feature library relationship is used to update the image gallery with dynamic features in order to achieve the estimation of similarity between images and improve the image recognition rate of the model. In addition, to reduce the complexity of image pre-processing process and extract key features in images effectively, this paper also proposes a dynamic feature extraction method for fire regions based on image segment ability. By setting the threshold value of HSV color space, the fire region is segmented from the image, and the dynamic features of successive frames of the fire region are extracted. The experimental results show that, compared with the baseline method Resnet, this paper's method is more effective in identifying forest fires, and its recognition accuracy is improved by 2%. And the scheme of this paper can adapt to different forest fire scenes, with better generalization ability and anti-interference ability.
ARTICLE | doi:10.20944/preprints202106.0730.v1
Subject: Social Sciences, Accounting Keywords: tourist destination; image; promotion; experience; Bihor; Romania
Online: 30 June 2021 (11:49:13 CEST)
The concept of destination image is closely related to the brand image of the destination. A good image is a step in branding the destination. The image of the destination can be a primary, sec-ondary or global one, the latter incorporating the first two. The sustainability of a positive image of the destination is based on both a positive secondary image and a positive global image. The purpose of this research is to analyze separately the two types of images for a given tourist des-tination that has registered in recent years a remarkable increase in the number of visitors. The research is based on a questionnaire-based survey of a sample of 607 people. The collected data were processed with SPSS and the results show significant differences between the two types of images (secondary image and global image), a dangerous situation in the medium and long term for destination management. The nuances in the perception of the image of the destination on the two types of respondents (who experienced respectively who did not experience the destination) can be explained by the aggressive strategy of promoting the tourist destination, but inefficient strategy for younger age groups. The study allows the formulation of conclusions and measures to correct the situation.
ARTICLE | doi:10.20944/preprints202104.0495.v1
Online: 19 April 2021 (14:17:10 CEST)
The genetic development of commercial broiler led to body misconfiguration and consequent walking disabilities, mainly at the slaughter age. The present study aimed to identify broiler locomotion ability using image analysis automatically. A total of 40 broiler 40 d-old were placed to walk on a specially built runway, and their locomotion was recorded. An image segmentation algorithm was developed, and the coordinates of the bird's center of mass were extracted from the segmented images for each frame analyzed, and the Unrest Index (UI) was applied. We calculated the center of mass's movement of the broiler walking's lateral images, therefore, capturing the bird's displacement speed in the onward direction. Results indicated that broiler speed on the runway tends to decrease with the increase of the gait score. The locomotion did not differ between males or females. The proposed algorithm was efficient if predicting the broiler gait score based on their displacement speed.
Subject: Biology, Animal Sciences & Zoology Keywords: intramuscular fat; prediction; image analysis; Bísaro pork
Online: 13 January 2021 (13:16:19 CET)
This work presents an analytical methodology to predict meat juiciness (discriminant semi-quantitative analysis using groups of intervals of intramuscular fat) and intramuscular fat (regression analysis) in Longissimus thoracis et lumborum (LTL) muscle of Bísaro pigs using as independent variables the animal carcass weight and parameters from color and image analysis. These are non-invasive and non-destructive techniques which allow development of rapid, easy and inexpensive methodologies to evaluate pork meat quality in a slaughterhouse. The proposed predictive supervised multivariate models were non-linear. Discriminant mixture analysis to evaluate meat juiciness by classified samples into three groups—0.6 to 1.1%; 1.25 to 1.5%; and, greater than 1.5%. The obtained model allowed 100% of correct classifications (92% in cross-validation with seven-folds with five repetitions). Polynomial support vector machine regression to determine the intramuscular fat presented R2 and RMSE values of 0.88 and 0.12, respectively in cross-validation with seven-folds with five repetitions. This quantitative model (model’s polynomial kernel optimized to degree of three with a scale factor of 0.1 and a cost value of one) presented R2 and RSE values of 0.999 and 0.04, respectively. The overall predictive results demonstrated the relevance of photographic image and color measurements of the muscle to evaluate the intramuscular fat, rarther than the usual time-consuming and expensive chemical analysis.
ARTICLE | doi:10.20944/preprints202011.0530.v1
Subject: Biology, Anatomy & Morphology Keywords: rye; image analysis; grain color; anthocyanins; proanthocyanidins
Online: 20 November 2020 (09:34:01 CET)
In rye, there is a considerable variety of grain color which is determined by the diversity of compounds localized in different parts of the grain (caryopsis) - pericarp, testa, and aleurone. The localization of anthocyanins and proanthocyanidins was analyzed in 26 rye samples with identified anthocyanin genes, along with the analysis of CIE color coordinates. The Grain Scan program  was used to analyze images of individual grains. The localization of anthocyanins and proanthocyanidins was studied on longitudinal and cross sections of grains using light microscopy and MALDI-imaging. The violet-grained samples contain anthocyanins in the pericarp, and the green-grained samples contain anthocyanins in the aleurone layer. The green, violet and yellow-grained rye, with the exception of two anthocyaninless mutants vi3 and vi6, shows the presence of proanthocyanidins in the brown-colored testa. Four main color groups of the rye grains (yellow, green, brown, violet) could be differentiated using the color coordinate h° (hue angle). Interspecies and intraspecies variability for the localization of colored flavonoids in cereal grains is discussed.
Subject: Behavioral Sciences, Cognitive & Experimental Psychology Keywords: amplitude spectrum; image statistics; complexity; aesthetics; phase
Online: 23 October 2020 (20:47:00 CEST)
Within the spectrum of a natural image, the amplitude of modulation decreases with spatial frequency. The speed of such an amplitude decrease, or the amplitude spectrum slope, of an image affects the perceived aesthetic value. Additionally, a human observer would consider a symmetric image more appealing than they do an asymmetric one. We investigated how these two factors jointly affect aesthetic preferences by manipulating both the amplitude spectrum slope and the symmetric level of images to assess their effects on aesthetic preference on a 6-point Likert scale. Our results showed that the preference ratings increased with the symmetry level but had an inverted U-shape relation to amplitude spectrum slope. In addition, a strong interaction existed between symmetry level and amplitude spectrum slope on preference rating, in that symmetry can amplify the amplitude spectrum slope’s effects. Such effects can be described by a quadratic function of the spectrum slope. That is, preference is an inverted U-shape function of spectrum slope whose intercept is determined by the number of symmetry axis. In addition, the interaction between the two factors is manifested as the modulation depth of the quadratic function.
ARTICLE | doi:10.20944/preprints202010.0122.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: symmetry; symmetry detection techniques; image processing; matlab.
Online: 6 October 2020 (11:10:24 CEST)
With the widespread use of mobile phones, increasing use of mobile applications in different areas of life have gained a large place. Between the reasons of the preference of the rapid increasing number of apps in the application markets of the different platforms, the aesthetic appereance of the icons that apps have perhaps the most important one. In this study, the visual symmetrical side of the icons of the most downloaded apss which are developed for preparing to Public Servant Exam in Turkey is emphasized. Two different types of data are obtained with working on the icons with image processing technique by using Mathworks Matlab program and survey method which is applied on Korkut Ata University students. By comparing the obtained data with the binary logistic regression method, it was determined that the visual symmetry in the apps’ icons partially contributed to the aesthetic appreciation.
ARTICLE | doi:10.20944/preprints201909.0232.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: image similarity; SMI; SMI temp index; PSMI
Online: 20 September 2019 (05:32:21 CEST)
Online social networking techniques and large-scale multimedia retrieval are developing rapidly, which not only has brought great convenience to our daily life, but generated, collected, and stored large-scale multimedia data as well. This trend has put forward higher requirements and greater challenges on massive multimedia retrieval. In this paper, we investigate the problem of image similarity measurement, which is one of the key problems of multimedia retrieval. Firstly, the definition of similarity measurement of images and the related notions are proposed. Then, an efficient similarity measurement framework is proposed. Besides, we present a novel basic method of similarity measurement named SMIN. To improve the performance of similarity measurement, we carefully design a novel indexing structure called SMI Temp Index (SMII for short). Moreover, we establish an index of potential similar visual words off-line to solve to problem that the index cannot be reused. Experimental evaluations on two real image datasets demonstrate that the proposed approach outperforms state-of-the-arts.
ARTICLE | doi:10.20944/preprints201906.0105.v1
Subject: Biology, Plant Sciences Keywords: image analysis; machine learning; algorithms; computer vision
Online: 12 June 2019 (12:39:18 CEST)
Spike shape and morphometric characteristics are among the key characteristics of cultivated cereals associated with their productivity. Identification of the genes controlling these traits requires morphometric data at harvesting and analysis of numerous plants, which could be automatically done using technologies of digital image analysis. A method for wheat spike morphometry utilizing 2D image analysis is proposed. Digital images are acquired in two variants: a spike on a table (one projection) or fixed with a clip (four projections). The method identifies spike and awns in the image and estimates their quantitative characteristics (area in image, length, width, circularity, etc.). Section model, quadrilaterals, and radial model are proposed for describing spike shape. Parameters of these models are used to predict spike shape type (spelt, normal, or compact) by machine learning. The mean error in spike density prediction for the images in one projection is 4.61 (~18%) versus 3.33 (~13%) for the parameters obtained using four projections.
ARTICLE | doi:10.20944/preprints201905.0308.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Colour Model; Steganography; Medical Image; C4S; distortion
Online: 27 May 2019 (10:13:02 CEST)
Visible light photography diagnostic images are coloured ex vivo medical images popularly used in Dermatology and Endoscopy for diagnosis and monitoring. The need to protect the integrity of these images as well as associated patient data calls for techniques such as image steganography and watermarking. This research explores and compares the effect of watermarking on the YIQ and YCbCr colour transforms used in processing digital coloured images and video in recent times. Using a new spread spectrum watermarking algorithm, it was found that YIQ has better distortion performance than YCbCr in the order of 3dB while YCbCr had lower BER for accurate watermark retrieval and tamper detection in the order of 1.3 x 10-3.
Online: 27 March 2019 (08:44:59 CET)
This study was designed to assess the reliability of geographic information using satellite image information in ungauged basins. For this, this study constructed geographic information using actual gauged data and satellite information data and conducted runoff analysis through S-RAT, a rainfall–runoff model, and performed the comparison and analysis of geographic information and runoff data. For actual gauged data, the gauged geographic information of the Water Resources Management Information System (WAMIS) was collected, and for satellite information, the image information of moderate-resolution imaging spectroradiometer (MODIS) observation sensor loaded on Terra Satellite was collected. As analysis areas, three basins where mountains occupy more than 80% and another three basins where urban areas occupy more than 7% in the Han River basin were selected. According to the analysis result, the gauged information and satellite image information showed great difference in runoff, maximum 50% in peak flood and maximum 17% in total flood, in the rivers with many urban areas, while the runoff difference in the rivers with many mountains showed maximum 13% in peak flood and 4% in total flood. What showed the greatest difference in image information was land use, and it turned out that the MODIS satellite recognized the urban rivers as cities for more than maximum 60% compared to WAMIS-gauged data. Meanwhile, in the forest area, the MODIS satellite image showed error of less than 5% of the WAMIS-gauged data, which indicates that it has higher applicability in Mountain Rivers.
ARTICLE | doi:10.20944/preprints201804.0206.v1
Subject: Medicine & Pharmacology, Nutrition Keywords: body self-image; adolescent; anthropometry; nutritional status
Online: 16 April 2018 (10:51:45 CEST)
The critical changes in physical appearance during adolescence can considerably influence the self-appraisal of body image. The purpose of this study is to analyze body self-image gender differences in Mediterranean adolescents, and his relationships to the anthropometric characteristics of this population in different phases of the adolescence. Participants were 809 Mediterranean teenagers (396 females) aged 11 to 17. A relative low prevalence of dissatisfaction with body image was observed among healthy urban Mediterranean adolescents (boys 17.3%; girls 22.7%). Girls showed statistically significantly higher mean BSQ scores than boys (M = 61.7, SD = 26.6 versus M = 56.3, SD = 27.1; p < 0.001). Girls in the late adolescence were more often classified as being dissatisfied (31%) in comparison to those in the early adolescent group (19.1%; p < 0.05). There was a good correlation of BSQ scores with all the anthropometric variables in males but not in females.
ARTICLE | doi:10.20944/preprints201704.0174.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Hierarchical search; Image retrieval; Multi-feature fusion
Online: 26 April 2017 (18:51:42 CEST)
Aiming at the problems that are poor generalization performance, low retrieval accuracy and large time consumption of existing content-based image retrieval system, the hierarchical image retrieval method based on multi feature fusion is proposed in this paper. The retrieval accuracy rates on Corel5K, UKbeach and Holidays are 68.23(Top 1), 3.73(N-S) and 88.20(mAp), respectively. The experimental results show that the method proposed in this paper can effectively improve the deficiency of single feature retrieval and save time significantly in the premise of a small amount of loss of accuracy.
ARTICLE | doi:10.20944/preprints201608.0106.v1
Subject: Behavioral Sciences, Applied Psychology Keywords: packaging; beer; image mold; packaging weight; taste
Online: 10 August 2016 (09:04:27 CEST)
People often say that beer tastes better from a bottle than from a can. However, one can ask whether this perceived difference is reliable across consumers; And, if so, whether it is purely a psychological phenomenon (associated with the influence of packaging on taste perception), or whether instead it reflects some more mundane physico-chemical interaction between the packaging material (or packing procedure/process) and the contents. We conducted two experiments in order to address these important questions. In the main experiment, 151 participants at the 2016 Edinburgh Science Festival were served a beer in a plastic cup. The beer was either poured from a bottle or can (i.e., a between-participants experimental design was used) and the participants were encouraged to pick up the packaging in order to inspect the label before tasting the beer. The participants rated the perceived taste, quality, and freshness of the beer, as well as their likelihood of purchase, and their estimate of the price. All of the beer came from the same batch (from Barney’s Brewery in Edinburgh). Nevertheless, those who evaluated the bottled beer rated it as tasting better than those who rated the beer that had been served from a can. Having demonstrated such a perceptual difference in terms of taste, we then went on to investigate whether people would prefer one packaging format over the other when the beer from bottle and can was served to a new group of participants blind (i.e., when the participants did not know the packaging material). The participants in this control study (N = 29) were asked which beer they preferred or else could state that the two samples tasted the same. No sign of preference was obtained under such conditions. Explanations for the psychological impact of the packaging format, in terms of differences in packaging weight (between tin and glass), and/or prior associations of quality with specific packaging materials/formats (what some have chosen to call ‘image molds’) are discussed.
ARTICLE | doi:10.20944/preprints201810.0343.v1
Subject: Engineering, Control & Systems Engineering Keywords: unmanned aircraft (UAV); sensing; intelligent transportation; image fusion; signal alignment; runway detection; image registration; wavelet transform; Hough transform
Online: 16 October 2018 (08:49:55 CEST)
UAV network operation enables gathering and fusion from disparate information sources for flight control in both manned and unmanned platforms. In this investigation, a novel procedure for detecting runways and horizons as well as enhancing surrounding terrain is introduced based on fusion of enhanced vision system (EVS) and synthetic vision system (SVS) images. EVS and SVS image fusion has yet to be implemented real-world situations due to signal misalignment. We address this through a registration step to align the EVS and SVS images. Four fusion rules combining discrete wavelet transform (DWT) sub-bands are formulated, implemented and evaluated. The resulting procedure is tested on real EVS-SVS image pairs and pairs containing simulated turbulence. Evaluations reveal that runways and horizons can be detected accurately even in poor visibility. Furthermore, it is demonstrated that different aspects of the EVS and SVS images can be emphasized by using different DWT fusion rules. The procedure is autonomous throughout landing, irrespective of weather. We believe the fusion architecture developed holds promise for incorporation into head-up displays (HUDs) and UAV remote displays to assist pilots landing aircraft in poor lighting and varying weather. The algorithm also provided a basis rule selection in other signal fusion applications.
COMMUNICATION | doi:10.20944/preprints202209.0041.v1
Subject: Engineering, Biomedical & Chemical Engineering Keywords: Deep Learning; Convolutional Neural Networks; Medical Image Segmentation
Online: 5 September 2022 (03:12:55 CEST)
Convolutional neural network architectures have become increasingly complex, which has improved the performance slowly on well-known benchmark datasets in the recent years. In this research, we have analyzed the true need for such complexity. We have introduced G-Net light, a lightweight modified GoogleNet with improved filter count per layer to reduce feature overlaps and complexity. Additionally, by limiting the amount of pooling layers in the proposed architecture, we have exploited the skip connections to minimize the spatial information loss. The investigations on the proposed architecture are evaluated on three retinal vessel segmentation publicly available datasets. The proposed G-Net light outperforms other vessel segmentation architectures by reducing the number of trainable parameters..
CONCEPT PAPER | doi:10.20944/preprints202208.0072.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: Image Processing System; Drones; Surveillance system; FANET operations
Online: 3 August 2022 (03:54:45 CEST)
The major goal of this paper is to use image enhancement techniques for enhancing and extracting data in FANET applications to improve the efficiency of surveillance. The proposed conceptual system design can improve the likelihood of FANET operations in oil pipeline surveillance, and sports and media coverage with the ultimate goal of providing efficient services to those who are interested. The system architecture model is based on current scientific principles and developing technologies. A FANET, which is capable of gathering image data from video-enabled drones, and an image processing system that permits data collection and analysis are the two primary components of the system. Based on the image processing technique, a proof of concept for efficient data extraction and enhancement in FANET situations and possible services is illustrated
ARTICLE | doi:10.20944/preprints202207.0211.v1
Subject: Medicine & Pharmacology, Oncology & Oncogenics Keywords: Brain tumor; Image segmentation; PSO; ANOVA, K-means.
Online: 14 July 2022 (11:28:00 CEST)
Segmentation of brain tumor images is a major research topic in medical imaging to have a refined detection and understanding of abnormal masses in the brain. This paper proposes a new segmentation method, consisting of three main steps, to detect brain lesions using magnetic resonance imaging (MRI). In the first step, the parts of the image delineating the skull bone are removed to exclude insignificant data. In the second step, which is the main contribution of this study, the particle swarm optimization (PSO) technique is applied to detect the block that contains the brain lesions. The fitness function, used to determine the best block among all candidate blocks, is based on a two-way fixed-effects analysis of variance (ANOVA). In the last step of the algorithm, the K-means segmentation method is used in the lesion block to classify it as tumor or not. A thorough evaluation of the proposed algorithm is performed using the MRI database provided by the Kouba imaging center in Algiers, Algeria. Estimates of the selected fitness function are first compared to those based on the sum-of-absolute-differences (SAD) dissimilarity criterion and demonstrate the efficiency and robustness of the ANOVA. The performance of the optimized brain tumor segmentation algorithm is then compared to the results of several state-of-the-art techniques, including fuzzy C-means, K-means, Otsu thresholding, local thresholding, and watershed segmentation. The results obtained using Dice coefficient, Jaccard distance, correlation coefficient, and root mean square error (RMSE) measurements demonstrate the superiority of the proposed optimized segmentation algorithm over equivalent techniques.
ARTICLE | doi:10.20944/preprints202202.0139.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: uncertainty; prognostic modeling; image biomarkers; radiomics; radiomics harmonization
Online: 9 February 2022 (11:50:10 CET)
Problem. Image biomarker analysis, also known as radiomics, is a tool for tissue characterization and treatment prognosis that relies on routinely acquired clinical images and delineations. Due to the uncertainty in image acquisition, processing, and segmentation (delineation) protocols, radiomics often lacks reproducibility. Radiomics harmonization techniques have been proposed as a solution to reduce these sources of uncertainty and/or their influence on the prognostic model performance. A relevant question is how to estimate the protocol-induced uncertainty of a specific image biomarker, what the effect is on the model performance, and how to optimize the model given the uncertainty. In this manuscript, we show how protocol uncertainty can drastically reduce prognostic model performance. We introduce an effect-size measure η that assesses the protocol-induced uncertainty versus the measurable effect. Methods. Two non-small cell lung cancer (NSCLC) cohorts, composed of 421 and 240 patients respectively, were used for training and testing. Per patient, a Monte Carlo algorithm was used to generate three hundred synthetic contours with a surface dice tolerance measure less than 1.18 mm with respect to the original GTV. These contours were subsequently used to derive 104 radiomic features, which were ranked on their relative sensitivity to contour perturbation, expressed in the parameter η. The top four (low η) and the bottom four (high η) features were selected for two models based on Cox proportional hazards model. To investigate the influence of segmentation uncertainty on the prognostic model, we trained and tested the setup in 5000 augmented realizations (using a Monte Carlo sampling method); the log-rank test was used to assess the stratification performance and stability to segmentation uncertainty. Results. Although both low and high η setup showed significant testing set log-rank p-values (p=0.01) in the original GTV delineations (without segmentation uncertainty introduced), in the model with high uncertainty to effect ratio only around 30% of the augmented realizations resulted in model performance with p < 0.05 in the test set. In contrast, the low η setup performed with log-rank p < 0.05 in 90% of the augmented realizations. Moreover, the high η setup classification was uncertain for 50% of the subjects in the testing set (for 80% agreement rate), whereas the low η setup was uncertain only in 10% of the cases. The code and part of the data are available at https://github.com/Maastro-CDS-Imaging-Group/sure. Discussion. Estimating image biomarker model performance based only on the original GTV segmentation without considering segmentation uncertainty may be deceiving. The model might result in a significant stratification performance, but can be unstable for delineation variations, which are inherent to manual segmentation. Simulating segmentation uncertainty using the method described allows for more stable image biomarker estimation, selection, and model development. The segmentation uncertainty estimation method described here is universal and can be extended to estimate other protocol uncertainties (such as image acquisition and pre-processing).
ARTICLE | doi:10.20944/preprints202104.0611.v1
Subject: Behavioral Sciences, Applied Psychology Keywords: validity; reliability; assessment; body image; self-evaluation; students
Online: 22 April 2021 (14:05:42 CEST)
Body-Esteem Scale is an assessment tool for adolescents and adults that evaluate three dimensions of self-evaluations of one’s body. Body-Esteem Scale has been translated and validated in some countries since America down to Europe. Lack of translation and reliability evidence in Portugal was detected. This study aimed to translate and test the validity and reliability of the Body Esteem Scale for Adolescents and Adults (BESAA) in students in the context of Portuguese higher educa-tion. A total of 173 students (60.7% are female) with a mean age of 19.7 (standard deviation = 2.2) years participated. Categorical Principal Component Analysis was used to assess the underlying dimensions of BESAA. Construct validity was evaluated through correlation with the Appearance Schemas Inventory – Revised and a three-factor model (“Appearance”, ‘‘Weight’’ and “Attribu-tion’’) was established. Confirmatory factor analysis was performed to verify the construct validity of the instrument. Items that had factor weights (λ)<.40 were removed, as well as those that were considered redundant by the modification indices estimated by the Lagrange Multipliers (LM) method (LM>11, p<.001). We observed high correlations between theoretically similar factors, and low correlations between different factors. The Portuguese BESAA showed adequate validity and reliability.
TECHNICAL NOTE | doi:10.20944/preprints202102.0618.v1
Subject: Earth Sciences, Atmospheric Science Keywords: Interpolation; Hydraulic Conductivity; Multi-Point Geostatistics; Training Image
Online: 26 February 2021 (12:47:53 CET)
Hydraulic conductivity is the key and one of the most uncertain parameters in groundwater modeling. The grid based numerical simulation require spatial distribution of sampled hydraulic conductivity at un-sampled locations in the study area. This spatial interpolation has been routinely performed using variogram based models (two-point geostatistics methods). These traditional techniques fail to capture the complex geological structures, provides smoothing effects and ignore the higher order moments of subsurface heterogeneities. In this work, a multiple-point geostatistics (MPS) method is applied to interpolate hydraulic conductivity data which will be further used in WASH123D numerical groundwater simulation model for regional smart groundwater management. To do this, MPS need ‘training images (TIs) as a key input. TI is a conceptual model of subsurface geological heterogeneity which was developed by using concept of ages, topographic slope as an index criteria and knowledge of geologist. After considerations of full physics of study area, an example shows the advantages of using multiple-point geostatistics compared with the traditional two-point geostatistics methods (such as Kriging) for the interpolation of hydraulic conductivity data in a complex geological formation.
ARTICLE | doi:10.20944/preprints202011.0534.v1
Subject: Materials Science, Biomaterials Keywords: Digital Image Correlation; damage; self-heating; EPDM; fillers
Online: 20 November 2020 (10:32:54 CET)
The effect of the strain rate on damage in carbon black filled EPDM stretched during single and multiple uniaxial loading is investigated. This has been performed by analysing the stress-strain response, the evolution of damage by Digital Image Correlation (DIC), the associated dissipative heat source by InfraRed thermography (IR), and the chains network damage by swelling. The strain rates were selected to cover the transition from quasi-static to medium strain rate conditions. In single loading conditions, the increase of the strain rate yields in a preferential damage of the filler network while rubber network is preserved. Such damage is accompanied by a stress softening and an adiabatic heat source rise. Conversely, increasing the strain rate in cyclic loading conditions yields in a filler network accommodation and a high self-heating whose combined effect is proposed as a possible cause of the ability of filled EPDM to limit damage, by reducing cavities opening during loading and favoring cavities closing upon unloading.