REVIEW | doi:10.20944/preprints202105.0127.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Image Acquisition, Image preprocessing, Image enhancement, beatboxing, segmentation
Online: 7 May 2021 (09:09:14 CEST)
Human beatboxing is a vocal art making use of speech organs to produce vocal drum sounds and imitate musical instruments. Beatbox sound classification is a current challenge that can be used for automatic database annotation and music-information retrieval. In this study, a large-vocabulary humanbeatbox sound recognition system was developed with an adaptation of Kaldi toolbox, a widely-used tool for automatic speech recognition. The corpus consisted of eighty boxemes, which were recorded repeatedly by two beatboxers. The sounds were annotated and transcribed to the system by means of a beatbox specific morphographic writing system (Vocal Grammatics). The image processing techniques plays vital role on image Acquisition, image pre-processing, Clustering, Segmentation and Classification techniques with different kind of images such as Fruits, Medical, Vehicle and Digital text images etc. In this study the various images to remove unwanted noise and performs enhancement techniques such as contrast limited adaptive histogram equalization, Laplacian and Harr filtering, unsharp masking, sharpening, high boost filtering and color models then the Clustering algorithms are useful for data logically and extract pattern analysis, grouping, decision-making, and machine-learning techniques and Segment the regions using binary, K-means and OTSU segmentation algorithm. It Classifying the images with the help of SVM and K-Nearest Neighbour(KNN) Classifier to produce good results for those images.
TECHNICAL NOTE | doi:10.20944/preprints202203.0095.v1
Subject: Engineering, Control And Systems Engineering Keywords: pre-processing; image transformation; image enhancement; geometric correction; radiometric correction; Satellite Imagery
Online: 7 March 2022 (09:43:08 CET)
During the few years, various algorithms have been developed to extract features from high-resolution satellite imagery. For the classification of these extracted features, several complex algorithms have been developed. But these algorithms do not possess critical refining stages of processing the data at the preliminary phase. Various satellite sensors have been launched such as LISS3, IKONOS, QUICKBIRD, and WORLDVIEW etc. Before classification and extraction of semantic data, imagery of the high resolution must be refined. The whole refinement process involves several steps of interaction with the data. These steps are pre-processing algorithms that are presented in this paper. Pre-processing steps involves Geometric correction, radiometric correction, Noise removal, Image enhancement etc. Due to these pre-processing algorithms, the accuracy of the data is increased. Various applications of these pre-processing of the data are in meteorology, hydrology, soil science, forest, physical planning etc. This paper also provides a brief description of the local maximum likelihood method, fuzzy method, stretch method and pre-processing methods, which are used before classifying and extracting features from the image.
ARTICLE | doi:10.20944/preprints202307.1395.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Fractional order differential operator; Fractional order integral operator; Image enhancement; Image denoising
Online: 20 July 2023 (10:42:06 CEST)
The theory of fractional calculus extends the order of classical integer calculus from integer to non-integer. As a new engineering application tool, it has made many important research achievements in many fields, including image processing. This paper mainly studies the application of fractional calculus theory in image enhancement and denoising, including the basic theory of fractional calculus and its amplitude frequency characteristics, the application of fractional differential operator in image enhancement, and the application of fractional integral operator in image denoising. The experimental results show that the fractional calculus theory has more special advantages in image enhancement and denoising. Compared with the existing integer order image enhancement operators, the fractional differential operator can more effectively enhance the "weak edge" and "strong texture" details of the image. The fractional order integral image denoising operator can not only improve the signal-to-noise ratio of the image compared to traditional denoising methods, but also better preserve detailed information such as edges and textures of the image.
ARTICLE | doi:10.20944/preprints202108.0574.v2
Subject: Computer Science And Mathematics, Signal Processing Keywords: variational methods; anisotropic diffusion; gradient-domain image processing; local contrast enhancement
Online: 24 September 2021 (10:24:26 CEST)
Gradient-domain image processing is a technique where, instead of operating directly on the image pixel values, the gradient of the image is computed and processed. The resulting image is obtained by reintegrating the processed gradient. This is normally done by solving the Poisson equation, most oftenly by means of a finite difference implementation of the gradient descent method. However, this technique in some cases lead to severe haloing artefacts in the resulting image. To deal with this, local or anisotropic diffusion has been added as an ad-hoc modification of the Poisson equation. In this paper, we show that a version of anisotropic gradient-domain image processing can result from a more general variational formulation through the minimisation of a functional formulated in terms of the eigenvalues of the structure tensor of the differences between the processed gradient and the gradient of the original image. Example applications of linear and non-linear local contrast enhancement and colour image daltonisation illustrate the behaviour of the method.
ARTICLE | doi:10.20944/preprints202104.0318.v1
Subject: Engineering, Transportation Science And Technology Keywords: Kerr frequency comb; Hilbert transform; integrated optics; all-optical signal processing; image processing; video image processing
Online: 12 April 2021 (14:27:20 CEST)
Advanced image processing will be crucial for emerging technologies such as autonomous driving, where the requirement to quickly recognize and classify objects under rapidly changing, poor visibility environments in real time will be needed. Photonic technologies will be key for next-generation signal and information processing, due to their wide bandwidths of 10’s of Terahertz and versatility. Here, we demonstrate broadband real time analog image and video processing with an ultrahigh bandwidth photonic processor that is highly versatile and reconfigurable. It is capable of massively parallel processing over 10,000 video signals simultaneously in real time, performing key functions needed for object recognition, such as edge enhancement and detection. Our system, based on a soliton crystal Kerr optical micro-comb with a 49GHz spacing with >90 wavelengths in the C-band, is highly versatile, performing different functions without changing the physical hardware. These results highlight the potential for photonic processing based on Kerr microcombs for chip-scale fully programmable high-speed real time video processing for next generation technologies.
ARTICLE | doi:10.20944/preprints202301.0313.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Minipig; Brain; Segmentation; Landmarks; Image Processing; Deep Learning; Pig
Online: 17 January 2023 (12:42:08 CET)
Translation of basic animal research to find effective methods of diagnosing and treating human neurological disorders requires parallel analysis infrastructures. Small animals such as mice provide exploratory animal disease models. However, many interventions developed using small animal models fail to translate to human use due to physical or biological differences. Recently, large-animal minipigs have emerged in neuroscience due to both brain similarity and economic advantages. Medical image processing is a crucial part of research as it allows researchers to monitor their experiments and understand disease development. However, although many algorithms are created and optimized for MR analysis of human data, those tools are not directly applicable or sufficiently sensitive to measure minipig data. In this work, we propose PigSNIPE - a pipeline for the automated handling, processing, and analyzing of large-scale data sets of minipig MR images. The pipeline allows for image registration, AC-PC alignment, landmark detection, skull stripping, brainmasks and intracranial volume segmentation (DICE 0.98), tissue segmentation (DICE 0.82), and caudate-putamen brain segmentation (DICE 0.8) in under two minutes. To the best of our knowledge, this is the first automated pipeline tool aimed at large animal images.
CONCEPT PAPER | doi:10.20944/preprints202208.0072.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Image Processing System; Drones; Surveillance system; FANET operations
Online: 3 August 2022 (03:54:45 CEST)
The major goal of this paper is to use image enhancement techniques for enhancing and extracting data in FANET applications to improve the efficiency of surveillance. The proposed conceptual system design can improve the likelihood of FANET operations in oil pipeline surveillance, and sports and media coverage with the ultimate goal of providing efficient services to those who are interested. The system architecture model is based on current scientific principles and developing technologies. A FANET, which is capable of gathering image data from video-enabled drones, and an image processing system that permits data collection and analysis are the two primary components of the system. Based on the image processing technique, a proof of concept for efficient data extraction and enhancement in FANET situations and possible services is illustrated
REVIEW | doi:10.20944/preprints202308.1855.v1
Subject: Engineering, Control And Systems Engineering Keywords: unmanned aerial vehicle; image processing; first-person view; sensors
Online: 29 August 2023 (03:41:09 CEST)
The dependence on Unmanned Aerial Vehicles (UAVs) has dramatically increased in many sectors around the globe. UAVs are in high demand, and their technology is developing quickly due to their sophisticated ability to handle various issues. UAVs are capable of replacing labor-intensive tasks with conducive and safe regulation. Additional tools or sensors need to be added to the UAVs system to ensure the implementation of UAVs able to serve into industrial level. The paper aims to consolidate and present a thorough understanding of the various stages of image processing pipelines deployed in UAV applications, including image acquisition, preprocessing, feature extraction, object detection and tracking, and decision-making processes. Throughout this paper, several aspects were deliberate such as strengths, limitations, and performance metrics of existing approaches, this paper seeks to provide researchers, engineers, and practitioners with valuable insights into the challenges and opportunities of image processing systems for UAVs. Ultimately, the synthesis of this knowledge will contribute to enhancing the effective-ness, autonomy, and applicability of UAVs in diverse fields such as surveillance, agriculture, disaster management, and environmental monitoring.
REVIEW | doi:10.20944/preprints202308.0973.v1
Subject: Engineering, Bioengineering Keywords: spine deformity; scoliosis diagnostic; image processing; medical images
Online: 14 August 2023 (03:38:50 CEST)
Spinal deformity refers to a range of disorders that are defined by anomalous curvature of the spine and may be classified as scoliosis, lordosis, or kyphosis. Among these, scoliosis stands out as the most common type of spinal deformity in human beings, and it can be distinguished by abnormal lateral spine curvature accompanied by axial rotation. Accurate identification of spinal deformity is crucial for a person's diagnosis, and numerous assessment methods have been developed by researchers. Therefore, the present study aims to systematically review recent works on spinal deformity as-assessment for scoliosis diagnosis, utilizing image processing techniques. To gather relevant studies, a search strategy was conducted on three electronic databases (Scopus, ScienceDirect, and PubMed) between 2012 and 2022, using specific keywords and focusing on scoliosis cases. A total of 17 papers fully satisfied the established criteria and were extensively evaluated. Despite variations in methodological designs across the studies, all reviewed articles obtained quality ratings higher than satisfactory. Various diagnostic approaches have been employed, including artificial intelligence mechanisms, image processing, and scoliosis diagnosis systems. These approaches have the potential to save time and, more significantly, can reduce the incidence of human error. While all assessment methods have potential in scoliosis diagnosis, they possess several limitations that can be ameliorated in forthcoming studies. Therefore, the findings of this study may serve as guidelines for the development of a more accurate spinal deformity assessment method that can aid medical personnel in the real diagnosis of scoliosis.
ARTICLE | doi:10.20944/preprints202305.0624.v1
Subject: Engineering, Control And Systems Engineering Keywords: aerial photography; agricultural crop; digital image processing; pattern identification
Online: 9 May 2023 (09:26:00 CEST)
The agricultural sector is undergoing a revolution that requires sustainable solutions to the challenges that arise from traditional farming methods. To address these challenges, technical and sustainable support is needed to develop projects that improve crop performance. This study focuses on the onion crop and the challenges presented throughout its phenological cycle. Aerial monitoring using unmanned aerial vehicles (UAV) and digital image processing were used to identify patterns in the onion crop, including humid areas, weed growth, vegetation deficits, and decreased harvest performance. An algorithm was developed to identify the patterns that most affected crop growth, as the average local production reported was 40.166 ton/ha, but only 25.00 ton/ha was reached due to blight caused by constant humidity and limited sunlight. This resulted in the death of leaves and poor development of bulbs, with 50% of the production being of medium size. It is estimated that approximately 20% of the production was lost due to blight and unfavorable weather conditions.
ARTICLE | doi:10.20944/preprints201610.0040.v1
Subject: Engineering, Automotive Engineering Keywords: agriculture; digital image processing; machine vision; precision agriculture; unmanned aerial vehicle (UAV)
Online: 12 October 2016 (10:28:54 CEST)
Precision agriculture is a farm management technology that involves sensing and then responding to the observed variability in the field. Remote sensing is one of the tools of precision agriculture. The emergence of small unmanned aerial vehicles (sUAV) have paved the way to accessible remote sensing tools for farmers. This paper describes the comparison of two popular off-the-shelf sUAVs: 3DR Iris and DJI Phantom 2. Both units are equipped with a camera gimbal attached with a GoPro camera. The comparison of the two sUAV involves a hovering test and a rectilinear motion test. In the hovering test, the sUAV was allowed to hover over a known object and images were taken every second for two minutes. The position of the object in the images was measured and this was used to assess the stability of the sUAV while hovering. In the rectilinear test, the sUAV was allowed to follow a straight path and images of a lined track were acquired. The lines on the images were then measured on how accurate the sUAV followed the path. Results showed that both sUAV performed well in both the hovering test and the rectilinear motion test. This demonstrates that both sUAVs can be used for agricultural monitoring.
ARTICLE | doi:10.20944/preprints202303.0023.v1
Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: Game Design; Variational AutoEncoder (VAE); Image and Video Generation; Bayesian Algorithm; Loss Function; Data Clustering; Data and Image Analytics; MNIST database; Generator and Discriminator
Online: 1 March 2023 (11:17:12 CET)
In recent decades, the Variational AutoEncoder (VAE) model has shown good potential and capabilities in image generation and dimensionality reduction. The combination of VAE and various machine learning frameworks has also worked effectively in different daily life applications, however its possibility and effectiveness in modern game design has seldom been explored nor assessed. The use of its feature extractor for data clustering was minimally discussed in literature neither. This paper first attempts to explore different mathematical properties of the VAE model, in particular, the theoretical framework of the encoding and decoding processes, the possible achievable lower bound and loss functions of different applications; then applies the established VAE model into generating new game levels within two well-known game settings; as well as validating the effectiveness of its data clustering mechanism with the aid of the Modified National Institute of Standards and Technology (MNIST) database. Respective statistical metrics and assessments were also utilized for evaluating the performance of the proposed VAE model in aforementioned case studies. Based on the statistical and spatial results, several potential drawbacks and future enhancement of the established model were outlined, with the aim of maximizing the strengths and advantages of VAE for future game design tasks and relevant industrial missions.
ARTICLE | doi:10.20944/preprints202302.0203.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Hyperpectral; Deep learning; Neural networks; image processing; classification; segmentation; hardware accelerators; CHIME mission
Online: 13 February 2023 (07:32:31 CET)
Modern hyperspectral imaging technologies generate enormous datasets that could potentially transmit a wealth of information, but such a resource presents numerous difficulties for data analysis and interpretation. Deep learning techniques undoubtedly provide a wide range of potential for solving both traditional imaging tasks and exciting new problems in the spatial-spectral domain. This is true in the primary application area of remote sensing, where hyperspectral technology originated and has made the majority of its progress, but it may be even more true in the vast array of now existing and developing application areas that make use of these imaging technologies. The current review advances on two fronts: on the one hand, it is directed at domain experts who desire an updated overview of how deep learning architectures might work in conjunction with hyperspectral acquisition techniques to address specific tasks in various application sectors. On the other hand, by providing them with a picture of how deep learning technologies are applied to hyperspectral data from (near)real-time perspective. The contributions of this review include the existence of these two points of view and the inclusion of opportunities and important problems associated with the development of future CHIME mission to be launched by European Space Agency (ESA).
ARTICLE | doi:10.20944/preprints202308.0979.v1
Subject: Engineering, Industrial And Manufacturing Engineering Keywords: image processing; K-means clustering algorithm; Traversal algorithm; Indirect solution method
Online: 14 August 2023 (08:57:37 CEST)
The utilization of tempered blast furnace (BF) slag through the direct fiber forming process to create high-value thermal insulation materials offers a dual benefit: it efficiently harnesses the latent heat within unused slag and substantially enhances the value of blast-furnace slag utilization. However, gauging the melting properties of iron slag under high temperatures is a challenge. In this study, we explore the melting behavior of SiO2 within a high-temperature molten pool. We employ dynamic visual data (video stream) captured via a non-contact charge coupled device (CCD) video recording system to extract SiO2 contours through image processing. The change in image centroid characteristics is used to establish a convolution function relationship, and MATLAB's traversal search algorithm determines SiO2's centroid position. Given that SiO2 is proportionate to crucible pixels, the area of SiO2 is calculated through pixel statistics within these contours. Subsequently, we propose a new indirect method to process image information, yielding SiO2 volume and mass at different time points. An exponential fitting yields the melting rate function of SiO2. Finally, we compare this indirect method with shape from shading (SFS), quantitative characterization, and dimensional analysis techniques. We also discuss the strengths and limitations of each method. Our findings reveal that the indirect solution method presented here boasts straightforward calculation steps and imposes minimal image format requirements. This research provides theoretical and technical support for blast-furnace slag's direct fiber forming process.
ARTICLE | doi:10.20944/preprints202009.0583.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: high-speed camera; crack propagation velocity; image sequence analysis; crack analysis; material testing; deformation measurement
Online: 24 September 2020 (12:19:52 CEST)
The determination of crack propagation velocities can provide valuable information for a better understanding of damage processes of concrete. The spatio-temporal analysis of crack patterns developing at a speed of several hundred meters per second is a rather challenging task. In the paper, a photogrammetric procedure for the determination of crack propagation velocities in concrete specimens using high-speed camera image sequences is presented. A cascaded image sequence processing which starts with the computation of displacement vector fields for a dense pattern of points on the specimen’s surface between consecutive time steps of the image sequence chain has been developed. These surface points are triangulated into a mesh, and as representations of cracks, discontinuities in the displacement vector fields are found by a deformation analysis applied to all triangles of the mesh. Connected components of the deformed triangles are computed using region-growing techniques. Then, the crack tips are determined using principal component analysis. The tips are tracked in the image sequence and the velocities between the time stamps of the images are derived. A major advantage of this method as compared to established techniques is in the fact of its allowing for spatio-temporally resolved, full-field measurements rather than point-wise measurements and that information on crack width can be obtained simultaneously. To validate the experimentation, the authors processed image sequences of tests on four compact-tension specimens performed on a split-Hopkinson tension bar. The images were taken by a high-speed camera at a frame rate of 160,000 images per second. By applying to these datasets the image sequence processing procedure as developed, crack propagation velocities of about 800 m/s were determined with a precision in the order of 50 m/s.
ARTICLE | doi:10.20944/preprints202210.0322.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: TC center detection; image processing; optical flow; operational radar network; ACTION
Online: 21 October 2022 (07:38:27 CEST)
This study presents the algorithm ACTION, defined as Automatic Center detection of Tropical cyclone (TC) using Image processing based on the Operational radar Network. Based on the high visibility of weather radar imagery, the TC’s motion vector is calculated from the continuous image change using optical flow, producing its rotation center as the TC’s center. The algorithm’s performance was verified for typhoons (TCs in the Northwestern Pacific) affecting the Korean Peninsula from 2018–2021 and showed a high detection rate of 80.8% within an error distance of 40 km compared to the best track of the Korea Meteorological Administration (KMA). The detection rate was 100% for typhoons with temporally consistent morphological characteristics. ACTION automatically generates TC center information upon the TC’s initial entry inside the observation radius even in the absence of uniform radar data. ACTION easily calculates using Open Source Computer Vision, performs in real time, and can be directly applied to rapidly generated weather radar images; hence, it is currently being utilized by the KMA. The high-temporal-resolution TC center information calculated through ACTION is expected to improve the efficiency of TC forecasting.
ARTICLE | doi:10.20944/preprints201804.0139.v1
Subject: Engineering, Control And Systems Engineering Keywords: ANFIS; basmati rice; image processing; grading; quality assessment; fuzzy inference system
Online: 11 April 2018 (06:28:49 CEST)
Grading of rice grains has gain attentions due its requirement of quality assessment during import or export. Rice grain quality depends on milling operation, where rice hull is removed with a huller system followed by whitening operation. In such process, adjustment of rollers, control, and operation is important in terms of quality of milled rice. Especially, the basmati rice needed more quality assurance as it is not parboiled rice and exported globally with a high product value. In this present work, the basic problem of quality assessment in rice industry is addressed with digital image processing based technique. Machine vision and digital image processing provide an alternative with the automated, nondestructive, cost-effective, and fast approach as compared with traditional method which is done manually by human inspectors. A model of quality grade testing and identification is built based on morphological features using digital image processing and knowledge based adaptive neuro-fuzzy inference system (ANFIS). The qualities of rice kernels are determined with the help of shape descriptors and geometric features using the sample images of milled rice. The adopted technique has been tested on a sufficient number of training images of basmati rice grain. The proposed method gives a promising result in an evaluation of rice quality with 100% classification accuracy for broken and whole grain. The milling efficiency is also assessed using the ratio between head rice and broken rice percentage and it is 77.27% for the test sample. The overall results of the adopted methodology are promising in terms of classification accuracy and efficiency.
ARTICLE | doi:10.20944/preprints202107.0543.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Apoptosis; TUNEL; Caspase; image processing; thresholding; signal quantification; Drosophila
Online: 23 July 2021 (11:55:04 CEST)
Apoptosis is associated with numerous phenotypical characteristics, and is thus studied with many tools. In this study, we compared two broadly used apoptotic assays: TUNEL and staining with an antibody targeting the activated form of an effector caspase. To compare them, we developed a protocol based on commonly used tools such as filters, zprojection and thresholding. Even though it is commonly used in imageprocessing protocols, thresholding remains a recurring problem. Here we analyzed the impact of processing parameters and readout choice on the accuracy of apoptotic signal quantification. Our results show that TUNEL is quite robust, even if image processing parameters can allow or not to detect subtle differences of the apoptotic rate. On the contrary, images from anticleaved caspase staining are more sensitive to handle and proved to necessitate to be processed more carefully. We then developed an open source Fiji macro automatizing most steps of the image processing and quantification protocol. It is noteworthy that the field of application of this macro is wider than apoptosis as it can perfectly be used to treat and quantify other kind of images.
ARTICLE | doi:10.20944/preprints201910.0117.v1
Subject: Engineering, Industrial And Manufacturing Engineering Keywords: SART process; precipitation aggregates; image analysis; microscopy; particle size distribution
Online: 10 October 2019 (10:55:10 CEST)
Precipitation processes are technologies commonly used in hydrometallurgical plants to recover metals or to treat wastewaters. Moreover, solid-liquid separation technologies, such as thickening or filtering, are relevant unit operations, included in the precipitation technologies. These methods are strongly dependent on the characteristics of the solid precipitates formed during the specific precipitation reaction. One of these characteristics is the particle size distribution (PSD) of the solid precipitates which are fed into a solid-liquid separation process. Therefore, PSD determination is a typical practice for the characterization of the slurries generated in a precipitation plant. Furthermore, the precipitates generated in these processes have a colloidal or aggregation behavior, depending on the operational conditions. Nevertheless, the conventional methods used to estimate PSD (e.g., laser diffraction and/or ciclosizer) have not been designed to measure particles that tend to aggregate or disaggregate, since they include external forces (e.g., centrifugal, agitation, pumping and sonication). These forces affect the true size of the aggregates formed in a unit operation, thereby losing representativity in terms of aggregates particle size. This study presents an alternative method of measuring the size distribution of particles with aggregation behavior, particularly, by using non-invasive microscopy and image processing and analysis. The samples used have been obtained from an experimental precipitation process by applying sulfidization to treat the cyanide-copper complexes contained in a cyanidation solution. This method has been validated with statistical tools and compared with a conventional analysis based on laser diffraction. Our results show significant differences between the methods analyzed, demonstrating that image processing and analysis by microscopy is an excellent and non-invasive alternative to obtaining size distribution of aggregates in precipitation processes.
ARTICLE | doi:10.20944/preprints202202.0204.v1
Subject: Medicine And Pharmacology, Pharmacy Keywords: computer vision; image processing; medication adherence; object detection; pill detection
Online: 17 February 2022 (08:45:14 CET)
Objective tools to track medication adherence are lacking. A tool to monitor pill intake that can be implemented in mHealth apps without the need for additional devices was developed. We propose a pill intake detection tool that uses digital image processing to analyze images of a blister to detect the presence of pills. The tool uses the circular Hough transform as a feature extraction technique and is therefore primarily useful for the detection of pills with a round shape. This pill detection tool is composed of two steps. First, the registration of a full blister and storing of reference values in a local database. Second, the detection and classification of taken and remaining pills in similar blisters, to determine the actual number of untaken pills. In the registration of round pills in full blisters, 100% of pills in gray blisters or blisters with a transparent cover were successfully detected. In counting of untaken pills in partially opened blisters, 95.2% of remaining and 95.1% of taken pills were detected in gray blisters, while 88.2% of remaining and 80.8% of taken pills were detected in blisters with a transparent cover. The proposed tool provides promising results for the detection of round pills. However, the classification of taken and remaining pills need to be further improved, in particular for the detection of pills with non-oval shapes.
ARTICLE | doi:10.20944/preprints202107.0638.v1
Subject: Biology And Life Sciences, Plant Sciences Keywords: Image Processing; Automated Plant Diseases Detection; Histogram Oriented Gradient (HOG); Local Binary Pattern (LBP); Support Vector Machine (SVM)
Online: 28 July 2021 (17:18:04 CEST)
: On earth, plants play the most important part. Every organ of a plant plays a vital role in the ecological field as well as the medicinal field. But on the whole earth there are several species of plants are available. Different plants have different diseases. Therefore it is needed to identify the plants and their diseases to prevent loss. Now to identify the plants and their diseases manually is very time consuming. In this research an automatic plant and their disease detection system is proposed. For experimental purposes, high-quality leaf images are accepted for training and testing. For detecting the healthy and diseased area in a leaf, region-based and color-based region thresholding techniques were used. For feature selection Histogram Oriented Gradient (HOG) and Local Binary Pattern (LBP) method were applied. Finally for classification two-class and multi-class Support Vector Machine (SVM) was used. It is observed that both feature selection processes with SVM give 99% accuracy. Finally to understand the automated system a graphical user interface was created for all users.
ARTICLE | doi:10.20944/preprints202102.0189.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: image quality assessment; image databases; superpixels; color image; color space; image quality measures
Online: 8 February 2021 (11:11:47 CET)
Objective Image Quality Assessment (IQA) measures are playing an increasingly important role in the evaluation of digital image quality. New IQA indices are expected to be strongly correlated with subjective observer evaluations expressed by MOS/DMOS scores. One such recently proposed index is the SuperPixel-based SIMilarity (SPSIM) index, which uses superpixel patches instead of the rectangular pixel grid.The authors in this paper have been proposed three modifications of SPSIM index. For this purpose, the color space used by SPSIM was changed and the way SPSIM determines similarity maps was modified using methods derived from the algorithm for computing the MDSI index. The third modification was a combination of the first two. These three new quality indices were used in the assessment process. The experimental results obtained on many color images from five image databases demonstrated the advantages of the proposed SPSIM modifications.
ARTICLE | doi:10.20944/preprints202007.0686.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: document scanning; whiteboard capture; image enhancement; image alignment; image registration; image quality assessment
Online: 28 July 2020 (14:03:51 CEST)
The move from paper to online is not only necessary for remote working, it is also significantly more sustainable. This trend has seen a rising need for high-quality digitization of content from pages and whiteboards to sharable online material. But capturing this information is not always easy, nor are the results always satisfactory. Available scanning apps vary in their usability and do not always produce clean results, retaining surface imperfections from the page or whiteboard in their output images. CleanPage, a novel smartphone-based document and whiteboard scanning system, is presented. CleanPage requires one button-tap to capture, identify, crop and clean an image of a page or whiteboard. Unlike equivalent systems, no user intervention is required during processing and the result is a high-contrast, low-noise image with a clean homogenous background. Results are presented for a selection of scenarios showing the versatility of the design. CleanPage is compared with two market leader scanning apps using two testing approaches: real paper scans and ground-truth comparisons. These comparisons are achieved by a new testing methodology that allows scans to be compared to unscanned counterparts, by using synthesized images. Real paper scans are tested using image quality measures. An evaluation of standard image quality assessments is included in this work and a novel quality measure for scanned images is proposed and validated. The user experience for each scanning app is assessed, showing CleanPage to be fast and easier to use.
ARTICLE | doi:10.20944/preprints202010.0323.v1
Subject: Engineering, Automotive Engineering Keywords: Image segmentation; sonar image; ocean engineering；morphological image processing
Online: 15 October 2020 (13:10:41 CEST)
It has remained a hard nut for years to segment sonar images, most of which are noisy images with inevitable blur after noise reduction. For the purpose of solutions to this problem, a fast segmentation algorithm is proposed on the basis of the gray value characteristics of sonar images. This algorithm is endowed with the advantage in no need of segmentation thresholds to be calculated. To realize this goal, it follows the undermentioned steps: first, calculate the gray matrix of the fuzzy image background. After adjusting the gray value, segment the region into the background region, buffer region and target regions. After filtering, reset the pixels with gray value lower than 255 to binarize images and eliminate most artifacts. Finally, remove the remaining noise from images by means of morphological image processing. The simulation results of several sonar images show that the algorithm can segment the fuzzy sonar image quickly and effectively, with no problem of incomplete image target shape. Thus, the stable and feasible method is testified.
REVIEW | doi:10.20944/preprints202306.1179.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image forensics; image forgery detection; robust image watermarking; deep learning
Online: 16 June 2023 (11:07:50 CEST)
Digital images have become an important carrier for people to access information in the information age. However, with the development of the technology, digital images are vulnerable to illegal access and tampering, to the extent that they pose a serious threat to personal privacy, social order and national security. Therefore, image forensic techniques have become an important research topic in the field of multimedia information security. In recent years, deep learning technology has been widely applied in the field of image forensics and the performance achieved has significantly exceeded the conventional forensic algorithms. This survey compares the state-of-the-art image forensic techniques based on deep learning in recent years. The image forensic techniques are divided into passive and active forensics. In passive forensics, forgery detection techniques are reviewed, and the basic framework, evaluation metrics and commonly used datasets for forgery detection are presented. The performance, advantages and disadvantages of existing methods are also compared and analyzed according to different types of detection. In active forensics, robust image watermarking techniques are overviewed, the evaluation metrics and basic framework of robust watermarking techniques are presented. The technical characteristics and performance of existing methods are analyzed based on the different types of attacks on images. Finally, future research directions and conclusions are given to provide useful suggestions for people in image forensics and related research fields.
ARTICLE | doi:10.20944/preprints201703.0086.v1
Subject: Engineering, Control And Systems Engineering Keywords: image enhancement; image fusion; color space; edge detector; underwater image
Online: 14 March 2017 (17:52:48 CET)
In order to improve contrast and restore color for underwater image captured by camera sensors without suffering from insufficient details and color cast, a fusion algorithm for image enhancement in different color spaces based on contrast limited adaptive histogram equalization (CLAHE) is proposed in this article. The original color image is first converted from RGB color space to two different special color spaces: YIQ and HSI. The color space conversion from RGB to YIQ is a linear transformation, while the RGB to HSI conversion is nonlinear. Then, the algorithm separately operates CLAHE in YIQ and HSI color spaces to obtain two different enhancement images. The luminance component (Y) in the YIQ color space and the intensity component (I) in the HSI color space are enhanced with CLAHE algorithm. The CLAHE has two key parameters: Block Size and Clip Limit, which mainly control the quality of CLAHE enhancement image. After that, the YIQ and HSI enhancement images are respectively converted backward to RGB color. When the three components of red, green, and blue are not coherent in the YIQ-RGB or HSI-RGB images, the three components will have to be harmonized with the CLAHE algorithm in RGB space. Finally, with 4 direction Sobel edge detector in the bounded general logarithm ratio operation, a self-adaptive weight selection nonlinear image enhancement is carried out to fuse YIQ-RGB and HSI-RGB images together to achieve the final fused image. The enhancement fusion algorithm has two key factors: average of Sobel edge detector and fusion coefficient, and these two factors determine the effects of enhancement fusion algorithm. A series of evaluate metrics such as mean, contrast, entropy, colorfulness metric (CM), mean square error (MSE) and peak signal to noise ratio (PSNR) are used to assess the proposed enhancement algorithm. The experiments results showed that the proposed algorithm provides more detail enhancement and higher values of colorfulness restoration as compared to other existing image enhancement algorithms. The proposed algorithm can suppress effectively noise interference, improve the image quality for underwater image availably.
REVIEW | doi:10.20944/preprints202307.0585.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Underwater image analysis; Underwater image restoration; Underwater image enhancement; Underwater datasets; Underwater image quality evaluation
Online: 10 July 2023 (10:06:22 CEST)
In recent years, underwater exploration for deep-sea resource utilization and development has a considerable interest. In an underwater environment, the obtained images and videos undergo several types of quality degradation resulting from light absorption and scattering, low contrast, color deviation, blurred details, and nonuniform illumination. Therefore, the restoration and enhancement of degraded images and videos are critical. Numerous techniques of image processing, pattern recognition and computer vision have been proposed for image restoration and enhancement, but many challenges remain. This survey presents a comparison of the most prominent approaches in underwater image processing and analysis. It also discusses an overview of the underwater environment with a broad classification into enhancement and restoration techniques and introduces the main underwater image degradation reasons in addition to the underwater image model. The existing underwater image analysis techniques, methods, datasets, and evaluation metrics are presented in detail. Furthermore, the existing limitations are analyzed, which are classified into image-related and environment-related categories. In addition, the performance is validated on images from the UIEB dataset for qualitative, quantitative, and computational time assessment. Areas in which underwater images have recently been applied are briefly discussed. Finally, recommendations for future research are provided and the conclusion is presented.
ARTICLE | doi:10.20944/preprints201902.0089.v3
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Digital image processing, color image, grayscale image, histogram equalization, histogram specification, image enhancement, RGB channel
Online: 11 February 2019 (10:42:57 CET)
This paper has two major parts. In the first part histogram equalization for the image enhancement was implemented without using the built-in function in MATLAB. Here, at first, a color image of a rat was chosen and the image was transformed into a grayscale image. After this conversion, histogram equalization was implemented on the grayscale image. Later on, in the same image for each RGB channel, histogram equalization was implemented to observe the effect of histogram equalization on each channel. In the end, the histogram equalization was implemented to this specific color image of a rat. In the second part, for the grayscale image in part 1, the desired histogram of another colored image of a rat was introduced and histogram specification was implemented on the original colored image.
ARTICLE | doi:10.20944/preprints201811.0565.v1
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Digital image processing, color image, grayscale image, histogram equalization, histogram specification, image enhancement, RGB channel
Online: 23 November 2018 (14:17:13 CET)
This paper has two major parts. In the first part histogram equalization for the image enhancement was implemented without using the built-in function in MATLAB. Here, at first, a color image of a rat was chosen and the image was transformed into a grayscale image. After this conversion, histogram equalization was implemented on the grayscale image. Later on, in the same image for each RGB channel, histogram equalization was implemented to observe the effect of histogram equalization on each channel. In the end, the histogram equalization was implemented to this specific color image of a rat. In the second part, for the grayscale image in part 1, the desired histogram of another colored image of a rat was introduced and histogram specification was implemented on the original colored image.
ARTICLE | doi:10.20944/preprints202309.2177.v1
Subject: Engineering, Mechanical Engineering Keywords: particle image velocimetry; OpenPIV; python; image processing
Online: 30 September 2023 (09:59:14 CEST)
Particle Image Velocimetry (PIV) is a widely used experimental technique for measuring flow. In recent years, open-source PIV software has become more popular as it offers researchers and practitioners enhanced computational capabilities. Software development for graphical processing unit (GPU) architectures requires careful algorithm design and data structure selection for optimal performance. PIV software, optimized for central processing units (CPUs), offer an alternative to specialized GPU software. In the present work, an improved algorithm for the OpenPIV-Python software is presented and implemented under a traditional CPU framework. The Python language was selected due to its versatility and widespread adoption. The algorithm was also tested on a supercomputing cluster, a workstation, and Google Colaboratory during the development phase. Using a known velocity field, the algorithm precisely captured the time-average flow, monetary velocity fields, and vortices.
ARTICLE | doi:10.20944/preprints202304.1088.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: deep learning; image aesthetics assessment; image enhancement
Online: 28 April 2023 (03:15:16 CEST)
Abstract: Image aesthetic assessment (IAA) with neural attention has made significant progress due to its effectiveness in object recognition. Current studies have shown that the features learned by convolutional neural networks (CNN) at different learning stages indicate meaningful information. The shallow feature contains the low-level information of images and the deep feature perceives the image semantics and themes. Inspired by this, we propose a visual enhancement network with feature fusion (FF-VEN). It consists of two sub-modules, the visual enhancement module (VE module) and the shallow and deep feature fusion module (SDFF module). The former uses an adaptive filter in the spatial domain to simulate human eyes according to the region of interest (ROI) extracted by neural feedback. The latter not only takes out the shallow feature and the deep feature by transverse connection, but also uses a feature fusion unit (FFU) to fuse the pooled features together with the aim of information contribution maximization. Experiments on standard AVA dataset and Photo.net dataset show the effectiveness of FF-VEN.
ARTICLE | doi:10.20944/preprints202310.0838.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: neural networks; image denoising; image processing; denoising algorithms
Online: 13 October 2023 (04:19:29 CEST)
Image denoising has been one of the important problems in the field of computer vision, and it has a wide range of practical value in many applications, such as medical image processing, image enhancement, and computational photography. Traditional image denoising methods are usually based on hand-designed features and filters, but these methods perform poorly under complex noise and image structures. In recent years, the rapid development of neural network technology has revolutionized the image-denoising task. This paper introduces the knowledge about neural networks and image denoising, explores the impact of neural networks on image denoising, and how is it possible to denoise images by neural networks. It also summarises other image-denoising methods and finally points out the challenges and problems faced by image-denoising at present. Some possible new development directions are proposed to provide new solutions for image-denoising researchers and to promote the development of the field.
ARTICLE | doi:10.20944/preprints202306.0081.v1
Subject: Engineering, Bioengineering Keywords: Deep Learning; Image Synthesis; Image Generation; Machine Learning; Medical Imaging; CT to MRI; Synthetic MRI; Stroke; Image-to-image Translation
Online: 1 June 2023 (11:30:09 CEST)
CT scans are currently the most common imaging modality used for suspected stroke patients due to their short acquisition time and wide availability. However, MRI offers superior tissue contrast and image quality. In this study, eight deep learning models are developed, trained, and tested using a dataset of 181 CT/MR pairs from stroke patients. The resultant synthetic MRIs generated by these models are compared through a variety of qualitative and quantitative methods. The synthetic MRIs generated by a 3D UNet model consistently demonstrated superior performance across all methods of evaluation. Overall, the generation of synthetic MRIs from CT scans using the methods described in this paper produces realistic MRIs that can guide the registration of CT scans to MRI atlases. The synthetic MRIs enable the segmentation of white matter, gray matter, and cerebrospinal fluid using algorithms designed for MRIs, exhibiting a high degree of similarity to true MRIs.
ARTICLE | doi:10.20944/preprints202309.0946.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: adversarial attacks; artificial neural networks; robustness; image filtering; convolutional neural networks; image recognition; image distortion
Online: 14 September 2023 (08:31:30 CEST)
In this paper, we continue the research cycle on the properties of convolutional neural network-based image recognition systems and ways to improve noise immunity and robustness . Currently, a popular research area related to artificial neural networks is adversarial attacks. The effect of adversarial attacks on the image is not highly perceptible to the human eye, also it drastically reduces the neural network accuracy. Image perception by a machine is highly dependent on the propagation of high frequency distortions throughout the network. At the same time, a human efficiently ignores high-frequency distortions, perceiving the shape of objects as a whole. The approach proposed in this paper can improve the image recognition accuracy in the presence of high-frequency distortions, in particular, caused by adversarial attacks. The proposed technique makes it possible to measure up the logic of artificial neural network to that of a human, for whom high-frequency distortions are not decisive in object recognition.
ARTICLE | doi:10.20944/preprints202306.0736.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image denoising; image deblurring; salt&pepper noise; nonlinear diffusion.
Online: 12 June 2023 (02:18:59 CEST)
An algorithm for the treatment of images affected by both blurring and salt&pepper noise is proposed with a cost only proportional to the number of pixels. The methodology uses a discretization scheme for the Laplace operator multiplied by a suitable nonlinear term depending on the gradient. Even if this approach resembles a diffusion type algorithm, only one step of the procedure is applied, leading to significant time savings. The procedure is successfully tested on some standard black&white natural images.
ARTICLE | doi:10.20944/preprints202108.0286.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Image enhancement; DCT-Domain Perceived Contrast; Perceptual Image Quality
Online: 13 August 2021 (08:31:37 CEST)
This paper develops a detail image signal enhancement that makes images perceived as clearer and more resolved and so is more effective for higher resolution displays. We observe that the local variant signal enhancement makes images more vivid, and the more revealed granular signals harmonically embedded on the local variant signals make images more resolved. Based on this observation, we develop a method that not only emphasizes the local variant signals by scaling up the frequency energy in accordance with human visual perception, but also strengths up the granular signals by embedding the alpha-rooting enhanced frequency components. The proposed energy scaling method emphasizes the detail signals in texture images and rarely boosts noisy signals in plain images. In addition, to avoid the local ringing artifact, the proposed method adjusts the enhancement direction to be parallel to the underlying image signal direction. It was verified through the subjective and objective quality evaluations that the developed method makes images perceived as clearer and highly resolved.
ARTICLE | doi:10.20944/preprints202101.0345.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: image processing; low resolution image; crack detection; user algorithm
Online: 18 January 2021 (14:26:38 CET)
Abstract: Imaging devices of less than 300,000 pixels are mostly used for sewage conduit exploration due to the petty nature of the survey industry in Korea. P articular ly , devices of less than 100,000 pixels are still widely used, and the environment for image processing is very bitter . Since the sewage conduit image s covered in this study ha ve a very low resolution (240 × 320 = 76,800 pixels), it is very difficult to detect cracks. Because most of the resolution of the sewe r conduit images are very low in Korea, this problem of low resolution was selected as the subject of study. Cracks were detected through a total of six steps of improving the crack in Step 2, finding the optimal threshold value in Step 3, and applying an algorithm to detect cracks in Step 5. Cracks were effectively detected by the optimal parameters in Steps 2 and 3 and the user algorithm in Step 5. Desp ite the very low resolution, the cracked image s showed 96.4% accuracy of detection, and the non cracked image s showed 94.5% accuracy . Moreover, the analysis was excellent in quality , also . It is believed that the findings of this study can be effectively u sed for crack detection with low resolution images.
ARTICLE | doi:10.20944/preprints201810.0393.v1
Subject: Engineering, Control And Systems Engineering Keywords: image analysis; Turin Shroud; body-image formation; energy propagation
Online: 18 October 2018 (03:55:21 CEST)
Recent studies on the image of the Turin Shroud (TS) lead to think it could have been formed through a not well-identified mechanism of energy radiation. In order to remove some lacunas about this imaging process, a reverse engineering method has been applied to it, arriving to exclude some possible mechanisms. The image formation of a human face wrapped on a cloth by using an ad-hoc developed software has been simulated. The results of different kinds of the radiation depending from different parameters have been simulated, each one connected with accredited hypotheses. On the basis of the comparison among the different images produced by the software and the TS Face, some useful information both about the kind of radiation and the cloth wrapping conditions have been obtained. The effect of image distortion of a cloth wrapped around a face has been discussed too by defining the best laws of radiation and of their attenuation with distance. A Lambertian law is not compatible with the TS image. A vertical radiation shows a problem in reproducing the requested resolution. A radiation perpendicular to the emitting surface, like that produced by an electric field appears promising to explain the TS Face.
ARTICLE | doi:10.20944/preprints201705.0028.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: monocular image; image segment; SIFT; depth measurement; convex hull
Online: 3 May 2017 (09:19:59 CEST)
It is one of very important and basic problem in compute vision field that recovering depth information of objects from two-dimensional images. In view of the shortcomings of existing methods of depth estimation, a novel approach based on SIFT (the Scale Invariant Feature Transform) is presented in this paper. The approach can estimate the depths of objects in two images which are captured by an un-calibrated ordinary monocular camera. In this approach, above all, the first image is captured. All of the camera parameters remain unchanged, and the second image is acquired after moving the camera a distance d along the optical axis. Then image segmentation and SIFT feature extraction are implemented on the two images separately, and objects in the images are matched. Lastly, an object depth can be computed by the lengths of a pair of straight line segments. In order to ensure that the best appropriate a pair of straight line segments are chose and reduce the computation, the theory of convex hull and the knowledge of triangle similarity are employed. The experimental results show our approach is effective and practical.
ARTICLE | doi:10.20944/preprints201611.0057.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: multi-focus image, image fusion, region mosaic, contrast pyramid
Online: 10 November 2016 (07:34:22 CET)
This paper proposes a new approach for multi-focus images fusion based on Region Mosaicing on Contrast Pyramids (REMCP). A density-based region growing method is developed to construct a focused region mask for multi-focus images. The segmented focused region mask is decomposed into a mask pyramid, which is then used for supervised region mosaicking on a contrast pyramid. In this way, the focus measurement and the continuity of focused regions are incorporated and the pixel level pyramid fusion is improved at the region level. Objective and subjective experiments show that the proposed REMCP is more robust to noise than compared algorithms and can fully preserves the focus information of the multi-focus images meanwhile reducing distortions of the fused images.
ARTICLE | doi:10.20944/preprints201811.0566.v2
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Color image, grayscale image, motion blurring, random noise, inverse filtering, Wiener filtering, restoration of an image
Online: 5 February 2019 (16:13:14 CET)
In this paper, at first, a color image of a car is taken. Then the image is transformed into a grayscale image. After that, the motion blurring effect is applied to that image according to the image degradation model described in equation 3. The blurring effect can be controlled by a and b components of the model. Then random noise is added in the image via Matlab programming. Many methods can restore the noisy and motion blurred image; particularly in this paper Inverse filtering as well as Wiener filtering are implemented for the restoration purpose. Consequently, both motion blurred and noisy motion blurred images are restored via Inverse filtering as well as Wiener filtering techniques and the comparison is made among them.
ARTICLE | doi:10.20944/preprints202201.0259.v2
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: image classifier; image part; quick learning; feature overlap; positional context
Online: 11 April 2022 (10:17:57 CEST)
This paper describes an image processing method that makes use of image parts instead of neural parts. Neural networks excel at image or pattern recognition and they do this by constructing complex networks of weighted values that can cover the complexity of the pattern data. These features however are integrated holistically into the network, which means that they can be difficult to use in an individual sense. A different method might scan individual images and use a more local method to try to recognise the features in it. This paper suggests such a method, where a trick during the scan process can not only recognise separate image parts, as features, but it can also produce an overlap between the parts. It is therefore able to produce image parts with real meaning and also place them into a positional context. Tests show that it can be quite accurate, on some handwritten digit datasets, but not as accurate as a neural network, for example. The fact that it offers an explainable interface could make it interesting however. It also fits well with an earlier cognitive model, and an ensemble-hierarchy structure in particular.
ARTICLE | doi:10.20944/preprints202008.0336.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image processing; image classification; computer vision; expert systems; amber gemstones
Online: 15 August 2020 (04:39:11 CEST)
The article describes a classification solution for amber stones. The problem of classifying amber is known for a long time among jewelers and artisans of amber art. Existing solutions can classify amber pieces according to color, but a need to classify by shape and texture is not satisfied up to now. The proposed solution is capable of classifying the gemstones according to a shape. Amber can be considered as a specific object since the form is difficult to define unambiguously. Data for amber experiments was gathered from amber art craftsmen. In the proposed solution amber form can be classified into 10 different classes (7 classes chosen during the experiment).
ARTICLE | doi:10.20944/preprints202006.0117.v1
Subject: Medicine And Pharmacology, Other Keywords: Image Noise Removal; Image Enhancement; MFNR; Speckle noise; Median Filter
Online: 9 June 2020 (05:00:26 CEST)
Speckle noise is one of the most difficult noises to remove especially in medical applications. It is a nuisance in ultrasound imaging systems which is used in about half of all medical screening systems. Thus, noise removal is an important step in these systems, thereby creating reliable, automated, and potentially low cost systems. Herein, a generalized approach MFNR (Multi-Frame Noise Removal) is used, which is a complete Noise Removal system using KDE (Kernal Density Estimation). Any given type of noise can be removed if its probability density function (PDF) is known. Herein, we extracted the PDF parameters using KDE. Noise removal and detail preservation are not contrary to each other as the case in single-frame noise removal methods. Our results showed practically complete noise removal using MFNR algorithm compared to standard noise removal tools. The Peak Signal to Noise Ratio (PSNR) performance was used as a comparison metric. This paper is an extension to our previous paper where MFNR Algorithm was showed as a general purpose complete noise removal tool for all types of noises
ARTICLE | doi:10.20944/preprints202002.0125.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: image inpainting; image completion; attention; pyramid structure loss; deep learning
Online: 10 February 2020 (10:16:37 CET)
This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures --- edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.
ARTICLE | doi:10.20944/preprints201906.0248.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image segmentation; neutrosophic information; Shannon entropy; gray level image threshold
Online: 25 June 2019 (08:48:22 CEST)
This article presents a new method of segmenting grayscale images by minimizing Shannon's neutrosophic entropy. For the proposed segmentation method, the neutrosophic information components, i.e., the degree of truth, the degree of neutrality and the degree of falsity are defined taking into account the belonging to the segmented regions and at the same time to the separation threshold area. The principle of the method is simple and easy to understand and can lead to multiple thresholds. The efficacy of the method is illustrated using some test gray level images. The experimental results show that the proposed method has good performance for segmentation with optimal gray level thresholds.
ARTICLE | doi:10.20944/preprints201904.0078.v1
Subject: Social Sciences, Psychology Keywords: forest recreation; forest landscape; landscape image; landscape image sketching technique
Online: 8 April 2019 (09:08:30 CEST)
The landscape image is the bridge of communication between people and forests, and the cut point of the supply-side reform of forest tourism products. The research collected 140 copies in total of forest landscape image drawings from non-art-major graduate students by randomly sampling during April and May, 2018, and constructed the landscape image conceptual model of forest by utilizing the landscape image sketching technique. The results showed that (1) In regard to linguistic knowledge, the natural landscape elements for instance, herbaceous plants, terrains, creatures, water and sky, and the broad-leaf forest objectively reflected not only the real forest landscape and the local native vegetation, but the variation of forest species with little attention. (2) On the perspective of spatial view, the sideways view indicated that graduate students preferred to watch forests at a moderate distance externally and few looked at forests internally. (3) In the view of self-orientation, the objective landscape indicated that graduate students preferred to demonstrate forest landscapes, they did not realize to interact with the environment. (4) On the aspect of social meaning, the scenic view and forest structure stated that graduate students preferred rural forest landscapes, not significantly for other special interests for forest. In conclusions, (1) the forest is thought to be a feature of people's life world and of rural scenes around homes, not an objective perception of the forest. (2) The forest is regarded as an important habitat for animals and a limited resource for people's life, production and recreation needs, into which people will go only to meet such needs. (3) The natural values of forests, like the ecology and aesthetics, etc. get more attention, while the social values of forests, like the life, production and culture receives rather low attention.
ARTICLE | doi:10.20944/preprints202006.0091.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Breast Cancer Screening; Digital Image Elasto Tomography (DIET); Image Noise Removal, Image Enhancement; Multiple Frame Noise Removal (MFNR)
Online: 7 June 2020 (14:53:34 CEST)
Breast cancer is a leading cause of death among women. Conventional screening methods, such as mammography, and ultrasound diagnosis are expensive and have significant limitations. Digital Image Elasto Tomography (DIET) is a new noninvasive breast cancer screening system that has a potential to be a low cost and reliable breast cancer screening tool. It is based on modal analysis of the breast mass, and stereographic 3D image analysis to detect the stiffer abnormal tissues. However, camera sensor noise, especially Gaussian noise is a major source of Optical Flow (OF) error in this approach to tumor detection. This work studies the performance of different conventional filters, including the standard Gaussian filter tool to remove this noise and produce more robust screening results. A radical approach, Multiple Frame Noise Removal (MFNR) is proposed, for use in this type of medical image processing instead of a Gaussian filter or other typical image noise removal tools. Its a multiple frame noise removal method where Probability Density Function (PDF) of noise is extracted from the multiple images by characterizing the same pixel positions in multiple images. The noise becomes deterministic, and hence easily removed. The proposed algorithm was applied to a data set from 10 phantom breast tests with a prototype DIET system, and 10 in-vivo samples from healthy women. Comparisons were made to an optimal Gaussian filter form that is commonly used. Reductions in OF error using these digitally imaged data sets was used to compare performance. Refinement of the images for medical applications requires higher PSNR, which was successfully achieved by using MFNR algorithm. In this study, the algorithm was used to improve the imaging results of a DIET system. The conventional wisdom that states that noise removal and detail preservation are contrasting effects is
ARTICLE | doi:10.20944/preprints202310.1144.v1
Subject: Biology And Life Sciences, Life Sciences Keywords: image denoising; filtering methods; biomedical image denoising; healthcare; adaptive filtering methods
Online: 18 October 2023 (09:18:36 CEST)
In this paper, the filtering method of biomedical image denoising is described comprehensively. Firstly, it introduces the biomedical image denoising, describes the relationship between biomedical image denoising and medical care, introduces the filtering methods, the filtering methods of biomedical image denoising, the challenges encountered by the current filtering methods, and other application fields of filtering methods. Firstly, the background of biomedical image denoising is introduced. Biomedical image denoising is a challenge. Different imaging modes have different noise characteristics, and noise levels can vary greatly depending on the specific application. Secondly, it describes that biomedical image denoising plays an important role in medical care, and the biomedical image directly affects the patient's diagnosis, treatment plan and the overall quality of medical care service. Then the filtering method is introduced in detail, describing the core concepts and related features of linear filtering, nonlinear filtering and frequency domain filtering, and then focusing on the adaptive filtering method, describing the characteristics, conditions of use, common algorithms and advantages of adaptive filtering method. Then the filter methods of biomedical image denoising are introduced, and the core concepts of Gaussian filter, median filter, total variation denoising and Wiener filter are introduced respectively. Then, the challenges encountered by filtering methods are described, such as the accurate selection of filters, the balance between noise reduction and image detail preservation are introduced. Finally, the application of filtering method in other fields is mentioned, such as audio processing, speech recognition and so on. In summary, this paper comprehensively expounds the denoising and filtering methods of biomedical images, the filtering methods of medical image denoising, the relationship between medical image denoising and medical care, and the challenges encountered by filtering methods.
REVIEW | doi:10.20944/preprints202309.0223.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: deep learning; medical images; image registration; medical image analysis; survey; review
Online: 5 September 2023 (03:51:29 CEST)
Image registration (IR) is a process that deforms images to align them with respect to a reference space, making it easier for medical practitioners to examine various medical images in a standardized reference frame, such as having the same rotation and scale. This document introduces image registration using a simple numeric example. It provides a definition of image registration along with a space-oriented symbolic representation. This review covers various aspects of image transformations, including affine, deformable, invertible, and bidirectional transformations, as well as medical image registration algorithms such as Voxelmorph, Demons, SyN, Iterative Closest Point, and SynthMorph. It also explores atlas-based registration and multistage image registration techniques, including coarse-fine and pyramid approaches. Furthermore, this survey paper discusses medical image registration taxonomies, datasets, evaluation measures, such as correlation-based metrics, segmentation-based metrics, processing time, and model size. It also explores applications in image-guided surgery, motion tracking, and tumor diagnosis. Finally, the document addresses future research directions, including the further development of transformers.
ARTICLE | doi:10.20944/preprints202105.0408.v1
Subject: Engineering, Automotive Engineering Keywords: UAV Images; Monoscopic Mapping; Stereoscopic Plotting; Image Overlap; Optimal Image Selection
Online: 18 May 2021 (10:10:07 CEST)
Recently, the mapping industry has been focusing on the possibility of large-scale mapping from unmanned aerial vehicles (UAVs) owing to advantages such as easy operation and cost reduction. In order to produce large-scale maps from UAV images, it is important to obtain precise orientation parameters. For this, various techniques have been developed and are included in most of the commercial UAV image processing software. For mapping, it is equally important to select images that can cover a region of interest (ROI) with the fewest possible images. Otherwise, to map the ROI, one may have to handle too many images, and commercial software does not provide information needed to select images, nor does it explicitly explain how to select images for mapping. For these reasons, stereo mapping of UAV images in particular is time consuming and costly. In order to solve these problems, this study proposes a method to select images intelligently. We can select a minimum number of image pairs to cover the ROI with the fewest possible images. We can also select optimal image pairs to cover the ROI with the most accurate stereo pairs. We group images by strips, and generate the initial image pairs. We then apply an intelligent scheme to iteratively select optimal image pairs from the start to the end of an image strip. According to the results of the experiment, the number of images selected is greatly reduced by applying the proposed optimal image–composition algorithm. The selected image pairs produce a dense 3D point cloud over the ROI without any holes. For stereoscopic plotting, the selected image pairs were map the ROI successfully on a digital photogrammetric workstation (DPW), and a digital map covering the ROI is generated. The proposed method should contribute to time and cost reductions in UAV mapping.
REVIEW | doi:10.20944/preprints202012.0479.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Image classification; Texture image analysis; Discriminant features; Combination methods; texture operators
Online: 18 December 2020 (16:21:50 CET)
In many image processing and computer vision applications, the main aim is to describe image contents. So, different visual properties such as color, texture and shape are extracted to make aim. In this respect, texture information play important role in image description and visual pattern classification. Texture is referred to a specific local distribution of intensities that is repeated throughout the image. Since now different operations or descriptors have been proposed to analysis texture characteristics. In the multi object images specific texture operators usually doesn’t provide accurate results. So, in many cases, combination of texture operators are used to achieve more discriminant features. In this paper, some combination methods are survived to analysis effect of combinational texture features in image content description. Also, in the result part, different related methods are compared in terms of accuracy and computational complexity.
ARTICLE | doi:10.20944/preprints202005.0167.v1
Subject: Computer Science And Mathematics, Applied Mathematics Keywords: neutrosophic information; Onicescu information energy; image segmentation; gray level image threshold
Online: 10 May 2020 (14:41:04 CEST)
This article presents a method of segmenting images with gray levels that uses Onicescu's information energy calculated in the context of the neutrosophic theory. Starting from the information energy calculation for complete neutrosophic information, it is shown how to extend its calculation for incomplete and inconsistent neutrosophic information. The segmentation method is based on calculation of thresholds for separating the gray levels using the local maximum points of the Onicescu information energy.
ARTICLE | doi:10.20944/preprints202303.0326.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: document image processing; deskew; Hough Line Transform; image rectification; machine learning; OCR; document orientation; image preprocessing; computer vision; AI
Online: 17 March 2023 (13:25:06 CET)
Document deskewing is a fundamental problem in document image processing. While existing methods have limitations, such as Hough Line Transformation that can deskew images upside down, and Deep Learning models that require huge amounts of human labour and computational resources and still fail to deskew while taking care of orientation, OCR-based methods also struggle to read text when it is tilted. In this paper, we propose a novel, simple, cost-effective deep learning method for fixing the skew and orientation of documents. Our approach reduces the search space for the machine learning model to predict whether an image is upside down or not, avoiding the huge search space of predicting an angle between 0 and 360. We finetuned a MobileNetV2 model, which was pre-trained on imagenet, using only 200 images and achieve good results. This method is useful for automation-based tasks, such as data extraction using OCR technology, and can greatly reduce manual labour.
ARTICLE | doi:10.20944/preprints202112.0140.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Image Recognition; Preference Net
Online: 8 December 2021 (14:43:39 CET)
Accuracy and computational cost are the main challenges of deep neural networks in image recognition. This paper proposes an efficient ranking reduction to binary classification approach using a new feed-forward network and feature selection based on ranking the image pixels. Preference net (PN) is a novel deep ranking learning approach based on Preference Neural Network (PNN), which uses new ranking objective function and positive smooth staircase (PSS) activation function to accelerate the image pixels’ ranking. PN has a new type of weighted kernel based on spearman ranking correlation instead of convolution to build the features matrix. The PN employs multiple kernels that have different sizes to partial rank image pixels’ in order to find the best features sequence. PN consists of multiple PNNs’ have shared output layer. Each ranker kernel has a separate PNN. The output results are converted to classification accuracy using the score function. PN has promising results comparing to the latest deep learning (DL) networks using the weighted average ensemble of each PN models for each kernel on CFAR-10 and Mnist-Fashion datasets in terms of accuracy and less computational cost.
ARTICLE | doi:10.20944/preprints202310.2003.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Image classification; Computer vision; Transfer learning; Image database; Plant nutrition; Leaf analysis
Online: 31 October 2023 (08:13:17 CET)
Computer vision is a powerful technology that has enabled solutions in various fields by analyzing visual attributes in images. One field that has taken advantage of computer vision is agricultural automation, which promotes high-quality crop production. The nutritional status of a crop is a crucial factor in determining its productivity. This status is mediated by approximately 14 chemical elements acquired by the plant, and their determination plays a pivotal role in farm management. To address the timely identification of nutritional disorders, this study focuses on the classification of three levels of phosphorus deficiencies through individual leaf analysis. The methodological steps include: (1) generating a database with laboratory-grown maize plants that were induced to total phosphorus deficiency, medium deficiency, and total nutrition, using different capture devices; (2) processing the images with state-of-the-art transfer learning architectures (i.e. VGG16, ResNet50, GoogLeNet, DenseNet201, and MobileNetV2); and (3) evaluating the classification performance of the models using the created database. The results show that the VGG16 model achieves superior performance, with 98% classification accuracy. However, the other studied architectures also demonstrate competitive performance and are considered state-of-the-art automatic leaf deficiency detection tools. The proposed method can be a starting point to fine-tune machine vision-based solutions tailored for real-time monitoring of crop nutritional status.
ARTICLE | doi:10.20944/preprints202304.0723.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: no-reference image quality assessment; multitask learning; image restoration; multi-level features.
Online: 21 April 2023 (10:52:35 CEST)
When the image quality is evaluated, the human visual system (HVS) infers the details in the image through its internal generative mechanism. In this process, the HVS integrates both local and global information of the image, utilizes contextual information to restore the original image information, and compares it with the distorted image information for image quality evaluation. Inspired by this mechanism, a no-reference image quality assessment method is proposed based on a multitask image restoration network. The multitask image restoration network generates a pseudo-reference image as the main task and produces structural similarity index measure map as an auxiliary task. By mutually promoting the two tasks, a higher quality pseudo-reference image is generated. In addition, when predicting the image quality score, both the quality restoration features and the difference features between the distorted and reference images are used, thereby fully utilizing the information from the pseudo-reference image. To enable the model to focus on both global and local features, a multi-scale feature fusion module is proposed. Experimental results demonstrate that the proposed method achieves excellent performance on both synthetically and authentically distorted databases.
ARTICLE | doi:10.20944/preprints201705.0027.v2
Subject: Social Sciences, Geography, Planning And Development Keywords: remote sensing; image registration; multiple image features; different viewpoint; non-rigid distortion
Online: 13 June 2017 (09:52:10 CEST)
Remote sensing image registration plays an important role in military and civilian fields, such as natural disaster damage assessment, military damage assessment and ground targets identification, etc. However, due to the ground relief variations and imaging viewpoint changes, non-rigid geometric distortion occurs between remote sensing images with different viewpoint, which further increases the difficulty of remote sensing image registration. To address the problem, we propose a multi-viewpoint remote sensing image registration method which contains the following contributions. (i) A multiple features based finite mixture model is constructed for dealing with different types of image features. (ii) Three features are combined and substituted into the mixture model to form a feature complementation, i.e., the Euclidean distance and shape context are used to measure the similarity of geometric structure, and the SIFT (scale-invariant feature transform) distance which is endowed with the intensity information is used to measure the scale space extrema. (iii) To prevent the ill-posed problem, a geometric constraint term is introduced into the L2E-based energy function for better behaving the non-rigid transformation. We evaluated the performances of the proposed method by three series of remote sensing images obtained from the unmanned aerial vehicle (UAV) and Google Earth, and compared with five state-of-the-art methods where our method shows the best alignments in most cases.
ARTICLE | doi:10.20944/preprints202108.0392.v1
Subject: Engineering, Control And Systems Engineering Keywords: image quality assessment; real-time image processing; image functions adaptation; convolutional neural network; face alignment; deep neural network; random forest
Online: 18 August 2021 (17:06:02 CEST)
In recent years, data providers are generating and streaming a large number of images. More particularly, processing images that contain faces have received great attention due to its numerous applications, such as entertainment and social media apps. The enormous amount of images shared on these applications presents serious challenges and requires massive computing resources to ensure efficient data processing. However, images are subject to a wide range of distortions in real application scenarios during the processing, transmission, sharing, or combination of many factors. So, there is a need to guarantee acceptable delivery content, even though some distorted images do not have access to their original version. In this paper, we present a framework developed to estimate the images' quality while processing a large number of images in real-time. Our quality evaluation is measured using an integration of a deep network with random forests. In addition, a face alignment metric is used to assess the facial features. Experimental results have been conducted on two artificially distorted benchmark datasets, LIVE and TID2013. We show that our proposed approach outperforms the state-of-art methods, having a Pearson Correlation Coefficient (PCC) and Spearman Rank Order Correlation Correlation Coefficient (SROCC) with subjective human scores of almost 0.942 and 0.931 while minimizing the processing time from 4.8ms to 1.8ms.
ARTICLE | doi:10.20944/preprints202311.0161.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: deep learning; skin cancer; image augmentation; GAN; geometric augmentation; image classification; interpretable technique
Online: 2 November 2023 (10:52:57 CET)
This research paper presents a deep learning approach to early detection of skin cancer using image augmentation techniques. The authors propose a two-stage image augmentation technique that involves the use of geometric augmentation and generative adversarial network (GAN) to classify skin lesions as either benign or malignant. This research utilized the public HAM10000 dataset to test the proposed model. Several pre-trained models of CNN were employed, namely Xception, Inceptionv3, Resnet152v2, EfficientnetB7, InceptionresnetV2, and VGG19. Our approach achieved accuracy, precision, recall, and F1-score of 96.90%, 97.07%, 96.87%, 96.97%, respectively, which is higher than the performance achieved by other state-of-the-art methods. The paper also discusses the use of SHapley Additive exPlanations (SHAP), an interpretable technique for skin cancer diagnosis, which can help clinicians understand the reasoning behind the diagnosis and improve trust in the system. Overall, the proposed method presents a promising approach to automated skin cancer detection that could improve patient outcomes and reduce healthcare costs.
ARTICLE | doi:10.20944/preprints202310.0524.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning; image representation learning; self-supervised learning; masked image modeling; contrastive learning
Online: 9 October 2023 (12:52:30 CEST)
Self-supervised learning is a method that learns general representation from unlabeled data. Masked image modeling (MIM), one of the generative self-supervised learning methods, has drawn attention showing state-of-the-art performance on various downstream tasks, though showing poor linear separability resulting from the token-level approach. In this paper, we propose a contrastive learning-based multi-view masked autoencoder for MIM, exploiting an image-level approach by learning common features from two different augmented views. We strengthen MIM by learning long-range global patterns from contrastive loss. Our framework adopts simple encoder-decoder architecture, learning rich and general representation by following a simple process: 1) two different views are generated from an input image with random masking and by contrastive loss, we can learn semantic distance of the representations generated by an encoder. By applying a high mask ratio, 80%, it works as strong augmentation and alleviates the representation collapse problem. 2) With reconstruction loss, decoder learns to reconstruct an original image from the masked image. We assess our framework by several experiments on benchmark datasets of image classification, object detection, and semantic segmentation. We achieve 84.3% fine-tuning accuracy on ImageNet-1K classification and 76.7% in linear probing, exceeding previous studies and show promising results on other downstream tasks. Experimental results demonstrate that our work can learn rich and general image representation by applying contrastive loss to masked image modeling.
REVIEW | doi:10.20944/preprints202309.2137.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Medical image analysis, Medical image data, Deep learning, Computer vision techniques, Optimisation methods
Online: 30 September 2023 (17:58:32 CEST)
Medical image analysis is an important branch in the field of medicine, which mainly uses image processing and analysis techniques to interpret and diagnose medical image data. Medical image data helps doctors to effectively observe and diagnose patients' body structures, tissues and lesions. Medical image analysis has been an important research area in the medical field, and it is important for disease diagnosis, treatment planning, and condition monitoring. In recent years, the rapid development of deep learning and computer vision technologies has contributed greatly to the automation, multimodal data fusion, real-time application, and accuracy improvement of medical image analysis. In addition, the development of deep learning has given rise to some new research areas in medical image analysis, such as Generative Adversarial Networks (GANs) for synthetic medical images, self-supervised learning for unsupervised feature learning, and neural network interpretability. In this paper, we will introduce some optimisation methods for medical images which are effective in improving the accuracy, efficiency and reliability of medical image analysis.
ARTICLE | doi:10.20944/preprints202306.0922.v1
Subject: Computer Science And Mathematics, Signal Processing Keywords: Multimodality medical image; Image fusion; Sparse representation (SR); Kronecker criterion; Activity level measure
Online: 13 June 2023 (10:09:15 CEST)
Multimodal medical image fusion is a fundamental but challenging problem in the fields of brain science research and brain disease diagnosis, and it is challenging for sparse representation (SR)-based fusion to characterize activity level with single measurement and no loss of effective information. In this paper, the Kronecker-criterion-based SR framework is applied for medical image fusion with a patch-based activity level integrating salient features of multiple domains. Inspired by the formation process of vision system, the spatial saliency is characterized by textural contrast (TC), which is composed of luminance and orientation contrasts to promote more highlighted texture information to participate in the fusion process. As substitution of the conventional l1-norm-based sparse saliency, a metric of sum of sparse salient features (SSSF) is used for promoting more significant coefficients to participate in the composition of activity level measure. The designed activity level measure is verified to be more conducive to maintain the integrity and sharpness of detailed information. Various experiments on multiple groups of clinical medical images verify the effectiveness of the proposed fusion method on both visual quality and objective assessment. Furthermore, the research work of this paper is helpful for further detection and segmentation of medical images.
ARTICLE | doi:10.20944/preprints202303.0319.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: industrial image processing; feature amplification; image transformation strategy; text detection; Probabilistic Hough Transform
Online: 17 March 2023 (09:05:54 CET)
Industrial nameplates serve as a means of conveying critical information and parameters. In this work, we propose a novel approach for rectifying industrial nameplate pictures utilizing a probabilistic Hough transform. Our method effectively corrects for distortions and clipping, and features a collection of challenging nameplate pictures for analysis. To determine the corners of the nameplate, we employ a progressive probability Hough transform, which not only enhances detection accuracy but also possesses the ability to handle complex industrial scenarios. The results of our approach are clear and readable nameplate text, as demonstrated through experiments that show improved accuracy in model identification compared to other methods.
CONCEPT PAPER | doi:10.20944/preprints202204.0129.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Digital Design; Digital Architecture; Image Processing; Machine learning; FPGA; Dedicated Design; Image Processor
Online: 14 April 2022 (05:09:47 CEST)
Many dedicated designs for real-time operations provide functionality on fixed-sized operators, but where speed, scalability, and flexibility are required, extensive research is demanded. Dedicated designs can provide real-time processing for many applications. This paper presents an FPGA-based design of a general image processor. The proposed design is based on a fixed-point representation of binary numbers. The proposed design provides a mechanism to manage matrices on-chip along with matrix arithmetic. The matrices are represented with simple identifiers and microinstruction that assist in the computation of many operations which are useful for solving complex problems. The design was successfully implemented and tested using VHDL language. The proposed design is an efficient architecture as a standalone processor with all embedding computational resources necessary for an embedded image processing application.
ARTICLE | doi:10.20944/preprints202001.0205.v1
Subject: Social Sciences, Behavior Sciences Keywords: itch; scratch; automated real-time detection; machine-learning based image classifier; image sharpness
Online: 19 January 2020 (03:13:48 CET)
A 'little brother' of pain, itch is an unpleasant sensation that creates a specific urge to scratch. To date, various machine-learning based image classifiers (MBICs) have been proposed for quantitative analysis of itch-induced scratch behaviour of laboratory animals in an automated, non-invasive, inexpensive and real-time manner. In spite of MBICs' advantages, the overall performances (accuracy, sensitivity and specificity) of current MBIC approaches remains inconsistent, with their values varying from ~50% to ~99%, for which the reasons underlying have yet to be investigated further, both computationally and experimentally. To look into the variation of the performance of MBICs in automated detection of itch-induced scratch, this article focuses on the experimental data recording step, and reports here for the first time that MBICs' overall performance is inextricably linked to the sharpness of experimentally recorded video of laboratory animal scratch behaviour. This article furthermore demonstrates for the first time that a linearly correlated relationship exists between video sharpness and overall performance (accuracy and specificity, but not sensitivity) of MBICs, and highlight the primary role of experimental data recording in rapid, accurate and consistent quantitative assessment of laboratory animal itch.
ARTICLE | doi:10.20944/preprints201911.0218.v1
Subject: Environmental And Earth Sciences, Environmental Science Keywords: Landsat; Google Earth; water index; unsupervised image classification; supervised image classification; Kappa coefficient
Online: 19 November 2019 (03:10:17 CET)
To address three important issues related to extraction of water features from Landsat imagery, i.e., selection of water indexes and classification algorithms for image classification, collection of ground truth data for accuracy assessment, this study applied four sets (ultra-blue, blue, green, and red light based) of water indexes (NWDI, MNDWI, MNDWI2, AWEIns, and AWEIs) combined with three types of image classification methods (zero-water index threshold, Otsu, and kNN) to 24 selected lakes across the globe to extract water features from Landsat-8 OLI imagery. 1440 (4x5x3x24) image classification results were compared with the extracted water features from high resolution Google Earth images with the same (or ±1 day) acquisition dates through computing the Kappa coefficients. Results show the kNN method is better than the Otsu method, and the Otsu method is better than the zero-water index threshold method. If the computational cost is not an issue, the kNN method combined with the ultra-blue light based AWEIns is the best method for extracting water features from Landsat imagery because it produced the highest Kappa coefficients. If the computational cost is taken into account, the Otsu method is a good choice. AWEIns and AWEIs are better than NDWI, MNDWI and MNDWI2. AWEIns works better than AWEIs under the Otsu method, and the average rank of the image classification accuracy from high to low is the ultra-blue, blue, green, and red light-based AWEIns.
CONCEPT PAPER | doi:10.20944/preprints202312.0116.v2
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: image processing; keyframes; indexing; metadata
Online: 6 December 2023 (03:43:44 CET)
Video lectures are becoming more popular and in demand as online classroom teaching is becoming more prevalent. Massive Open Online Courses (MOOCs), such as NPTEL, have been creating high-quality educational content that is freely accessible to students online. A large number of colleges across the country are now using NPTEL videos in their classrooms. So more video lectures are being recorded, maintained, and uploaded. These videos generally contain information about that video before the lecture begins. We generally observe that these educational videos have metadata containing five to six attributes: Institute Name, Publisher Name, Department Name, Professor Name, Subject Name, and Topic Name. It would be easy to maintain these videos if we could organize them according to their categories. The indexing of these videos based on this information is beneficial for students all around the world to efficiently utilise these videos. In this project, we are trying to get the metadata information mentioned above from the video lectures.
REVIEW | doi:10.20944/preprints202308.0657.v2
Subject: Physical Sciences, Radiation And Radiography Keywords: image quality; interventional radiology; pediatrics
Online: 30 August 2023 (04:05:02 CEST)
Pediatric interventional cardiology procedures are essential in diagnosing and treating congenital heart disease in children; however, they raise concerns about potential radiation exposure. Managing radiation doses and assessing image quality in angiographs becomes imperative for safe and effective interventions. This systematic review aims to comprehensively analyze the current understanding of physical image quality metrics relevant for characterizing X-ray systems used in fluoroscopy-guided pediatric cardiac interventional procedures, considering the main factors reported in the literature that influence this outcome. A search in Scopus and Web of Science, using relevant keywords and inclusion/exclusion criteria, yielded fourteen relevant articles published between 2000 and 2022. The physical image quality metrics reported were noise, signal-to-noise ratio, contrast, contrast-to-noise ratio, and high contrast spatial resolution. Various factors influencing image quality were investigated, such as polymethyl methacrylate thickness (often used to simulate water equivalent tissue thickness), operation mode, anti-scatter grid presence and tube voltage. Objective evaluations using these metrics ensure impartial assessments for main factors affecting image quality, improving in the characterization fluoroscopic X-ray systems, and aiding informed decisions to safeguard pediatric patients during procedures.
COMMUNICATION | doi:10.20944/preprints202306.0492.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: strongest activations; image complexity; convolution
Online: 7 June 2023 (05:42:05 CEST)
Neural networks were treated as black boxes for a long time. Previous works have unearthed what aspects of an image were important for convolutional layers at different positions in the network. This was done using deconvolutional networks. In this paper, we examine how well a convolutional neural network performs when those convolutional layers which are relatively unimportant for a particular image (i.e., the image does not produce one of the strongest activations) are skipped in the training, validating, and testing process.
ARTICLE | doi:10.20944/preprints201906.0166.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: MRI image; Texture Features; GLCM
Online: 18 June 2019 (05:36:29 CEST)
This paper presented a feature vector using a different statistical texture analysis of brain tumor from MRI image. The statistical feature texture is computed using GLCM (Gray Level Co-occurrence Matrices) of Brain Nodule structure. For this paper, the brain nodule segmented using strips method to implemented marker watershed image segmentation based on PSO (Particle Swarm Optimization) and Fuzzy C-means clustering (FCM). Furthermore, the four angles 0o, 45o, 90o and 135o are calculated the segmented brain image in GLCM. The four angular directions are calculated using texture features are correlation, energy, contrast and homogeneity. The texture analysis is performed a different types of images using past years. So the algorithm proposed statistical texture features are calculated for iterative image segmentation. These results show that MRI image can be implemented in a system of brain cancer detection.
ARTICLE | doi:10.20944/preprints202309.0762.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image processing; image analysis; deep learning; roof structure extraction; roof vectorization; frame field learning
Online: 12 September 2023 (08:36:45 CEST)
A topic of growing interest in urban remote sensing is the automated extraction of geometrical building information for 3D city modeling. Roof geometry information is useful for applications such as urban planning, solar potential estimation and telecommunication installation planning, and wind flow simulations for pollutant diffusion analysis. Recent research has proven that the advance in remote sensing technologies and deep learning methods offer the prospects of deriving the roof structure information accurately and efficiently. In this study, we propose a Vectorized Roof Extractor- method based on Fully Convolutional Networks (FCNs) and advanced polygonization method to extract roof structure from aerial imagery and a normalized Digital Surface Models (nDSM) in a regularized vector format. The roof structure consists of building outlines, external edges of the building roof, inner rooflines, internal intersections of the main roof planes. The methodology is comprised of segmentation, vectorization and post-processing for outer rooflines, external edges of the building roof, and inner rooflines, and internal intersections of the main roof planes. For the comparison, we adapt the Frame field Learning (FFL) method originally designed to extract building polygons . Our experiments are conducted on a custom data set derived for the city of Enschede, The Netherlands, using aerial imagery, nDSM and manually digitized training polygons. The results show that the proposed Vectorized Roof Extractor outperformed adapted FFL on PoLiS distance with values of 3.5 m and 1.2 m for outlines and inner rooflines, respectively. Furthermore, the model surpassed the adapted FFL on PoLiS-thresholded F-score for outlines and inner rooflines, with 0.31 and 0.57, respectively. The Vectorized Roof Extractor produced adequate visual results, with straighter walls and fewer missed inner roofline detections. It can predict buildings with common walls thanks to skeleton graph computation. To summarize, the proposed method is suitable for urban applications and has the potential to be improved further.
ARTICLE | doi:10.20944/preprints201810.0534.v1
Subject: Engineering, Industrial And Manufacturing Engineering Keywords: non-destructive testing; process optimization; porosity; pore hotspots; image-based simulations; 3D image analysis
Online: 23 October 2018 (09:58:18 CEST)
This paper presents the latest developments in microCT, both globally and locally, for supporting the additive manufacturing industry. There are a number of recently developed capabilities which are especially relevant to the non-destructive quality inspection of additive manufactured parts; and also for advanced process optimization. These new capabilities are all locally available but not yet utilized to their full potential, most likely due to a lack of knowledge of these capabilities. The aim of this paper is therefore to fill this gap and provide an overview of these latest capabilities, showcasing numerous local examples.
ARTICLE | doi:10.20944/preprints201805.0240.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: background reconstruction; image quality assessment; image dataset; subjective evaluation; perceptual quality; objective quality metric
Online: 17 May 2018 (09:36:33 CEST)
With an increased interest in applications that require a clean background image, such as video surveillance, object tracking, street view imaging and location-based services on web-based maps, multiple algorithms have been developed to reconstruct a background image from cluttered scenes. Traditionally, statistical measures and existing image quality techniques have been applied for evaluating the quality of the reconstructed background images. Though these quality assessment methods have been widely used in the past, their performance in evaluating the perceived quality of the reconstructed background image has not been verified. In this work, we discuss the shortcomings in existing metrics and propose a full reference Reconstructed Background image Quality Index (RBQI) that combines color and structural information at multiple scales using a probability summation model to predict the perceived quality in the reconstructed background image given a reference image. To compare the performance of the proposed quality index with existing image quality assessment measures, we construct two different datasets consisting of reconstructed background images and corresponding subjective scores. The quality assessment measures are evaluated by correlating their objective scores with human subjective ratings. The correlation results show that the proposed RBQI outperforms all the existing approaches. Additionally, the constructed datasets and the corresponding subjective scores provide a benchmark to evaluate the performance of future metrics that are developed to evaluate the perceived quality of reconstructed background images.
ARTICLE | doi:10.20944/preprints201612.0075.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: image recognition bases location; indoor positioning; RGB-D images; LiDAR; DataBase; mobile computing; image retrieval
Online: 15 December 2016 (07:17:35 CET)
This paper describes the first results of an Image Recognition Based Location (IRBL) for mobile application focusing on the procedure to generate a Database of range images (RGB-D). In an indoor environment, to estimate the camera position and orientation, a prior spatial knowledge of the surrounding is needed. In order to achieve this objective a complete 3D survey of two different environment (Bangbae metro station of Seoul and E.T.R.I. building in Daejeon – Republic of Korea) was performed using LiDAR (Light Detection And Ranging) instrument and the obtained scans were processed in order to obtain a spatial model of the environments. From this, two databases of reference images were generated using a specific software realized by the Geomatics group of Politecnico di Torino (ScanToRGBDImage). This tool allow to generate synthetically different RGB-D images) centered in the each scan position in the environment. Later, the external parameters (X, Y, Z, ω, φ, κ) and the range information extracted from the DB images retrieved, are used as reference information for pose estimation of a set of acquired mobile pictures in the IRBL procedure. In this paper the survey operations, the approach for generating the RGB-D images and the IRB strategy are reported. Finally the analysis of the results and the validation test are described.
ARTICLE | doi:10.20944/preprints202312.0005.v1
Subject: Engineering, Mechanical Engineering Keywords: Microfracture; image processing; network; simulation analyzes
Online: 1 December 2023 (05:14:14 CET)
Fatigue fractures in materials are the main cause of approximately 80% of all material failures, and it is believed that such failures can be predicted and mathematically calculated in a reliable manner. It is possible to establish prediction modalities in cases of fatigue fracture, according to three fundamental variables in fatigue, such as volume, number of fracture cycles, as well as applied stress, with the integration of Weibull constants (length characteristic). This investigation was carried out mechanical fatigue tests on specimens smaller than 4 mm2 in section of different industrial materials for their subsequent analysis through precision computed tomography in search of microfractures. The measurement of these microfractures, along with their metrics and classifications, was recorded. A convolutional neural network trained with deep learning was used to achieve the detection of microfractures in image processing. The detection of microfractures in images with 480x854 or 960x960 pixels is the primary objective of this network, and its accuracy is above 95%. Images that have microfractures and those that do not are classified by the network. Subsequently, by means of image processing, the microfracture is isolated. Finally, the images that do contain this feature are interpreted by image processing to obtain their area, perimeter, characteristic length, circularity, orientation, and type microfracture metrics. All values will be obtained in pixels and converted to metric units (μm) through a conversion factor based on image resolution.
ARTICLE | doi:10.20944/preprints202109.0295.v1
Subject: Medicine And Pharmacology, Dietetics And Nutrition Keywords: Obesity; Eating Disorder; Body Image; Adolescents.
Online: 16 September 2021 (16:34:57 CEST)
There is growing recognition of the adverse effects of body image dissatisfaction (BID) and eating disorder (ED) symptoms on adolescent health. The aim of this study was to estimate the prevalence of ED symptoms, BID, and their relationship in adolescents from public schools in Southern Brazil. A total of 782 schoolchildren (male: n=420, female: n=362); age: 15 ± 0,4 years) answered a self-administrated questionnaire to identify sociodemographic data. Children´s Figure Rating Scale was adopted to identify body image and Eating Attitudes Test (EAT-26) was applied to investigate ED symptoms. Inferential statistics and hierarchical model-controlled logistic regression were used for association between variables. Most of the schoolchildren reported being satisfied with their bodies. However, we observed a higher prevalence of dissatisfaction among girls for being overweight and thinness among boys. Female students and students from schools located in the central area of the city showed higher chances of developing ED symptoms, and the absence of symptoms of ED appeared to act as a protective factor against BID in schoolchildren. Results of this study show the need to reflect on these factors that influence the development of ED and non-acceptance of their own body in a population concerned with their physical appearance.
ARTICLE | doi:10.20944/preprints202109.0285.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: remote sensing; deep learning; image classification
Online: 16 September 2021 (13:38:55 CEST)
Autonomous image recognition has numerous potential applications in the field of planetary science and geology. For instance, having the ability to classify images of rocks would allow geologists to have immediate feedback without having to bring back samples to the laboratory. Also, planetary rovers could classify rocks in remote places and even in other planets without needing human intervention. Shu et al. classified 9 different types of rock images using a Support Vector Machine (SVM) with the image features extracted autonomously. Through this method, the authors achieved a test accuracy of 96.71%. In this research, Convolutional Neural Networks(CNN) have been used to classify the same set of rock images. Results show that a 3-layer network obtains an average accuracy of 99.60% across 10 trials on the test set. A version of Self-taught Learning was also implemented to prove the generalizability of the features extracted by the CNN. Finally, one model has been chosen to be deployed on a mobile device to demonstrate practicality and portability. The deployed model achieves a perfect classification accuracy on the test set, while taking only 0.068 seconds to make a prediction, equivalent to about 14 frames per second.
CASE REPORT | doi:10.20944/preprints202012.0785.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: built environment; image analysis; remote sensing
Online: 31 December 2020 (09:51:50 CET)
The development of unmanned satellite space technology is increasingly willing, the emergence of medium resolution satellites with sensitivity and spectral variants such as Landsat is very effective in observing environmental changes, while the purpose of this study is to monitor the development of built-in land using image transformation techniques, estimating built-in land changes. The research method uses the NDVI image transformation technique, NDBI and Built Up Index, with Landsat satellite image data obtained from USGS. Accuracy sampling is done by purposive sampling with confusion matrix accuracy test technique. The research results were found. developed land for the period 2004 - 2010 with a percentage of 19.25%, for stages 2010 - 2018 with a percentage of 30.25%. The land development was built based on the area of the highest sub-district in the Kubung area in the early period with a percentage of 7.20% then in the second period with a percentage of 32.23%. The quality of the accuracy of the results of image analysis using confusion matrix technique with an image accuracy level in a field sample of 185 with an image accuracy of 86.04%.
ARTICLE | doi:10.20944/preprints202012.0727.v1
Subject: Business, Economics And Management, Accounting And Taxation Keywords: city marketing; sustainable development; resillience; image
Online: 29 December 2020 (11:24:13 CET)
The focus of this study is to identify whether resilience and sustainable development can be used as an image for strategic planning of the city marketing. Resilience is about building and planning for future proof the cities. How urban challenges and crisis have the lowest impact and the maximum of bounce back and evolution. Resilience is part of the sustainable development. Thus, it is important for the decision-makers to define the mission on their strategic planning in a holistically way taking into consideration the basic assets of a city, the environment, the economy and the society and how can all of them can be combined to marketing the city and take into consideration the internal and external environment. As the past few years’ city marketing has become an important tool for the urban development. The main goal is to show how city marketing can be applied on a city that tries to be more resilient and more sustainable by using strategic urban planning to set the vision, to identify the challenges and the problematic areas and to set new goals and objectives in order to plan and build to future proof the complexity of an urban system. For answering the questions of this article we use two case studies Rotterdam (Netherlands) and Thessaloniki (Greece), using a literature review and researches conducted alongside with a benchmarking of their resilient strategies as both of the cities are members of the Resilient Cities Network. From a different perspective of resilient thinking, both of the cities have managed to use resilience as a marketing image for further sustainable development.
ARTICLE | doi:10.20944/preprints201910.0188.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: digital watermarking; multiple image; transform domain
Online: 17 October 2019 (08:48:19 CEST)
In this paper, a technique of image watermarking using multiple images as watermarks is presented. The technique is based on transform domain functions including discrete wavelet transform (DWT), discrete cosine transform (DCT) and singular value decomposition (SVD) with an image as the host signal i.e. the watermarks will be used as proofs of the authenticity of the host image. The technique is executed by performing multilevel DWT followed by applying DCT and SVD to both the host and watermark. Multiple watermarks are used for the insurance of better security level. The scheme is immune to common image processing operations & some attacks and exhibits PSNR of 108.3781dB, normalized cross correlation (NCC) over 0.99 and normalized correlation (NC) over 0.99.
ARTICLE | doi:10.20944/preprints201906.0215.v1
Subject: Social Sciences, Education Keywords: addiction; triathletes; bogy image; behavior regulation
Online: 21 June 2019 (11:36:23 CEST)
The aim of the research was getting to know the risk of dependency on physical exercising in individual sportspeople and the relationship with body dissatisfaction and motivation. 225 triathletes, swimmers, cyclists and athletes- with ages going from 18 to 63 years old took part in the research, of which 145 were men (M = 35.57 ±10.46 years) and 80 women (M = 32.83 ±10.31 years). The EDS-R was used to study the dependency on exercising, BSQ to study body dissatisfaction, BREQ-3 to know the motivation of participants and BIAQ to analyse conducts of avoidance to body image. The obtained results show that 8.5% of the subjects had risk of dependency on exercising and that 18.2% tend to have corporal dissatisfaction, without meaningful differences in the kind of sport they practiced. However, there were important differences concerning the dependency on physical exercise (15% vs 4.8%) and body dissatisfaction (31.1% vs 11%) in relation to sex, being the higher percentage referring to women. The introjected regulation and the conduct of food restriction were the predictor variables of the dependency on exercising and corporal dissatisfaction.
REVIEW | doi:10.20944/preprints201903.0095.v1
Subject: Biology And Life Sciences, Biophysics Keywords: Striated Muscle, image reconstruction, muscle physiology
Online: 7 March 2019 (12:42:36 CET)
Much has been learned about the interaction between myosin and actin through biochemistry, in vitro motility assays and cryo-electron microscopy of F-actin decorated with myosin heads. Comparatively less is known about actin-myosin interactions within the filament lattice of muscle, where myosin heads function as independent force generators and thus most measurements report an average signal from multiple biochemical and mechanical states. All of the 3-D imaging by electron microscopy that has revealed the interplay of the regular array of actin subunits and myosin heads within the filament lattice has been accomplished using the flight muscle of the large waterbug Lethocerus sp. Lethocerus flight muscle possesses a particularly favorable filament arrangement that enables all the myosin cross-bridges contacting the actin filament to be visualized in a thin section. This review covers the history of this effort and the progress toward visualizing the complex set of conformational changes that myosin heads make when binding to actin in several static states as well as fast frozen actively contracting muscle. The efforts have revealed a consistent pattern of changes to the myosin head structures determined by X-ray crystallography needed to explain the structure of the different acto-myosin interactions observed in situ.
ARTICLE | doi:10.20944/preprints201811.0028.v1
Subject: Business, Economics And Management, Business And Management Keywords: ISO; social responsibility; image; profitability; SMEs
Online: 2 November 2018 (06:53:35 CET)
At present, business strategies in SMEs (Small and medium enterprises) are crucial for consolidation in highly competitive markets, in achieving a better image and in business profitability. One of the strategies that have the most success and business success are sustainable practices and social responsibility such as: ISO 14001 and ISO 26001. The literature related to sustainable business is based mainly on the theory of resources and capabilities, and in theory based on Stakeholders. These currents state that companies should focus on profitable strategies to ensure significant and long-term results, in order to achieve organizational and financial results for stakeholders. In this work, the sample consists of 215 companies from the commerce, services and industry sectors, located in the southern region of the State of Sonora in Mexico. The objective of the work is to analyze the influence of ISO 14001 and 26001 standards on the image and profitability of SMEs. The statistical analysis of the data has been carried out through the linear regression technique by OLS (Ordinary Least Squares). The findings prove that the ISO 14001 standard is the one that most influences the improvement of the business image and the level of profitability of the SME. In addition, we discovered that ISO 26001 has a partial influence on the image and profitability of the SME.
ARTICLE | doi:10.20944/preprints201810.0305.v1
Subject: Business, Economics And Management, Marketing Keywords: sustainable banking; corporate image; bank loyalty
Online: 15 October 2018 (11:49:29 CEST)
As the demand for a more sustainable society increases, adopting a sustainable banking approach serves as a competitive advantage for banks that are focused on attaining bank loyalty. This study revolves around understanding the role of sustainable banking practices on bank loyalty, while exploring the mediating effect of corporate image in the relationship between sustainable banking practices and bank loyalty. 511 data derived from customers of the banking sector was adopted for this study. Result from the structural equation modeling shows that sustainable banking practices positively and directly affects bank loyalty and corporate image, corporate image directly and positively affect bank loyalty, and also mediates in the relationship between sustainable banking practices and bank loyalty.
ARTICLE | doi:10.20944/preprints201802.0103.v1
Subject: Environmental And Earth Sciences, Space And Planetary Science Keywords: Cloud detection; Deep learning; Image Compression.
Online: 15 February 2018 (16:49:55 CET)
An effective on-board cloud detection method in small satellites would greatly improve the downlink data transmission efficiency and reduce the memory cost. In this paper, an ensemble method combining a lightweight U-Net with wavelet image compression is proposed and evaluated. The red, green, blue and infrared waveband images from Landsat-8 dataset are trained and tested to estimate the performance of proposed method. The LeGall-5/3 wavelet transform is applied on the dataset to accelerate the neural network and improve the feasibility of on-board implement. The experiment results illustrate that the overall accuracy of the proposed model achieves 97.45% by utilizing only four bands. Tests on low coefficients of compressed dataset have shown that the overall accuracy of the proposed method is still higher than 95%, while its inference speed is accelerated to 0.055 second per million pixels and maximum memory cost reduces to 2Mb. By taking advantage of mature image compression system in small satellites, the proposed method provides a good possibility of on-board cloud detection based on deep learning.
ARTICLE | doi:10.20944/preprints202204.0163.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: artificial intelligence; deep learning; image-to-image translation; dual-energy computed tomography; pulmonary embolism; emergency radiology
Online: 18 April 2022 (09:45:00 CEST)
Detector-based spectral CT offers the possibility of obtaining spectral information from which discrete acquisitions at different energy levels can be derived, yielding so-called virtual monoenergetic images (VMI). In this study, we aimed to develop a jointly optimized deep learning framework based on dual-energy CT pulmonary angiography (DE-CTPA) data to generate synthetic monoenergetic images (SMI) for improving automatic pulmonary embolism (PE) detection in single-energy CTPA scans. For this purpose, we used two data sets: our institutional DE-CTPA data set D1 comprising polyenergetic arterial series and the corresponding VMI at low-energy levels (40 keV) with 7,892 image pairs, and a 10% subset of the 2020 RSNA Pulmonary Embolism Detection Challenge data set D2, which consisted of 161,253 polyenergetic images with dichotomous slice-wise annotations (PE/no PE). We trained a fully convolutional encoder-decoder on D1 to generate SMI from single-energy CTPA scans of D2, which were then fed into a ResNet50 network for training of the downstream PE classification task. The quantitative results on the reconstruction ability of our framework revealed high-quality visual SMI predictions with reconstruction results of 0.984 ± 0.002 (structural similarity) and 41.706 ± 0.547 dB (peak-signal-to-noise ratio). PE classification resulted in an AUC of 0.84 for our model, which achieved improved performance compared to other naive approaches with AUCs up to 0.81. Our study stresses the role of using joint optimization strategies for deep learning algorithms to improve automatic PE detection. The proposed pipeline may prove to be beneficial for computer-aided detection systems and could help rescue CTPA studies with suboptimal opacification of the pulmonary arteries from single-energy CT scanners.
ARTICLE | doi:10.20944/preprints202105.0605.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: deep learning; computed tomography; image classification; COVID-19; medical image analysis; pneumonia; CNN, LSTM, medical diagnosis
Online: 25 May 2021 (10:32:29 CEST)
Advancements in deep learning and availability of medical imaging data have led to use of CNN based architectures in disease diagnostic assisted systems. In spite of the abundant use of reverse transcription-polymerase chain reaction (RT-PCR) based tests in COVID-19 diagnosis, CT images offer an applicable supplement with its high sensitivity rates. Here, we study classification of COVID-19 pneumonia (CP) and non-COVID-19 pneumonia (NCP) in chest CT scans using efficient deep learning methods to be readily implemented by any hospital. We report our deep network framework design that encompasses Convolutional Neural Networks (CNNs) and bidirectional Long Short Term Memory (biLSTM) architectures. Our study achieved high specificity (CP: 98.3%, NCP: 96.2% Healthy: 89.3%) and high sensitivity (CP: 84.0%, NCP: 93.9% Healthy: 94.9%) in classifying COVID-19 pneumonia, non-COVID-19 pneumonia and healthy patients. Next, we provide visual explanations for the CNN predictions with gradient-weighted class activation mapping (Grad-CAM). The results provided a model explainability by showing that Ground Glass Opacities (GGO), indicators of COVID-19 pneumonia disease, were captured by our CNN network. Finally, we have implemented our approach in three hospitals proving its compatibility and efficiency.
ARTICLE | doi:10.20944/preprints201710.0187.v1
Subject: Computer Science And Mathematics, Analysis Keywords: medical image classification; local binary patterns; characteristic curves; whole slide image pro-cessing; automated HER2 scoring
Online: 31 October 2017 (03:10:22 CET)
This paper presents novel feature descriptors and classification algorithms for automated scoring of HER2 in Whole Slide Images (WSI) of breast cancer histology slides. Since a large amount of processing is involved in analyzing WSI images, the primary design goal has been to keep the computational complexity to the minimum possible level and to use simple, yet robust feature descriptors that can provide accurate classification of the slides. We propose two types of feature descriptors that encode important information about staining patterns and the percentage of staining present in ImmunoHistoChemistry (IHC) stained slides. The first descriptor is called a characteristic curve which is a smooth non-increasing curve that represents the variation of percentage of staining with saturation levels. The second new descriptor introduced in this paper is an LBP feature curve which is also a non-increasing smooth curve that represents the local texture of the staining patterns. Both descriptors show excellent interclass variance and intraclass correlation, and are suitable for the design of automatic HER2 classification algorithms. This paper gives the detailed theoretical aspects of the feature descriptors and also provides experimental results and comparative analysis.
ARTICLE | doi:10.20944/preprints201710.0181.v1
Subject: Computer Science And Mathematics, Analysis Keywords: ultrasound image analysis; speckle noise; synthetic ultrasound images; texture features; local binary patterns; image quality assessment
Online: 30 October 2017 (09:37:59 CET)
Speckle noise reduction is an important area of research in the field of ultrasound image processing. Several algorithms for speckle noise characterization and analysis have been recently proposed in the area. Synthetic ultrasound images can play a key role in noise evaluation methods as they can be used to generate a variety of speckle noise models under different interpolation and sampling schemes, and can also provide valuable ground truth data for estimating the accuracy of the chosen methods. However, not much work has been done in the area of modelling synthetic ultrasound images, and in simulating speckle noise generation to get images that are as close as possible to real ultrasound images. An important aspect of simulated synthetic ultrasound images is the requirement for extensive quality assessment for ensuring that they have the texture characteristics and gray-tone features of real images. This paper presents texture feature analysis of synthetic ultrasound images using local binary patterns (LBP) and demonstrates the usefulness of a set of LBP features for image quality assessment. Experimental results presented in the paper clearly show how these features could provide an accurate quality metric that correlates very well with subjective evaluations performed by clinical experts.
ARTICLE | doi:10.20944/preprints202304.0596.v2
Subject: Engineering, Bioengineering Keywords: Deep Learning; image-to-image translation; dosimetry; cycleGAN; CBCT; CT; limited FOV; artifact correction; Hounsfield unit recovery
Online: 5 June 2023 (09:53:36 CEST)
Radiotherapy commonly utilizes CBCT for patient positioning and treatment monitoring. CBCT is deemed to be secure for patients, making it suitable for the delivery of fractional doses. However, limitations such as a narrow field of view, beam hardening, scattered radiation artifacts, and variability in pixel intensity hinder the direct use of raw CBCT for dose recalculation during treatment. To address this issue, reliable correction techniques are necessary to remove artifacts and remap pixel intensity into HU values. This study proposes a deep-learning framework for calibrating CBCT images acquired with narrow FOV systems and demonstrates its potential use in proton treatment planning updates. Cycle-consistent GAN processes raw CBCT to reduce scatter and remap HU. Monte Carlo simulation is used to generate CBCT scans, enabling the possibility to focus solely on the algorithm’s ability to reduce artifacts and cupping effects without considering intra-patient longitudinal variability and producing a fair comparison between planning CT and calibrated CBCT dosimetry. To showcase the viability of the approach using real-world data, experiments were also conducted using real CBCT. Tests were performed on a publicly available dataset of 40 patients who received ablative radiation therapy for pancreatic cancer. The simulated CBCT calibration led to a difference in proton dosimetry of less than 2%, compared to the planning CT. The potential toxicity effect on the organs at risk decreased from about 50% (uncalibrated) up the 2% (calibrated). The gamma pass rate at 3%/2mm produced an improvement of about 37% in replicating the prescribed dose before and after calibration (53.78% vs 90.26%). Real data also confirmed this with slightly inferior performances for the same criteria (65.36% vs 87.20%). These results may confirm that generative artificial intelligence brings the use of narrow FOV CBCT scans incrementally closer to clinical translation in proton therapy planning updates.
ARTICLE | doi:10.20944/preprints202206.0384.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep Learning; Smartphone Image; Acne Grading; Acne Object DetectionDeep Learning, Smartphone Image, Acne Grading, Acne Object Detection
Online: 28 June 2022 (10:05:25 CEST)
Skin image analysis using artificial intelligence (AI) has recently attracted significant research interest, particularly for analyzing skin images captured by mobile devices. Acne is one of the most common skin conditions with profound effects in severe cases. In this study, we developed an AI system called AcneDet for automatic acne object detection and acne severity grading using facial images captured by smartphones. AcneDet includes two models for conducting two tasks: (1) a Faster R-CNN-based deep learning model for the detection of acne lesion objects of four types including blackheads/whiteheads, papules/pustules, nodules/cysts, and acne scars; and (2) a LightGBM machine learning model for grading acne severity using the Investigator’s Global Assessment (IGA) scale. The output of the Faster R-CNN model, i.e., the counts of each acne type, were used as input for the LightGBM model for acne severity grading. A dataset consisting of 1,572 labeled facial images captured by both iOS and Android smartphones was used for training. The results show that the Faster R-CNN model achieves a mAP of 0.54 for acne object detection. The mean accuracy of acne severity grading by the LightGBM model is 0.85. With this study, we hope to contribute to the development of artificial intelligent systems that are able to help acne patients understand more about their conditions and support doctors in acne diagnosis.
ARTICLE | doi:10.20944/preprints201812.0137.v2
Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: microscopy, fluorescence, machine learning, deep learning, inverse problems, image reconstruction, image restoration, super-resolution, deconvolution, spectral unmixing
Online: 5 February 2019 (10:30:40 CET)
Deep Learning is a recent and important addition to the computational toolbox available for image reconstruction in fluorescence microscopy. We review state-of-the-art applications such as image restoration, super-resolution, and light-field imaging, and discuss how the latest Deep Learning research can be applied to other image reconstruction tasks such as structured illumination, spectral deconvolution, and sample stabilisation. Despite its successes, Deep Learning also poses significant challenges, has often misunderstood capabilities, and overlooked limits. We will address key questions, such as: What are the challenges in obtaining training data? Can we discover structures not present in the training data? And, what is the danger of inferring unsubstantiated image details?
ARTICLE | doi:10.20944/preprints201709.0098.v2
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: farming-pasture ecotone; TM image; remote sensing; vegetation cover factor; scale conversion; land use; high resolution image
Online: 21 September 2017 (16:33:49 CEST)
The key to simulating soil erosion is to calculate the vegetation cover (C) factor. Methods that apply remote sensing to calculate C factor at regional scale cannot directly use the C factor formula. That is because the C factor formula is obtained by experiment, and needs the coverage ratio data of croplands, woodlands and grasslands at standard plot scale. In this paper, we present a C factor conversion method from a standard plot to a km-sized grid based on large sample theory and multi-scale remote sensing. Results show that: 1) Compared with the existing C factor formula, our method is based on the coverage ratio of croplands, woodlands and grasslands on a km-sized grid, takes the C factor formula obtained from the standard plot experiment and applies it to regional scale. This method improves the applicability of the C factor formula, and can satisfy the need to simulate soil erosion in large areas. 2) The vegetation coverage obtained by remote sensing interpretation is significantly consistent (paired samples t-test, t = −0.03, df = 0.12, 2-tail significance p < 0.05) and significantly correlated with the measured vegetation coverage. 3) The C factor of the study area is smaller in the middle, southern and northern regions, and larger in the eastern and western regions. The main reason for that is the distribution of woodlands, the Hunshandake and Horqin sandy lands and the valleys affected by human activities. 4) The method presented in this paper is more meticulous than the C factor method based on the vegetation index, improves the applicability of the C factor formula, and can be used to simulate soil erosion on large scale and provide strong support for regional soil and water conservation planning.
ARTICLE | doi:10.20944/preprints202311.0891.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Olive; Color Image; Xception; Sorting; Deep Learning
Online: 14 November 2023 (11:54:41 CET)
Olive fruits at different ripening stages give rise to various table olive products and oil qualities. Therefore, developing an efficient method for recognizing and sorting olive fruits based on their ripening stages can greatly facilitate postharvest processing. This study introduces an automatic computer vision system that utilizes deep learning technology to classify the `Roghani` Iranian olive cultivar into five ripening stages using color images. The developed model employs convolutional neural networks (CNN) and transfer learning based on the Xception architecture and ImageNet weights as the base network. The model was fine-tuned by testing multiple configurations of well-known CNN layers. To minimize overfitting and enhance model generality, data augmentation techniques were employed. By considering different optimizers and two image sizes, four final candidate models were generated. These models were then compared in terms of loss and accuracy on the test dataset, classification performance (classification report and confusion matrix), and generality. All four candidates exhibited high accuracies ranging from 86.93% to 93.46% and comparable classification performance. In all models, at least one class was recognized with 100% accuracy. However, by taking into account the risk of overfitting, two models were discarded. Finally, a model with an image size of 224 × 224 and an SGD optimizer, which had a loss of 1.23 and an accuracy of 86.93%, was selected as the preferred option. The results of this study offer robust tools for automatic olive sorting systems, simplifying the differentiation of olives at various ripening levels for different post-harvest products.
REVIEW | doi:10.20944/preprints202310.0870.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: image inpainting object removal detection forensic forgery
Online: 13 October 2023 (08:25:14 CEST)
In recent years, significant advancements in the field of machine learning have influenced the domain of image restoration. While these technological advancements present prospects for improving the quality of images, they also present difficulties, particularly the proliferation of manipulated or counterfeit multimedia information on the internet. The objective of this paper is to provide a comprehensive review of existing inpainting algorithms and forgery detections, with a specific emphasis on techniques that are designed for the purpose of removing objects from digital images. In this study, we will examine various techniques encompassing conventional texture synthesis methods, as well as those based on neural networks. Furthermore, we will explore the artifacts associated with the identification of modified photos and present the artifacts frequently introduced by the inpainting procedure and assess the state-of-the-art technology for detecting such modifications. Lastly, we shall look at the available datasets and how the methods compare with each other. Having covered all of the above, the final outcome of this study is to provide a comprehensive perspective on the abilities and constraints to detect images for which an inpainting object removal method was applied.
ARTICLE | doi:10.20944/preprints202309.0400.v1
Subject: Physical Sciences, Astronomy And Astrophysics Keywords: pulsar candidate image; lossless compression; PixelCNN; FAST
Online: 6 September 2023 (09:48:00 CEST)
The study focuses on the crucial aspect of lossless compression for FAST pulsar search data. The deep generative model PixelCNN, stacking multiple masked convolutional layers, achieves neural network autoregressive modeling, making it one of the most excellent image density estimators. However, the local nature of convolutional networks causes PixelCNN to concentrate only on nearby information, neglecting important information at greater distances. Although deepening the network can broaden the receptive field, excessive depth can compromise model stability, leading to issues like gradient degradation. To address these challenges, the study combines causal attention modules with residual connections, proposing the Causal Residual Attention Module to enhance the PixelCNN model. This innovation not only resolves convergence problems arising from network deepening but also widens the receptive field. It effectively utilizes global features, particularly capturing vertically correlated features prominently present in subgraphs of candidates. This significantly enhances its capability to model pulsar data.In the experiments, the model is trained and validated using the HTRU1 dataset. The study compares the average negative log-likelihood score with baseline models like GMM, STM, and PixelCNN. The results demonstrate the superior performance of the our model over other models. Finally, the study introduces the practical compression encoding process by combining the proposed model with arithmetic coding.
ARTICLE | doi:10.20944/preprints202308.0349.v1
Subject: Social Sciences, Other Keywords: Body dissatisfaction, Body image, Female students, Perfection
Online: 3 August 2023 (14:07:43 CEST)
Many university female students are concerned about their bodies. Body image perception has become a public health issue globally. This study aimed to explore factors contributing to body image dissatisfaction among female students at the University of Venda. The study was qualitative in nature and employed exploratory research design. A sample of 10 female students enrolled at the University of Venda were identified using convenience sampling method. A pre-tested, semi-structured interview guide was used to collect data and thematic content analysis technique was used to analyse the collected data. The findings of the study showed that body comparison, societal beauty standards, social media, and body shaming by family and friends were the main factors contributing to student’s body image dissatisfaction. The findings further revealed that lack of self-confidence, stress, avoidance, anxiety and depressive symptoms were the challenges faced by students with body image dissatisfaction. Acceptance. Self-care, and healthy diet were identified as coping strategies to help deal with the challenges of student’s body image dissatisfaction. Conclusively, students should be encouraged to seek professional help timeously, to help navigate their body image concerns to avoid decline in their daily functioning.
ARTICLE | doi:10.20944/preprints202306.1593.v1
Subject: Business, Economics And Management, Business And Management Keywords: Ecotourism; birdwatching; destination image; destination marketing; Colombia
Online: 22 June 2023 (10:40:39 CEST)
Colombia is noteworthy as a biodiversity hotspot, featuring an extraordinary number of en-demic orchids, birds, and butterflies. This exploratory study examines the perceptions of desti-nation image considering the cognitive and affective image in predicting behavioral intentions of ecotourists through symmetric data analysis. Using Partial Least Squares (PLS), the author(s) analyzed 64 survey responses collected of rural areas, including a new 15 statement scale spe-cialized on birdwatching. The findings support the reliability of the model, symmetric analysis presents the higher influence of emotions and affections in increasing intentions of recommen-dation, considering birdwatching as based on personal relationships. Additionally, the cognitive image for the birders despite representing destination attributes or sets of destination resources of a mental picture does not have the same impact on behavioral intentions. Therefore, manag-ers should develop positioning strategies based on the generation of emotions, because bird-watching tourists seek to have more emotional experiences.
ARTICLE | doi:10.20944/preprints202305.0067.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: low light; image enhancement; counvolutional neural networks
Online: 2 May 2023 (07:32:56 CEST)
In this study, we explore the potential of using a straightforward neural network inspired by the retina model to efficiently restore low-light images. The retina model imitates the neurophysiological principles and dynamics of various optical neurons. Our proposed neural network model reduces the computational overhead compared to traditional signal-processing models while achieving results similar to complex deep learning models from a subjective perceptual perspective. By directly simulating retinal neuron functionalities with neural networks, we not only avoid manual parameter optimization but also lay the groundwork for constructing artificial versions of specific neurobiological organizations.