ARTICLE | doi:10.20944/preprints202308.1432.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Taekwondo poomsae; action recognition; skeletal data; camera viewpoint; martial arts
Online: 21 August 2023 (07:48:57 CEST)
Issues of fairness and consistency in Taekwondo poomsae evaluation have emerged owing to the lack of an objective evaluation method. This study proposes a three-dimensional (3D) convolutional neural network (CNN)-based action recognition model for the objective evaluation of Taekwondo poomsae. The model exhibits robust recognition performance regardless of variation in perspective by reducing the discrepancies between training and test images. The model uses 3D skeletons of the poomsae unit action collected using a full-body motion-capture suit to generate synthesized two-dimensional (2D) skeletons from the desired perspective. This approach aids in obtaining 2D skeletons from diverse perspectives as part of the training dataset and ensures consistent recognition performance regardless of the viewpoint. The model was trained using 2D skeletons projected from diverse viewpoints, and its performance was evaluated using various test datasets, including projected 2D skeletons and RGB images captured from various viewpoints. Comparison of the performance of the proposed model with that of previously reported action recognition models demonstrated the superiority of the model, underscoring its effectiveness in recognizing and classifying Taekwondo poomsae actions.
ARTICLE | doi:10.20944/preprints202008.0129.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: Streaming data; Optical camera communications; rolling shutter camera
Online: 5 August 2020 (10:40:12 CEST)
This paper addresses a method to transmit streaming data via rolling shutter camera-based optical camera communications (OCC). Since the amount of data that can be contained within one frame is limited, and the continuity of received data cannot be guaranteed due to OCC environmental variations, we introduce the concept of dividing the streaming data into several fragmented sets that are transmitted sequentially. We propose a superframe to contain sequential packets of fragmented data and corresponding indexes, so that sequential packets for streaming data can be continuously collected. When redundant frame transmission is considered, any packet lost due to OCC environmental conditions can be recovered. Experimental results show that the proposed method can be successfully used to transmit streaming data, with the number of redundant frames required to acquire all data packets based on image resolution. In addition, we describe how to identify missing packets from a network point of view to reduce the number of redundant frames needed to acquire all the data. This paper presents baseline results of communication performance when sending streaming data via a rolling shutter-based OCC.
ARTICLE | doi:10.20944/preprints202312.0491.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Autonomous vehicles; camera orientation estimation; vanishing point; camera extrinsic parameters
Online: 7 December 2023 (16:49:02 CET)
This study introduces a multilayer perceptron (MLP) error compensation method for real-time camera orientation estimation, leveraging a single vanishing point and road lane lines within a steady-state framework. The research emphasizes cameras with a roll angle of 0°, predominant in autonomous vehicle contexts. The methodology estimates pitch and yaw angles using a single image and integrates two Kalman filter models with inputs from image points (u, v) and derived angles (pitch, yaw). Performance metrics, including AE, MINE, MAXE, SSE, and STDEV, were utilized, testing the system in both simulator and real-vehicle environments. The outcomes indicate that our method notably enhances the accuracy of camera orientation estimations, consistently outpacing competing techniques across varied scenarios. This method’s potency is evident in its adaptability and precision, holding promise for advanced vehicle systems and real-world applications.
ARTICLE | doi:10.20944/preprints202309.1690.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Camera Calibration; Vanishing Point Detection; Transformer
Online: 26 September 2023 (02:15:36 CEST)
Previous camera self-calibration methods have exhibited certain notable shortcomings. On one hand, they either exclusively emphasized scene cues or solely focused on vehicle-related cues, resulting in a lack of adaptability to diverse scenarios and a limited number of effective features. Furthermore, these methods either solely utilized geometric features within traffic scenes or exclusively extracted semantic information, failing to comprehensively consider both aspects. This limited the comprehensive feature extraction from scenes, ultimately leading to a decrease in calibration accuracy. Additionally, conventional vanishing point-based self-calibration methods often required the design of additional edge-background models and manual parameter tuning, thereby increasing operational complexity and the potential for errors. Given these observed limitations, and in order to address these challenges, we propose an innovative roadside camera self-calibration model based on the Transformer architecture. This model possesses a unique capability to simultaneously learn scene features and vehicle features within traffic scenarios while considering both geometric and semantic information. Through this approach, our model can overcome the constraints of prior methods, enhancing calibration accuracy and robustness while reducing operational complexity and the potential for errors. Our method outperforms existing approaches on both real-world dataset scenarios and publicly available datasets, demonstrating the effectiveness of our approach.
ARTICLE | doi:10.20944/preprints202307.1921.v1
Subject: Environmental And Earth Sciences, Ecology Keywords: Vultures; decomposition state; Camera traps; dominance
Online: 27 July 2023 (12:01:51 CEST)
The species composition of vultures and their interactions on carcasses of various stages of decomposition is not well understood yet it potentially affects their food acquisition and survival. We collected data from six carcasses between June and December 2021 using camera traps that were set on carcasses undergoing various decomposition states in Sinamatella Camp of Hwange National Park, Zimbabwe. Of interest were the cases in which each vulture species was dominating in interactions which gave it an advantage in terms of food acquisition. Four vulture species were observed (White-backed, White-headed, Lappet-faced and Hooded Vultures). Vulture abundances were greatest on fresh carcasses and least on dry ones. Although dominance behaviors by the White-backed and White-headed Vultures were recorded over all other vulture species, there were no records of the Lappet-faced dominating other vultures. In addition, the Hooded Vultures were mostly dominating non-vulture avian species on advanced decay carcasses. Our results demonstrate how various species may be prone to intense competition that may further place them at disadvantageous positions if food sources decline, more so under climatic shifts and various anthropogenic pressures.
ARTICLE | doi:10.20944/preprints202205.0006.v1
Subject: Biology And Life Sciences, Biophysics Keywords: structured illumination; fluorescence; brain; multi-camera
Online: 4 May 2022 (12:24:22 CEST)
Fluorescence microscopy provides an unparalleled tool for imaging biological samples. However, producing high-quality volumetric images quickly and without excessive complexity remains a challenge. Here, we demonstrate a simple multi-camera structured illumination microscope (SIM) capable of simultaneously imaging multiple focal planes, allowing for the capture of 3D fluorescent images without any axial movement of the sample. This simple setup allows for the acquisition of many different 3D imaging modes, including 3D time lapses, high-axial-resolution 3D images, and large 3D mosaics.
ARTICLE | doi:10.20944/preprints202005.0221.v1
Subject: Computer Science And Mathematics, Robotics Keywords: odometry; camera; positioning; navigation; indoor; robot
Online: 13 May 2020 (04:45:54 CEST)
Positioning is an essential aspect of robot navigation, and visual odometry an important technique for continuous updating the internal information about robot position, especially indoors without GPS. Visual odometry is using one or more cameras to find visual clues and estimate robot movements in 3D relatively. Recent progress has been made, especially with fully integrated systems such as the RealSense T265 from Intel, which is the focus of this article. We compare between each other three visual odometry systems and one wheel odometry, on a ground robot. We do so in 8 scenarios, varying the speed, the number of visual features, and with or without humans walking in the field of view. We continuously measure the position error in translation and rotation thanks to a ground truth positioning system. Our result show that all odometry systems are challenged, but in different ways. In average, ORB-SLAM2 has the poorer results, while the RealSense T265 and the Zed Mini have comparable performance. In conclusion, a single odometry system might still not be sufficient, so using multiple instances and sensor fusion approaches are necessary while waiting for additional research and further improved products.
DATA DESCRIPTOR | doi:10.20944/preprints201810.0179.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: imaging; CMOS; camera; SNR; noise; performance
Online: 9 October 2018 (09:38:23 CEST)
Expensive cameras meant for research applications are usually characterized by the manufacturers and detailed specifications  are available for them. Suppliers of inexpensive cameras usually do not provide such detailed information about their cameras. This data set provides the acquisition speed and noise characteristics acquired from a monochrome 1.2 megapixel CMOS camera, the QHY5L-II M . The source code provided along with this data set  can also be used to acquire similar data for other QHY cameras. This enables the use of such cost-effective cameras for other scientific applications in other fields, beyond the designed use in Astronomy.
COMMUNICATION | doi:10.20944/preprints202302.0003.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: image forensics; camera identification; fingerprint; forgery; PRNU
Online: 1 February 2023 (01:30:04 CET)
In the field of forensic imaging, it is important to be able to extract a “camera fingerprint” from one or a small set of images known to have been taken by the same camera (image sensor). Ideally, that fingerprint would be used to identify an individual source camera. Camera fingerprint is based on certain kind of random noise present in all image sensors that is due to manufacturing imperfections and thus unique and impossible to avoid. PRNU (Photo-Response Non-Uniformity) has become the most widely used method for SCI (Source Camera Identification). In this paper, we design a set of “attacks” to a PRNU based SCI system and we measure the success of each method. We understand an attack method as any processing that alters minimally image quality and that is designed to fool PRNU detectors (or, generalizing, any camera fingerprint detector). The PRNU based SCI system was taken from an outstanding reference that is publicly available.
ARTICLE | doi:10.20944/preprints201905.0286.v1
Subject: Physical Sciences, Applied Physics Keywords: shear band, pyrometry, punch test, streak camera
Online: 23 May 2019 (16:26:52 CEST)
This paper presents the development of a new system designed to measure the local temperature field in adiabatic shear band. Transient temperature field are simultaneously recorded by an array of 32 InSb infrared (IR) detectors and a streak camera working in visible-near infrared (VIS-NIR). Observations in IR offer a low temperature detection limit (350°C) but they are highly sensitive to uncertainty in the emissivity. Observations in VIS-NIR allow for measurement only at high temperatures (750°C) but they are less affected by uncertainty on emissivity and present a higher temperature sensitivity. By performing simultaneous measurements, it is possible to obtain data on a large temperature range with an improved accuracy at high temperature. The different sources of errors caused by uncertainty in the emissivity, spatial and temporal resolution of the detectors has been analyzed and an estimation of the total measurement uncertainty of the system is given.
ARTICLE | doi:10.20944/preprints202308.1118.v1
Subject: Engineering, Industrial And Manufacturing Engineering Keywords: monocular camera; world coordinates; pose measurement; rigid body
Online: 15 August 2023 (10:33:35 CEST)
A method of measuring the absolute pose parameters of a moving rigid body using a monocular camera is proposed, aiming at addressing calibration difficulties and inconsistencies of repeated measurements of the rigid-body pose for a camera having a varying focal length. The proposed method does not require calibration beforehand. Using more than six non-coplanar control points symmetrically arranged in the rigid-body and world coordinate systems, the matrices of rotation and translation between the camera and two coordinate systems are obtained and the absolute pose of the rigid body measured. In this paper, formulas of the absolute pose measurement of a moving rigid body are deduced systematically and the complete implementation is presented. Position and attitude measurement experiments carried out on a three-axis precision turntable show that the average absolute error in the attitude angle of a moving rigid body measured by an uncalibrated camera at different positions changes by no more than 0.2 degrees. Analysis of the three-dimensional coordinate errors of the centroid of a moving rigid body shows little deviation in measurements made at three camera positions, with the maximum deviation of the average absolute error being 0.53 cm and the maximum deviation of the standard deviation being 0.66 cm. The proposed method can measure the absolute pose of a rigid body and is insensitive to the position of the camera in the measurement process. This work thus provides guidance for the repeated measurement of the absolute pose of a moving rigid body using a monocular camera.
ARTICLE | doi:10.20944/preprints201910.0168.v1
Subject: Physical Sciences, Radiation And Radiography Keywords: neutron imaging; neutron detector; X-ray detector; CCD camera; CMOS camera; collimator; X-rays; image processing; computed tomography; CT reconstruction
Online: 15 October 2019 (11:15:26 CEST)
Neutron computed tomography (nCT) has been established at many major neutron sources worldwide, using high-end equipment requiring major investment and development. Many older and smaller reactors would be capable of doing nCT as well, but cannot afford the investment before feasibility is proven. We have developed a compact low-cost but high-quality detection system using a new cooled CMOS camera that can either be fully integrated into a sophisticated setup, or used with a rudimentary CT control and motion system to quickly evaluate feasibility of neutron CT at a given beam line facility. Exchanging the scintillation screen makes it feasible for X-rays as well, even for visible light (and transparent samples) using a matte screen. The control system uses a hack to combine motion control with existing imaging software so it can be used to test several dozen different cameras without writing specific drivers. Freeware software can do reconstruction and 3D imaging.
ARTICLE | doi:10.20944/preprints202311.1933.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: optical camera communication (OCC); hybrid OCC waveform; IoT application
Online: 30 November 2023 (10:05:31 CET)
Optical wireless communication is a promising emerging technology that addresses the limitations of radio-frequency-based wireless technologies. This study presents a novel hybrid modulation for optical camera communication (OCC); it integrates two waveforms transmitted from a single transmitter light-emitting diode (LED) and received by two rolling-shutter camera devices. Then, a smart camera with a high-resolution image sensor captures the high-frequency signal, and a low-resolution image sensor from a smartphone camera captures the low-frequency signal. Through this hybrid scheme, two data streams are transmitted from a single LED lamp; this reduces the cost of the indoor OCC device compared to one that transmits two signals from two different LEDs. In proposed scheme, rolling-shutter orthogonal frequency-division multiplexing is used for the high-frequency signals, and M-ary frequency-shift keying is used for the low-frequency signals in the time domain. This proposed scheme is compatible with smart phone camera and USB camera. By controlling the OCC parameters, the hybrid scheme can be implemented with high performance for a communication distance of 10 m.
COMMUNICATION | doi:10.20944/preprints202310.1794.v1
Subject: Engineering, Bioengineering Keywords: compton camera; semiconductor detectors; efficiency; analytical reconstruction method; Geant4
Online: 27 October 2023 (11:35:01 CEST)
Compton cameras detect scattered gamma rays and estimate the distribution of gamma-ray sources. Nonetheless, crafting a camera tailored to a specific application presents formidable challenges, often necessitating the implementation of diverse image reconstruction techniques. Delving into the factors influencing these cameras can pave the way for design optimization and performance enhancement. This study introduces an inventive detector design for Compton imaging systems, building upon the achievements of prior designs. The proposed system contains eight scatterer detectors and a semiconductor absorber detector, spaced at 1 mm and 30 mm intervals, respectively. The source-to-first-scatterer-detector distance is 5 mm, with scatterer and absorber detector plates measuring 70× 70× 2.125 mm3 and 70× 70× 10 mm3, respectively. Geant4 simulation toolkit models the Compton imaging system, and an analytical method reconstructs Compton camera images. Unlike more straightforward techniques, the analytical method directly reconstructs Compton camera images by solving the equation relating to the reflected image data. This approach is implemented in the C++ programming language. The study's findings reveal that the analytical method discerns optimal conditions and parameters that significantly influence efficiency, yielding a full width at half maximum (FWHM) of 3.7 mm with an angular uncertainty of approximately 2.7 degrees at an energy level of 0.662 MeV. Compared to another experimental design employing the analytical image reconstruction approach, the FWHM value decreased by 0.7 mm. This study presents an innovative detector design and an analytical reconstruction method for Compton imaging systems, showcasing improved efficiency and accuracy.
ARTICLE | doi:10.20944/preprints202310.1010.v1
Subject: Biology And Life Sciences, Ecology, Evolution, Behavior And Systematics Keywords: behavior ecology; interspecific competition; interference; community structures； camera-trapping
Online: 17 October 2023 (05:44:09 CEST)
This study presents a comprehensive ecological evaluation of avian species based on 5,322 photographs obtained through camera-trap sampling. We identified 1,427 independent bird sightings, encompassing 26 families and 49 species. The study focused on temporal activity patterns, nesting behaviors, habitat preferences, and the overlap coefficient of activity patterns among 22 species of Passeriformes. Two species exhibited predominant morning activity, while five species were active in the afternoon, and 15 exhibited cathemeral activity (activity throughout the day). A cross-analysis revealed varying degrees of overlap in the activity patterns of pairs of species with similar behavioral ecology. Our findings indicate that despite exhibiting similar ecological behavior, these species display unique activity patterns, likely influenced by factors such as resource availability, competition avoidance, and thermoregulation strategies. The results highlight the richness and complexity of avian temporal niches and emphasize the need for further research into their correlation with environmental factors. This study contributes to a deeper understanding of niche separation within Passeriformes and expands our knowledge of avian behavioral ecology.
ARTICLE | doi:10.20944/preprints202304.0524.v1
Subject: Engineering, Automotive Engineering Keywords: Camera; Radar; Lidar; Automotive Engineering; Adverse Weather; Sensor Perception
Online: 18 April 2023 (12:40:48 CEST)
Vehicle safety promises to be one of the Advanced Driver Assistance System (ADAS) biggest benefits. Higher levels of automation remove the human driver from the chain of events that can lead to a crash. Sensors play an influential role in vehicle driving as well as in ADAS by helping the driver to watch the vehicle’s surroundings for safe driving. Thus, the driving load is drastically reduced from steering as well as accelerating and braking for long-term driving. The baseline for the development of future intelligent vehicles relies even more on the fusion of data from surrounding sensors such as Camera, Lidar and Radar. These sensors not only need to perceive in clear weather but also need to detect accurately adverse weather and illumination conditions. Otherwise, a small error could have an incalculable impact on ADAS. As most of the current study is based on indoor or static testing. In order to solve this problem, this paper designs a series of dynamic test cases with the help of outdoor rain and intelligent lightning simulation facilities to make the sensor application scenarios more realistic. As a result, the effect of rainfall and illumination on sensor perception performance is investigated. As speculated, the performance of all automotive sensors is degraded by adverse environmental factors, but their behaviour is not identical. Future work on sensor model development and sensor information fusion should therefore take this into account.
ARTICLE | doi:10.20944/preprints202209.0276.v1
Subject: Computer Science And Mathematics, Computer Science Keywords: Sensor fusion; Camera and LiDAR fusion; Odometry; Explainable AI
Online: 19 September 2022 (10:27:42 CEST)
Recent deep learning frameworks draw a strong research interest in the application of ego-motion estimation as they demonstrate a superior result compared to geometric approaches. However, due to the lack of multimodal datasets, most of these studies primarily focused on a single sensor-based estimation. To overcome this challenge, we collect a unique multimodal dataset named LboroAV2, using multiple sensors including camera, Light Detecting And Ranging (LiDAR), ultrasound, e-compass and rotary encoder. We also propose an end-to-end deep learning architecture for fusion of RGB images and LiDAR laser scan data for odometry application. The proposed method contains a convolutional encoder, a compressed representation and a recurrent neural network. Besides feature extraction and outlier rejection, the convolutional encoder produces a compressed representation which is used to visualise the network's learning process and to pass useful sequential information. The recurrent neural network uses this compressed sequential data to learn the relation between consecutive time steps. We use the LboroAV2 and KITTI VO datasets to experiment and evaluate our results. In addition to visualising the network's learning process, our approach gives superior results compared to other similar methods. The code for the proposed architecture is released in GitHub and accessible publicly.
ARTICLE | doi:10.20944/preprints201902.0265.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: computational imaging; lensless camera; CMOS image sensor; compressive sensing
Online: 28 February 2019 (07:13:35 CET)
A lensless camera is an ultra-thin computational-imaging system. Existing lensless cameras are based on the axial arrangement of an image sensor and a coding mask, and therefore, the back side of the image sensor cannot be captured. In this paper, we propose a lensless camera with a novel design that can capture the front and back sides simultaneously. The proposed camera is composed of multiple coded image sensors, which are complementary-metal-oxide-semiconductor~(CMOS) image sensors in which air holes are randomly made at some pixels by drilling processing. When the sensors are placed facing each other, the object-side sensor works as a coding mask and the other works as a sparsified image sensor. The captured image is a sparse coded image, which can be decoded computationally by using compressive-sensing-based image reconstruction. We verified the feasibility of the proposed lensless camera by simulations and experiments. The proposed thin lensless camera realizes super field-of-view imaging without lenses or coding masks, and therefore can be used for rich information sensing in confined spaces. This work also suggests a new direction in the design of CMOS image sensors in the era of computational imaging.
ARTICLE | doi:10.20944/preprints201808.0407.v1
Subject: Engineering, Control And Systems Engineering Keywords: angle estimation; microsoft kinect; single camera; markerless mocap system
Online: 23 August 2018 (05:55:56 CEST)
The use of motion capture has increased from last decade in a varied spectrum of applications like film special effects, controlling games and robots, rehabilitation system, animations etc. The current human motion capture techniques use markers, structured environment, and high resolution cameras in a dedicated environment. Because of rapid movement, elbow angle estimation is observed as the most difficult problem in human motion capture system. In this paper, we take elbow angle estimation as our research subject and propose a novel, markerless and cost-effective solution that uses RGB camera for estimating elbow angle in real time using part affinity field. We have recruited five (5) participants of (height, 168 ± 8 cm; mass, 61 ± 17 kg) to perform cup to mouth movement and at the same time measured the angle by both RGB camera and Microsoft Kinect. The experimental results illustrate that markerless and cost-effective RGB camera has a median RMS errors of 3.06° and 0.95° in sagittal and coronal plane respectively as compared to Microsoft Kinect.
ARTICLE | doi:10.20944/preprints201805.0414.v1
Subject: Engineering, Mechanical Engineering Keywords: auto-alignment; intelligent stereo camera; stereo film; three-dimensional
Online: 28 May 2018 (15:57:25 CEST)
This study presents an instant preview and analysis system implementation of intelligent stereo cameras (ISCs). A parameter optimization prototype adopted the instant preview and analysis system of the ISCs has been achieved the automatic alignment function, and obtained optimal stereo films with the automatic alignment function by adjusting gap and angle between dual cameras. The instant preview and analysis system of the ISCs with parameter optimization can enhance the quality of stereo films effectively and reduce filmed errors and save retouched cost and time in harsh filmed environment.
ARTICLE | doi:10.20944/preprints202308.1822.v1
Subject: Engineering, Mechanical Engineering Keywords: PIV; image acquisition; measurement; smartphone; high-speed camera; low-cost
Online: 29 August 2023 (03:48:21 CEST)
The study of velocimetry is important for characterizing and comprehending the effects of fluid flow, and the Particle Image Velocimetry (PIV) technique is one of the primary approaches for understanding the velocity vector field in a test section. Commercial PIV systems are expensive, with one of the main cost factors being high-speed camera equipment capable of capturing images at high frames per second (fps), rendering them impractical for many applications. This study proposes an evaluation of utilizing smartphones as image acquisition systems for PIV technique application. An experimental setup inspired by the known angular displacement of synthetic particles is proposed. A stepper motor rotates a plate containing an image of synthetic particles on its surface. The motion of the plate is captured by the smartphone camera, and the images are processed using PIVlab-MatLab® software. The use of two smartphones is assessed, with acquisition rates of either 240 fps or 960 fps and varying angular velocities. The results were satisfactory for velocities up to 0.7 m/s at an acquisition rate of 240 fps and up to 1.8 m/s at 960 fps, validating the use of smartphones as a cost-effective alternative for the PIV technique.
ARTICLE | doi:10.20944/preprints202308.0850.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Image recognition; Micro AR marker; Camera parameter control; Iterative recognition
Online: 11 August 2023 (02:53:04 CEST)
This paper presents a novel dynamic camera parameter control method for position and posture estimation of highly miniaturized AR markers (micro AR markers) using a low cost general camera. The proposed method performs iterative calculation of the marker’s position and posture to converge them to specified accuracy with dynamically updating the camera’s zoom, focus and other parameter values based on the detected marker’s depth distances. For a 10 mm square micro AR marker, the proposed system demonstrated recognition accuracy of better than ±1.0% for depth distance and 2.5∘ for posture angle with a maximum recognition range of 1.0 m. In addition, the iterative calculation time was at most 0.7 seconds in an initial detection of the marker. These experimental results suggest that the proposed method and system can be applied to a robotic precise handling of small objects with low cost.
ARTICLE | doi:10.20944/preprints202305.0923.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Injury prevention; FMS; depth camera; Gaussian Mixture Model; machine learning
Online: 12 May 2023 (10:21:23 CEST)
Background: Functional Movement Screening (FMS) allows for rapid assessment of an individual’s physical activity level and timely detection of sports injury risk. However, traditional functional movement screening often requires on-site assessment by experts, which is time-consuming and prone to subjective bias. Therefore, the study of automated functional movement screening has become increasingly important. Methods: In this study, we propose an automated assessment method for FMS based on the improved Gaussian Mixture Model (GMM). First, the oversampling of minority samples is conducted, the movement features are manually extracted from the FMS dataset collected with two Azure Kinect depth sensors, then we train the Gaussian mixture model with different scores (1 point, 2 points, 3 points) of feature data separately, finally, we conducted FMS assessment by the Maximum Likelihood estimation. Results: The improved GMM has a higher scoring accuracy (Improved GMM:0.8) compared to other models (Traditional GMM=0.38, Adaboost.M1=0.7, Naïve-Bayes=0.75), and the scoring results of improved GMM have a high level of agreement with the expert scoring (kappa=0.67). Conclusions: The results show that the proposed method based on the improved Gaussian mixture model can effectively perform the FMS assessment task and it is potentially feasible to use depth cameras for FMS assessment.
ARTICLE | doi:10.20944/preprints201907.0158.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Cunninghamia lanceolate; UAVs; hyperspectral camera; machine learning; random forests; XGBoost
Online: 11 July 2019 (11:41:33 CEST)
Accurate measurements of tree height and diameter at breast height (DBH) in forests to evaluate the growth rate of cultivars is still a significant challenge, even when using LiDAR and 3-D modeling. We propose an integrated pipeline methodology to measure the biomass of different tree cultivars in plantation forests with high crown density which that combines unmanned aerial vehicles (UAVs), hyperspectral image sensors, and data processing algorithms using machine learning. Using a planation of Cunninghamia lanceolate, commonly known as Chinese fir, in Fujian, China, images were collected using a hyperspectral camera and orthorectified in HiSpectral Stitcher. Vegetation indices and modeling were processed in Python using decision trees, random forests, support vector machine, and eXtreme Gradient Boosting (XGBoost) third-party libraries. Tree height and DBH of 2880 samples were measured manually and clustering into three groups: “fast growth,” “median,” growth and “normal” growth group, and 19 vegetation indices from 12,000 pixels were abstracted as the input of features for the modeling. After modeling and cross-validation, the classifier generated by random forests had the best prediction accuracy compare to other algorisms (75%). This framework can be applied to other tree species to make management and business decisions.
ARTICLE | doi:10.20944/preprints201711.0142.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: segmentation; multi-spectral camera; soil: tree; raster scanning; UAV application
Online: 22 November 2017 (05:37:02 CET)
The increased availability of high resolution remote sensor data for precision agriculture 1 applications permits users to aquire deeper and more relevant knowledge about crops states that lead 2 inevitably to better decisions. The algorithm libraries being developed and evolved around these 3 applications rely on multi-spectral or hyper-spectral data acquired by using manned or unmanned 4 platforms. The current state of the art makes thorough use of vegetational indicies to guide the 5 operational management of agricultural land plots. One of the most challenging sub-problems is 6 to correctly identify and separate crop from soil. Thresholding techniques based on Normalized 7 Difference Vegetation Index (NDVI) or other such similar metrics have the advantage of being simple, 8 easy to read transformations of the data packed with useful information. Obvious difficulties arise 9 when crop/tree and soil have similar spectral responses as in case of grass filled areas in vineyards. 10 In this case grass and canopy are close in terms of NDVI values and thresholding techniques will 11 generally fail. Radiometric approaches could be integrated or replaced by a geometric approach that 12 is based on terrain data like Digital Surface Models (DSMs). These models are one of the ouputs 13 of orthorectification engines usually present in data acquired by using unmanned platforms. In 14 this paper we present two approaches based on DSM that are able to segment crop/tree from soil 15 while over gradient terrain. The DSM data are processed through a two dimensional data slicing or 16 reduction technique. Each slice is separately processed as a one dimensional time series to derive the 17 terrain and tree structures separately, here interpreted as object probability densities. In particular 18 the first approach is a Cartesian grid rasterization (CARSCAN) of the terrain and the second is its 19 immediate generalisation or radial grid rasterization of the DSM model (FANSCAN). The FANSCAN 20 recovers information from the original image at greater frequencies on the Fourier plane. These 21 approaches enable the identification of crop/tree from soil in case of slopes or hilly terrain without 22 any constraint on the displacement / direction of plant/tree row. The proposed algorithm uses pure 23 DSM information even if it is possible to fuse its output with other classifiers.
ARTICLE | doi:10.20944/preprints201704.0169.v1
Subject: Engineering, Bioengineering Keywords: thermopile sensor; actimetry; thermal camera, data classification; tele-medicine; polysomnography;
Online: 26 April 2017 (12:27:38 CEST)
This paper address the development of a new technic in the sleep analysis domain. Sleep is defined as a periodic physiological state during which vigilance is suspended and reactivity to external stimulations diminished. We sleep on average between six and nine hours per night and our sleep is composed of four to six cycles of about 90-minutes each. Each of these cycles is composed of a succession of several stages of sleep, more or less deep. The analysis of sleep is usually done using a polysomnography. This examination consists of recording, among other things, electrical cerebral activity by electroencephalography (EEG), ocular movements by electrooculography (EOG) and chin muscle tone by electromyography (EMG). The recording is done mostly in a hospital, more specifically in a service for monitoring the pathologies related to sleep. The readings are then interpreted manually by an expert to generate a hypnogram, a curve showing the succession of sleep stages during the night in 30-second epochs. The proposed method is based on the follow-up of the thermal signature that makes it possible to classify the activity into three classes: "awakening", "calm sleep" and "agitated sleep". The contribution of this non-invasive method is part of the screening of sleep disorders, to be validated by a more complete analysis of the sleep. The measure provided by this new system, based on temperature monitoring (patient and ambient), aims to be integrated into the tele-medicine platform developed within the framework of the Smart-EEG project by the SYEL - SYstèmes ELectroniques team. Analysis of the data collected during the first surveys carried out with this method showed a correlation between thermal signature and activity during sleep. The advantage of this method lies in its simplicity and the possibility of carrying out measurements of activity during sleep and without direct contact with the patient at home or hospitals.
ARTICLE | doi:10.20944/preprints202309.0092.v1
Subject: Biology And Life Sciences, Food Science And Technology Keywords: Dietary intake assessment; Wearable camera; Food; Nutrients; Portion size; Nutritional analysis
Online: 1 September 2023 (16:36:47 CEST)
Background: Accurate estimation of dietary intake is challenging. But whilst some progress has been made in high-income countries, low- and middle-income countries (LMICs) remain behind, contributing to critical nutritional data gaps. This study aimed to validate an objective, passive image-based dietary intake assessment method against weighed food records in London, UK for onward deployment to LMICs. Methods: Wearable camera devices were used to capture food intake of eating occasions in 18 adults and 17 children of Ghanaian and Kenyan origin living in London. Participants were provided pre-weighed meals of Ghanaian and Kenyan cuisine and camera devices to automatically capture images of the eating occasions. Food images were assessed for portion size, energy, and nutrient intake, and relative validity of the method compared to the weighed food records. Results: Pearson and Intra-class correlation coefficient of estimates of intakes of food, energy and 19 nutrients ranged from 0.60 to 0.95 and 0.67 to 0.90, respectively. Bland-Altman analysis showed good agreement between the image-based method and weighed food record. Under-estimation of dietary intake by the image-based method ranged from 4 to 23%. Conclusions: Passive food image capture and analysis provides an objective assessment of dietary intake comparable to weighed food records.
ARTICLE | doi:10.20944/preprints202111.0258.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: feral cat; Felis catus; Australia; Indigenous Protected Area; 1080; camera; tracking
Online: 15 November 2021 (11:57:03 CET)
Feral cats are both difficult to manage and harder to monitor. We analysed the cost-efficacy of monitoring the pre- and post-bait abundance of feral cats via camera-traps or track counts using four years of data from the Matuwa Indigenous protected Area. Additionally, we report on the recovery of the feral cat population and the efficacy of subsequent Eradicat® aerial baiting programs following 12 months of intensive feral cat control in 2019 that consisted of aerial baiting and leg-hold trapping. Significantly fewer cats were captured in 2020 (n = 8) compared to 2019 (n = 126). Pre-baiting surveys for 2020 and 2021 suggested that the population of feral cats on Matuwa was very low, at 5.5 and 4.4 cats/100 km respectively, which is well below our target threshold of 10 cats/100 km. Post-baiting surveys then recorded 3.6 and 3.0 cats/100 km respectively, which still equates to a 35% and 32% reduction in cat activity. Track counts recorded significantly more feral cats than camera traps and were cheaper to implement. We recommend that at least two methods of monitoring cats be implemented to prevent erroneous conclusions.
ARTICLE | doi:10.20944/preprints201710.0077.v1
Subject: Engineering, Other Keywords: camera calibration; multi-cameras system; ray tracing; glass checkerboard; bundle adjustment
Online: 12 October 2017 (04:55:15 CEST)
Multi-cameras system is widely applied in 3D computer vision especially when multiple cameras are distributed on both sides of the measured object. The calibration methods of multi-cameras system are critical to the accuracy of vision measurement and the key is to find an appropriate calibration target. In this paper, a high-precision camera calibration method for multi-cameras system based on transparent glass checkerboard and ray tracing is described, which is used to calibrate multiple cameras distributed on both sides of the glass checkerboard. Firstly, the intrinsic parameters of each camera is obtained by Zhang’s calibration method. Then, multiple cameras capture several images from the front and back of the glass checkerboard with different orientations, and all images contain distinct grid corners. As the cameras on one side are not affected by the refraction of glass checkerboard, extrinsic parameters can be directly calculated. However, the cameras on another side are influenced by the refraction of glass checkerboard, and the direct use of projection model will produce calibration error. A multi-cameras calibration method using refractive projection model and ray tracing is developed to eliminate this error. Furthermore, both synthetic and real data are employed to validate the proposed approach. The experimental results of refractive calibration show that the error of the 3D reconstruction is smaller than 0.2 mm, the relative errors of both rotation and translation are less than 0.014%, and the mean and standard deviation of reprojection error of 4-cameras system are 0.00007 and 0.4543 pixel. The proposed method is flexible, high accurate, and simple to carry out.
ARTICLE | doi:10.20944/preprints201701.0077.v1
Subject: Engineering, Control And Systems Engineering Keywords: leaf area index; smartphone camera sensor; conifer forest; canopy gap fraction
Online: 17 January 2017 (09:59:36 CET)
Plant leaf area index (LAI) is a key characteristic affecting field canopy microclimate. In addition to traditional professional measuring instruments, smartphone camera sensors have been used to measure plant LAI. However, when smartphone methods were used to measure conifer forest LAI, very different performances were obtained depending on whether the smartphone was held at the zenith angle or at a 57.5° angle. To validate further the potential of smartphone sensors for measuring conifer LAI and to find the limits of this method, this paper reports the results of a comparison of two smartphone methods with an LAI-2000 instrument. It is shown that both methods can be used to reveal the conifer leaf-growing trajectory. However, the method with the phone oriented vertically upwards always produced better consistency in magnitude with LAI-2000. The bias of the LAI between the smartphone method and the LAI-2000 instrument was explained with regard to four aspects that can affect LAI: gap fraction, leaf projection ratio, sensor field of view (FOV), and viewing zenith angle (VZA). It was concluded that large FOV and large VZA cause the 57.5° method to overestimate the gap fraction and hence underestimate conifer LAI, especially when tree height is greater than 2.0 m. For the vertically upward method, the bias caused by the overestimated gap fraction is compensated for by an underestimated leaf projection ratio.
ARTICLE | doi:10.20944/preprints202310.0518.v1
Subject: Physical Sciences, Radiation And Radiography Keywords: Compton camera; wide-energy range; Monte Carlo simulation; scintillation detector; image reconstruction
Online: 9 October 2023 (11:08:24 CEST)
(1) Background: The imaging energy range of a typical Compton camera is limited due to the fact that scattered gamma photons are seldom fully absorbed when the incident energies are above 3 MeV. Further improving the upper energy limit of gamma-ray imaging has important application significance in active interrogation of special nuclear materials and chemical warfare agents, as well as range verification of proton therapy; (2) Methods: To realize gamma-ray imag-ing in a wide energy range of 0.3~7 MeV, a principle prototype, named a portable three-layer Compton camera, was developed using the scintillation detector that consist of an silicon photo-multiplier array coupled with a Gd3Al2Ga3O12:Ce pixelated scintillator array. Implemented in a list-mode maximum likelihood expectation maximization algorithm, a far-field energy-domain imaging method based on two-interaction and three-interaction events was applied to estimate the initial energy and spatial distribution of gamma-ray sources. The simulation model of detec-tors was established based on Monte Carlo simulation toolkit Geant4. The reconstructed images of a 133Ba, a 137Cs and a 60Co point-like sources were successfully obtained with our prototype in laboratory tests and compared with simulation studies; (3) Results: The proportion of effective imaging events accounted for about 2%, which made our prototype realize reconstructing the distribution of a 0.05 μSv/h 137Cs source in 10 seconds. The angular resolution for resolving two 137Cs point-like sources was 15°. Additional simulated imaging of the 6.13 MeV gamma-rays from 14.1 MeV neutron scattering with water preliminarily demonstrated the imaging capability for high incident energy; (4) Conclusions: We concluded that the prototype had good imaging per-formance in a wide energy range (0.3~7 MeV), which showed potential in several MeV gam-ma-ray imaging applications.
Subject: Computer Science And Mathematics, Information Systems Keywords: Area Estimation, Crowd Management, COVID-19, Edge Camera, Interpersonal Distance, Social Distancing.
Online: 1 October 2021 (15:37:26 CEST)
For public safety and physical security, currently more than a billion closed-circuit television (CCTV) cameras are deployed around the world. Proliferation of artificial intelligence (AI) and machine learning (ML) technologies has gained significant applications including crowd surveillance. The state-of-the-art distance and area estimation algorithms either need multiple cameras or a reference scale as a ground truth. It is an open question to obtain an estimation using a single camera without a scale reference. In this paper, we propose a novel solution called E-SEC, which estimates interpersonal distance between a pair of dynamic human objects, area occupied by a dynamic crowd, and density using a single edge camera. The E-SEC framework comprises edge CCTV cameras responsible for capture a crowd on video frames leveraging a customized YOLOv3 model for human detection. E-SEC contributes an interpersonal distance estimation algorithm vital for monitoring the social distancing of a crowd, and an area estimation algorithm for dynamically determining an area occupied by a crowd with changing size and position. A unified output module generates the crowd size, interpersonal distances, social distancing violations, area, and density per every frame. Experimental results validate the accuracy and efficiency of E-SEC with a range of different video datasets.
ARTICLE | doi:10.20944/preprints202203.0085.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: object segmentation; LiDAR-camera fusion; autonomous driving; artificial intelligence; semi-supervised learning; iseAuto
Online: 4 March 2022 (21:43:06 CET)
Object segmentation is still considered a challenging problem in autonomous driving, particularly in consideration of real world conditions. Following this line of research, this paper approaches the problem of object segmentation using LiDAR-camera fusion and semi-supervised learning implemented in a fully-convolutional neural network. Our method is tested on real-world data acquired using our custom vehicle iseAuto shuttle. The data include all-weather scenarios, featuring night and rainy weather. In this work, it is shown that LiDAR-camera fusion with only a few annotated scenarios and semi-supervised learning, it is possible to achieve robust performance on real-world data in a multi-class object segmentation problem. The performance of our algorithm is measured in terms of intersection over union, precision, recall and area-under-the-curve average precision. Our network achieves 82% IoU in vehicle detection in day fair scenarios and 64% IoU in vehicle segmentation in night rain scenarios.
ARTICLE | doi:10.20944/preprints202112.0206.v1
Subject: Engineering, Control And Systems Engineering Keywords: Motion capture camera; robotic total station; autonomous vehicle; 6 DoF pose estimation; accuracy
Online: 13 December 2021 (13:30:53 CET)
To validate the accuracy and reliability of onboard sensors for object detection and localization in driver assistance, as well as autonomous driving applications under realistic conditions (indoors and outdoors), a novel tracking system is presented. This tracking system is developed to determine the position and orientation of a slow-moving vehicle (e.g. car during parking maneuvers), independent of the onboard sensors, during test maneuvers within a reference environment. One requirement is a 6 degree of freedom (DoF) pose with a position uncertainty below 5 mm (3σ), an orientation uncertainty below 0.3° (3σ) at a frequency higher than 20 Hz, and a latency smaller than 500 ms. To compare the results from the reference system with the vehicle’s onboard system, a synchronization via Precision Time Protocol (PTP) and a system interoperability to Robot Operating System (ROS) is implemented. The developed system combines motion capture cameras mounted in a 360° panorama view set-up on the vehicle with robotic total stations. A point cloud of the test site serves as a digital twin of the environment, in which the movement of the vehicle is simulated. Results have shown that the fused measurements of these sensors complement each other, so that the accuracy requirements for the 6 DoF pose can be met, while allowing a flexible installation in different environments.
ARTICLE | doi:10.20944/preprints202112.0352.v1
Subject: Engineering, Control And Systems Engineering Keywords: Ball-Plate System; STEM; USB HD camera; Python scripts; ready-made functions; PID controller
Online: 22 December 2021 (11:20:19 CET)
This paper presents the process of designing, fabrication, assembling, programming and optimizing a prototype of a nonlinear mechatronic Ball-Plate System (BPS) as a laboratory platform for STEM engineer education. Due to the nonlinearity and complexity of BPS, task presents challenging issues, such as: 1) difficulties in controlling the stabilization of a given position point known as steady state error, 2) position resolution known as specific distance error and 3) adverse environmental effects - light shadow error, also discussed in this paper. The laboratory BPS prototype for education was designed, manufactured and installed at the Karlovac University of Applied Sciences at the Department of Mechanical Engineering, Study of mechatronics. The low-cost two degrees BPS system uses a USB HD camera for computer vision as feedback and two DC servomotors as actuators. Due to controlling problems, an advanced block diagram of control system is proposed and discussed. An open-source control system based on Python scripts that allows the use of ready-made functions from the library allows changing the color of the ball and the parameters of the PID controller, thus indirectly simplifying control system and directly the mathematical calculation. The authors will continue their research on this BPS mechatronic platform and control algorithms.
REVIEW | doi:10.20944/preprints202102.0459.v1
Subject: Engineering, Automotive Engineering Keywords: autonomous vehicles; self-driving cars; perception; camera; lidar; radar; sensor fusion; calibration; obstacle detection
Online: 22 February 2021 (11:31:02 CET)
The market for autonomous vehicles (AV) is expected to experience significant growth over the coming decades and to revolutionize the future of transportation and mobility. The AV is a vehicle that is capable of perceiving its environment and perform driving tasks safely and efficiently with little or no human intervention and is anticipated to eventually replace conventional vehicles. Self-driving vehicles employ various sensors to sense and perceive their surroundings and, also rely on advances in 5G communication technology to achieve this objective. Sensors are fundamental to the perception of surroundings and the development of sensor technologies associated with AVs has advanced at a significant pace in recent years. Despite remarkable advancements, sensors can still fail to operate as required, due to for example, hardware defects, noise and environment conditions. Hence, it is not desirable to rely on a single sensor for any autonomous driving task. The practical approaches shown in recent research is to incorporate multiple, complementary sensors to overcome the shortcomings of individual sensors operating independently. This article reviews the technical performance and capabilities of sensors applicable to autonomous vehicles, mainly focusing on vision cameras, LiDAR and Radar sensors. The review also considers the compatibility of sensors with various software systems enabling the multi-sensor fusion approach for obstacle detection. This review article concludes by highlighting some of the challenges and possible future research directions.
ARTICLE | doi:10.20944/preprints201807.0490.v1
Subject: Engineering, Control And Systems Engineering Keywords: TV, monitor, projector distortion, distortion correction, calibration, depth camera, Unmanned Aerial Vehicles, UAVs, display.
Online: 25 July 2018 (15:36:21 CEST)
TVs and monitors are among the most widely used displays in various environments. However, they have limitations in their physical display conditions, such as a fixed size/position and a rigid/flat space. In this paper, we suggest a new "Display In the Wild" (DIW) concept to overcome the aforementioned problems. Our proposed DIW system allows us to display a flexibly large screen on dynamic non-planar surfaces at an arbitrary display position. To implement our DIW concept practically, we choose a projector as the hardware configuration in order to a generate screen anywhere with different sizes. However, distortion occurs when the projector displays content on a surface that is dynamic and/or non-planar. Therefore, we propose a distortion correction method for DIW to overcome the aforementioned surface-constraints. Since projectors are not capture devices, we propose using a depth camera to determine the distortions on the surfaces quickly. We also propose DIW-specific calibration and fast/precise correction methods. Our calibration method is constructed to easily and quickly detect the projection surface, and also allows our proposed system to accommodate the intrinsic parameters such as a display resolution and field of view. We accomplish a fast undistortion process of the projector by considering only surface boundary pixels, which enables our method to run in real time. In our comprehensive experiments, the proposed DIW system generates undistorted screens such as TVs and monitors on dynamic non-planar surfaces at an arbitrary display position with Unmanned Aerial Vehicles (UAVs) in a fast and accurate manner, demonstrating its usefullness in practical DIW scenarios.
ARTICLE | doi:10.20944/preprints202305.2250.v1
Subject: Medicine And Pharmacology, Veterinary Medicine Keywords: respiration; automatic monitoring; rodent; rat; animal welfare; refinement; 3R; laboratory animals; camera-based monitoring; breathing
Online: 31 May 2023 (12:56:46 CEST)
Animal research has always been crucial for various medical and scientific breakthroughs, providing information on disease mechanisms, genetic predisposition to diseases, and pharmacological treatment. However, the use of animals in medical research is a source of great controversy and ongoing debate in modern science. To ensure a high level of bioethics, new guidelines have been adopted to replace animal testing wherever possible, reduce the number of animals per experiment, and refine procedures to minimize stress and pain. Supporting these guidelines, this article proposes a novel approach for unobtrusive, continuous, and automated monitoring of the respiratory rate of laboratory rats. It uses the cyclical expansion and contraction of the rats’ thorax/abdominal region to determine this physiological parameter. In contrast to previous work, the focus is on unconstrained animals, which requires the algorithms to be especially robust to motion artifacts. To test the feasibility of the proposed approach, video material of multiple rats was recorded and evaluated. High agreement was obtained between RGB-imaging and the reference method (respiratory rate derived from electrocardiography), which was reflected in a relative error of 5.46%. The current work shows that camera-based technologies are promising and relevant alternatives for monitoring the respiratory rate of unconstrained rats, contributing to the development of new alternatives for continuous and objective assessment of animal welfare and hereby guiding the way to modern and bioethical research.
ARTICLE | doi:10.20944/preprints202305.2180.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Sensor fusion; object detection; deep learning; autonomous vehicles; camera-radar; adverse weather; fog; attention module
Online: 31 May 2023 (07:25:27 CEST)
AVs suffer reduced maneuverability and performance due to the degradation in sensor performances in fog. Such degradation causes significant object detection errors essential for AVs' safety-critical conditions. For instance, YOLOv5 performs significantly well under favorable weather but suffers miss detections and false positives due to atmospheric scattering caused by fog particles. Existing deep object detection techniques often exhibit a high degree of accuracy. The drawback is being sluggish at object detection in fog. Object detection methods with fast detection speed have been obtained using deep learning at the expense of accuracy. The problem of the lack of balance between detection speed and accuracy in fog persist. This paper presents an improved YOLOv5-based multi-sensor fusion network that combines radar's object detection with a camera image bounding box. We transformed radar detection by mapping the radar detections into a two-dimensional image coordinate and projected the resultant radar image on the camera image. Using the attention mechanism, we emphasized and improved important feature representation used for object detection while reducing high-level feature information loss. We trained and tested our multi-sensor fusion network on clear and multi-fog weather datasets obtained from the CARLA simulator. Our result shows that the proposed method significantly enhances the detection of distant and small objects. Our small CR-YOLOnet model best strikes a balance between accuracy and speed with an accuracy of 0.849 at 69 fps.
ARTICLE | doi:10.20944/preprints202011.0192.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Total cloud cover; all-sky camera; algorithms assessment; neural networks; machine learning; data-driven approach
Online: 4 November 2020 (10:55:10 CET)
Total cloud cover (TCC) retrieval from ground-based optical imagery is a problem being tackled by a few generations of researchers. The number of human-designed algorithms for the estimation of TCC grows every year. However, there is not very much progress in terms of quality, mostly due to the lack of systematic approach to the design of the algorithms, to the assessment of their generalization ability, and to the assessment of the TCC retrieval quality. In this study, we discuss the optimization nature of data-driven schemes for TCC retrieval. In order to compare the algorithms, we propose the framework for the assessment of the algorithms characteristics. We present several new algorithms that are based on deep learning techniques: a model for outliers filtering, and a few models for TCC retrieval from all-sky imagery. For training and assessment of data-driven algorithms of this study, we present the Dataset of All-Sky Imagery over the Ocean (DASIO) containing over one million of all-sky optical images of visible sky dome taken in various regions of the World Ocean. The research campaigns contributed to DASIO collection took place in the Atlantic ocean, the Indian ocean the Red and Mediterranean seas, and also in the Arctic ocean. Optical imagery collected during these missions are accompanied by standard meteorological observations of cloudiness characteristics made by experienced observers. We assess the generalization ability of the presented models in several scenarios that differ in terms of the regions selected for the train and validation subsets. As a result, we demonstrate that our models based on convolutional neural networks deliver superior quality compared to all previously published schemes. As a key result, we demonstrate considerable drop of the ability to generalize the training data in case of strong covariate shift between training and validation subsets of imagery which may take place in case of region-aware subsampling.
ARTICLE | doi:10.20944/preprints202009.0583.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: high-speed camera; crack propagation velocity; image sequence analysis; crack analysis; material testing; deformation measurement
Online: 24 September 2020 (12:19:52 CEST)
The determination of crack propagation velocities can provide valuable information for a better understanding of damage processes of concrete. The spatio-temporal analysis of crack patterns developing at a speed of several hundred meters per second is a rather challenging task. In the paper, a photogrammetric procedure for the determination of crack propagation velocities in concrete specimens using high-speed camera image sequences is presented. A cascaded image sequence processing which starts with the computation of displacement vector fields for a dense pattern of points on the specimen’s surface between consecutive time steps of the image sequence chain has been developed. These surface points are triangulated into a mesh, and as representations of cracks, discontinuities in the displacement vector fields are found by a deformation analysis applied to all triangles of the mesh. Connected components of the deformed triangles are computed using region-growing techniques. Then, the crack tips are determined using principal component analysis. The tips are tracked in the image sequence and the velocities between the time stamps of the images are derived. A major advantage of this method as compared to established techniques is in the fact of its allowing for spatio-temporally resolved, full-field measurements rather than point-wise measurements and that information on crack width can be obtained simultaneously. To validate the experimentation, the authors processed image sequences of tests on four compact-tension specimens performed on a split-Hopkinson tension bar. The images were taken by a high-speed camera at a frame rate of 160,000 images per second. By applying to these datasets the image sequence processing procedure as developed, crack propagation velocities of about 800 m/s were determined with a precision in the order of 50 m/s.
ARTICLE | doi:10.20944/preprints201812.0284.v1
Subject: Engineering, Industrial And Manufacturing Engineering Keywords: Reverse Engineering, RealSense D415, depth camera, device characterization, VDI/VDE normative, active stereo, performance comparison
Online: 24 December 2018 (15:11:39 CET)
Low-cost RGB-D cameras are increasingly used in several research fields including human-machine interaction, safety, robotics, biomedical engineering and even Reverse Engineering applications. Among the plethora of commercial devices, the Intel RealSense cameras proved to be among the best suitable devices, providing a good compromise between cost, ease of use, compactness and precision. Released on the market in January 2018, the new Intel model RealSense D415 has a wide acquisition range (i.e. ~160-10000 mm) and a narrow field of view to capture objects in rapid motion. Given the unexplored potential of this new device, especially when used as a 3D scanner, the present work aims to characterize and to provide metrological considerations on the RealSense D415. In particular, tests are carried out to assess the device performances in the near range (i.e. 100-1000 mm). Characterization is performed by integrating the guidelines of the existing standard (i.e. the German VDI/VDE 2634 part 2 normative) with a number of literature-based strategies. Performance analysis is finally compared against latest close-range sensors, thus providing a useful guidance for researchers and practitioners aiming to use RGB-D cameras in Reverse Engineering applications.
ARTICLE | doi:10.20944/preprints202303.0340.v1
Subject: Computer Science And Mathematics, Logic Keywords: SLAM, localization, mapping, mobile mapping system, spherical camera, panoramic image, LiDAR, IMU, sensor fusion, pose graph.
Online: 20 March 2023 (04:00:38 CET)
Mobile Mapping System (MMS) plays a crucial role in generating high-precision 3D maps for various applications. However, the traditional MMS that uses tilted LiDAR (light detection and ranging) has limitations in capturing complete information of the environment. To overcome these limitations, we propose a panoramic vision-aided Cartographer simultaneous localization and mapping (SLAM) system for MMS, named "PVL-Cartographer". The proposed system integrates multiple sensors to achieve accurate and robust localization and mapping. It contains two sub-systems, early fusion and middle fusion. In the early fusion, range-maps are created from LiDAR points in a panoramic image space, facilitating the incorporation of visual features. The SLAM system works with both visual features with and without augmented ranges. In the middle fusion, a pose graph combines camera and LiDAR nodes, with IMU (Inertial Measurement Unit) data providing constraints between each node. Extensive experiments in challenging outdoor scenarios demonstrate the effectiveness of the proposed SLAM system in producing accurate results, even in conditions with limited features. Overall, our proposed PVL Cartographer system offers a robust and accurate solution for MMS localization and mapping.
ARTICLE | doi:10.20944/preprints202106.0037.v1
Subject: Computer Science And Mathematics, Robotics Keywords: robot navigation; computer vision; camera calibration; mapping; path planning; communication; NAO robot; educational innovation; higher education
Online: 1 June 2021 (14:49:11 CEST)
Maze navigation using one or more robots has become a recurring challenge in scientific literature and real life practice, with fleets having to find faster and better ways to navigate environments such as a travel hub (e.g. airports) or to evacuate a disaster zone. Many methods have been used to solve this issue, including the implementation of a variety of sensors and other signal receiving systems. Most interestingly, camera-based techniques have increasingly become more popular in this kind of applications, given their robustness and scalability. In this paper, we have implemented an end-to-end strategy to address this scenario, allowing a robot to solve a maze in an autonomous way, by using computer vision and path planning. In addition, this robot shares the generated knowledge to another by means of communication protocols, having to adapt its mechanical characteristics to be able to solve the same challenge. The paper presents experimental validation of the four components of this solution, namely camera calibration, maze mapping, path planning and robot communication. Finally, we present the integration and functionality of these methods applied in a pair of NAO robots.
ARTICLE | doi:10.20944/preprints202312.0492.v1
Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Smart Cane; Jetson Nano (B01); 2D LiDAR; RGB-D Camera; Laser SLAM; Target Recognition; Cartographer; Improved Yolov5
Online: 7 December 2023 (07:52:26 CET)
In this paper, an intelligent blind guide system based on 2D LiDAR and RGB-D camera sensing is proposed, and the system is mounted on the smart cane. The intelligent guide system relies on 2D LiDAR, RGB-D camera, IMU, GPS, Jetson nano B01, STM32 and other hardware. The main advantage of the intelligent guide system proposed by us is that the distance between the smart cane and obstacles can be measured by 2D LiDAR based on Cartographer algorithm, thus achieving Simultaneous localization and mapping (SLAM). At the same time, through the improved yolov5 algorithm, pedestrians, vehicles, pedestrian crosswalks, traffic lights, warning posts, stone piers, tactile paving and other objects in front of the visually impaired can be quickly and effectively identified. Laser SLAM and improved yolov5 obstacle identification tests were carried out inside a teaching building on the campus of Hainan Normal University and on a pedestrian crossing on Longkun South Road in Haikou City, Hainan Province. The results show that the intelligent guide system developed by us can drive the wheels and omnidirectional wheels at the bottom of the smart cane, and give the smart cane a self-leading blind guide function like a "guide dog", which can effectively guide the visually impaired to avoid obstacles and reach the predetermined destination, and can quickly and effectively identify the obstacles on the way out. The laser SLAM speed of this system is 25~31FPS, which can realize the short-distance obstacle avoidance and navigation function both in indoor and outdoor envionments. The improved yolov5 helps to identify 86 types of objects, the recognition rate for pedestrian crosswalks and for vehicles are 84.6% and 71.8%, respectively; the overall recognition rate for 86 types of objects is 61.2%, and the obstacle recognition rate of the intelligent guide system is 25-26FPS.
ARTICLE | doi:10.20944/preprints202304.1021.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: camera calibration; registration; normalized Zernike moments; corresponding point matching; essential matrix; relative orientation; absolute orientation; point cloud coloring
Online: 27 April 2023 (03:54:51 CEST)
With the continuous development of 3D city modeling, traditional close-range photogrammetry is limited by complex processing procedures and incomplete 3D depth information, making it unable to meet high-precision modeling requirements. In contrast, the integration of LiDAR and camera in mobile measurement systems provides a new and highly effective solution. The LiDAR can quickly and accurately acquire the 3D spatial coordinates of target objects, while optical imagery contains abundant color information. If the two can be integrated, they can play an important role in multiple fields such as streetscape modeling, archaeology and digital city, etc. Currently, integrated mobile measurement systems commonly require cameras, lasers, POS and IMU, thus the hardware cost is relatively expensive and the system integration is complex. Therefore, in this paper we propose a simple ground mobile measurement system composed of a LiDAR and a GoPro camera without a POS system, providing a more convenient and reliable way to automatically obtain 3D point cloud data with spectral information. The automatic point cloud coloring based on video images mainly includes four aspects: (1) Establishing models for radial distortion and tangential distortion to correct video images. (2) Establishing a registration method based on normalized Zernike moments to obtain the exterior orientation elements. Normalized Zernike moments are region-based shape descriptors that can reflect the features of images in multiple dimensions even for low-quality video images. The results show that registration based on normalized Zernike moments provides a good result, with an error accuracy of 0.5-1 pixel, which is far higher than registration based on a collinearity equation. (3) Establishing adjacent video image relative orientation based on essential matrix decomposition and nonlinear optimization. This involves uniformly using the SURF algorithm with distance restriction and RANSAC to select corresponding points, which can improve the reliability of the corresponding points. The results indicate that the accuracy of the relative orientation method is high. Moreover, this method can converge to good results for stereo image pairs with large rotation angles and displacement amounts. Therefore, relative orientation based on essential matrix decomposition and nonlinear optimization has good applicability. (4) The video imagery suffers from significant motion blur and boundary distortion. Therefore, a point cloud coloring method based on Gaussian distribution with central region restriction is adopted. Only pixels within the central region are considered valid for coloring. Then, the point cloud is colored based on the mean of the Gaussian distribution of the color set. Experimental results show that the coloring accuracy between the video imagery and point cloud data is high, meeting the accuracy requirements of applications such as tunnel detection, street-view modeling and 3D urban modeling.
ARTICLE | doi:10.20944/preprints202101.0255.v1
Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: UAV; Parrot Sequoia multispectral camera; photosynthetic pigments; Norway spruce; forest; linear models; ground truth; needle age; crown detection
Online: 13 January 2021 (14:52:06 CET)
Remote sensing is one of the modern methods that have significantly developed over the last two decades and nowadays provides a new means for forest monitoring. High spatial and temporal resolutions are demanded for accurate and timely monitoring of forests. In this study multi-spectral Unmanned Aerial Vehicle (UAV) images were used to estimate canopy parameters (definition of crown extent, top and height as well as photosynthetic pigment contents). The UAV images in Green, Red, Red-Edge and NIR bands were acquired by Parrot Sequoia camera over selected sites in two small catchments (Czech Republic) covered dominantly by Norway spruce monocultures. Individual tree extents, together with tree tops and heights, were derived from the Canopy Height Model (CHM). In addition, the following were tested i) to what extent can the linear relationship be established between selected vegetation indexes (NDVI and NDVIred edge) derived for individual trees and the corresponding ground truth (e.g., biochemically assessed needle photosynthetic pigment contents), and ii) whether needle age selection as a ground truth and crown light conditions affect the validity of linear models. The results of the conducted statistical analysis show that the two vegetation indexes (NDVI and NDVIred edge) tested here have a potential to assess photosynthetic pigments in Norway spruce forests at a semi-quantitative level, however the needle-age selection as a ground truth was revealed to be a very important factor. The only usable results were obtained for linear models when using the 2nd year needle pigment contents as a ground truth. On the other hand, the illumination conditions of the crown proved to have very little effect on the model’s validity. No study was found to directly compare these results conducted on coniferous forest stands. This shows that there is a further need for studies dealing with a quantitative estimation of the biochemical variables of nature coniferous forests when employing spectral data acquired by the UAV platform at a very high spatial resolution.
ARTICLE | doi:10.20944/preprints202310.0239.v1
Subject: Engineering, Automotive Engineering Keywords: Driver state monitoring system; Multi-sensor data fusion; CNN Model; ToF(Time of Flight) camera; Hand on detection; EEG; ECG
Online: 5 October 2023 (07:34:02 CEST)
A stable driver state is essential in the process of manually transition control, which inevitably occurs in Level 3 automated driving situations. To this end, this paper proposed a CNN algorithm-based driver state monitoring system that uses multi-sensor data such as driver's face image, biometric information, and vehicle behavior information as input. This system calculates the probability of drowsiness for each of the four time periods using a convolutional neural network (CNN) based on ToF camera-based eye blinking, ECG information (pulse rate) embedded in the steering wheel, and vehicle information (steering angle data). In order to build a reliable and high-quality training dataset (Ground Truth) for the CNN algorithm, a baseline was established by matching the driver's face image with the electrocardiogram (ECG) and electroencephalogram (EEG) changes in the drowsy and normal states. In a simulation test of the proposed CNN algorithm using more than 20,000 driver image data acquired using a driving simulator, the TNR was 94.8% and the accuracy was 94.2%. Our proposed method is expected to minimize human errors that may occur when switching control by monitoring inappropriate driver state (drowsiness) in real time.
ARTICLE | doi:10.20944/preprints201608.0186.v1
Subject: Computer Science And Mathematics, Geometry And Topology Keywords: active vision; the conformal camera; the Riemann sphere; Möbius geometry; complex projective geometry; projective Fourier transform; retinotopy; binocular vision; horopter
Online: 20 August 2016 (11:24:25 CEST)
Primate vision is an active process that constructs a stable internal representation of the 3D world based on 2D sensory inputs that are inherently unstable due to incessant eye movements. We present here a mathematical framework for processing visual information for a biologically-mediated active vision stereo system with asymmetric conformal cameras. This model utilizes the geometric analysis on the Riemann sphere developed in the group-theoretic framework of the conformal camera, thus far only applicable in modeling monocular vision. The asymmetric conformal camera model constructed here includes the fovea’s asymmetric displacement on the retina and the eye’s natural crystalline lens tilt and decentration, as observed in ophthalmological diagnostics. We extend the group-theoretic framework underlying the conformal camera to the stereo system with asymmetric conformal cameras. Our numerical simulation shows that the 1 theoretical horopter curves in this stereo system are conics that well approximate the empirical longitudinal horopters of the primate vision system.
ARTICLE | doi:10.20944/preprints202308.0322.v1
Subject: Engineering, Other Keywords: object‐level SLAM; RBPF‐SLAM; shape‐based pose estimation; undelayed initialization; IMU/camera fusion; tightly coupled; coarse‐to‐fine pose estimation
Online: 3 August 2023 (10:33:25 CEST)
Object-level Simultaneous Localization and Mapping (SLAM) has gained popularity in recent years since it can provide a means for intelligent robot-to-environment interactions. However, most of these methods assume that the distribution of the errors is gaussian. This assumption is not valid under many circumstances. Further, these methods use a delayed initialization of the objects in the map. During this delayed period, the solution relies on the motion model provided by an Inertial Measurement Unit (IMU). Unfortunately, the errors tend to accumulate quickly due to the dead-reckoning nature of these motion models. Finally, the current solutions depend on a set of salient features on the object’s surface and not the object’s shape. This research proposes an accurate object-level solution to the SLAM problem with a 4.1 to 13.1 cm error in the position (0.005 to 0.021 of the total path). The developed solution is based on Rao-blackwellized Particle Filtering (RBPF) that does not assume any predefined error distribution for the parameters. Further, the solution relies on the shape and thus can be used for objects that lack texture on their surface. Finally, the developed tightly coupled IMU/camera solution is based on an undelayed initialization of the objects in the map.
ARTICLE | doi:10.20944/preprints202004.0032.v1
Subject: Environmental And Earth Sciences, Remote Sensing Keywords: indoor positioning system; image-based positioning system; computer vision; SIFT; feature detection; feature description; cell phone camera; PnP problem; projection matrix; epipolar geometry; OpenCV
Online: 3 April 2020 (11:59:48 CEST)
As people grow a custom to effortless outdoor navigation there is a rising demand for similar possibility indoors as well. Unfortunately, indoor localization, being one of the necessary requirements for navigation, continues to be problem without a clear solution. In this article we are proposing a method for an indoor positioning system using a single image. This is made possible using small preprocessed database of images with known control points as the only preprocessing needed. Using feature detection with SIFT algorithm we can look through the database and find image which is the most similar to the image taken by user. Pair of images is then used to find coordinates of database image using PnP problem. Furthermore, projection and essential matrices are determined allowing for the user image localization ~ determining the position of the user in indoor environment. Benefits of this approach lies in the single image being the only input from user and no requirements for new onsite infrastructure and thus enables a simpler realization for the building management.