Search | Preprints.org

The early-season area estimation of the winter wheat crop as a strategic product is important for decision makers. Classification of multi-temporal images is an approach which is affected by many factors like appropriate training sample size, proper frequency and acquisition times, vegetation indices (VIs) type, temporal gradient of spectral bands and VIs, appropriate classifier and missed values because of cloudy conditions. This paper addresses the impact of appropriate frequency and acquisition times and VIs type along with the spectral and VI gradient on random forest (RF) classifier when missed values exist in multi-temporal images. To investigate the appropriate temporal resolution for image acquisition, the study area was selected on an overlapping area between two LDCM paths. In our developed method, the miss values of cloudy bands for each pixel are retrieved by the mean of k-nearest ordinary pixels. Then the multi-temporal image analysis is performed by considering different scenarios provided by decision makers in terms of desired crop types that should be extracted at early-season in the study areas. The classification results obtained by the RF decrease by 1.6% when temporally missed values retrieved by the proposed method, which is an acceptable result. Moreover, the experimental results demonstrated that if temporal resolution of Landsat 8 increased to one week the classification task can be conducted earlier with almost better results in terms of OA and kappa. The impact of incorporating VIs along with the temporal gradients of spectral bands and VIs as new features in RF demonstrated that the OA and Kappa are improved 3.1% and 6.6%, respectively. Furthermore, the obtained result showed that if only one image from seasonal changes of crops is available, the temporal gradient of VIs and spectral bands play the main role to discriminate remarkably wheat from barley. The experiments also demonstrated that if both wheat and barley merge to a single class the crop area can be estimated two months earlier with 97.1 and 93.5 in terms of OA and kappa, respectively.

Working Paper ARTICLE

Fault Diagnosis of Diesel Engine Valve Clearance Based on Variational Mode Decomposition and Random Forest

Nanyang Zhao, Zhiwei Mao, Donghai Wei, Haipeng Zhao, Jinjie Zhang, Zhinong Jiang

Subject: Engineering, Mechanical Engineering Keywords: diesel engine; fault diagnosis; variational mode decomposition; random forest; feature extraction

Online: 25 December 2019 (11:13:13 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202201.0138.v1

Random Forest Based Power Sustainability and Cost Optimization in Smart Grid

Danalakshmi D, Łukasz Wróblewski, Sheela A, A. Hariharasudan, Mariusz Urbański

Subject: Business, Economics And Management, Business And Management Keywords: Smart Grid; Random Forest; Internet of Things; Power management; Machine Learning; Smart Meter; Priority Power Scheduling.

Online: 11 January 2022 (13:01:08 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202102.0318.v3

Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically-Based Features

Alfonso T. García-Sosa

Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: Machine Learning; Artificial Intelligence; Androgen Receptor; Random Forest; Deep Neural Network; Convolutional

Online: 24 February 2021 (13:14:01 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202210.0190.v1

Estimation of Above Ground Volume of Mangrove Forest Trees from Terrestrial LiDAR Data using Supervised Machine Learning Algorithms

Yeshwanth Adimoolam, Nithin D. Pillai, Gnanappazham Lakshmanan, Deepak Mishra, Vinay Kumar Dadhwal

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Above-ground biomass; mangroves; pneumatophores; terrestrial LiDAR; machine learning; random forest

Online: 13 October 2022 (08:14:07 CEST)

Show abstract| Download PDF| Share

Accurately quantifying the above-ground volume (AGV) and thus above-ground biomass (AGB) of forest stands is an important aspect in the conservation of mangrove ecosystem owing to their ecological and economic benefits. However, the number of studies focusing on quantifying mangrove forests’ biomass has been relatively low due to their marshy terrain, making exploratory studies challenging. In recent times, the use of LiDAR technologies in forest inventory studies has become increasingly popular, due to the reliability of LiDAR as a highly accurate means of 3D spatial data acquisition. In this study, we propose an end-to-end methodology for estimating AGV of mangrove forest stands from terrestrial LiDAR data. Many of the recent studies on this topic effectively employ machine learning algorithms such as multi layer perceptron, random forests, etc. for filtering foliage in the point cloud data of single trees. This study further extends that approach by incorporating the impact of class imbalance of forest point cloud data in a weighted random forest classifier. For the task of segmentation of wood/foliage points in a single tree point cloud, this approach yielded an average increase of 2.737% in the balanced accuracy score, 0.007 in the Cohen’s kappa score, 2.745% in the ROC AUC score and 0.857% in the F1 score. For the task of AGV estimation of a single tree, this approach resulted in an average coefficient of determination of 0.93 with respect to the ground truth volumes. For the task of counting pneumatophores in a plot-level point cloud, the proposed breadth-first searching method yielded an average coefficient of determination of 0.9391. Also, the machine learning classifier and geometric features used in this study were invariant to tree species and hence could be generalised for the classification of point clouds of other tree species as well. Finally, a breadth-first graph-search segmentation based approach is also proposed as part of this pipeline to estimate the contribution of pneumatophores to the AGB of mangrove forest stands. Since pneumatophores are a special adaptation of mangrove forests for gaseous exchange in marshy environments, this study aims to incorporate the detection and AGB estimation of pneumatophores in the inventory of mangrove forest stands. Studying the contribution of pneumatophores to the AGB of mangrove forest plots could also aid future mangrove forest inventory studies in modeling the underlying root network and estimating the below-ground biomass of mangrove trees.

Preprint ARTICLE | doi:10.20944/preprints202407.1477.v1

Research on Students’ Utilization of Artificial Intelligence Based on Random Forest Model and PCA-K-means Algorithm

Can Liu, Siyi Xu, Zixiang Wang, Yuxuan Chen, Lu Chao, Yang Lin, Hao Yan

Subject: Computer Science And Mathematics, Computer Science Keywords: Artificial intelligence; Random forest model; PCA algorithm; K-means algorithm

Online: 18 July 2024 (08:48:13 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202407.1684.v1

A Regularized Tree Forest for Classification in the Presence of Extreme Class Imbalance

Samir K. Safi, Sheema Gul

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: ; machine learning; optimal tree ensemble classifier; random forest; support vector machine; artificial neural network ;

Online: 22 July 2024 (07:13:53 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1209.v1

Integrated Learning Activity Prediction Model of BHO-AdaBoosting Anti-Breast Cancer ERα Inhibitor Based on Improved Random Forest

Yanxuan Du, Zhengjie Xu, Jiaxin Huang, Chengxuan Lyu, Cunhao Lu, Jian Chen

Subject: Medicine And Pharmacology, Medicine And Pharmacology Keywords: Breast cancer; Activity prediction; Random forest; Feature selection; Bayesian hyperparameter optimization; AdaBoosting

Online: 17 August 2023 (03:57:54 CEST)

Show abstract| Download PDF| Share

Breast cancer is the most common malignancy in women worldwide. The pathogenesis of this disease is closely related to the estrogen receptor alpha subtype (ERα). Therefore, it is of great importance to develop effective inhibitors of ERα activity for the treatment of breast cancer. In this paper, we propose a novel ensemble machine learning model for quantitative structure-activity relationship of anti-breast cancer drugs, which can effectively predict drug activity in small samples with multiple characteristic variables. To avoid the problem of over-fitting caused by low-correlation independent variables, the scoring mechanism of random forest was improved by incorporating three relevance indicators, including the maximum mutual information number, Pearson correlation coefficient and distance correlation coefficient, and 20 optimal molecular descriptors were selected. The Bayesian hyperparameter optimization method was used to optimize the parameters of multiple linear regression (MLR), support vector regression (SVR), and extreme gradient boosting (XGBoost), respectively. The AdaBoost strong learner was constructed by combining the weak learner with the weighted linear addition method. The results show that the proposed ensemble learning model has the best prediction performance compared to the three basic learner models and the CNN-LSTM combination prediction model. The root mean square error was reduced by 7.60%-26.51%. The mean relative error was reduced by 6.46%-30.92%. Goodness of fit increased by 9.57%-36.94%. Finally, the biological activities of 50 candidate compounds for ERα inhibitors were predicted, and it was found that 4-[2-benzyl-1-[4-(2-pyrrolidin-1-ylethoxy)phenyl]but-1-enyl]phenol had an excellent biological activity value pIC50, which had the potential to be an ERα inhibitor. The model proposed in this paper has good prediction accuracy, which can provide an effective reference for the discovery and development of anti-breast cancer drugs.

Preprint ARTICLE | doi:10.20944/preprints202108.0024.v1

Analyzing the Contribution of Road Traffic to Changes of Air Pollutants Using Random Forest: Insights from COVID-19 lockdown in Wuhan

Jiansheng Wu, Yun Qian, Yuan Wang, Na Wang

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Nitrogen Dioxide (NO2); Random Forest; Contribution Rate; Air pollution; COVID-19 lockdown

Online: 2 August 2021 (11:54:10 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0213.v2

The Impact of SMOTE and ADASYN on Random Forest and Advanced Gradient Boosting Techniques in Telecom Customer Churn Prediction

Mehdi Imani, Zahra Ghaderpour, Majid Joudaki, Ali Beikmohammadi

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: customer churn prediction; machine learning; classification techniques; SMOTE; ADASYN; Random Forest; XGBoost; LightGBM; CatBoost

Online: 10 April 2024 (07:57:46 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202307.1252.v1

Prediction of Au-Polymetallic Deposits Based on Spatial Multi-Layer Information Fusion by Random Forest Model in the Central Kunlun Area of Xinjiang, China

Yuepeng Zhang, Xiaofeng Ye, Shuyun Xie, Jianbiao Dong, Xuwei Zhou, Xiaoying Zhou

Subject: Environmental And Earth Sciences, Geophysics And Geology Keywords: spatial multi-information fusion; random forest; metallogenic prediction; Central Kunlun; Xinjiang

Online: 19 July 2023 (03:08:11 CEST)

Show abstract| Download PDF| Share

In recent years, how to combine intelligent prospecting algorithms such as random forest with a large number of geological and mineral data for quantitative prediction of exploration geochemistry has become an important topic of concern to quantitatively improve the accuracy of target delineation. The ore-forming geological conditions in the central Kunlun area of Xinjiang are great and have good prospecting prospects. However, due to the exhaustion of shallow deposits and the lag of geological prospecting work in the past ten years, there has been no expected breakthrough in the search for large and super-large metal deposits for many years. There has been a serious shortage of reserve resources. The use of new theories, new methods and new technologies for mineral resources investigation and evaluation has become an urgent need in the current prospecting work. In view of this, based on the existing spatial database of geological and mineral resources in the central Kunlun of Xinjiang, combined with the geological characteristics, genesis and metallogenic regularity of the area, this paper carried out a series of studies on gold polymetallic minerals with the help of geographic information system and data science programming software platform. The researchers integrated geological and regional geochemical data, and constructed a random forest metallogenic discriminant model based on two different sampling methods (integrated random undersampling and selection of training samples) to predict the mineralization of gold polymetallic minerals in the central Kunlun area of Xinjiang and delineate the metallogenic target area. The quantitative prediction of gold polymetallic mineral resources in the central Kunlun area of Xinjiang by two random forest models is compared and discussed: the known ore spots, fault structures and geochemical information are extracted, and the known gold polymetallic ore spots and geochemical data are used to form a training set and a prediction set to construct a machine learning random forest model. The results of prediction evaluation and metallogenic prospect division show that for different sampling methods, the performance evaluation parameters of the training process show that the prediction accuracy of the selected training samples is higher, and the selected training samples are more reliable because they can fully learn the complex information of the original data. In the metallogenic prospect prediction and metallogenic potential division, the random forest model of selecting training samples has more reference value and further exploration research significance in the production problem considering the actual exploration cost because of its small area of high potential prediction area and high proportion of ore bearing per unit area. At the same time, this study innovatively improves the prediction accuracy, reduces the exploration risk, and expands the prospecting idea of machine learning algorithm in mathematical geology in the central Kunlun area of Xinjiang. The delineated metallogenic potential area has positive guiding significance for the actual gold polymetallic prospecting work in this area.

Preprint ARTICLE | doi:10.20944/preprints202307.0841.v1

A New Land Cover Map of Two Watersheds under Long-Term Environmental Monitoring in the Swedish Arctic Using Random Forest Classification of Sentinel-2 Data

Yves Auda, Erik J. Lundin, Jonas Gustafsson, Oleg S. Pokrovsky, Simon Cazaurang, Laurent Orgogozo

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: land cover; sentinel-2 images; random forest; boreal forest; alpine tundra

Online: 12 July 2023 (13:39:19 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.1742.v1

Experimental Analysis and Machine Learning of Ground Vibra-Tions by Elevated High-Speed Railway Based on Random Forest and Bayesian Optimization

Yanmei CAO, Boyang Li, Qi Xiang, Yuxian Zhang

Subject: Engineering, Civil Engineering Keywords: Machine learning; ground vibration; on-site experiment; random forest; Bayesian optimization; elevated high-speed railway

Online: 26 June 2023 (05:11:33 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.0123.v1

Generalizability of a Random Forest-Based Model of Maize Lodging Built with Satellite Image Data and Its Application to Monitoring and Evaluating Maize Lodging Risks

Huirong Guo, Bo Ming, Chenwei Nie, Guoqiang Zhang, Hongye Yang, Shang Gao, Beibei Xue, Jiangfeng Xin, Dayun Feng, Biao Jia, Peng Hou, Jun Xue, Ruizhi Xie, Keru Wang, Shaokun Li

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Sentinel-2 multispectral data; Maize lodging; Random Forest classification; Predictive variables; Model generalizability

Online: 2 June 2023 (04:08:42 CEST)

Show abstract| Download PDF| Share

Lodging is a common problem in maize production that seriously impacts yield, quality, and the capacity for mechanical harvesting. Evaluation of site-specific lodging risks requires establishment of a method for multi-year monitoring. In this study, spectral images collected by the Sentinel-2 satellite were processed to obtain three types of data: gray-level co-occurrence matrix texture (GLCM), vegetation indices (VIs), and spectral reflectance (SR). Lodging classification models were then established with Random Forest (RF) using each of the three data types separately (the GLCM, VI, and SR models) and in combination (SR+VI model, SR+GLCM model, VI+GLCM mod-el, and SR+VI+GLCM model). By gradually removing features with low importance scores from the SR+VI+GLCM model and analyzing the changes in the overall accuracy (OA), the optimal set of predictive variables was identified and used to construct the optimal model. A model built us-ing data from a single timepoint in 2021 was tested on data collected at a similar timepoint in 2019 and vice versa to assess interannual model generalizability. The results of this study demon-strate that for monitoring maize lodging, models constructed with a single feature type, the GLCM model had significantly lower accuracy compared to the VI and SR models. During certain growth stages, the model constructed with combined features had significantly higher accuracy in monitoring maize lodging compared to models constructed with a single feature. During the pro-cess of selecting the optimal predictive variables, it was found that the accuracy of the model did not increase as the number of predictive variables increased. The results show that the positive and negative validation models had an accuracy of 96.55% and 95.18%, with kappa values of 0.93 and 0.83, respectively. This indicates that the model has strong generality for the same repro-ductive stage between years. This study provides a detailed method for large-scale maize lodging monitoring, allowing for identification of optimal planting practices to reduce the probability of lodging and ultimately improving regional maize yield and quality.

Preprint ARTICLE | doi:10.3390/sci2040061

A Hybrid Approach: Dynamic Diagnostic Rules for Sensor Systems in Industry 4.0 Generated by Online Hyperparameter Tuned Random Forest

Ahlam Mallak, Madjid Fathi

Subject: Engineering, Industrial And Manufacturing Engineering Keywords: industry4.0; fault detection; fault diagnosis; random forest; diagnostic graph; distributed diagnosis; model-based; data-driven; hybrid approach; hydraulic test rig

Online: 24 September 2020 (00:00:00 CEST)

Show abstract| Share

Preprint ARTICLE | doi:10.20944/preprints202007.0548.v1

A Hybrid Approach: Dynamic Diagnostic Rules for Sensor Systems in Industry 4.0 Generated by Online Hyperparameter Tuned Random Forest

Ahlam Mallak, Madjid Fathi

Subject: Computer Science And Mathematics, Information Systems Keywords: industry4.0; fault detection; fault diagnosis; random forest; diagnostic graph; distributed diagnosis; model-based; data-driven; hybrid approach; hydraulic test rig

Online: 23 July 2020 (11:26:41 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201806.0188.v1

Spectral-Spatial Dimensionality Reduction of APEX Hyperspectral Imagery for Tree Species Classification; a Case Study of Salzach Riparian Mixed Forest

Zahra Dabiri, Stefan Lang

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: minimum noise fraction (MNF) transformation; object-based image analysis (OBIA); APEX hyperspectral imagery; Random forest (RF) classifier; multiresolution segmentation (MRS); tree species classification

Online: 12 June 2018 (10:55:07 CEST)

Show abstract| Download PDF| Share

Tree species composition is an important key element for biodiversity and sustainable forest management, and hyperspectral data provide detailed spectral information, which can be used for tree species classification. There are two main challenges for using hyperspectral imagery: a) Hughes phenomena, meaning by increasing the number of bands in hyperspectral imagery, the number of required classification samples would increase exponentially, and b) in a more complex environment, such as riparian mixed forest, focusing on spectral variability per pixel may not be adequate for definability of tree species. Therefore, the focus of this study is to assess spectral-spatial dimensionality reduction of airborne hyperspectral imagery by using minim noise fraction (MNF) transformation, and object-based image analysis (OBIA). An airborne prism experiment (APEX) hyperspectral imagery was used. A study area was a riparian mixed forest located along the Salzach river, and six tree species including Picea abies, Populus (canadensis and balsamifera), Fraxinus excelsior, Alnus incana, and Salix alba were selected. A machine learning algorithm random forest (RF) was used to train and apply a prediction model for classification. Using a spectral dimensionality reduced APEX, a pixel-level classification was also done. According to a confusion matrix, the object-level classification of MNF-derived components achieved the overall accuracy of 85 %, and kappa coefficient of 0.805. The performance of classes according to producer’s accuracy varied between 80% for Fraxinus excelsior, Alnus incana, and Populus canadensis to 90% for Salix alba and Picea abies. Comparison the results to a pixel-level classification, showed a better performance of object-level classification (an overall accuracy of 63% and Kappa coefficient of 0.559 were achieved for pixel-level classification). The performance of classes using pixel-based classification varied 45 % for Alnus incana to 80% for Picea abies. In general, Spectral-spatial complexity reduction using MNF transformation and object-level classification yielded a statistically satisfactory results.

Preprint ARTICLE | doi:10.20944/preprints202306.1169.v1

Practical Entropy Accumulation for Random Number Generators with Image Sensor-Based Quantum Noise Sources

Youngrak Choi, Ju-Sung Kang, Yongjin Yeom

Subject: Computer Science And Mathematics, Applied Mathematics Keywords: Entropy accumulation; Random Number Generator; Quantum random noises

Online: 16 June 2023 (04:32:21 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201809.0195.v1

Random Fixed Point Theorems for Generalized Random α-ψ-contractive Mappings with Applications to Stochastic Differential Equation

Chayut Kongban, Poom Kumam, Juan Martinez-Moreno

Subject: Computer Science And Mathematics, Analysis Keywords: random fixed point, random $\alpha-$admissible with respect to $\eta$, generalized random $\alpha-\psi-$contractive mapping.

Online: 11 September 2018 (11:52:25 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201905.0036.v3

An Algorithmic Random Integer Generator Based on the Distribution of Prime Numbers

Bertrand Teguia

Subject: Computer Science And Mathematics, Logic Keywords: prime numbers; random

Online: 4 June 2019 (11:12:53 CEST)

Show abstract| Download PDF| Share

Working Paper ARTICLE

A Note on Discrete Degenerate Random Variables

Taekyun Kim, Dae San Kim, Lee-Chae Jang, H. Y. Kim

Subject: Computer Science And Mathematics, Discrete Mathematics And Combinatorics Keywords: discrete degenerate random variables; degenerate binomial random variable; degenerate Poisson random variable; new type degenerate Bell polynomials

Online: 15 November 2019 (16:43:03 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202309.0093.v1

Interpreting Randomized Controlled Trials

Pavlos Msaouel, Juhee Lee, Peter F Thall

Subject: Medicine And Pharmacology, Other Keywords: blocking; hazard ratios; confidence intervals; generalizability; randomized controlled trials; random allocation; random sampling; random treatment assignment; stratification; transportability

Online: 4 September 2023 (03:22:18 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202002.0350.v1

Random Pullback Attractor of a Non-autonomous Local Modified Stochastic Swift-Hohenberg with Multiplicative Noise

Yongjun Li, Tinggang Zhao, Hongqing Wu

Subject: Computer Science And Mathematics, Applied Mathematics Keywords: Swift-Hohenberg equation; Random-pullback attractor; Non-autonomous random dynamical system

Online: 24 February 2020 (12:30:08 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.0879.v1

Random Number Generators: Principles and Applications

Anastasios Bikos, Panayiotis E. Nastou, Georgios Petroudis, Yannis Stamatiou

Subject: Computer Science And Mathematics, Security Systems Keywords: Random Number Generation; Cryptography

Online: 14 September 2023 (03:37:49 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201905.0165.v2

Geospatial-Temporal and Demand Models for Opioid Admissions, Implications for Policy

Lawrence Fulton, Zhijie Dong, Benjamin Zhan, C. Scott Kruse, Paula Stigler Granados

Subject: Medicine And Pharmacology, Pharmacy Keywords: opioids, GIS, random forests

Online: 18 June 2019 (11:15:56 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.1142.v1

Stochastic Compartment Model With Mortality and Its Application to Epidemic Spreading in Complex Networks

Teo Granger, Thomas M. Michelitsch, Michael Bestehorn, Alejandro P Riascos, Bernard A. Collet

Subject: Physical Sciences, Other Keywords: Epidemic spreading; compartment model with mortality; memory effects; random walks; random graphs

Online: 22 March 2024 (07:28:23 CET)

Show abstract| Download PDF| Share

We study epidemic spreading in complex networks by a multiple random walker approach. Each walker performs an independent simple Markovian random walk on a complex undirected (ergodic) random graph where we focus on Barabási-Albert (BA), Erdös-Rényi (ER) and Watts-Strogatz (WS) types. Both, walkers and nodes can be either susceptible (S) or infected and infectious (I) representing their states of health. Susceptible nodes may be infected by visits of infected walkers, and susceptible walkers may be infected by visiting infected nodes. No direct transmission of the disease among walkers (or among nodes) is possible. This model mimics a large class of diseases such as Dengue and Malaria with transmission of the disease via vectors (mosquitos). Infected walkers may die during the time span of their infection introducing an additional compartment D of dead walkers. Infected nodes never die and always recover from their infection after a random finite time. This assumption is based on the observation that infectious vectors (mosquitos) are not ill and do not die from the infection. The infectious time spans of nodes and walkers, and the survival times of infected walkers, are represented by independent random variables. We derive stochastic evolution equations for the mean-field compartmental populations with mortality of walkers and delayed transitions among the compartments. From linear stability analysis, we derive the basic reproduction numbers R M , R 0 with and without mortality, respectively, and prove that R M < R 0 . For R M , R 0 > 1 the healthy state is unstable whereas for zero mortality a stable endemic equilibrium exists (independent of the initial conditions) which we obtained explicitly. We observe that the solutions of the random walk simulations in the considered networks agree well with the mean-field solutions for strongly connected graph topologies, whereas less well for weakly connected structures and for diseases with high mortality. Our model has applications beyond epidemic dynamics, for instance in the kinetics of chemical reactions, the propagation of contaminants, wood fires, among many others.

Preprint ARTICLE | doi:10.20944/preprints202310.2066.v1

Random Walks Based Node Centralities to Attack Complex Networks

Massimiliano Turchetto, Michele Bellingeri, Roberto Alfieri, Ngoc-Kim-Khanh Khan, Quang Nguyen, Davide Cassi

Subject: Computer Science And Mathematics, Computer Science Keywords: real-world networks; node centrality; random walk processes; network robustness; network random walks

Online: 1 November 2023 (03:09:43 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.1072.v1

Entropy of Difference

Pasquale Nardone, Giorgio Sonnino

Subject: Computer Science And Mathematics, Applied Mathematics Keywords: entropy; complexity measure; random signal

Online: 14 December 2023 (08:55:02 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202407.2197.v1

Comparison of Statistical and Machine Learning Methods for Analysing Traffic Accident Fatalities

Farai Chigodora, Farai Fredric Mlambo, Herbert Hove

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: traffic fatalities, logistic regression, random forest

Online: 26 July 2024 (14:52:15 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.2160.v1

The Classification for The Sources in SDSS DR18: Searching for QSOs by Machine Learning

Xiao-Qing Wen, Ying-Zi Jiang, Feng-Hua Liu, Jun-Li Mi, Cui-Xia Li, Jiang Hu, Xiang-Ping Shi, Xiao-Wei Dong

Subject: Physical Sciences, Astronomy And Astrophysics Keywords: QSOs; LightGBM; CatBoost; XGBoost; random forest

Online: 30 September 2023 (08:04:34 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1257.v1

Random Authentication Node Selection Mechanism in Block Network for Meta-Mobility Service Data Reliability

Jinsu Kim, Eunsun Choi, Byung-Gyu Kim, Namje Park

Subject: Computer Science And Mathematics, Computer Science Keywords: Blockchain; Mobility; Random Selection; Encoding; Token

Online: 17 August 2023 (09:55:13 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202307.1517.v1

The Random Vibrations of the Active Body of the Cultivators

PETRU CARDEI, Nicolae Constantin, Vergil Muraru, Catalin Persu, Raluca Sfiru, Nicolae-Valentin Vladut, Nicoleta Ungureanu, Mihai Matache, Cornelia Muraru-Ionel, Oana-Diana Cristea, Evelin-Anda LAZA

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: random; vibrations; tillage; tools; complex; cultivator

Online: 21 July 2023 (11:39:13 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202305.1425.v1

Anaemia in Preschool-Aged Children in DR. Congo: Finding from a Nationally Representative Survey

Ngianga II (Shadrack) Kandala

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: Haemoglobin; Anaemia; Dietary, Diversity; Random-effect

Online: 19 May 2023 (10:06:34 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202102.0492.v3

How Accurate are WorldPop-Global-Unconstrained Gridded Population Data at the Cell-Level?: A Simulation Analysis in Urban Namibia

Dana R. Thomson, Douglas R. Leasure, Tomas Bird, Nikos Tzavidis, Andrew J. Tatem

Subject: Social Sciences, Geography, Planning And Development Keywords: LMIC; Global South; indicator; Random Forrest

Online: 1 April 2022 (06:22:53 CEST)

Show abstract| Download PDF| Share

Disaggregated population counts are needed to calculate health, economic, and development indicators in Low- and Middle-Income Countries (LMICs), especially in settings of rapid urbanisation. Censuses are often outdated and inaccurate in LMIC settings, and rarely disaggregated at fine geographic scale. Modelled gridded population datasets derived from census data have become widely used by development researchers and practitioners. These datasets are evaluated for accuracy at the spatial scale of the input data which is often much courser (e.g. administrative units) than the neighbourhood or cell-level scale of many applications. We simulate a realistic "true" 2016 population in Khomas, Namibia, a majority urban region, and introduce realistic levels of outdatedness (over 15 years) and inaccuracy in slum, non-slum, and rural areas. We aggregate these simulated realistic populations by census and administrative boundaries (to mimic census data), and generate 32 gridded population datasets that are typical of a LMIC setting using WorldPop-Global-Unconstrained gridded population approach. We evaluate the cell-level accuracy of these simulated datasets using the original "true" population as a reference. In our simulation, we found large cell-level errors, particularly in slum cells, driven by the use of average population densities in large areal units to determine cell-level population densities. Age, accuracy, and aggregation of the input data also played a role in these errors. We suggest incorporating finer-scale training data into gridded population models generally, and WorldPop-Global-Unconstrained in particular (e.g., from routine household surveys or slum community population counts), and use of new building footprint datasets as a covariate to improve cell-level accuracy. It is important to measure accuracy of gridded population datasets at spatial scales more consistent with how the data are being applied, especially if they are to be used for monitoring key development indicators at neighbourhood scales with relevance to small dense deprived areas within larger administrative units.

Preprint ARTICLE | doi:10.20944/preprints201805.0302.v1

Relating Vertex and Global Graph Entropy in Randomly Generated Graphs

Philip Tee, George Parisis, Luc Berthouze, Ian Wakeman

Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: graph entropy; chromatic classes; random graphs

Online: 22 May 2018 (11:59:26 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202304.0755.v1

Fuzzy Random Option Pricing in Continuous Time: A Systematic Review and an Extension of Vasicek’s Equilibrium Model of the Term Structure

Jorge de Andrés-Sánchez

Subject: Business, Economics And Management, Finance Keywords: option pricing; fuzzy-random variables; fuzzy numbers; fuzzy-random option pricing; Vasicek’s model of term structure

Online: 23 April 2023 (03:58:33 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.1569.v2

Highly-Sensitive Measure of Complexity Captures Boolean Networks Regimes and Temporal Order More Optimally

Manuel de J. Luevano, Alejandro Puga

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: Random Boolean Networks; Entropy; Algorithmic Complexity; Compressibility

Online: 24 July 2024 (07:24:36 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0944.v1

An Enhanced Particle Swarm Optimization (PSO) Employing Quasi-Random Numbers

Shivakumar Kannan, Urmila Diwekar

Subject: Engineering, Industrial And Manufacturing Engineering Keywords: Enhanced PSO; SOBOL; Halton; Quasi-random numbers

Online: 15 March 2024 (16:00:41 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.1560.v1

Clustering of Floating Tracer in the Random Velocity Field Modulated by an Ellipsoidal Vortex Flow

Konstantin Koshel, Dmitry Stepanov, Nata Kuznetsova, Evgeny Ryzhov

Subject: Physical Sciences, Fluids And Plasmas Physics Keywords: tracer clustering; compressibility; ellipsoidal vortex; random flow

Online: 24 November 2023 (05:41:18 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.0441.v1

Amino Acid Residues of the Metal Transporter OsNRAMP5 Responsible for Cadmium Absorption in Rice

Zhengtong Qu, Hiromi Nakanishi

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: OsNRAMP5; cadmium; manganese; rice; transporter; random mutation

Online: 7 November 2023 (11:23:32 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.0607.v1

Application of Machine Learning to Estimate Ammonia Atmospheric Emissions

Alessandro Marongiu, Anna Gilia Collalto, Gabriele Giuseppe Distefano, Elisabetta Angelino

Subject: Environmental And Earth Sciences, Pollution Keywords: ammonia; emission modelling; emission inventory; random forest

Online: 11 September 2023 (05:26:24 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202301.0557.v1

PDF Malware Detection Using Machine Learning

Awss AlMahadeen, mouhammd alkasassbeh

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: PDF; Malware; Machine Learning; Python; Random Forest

Online: 30 January 2023 (12:55:47 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202208.0050.v1

Did Maxwell Dream of Electrical Bacteria?

Eleonora Alfinito, Maura Cesaria, Matteo Beccaria

Subject: Physical Sciences, Applied Physics Keywords: quorum sensing; resistance random network; complex networks

Online: 2 August 2022 (08:21:25 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202205.0023.v1

Random Triangle Theory: a Computational Approach

Ivano Azzini

Subject: Computer Science And Mathematics, Applied Mathematics Keywords: Random Triangle; Quasiorthogonal Dimension; Combinatorics; Computational Problems

Online: 5 May 2022 (07:58:23 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202202.0175.v1

Prediction of Linear Cationic Antimicrobial Peptides Active against Gram Negative and Positive Bacteria Based on Machine Learning Models

Ümmü Gülsüm Söylemez, Malik Yousef, Zülal Kesmen, Mine Erdem Büyükkiraz, Burcu Bakır-Güngör

Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: antimicrobial peptide prediction; sequence analysis; random forest

Online: 14 February 2022 (11:57:01 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202102.0498.v1

Machine Learning-Based Approach to Predict Insect-Herbivory-Damage and Insect-Type Attack in Maize Plants Using Hyperspectral Data

Danielle Elis Garcia Furuya, Mayara Maezano Faita Pinheiro, Felipe David Georges Gomes, Wesley Nunes Gonçalves, José Marcato Júnior, Diego de Castro Rodrigues, Maria Carolina Blassioli-Moraes, Mirian Fernandes Furtado Michereff, Miguel Borges, Raúl Alberto Alaumann, Ednaldo José Ferreira, Ana Paula Marques Ramos, Lucas Prado Osco, Lúcio André de Castro Jorge

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: proximal hyperspectral sensing; precision agriculture; random forest

Online: 22 February 2021 (17:20:41 CET)

Show abstract| Download PDF| Share

A strategy to reduce qualitative and quantitative losses in crop-yields refers to early and accurate detection of insect-damage caused in plants. Remote sensing systems like hyperspectral proximal sensors are a promising strategy for managing crops. In this aspect, machine learning predictions associated with clustering techniques may be an interesting approach mainly because of its robustness to evaluate high dimensional data. In this paper, we model the spectral response of insect-herbivory-damage in maize plants and propose an approach based on machine learning and a clustering method to predict whether the plant is herbivore-attacked or not using leaf reflectance measurements. We differentiate insect-type damage based on the spectral response and indicate the most contributive wavelengths to perform it. For this, we used a maize experiment in semi-field conditions. The maize plants were submitted to three different treatments: control (health plants); plants submitted to Spodoptera frugiperda herbivory-damage, and; plants submitted to Dichelops melacanthus herbivory-damage. The leaf spectral response of all plants (controlled and submitted to herbivory) was measured with a FieldSpec 3.0 Spectroradiometer from 350 to 2500 nm for eight consecutive days. We evaluated the performance of different learners like random forest (RF), support vector machine (SVM), extreme gradient boost (XGB), neural networks (MLP), and measured the impact of a day-by-day analysis into the prediction. We proposed a novel framework with a ranking strategy, based on the accuracy returned by predictions, and a clusterization method based on a self-organizing map (SOM) to identify important regions in the reflectance measurement. Our results indicated that the RF-based framework algorithm is the overall best learner to deal with this type of data. After the 5th day of analysis, the accuracy of the algorithm improved substantially. It separated the three treatments into different groups with an F-measure equal to 0.967, 0.917, and 0.881, respectively. We also verified that the most contributive spectral regions are situated in the near-infrared domain. We conclude that the proposed approach with machine learning methods is adequate to monitor herbivory-damage of S. frugiperda and stink bugs like Dichelops melacanthus in maize, differentiating the types of insect-attack early on. We also demonstrate that the framework proposed for the analysis of the most contributive wavelengths is suitable to highlight spectral regions of interest.

Working Paper BRIEF REPORT

Linking Cannabis spp. Metabolite Profiles to Effects and Classifications

Ana Monk, Eric Lane

Subject: Medicine And Pharmacology, Pharmacology And Toxicology Keywords: Cannabis; Metabolite; Principal Component Analysis; Random Forest

Online: 5 September 2020 (07:51:50 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202008.0132.v1

Reinforcement of Cohesionless Soil by Multi-oriented Geosynthetic Inclusions

Shwetha Prasanna

Subject: Environmental And Earth Sciences, Soil Science Keywords: reinforced soil; hexapods; layered inclusion; random inclusion

Online: 5 August 2020 (10:51:29 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201812.0250.v3

Annually Modelling Built-settlements between Remotely-sensed Observations Using Relative Changes in Subnational Populations and Lights at Night

Jeremiah J. Nieves, Alessandro Sorichetta, Catherine Linard, Maksym Bondarenko, Jessica Steele, Forrest Stevens, Andrea E. Gaughan, Alessandra Carioli, Donna Clarke, Thomas Esch, Andrew J. Tatem

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Built-settlements; urban features; spatial growth; , random forest; dasymetric modelling; population

Online: 9 October 2019 (10:48:20 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201804.0022.v1

Heterogeneity of Cooperative Membership: Implications for Cooperative Sustainability

Matthew Elliott, Lisa Elliott, Evert Van der Sluis

Subject: Business, Economics And Management, Economics Keywords: cooperatives; membership heterogeneity; random forest; collective action

Online: 2 April 2018 (11:01:16 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201611.0028.v1

Predictors of Survival in Children with Ependymoma from a Single Center: Using Random Survival Forests

Francisco H. C. Felix, Juvenia B. Fontenele

Subject: Medicine And Pharmacology, Oncology And Oncogenics Keywords: random survival forests; ependymoma; predictors; valproic acid

Online: 3 November 2016 (11:02:12 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202111.0078.v1

GRaVN: A Convolutional Network Approach to Generalised Characterisation of Raman Spectra for Space Exploration

Jon Kissi, Tianqi Xie, Ken McIsaac, Gordon R Osinski, Sean Shieh

Subject: Engineering, Electrical And Electronic Engineering Keywords: GRaVN; machine learning; convolutional neural networks; CNN; raman spectroscopy; analogue missions; planetary science; random undersampling; random oversampling; CanMoon

Online: 3 November 2021 (09:24:38 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202407.0661.v1

Extreme Behavior of Competing Risks with Random Sample Size

Long Bai, Kaihao Hu, Conghua Wen, Zhongquan Tan, Chengxiu Ling

Subject: Business, Economics And Management, Finance Keywords: extreme value theory; competing risks; random sample size

Online: 9 July 2024 (10:21:51 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202407.0562.v1

New Random Walk Algorithm Based on Different Seed Nodes for Community Detection

Wencong Li, Jiansheng Cai, Xiaodong Zhang, Jihui Wang

Subject: Computer Science And Mathematics, Discrete Mathematics And Combinatorics Keywords: Complex networks; Community detection; Random walk; Seed nodes

Online: 8 July 2024 (09:11:02 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202405.1116.v1

Exploring Pattern of Relapse in Pediatric Patients with Acute Lymphocytic Leukemia and Acute Myeloid Leukemia Undergoing Stem Cell Transplant Using Machine Learning Methods

David Shyr, Bing Zhang, Gopin Saini, Simon Brewer

Subject: Medicine And Pharmacology, Hematology Keywords: Leukemia; relapse; predictive model; random forest; machine learning

Online: 16 May 2024 (17:18:14 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1414.v1

Combining Hydroacoustics Data with Landsat Images to Map Seagrass Cover and to Identify Spatial Pattern Predictors (Historically Low Rainfall, Benthic Topography and Hurricanes) of Long-Term Change (1984 to 2021) in a Jamaican Marine Protected Area

Kurt McLaren, Jasmine Sedman, Karen McIntyre, Kurt Prospere

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: climate change; storms; submerged aquatic vegetation; random forest

Online: 26 February 2024 (11:35:43 CET)

Show abstract| Download PDF| Supplementary Files| Share

Despite increased protection globally, climate change and other anthropogenic factors have caused a significant decline in seagrass cover. Identifying the specific causes of this decline is paramount if they are to be addressed. Therefore, we identified the causes of long-term change in seagrass/submerged aquatic vegetation (SAV) percentage cover and extent in the Bluefields Bay Special Fish Conservation Area (BBSFCA), a marine protected area on Jamaica’s southern coast. We used two random forest regression (RFr) models that included 2013 hydroacoustic survey SAV percentage cover data (dependent variable), and auxiliary data, reflectance data from level 2 processed 2013 Landsat 7 and 8 images and image band texture statistics maps (mean and variance) as predictors to generate 24 SAV percentage cover maps for the period 1984 – 2021 (37 years) using data from Landsat 4-5, 7 and 8 images., from which benthic features maps (SAV present, absent and coral reef) were created. Rainfall and map data were used to determine if SAV extent/area (km2) and average percentage cover and annual rainfall changed significantly over time and to evaluate the influence of rainfall. Rainfall impact on the overall spatial patterns of SAV loss, gain, and percentage cover change was assessed using a pixel-based regression. Finally, the most important spatial pattern predictors (two rainfall proxies (distance and direction from river mouth), benthic topography, depth, and hurricane exposure (a measure of hurricane disturbance)) of SAV loss, gain, and percentage cover change during 23 successive 1-to-4-year periods were identified using spatial Bayesian INLA generalized linear mixed models. SAV area/extent was largely stable with > 70% mean percentage cover for multiple years. However, Hurricane Ivan (in 2004) caused a significant decline in SAV area/extent (by 1.62 km2, or a 13%) during 2002 – 2006 and a second hurricane (Dean) in 2007 delayed recovery until 2015. Additionally, rainfall declined significantly by >1000 mm since 1901, and mean monthly rainfall positively influenced SAV percentage cover change. The pixel-based regression highlighted areas where mean monthly rainfall had a positive overall effect on SAV cover percentage change (across the entire bay) and gain (closer to the mouth of a river). The most frequently selected important spatial pattern predictors were the two rainfall proxies (areas closer to the river mouth were more likely to experience SAV loss and gain) and depth, with shallow areas generally having a higher probability of SAV loss and gain. Three hurricanes had significant but different impacts depending on their distance from the southern coastline. Hurricane Gilbert, which made landfall in 1988, resulted in higher SAV percentage cover loss in 1987 - 1988. Benthic locations with a northwestern/northern facing aspect (the predominant direction of Ivan’s leading edge wind bands) experienced higher SAV losses during 2002 – 2006. Although some locations impacted by Ivan and not by Dean recovered during 2006 – 2008, exposure to Ivan explained percentage cover loss during 2006 – 2008 and average exposure to (the cumulative impact of) Ivan and Dean (both with tracks close to the southern coastline) explained SAV loss during 2013 – 2015. Therefore, despite historic lows in annual rainfall, overall, higher rainfall was beneficial, multiple hurricanes impacted the site, and despite two hurricanes in three years SAV recovered within a decade. Hurricanes and a further reduction in rainfall may pose a serious threat to SAV persistence in the future.

Preprint ARTICLE | doi:10.20944/preprints202401.1978.v1

Study of the Influence of Data Volume on the Quality of Regression to Restore the Distribution of Temperatures inside Tissue during Hyperthermia

Evgeny Kostyuchenko, Elena Amletova

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Hyperthermia, Regression, Data reduction, Decision Tree, Random Forest

Online: 29 January 2024 (09:52:17 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202401.0072.v1

Autonomous Strike UAVs for Counterterrorism Missions: Challenges and Preliminary Solutions

Meshari Aljohani, Ravi Mukkamala, Stephan Olariu

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Drone; UAV; Smart contract; black box; random forest

Online: 3 January 2024 (02:24:28 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202304.1222.v1

Phenylalanine Residues in the Active Site of CYP2E1 Participate in Determining the Binding Orientation and Metabolism-Dependent Genotoxicity of Aromatic Compounds

Keqi Hu, Hongwei Tu, Jiayi Xie, Zongying Yang, Zihuan Li, Yijing Chen, Yungang Liu

Subject: Biology And Life Sciences, Toxicology Keywords: aromatic compounds; CYP2E1; phenylalanine; molecular simulation; random forest

Online: 29 April 2023 (07:32:55 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202108.0248.v1

Spatial Warped Gaussian Processes: Estimation and Efficient Field Reconstruction

Gareth William Peters, Ido Nevat, Sai Ganesh Nagarajan, Tomoko Matsui

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Random fields; warped Gaussian Process; Spatial field reconstruction

Online: 11 August 2021 (10:39:35 CEST)

Show abstract| Download PDF| Share

Working Paper ARTICLE

A Semi-Deterministic Random Walk with Resetting

Javier Villarroel, Miquel Montero, Juan Antonio Vega

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Random walk with resetting; Escape probabilities; Exit times

Online: 7 June 2021 (08:04:12 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202101.0349.v1

Determinants of Capital Structure in Financial Institutions: Evidence from selected Micro Finance Institutions of Ethiopia

Kanbiro Orkaido

Subject: Business, Economics And Management, Accounting And Taxation Keywords: Capital structure; Determinants; Microfinance Institutions; Random effect Model

Online: 18 January 2021 (14:50:08 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202006.0028.v1

Genetic Algorithm: Reviews, Implementations, and Applications

Tanweer Alam, Shamimul Qamar, Amit Dixit, Mohamed Benaida

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: genetic algorithm; search techniques; random tests; evolution; applications

Online: 4 June 2020 (07:44:03 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201808.0018.v1

1H NMR Based Serum Metabolic Profiling Reveals Differentiating Biomarkers in Patients with Diabetes and Diabetes Comorbidity

Atul Rawat, Gunjan Misra, Madhukar Saxena, Sukanya Tripathi, Durgesh Dubey, Sulekha Saxena, Avinash Aggarwal, Varsha Gupta, M Y Khan, Anand Prakash

Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Nuclear Magnetic Resonance Spectroscopy, Metabolomics, Biomarker, Random Forest.

Online: 1 August 2018 (11:30:39 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Background: Diabetes is among the most prevalent diseases worldwide, of all the affected individuals a significant proportion of the population remains undiagnosed because of a lack of specific symptoms early in this disorder and inadequate diagnostics. Diabetes and its associated sequela, i.e., comorbidity are associated with microvascular and macrovascular complications. As diabetes is characterized by an altered metabolism of key metabolites and regulatory pathways. Metabolic phenotyping can provide us with a better understanding of the unique set of regulatory perturbations that predispose to diabetes and its associated comorbidities. Methodology: The present study utilizes the analytical platform NMR spectroscopy coupled with Random Forest statistical analysis to identify the discriminatory metabolites of diabetes (DB) and diabetes-related comorbidity (DC) along with the healthy control (HC) subjects. A combined and pairwise analysis was performed, between the serum samples of HC (n=50), and DB (n=38), and DC (n=35) individuals to identify the discriminatory metabolites responsible for class separation. The perturbed metabolites were further rigorously validated using t-test, AUROC analysis to examine the statistical significance of the identified metabolites. Results: The DB and DC patients were well discriminated from HC. However, 15 metabolites were found to be significantly perturbed in DC patients compared to DB, the identified panel of metabolites are TCA cycle (succinate, citrate), methylamine metabolism (trimethylamine, methylamine, betaine), -intermediates; energy metabolites (glucose, lactate, pyruvate); and amino acids (valine, arginine, glutamate, methionine, proline and threonine). The metabolites were further used to identify the perturbed metabolic pathway and correlation of metabolites in DC patients. Conclusion: The 1H NMR metabolomics may prove a promising technique to differentiate and predict diabetes and its comorbidities on their onset or progression by determining the altered levels of the metabolites in serum.

Preprint ARTICLE | doi:10.20944/preprints202405.1180.v1

Fractional Operators and Fractionally Integrated Random Fields on Z

Donatas Surgailis, Vytautė Pilipauskaitė

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: fractional differentiation/integration operators; tempered fractional operators; fractional random field; random walk; limit theorems; long-range dependence; negative dependence; conditional autoregression

Online: 21 May 2024 (10:34:48 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201802.0008.v1

Computational Information Geometry For Binary Classification of High-Dimensional Random Tensors

Gia-Thuy Pham, Rémy Boyer, Frank Nielsen

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Optimal Bayesian detection, information geometry, minimal error probability, Chernoff/Bhattacharyya upper bound, large random tensor, Fisher information, large random sensing matrix

Online: 1 February 2018 (16:32:04 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202406.0264.v1

Optimizing Machine Learning Models for Urban Studies: A Comparative Analysis of Hyperparameter Tuning Methods

Tris Kee, Winky K.O. Ho

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: hyperparameter tuning; optuna; grid search; random search; urban studies

Online: 5 June 2024 (10:59:28 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202405.0476.v1

Nitrogen Estimation in Fig Cultivation through Remote Sensing and Machine Learning

Karla Janeth Martínez-Macias, Aldo Rafael Martínez-Sifuentes, Selenne Yuridia Márquez-Guerrero, Arturo Reyes-González, Pablo Preciado-Rangel, Pablo Yescas-Coronado, Ramón Trucíos-Caciano

Subject: Environmental And Earth Sciences, Soil Science Keywords: Gradient Boosting; Random Forest; Artificial Neural Networks; Vegetation Index

Online: 9 May 2024 (12:15:23 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.1875.v1

Enhancing Precision: Unveiling Individualized Treatment Effects with Advanced Computational Methods

Satish Mandavalli

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Causal Forest; Causal Net; Healthcare; Virtual Twin Random Forest

Online: 29 April 2024 (10:34:18 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.1010.v1

Eigenvalue Distributions in Random Confusion Matrices: Applications to Machine Learning Evaluation

Oyebayo Ridwan Olaniran, Ali Rashash R. Alzahrani, Mohammed R. Alzahrani

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Eigenvalue; Confusion Matrix; Random matrix; Probability distribution; Evaluation metrics

Online: 16 April 2024 (07:37:52 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.0049.v1

Predicting Nurse Turnover for Highly Imbalanced Data Using SMOTE and Machine Learning Algorithms

Yuan Xu, Yongshin Park, Ju dong Park, Bora Sun

Subject: Public Health And Healthcare, Nursing Keywords: Nurse Turnover; Machine Learning; SMOTE; NSSRN; Random Forest; XGoost

Online: 1 November 2023 (09:19:50 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202310.1706.v1

Analysis of Reactive Power in Electrical Networks Supplying Non-linear, Fast-Varying Loads

Yuriy Sayenko, Ryszard Pawelek, Tetiana Baranenko

Subject: Engineering, Electrical And Electronic Engineering Keywords: reactive power; higher harmonics; interharmonics; random processes; autocorrelation function

Online: 26 October 2023 (10:39:24 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202310.1670.v1

Front Movement and Sweeping Rules of CO2 Flooding under Different Oil Displacement Patterns

Xiang Qi, Tiyao Zhou, Weifeng Lyu, Dongbo He, Yingying Sun, Meng Du, Mingyuan Wang, Zheng Li

Subject: Engineering, Energy And Fuel Technology Keywords: CO2 front; Sweep coefficient; Random Forest; Main controlling factor

Online: 25 October 2023 (15:08:12 CEST)

Show abstract| Download PDF| Share

CO2 flooding stands as a pivotal technique for significantly enhancing oil recovery in low-permeability reservoirs. The movement and sweeping rules at the front of CO2 flooding play a critical role in its oil recovery, yet a comprehensive quantitative analysis remains an area in need of refinement. In this study, we have developed 1D and 2D numerical simulation models to explore the sweeping behavior of miscible, immiscible, and partly-miscible CO2 flooding patterns. The front position and movement rules of the three CO2 flooding patterns are determined. A novel approach of the contour area calculation method is introduced to quantitatively characterize the sweep coefficients, and the sweeping rules are discussed regarding geological parameters, oil viscosity, and injection-production parameters. Furthermore, the Random Forest (RF) algorithm is employed to identify the controlling factor of the sweep coefficient, as determined by out-of-bag (OOB) data displacement analysis. The results show that the miscible front is located at the point of maximum CO2 content in the oil phase. The immiscible front occurs at the point of maximum interfacial tension near the production well. Remarkably, the immiscible front moves at a faster rate compared to the miscible front. Geological parameters, including porosity, permeability, and reservoir thickness, significantly impact the gravity segregation effect, thereby influencing the CO2 sweep coefficient. Immiscible flooding exhibits the highest degree of gravity segregation, with a maximum gravity segregation degree (GSD) reaching 78.1. The permeability ratio is a crucial factor, with a lower limit of approximately 5.0 for reservoirs suitable for CO2 flooding. Injection-production parameters also play a pivotal role in sweep coefficient. Decreased well spacing and increased gas injection rates are found to enhance sweep coefficients by suppressing gravity segregation. Additionally, higher gas injection rates can improve the miscibility degree of partly-miscible flooding from 0.69 to 1.0. Oil viscosity proves to be a significant factor influencing the sweep coefficients, with high seepage resistance due to increasing oil viscosity dominating the miscible and partly-miscible flooding patterns. Conversely, gravity segregation primarily governs the sweep coefficient in immiscible flooding. In terms of controlling factors, the permeability ratio emerges as a paramount influence, with a factor importance value (FI) reaching 1.04. these results provide a theoretical foundation for the application of CO2 flooding, enhancing the understanding of the critical factors governing its success.

Preprint ESSAY | doi:10.20944/preprints202308.0080.v1

Impact of Land Cover Changes on Soil Mapping in Plain Areas: Evidence from Tongzhou District of Beijing,China

Xiangyuan Wu, Kening Wu Wu, Huafu Zhao, Shiheng Hao, Long Kang, Zhenyu Zhou

Subject: Environmental And Earth Sciences, Soil Science Keywords: and cover changes; soil mapping; random forest; plain areas

Online: 1 August 2023 (10:53:33 CEST)

Show abstract| Download PDF| Share

The flat terrain in plain areas makes the land easily accessible for cultivation and farming, providing vast opportunities for agricultural development. Additionally, these areas are crucial for urban construction and economic growth. Soil mapping plays a crucial role in understanding soil characteristics and guiding land management practices. However, accurately mapping soils in plain regions can be challenging due to their low spatial variability and diverse land use types. This study focuses on the impact of land cover changes on the accuracy of soil mapping in plain areas, aiming to provide effective assistance in soil mapping through the analysis of their coupling relationship. Starting with a 20-year land cover change analysis, this study utilizes a unified approach that combines expert knowledge, mixed sampling methods, and random forest mapping techniques. The study incorporates environmental covariates that have minimal period influence and synergistically use NDVI (Normalized Difference Vegetation Index) and land cover data from the same year. The analysis is based on transition matrices, confusion matrices, and their derived indicators. The research findings indicate that Tongzhou District has experienced rapid development over the past 20 years, with the area of construction land nearly doubling. 29% of arable land has been converted into construction land, resulting in an increase in the accuracy of the soil map from 58.99% to 66.91% over the 20-year period. The soil change area during this period accounts for 16.5% of the total area, with 51.9% of the changed areas overlapping with land cover change areas. These overlapping regions are predominantly influenced by human activities. In terms of cultivated land types in the study area, the quantity of arable land has decreased by approximately 29% over the 20 years, while the proportion of sandy loam calcareous fluvo-aquic soil and light loam calcareous fluvo-aquic soil, which constitute nearly half of the soil types, has increased. These data demonstrate the coupling relationship between land cover changes and soil type variations, particularly the significant influence of human activities on soil structure. It is evident that on one hand, improving the extent of land use in plain areas enhances the credibility of soil mapping. On the other hand, human activities impact land cover, which in turn affects and reflects changes in the soil.

Preprint ARTICLE | doi:10.20944/preprints202306.1210.v1

Inverse Evaluation of Monopile Pile-soil Interaction Parameters Using Random Search

Hou Qiao, Wei Li, Zhenqiang Jiang, Chuanrui Guo

Subject: Engineering, Marine Engineering Keywords: offshore wind; parameter inversion; pile-soil interaction; random search

Online: 16 June 2023 (10:13:49 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.0705.v1

Phenology-Based Winter Wheat Classification for Crop Growth Monitoring Using Multi-Temporal Sentinel-2 Satellite Data

Solomon W Newete

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Phenology; Tillering; Random Forest; Crop type; Clustering, Unsupervised classification

Online: 9 June 2023 (11:04:40 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202305.0778.v1

Credit Reports Classification Based on Semi-Supervised Learning Methods

Ruiqi Feng, Lu Han, Muzi Chen

Subject: Computer Science And Mathematics, Computational Mathematics Keywords: Ant colony clustering algorithm; Random Forest; Fuzzy number; Classification

Online: 11 May 2023 (03:57:06 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202209.0088.v1

Numerical Simulation of the Elastic-Ideal Plastic Material Behavior of Short Fiber-Reinforced Composites Including Its Spatial Distribution with an Experimental Validation

Natalie Rauter

Subject: Engineering, Mechanical Engineering Keywords: Short fiber-reinforced composite; Random fields; Plasticity; Numerical simulation

Online: 6 September 2022 (10:11:54 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202208.0058.v1

Heterogeneity Extends Criticality

Fernanda Sánchez-Puig, Octavio Zapata, Omar K. Pineda, Gerardo Iñiguez, Carlos Gershenson

Subject: Physical Sciences, Theoretical Physics Keywords: complexity; phase transitions; criticality; Ising model; random Boolean networks

Online: 2 August 2022 (09:30:37 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202207.0462.v1

On the Predictability of Greek Systemic Bank Stocks using Machine Learning Techniques

Hera Antonopoulou, Leonidas Theodorakopoulos, Constantinos Halkiopoulos, Vicky Mamalougkou

Subject: Business, Economics And Management, Finance Keywords: Machine Learning; Random Forest; Google Trends; Predictability; Banks; Greece

Online: 29 July 2022 (13:07:42 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202112.0138.v2

Comparison of Machine Learning Techniques in Cotton Yield Prediction Using Satellite Remote Sensing

Francielle Morelli-Ferreira, Nayane Jaqueline Costa Maia, Danilo Tedesco, Elizabeth Haruna Kazama, Franciele Morlin Carneiro, Leticia Bernabe Santos, Getulio Freitas Seben Junior, Glauco Souza Rolim, Luciano Shozo Shiratsuchi, Rouverson Pereira Silva

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Yield mapping; vegetation index; Stepwise; SR; Random Forest; KNN

Online: 9 December 2021 (15:39:34 CET)

Show abstract| Download PDF| Share

The use of machine learning techniques to predict yield based on remote sensing is a no-return path and studies conducted on farm aim to help rural producers in decision-making. Thus, commercial fields equipped with technologies in Mato Grosso, Brazil, were monitored by satellite images to predict cotton yield using supervised learning techniques. The objective of this research was to identify how early in the growing season, which vegetation indices and which machine learning algorithms are best to predict cotton yield at the farm level. For that, we went through the following steps: 1) We observed the yield in 398 ha (3 fields) and eight vegetation indices (VI) were calculated on five dates during the growing season. 2) Scenarios were created to facilitate the analysis and interpretation of results: Scenario 1: All Data (8 indices on 5 dates = 40 inputs) and Scenario 2: best variable selected by Stepwise regression (1 input). 3) In the search for the best algorithm, hyperparameter adjustments, calibrations and tests using machine learning were performed to predict yield and performances were evaluated. Scenario 1 had the best metrics in all fields of study, and the Multilayer Perceptron (MLP) and Random Forest (RF) algorithms showed the best performances with adjusted R2 of 47% and RMSE of only 0.24 t ha^-1, however, in this scenario all predictive inputs that were generated throughout the growing season (approx. 180 days) are needed, so we optimized the prediction and tested only the best VI in each field, and found that among the eight VIs, the Simple Ratio (SR), driven by the K-Nearest Neighbor (KNN) algorithm predicts with 0.26 and 0.28 t ha^-1 of RMSE and 5.20% MAPE, anticipating the cotton yield with low error by ±143 days, and with important aspect of requiring less computational demand in the generation of the prediction when compared to MLP and RF, for example, enabling its use as a technique that helps predict cotton yield, resulting in time savings for planning, whether in marketing or in crop management strategies.

Working Paper ARTICLE

Biofertilization Alters the Composition and Interaction of the Protistan Community in the Wheat Rhizosphere under Field Conditions

Yongbin Li, Caixia Wang, Sanfeng Chen

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Microbiome; Diazotroph; Nitrogen fixation bacteria; Random Forest; Network; Trichomona

Online: 23 August 2021 (12:15:31 CEST)

Show abstract| Download PDF| Share