Preprint
Article

This version is not peer-reviewed.

Remote Sensing and Deep Learning Based Soil Moisture Monitoring System Using HCSWO Optimization Technique and AGRL-RBFN

Submitted:

03 March 2025

Posted:

05 March 2025

You are already at the latest version

Abstract

Soil moisture serves as a crucial factor in the hydrological cycle, supporting plant development, ecosystems heath and contributing to groundwater reserves. Consequently, it plays a significant role in the global climate system. Existing research has not sufficiently explored the impact of climatic changes on soil moisture patterns during monitoring, which has complicated prediction and management efforts. To tackle this issue, the proposed study employs seasonal mapping and grouping techniques to observe climatic variations and predict soil moisture utilizing the AGRL-RBFN method with IL. Initially, historical remote sensing data on soil moisture is gathered and subjected to a three-step preprocessing procedure: gaps are filled using the AdaK-MCC method, noise is minimized through the Savitzky-Golay Filter (SGF), and atmospheric interferences are corrected. Following this preprocessing phase, seasons are mapped, and the AdaK-MCC method is utilized for data grouping. A multivariate correlation analysis is subsequently conducted on the grouped data through Principal Component Analysis (PCA). The diverse patterns within the grouped data are further examined using the FWFCSD method. Features are then extracted from these patterns and correlation analyses, after which optimal features are selected via the Hierarchical Correlated Spider Wasp Optimizer (HCSWO). Ultimately, the AGRL-RBFN with IL is employed to predict soil moisture, resulting in a highly accurate prediction with an accuracy rate of 98.09%.

Keywords: 
;  ;  ;  ;  ;  ;  

1. Introduction

Soil moisture serves as a crucial factor in governing the interactions of water, energy, and carbon cycles between terrestrial ecosystems and the atmosphere. Consequently, data on soil moisture is vital for a range of environmental research areas, such as hydrology, climatology, meteorology, agriculture, water resource management, and the study of climate change. In agriculture, soil moisture data is crucial for assessing drought severity, determining the onset of the rainy season, scheduling planting activities, and providing early alerts regarding potential yield reductions. As climate conditions change, the significance and challenges associated with monitoring soil moisture will increase markedly. To ensure the healthy growth of plants, it is crucial to keep soil moisture levels within an ideal range, preventing both excessive saturation and drought. Soils that are too wet can cause the leaching of essential nutrients, whereas soils that are too dry may result in diminished crop yields and compromised quality. A lack of adequate moisture can lead to plant mortality and reduced crop yields, whereas an excess of moisture may cause root diseases (Pal & Bodhe, 2023). To address this concern, regular soil moisture monitoring is conducted. This monitoring frequently employs probe techniques to facilitate informed decision-making regarding irrigation practices, crop management, and water conservation initiatives. Accurate forecasting of soil moisture levels is vital for evaluating farmland quality (Abbes et al., 2024) (Yang et al., 2024). Remote sensing technology serves as a robust alternative for large-scale soil moisture monitoring, providing high spatial and temporal resolution. Currently, soil moisture data derived from remote sensing is supplied by the National Aeronautics and Space Administration (NASA) and the European Space Agency’s Soil Moisture and Ocean Salinity (SMOS) mission (Wang et al., 2023) (Lee et al., 2023). In recent years, machine learning and deep learning methodologies have been employed to estimate soil moisture from this data. The methodology is capable of automatically extracting significant features from raw data and adeptly managing complex relationships within large datasets (Li & Yan, 2024). Current research predominantly employs machine learning techniques such as Support Vector Machines (SVM), k-nearest neighbors (KNN) (Uthayakumar et al., 2022), Random Forest (RF), and Artificial Neural Networks (ANN), which enhance accuracy in soil moisture estimation (Liu et al., 2022, Shokati et al., 2024). These methods effectively capture intricate relationships and mitigate the risk of overfitting (Win et al., 2024) (Peng et al., 2024). Convolutional Neural Networks (CNN) are particularly adept at considering spatial arrangements, thereby facilitating the extraction of deeper features (Liu et al., 2021) (Hegazi et al., 2023). Long Short-Term Memory (LSTM) networks are utilized to effectively capture temporal dynamics and to create a virtual soil moisture sensor using data from other node transducers (Babu & Yadavamuthiah, 2024) (Patrizi et al., 2022).
Roberts et al. (2022) introduced a deep learning model aimed at estimating soil moisture through the use of remote sensing data. This model employs a Convolutional Neural Network (CNN) to process reflection measurements and surface parameters, thereby improving the accuracy of global soil moisture estimations. The analysis of results indicates that the CNN method outperforms existing models in terms of accuracy; however, the framework demonstrates limitations in areas characterized by minimal variation in soil moisture.
Singh et al. (2024) introduced a machine learning (ML) methodology for predicting soil moisture utilizing Wireless Sensor Networks (WSN). The study primarily employed Random Forest (RF), Support Vector Machines (SVM), and Long Short Term Memory (LSTM) techniques for monitoring soil moisture levels. The findings indicate that the LSTM model consistently surpasses other ML algorithms across various evaluation metrics; however, it necessitates greater computational resources and extended training durations, which may impact the overall performance of the research framework.
Dabboor et al. (2023) established a deep learning framework aimed at predicting soil moisture content from satellite data through the application of several optimized machine learning models. The Gaussian Process Regression (GPR) and Artificial Neural Network (ANN) methods were utilized in the prediction process. The results demonstrate that the GPR model excelled compared to other ML models in accurately estimating soil moisture content, achieving a root mean square error (RMSE) of 4.05%. Nonetheless, the training time is prolonged due to the extensive hyperparameter tuning involved.
Nijaguna et al. (2023) introduced a deep learning methodology for the retrieval of soil moisture from remote sensing imagery. They employed a Gated Recurrent Unit (GRU) hybrid classifier for determining soil moisture content. The findings suggest that this hybrid classifier outperforms other models in terms of performance. Nevertheless, the conventional feature selection method did not encompass all significant features that could influence the research outcomes.
Liu et al. (2023) proposed a machine learning technique for estimating soil moisture utilizing multi-source remote sensing images. The Extreme Trees (ETr) and Random Forest (RF) models were effectively implemented for this estimation. While the ETr model demonstrated superior predictive accuracy compared to other models, it encountered overfitting issues due to the lower spatial resolution of the satellite imagery.
Adab et al. (2020) introduced a machine learning model aimed at estimating surface soil moisture utilizing remote sensing data. The study employed Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Network (ANN) models to improve soil moisture predictions. The findings indicate that the RF method outperformed traditional models; however, it is constrained by limited spatial coverage and inadequate measurements, which complicate accurate estimations.
Celik et al. (2022) concentrated on predicting soil moisture from remote sensing images through a deep learning approach. The Long Short-Term Memory (LSTM) method was utilized for this prediction. The results of the experimental analysis demonstrate that the LSTM approach surpassed existing models; nevertheless, inaccuracies were observed due to variations in soil texture data derived from the remote sensing images.
However, existing studies do not adequately address the complexities involved in predicting soil moisture in the context of climatic changes, and they lack comprehensive background information. Specifically, current research studies exhibit several limitations, which are outlined as follows:
• Climate change significantly influences soil moisture patterns through variations in temperature and extreme weather events. To date, there is little research focused on the complexity of predicting soil moisture in the context of changing climate conditions.
• Remote sensing data may be compromised by noise or other atmospheric influences. Many existing methodologies have overlooked this issue, complicating the development of accurate models.
• Gaps in remote sensing data often arise due to cloud cover, sensor malfunctions, and other factors. This results in incomplete soil moisture information, which has not been adequately addressed in current methodologies.
• Soil moisture dynamics occur across various spatial and temporal scales, and existing models are unable to encapsulate all relevant information within a single framework.
This study proposes an AGRL-RBFN approach for the soil moisture prediction process. The research methodology applied delineates specific objectives designed to tackle the identified research problems:
• To forecast soil moisture trends in relation to climatic changes through the application of the AdaK-MCC approach.
• To create precise models for soil moisture prediction by eliminating the influence of extraneous factors using the SGF approach.
• To predict soil moisture utilizing comprehensive data sets that are devoid of any gaps, employing the AdaK-MCC approach.
• To consolidate all relevant information pertaining to soil moisture prediction into a singular model by leveraging the FWFCSD approach.
The organization of the research paper is delineated as follows: Section 2 details the proposed methodology. In Section 3, the effectiveness of the proposed methodology is evaluated in comparison to existing studies. Finally, Section 4 concludes the paper and discusses potential future improvements.

2. Materials and Methods: Soil Moisture Prediction Using AGRL-RBFN with IL

In the proposed study, soil moisture is forecasted utilizing remote sensing data through the AGRL-RBFN with an IL approach, while also monitoring climatic variations via seasonal mapping and categorization. The primary stages of the research include pre-processing, seasonal mapping and categorization, analysis of varying patterns and multivariate correlations, feature extraction and selection, culminating in the prediction of soil moisture. The structural diagram representing the proposed soil moisture prediction methodology is depicted in Figure 1.

2.1. Data Acquisition

The procedure for monitoring soil moisture utilizing remote sensing and deep learning commences with the collection of remote sensing data related to soil moisture. These sensors are capable of assessing soil moisture across extensive regions without requiring direct physical sampling. The gathered remote sensing data (Z) on soil moisture is represented as follows:
Ζ = Ζ 1 , Ζ 2 , Ζ 3 , ........ Ζ a
Where, a illustrates total number of Ζ . This continuous and spatially extensive data collection significantly enhances the accuracy of soil moisture predictions.

2.2. Pre-Processing

In this pre-processing phase, Ζ is pre-processed to ensure more accurate and reliable soil moisture predictions from the remote sensing data.

2.2.1. Gap Filling

The missing values in Ζ are filled using the AdaK-MCC method. By grouping similar observations, K-Means Clustering (KMC) identifies patterns within the data, enabling more accurate imputations for missing values and ensuring their contextual relevance. However, the final clusters can be significantly influenced by the initial placement of centroids. Poor initialization may lead to suboptimal clustering, which can affect the accuracy of the imputed values. To address this, Adaptive Momentum Coefficients (AMCs) is used for centroid initialization, allowing centroids to explore the solution space more effectively and enhancing the imputation of missing data. Firstly, clusters β  are selected from Ζ , and the centroids ς j  are initialized using AMC, where centroids can explore the solution space more widely, reducing the risk of local minima. ς j  is initialized as:
ς j = ϑ ς j 1 + 1 ϑ G
G = 1 a i = 1 a j = 1 β h i j Ζ i , ς j
Where, ς j and ς j 1 indicates the current and previous centroid, ϑ represents the momentum coefficient of centroid, h i j specifies the distance between i t h input Ζ and j t h centroid ς j , G refers to the gradient loss function. Next, each data point is assigned to the nearest centroid based on the distance as given by:
ρ i = arg min j   Ζ i ς j 2
Here, ρ i denotes cluster assigned to data Ζ i and arg min j refers to the minimum argument in the j t h cluster. The centroids are recalculated based on the data point that is already assigned to the cluster. The updated centroid ς j is:
ς j = 1 ρ i Ζ i   ρ i Ζ i
Where, ρ i is the number of data points in cluster ς j . By summing this, updated centroid is produced. The above steps are repeated until the maximum number of iterations t max is reached. Once the final clusters are formed, the missing values are imputed based on the values of similar points in the same cluster and are denoted as D g f i l l . The pseudocode for AdaK-MCC is expressed below.
Pseudo code of AdaK-MCC
Input: Soil Moisture Data Ζ
Output: Gaps Filled Data D g f i l l
Begin
    Initialize β , ς j , minimum iteration t min , maximum iteration t max
    While  t min < t max
        Derive  ς j # using Adaptive Momentum Coefficients
                 ς j = ϑ ς j 1 + 1 ϑ G o
        Assign  ρ i to data Ζ i
                 ρ i = arg min j   Ζ i ς j 2
        Update centroid  ς j
                 ς j = 1 ρ i Ζ i   ρ i Ζ i
        Repeat until  t max
    End while
    Return  D g f i l l
End
Therefore, the gaps in the soil moisture data are filled, leading to improved accuracy in the models used for soil moisture estimation.

2.2.2. Noise Reduction

The noises in the gaps filled data D g f i l l are reduced by using SGF. It is a smoothing filter that effectively reduces noise while preserving critical features in soil moisture data, making it especially useful for analyzing spectral data related to soil moisture content. The noise reduced data Ν n r e d is determined as,
Ν n r e d = u = 1 d 2 d 1 2 J u D g f i l l   v + u
Where, J u refers to filter coefficient for each index u , d indicates the length of the moving window, and D g f i l l   v + u represents input at index v + u within the window.

2.2.3. Atmospheric Correction

Here, the atmospheric noises present in the noise reduced data Ν n r e d are corrected.
Atmospheric noise pertains to undesirable interferences that may compromise the precision of remote sensing measurements, frequently resulting from phenomena such as lightning, thunderstorms, and other natural disturbances.
The atmospheric noise refers to unwanted disturbances that can affect the accuracy of remote sensing measurements, often caused by factors like lightning, thunderstorms, and natural disturbances. The data, after correction for atmospheric noise, C a t n o is expressed as follows:
C a t n o = f Ν n r e d
Where, f denotes the noise correction function. The purpose of this correction is to remove atmospheric noises, ensuring that the measurements accurately reflect the true moisture content of the soil. The final pre-processed data is specified as ¨ .

2.3. Season Mapping

After pre-processing the data, climatic seasons are mapped based on the date, year, and month to distinguish soil moisture patterns influenced by climatic variations. The four seasons spring S s p r , summer S s u m , autumn (fall) S a u t , and winter S w i n are mapped. The mapped seasons Φ s e a are expressed as follows:
Φ s e a = S s p r ,     i f     m   δ ,   γ     3 , 4 , 5     S s u m ,   i f     m   δ ,   γ     6 , 7 , 8 S a u t ,   i f     m   δ ,   γ     9 , 10 , 11 S w i n ,   i f     m   δ ,   γ     12 , 1 , 2    
Where, m   δ ,   γ represents the month along with the date and year in the dataset ¨ , while the numbers 1 , 2 , ....12 correspond to the 12 months of the year from January to December. This seasonal mapping facilitates a clearer comprehension of the variations in soil moisture levels across different seasons.

2.4. Group of Seasons

In this phase, based on the mapped seasons Φ s e a , the data are grouped using AdaK-MCC. KMC has the capability to categorize soil moisture data according to comparable seasonal traits, thereby facilitating the analysis of moisture fluctuations throughout various seasons such as spring, summer, autumn, and winter. The resulting clusters may be affected by inadequate centroid initialization; therefore, AMC is employed for this purpose, which aids in preventing local minima and improving the seasonal classification. The AdaK-MCC method is detailed in section 3.2.1. The grouped seasons are determined as G s e a s o n .

2.5. Varying Pattern Analysis

Now, the variability patterns in the grouped seasons G s e a s o n are analyzed using FWFCSD, as soil moisture data exhibit seasonal patterns affected by climatic variations. Fourier Series Decomposition (FSD) breaks the data into frequency components, helps to analyze both short-term fluctuations and long-term trends. FSD assumes data continuity, but abrupt changes can distort Fourier coefficients. To address this, Frequency Weighted Fourier Coefficients (FWFC) are used to adjust the importance of coefficients based on each frequency, enabling more flexible analysis that adapts to changing soil moisture patterns over time. Firstly, the time series data in G s e a s o n are decomposed into frequency components to analyze periodic patterns as follows:
χ t = η 0 2 + p = 1 η p   cos p   t + η p   sin p   t
Where, χ t represents frequency components in G s e a s o n at time t , η 0 indicates average soil moisture value, η p and η p are the Fourier coefficients of cosine and sine terms, p denotes the index of G s e a s o n . In this context, the FWFC is employed in Fourier coefficients to facilitate a flexible analysis of varying conditions, serving as the weighting function associated with frequency ω F and is expressed as follows:
η 0 = 1 π π π χ t   d t
η p = ω F 1 π π π χ t   cos p t   d t
η p = ω F 1 π π π χ t   sin p t   d t
Here, π denotes the normalization factor. The analyzed variability pattern data Ψ v a p a is obtained, representing the varying soil moisture patterns.

2.6. Multivariate Correlation Analysis

In this phase, the multivariate correlations are analyzed for the grouped seasons G s e a s o n using PCA. PCA helps to find patterns and relationships among variables by transforming correlated variables into uncorrelated principal components, thereby simplifying the understanding of the underlying data structure. The PCA steps are explained below.
Step 1: The first step is to standardize the data G s e a s o n , to ensure all variables are on the same scale. The standardized data A is given by:
A = G s e a s o n μ σ
Where, μ and σ are the mean and standard deviation of G s e a s o n , respectively.
Step 2: Next, the covariance matrix is computed to identify how the soil moisture variables vary with each other. The construction of covariance matrix is formulated as,
B = 1 g A A T
Where, g defines the total number of samples in G s e a s o n and A T depicts matrix transpose.
Step 3: Then, eigenvalues E v a l u e and eigenvectors E v e c t o r are computed to find the principal components that account for the greatest variance in the data as indicated by:
B E v e c t o r = E v a l u e E v e c t o r
The principal components χ are selected from E v e c t o r , corresponding to the highest E v a l u e . It is calculated as follows:
χ = A E v e c t o r
From the obtained principal components χ , the multivariate correlation analyzed data is obtained, and it is denoted as Φ m u c o .

2.7. Feature Extraction

Here, features are extracted from both the varying pattern and multivariate correlation analyzed data Ψ v a p a and Φ m u c o , to enhance the model's performance by simplifying complex data and improving predictions for soil moisture estimation. The extracted features Y are represented as:
Y = Y 1 , Y 2 , Y 3 , ....... Y b
Here, b indicates the total number of features extracted from both Ψ v a p a and Φ m u c o , respectively. The features such as mean reflectance, standard deviation, peak positions, Normalized Difference Vegetation Index (NDVI), Soil Moisture Index (SMI), ratio indices, contrast, homogeneity¸ entropy, curvature metrics, inflection points, wavelet coefficients, energy of coefficients, reflectance ratios, derivative spectra are extracted from Ψ v a p a . Then, from Φ m u c o , features such as pairwise correlation coefficients, correlation heatmap, clustered correlation groups, representative bands, canonical variate, canonical correlation coefficients, regression weights, standardized coefficients¸ higher-order terms, unique contribution scores are extracted.

2.8. Feature Selection

From the extracted features Y , the optimal features are selected using HCSWO. The Spider Wasp Optimizer (SWO) enhances soil moisture estimation by selecting high-quality feature subsets. However, SWO struggles with highly correlated features, leading to redundancy in selected features and potentially reducing the model’s performance. Therefore, a Hierarchical Correlation Matrix (HCM) is used to cluster correlated features, enabling SWO to focus on representative features from each cluster. This approach enhances model’s performance by ensuring selected features contribute uniquely to the predictive model. The steps of HCSWO are as follows.
Hierarchical Correlation Matrix
The correlation between each pair of extracted features Y is determined to cluster the correlated features. The correlation coefficient c o r is defined as follows:
c o r = C o v Y b , Y b + 1 σ Y b   σ Y b + 1
Here, C o v Y b , Y b + 1 denotes covariance between features Y b and Y b + 1 of Y , σ Y b   and σ Y b + 1 refers to standard deviation between variables Y b and Y b + 1 of Y . Next, to group the correlated features, hierarchical clustering on the correlation matrix c o r is performed as follows:
Δ h i e r = 1 c o r
Where, Δ h i e r indicates the output of HCM, which shows the clusters of the correlated feature. From this, the representative features are represented as Y r e p .
Initialization:
Initially, the population of spider wasps Λ ˜ is initialized within the search space concerning Y r e p as follows:
Λ ˜ m × l = Y r e p 1 , 1 Y r e p 1 , 2 Y r e p 1 , l Y r e p 2 , 1 Y r e p 2 , 2 Y r e p 2 , l Y r e p m , 1 Y r e p m , 2 Y r e p m , l m × l
Λ ˜ x = u b x l b x ξ + l b x
Here, m refers to the number of spider wasps (female) and l is the dimension of the search space. Then, the random position Λ ˜ x of the spider wasp with ξ random variable and u b x , l b x the upper and lower bound of the x t h spider wasp are expressed.
Fitness Evaluation
Next, the fitness value Θ f i t that determines the optimal feature based on the classification accuracy ϖ is determined as follows:
Θ f i t = max ϖ
By using Θ f i t , the prey, which is the best feature, is selected and compared from the population during the mating stage.
Searching Phase
In this exploration phase, the female wasp searches the spiders to feed their young. The updated position Λ ˜ x + 1 of the female wasp in the search phase is calculated as:
Λ ˜ x + 1 = Λ ˜ x + μ ¨ 1 Λ ˜ x v Λ ˜ x w ,                                 i f   q 3 < q 4 Λ ˜ x w + μ ¨ 2 l b x + q 2 u b x l b x ,   O t h e r w i s e
Where, μ ¨ 1 and μ ¨ 2 denotes the constant motion, q 1 , q 2 , q 3 , q 4 are random numbers, and v , w , w indicates the index randomly selected from the population.
Following and Escaping Stage
This is the exploration and exploitation phase, here, after the prey (spider) is found, the wasp starts chasing the prey. The updated position is expressed as:
Λ ˜ x + 1 = Λ ˜ x + C 2 q 5 Λ ˜ x v Λ ˜ x ,     i f   q 3 < q 4 Λ ˜ x ƛ ,   O t h e r w i s e
The transition between searching, and following and escaping phase is given by:
Λ ˜ x + 1 = e q u a t i o n   19 ,     i f   q 6 < κ e q u a t i o n   20 ,   O t h e r w i s e
Here, κ indicates control parameter, q 5 , q 6 are random numbers, C denotes distance controlling factor, and ƛ denotes a vector.
Hunting and Nesting Behavior
In this stage, the paralyzed spider is pulled into the pre-prepared nest. The final updated position in the hunting and nesting behavior stage is expressed as:
Λ ˜ x + 1 = Λ ˜ x + cos 2 π ¨ l ˙ Λ ˜ x Λ ˜ x ,                                                                     i f   q 3 < q 4 Λ ˜ x v + q 3 γ ¨ Λ ˜ x v Λ ˜ x + 1 q 3 R ˙ Λ ˜ x w Λ ˜ x w ,   o t h e r w i s e
Where, Λ ˜ x indicates best available solution, π ¨ is a constant, γ ¨ represents a number generated based on Levy flight, l ˙ indicates the localization factor, and R ˙ signifies the binary vector.
Mating Behavior
The mating behavior is the last stage, and it seeks to determine the offspring’s locations and enhance their positions. A new offspring is produced by performing crossover between selected male and female wasps. Next, based on the fitness value Θ f i t , the new offspring replaces current population members. The position the population is updated by selecting the spider wasps with the best fitness values Θ f i t . By iteratively evaluating the fitness of spider wasps, the best optimal features Ξ s e l are obtained.

2.9. Soil Moisture Prediction

In this phase, the soil moisture is predicted from the selected optimal features Ξ s e l using AGRL-RBFN with IL. Radial Basis Function Networks (RBFNs) can be tailored to approximate diverse functions by adjusting the shape and parameters of the radial basis functions, enabling them to effectively model complex soil moisture dynamics and various influencing factors. However, the local relationship between input data and output predictions limited their ability to capture global patterns in soil moisture data, especially in datasets with significant variations. So, Adaptive Group Lasso Regularization (AGRL) is used in RBFN, which jointly regularizes feature groups and identifies crucial interactions for understanding soil moisture variations, particularly in remote sensing applications with related features. Additionally, IL is used, which enables models to update parameters with new data without retraining from scratch, benefiting soil moisture prediction, as conditions may change due to seasonal variations or climate shifts. The structural diagram of the proposed AGRL-RBFN classifier is depicted in Figure 2.
This classifier has the following input, adaptive group lasso regularization, hidden and output layers.
Input Layer
This layer receives the selected features Ξ s e l as input, and passed to the hidden layers. The inputs are represented as:
I i n Ξ s e l 1 , Ξ s e l 2 , ..... Ξ s e l z
Where, I i n indicates the inputs and has z number of selected features Ξ s e l .
Adaptive Group Lasso Regularization
To capture global patterns and regulate feature groups, AGRL is used. It regularizes the model based on feature groups, which helps capture interactions within these groups, particularly useful for remote sensing data. It is given by:
ε r e g = min υ L ^ υ + λ ^ e = 1 E Γ e υ e 2
Here, ε r e g defines the output of regularized feature group with loss function L ^ υ of I i n , υ refers to the coefficient vector of I i n , λ ^ determines the regularization parameter, Γ e indicates the weight of group e , υ e 2 specifies L2 norm (Euclidean norm), and E is the number of feature groups.
Hidden Layer
Here, the regularized data ε r e g are transformed using Radial Basis Functions (RBFs) (e.g., Gaussian function) to model non-linear relationships in soil moisture. The RBF I R B F is also called as activation function and is given by:
I R B F = exp ε r e g ι c e n n 2 2 σ n 2
Where, ι c e n n refers to the n t h centre of I R B F , and σ n indicates the n t h width of the I R B F .
Output Layer
In this layer, the output from the hidden layer is given to make the soil moisture prediction. The output I o u t of this layer is given by:
I o u t = n = 1 l Γ n I R B F + b ˜ b i a s
Here, b ˜ b i a s indicates the bias, Γ n refers to the weight of n t h neuron, and l indicates the number of hidden neurons in RBF. From this output layer, the predicted soil moisture I o u t is obtained. The pseudocode for the AGRL-RBFN classifier is described as follows.
Pseudocode of AGRL-RBFN
Input: Selected Features Ξ s e l
Output: Predicted Soil Moisture I o u t
Begin
     Initialize  σ n , ι c e n n
     For each I i n
          Compute I i n Ξ s e l 1 , Ξ s e l 2 , ..... Ξ s e l z
          Regularize the input I i n
                ε r e g = min υ L ^ υ + λ ^ e = 1 E Γ e υ e 2
          Evaluate I R B F
          Calculate I o u t
                I o u t = n = 1 l Γ n I R B F + b ˜ b i a s
          End for
          Return I o u t
End
Then, IL is applied to enable the model to adapt to new data, ensuring responsiveness to environmental changes while minimizing the loss function, which is essential for addressing seasonal and climate variations in soil moisture prediction. Finally, from the AGRL-RBFN classifier and IL approach the soil moisture is predicted effectively. The performance evaluation of the proposed soil moisture prediction system is discussed below.

3. Results and Discussion

In this section, the proposed system's performance is assessed and compared with existing models using a range of metrics. The evaluation results were derived from implementing and testing the framework on the PYTHON platform.

3.1. Dataset Description

The proposed approach for soil moisture monitoring is evaluated using data from the hyperspectral and soil-moisture dataset, which is mentioned in the reference section. This dataset, publicly available, which contains hyperspectral and soil moisture data collected during a 2017 field campaign in Karlsruhe. It includes variables such as datetime, soil moisture percentage, soil temperature, and spectral bands. The dataset is divided so that 80% is used for training, with the remaining 20% reserved for testing.

3.2. Performance Analysis

This section compares the performance of AdaK-MCC, FWFCSD, HCSWO, and AGRL-RBFN with existing models to showcase the improvements achieved by the proposed model

3.2.1. Performance Analysis of Gap Filling

The performance evaluation of the proposed method AdaK-MCC for gap filling is discussed here. The graphical analysis of AdaK-MCC with existing KMC, Fuzzy C Means (FCM), K-Nearest Neighbor (KNN), K-Medoid in terms of Mean Absolute Error (MAE) is given below in Figure 3.
The gaps in the dataset are filled using AdaK-MCC technique with 0.047 of MAE. The proposed model obtained lower errors than the existing works. This is because the centroids are initialized using AMCs, which imputed the values effectively. The existing techniques such as KMC, FCM, KNN, and K-Medoid attained higher MAE of 0.099, 0.157, 0.386, and 0.548. Hence, the gaps were more efficiently filled than the existing models.
Table 1 shows the silhouette score of proposed AdaK-MCC and other existing techniques. The proposed technique achieved a silhouette score of 0.975. In contrast, the existing KMC, FCM, KNN, and K-Medoid attained 0.953, 0.929, 0.858, and 0.813 of silhouette score, which are lower than the proposed AdaK-MCC. This demonstrates the proposed approach's superiority in gap filling over existing models.

3.2.2. Performance Analysis of Clustering

In this section, the proposed AdaK-MCC is evaluated against existing clustering methods, including KMC, FCM, KNN, and K-Medoid, for a comprehensive clustering analysis.
Figure 4 and Table 2 illustrate the performance analysis of AdaK-MCC compared to existing techniques Regarding clustering time and Dunn index. The AdaK-MCC attained a lower clustering time of 2118ms, and a higher Dunn index of 4.897. But the existing techniques KMC, FCM, KNN, and K-Medoid attained longer clustering times and lower Dunn index. This improved performance in the proposed technique is attributed to the centroid initialization using AMC technique. Thus, the robust performance of AdaK-MCC for data clustering is validated.

3.2.3. Performance Analysis of Varying Pattern Analysis

Here, the proposed FWFCSD is compared with the existing techniques like FSD, Discrete Fourier Transform (DFT), Fast Fourier Transform (FFT), Short Time Fourier Transform (STFT).
Figure 5 depicts the cross-correlation coefficient and reconstruction error of proposed FWFCSD and existing techniques. The varying patterns in the grouped seasons are analyzed by integrating FWFC for flexibly analyzing the varying soil moisture patterns. By integrating FWFC, the proposed FWFCSD achieved high cross-correlation coefficient of 0.9715 and low reconstruction error of 0.0589. Whereas, the existing FSD, DWT, FFT, STFT achieved an average cross-correlation coefficient of 0.893025 and an average reconstruction error of 0.148275, which are lower and higher than proposed FWFCSD. Therefore, the proposed FWFCSD showcases its efficacy in analyzing varying patterns.
Table 3 presents a performance analysis of the proposed FWFCSD method compared to existing techniques such as FSD, DWT, FFT, and STFT, focusing on execution time. The FWFCSD method achieved an execution time of just 18564ms, significantly outperforming the existing methods, which took 24796ms, 28357ms, 32497ms, and 39641ms, respectively. Thus, the proposed method provides a more effective analysis of the varying patterns compared to the existing methods.

3.2.4. Performance Analysis of Feature Selection

In this section, the performance evaluation of the proposed HCSWO in comparison to existing techniques is discussed, as outlined in Table 4.
The performance of the proposed method is expressed in Table 4 regarding feature selection time. To uniquely select the features, here, HCM is incorporated to cluster the correlated features to predict the moisture. It achieved lesser feature selection time of 2118ms. On the contrary, the existing methods such as SWO, Ant Colony Optimizer (ACO), Grey Wolf Optimizer (GWO), and Cuckoo Search Optimizer (CSO) consumed more feature selection time of 2211ms, 5178ms, 7135ms, and 9052ms. Hence, the proposed feature selection method demonstrates enhanced performance when compared to existing techniques.

3.2.5. Performance Analysis of Soil Moisture Prediction

In this section, the performance of the proposed AGRL-RBFN is assessed in comparison to other relevant existing techniques.
Figure 6 and Figure 7 illustrate the performance of the proposed AGRL-RBFN in comparison to established techniques, including RBFN, Feed Forward Neural Network (FFNN), Recurrent Neural Network (RNN), and Artificial Neural Network (ANN). The AGRL-RBFN achieved an accuracy of 98.09%, precision of 98.17%, recall of 97.24%, F1-score of 98.95%, and specificity of 97.21%, all of which surpass the results of the existing methods. This superior performance can be attributed to the integration of AGRL within the RBFN, which effectively regularizes the feature groups for soil moisture prediction, in conjunction with the application of IL. In contrast, the traditional RBFN recorded accuracy, precision, and recall values of 96.27%, 97.55%, and 95.34%, respectively, which are inferior to those of the proposed AGRL-RBFN. Therefore, the proposed model demonstrates greater efficiency in predicting soil moisture compared to existing approaches.
Table 5 presents a comparison of the performance of the proposed AGRL-RBFN against existing methods, focusing on the False Positive Rate (FPR) and False Negative Rate (FNR). The proposed AGRL-RBFN achieved 0.0248 of FPR and 0.076 of FNR, which are lower than existing techniques. On the contrary, the existing techniques attained higher FPR and TNR. As a result, the proposed model outperformed the existing techniques in predicting soil moisture.

3.3. Comparison with Existing Approaches

This section provides a comparative evaluation of the proposed AGRL-RBFN against current methodologies for predicting soil moisture.
Table 6 provides a comparative evaluation of the proposed AGRL-RBFN against current methodologies for soil moisture prediction. The AGRL-RBFN technique demonstrates a high level of accuracy in forecasting soil moisture, thereby improving the overall effectiveness of the system; however, it does not specify appropriate crops for varying soil moisture conditions. In contrast, established methods such as Long Short-Term Memory (LSTM), Deep Convolutional Neural Networks (DCNNs), Attention-Aware LSTM, Deep Learning (DL) models utilizing Transfer Learning (TL), and Machine Learning (ML) regression with a pre-classification approach often encounter significant obstacles. These obstacles include substantial computational demands, limited generalizability, dependence on large datasets for training, and challenges in adjusting to swiftly changing climatic conditions. Consequently, the AGRL-RBFN exhibits superior performance in soil moisture prediction compared to these existing techniques.

4. Conclusions

Soil Moisture (SM) plays a critical role in fostering plant development and boosting agricultural productivity. A deficiency in soil moisture, whether in the short or long term, can adversely affect both rainfed and subsistence agricultural practices. As climate conditions evolve, the importance and difficulty of monitoring soil moisture will rise significantly. For plants to thrive, it is essential that soil moisture levels are maintained within an optimal range, avoiding extremes of saturation or drought. Excessively wet soils can result in the leaching of vital nutrients, while overly dry soils may lead to reduced crop yields and compromised quality.
Current research fails to sufficiently tackle the intricacies associated with forecasting soil moisture amid climatic changes, and it is deficient in providing thorough background information. This study introduced an effective system for predicting soil moisture utilizing AdaK-MCC, FWFCSD, HCSWO, and AGRL-RBFN. Initially, data collection was followed by a pre-processing phase where data gaps were addressed and seasonal clustering was performed using AdaK-MCC, achieving a mean absolute error (MAE) of 0.047 and a clustering duration of 2118 milliseconds. Subsequently, the varying patterns were examined through FWFCSD, which required an execution time of 18564 milliseconds. Following this, optimal features were identified from the extracted data within a feature selection timeframe of 2118 milliseconds using HCSWO. Ultimately, soil moisture predictions were made using AGRL-RBFN, resulting in an accuracy of 98.09%, precision of 98.17%, and an F1-score of 98.95%. The effectiveness of the proposed framework was further validated by comparing it with existing related approaches. Thus, the proposed model successfully predicted soil moisture levels.
In this study, the prediction of soil moisture has been addressed; however, the compatibility of particular crops with various soil types remains unexamined. Future research will focus on this aspect to enhance smart agricultural practices

Author Contributions

For research articles with several authors, a short paragraph specifying their individual contributions must be provided. The following statements should be used “Conceptualization, B.A.; methodology, B.A.; software, B.A.; validation, C.C.; formal analysis, B.A.; investigation, B.A.; resources, B.A.; data curation, C.C; writing—original draft preparation, B.A. and C.C.; writing—review and editing, C.C.; visualization B.A. and C.C; supervision, C.C.; project administration, C.C.; funding acquisition, B.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The dataset is available at the following link: https://github.com/felixriese/hyperspectral-soilmoisture-dataset.

Acknowledgments

In this section, you can acknowledge any support given which is not covered by the author contribution or funding sections. This may include administrative and technical support, or donations in kind (e.g., materials used for experiments).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Abbes, A. Ben, Jarray, N., & Farah, I. R. (2024). Advances in remote sensing based soil moisture retrieval: applications, techniques, scales and challenges for combining machine learning and physical models. Artificial Intelligence Review, 57(9), 1–22. https://doi.org/10.1007/s10462-024-10734-1. [CrossRef]
  2. Adab, H., Morbidelli, R., Saltalippi, C., Moradian, M., & Ghalhari, G. A. F. (2020). Machine learning to estimate surface soil moisture from remote sensing data. Water (Switzerland), 12(11), 1–28. https://doi.org/10.3390/w12113223. [CrossRef]
  3. Babu, C. V. S., & Yadavamuthiah, K. (2024). Soil quality prediction using deep learning. Sustainable Development in AI, Blockchain, and E-Governance Applications, 171–188. https://doi.org/10.4018/979-8-3693-1722-8.ch010. [CrossRef]
  4. Celik, M. F., Isik, M. S., Yuzugullu, O., Fajraoui, N., & Erten, E. (2022). Soil Moisture Prediction from Remote Sensing Images Coupled with Climate, Soil Texture and Topography via Deep Learning. Remote Sensing, 14(21), 1–24. https://doi.org/10.3390/rs14215584. [CrossRef]
  5. Dabboor, M., Atteia, G., Meshoul, S., & Alayed, W. (2023). Deep Learning-Based Framework for Soil Moisture Content Retrieval of Bare Soil from Satellite Data. Remote Sensing, 15(7), 1–19. https://doi.org/10.3390/rs15071916. [CrossRef]
  6. Filipović, N., Brdar, S., Mimić, G., Marko, O., & Crnojević, V. (2022). Regional soil moisture prediction system based on Long Short-Term Memory network. Biosystems Engineering, 213, 30–38. https://doi.org/10.1016/j.biosystemseng.2021.11.019. [CrossRef]
  7. Hegazi, E. H., Samak, A. A., Yang, L., Huang, R., & Huang, J. (2023). Prediction of Soil Moisture Content from Sentinel-2 Images Using Convolutional Neural Network (CNN). Agronomy, 13(3), 1–18. https://doi.org/10.3390/agronomy13030656. [CrossRef]
  8. Jia, Y., Jin, S., Chen, H., Yan, Q., Savi, P., Jin, Y., & Yuan, Y. (2021). Temporal-Spatial Soil Moisture Estimation from CYGNSS Using Machine Learning Regression with a Preclassification Approach. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, 4879–4893. https://doi.org/10.1109/JSTARS.2021.3076470. [CrossRef]
  9. Kumar, P., Udayakumar, A., Anbarasa Kumar, A., Senthamarai Kannan, K., & Krishnan, N. (2023). Multiparameter optimization system with DCNN in precision agriculture for advanced irrigation planning and scheduling based on soil moisture estimation. Environmental Monitoring and Assessment, 195(1), 1–26. https://doi.org/10.1007/s10661-022-10529-3. [CrossRef]
  10. Lee, S. J., Choi, C., Kim, J., Choi, M., Cho, J., & Lee, Y. (2023). Estimation of High-Resolution Soil Moisture in Canadian Croplands Using Deep Neural Network with Sentinel-1 and Sentinel-2 Images. Remote Sensing, 15(16), 1–26. https://doi.org/10.3390/rs15164063. [CrossRef]
  11. Li, M., & Yan, Y. (2024). Comparative Analysis of Machine-Learning Models for Soil Moisture Estimation Using High-Resolution Remote-Sensing Data. Land, 13(8), 1–24. https://doi.org/10.3390/land13081331. [CrossRef]
  12. Li, Q., Wang, Z., Shangguan, W., Li, L., Yao, Y., & Yu, F. (2021). Improved daily SMAP satellite soil moisture prediction over China using deep learning model with transfer learning. Journal of Hydrology, 600, 1–14. https://doi.org/10.1016/j.jhydrol.2021.126698. [CrossRef]
  13. Li, Q., Zhu, Y., Shangguan, W., Wang, X., Li, L., & Yu, F. (2022). An attention-aware LSTM model for soil moisture and soil temperature prediction. Geoderma, 409, 1–17. https://doi.org/10.1016/j.geoderma.2021.115651. [CrossRef]
  14. Liu, J., Xu, Y., Li, H., & Guo, J. (2021). Soil moisture retrieval in farmland areas with sentinel multi-source data based on regression convolutional neural networks. Sensors (Switzerland), 21(3), 1–21. https://doi.org/10.3390/s21030877. [CrossRef]
  15. Liu, Q., Gu, X., Chen, X., Mumtaz, F., Liu, Y., Wang, C., Yu, T., Zhang, Y., Wang, D., & Zhan, Y. (2022). Soil Moisture Content Retrieval from Remote Sensing Data by Artificial Neural Network Based on Sample Optimization. Sensors, 22(4), 1–21. https://doi.org/10.3390/s22041611. [CrossRef]
  16. Liu, Q., Wu, Z., Cui, N., Jin, X., Zhu, S., Jiang, S., Zhao, L., & Gong, D. (2023). Estimation of Soil Moisture Using Multi-Source Remote Sensing and Machine Learning Algorithms in Farming Land of Northern China. Remote Sensing, 15(17), 1–22. https://doi.org/10.3390/rs15174214. [CrossRef]
  17. Nijaguna, G. S., Manjunath, D. R., Abouhawwash, M., Askar, S. S., Basha, D. K., & Sengupta, J. (2023). Deep Learning-Based Improved WCM Technique for Soil Moisture Retrieval with Satellite Images. Remote Sensing, 15(8), 1–17. https://doi.org/10.3390/rs15082005. [CrossRef]
  18. Pal, J., & Bodhe, H. (2023). Design and Implementation of Deep Learning Model For Soil Moisture Analysis . An IoT Based Soil Moisture Monitoring on Losant Platform. International Journal of Research in Engineering and Science, 11(4), 509–514.
  19. Patrizi, G., Bartolini, A., Ciani, L., Gallo, V., Sommella, P., & Carratu, M. (2022). A Virtual Soil Moisture Sensor for Smart Farming Using Deep Learning. IEEE Transactions on Instrumentation and Measurement, 71, 1–11. https://doi.org/10.1109/TIM.2022.3196446. [CrossRef]
  20. Peng, Y., Yang, Z., Zhang, Z., & Huang, J. (2024). A Machine Learning-Based High-Resolution Soil Moisture Mapping and Spatial–Temporal Analysis: The mlhrsm Package. Agronomy, 14(3), 1–23. https://doi.org/10.3390/agronomy14030421. [CrossRef]
  21. Roberts, T. M., Colwell, I., Chew, C., Lowe, S., & Shah, R. (2022). A Deep-Learning Approach to Soil Moisture Estimation with GNSS-R. Remote Sensing, 14(14), 1–29. https://doi.org/10.3390/rs14143299. [CrossRef]
  22. Shokati, H., Masha, M., Noroozi, A., & Abkar, A. A. (2024). Random Forest-Based Soil Moisture Estimation Using Sentinel-2, Landsat-8/9, and UAV-Based Hyperspectral Data. Remote Sensing, 16, 1–17.
  23. Singh, T., Kundroo, M., & Kim, T. (2024). WSN-Driven Advances in Soil Moisture Estimation: A Machine Learning Approach. Electronics (Switzerland), 13(8), 1–20. https://doi.org/10.3390/electronics13081590. [CrossRef]
  24. Uthayakumar, A., Mohan, M. P., Khoo, E. H., Jimeno, J., Siyal, M. Y., & Karim, M. F. (2022). Machine Learning Models for Enhanced Estimation of Soil Moisture Using Wideband Radar Sensor. Sensors, 22(15), 1–11. https://doi.org/10.3390/s22155810. [CrossRef]
  25. Wang, Y., Zhao, J., Guo, Z., Yang, H., & Li, N. (2023). Soil Moisture Inversion Based on Data Augmentation Method Using Multi-Source Remote Sensing Data. Remote Sensing, 15(7), 1–17. https://doi.org/10.3390/rs15071899. [CrossRef]
  26. Win, K., Sato, T., & Tsuyuki, S. (2024). Application of Multi-Source Remote Sensing Data and Machine Learning for Surface Soil Moisture Mapping in Temperate Forests of Central Japan. Information (Switzerland), 15(8), 1–25. https://doi.org/10.3390/info15080485. [CrossRef]
  27. Yang, Y., Li, H., Sun, M., Liu, X., & Cao, L. (2024). A Study on Hyperspectral Soil Moisture Content Pre-diction by Incorporating a Hybrid Neural Network into Stacking Ensemble Learning. Agronomy, 14(9), 1–17. https://doi.org/10.3390/agronomy14092054. [CrossRef]
Figure 1. Structural Diagram of Soil Moisture Prediction.
Figure 1. Structural Diagram of Soil Moisture Prediction.
Preprints 151105 g001
Figure 2. Structural Diagram of Proposed AGRL-RBFN classifier.
Figure 2. Structural Diagram of Proposed AGRL-RBFN classifier.
Preprints 151105 g002
Figure 3. Graphical Analysis of Gap Filling in terms of MAE.
Figure 3. Graphical Analysis of Gap Filling in terms of MAE.
Preprints 151105 g003
Figure 4. Clustering Time Analysis.
Figure 4. Clustering Time Analysis.
Preprints 151105 g004
Figure 5. Pictorial Representation of Varying Pattern Analysis.
Figure 5. Pictorial Representation of Varying Pattern Analysis.
Preprints 151105 g005
Figure 6. Performance Analysis of Soil Moisture Prediction.
Figure 6. Performance Analysis of Soil Moisture Prediction.
Preprints 151105 g006
Figure 7. Graphical Representation of AGRL-RBFN with Existing Techniques.
Figure 7. Graphical Representation of AGRL-RBFN with Existing Techniques.
Preprints 151105 g007
Table 1. Tabular Analysis of Silhouette Score.
Table 1. Tabular Analysis of Silhouette Score.
Techniques Silhouette Score
Proposed AdaK-MCC 0.975
KMC 0.953
FCM 0.929
KNN 0.858
K-Medoid 0.813
Table 2. Evaluation of Clustering Methodologies.
Table 2. Evaluation of Clustering Methodologies.
Techniques Dunn Index
Proposed AdaK-MCC 4.897
KMC 2.984
FCM 2.541
KNN 1.924
K-Medoid 1.368
Table 3. Execution Time Analysis of FWFCSD with Existing Techniques.
Table 3. Execution Time Analysis of FWFCSD with Existing Techniques.
Techniques Execution Time (ms)
Proposed FWFCSD 18564
FSD 24796
DFT 28357
FFT 32497
STFT 39641
Table 4. Feature Selection Time Analysis.
Table 4. Feature Selection Time Analysis.
Techniques Feature Selection Time (ms)
Proposed HCSWO 2118
SWO 2211
ACO 5178
GWO 7135
CSO 9052
Table 5. Tabular Analysis of Soil Moisture Prediction in terms of FPR and FNR.
Table 5. Tabular Analysis of Soil Moisture Prediction in terms of FPR and FNR.
Techniques FPR FNR
Proposed AGRL-RBFN 0.0248 0.076
RBFN 0.481 0.108
FFNN 0.596 0.257
RNN 0.723 0.429
ANN 0.876 0.571
Table 6. Comparative Analysis of Proposed with Existing Works.
Table 6. Comparative Analysis of Proposed with Existing Works.
Study Technique/Method Used Advantages Limitations
Proposed AGRL-RBFN Accurately predicts soil moistures, enabling better irrigation management Did not indicate suitable crops for different soil moisture types.
(Filipović et al., 2022) LSTM Predicted soil moisture effectively, and worked well for irrigation scheduling But struggled with rapidly changing weather conditions.
(Kumar et al., 2023) DCNN Estimated soil moisture with optimized water use and boosted crop yield It required high computational power for soil moisture prediction.
(Li et al., 2022) Attention-Aware LSTM Predicted soil moisture and soil temperature Lacks spatiotemporal prediction and interpretability
(Li et al., 2021) DL with TL Surface of the soil moisture is predicted in diverse areas Less effective during winter and specific environmental conditions.
(Jia et al., 2021) ML regression with the pre-classification strategy Soil moisture was predicted using land-specific models from a pre-classification strategy. Relied on accurate land type classification.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated