4.1. GRU-Based Link Quality Estimation Model
In this work, we proposed a GRU algorithm to build a LQE model. The GRU is a type of recurrent neural network (RNN) that has shown effectiveness in capturing sequential dependencies in data [
25]. It is particularly useful when dealing with sequential data of varying lengths, making it suitable for modeling time-varying link quality. Depending on the task complexity and the amount of training data available, the specific architecture and configuration of the GRU layers would be designed. Using historical channel data, the model predicts the desired output (link quality indicator). GRU is more accurate and computationally efficient than LSTM when working with large datasets [
25]. Therefore, we proposed a GRU-based LQE framework to learn spatial-temporal features from channel data efficiently.
Figure 2 depicts the architecture of the GRU network. The first box represents the previous unit information, the middle box denotes the current unit, and the third box signifies the future unit information. The manipulation of the current unit relies on the performance of the gates, the current inputs, and the hidden layer. Within this architecture, the update gate determines whether to transmit the preceding information from the previous unit (h
t-1) to the current or subsequent unit (h
t). The hidden layer's current state at a given time is updated by linearly interpolating between h
t−1 and the current state h
t. The reset gate merges new input with stored data, while the update gate determines information utilization. Without an output gate, the GRU can be considered a distinct approach to integrating and combining information. The GRU network has simplified its structure in comparison to LSTM [
25]. The connection relationship in
Figure 2 is given following equations:
In a GRU, x is the input vector, h is the output vector, and Wxr, Wxz, and Wx are the weight matrices for the input vector. is the candidate output vector, Whr, Whz, and Wh are the weight matrices for the previous time step. Additionally, br, bz, and b are the bias terms. The element-wise multiplication operation is represented by . σ and tanh represents the logistic sigmoid and hyperbolic tangent functions respectively. The GRU has two gates: r is the reset gate and z is the update gate.
In a GRU network, each GRU unit includes a reset gate
(r) and an update gate
(z) that handle past, present, and future data [
26]. The update gate
(z) controls the history and newly added channel data with RIS phase shift, which can be computed according to (12). Meanwhile, the reset gate
(r) controls the flow of new input data to the preceding GRU unit and can be computed according to (13). To calculate the current hidden unit, two steps are involved. Firstly, a vector of new candidate values updates the cell state through (14). Secondly, the candidate hidden unit takes two input vectors - the current input vector
(rt) and the previous hidden unit vector
(ht−1) and multiplies them by the gr output. The result then passes through the tanh activation function to determine the candidate hidden unit. Finally, the current hidden unit
(ht) is calculated using (15) [
27].
The proposed LQE overall architecture is classified into three stages: data collection, preprocessing of the training dataset, and building the LQE model, as shown in
Figure 3.
4.1.1. User Channel Data Simulation
We used a MATLAB R2023a simulation tool to collect user channel data and RIS phase shift information. Through the simulation tool, we creating a simulation environment containing a UAV, RIS, and users. After created the simulation environment, we set up the simulation environment to setting up the parameters. After we defined the parameters, we generated the channel data according to Section 2.1 Channel Model mathematical expression. In simulating the RIS phase shift we defined our used RIS parameters such as the number of elements and reflection coefficients. We calculate the RIS phase shift depending on the channel modeling and the desired signal properties. In the next time slot, we update the phase shift at each RIS element based on the channel conditions and the specific algorithms we are using. Finally, generate user channel data and RIS phase shift by analyzing the received signal and extracting the relevant information for each user. This paper used a UAV with a fixed altitude and a horizontal trajectory so at each UAV trajectory we collect simulation user channel data. A RIS phase shift algorithm uses 9 elements of RIS to obtain data at a single location of a UAV and a single user. We obtained channel gain from each UAV location with each RIS element and from each user via the two links direct links (UAV-GU) and virtual links (UAV-RIS-GU).
We collected data from each position of UAV and in each position of UAV by changing 9 RIS elements. To express the mathematical expressions, we represent each position of the UAV with the index variable 'j', ranging from 1 to d and we represent each RIS element with the index variable 'k', ranging from 1 to 9. Based on these notations we expressed the given statements as a mathematical expression:
where, D
s (j, k) represents the data collected at UAV position “j” and RIS element “k”, the inner summation (Σ) runs over the RIS elements in each position (k = 1 to 9). Hence based on the above expression assumes that the data collected at each UAV position and RIS element can be combined in a meaningful way and based on these data we trained the model.
In this paper, we used the RIS phase shift algorithm. Algorithm 1 summarizes how the RIS phase shift is performed. Algorithm 2 summarizes the process of performing the RIS phase shift. It starts by initializing the RIS phase shift value to 0 and setting the maximum number of phase shift iterations depending on the number of RIS. The algorithm then enters a loop where it obtains the current CSI from the base station, calculates the optimal phase shift value based on the CSI, updates the RIS phase shift value, and increments the iteration counter. This loop continues until the iteration counter reaches the maximum number of iterations. Finally, the algorithm ends. By following this algorithm, the RIS phase shift can be effectively performed, allowing for improved wireless communication performance in RIS-assisted UAV communication systems.

The user data channel and RIS phase information can be denotes in the matrix for as
where
X ∈ XMxF M donates the total records of the data, F is the number of features r
i is a row vector in the i
th row.
4.1.2. Data Preprocessing
The data include channel data and RIS phase shifts. After acquiring the data from the target area, pre-processing is carried out to develop an accurate LQE model. Data pre-processing consists of filling missing values, data labeling (i.e., SNR to LQI) and data normalization to improve the training process and enhance neural networks' performance. We filling missing value preprocessing techniques in UAV communication when some users have not received signals. This issue we considered as a missing value, so in these techniques, we have filled these values. A user who doesn't receive a signal may be given a noise value, indicating that -6 should be filled in that missing SNR value after which we would apply data labeling to the data. After that, using filling missing values techniques, we label different link quality levels based on SNR value. For example, y = 1 for LQI = poor (SNR, less than 10), y = 2 for LQI = Fair (SNR, 10-20), y = 3 for LQI = good (SNR, 20-30), y = 4 for LQI = very good (SNR, 30-40), and y = 5 for LQI = excellent (SNR, greater than 40). Then we used the min-max normalization technique [
17]. The link quality parameters in UAV communication are different. The data are normalized between 0 and 1 to eliminate range and reduce model error. The min-max normalization technique enables the data to be independent of the range and reduces the model error. It is also beneficial for the model to learn more from the data. Additionally, it makes the data easier to use for further analysis. To do this, we applied the min-max normalization formula, which maps the data between 0 and 1. This allows us to compare data from different ranges and make more accurate predictions. The min-max normalization technique also ensures that the data is not affected by outliers, making the model more robust. Furthermore, it helps to reduce the time needed for training the model. The normalization process is shown in (17) as follows:
where s is the original data and s' is the normalized data. After normalization, the data are ready to feed into the neural network. Normalization helps the neural network to learn faster and more accurately. It also helps prevent the network from getting stuck in local minima. Finally, it ensures that the weights of the network are not skewed by the presence of outliers in the data.
Algorithm 2 provides a comprehensive approach to preprocessing user channel data with RIS phase shift. By filling in missing values, data labeling, and applying min-max normalization, the resulting data will be ready for further analysis and modeling tasks.
4.1.3. Build Link Quality Estimation Model
To address the LQE problem of RIS-assisted UAV communication systems, we employ a GRU algorithm. A simulation user channel data and the RIS phase shift information are used as input. Among various channel data parameters, we consider the SNR to label link quality indicator for GRU based LQE model. The GRU model is trained on the user channel data and RIS phase shift dataset. Once the features have been normalized as described in the preprocessing section using
Algorithm 2, the GRU model is applied to them. Accordingly, the GRU-based LQE model training uses a normalized dataset as inputs and the corresponding LQI as outputs. During training, the trial-and-error method adjusts the network hyperparameters [
28] listed in
Table 3 until the optimal GRU model is achieved. As part of the RIS-assisted UAV communication system, a well-trained LQE model is uploaded to the UAV to continuously monitor the link quality. GRU LQE method algorithms 3 and 4 summarize the detailed training process.

LQE model offline training: We use the input data X = (r1, r2,...., rI) to denote the sequence of input data vectors and Y for the corresponding LQI. The proposed LQE model is trained using preprocessed user channel data and the corresponding LQI. Next, the model predicts the output's LQI.
LQE model online training: After being trained offline, the LQE model is used by the UAV to predict ground user link quality over time using Algorithm 4. The estimated ground user link quality is input to optimize UAV mobility for deployment of RIS-assisted UAV enabled wireless communications.