This section provides an overview of ML/DL techniques applied to channel estimation in UWA communication, highlighting their effectiveness in improving communication reliability and efficiency. The discussion includes various ML/DL models, optimizers, and training examples in
Section 3.1. Subsequently,
Section 3.2 elaborates key system characteristics such as bandwidth, transmitter-receiver (Tx-Rx) distance, modulation schemes, system types, subcarriers, and CP. Similarly,
Section 3.3 provides performance metrics such as training loss, BER, channel prediction and computational complexity. Finally,
Section 3.4 and
Section 3.5 provide an in-depth discussion on the selected research works for single-carrier and multi-carrier UWA communication systems respectively.
3.2. Key Characteristics of Channel Estimation Techniques
Table 6 and
Table 7 provide a comprehensive comparison of the key characteristics of single-carrier and multi-carrier UWA systems. These comparisons highlight critical parameters, including bandwidth, transmission-reception (Tx-Rx) distance, modulation schemes, system types, subcarriers, and CP attributes. By analyzing these attributes, the tables shed light on the distinctive setups and operational differences between the two strategies, offering valuable insights for optimizing communication.
For single-carrier systems,
Table 6 begins with the parameter of bandwidth, defined as the difference between the maximum and minimum frequencies used for transmission. Bandwidth is a pivotal characteristic of UWA channels, typically limited to a few kilohertz, which in turn restricts the transmission rates achievable in these systems. Another essential parameter is Tx-Rx distance, which denotes the distance (in kilometers) between the transmitter (Tx) and receiver (Rx), a factor that directly influences communication efficiency. The final parameter for single-carrier systems is the modulation scheme employed, such as BPSK, QPSK, or frequency hopping spread spectrum (FH-SS). These modulation schemes are tailored to address the unique challenges of UWA systems.
Conversely,
Table 7 highlights system characteristics unique to multi-carrier architectures, including technologies like OFDM, Adaptive Frequency-Division Multiplexing (AFDM), and Generalized Frequency-Division Multiplexing (GFDM). The parameter of subcarriers, which typically adheres to values in powers of two, further distinguishes multi-carrier systems. Modulation schemes in these systems involve mapping techniques like PSK, QAM, and Continuous Phase Modulation (CPM), contributing to efficient data handling. Additionally, the CP is a defining feature of multi-carrier systems, implemented to mitigate the detrimental effects of multipath propagation. CP lengths are commonly set to one-fourth of the number of subcarriers, ensuring reliable signal integrity even in fluctuating underwater conditions.
It can be observed from
Table 6 and
Table 7 that SC-UWA systems operate with bandwidths ranging from 4 kHz to 25 kHz, with transmission distances between 0.2 km and 3 km. The primary modulation schemes used are BPSK and QPSK, with some studies employing FH-SS. In contrast, MC-UWA systems utilize higher bandwidths, ranging from 4 kHz to 1000 kHz, with transmission distances between 0.5 km and 5 km. OFDM is the dominant system, with some studies incorporating OFDMA, MIMO-OFDM, AFDM, and OTFS. Modulation schemes include BPSK, QPSK, 8-QAM, 16-QAM, and 8PSK, with cyclic prefixes varying from 0 to 256. The number of subcarriers ranges from 32 to 1024, reflecting the diversity in system configurations. Overall, SC-UWA systems focus on simpler modulation techniques and lower bandwidths, while MC-UWA systems use advanced multi-carrier architectures for improved efficiency and performance.
3.3. Comparative Analysis of Channel Estimation Techniques
The comparative analysis of selected channel estimation (CE) techniques focuses on measurable criteria used to evaluate and compare the efficiency, accuracy, and reliability of various ML/DL techniques in UWA communication. These metrics provide valuable insights into how well these systems perform under different conditions.
Table 8 and
Table 9 aim to present the performance evaluation of systems classified as SC-UWA and MC-UWA systems, respectively.
One of the key metrics for performance evaluation is training loss, which measures the training effectiveness of an algorithm. It is assessed by plotting MSE or mean absolute error (MAE) against the number of epochs during the training, testing, and validation phases. It calculates the average squared difference between predicted and actual values, providing a measure of prediction errors. The mathematical formula for MSE is [
65]:
On the other hand, MAE quantifies the sum of absolute errors divided by the sample size. Its formula is [
66]:
Another critical parameter is complexity which evaluates an algorithm’s efficiency compared to existing methods. A lower complexity is represented by a positive value indicating the gain in efficiency, while a higher complexity is denoted by a negative value, highlighting the algorithm’s inferiority compared to existing techniques. It is measured using time complexity (Big-O), runtime, FLOPs, network parameters, and storage requirements. Similarly, channel prediction performance compares predicted parameters with actual measurements using MSE, Normalized Mean Square Error (NMSE), Time-varying Impulse Response (TVIR), Cumulative Density Function (CDF), Jensen-Shannon (JS) divergence, and information entropy. Among these, TVIR is typically evaluated visually due to its graphical nature.
Lastly, BER is a widely used for assessing communication system performance. It evaluates the accuracy of receiver using channel estimates obtained through the proposed technique. The performance is presented through BER versus SNR plots, which compare the proposed method against existing techniques.
Table 8 and
Table 9 highlight the BER gain at a fixed SNR value relative to baseline techniques. When multiple techniques are compared, the BER gain is computed against the best-performing technique. If the authors provide multiple BER plots, the highest performance is taken, with the comparison technique’s name indicated in parentheses alongside the reported BER gain.
It can be observed from
Table 8 and
Table 9 that the training loss in SC-UWA systems values vary, with some studies using MAE and others using MSE. Complexity is not explicitly provided for most studies. The BER improvements range from
to
, indicating varying levels of performance enhancement. In MC-UWA systems, training loss values are mostly MSE, with some studies combining MSE and BER for optimization. Complexity varies, with some models requiring high memory usage (e.g., 20.7 MB, millions of parameters and others focusing on execution time (e.g., 47.8 ms per OFDM block). BER improvements range from
to
, with some studies reporting over 40% improvement in performance. Overall, SC-UWA techniques generally focus on simpler models with lower complexity, while MC-UWA techniques use advanced architectures with higher computational demands. The BER improvements in MC-UWA systems tend to be more significant, reflecting the advantages of multi-carrier communication in underwater environments.
3.4. Discussion on ML/DL-Based Channel Estimation in SC-UWA Communication Systems
The research works on single-carrier UWA systems demonstrate a progressive evolution in addressing the challenges posed by dynamic underwater environments. Each study contributes unique methodologies and insights, collectively advancing the field of UWA communication.
The work in [
20] introduces the ML-ECQP (Machine Learning-based Environment-aware Communication Channel Quality Prediction) method, using Logistic Regression (LR) to predict channel quality based on environmental parameters such as wind speed, water temperature, air humidity, and SNR. Operating at carrier frequencies of 23 kHz and 25 kHz over a transmitter-receiver distance of 205 meters, the system employs BPSK modulation. The study highlights the effectiveness of real-world experimental data from Furong Lake in optimizing MAE to
and achieving a BER close to
. While the method improves energy consumption and network performance, challenges such as prediction errors under favorable channel conditions and limited training data remain.
Building on the foundation of channel prediction, [
23] explores M-ary Spread Spectrum modulation with BPSK for single-carrier systems operating within a 4 kHz bandwidth and transmission distances of 2–3 km. The study employs advanced LSTM architectures, including Bidirectional LSTM (BiLSTM) and stacked bidirectional uni-directional LSTM (SBULSTM), optimized using the Adam optimizer and simulated channel data generated via BELLHOP software. The system achieves robust channel prediction and significant BER improvements, particularly under low SNR conditions, with values close to
for SBULSTM models.
Further advancing real-time channel prediction, [
24] introduces the Adaptive Bidirectional Gated Recurrent Unit (ABiGRU) network, which utilizes short-term Channel Impulse Response (CIR) data for online training. The integration of Space-Time Block Coding and Minimum Mean Square Error (MMSE) pre-equalization enhances system performance, achieving superior BER and prediction accuracy compared to MMSE, LSTM, and GRU models, effectively bridging the gap between predictive accuracy and real-time adaptability. Similarly, [
30] refines LSTM-based models by integrating attention mechanisms, leading to the development of the AttLstmPreNet model. By focusing on critical input sequence aspects, this model efficiently addresses sparsity and temporal coherence in fast time-varying UWA channels. Validated with simulated data, AttLstmPreNet outperforms conventional predictors like LMS and RLS, demonstrating its effectiveness in dynamic underwater environments. Together, these studies highlight advancements in chanel estimation techniques, improving accuracy and robustness in underwater communication systems.
Finally, the study [
36] presents UACC-GAN (underwater acoustic communication channel-generative adversial network), a data-driven simulator for underwater acoustic communication channels. Using generative adversial networks, it generates realistic time-varying impulse responses based on measured channel data. The model is validated against the WATERMARK dataset and achieves BER performance similar to real measured channels. However, it requires large training samples and lacks control over specific property variations, indicating areas for improvement. This work supports previous predictive models by focusing on simulation and design for underwater communication systems
Together, these studies showcase a cohesive progression in single-carrier UWA systems, from environment-aware channel prediction to advanced deep learning architectures and stochastic simulation techniques. Each contribution addresses specific challenges while paving the way for future advancements in underwater communication technologies.
3.5. Discussion on ML/DL-Based Channel Estimation in MC-UWA Communication Systems
The research on multi-carrier UWA systems showcases a diverse range of methodologies aimed at addressing the inherent challenges of underwater communication. Each study contributes unique insights, collectively advancing the field through innovative techniques and performance improvements.
The work in [
21] introduces CsiPreNet, a combination of CNN and LSTM networks to predict channel state information (CSI). It operates within a 4 kHz bandwidth over distances of 1–5 km, using 681 subcarriers with modulation schemes like BPSK, QPSK, 8-QAM, and 16-QAM. The model achieves low error rates, with MAE of
and BER close to
under low SNR conditions. By integrating subcarrier-bit-power allocation and an offline-online prediction approach, it enhances resource allocation and system performance, setting a benchmark for future research. A DNN-based method for OFDM systems using QPSK modulation is presented in [
25]. The model features 64 subcarriers and a cyclic prefix size of 16, utilizing a comb-type pilot arrangement with 32 pilot tones. It is trained on the WATERMARK dataset, which includes diverse underwater channel conditions. The optimization process employs the Adam optimizer, improving efficiency in environments such as Norway-Oslofjord and Brest Commercial Harbor.
In addition to OFDM systems of [
21] and [
25], the study in [
26] explores the use of Denoising Autoencoder (DAE) and DNN to improve OFDM systems by addressing impulsive noise. The DAE pre-processes noisy signals using a Gated Bernoulli-Gaussian (GBG) model, recovering clean data before a DNN, trained with the Adam optimizer over 10,000 epochs, predicts CSI. This approach achieves a BER gain of approximately
and an MSE of 0.0024 at 20 dB SNR. While a DNN-based improvement is employed in [
26], the work in [
27] introduces a CNN-based OFDM receiver that enhances signal recovery and communication reliability under challenging underwater conditions. Using QPSK modulation, the system achieves low MSE values and BER improvements over 0.17, with SNR gains of 2–3 dB at
. Experiments were conducted over 750–3160 m distances with the WATERMARK dataset show that the model reduces storage needs and operates efficiently with a runtime of 47.8 ms per OFDM block. The CNN predicts CIR using skip connections, enabling reliable underwater communication.
Further advancing channel estimation, a 2D BiLSTM-based CIR estimator for UWA sensor networks is proposed in [
28]. It employs Sound Speed Profile (SSP) estimation from water temperature data. The model, deployed in an OFDM system with a 5 kHz bandwidth, 462 subcarriers, and a cyclic prefix length of 22.6 ms, achieves low MSE values (0.02–0.14) during SSP estimation. It also demonstrates superior performance, with a BER gain of 2–3 dB SNR at
and a network throughput of 396 bps in practical experiments. Similarly, [
29] proposes a model-driven deep learning-based estimation network. The problem is modeled using sparse signal recovery, where approximation message passing (AMP) detects and use the sparse nature of the channel. The ST-LAMP network, designed as a generalized estimator, is applied to random sparse UWA channels. To achieve Bayesian optimal solutions, the authors use a Gaussian mixture (GM) prior, constructing the GM-LAMP network with a shrinkage function based on minimal mean squared error criteria.
While the SSP estimation [
28] and model-driven estimation [
29] contribute to enhancing UWA communication by improving prediction accuracy, robustness, and system adaptability, another deep learning-based receiver for UWA-OFDM, known as SCABNet, is presented in [
31]. It combines deep neural networks and expert knowledge. It uses an attention-enhanced bi-directional LSTM (AM-BiLSTM) for signal detection and a skip connection CNN (SC-CNN) for channel estimation. The model is trained on real experimental UWA channel data from the WATERMARK dataset [
35] and evaluated under various conditions, including fewer pilots, absence of cyclic prefixes, symbol time offset (STO), and carrier frequency offset (CFO). By achieving lower BER, it demonstrates resilience in time-varying and frequency-selective channels but requires substantial offline training, which may demand significant computational resources. Its applicability to other underwater environments beyond the WATERMARK dataset remains unexplored, and future research may focus on improving training efficiency and adapting the model for real-time applications. Similarly, [
32] presents CWGAN, which employs channel attention denoising (CAD) to enhance channel estimation accuracy in underwater acoustic communication. The system operates within a bandwidth range of 6-10 kHz and transmission distances from 0.75 km to 3.16 km, using OFDM modulation with 512 to 1024 subcarriers and cyclic prefix values between 128 and 256 samples to mitigate multipath interference. It integrates CNN-based models and CWGAN-GP (Conditional Wasserstein GAN with Gradient Penalty) for improved channel prediction. Performance evaluation using MSE, BER, and channel prediction accuracy confirms enhanced signal robustness and reliability in underwater environments.
The study in [
33] presents a DL method for MIMO-OFDM underwater acoustic IoT networks. A Bi-directional Deep Pelican Convolutional Neural Network (BDPCNN) is used for signal detection and an Adaptive Recursive Least Squares (ARLS) approach is employed for channel estimation. Moreover, the Pelican Optimizer is used to optimize network settings, reducing computational complexity. The system’s performance is evaluated under different water conditions, including turbid, coastal, clear ocean, and pure seawater, using metrics such as BER, MSE, energy efficiency, and channel capacity. The findings indicate lower BER values (0.0086–0.021) at 20 dB SNR, improved MSE-based channel estimation, and better energy efficiency compared to traditional models. However, the model is trained on simulated data, which may not fully represent real-world underwater channel variations. The study does not examine hardware limitations, requiring further research to validate the approach with real-world measurement datasets and adaptive real-time applications. Similarly, [
39] introduces a Multi-Task Learning (MTL) framework for predicting time-varying channels, addressing challenges in high-dimensional CIR prediction. The framework uses a Shared Feature Learning (SFL) layer to capture multipath correlations and a Task-Specific Head (TSH) layer for refining predictions. Based on maritime experiments in Wuyuan Bay, China, the study compares different SFL configurations, including LSTM, attention-enhanced LSTM, and transformer architectures. The MTL framework achieves lower prediction errors in both one-step and multi-step forecasting. While it reduces computational complexity compared to Single-Task Learning, advanced models like transformers still require significant resources.
The work reported in [
37] addresses large Doppler spread, low SNR, and complex multipath propagation in UWA OFDM communication with DenseNet-based channel estimation. Due to extensive connection and feature reuse, the proposed DenseNet estimator efficiently captures complicated channel characteristics. The model is trained on WATERMARK BCH, KAU1, and KAU2 channel data [
35]. Even with fewer pilot symbols, DenseNet surpasses conventional estimators in BER, MSE, and channel estimation accuracy. The study also shows the model’s flexibility to BPSK, QPSK, and environmental circumstances. DenseNet’s BER improvements (up to 96.3% over LS and 94.2% over MMSE) make it a reliable UWA channel estimate solution. The model reduces pilot overhead and errors in amplitude and phase estimation.
Authors in [
38] propose a CNN-based channel estimator and LSTM-based equalizer for OFDM receivers using model-driven deep learning. The CNN extracts essential features from the CFR, whereas the LSTM equalizes signals using temporal relationships. The WATERMARK dataset’s NOF1 and NCS1 channels [
35] are used to train and test the model. Simulation findings show that the suggested technique outperforms LS, MMSE, CNN-MLP, DNN, and ComNet in BER and channel mismatch robustness. Pilot symbol reduction and CP removal are also examined, showing that the suggested model performs well with fewer pilots and no CP. The study does not analyze computational complexity, therefore it is unknown how well the approach would scale to real-time, resource-constrained deployments. The model’s performance under large Doppler spread and extreme multipath situations is also understudied. To prove its viability, hardware implementations, real-time flexibility, and dataset expansion are needed.
The study in [
40] introduces two AFDM receiver designs for doubly selective fading channels. The first design uses DNNs for data prediction but faces problems with pilot-data interference when guard intervals are absent. To solve this, the authors propose an iterative receiver that estimates the channel, detects symbols, and cancels interference without guard intervals. Simulations show that the DNN receiver effectively reduces pilot-data interference, achieving channel estimation with a 0.001 BER within 0.5 dB of the ideal case. However, the model requires extensive offline training, has limited adaptability to untrained channels, and involves high computational complexity in iterative processing. Similarly, [
41] presents stacked CNN ResNet (S-CNN-ResNet) receiver for Orthogonal Time Frequency Space (OTFS) communications. This receiver is designed to mitigate the Doppler Squint Effect. It integrates CNN for channel feature extraction from pilot data and an enhanced ResNet for symbol recovery to improve feature learning. The model achieves better BER performance, balancing complexity and performance. However, it requires substantial training data and processing resources, which may limit its feasibility for real-time applications. Additionally, performance improvements at higher SNRs are less significant compared to iterative techniques.
To summarize, the review of ML and DL techniques for channel estimation in UWA communication highlights significant advancements in improving signal prediction, bit error rate reduction, and system reliability. The comparison between single-carrier and multi-carrier systems demonstrates how different AI-driven models, including CNNs, LSTMs, and reinforcement learning approaches, adapt to varying underwater conditions and enhance communication performance. Despite these improvements, challenges such as computational complexity, real-time adaptability, and efficient training data utilization persist. This exploration of AI-based channel estimation methods sets the stage for deeper investigations into adaptive modulation and modulation recognition techniques, ensuring a more seamless and reliable underwater communication framework.