Toward Intelligent Underwater Acoustic Systems: Systematic Insights into Channel Estimation and Modulation Methods

Imran A Tasadduq; Muhammad Rashid

doi:10.20944/preprints202507.0077.v1

Submitted:

26 June 2025

Posted:

01 July 2025

You are already at the latest version

Abstract

Underwater acoustic communication (UWA) is essential for many mission-critical applications such as such as deep-sea exploration, maritime security, and environmental monitoring. However, it continues to face several challenges from multipath propagation, channel variations, and unpredictable underwater noise. Recent advancements in artificial intelligence, particularly machine learning (ML) and deep learning (DL), have significantly improved channel estimation, adaptive modulation, and modulation recognition. To assess these advancements, a systematic literature review (SLR) is needed. Existing review articles focus on isolated parts like channel estimation or modulation recognition. It implies that they don’t compare methods across domains, making it hard to understand their overall impact in real applications. Moreover, they often miss system-level comparisons and practical constraints such as bandwidth, system complexity, and real-time adaptability, making it hard to judge real-world significance. To bridge this gap, this SLR provides a comprehensive and structured synthesis of 43 recent studies (2020–2025), covering both single-carrier and multi-carrier UWA systems. It categorizes ML/DL techniques applied at the physical layer and standardizes performance metrics such as bit error rate, training loss, and computational overhead. Furthermore, it highlights key architectural trends in model design and synthesizes insights across diverse scenarios to identify existing research gaps. This level of integration and comparative analysis has not been presented in previous reviews. As a result, the holistic perspective offered by this SLR serves as a timely and valuable resource for guiding future advancements in robust, intelligent, and scalable UWA communication systems.

Keywords:

Channel Estimation

;

Adaptive Modulation

;

Modulation Recognition

;

Underwater Acoustic Communication

;

Machine Learning

;

Deep Learning

Subject:

Engineering - Electrical and Electronic Engineering

1. Introduction

Underwater acoustic (UWA) communication is used in several critical areas such as environmental monitoring, deep-sea exploration, military operations, rescue missions, commercial maritime applications, and underwater robotics [1]. Electromagnetic waves, commonly used for wireless communication on land, do not work efficiently underwater due to high absorption and scattering. Consequently, UWA communication relies on sound waves to transmit and receive information [2]. These sound waves move efficiently through water, which makes it possible to communicate over long distances, even deep under the sea. [3].

Despite its importance, UWA communication faces challenges due to the complex nature of the environment. These challenges include multipath propagation, channel time variations, and unpredictable channel conditions [4,5,6]. Multipath propagation occurs when sound waves reflect off underwater surfaces. These reflections cause delays, frequency shifts, and variations in signal speed, making decoding difficult. [7]. Similarly, the underwater environment changes due to factors like currents, temperature shifts, and moving objects. These fluctuations cause considerable channel time variations, leading to signal degradation over time [8]. Moreover, UWA communication channels are highly unpredictable due to ambient noise, changes in salinity, and variations in water pressure. To address these challenges, UWA systems operate at low frequencies (10–15 kHz) to extend signal range, though this comes at the cost of lower data rates. Additionally, the long-distance signal loss and the slow sound speed (1500 m/s) weaken communication reliability and efficiency [9].

Addressing the challenges of UWA communication requires various advanced techniques such as channel estimation, adaptive modulation, and modulation recognition [1]. Channel estimation identifies channel characteristics to adapt to underwater changes and enhances decoding accuracy, while equalization compensates for distortions [10,11]. Adaptive modulation dynamically selects the modulation scheme based on real-time channel conditions. It optimizes data rates in good conditions while ensures reliability in challenging environments [12]. Modulation recognition enables the receiver to identify the correct modulation scheme. It helps in interference detection, spectrum monitoring, and military communications [13]. These combined techniques work together to enhance the efficiency and reliability of UWA systems.

In recent years, machine learning (ML) and deep learning (DL) have emerged as transformative tools for overcoming challenges in UWA communication [5,14,15,16]. These models help with channel estimation by learning underwater patterns. This allows the system to adjust in real time and decode signals more accurately. In adaptive modulation, reinforcement learning chooses the best modulation type based on current channel conditions, improving both speed and reliability. For modulation recognition, deep learning models like Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) detect the modulation scheme by picking out key signal features. T hese AI techniques work together to make underwater communication more reliable, efficient, and adaptable.

1.1. Motivation for a Systematic Literature Review

A systematic review is needed for UWA communication because current research on ML/DL-based UWA communication is scattered. Researchers have frequently applied ML/DL techniques for enhancing channel estimation, adaptive modulation, and modulation recognition. However, they often use different methods, datasets, and evaluation tools. Moreover, their results exhibit differences in training loss, modulation schemes, throughput gains, and bit error rate improvements. This makes it hard to compare results or draw clear conclusions. A well-organized review can consolidate findings from multiple studies, offering a comparative analysis of ML/DL-based approaches across single-carrier and multi-carrier systems. Given the ongoing challenges of multipath interference, extreme Doppler shifts, limited training data, and resource constraints in UWA systems, an SLR becomes important in guiding researchers toward developing more robust, scalable, and efficient AI-driven solutions.

1.2. State-of-the-Art Review Articles and their Limitations

Table 1 provides a summary of existing reviews on ML and DL techniques in UWA communication. It can be observed from Table 1 that existing reviews explore the potential of ML and DL approaches to enhance system robustness, optimize communication protocols, and improve signal processing in complex underwater environments. While these contributions are significant, the reviews exhibit noticeable research gaps. They fail to provide a holistic evaluation of ML and DL techniques for channel estimation, modulation recognition, and adaptive modulation. Consequently, there is a lack of detailed comparative analyses of ML and DL algorithms, their underlying system characteristics, and performance metrics. The broader impact of these advancements on enhancing the efficiency, scalability, and reliability of UWA communication systems is also essential.

1.3. Research Questions

The limitations of state-of-the-art review articles, highlighted in Table 1, have been rectified by performing this SLR. Particularly, it has explored the answers to the following five research questions:

Research Question 1 (RQ1): How do ML and DL techniques improve channel estimation in UWA communication, and what are the key system characteristics and performance metrics of these methods?

Research Question 2 (RQ2): How do ML and DL techniques improve adaptive modulation in UWA communication, and what are the key system characteristics and performance metrics of these methods?

Research Question 3 (RQ3): How effective are ML/DL-driven modulation recognition approaches in identifying modulation schemes under complex underwater conditions, and what are their strengths and limitations?

Research Question 4 (RQ4): What innovative approaches and emerging trends in machine/Deep learning can be employed to address unresolved challenges in underwater acoustic communication, and how can these advancements shape the future of intelligent, efficient, and scalable UWA systems?

This SLR aims to address key questions (RQ1 to RQ4) by evaluating existing methodologies, identifying research gaps, and proposing future directions for UWA communication systems in the context of ML/DL techniques.

1.4. SLR Framework

The framework of this SLR is shown in Figure 1. Research studies were carefully selected from four scientific databases IEEE, Springer, Elsevier, and Google Scholar based on specific inclusion and exclusion criteria, detailed in Section 2, which outlines the systematic process of study selection and classification.

The selected studies were categorized into three main areas channel estimation, adaptive modulation, and modulation recognition. Section 3 presents an overview of ML and DL-driven channel estimation techniques, discussing their effectiveness in predicting signal distortions and optimizing decoding accuracy. Section 4 explores adaptive modulation strategies that dynamically adjust transmission parameters in response to underwater conditions, improving spectral efficiency and system reliability. Section 5 analyzes ML-based modulation recognition methods, highlighting classification accuracy and robustness in noisy environments.

Section 6 addresses existing challenges while proposing innovative research directions. Section 7 provides detailed answers to formulated research questions. Finally, Section 8 outlines constraints in search coverage, database selection, inclusion criteria, and generalization, while the conclusion in Section 9 summarizes the article.

2. Literature Review Methodology

To address the research questions outlined in the introduction of this article, the systematic literature review process was conducted following the guidelines provided in [19]. A standard literature review process involves defining categories and developing a structured review protocol for selecting relevant research articles. Accordingly, Section 2.1 provides the foundational background for the various categories. Subsequently, the detailed steps of the review protocol are outlined in Section 2.2.

2.1. Detailed Exploration of Category Backgrounds

Section 1.3 states that this SLR targets three key areas—channel estimation, adaptive modulation, and modulation recognition—which are briefly introduced here.

2.1.1. Channel Estimation

It is the process of understanding and predicting how the underwater environment affects signal transmission [20,21,22]. As mentioned in the introductory part of this article that signals in a typical UWA environment experience multipath propagation, causing delays and distortion. The slow sound speed and varying water conditions further affect reliability. Researchers use ML/DL techniques like deep neural networks (DNN) and Long Short-Term Memory (LSTM) models to analyze patterns and improve signal prediction [23,24,25,26]. ML/DL models and techniques enhance message quality by adapting to underwater conditions in real time [27,28,29,30,31,32,33].

UWA communication channels typically operate with narrow bandwidths (a few kHz to tens of kHz) and varying transmission distances, from meters to kilometers. Modulation schemes like Binary Phase Shift Keying (BPSK), Quadrature Phase Shift Keying (QPSK), and Orthogonal Frequency Division Multiplexing (OFDM) structure signals efficiently. Metrics such as Mean Squared Error (MSE) and Bit Error Rate (BER) evaluate channel estimation and reliability. Advances in deep learning-based predictors and reinforcement learning improve UWA communication efficiency and robustness.

A typical UWA OFDM system that uses ML/DL models and techniques is shown in Figure 2. The figure depicts a multi-carrier system but is applicable to a single-carrier system as well after removing the multi-carrier-specific blocks. This is a very general illustration without specific details of the machine/deep learning block. However, in most cases, the ML/DL model is trained offline first and then used in the live system along with online training and/or tuning.

At the transmitter, the input data undergoes modulation, where symbols are mapped using techniques like QPSK or 16-QAM (Quadrature Amplitude Modulation) [29,33,34,35]. These modulated symbols are then converted into parallel streams and processed through an Inverse Fast Fourier Transform (IFFT) to generate OFDM signals. A cyclic prefix (CP) is added to mitigate inter-symbol interference caused by multipath propagation in underwater environments. The signal is then amplified and transmitted through the underwater channel [24,36,37,38].

At the receiver, the incoming signal first passes through a bandpass filter to remove unwanted noise. The cyclic prefix is stripped off, and the signal undergoes Fast Fourier Transform (FFT) to revert it back to the frequency domain. ML/DL techniques are employed for channel estimation, where models like DNNs or reinforcement learning algorithms analyze distortions and compensate for underwater channel variations. The estimated channel response is used to refine the received signal, ensuring accurate demodulation and data recovery. This ML/DL-driven approach enhances signal reliability, reduces BER, and optimizes communication efficiency in challenging underwater environments [39,40,41].

2.1.2. Adaptive Modulation

allows underwater communication systems to adjust modulation schemes based on real-time conditions, ensuring stable data transmission [42,43,44,45]. Since underwater environments constantly change due to temperature, salinity, pressure, and noise, fixed modulation schemes can be unreliable. Adaptive modulation selects the best scheme based on current conditions. For example, stable conditions allow higher data rate schemes like 16-QAM or OFDM, while rough conditions require more robust options like BPSK or FSK for better signal reliability. Reinforcement learning techniques, such as Q-learning and actor-critic models, automate this decision-making process by learning from previous transmissions.

These systems typically operate within bandwidths of 5 kHz to 10 kHz and cover distances from centimeters in small networks to kilometers for deep-sea communication. Machine learning and deep learning optimize throughput, bit error rate, and real-time signal processing, with some models prioritizing lower complexity for energy efficiency in resource-limited underwater devices. Figure 3 illustrates a typical underwater communication system using machine learning for adaptive modulation, where the modulator block in the transmitter can represent either a multicarrier or single-carrier system.

The adaptive modulation process starts at the transmitter, where the channel encoder adds extra data to protect against errors. The data is then modulated using techniques like BPSK, QPSK, or OFDM [22,46,47,48]. A modulation selector picks the best modulation scheme based on channel quality estimation. The amplified signal is sent through the UWA channel, where it faces challenges like attenuation, Doppler shifts, and noise. On the receiver side, the signal is filtered to remove noise. The channel decoder then reconstructs the original data, fixing any errors from transmission. ML and DL models are used to estimate channel quality. They analyze distortions and adjust the modulation scheme using real-time feedback and training data. The system keeps learning from past transmissions to improve efficiency and reliability. By using ML/DL-driven adaptive modulation, the system optimizes data rates when conditions are good. It also maintains reliable communication in difficult underwater environments. This approach improves signal clarity and reduces BER, making underwater communication more effective [43,44,45,49,50].

2.1.3. Modulation Recognition

Modulation recognition helps underwater receivers automatically identify the type of modulation used in a transmitted signal [51,52,53,54]. Since underwater signals can be affected by environmental variations, having an intelligent system that recognizes modulation patterns makes underwater communication more efficient and error-free. Machine/deep learning plays a significant role in this process, with deep learning techniques like Convolutional Neural Network (CNNs) and Recurrent Neural Networks (RNNs) being widely used to classify modulation types. The system is trained to analyze received signals, extract key features, and match them to known modulation patterns like BPSK, QPSK, QAM, FSK, DSSS (Direct Sequence Spread Spectrum), and OFDM. Modulation recognition techniques typically operate within bandwidths ranging from 1 kHz to 30 kHz, covering short-range (hundreds of meters) to long-distance communication (several kilometers). Performance metrics such as cross-entropy loss, accuracy percentage, and precision rates help researchers evaluate how well these models perform [55,56,57,58].

An AI-driven modulation recognition system for underwater acoustic channels analyzes received signals to determine modulation types, ensuring accurate signal interpretation [59,60,61,62]. The process starts at the transmitter, where the channel encoder adds redundancy to protect against errors. The encoded data is then modulated before transmission. At the receiver, machine learning and deep learning techniques assess distortions, perform channel estimation, and reconstruct the original data. Reinforcement learning methods, such as Q-learning and actor-critic models, refine the AI model by learning from previous transmissions. This approach enhances signal clarity, reduces bit error rate, and improves system reliability in underwater acoustic networks [63,64]. Figure 4 illustrates this system, where the modulator block can represent either a multicarrier or single-carrier setup.

2.2. Development of the Review Protocol

The development of a review protocol is a critical component of the literature review process. The protocol outlined in this section incorporates all the necessary steps to ensure a systematic and thorough review. These steps include the criteria for selecting and excluding research articles (Section 2.2.1), literature search process (Section 2.2.2) and to refine the focus and relevance of the literature under consideration (Section 2.2.3).

2.2.1. Criteria for Selection and Rejection

Subject Relevance: Research must be directly relevant to the context of this study and contribute to answering the formulated research questions.
Publication Date (2020–2025): Only research published between 2020 and 2025 is included. Studies published before 2020 are excluded.
Publisher: Selected research must be published in one of the three renowned scientific databases—IEEE, Springer, or Elsevier. Additionally, to ensure comprehensive coverage, the first 10 pages of Google Scholar were searched for each key term, allowing consideration of articles from other databases.
Impactful Contributions: Selected research must present key advancements in UWA communication, using ML/DL to enhance channel estimation, adaptive modulation or modulation recognition.
Results-Oriented: Studies with proposals and findings supported by solid evidence, facts, and experimental validation are favored.
Repetition: Identical or redundant research within the same context is excluded.

2.2.2. Literature Search Process

This SLR prioritizes high-quality sources to ensure a comprehensive understanding of ML/DL applications in UWA communication. IEEE, Elsevier, and Springer were selected for their extensive coverage and rigorous peer review. Journal articles were favored over conference proceedings for their comprehensive and validated findings. To broaden the scope, we also searched the first 10 pages of Google Scholar for each key term, identifying high-impact studies beyond traditional databases. This ensured the inclusion of valuable research not indexed in IEEE, Elsevier, or Springer. Limiting the search to 10 pages maintained focus on authoritative, widely cited works while filtering out irrelevant entries. This approach strengthens the review’s foundation by capturing key advancements, challenges, and future directions.

Table 2 presents a breakdown of search terms used across three major scientific databases—IEEE, Elsevier, and Springer. A combination of search terms such as channel estimation, adaptive modulation, modulation recognition, underwater acoustic communication, and machine/deep learning was employed, along with a publication filter (2020–2025). The table highlights the number of search results retrieved for each term, reflecting the extent of research coverage in these databases. Figure 5 illustrates the sequence of steps undertaken during the selection of research articles. In total, search terms were applied across four scientific databases, resulting in approximately 41,261 entries. Using the predefined selection and rejection criteria, 26,230 studies were excluded based on their title, 8,480 based on their abstract, and 5,495 following a general review. Subsequently, the remaining 1,356 articles underwent a detailed evaluation, culminating in the selection of 43 highly relevant research articles.

2.2.3. Systematic Approach Used in Extracting and Analyzing Studies

Table 3 provides a framework for extracting, analyzing, and classifying research studies. It categorizes studies into key domains: channel estimation, adaptive modulation, and modulation recognition. Consequently, it enables comparative assessments of methodologies and performance metrics. Additionally, Figure 6 presents statistical data on research articles, organized by publication year.

3. Results on ML/DL-Based Channel Estimation

This section provides an overview of ML/DL techniques applied to channel estimation in UWA communication, highlighting their effectiveness in improving communication reliability and efficiency. The discussion includes various ML/DL models, optimizers, and training examples in Section 3.1. Subsequently, Section 3.2 elaborates key system characteristics such as bandwidth, transmitter-receiver (Tx-Rx) distance, modulation schemes, system types, subcarriers, and CP. Similarly, Section 3.3 provides performance metrics such as training loss, BER, channel prediction and computational complexity. Finally, Section 3.4 and Section 3.5 provide an in-depth discussion on the selected research works for single-carrier and multi-carrier UWA communication systems respectively.

3.1. Overview of Channel Estimation Approaches

Table 4 and Table 5 provide an overview of ML/DL techniques, optimizers, and training examples. These methods are used to estimate channels for both single-carrier UWA (SC-UWA) and multi-carrier UWA (MC-UWA) communication systems. The ML/DL techniques column includes algorithms like DNN and CNN. These methods make predictions more accurate and reduce errors caused by noise and interference in UWA environments. Optimizers like Adam, Adagrad, Adadelta, and Nadam reduce errors in calculations. Adam is especially popular because it adjusts the learning rate automatically, making optimization more efficient. The training examples column shows the types of data used. Some data comes from direct measurements in UWA environments, while others are generated using models like [34]. Additionally, some datasets are based on actual measured data, such as [35]. The quality and diversity of these datasets significantly impact the effectiveness of the techniques. It can be observed from Table 4 and Table 5 that single-carrier systems highlight simpler models such as LR, LSTM, ABiGRU and UACC-GAN, with Adam being the most commonly used optimizer. In contrast, the multi-carrier table presents more complex models like CNN, DNN, BiLSTM, Transformers, and GAN-based architectures, showing a trend toward advanced deep learning techniques. Adam remains the dominant optimizer, though some studies use RMSprop and Pelican. Overall, SC-UWA studies focus on simpler architectures, while MC-UWA techniques employ more sophisticated models for improved accuracy and efficiency.

3.2. Key Characteristics of Channel Estimation Techniques

Table 6 and Table 7 provide a comprehensive comparison of the key characteristics of single-carrier and multi-carrier UWA systems. These comparisons highlight critical parameters, including bandwidth, transmission-reception (Tx-Rx) distance, modulation schemes, system types, subcarriers, and CP attributes. By analyzing these attributes, the tables shed light on the distinctive setups and operational differences between the two strategies, offering valuable insights for optimizing communication.

For single-carrier systems, Table 6 begins with the parameter of bandwidth, defined as the difference between the maximum and minimum frequencies used for transmission. Bandwidth is a pivotal characteristic of UWA channels, typically limited to a few kilohertz, which in turn restricts the transmission rates achievable in these systems. Another essential parameter is Tx-Rx distance, which denotes the distance (in kilometers) between the transmitter (Tx) and receiver (Rx), a factor that directly influences communication efficiency. The final parameter for single-carrier systems is the modulation scheme employed, such as BPSK, QPSK, or frequency hopping spread spectrum (FH-SS). These modulation schemes are tailored to address the unique challenges of UWA systems.

Conversely, Table 7 highlights system characteristics unique to multi-carrier architectures, including technologies like OFDM, Adaptive Frequency-Division Multiplexing (AFDM), and Generalized Frequency-Division Multiplexing (GFDM). The parameter of subcarriers, which typically adheres to values in powers of two, further distinguishes multi-carrier systems. Modulation schemes in these systems involve mapping techniques like PSK, QAM, and Continuous Phase Modulation (CPM), contributing to efficient data handling. Additionally, the CP is a defining feature of multi-carrier systems, implemented to mitigate the detrimental effects of multipath propagation. CP lengths are commonly set to one-fourth of the number of subcarriers, ensuring reliable signal integrity even in fluctuating underwater conditions.

It can be observed from Table 6 and Table 7 that SC-UWA systems operate with bandwidths ranging from 4 kHz to 25 kHz, with transmission distances between 0.2 km and 3 km. The primary modulation schemes used are BPSK and QPSK, with some studies employing FH-SS. In contrast, MC-UWA systems utilize higher bandwidths, ranging from 4 kHz to 1000 kHz, with transmission distances between 0.5 km and 5 km. OFDM is the dominant system, with some studies incorporating OFDMA, MIMO-OFDM, AFDM, and OTFS. Modulation schemes include BPSK, QPSK, 8-QAM, 16-QAM, and 8PSK, with cyclic prefixes varying from 0 to 256. The number of subcarriers ranges from 32 to 1024, reflecting the diversity in system configurations. Overall, SC-UWA systems focus on simpler modulation techniques and lower bandwidths, while MC-UWA systems use advanced multi-carrier architectures for improved efficiency and performance.

3.3. Comparative Analysis of Channel Estimation Techniques

The comparative analysis of selected channel estimation (CE) techniques focuses on measurable criteria used to evaluate and compare the efficiency, accuracy, and reliability of various ML/DL techniques in UWA communication. These metrics provide valuable insights into how well these systems perform under different conditions. Table 8 and Table 9 aim to present the performance evaluation of systems classified as SC-UWA and MC-UWA systems, respectively.

One of the key metrics for performance evaluation is training loss, which measures the training effectiveness of an algorithm. It is assessed by plotting MSE or mean absolute error (MAE) against the number of epochs during the training, testing, and validation phases. It calculates the average squared difference between predicted and actual values, providing a measure of prediction errors. The mathematical formula for MSE is [65]:

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(1)

On the other hand, MAE quantifies the sum of absolute errors divided by the sample size. Its formula is [66]:

MAE = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(2)

Another critical parameter is complexity which evaluates an algorithm’s efficiency compared to existing methods. A lower complexity is represented by a positive value indicating the gain in efficiency, while a higher complexity is denoted by a negative value, highlighting the algorithm’s inferiority compared to existing techniques. It is measured using time complexity (Big-O), runtime, FLOPs, network parameters, and storage requirements. Similarly, channel prediction performance compares predicted parameters with actual measurements using MSE, Normalized Mean Square Error (NMSE), Time-varying Impulse Response (TVIR), Cumulative Density Function (CDF), Jensen-Shannon (JS) divergence, and information entropy. Among these, TVIR is typically evaluated visually due to its graphical nature.

Lastly, BER is a widely used for assessing communication system performance. It evaluates the accuracy of receiver using channel estimates obtained through the proposed technique. The performance is presented through BER versus SNR plots, which compare the proposed method against existing techniques. Table 8 and Table 9 highlight the BER gain at a fixed SNR value relative to baseline techniques. When multiple techniques are compared, the BER gain is computed against the best-performing technique. If the authors provide multiple BER plots, the highest performance is taken, with the comparison technique’s name indicated in parentheses alongside the reported BER gain.

It can be observed from Table 8 and Table 9 that the training loss in SC-UWA systems values vary, with some studies using MAE and others using MSE. Complexity is not explicitly provided for most studies. The BER improvements range from

10^{- 3}

to

10^{- 5}

, indicating varying levels of performance enhancement. In MC-UWA systems, training loss values are mostly MSE, with some studies combining MSE and BER for optimization. Complexity varies, with some models requiring high memory usage (e.g., 20.7 MB, millions of parameters and others focusing on execution time (e.g., 47.8 ms per OFDM block). BER improvements range from

10^{- 3}

to

10^{- 6}

, with some studies reporting over 40% improvement in performance. Overall, SC-UWA techniques generally focus on simpler models with lower complexity, while MC-UWA techniques use advanced architectures with higher computational demands. The BER improvements in MC-UWA systems tend to be more significant, reflecting the advantages of multi-carrier communication in underwater environments.

3.4. Discussion on ML/DL-Based Channel Estimation in SC-UWA Communication Systems

The research works on single-carrier UWA systems demonstrate a progressive evolution in addressing the challenges posed by dynamic underwater environments. Each study contributes unique methodologies and insights, collectively advancing the field of UWA communication.

The work in [20] introduces the ML-ECQP (Machine Learning-based Environment-aware Communication Channel Quality Prediction) method, using Logistic Regression (LR) to predict channel quality based on environmental parameters such as wind speed, water temperature, air humidity, and SNR. Operating at carrier frequencies of 23 kHz and 25 kHz over a transmitter-receiver distance of 205 meters, the system employs BPSK modulation. The study highlights the effectiveness of real-world experimental data from Furong Lake in optimizing MAE to

4 \times 10^{- 3}

and achieving a BER close to

10^{- 3}

. While the method improves energy consumption and network performance, challenges such as prediction errors under favorable channel conditions and limited training data remain.

Building on the foundation of channel prediction, [23] explores M-ary Spread Spectrum modulation with BPSK for single-carrier systems operating within a 4 kHz bandwidth and transmission distances of 2–3 km. The study employs advanced LSTM architectures, including Bidirectional LSTM (BiLSTM) and stacked bidirectional uni-directional LSTM (SBULSTM), optimized using the Adam optimizer and simulated channel data generated via BELLHOP software. The system achieves robust channel prediction and significant BER improvements, particularly under low SNR conditions, with values close to

10^{- 2}

for SBULSTM models.

Further advancing real-time channel prediction, [24] introduces the Adaptive Bidirectional Gated Recurrent Unit (ABiGRU) network, which utilizes short-term Channel Impulse Response (CIR) data for online training. The integration of Space-Time Block Coding and Minimum Mean Square Error (MMSE) pre-equalization enhances system performance, achieving superior BER and prediction accuracy compared to MMSE, LSTM, and GRU models, effectively bridging the gap between predictive accuracy and real-time adaptability. Similarly, [30] refines LSTM-based models by integrating attention mechanisms, leading to the development of the AttLstmPreNet model. By focusing on critical input sequence aspects, this model efficiently addresses sparsity and temporal coherence in fast time-varying UWA channels. Validated with simulated data, AttLstmPreNet outperforms conventional predictors like LMS and RLS, demonstrating its effectiveness in dynamic underwater environments. Together, these studies highlight advancements in chanel estimation techniques, improving accuracy and robustness in underwater communication systems.

Finally, the study [36] presents UACC-GAN (underwater acoustic communication channel-generative adversial network), a data-driven simulator for underwater acoustic communication channels. Using generative adversial networks, it generates realistic time-varying impulse responses based on measured channel data. The model is validated against the WATERMARK dataset and achieves BER performance similar to real measured channels. However, it requires large training samples and lacks control over specific property variations, indicating areas for improvement. This work supports previous predictive models by focusing on simulation and design for underwater communication systems

Together, these studies showcase a cohesive progression in single-carrier UWA systems, from environment-aware channel prediction to advanced deep learning architectures and stochastic simulation techniques. Each contribution addresses specific challenges while paving the way for future advancements in underwater communication technologies.

3.5. Discussion on ML/DL-Based Channel Estimation in MC-UWA Communication Systems

The research on multi-carrier UWA systems showcases a diverse range of methodologies aimed at addressing the inherent challenges of underwater communication. Each study contributes unique insights, collectively advancing the field through innovative techniques and performance improvements.

The work in [21] introduces CsiPreNet, a combination of CNN and LSTM networks to predict channel state information (CSI). It operates within a 4 kHz bandwidth over distances of 1–5 km, using 681 subcarriers with modulation schemes like BPSK, QPSK, 8-QAM, and 16-QAM. The model achieves low error rates, with MAE of

4 \times 10^{- 3}

and BER close to

10^{- 3}

under low SNR conditions. By integrating subcarrier-bit-power allocation and an offline-online prediction approach, it enhances resource allocation and system performance, setting a benchmark for future research. A DNN-based method for OFDM systems using QPSK modulation is presented in [25]. The model features 64 subcarriers and a cyclic prefix size of 16, utilizing a comb-type pilot arrangement with 32 pilot tones. It is trained on the WATERMARK dataset, which includes diverse underwater channel conditions. The optimization process employs the Adam optimizer, improving efficiency in environments such as Norway-Oslofjord and Brest Commercial Harbor.

In addition to OFDM systems of [21] and [25], the study in [26] explores the use of Denoising Autoencoder (DAE) and DNN to improve OFDM systems by addressing impulsive noise. The DAE pre-processes noisy signals using a Gated Bernoulli-Gaussian (GBG) model, recovering clean data before a DNN, trained with the Adam optimizer over 10,000 epochs, predicts CSI. This approach achieves a BER gain of approximately

10^{- 2}

and an MSE of 0.0024 at 20 dB SNR. While a DNN-based improvement is employed in [26], the work in [27] introduces a CNN-based OFDM receiver that enhances signal recovery and communication reliability under challenging underwater conditions. Using QPSK modulation, the system achieves low MSE values and BER improvements over 0.17, with SNR gains of 2–3 dB at

1 \times 10^{- 3}

. Experiments were conducted over 750–3160 m distances with the WATERMARK dataset show that the model reduces storage needs and operates efficiently with a runtime of 47.8 ms per OFDM block. The CNN predicts CIR using skip connections, enabling reliable underwater communication.

Further advancing channel estimation, a 2D BiLSTM-based CIR estimator for UWA sensor networks is proposed in [28]. It employs Sound Speed Profile (SSP) estimation from water temperature data. The model, deployed in an OFDM system with a 5 kHz bandwidth, 462 subcarriers, and a cyclic prefix length of 22.6 ms, achieves low MSE values (0.02–0.14) during SSP estimation. It also demonstrates superior performance, with a BER gain of 2–3 dB SNR at

1 \times 10^{- 3}

and a network throughput of 396 bps in practical experiments. Similarly, [29] proposes a model-driven deep learning-based estimation network. The problem is modeled using sparse signal recovery, where approximation message passing (AMP) detects and use the sparse nature of the channel. The ST-LAMP network, designed as a generalized estimator, is applied to random sparse UWA channels. To achieve Bayesian optimal solutions, the authors use a Gaussian mixture (GM) prior, constructing the GM-LAMP network with a shrinkage function based on minimal mean squared error criteria.

While the SSP estimation [28] and model-driven estimation [29] contribute to enhancing UWA communication by improving prediction accuracy, robustness, and system adaptability, another deep learning-based receiver for UWA-OFDM, known as SCABNet, is presented in [31]. It combines deep neural networks and expert knowledge. It uses an attention-enhanced bi-directional LSTM (AM-BiLSTM) for signal detection and a skip connection CNN (SC-CNN) for channel estimation. The model is trained on real experimental UWA channel data from the WATERMARK dataset [35] and evaluated under various conditions, including fewer pilots, absence of cyclic prefixes, symbol time offset (STO), and carrier frequency offset (CFO). By achieving lower BER, it demonstrates resilience in time-varying and frequency-selective channels but requires substantial offline training, which may demand significant computational resources. Its applicability to other underwater environments beyond the WATERMARK dataset remains unexplored, and future research may focus on improving training efficiency and adapting the model for real-time applications. Similarly, [32] presents CWGAN, which employs channel attention denoising (CAD) to enhance channel estimation accuracy in underwater acoustic communication. The system operates within a bandwidth range of 6-10 kHz and transmission distances from 0.75 km to 3.16 km, using OFDM modulation with 512 to 1024 subcarriers and cyclic prefix values between 128 and 256 samples to mitigate multipath interference. It integrates CNN-based models and CWGAN-GP (Conditional Wasserstein GAN with Gradient Penalty) for improved channel prediction. Performance evaluation using MSE, BER, and channel prediction accuracy confirms enhanced signal robustness and reliability in underwater environments.

The study in [33] presents a DL method for MIMO-OFDM underwater acoustic IoT networks. A Bi-directional Deep Pelican Convolutional Neural Network (BDPCNN) is used for signal detection and an Adaptive Recursive Least Squares (ARLS) approach is employed for channel estimation. Moreover, the Pelican Optimizer is used to optimize network settings, reducing computational complexity. The system’s performance is evaluated under different water conditions, including turbid, coastal, clear ocean, and pure seawater, using metrics such as BER, MSE, energy efficiency, and channel capacity. The findings indicate lower BER values (0.0086–0.021) at 20 dB SNR, improved MSE-based channel estimation, and better energy efficiency compared to traditional models. However, the model is trained on simulated data, which may not fully represent real-world underwater channel variations. The study does not examine hardware limitations, requiring further research to validate the approach with real-world measurement datasets and adaptive real-time applications. Similarly, [39] introduces a Multi-Task Learning (MTL) framework for predicting time-varying channels, addressing challenges in high-dimensional CIR prediction. The framework uses a Shared Feature Learning (SFL) layer to capture multipath correlations and a Task-Specific Head (TSH) layer for refining predictions. Based on maritime experiments in Wuyuan Bay, China, the study compares different SFL configurations, including LSTM, attention-enhanced LSTM, and transformer architectures. The MTL framework achieves lower prediction errors in both one-step and multi-step forecasting. While it reduces computational complexity compared to Single-Task Learning, advanced models like transformers still require significant resources.

The work reported in [37] addresses large Doppler spread, low SNR, and complex multipath propagation in UWA OFDM communication with DenseNet-based channel estimation. Due to extensive connection and feature reuse, the proposed DenseNet estimator efficiently captures complicated channel characteristics. The model is trained on WATERMARK BCH, KAU1, and KAU2 channel data [35]. Even with fewer pilot symbols, DenseNet surpasses conventional estimators in BER, MSE, and channel estimation accuracy. The study also shows the model’s flexibility to BPSK, QPSK, and environmental circumstances. DenseNet’s BER improvements (up to 96.3% over LS and 94.2% over MMSE) make it a reliable UWA channel estimate solution. The model reduces pilot overhead and errors in amplitude and phase estimation.

Authors in [38] propose a CNN-based channel estimator and LSTM-based equalizer for OFDM receivers using model-driven deep learning. The CNN extracts essential features from the CFR, whereas the LSTM equalizes signals using temporal relationships. The WATERMARK dataset’s NOF1 and NCS1 channels [35] are used to train and test the model. Simulation findings show that the suggested technique outperforms LS, MMSE, CNN-MLP, DNN, and ComNet in BER and channel mismatch robustness. Pilot symbol reduction and CP removal are also examined, showing that the suggested model performs well with fewer pilots and no CP. The study does not analyze computational complexity, therefore it is unknown how well the approach would scale to real-time, resource-constrained deployments. The model’s performance under large Doppler spread and extreme multipath situations is also understudied. To prove its viability, hardware implementations, real-time flexibility, and dataset expansion are needed.

The study in [40] introduces two AFDM receiver designs for doubly selective fading channels. The first design uses DNNs for data prediction but faces problems with pilot-data interference when guard intervals are absent. To solve this, the authors propose an iterative receiver that estimates the channel, detects symbols, and cancels interference without guard intervals. Simulations show that the DNN receiver effectively reduces pilot-data interference, achieving channel estimation with a 0.001 BER within 0.5 dB of the ideal case. However, the model requires extensive offline training, has limited adaptability to untrained channels, and involves high computational complexity in iterative processing. Similarly, [41] presents stacked CNN ResNet (S-CNN-ResNet) receiver for Orthogonal Time Frequency Space (OTFS) communications. This receiver is designed to mitigate the Doppler Squint Effect. It integrates CNN for channel feature extraction from pilot data and an enhanced ResNet for symbol recovery to improve feature learning. The model achieves better BER performance, balancing complexity and performance. However, it requires substantial training data and processing resources, which may limit its feasibility for real-time applications. Additionally, performance improvements at higher SNRs are less significant compared to iterative techniques.

To summarize, the review of ML and DL techniques for channel estimation in UWA communication highlights significant advancements in improving signal prediction, bit error rate reduction, and system reliability. The comparison between single-carrier and multi-carrier systems demonstrates how different AI-driven models, including CNNs, LSTMs, and reinforcement learning approaches, adapt to varying underwater conditions and enhance communication performance. Despite these improvements, challenges such as computational complexity, real-time adaptability, and efficient training data utilization persist. This exploration of AI-based channel estimation methods sets the stage for deeper investigations into adaptive modulation and modulation recognition techniques, ensuring a more seamless and reliable underwater communication framework.

4. Results on ML/DL-Based Adaptive Modulation

This section provides an analysis of ML/DL learning techniques applied to adaptive modulation in underwater acoustic communication systems. By focusing on optimizing modulation schemes and coding strategies, these approaches aim to enhance communication reliability and efficiency in dynamic underwater environments. The discussion encompasses key methods (Section 4.1), system characteristics (Section 4.2), and performance comparison (Section 4.3), offering insights into the effectiveness of ML/DL-based solutions in addressing the unique challenges of SC-UWA systems (Section 4.4) as well as MC-UWA systems (Section 4.5).

4.1. Overview of Adaptive Modulation Strategies

An overview of ML/DL techniques, optimizers, and training examples for single-carrier and multi-carrier underwater acoustic communication systems, implementing adaptive modulation, is presented in Table 10 and Table 11. The detailed descriptions of the various parameters listed in these tables were provided earlier in Section 3.1. Notably, when reinforcement learning approaches were implemented using table lookup and Q-learning, the optimizer is marked as ’Not Applicable’. This distinction underscores a fundamental difference: the learning process in such cases is governed by the Bellman equation rather than traditional optimization methods [67].

It can be observed from Table 10 and Table 11 that SC-UWA systems utilize machine learning models such as SVM, KNN, LDA, BRT, MLR, MLP, CNN, and RL. Training data includes measured datasets, simulated datasets, and in some cases, no dataset is used, particularly in reinforcement learning approaches. In MC-UWA systems, adaptive modulation techniques involve models like A-kNN, CNN, and RL-based approaches. Training examples range from measured data to simulated datasets, with Bellhop being a common simulation tool. Some reinforcement learning studies do not use a dataset, focusing instead on theoretical evaluations. Overall, SC-UWA techniques employ a mix of traditional machine learning models and deep learning approaches, while MC-UWA techniques integrate more reinforcement learning methods and simulation-based training. The diversity in optimization strategies and training datasets highlights the evolving nature of adaptive modulation in underwater acoustic communication.

4.2. Key Characteristics of Adaptive Modulation Techniques

Table 12 and Table 13 provide a comparison of the key characteristics for selected adaptive modulation techniques in SC-UWA and MC-UWA communication systems respectively. Similar to Table 6 and Table 7, the comparisons highlight critical parameters, including bandwidth, transmitter-receiver (Tx-Rx) distance, modulation schemes, system types, subcarriers, and CP attributes. By analyzing these attributes, the tables shed light on the distinctive setups and operational differences between the two strategies, offering valuable insights for optimizing communication

In SC-UWA systems, bandwidths range from 4 kHz to 10 kHz, with transmission distances varying between 0.82 km and 3 km. Some studies do not specify bandwidth or transmission distance, particularly those focusing on theoretical evaluations. In MC-UWA systems, bandwidths range from 5 kHz to 8 kHz, with transmission distances spanning from 0.3 km to 5 km. OFDM is the dominant system, with some studies incorporating OTFS. The number of subcarriers varies from 32 to 1024, and cyclic prefixes range from 64 to 400. Modulation schemes include PSK, QAM, and FSK, with some studies not specifying modulation details. Overall, SC-UWA techniques generally use lower bandwidths and simpler modulation schemes, while MC-UWA techniques leverage multi-carrier systems with higher bandwidths and more complex configurations. The diversity in system characteristics highlights the evolving nature of adaptive modulation in underwater acoustic communication.

4.3. Comparative Analysis of Adaptive Modulation Techniques

Table 14 and Table 15 present the performance evaluation of systems implementing adaptive modulation, categorized into SC-UWA and MC-UWA underwater acoustic communication systems, respectively. The attributes of Training Loss, Complexity, and Gain in BER have already been explained in Section 3.3. By definition, in underwater acoustic communication, throughput—which takes account for protocol overhead, channel defects, and retransmissions—is the average number of information bits effectively received per unit time. The ability to accurately estimate channel conditions, or classification accuracy, directly affects throughput in adaptive modulation-based underwater acoustic communication systems. A high classification accuracy guarantees the best coding and modulation decisions, increasing data speeds and reducing retransmissions. Low precision, on the other hand, can result in more errors, or underuse of the channel, all of which lower effective throughput. For this reason, the numbers in "Throughput" columns in Table 14 and Table 15 indicate the percentage accuracy in selecting the most appropriate modulation scheme.

In single-carrier systems, training loss values are often not provided, with some studies using mean squared error or threshold-based approaches. Complexity is generally unspecified, but throughput improvements range from 3.6 percent to over 99 percent accuracy. Bit error rate gains vary, with some studies reporting substantial improvements while others provide specific values such as 4.5 times ten to the power of negative three. In multi-carrier systems, training loss values include actor and critic loss, cross-entropy, and mean squared error, though some studies do not specify a loss function. Complexity varies, with some models having higher processing and memory demands. Throughput improvements range from near ideal performance to specific percentage increases, such as four percent higher accuracy. Bit error rate gains are generally better than threshold values, with some studies maintaining a stable bit error rate of 0.001 or reporting improvements of up to 32 percent. Overall, single-carrier techniques focus on simpler models with lower complexity, while multi-carrier techniques leverage advanced architectures with higher computational demands. The improvements in throughput and bit error rate highlight the effectiveness of machine learning and deep learning in enhancing adaptive modulation for underwater acoustic communication.

4.4. Discussion on Adaptive Modulation Techniques in SC-UWA Communication Systems

Recent advancements in machine learning (ML) and reinforcement learning (RL) have shown promise in optimizing modulation and coding schemes (MCS) for UWA networks, outperforming traditional rule-based approaches. These techniques leverage channel parameters such as SNR, delay spread, and Doppler effects to dynamically adapt transmission strategies, improving throughput, BER, and spectral efficiency. The following studies explore various ML and RL-based approaches for adaptive modulation in single-carrier and in some cases both single and multicarrier UWA communications, highlighting their contributions and shortcomings in addressing these challenges.

The manuscript in [42] presents an ML-driven link adaptation (LA) strategy for UWA communication networks to tackle the difficulties arising from rapidly fluctuating and intricate channel circumstances. The authors compare rule-based strategies (such as 3D analysis, modulation-wise analysis, and fixed-SNR strategy) with ML algorithms (including SVM, KNN, pseudo-linear discriminant analysis, and boosted regression trees) to classify modulation and coding scheme levels by analyzing measured sea trial datasets. The boosted regression tree attains exceptional accuracy (99.97%) in MCS classification, surpassing alternative techniques. Significant contributions encompass illustrating the superiority of machine learning over rule-based methodologies for link adaptation in ultra-reliable and low-latency communication networks, as well as emphasizing the efficacy of boosted regression trees in managing diverse datasets. Nonetheless, the methodology is significantly dependent on substantial training data, and its real-time application is constrained by the reciprocity problem of the FDD system and the frame delays between transceivers.

Building on the advantages of ML-based link adaptation demonstrated by Alamgir et al., the research in [47] presents an iterative learning framework for dependable link adaptation in the Internet of Underwater Things (IoUT), utilizing ML approaches multilayer regression (MLR) and multilayer perceptron (MLP) to jointly forecast Modulation and Coding Scheme (MCS) and BER by employing various channel parameters (e.g., SNR, Delay Spread, Frequency Spread) derived from authentic underwater datasets. The system mitigates the poor SNR-BER correlation in underwater settings by iteratively optimizing MCS selections to achieve the target BER, resulting in up to 25% increased throughput compared to traditional SNR-based adaptive modulation. Significant contributions encompass the amalgamation of MCS and BER prediction models, validation with empirical data from the Gulf of Incheon, and enhancements in performance via MLP-based learning. Nonetheless, there is a need for additional optimization to minimize latency in real-time IoUT implementations.

Building upon the iterative ML frameworks discussed previously, the authors in [43] present a hybrid deep learning model that integrates CNN with Boosted Single Feedforward Layers (BSFL) to dynamically choose among CDMA, TDMA, and OFDM modulation methods in underwater acoustic networks. The CNN collects channel features, whereas the BSFL forecasts the ideal modulation scheme, with a high accuracy of 98.6% and a 30% enhancement in BER performance relative to traditional approaches. Significant achievements encompass illustrating the efficacy of hybrid learning in dynamic underwater settings and surpassing established models such as CNN+RF and DCNN in modulation selection. The method necessitates substantial processing resources and enormous datasets for training, while its real-time applicability is constrained by significant complexity.

While the prior study leverages hybrid deep learning, the research in [44] shifts focus to reinforcement learning and present an RL-based automatic modulation switching system designed to improve UWA communication by dynamically selecting among ASK, PSK, OFDM, and BFSK schemes according to real-time channel circumstances. Significant contributions encompass a cost-efficient, software-defined acoustic modem utilizing UNETStack and Raspberry Pi, realizing a 3.648% enhancement in RSSI, a 32% decrease in BER at 7 dB SNR, and a 5% augmentation in utility at 10 dB SNR relative to fixed FH-BFSK. The RL system utilizes a Q-matrix and a greedy policy for adaptive decision-making, enhancing reliability and efficiency in dynamic underwater situations. Nonetheless, the system’s efficacy is constrained by the confined experimental framework (0.05–0.1 m water range), and its practical scalability has yet to be validated.

Expanding on RL-based adaptive strategies, the paper [45] further examines a Q-learning-based adaptive modulation scheme for shallow sea UWA communication, leveraging SNR, multipath spread length, and Doppler frequency offset to dynamically select optimal modulation modes (OFDM, MFSK, DSSS) for improved performance. Field experiments demonstrated that the RL approach outperformed fixed threshold and random selection methods, achieving higher throughput (14,645.3 bits) and lower BER in time-varying channels. Key contributions include the practical validation of RL in real-world UWA conditions and the demonstration of its superiority over conventional adaptive modulation strategies. However, shortcomings include limited scalability to diverse environments due to site-specific training data, potential latency in real-time decision-making, and the lack of comparison with more advanced RL algorithms beyond Q-learning.

In summary, ML and RL techniques demonstrate significant potential in optimizing adaptive modulation for UWA communications, consistently outperforming traditional approaches in terms of accuracy, throughput, and BER performance.

4.5. Discussion on Adaptive Modulation Techniques in MC-UWA Communication Systems

The following studies explore various learning-based methods for adaptive modulation in multicarrier UWA communications, highlighting their innovative contributions while acknowledging existing limitations in computational efficiency and real-world applicability. These approaches range from attention-based classifiers to deep reinforcement learning frameworks, each offering unique solutions to the challenges of UWA channel adaptation.

The paper [46] proposes an ML-based adaptive modulation framework for UWA, introducing an A-kNN classifier to address UWA channel uncertainty and dynamics. The A-kNN leverages attention mechanisms to improve MCS selection accuracy, while the dimensionality reduced and data clustered A-kNN (DRDC-A-kNN) variant enhances efficiency through principal component analysis (PCA)-based dimensionality reduction and k-means clustering. The framework includes online learning for adaptability to new environments, validated using real-world lake data. Simulations employing real-world data from three lake experiments show that these approaches outperform model-based methods in throughput and dependability. However, the reliance on manually extracted features may limit performance, and the computational overhead of kNN for large datasets remains a challenge.

Expanding on the attention-based classification approach, the research [48] presents a proximal policy optimization (PPO)-based adaptive modulation system for UWA OFDM communication, tackling the dynamic and intricate characteristics of UWA channels by translating feedback data into a continuous state space. Significant contributions encompass the analysis of environmental impacts (multipath effects, Doppler shift, noise), the formulation of AM as a Markov Decision Process (MDP), and the utilization of PPO for effective policy optimization. The approach attains near-optimal throughput, surpassing Deep Q-Network (DQN), Double DQN (DDQN), and QL in both convergence velocity and stability. Drawbacks encompass dependence on precise channel feedback, elevated computational complexity stemming from PPO’s recurrent updates.

Building on the use of PPO for adaptive decision-making, the 2023 publication [49] by Cui et al. further explores deep RL and proposes a new method for improving underwater acoustic communication systems. In this work, a deep reinforcement learning (DRL) framework using a DQN to adaptively pick modulation schemes in OFDM systems is proposed. The adaptive modulation problem is addressed as an MDP using real-time CSI and SNR input to choose modulation modes. This adaptive method maximizes system throughput while maintaining acceptable BER. The DRL-based adaptive modulation strategy minimizes BER and improves system throughput compared to existing methods, according to SWellEx-96 simulations. However, simulations like SWellEx-96 may not fully replicate the complexity of real-world underwater ecosystems, limiting the generalizability of the findings.

Extending beyond conventional OFDM-based approaches, the manuscript [50] integrates OTFS modulation with DL and meta-learning techniques. They present an adaptive modulation framework for UWA communications utilizing OTFS modulation and deep learning, specifically a CNN, to extract channel characteristics in the Doppler-delay domain and identify optimal modulation and coding schemes (MCS). It tackles the difficulties posed by rapidly fluctuating UWA channels by utilizing the resilience of OTFS and applies model-agnostic meta-learning (MAML) to improve adaptability to novel environments with scarce data. The strategy surpasses conventional ML-based adaptive modulation and fixed MCS approaches, attaining superior throughput and expedited convergence in practical UWA contexts. However, the work omits a discussion on computational complexity and real-time implementation problems, which may be essential for actual deployment.

In conclusion, these studies demonstrate significant progress in applying ML and RL techniques to adaptive modulation for multicarrier UWA communications, achieving notable improvements in throughput, reliability, and environmental adaptability.

5. Results on ML/DL-Based Modulation Recognition

This section presents the results of ML/DL-based modulation recognition in UWA communication systems. It explores the application of ML/DL techniques in this domain, focusing on their effectiveness and implementation. The methods employed in selected research studies for modulation recognition are outlined in Section 5.1. Additionally, the system configurations utilized in these studies are detailed in Section 5.2. Section 5.3 compares the performance of these studies, offering insights into their strengths and limitations. Finally, Section 5.4 provides a comprehensive discussion of the selected research works, highlighting their contributions and implications for modulation recognition in UWA communication systems.

5.1. Overview of Modulation Recognition Techniques

An overview of ML/DL techniques, optimizers, and training examples for underwater acoustic communication systems is presented in Table 16. Since the same UWA modulation recognition system can recognize both single-carrier and multicarrier modulations, we did not make separate tables for single and multicarrier UWA systems. The detailed descriptions of the various parameters listed in this table were provided earlier in Section 3.1.

It can be observed from Table 16 that various models such as SCNet, RNN, CNN, ResNet, SVM, and reinforcement learning are applied across different studies. Optimizers include Adam, gradient descent, and momentum SGD, while some studies do not specify an optimizer. Training examples vary, with some studies using simulated data, others relying on measured data, and a few combining both approaches. Bellhop is a common simulation tool used in some cases. The diversity in models, optimization strategies, and training datasets highlights the evolving nature of modulation recognition techniques in underwater acoustic communication.

5.2. Key Characteristics of Modulation Recognition Techniques

Table 17 provides a comparison of the key characteristics of modulation recognition (MR) techniques in a typical UWA communication system. These characteristics were already explained in Section 3.2.

Table 17 shows that the bandwidths vary across studies, ranging from 1 kHz to 100 kHz, with some studies not specifying bandwidth values. Transmission distances also differ, spanning from a few meters to several kilometers, depending on the experimental setup. The modulation schemes considered include PSK, QAM, FSK, DSSS, OFDM, LFM, CW, and SSB, showing a wide range of modulation types used in underwater acoustic communication. Some studies focus on a limited set of modulation schemes, while others evaluate multiple types.

5.3. Performance Comparison of Modulation Recognition Techniques

Table 18 provides a detailed performance evaluation of a UWA communication system that implements modulation recognition techniques. The two characteristics "Training Loss" and "Complexity" were explained earlier. Two more characteristics, namely, namely, "Average Accuracy" and "Average Precision", are used when evaluating the performance of a typical modulation recognition system. These are explained as follows [68]. Average accuracy measures the overall correctness of the modulation recognition system across all modulation types. It is calculated as the ratio of correctly classified signals to the total number of signals tested. It is given by:

Accuracy = \frac{Number of Correct Predictions}{Total Number of Predictions}

(3)

Precision measures the reliability of a classifier for a specific modulation type, focusing on how many of the predicted positives (e.g., signals classified as "QPSK") are actually correct. It is calculated by using the following formula.

Precision = \frac{TP}{TP + FP}

(4)

where TP represents "True Positive" while FP represents "False Positive". Often a confusion matrix is used to compute the accuracy and precision of the modulation recognition algorithm. A confusion matrix, or error matrix, is a tabular depiction of a classification model’s efficacy, illustrating the frequency of correct and wrong predictions for each class. It is particularly advantageous for assessing supervised learning systems in tasks such as modulation recognition, image classification, or medical diagnosis.

Table 18 shows that the training loss values vary across studies, with some using cross-entropy, contrastive loss, hinge loss, and mean squared error, while others do not specify a loss function. Complexity differs, with models ranging from low to high computational demands, measured in execution time, number of parameters, and floating-point operations. Accuracy values range from 64 percent to 100 percent, showing varying levels of performance across different techniques. Precision also varies, with some models achieving values between 40 percent and 100 percent, while others report more stable precision rates. The diversity in training loss, complexity, accuracy, and precision highlights the effectiveness of different machine learning and deep learning approaches in modulation recognition for underwater acoustic communication.

5.4. Discussion on ML/DL-Based MR Techniques in UWA Communication Systems

Modulation Recognition in UWA communication is crucial for effective signal processing, yet it faces significant challenges due to the complex and dynamic nature of UWA channels. The following literature review explores recent advancements in deep learning and signal processing techniques applied to UWA MR.

For UWA signal detection, Wang et al. (2021) [51] suggest a Sequence Convolutional Network (SCNet) for AMR. SCNet leverages 1D sequence convolutions with adaptive kernel sizes to enhance feature extraction while keeping computation efficient. It achieves higher recognition accuracy, faster training, and fewer parameters compared to traditional CNN and RNN models. Tested on real-world underwater channel data, SCNet outperforms models like LSTM, ResNet, and DenseNet. While primarily validated on simulated data, the results suggest strong potential for practical deployment in challenging underwater environments.

Extending beyond convolutional models, Huang et al. [52] present a practical and efficient approach to AMR for UWA signals by combining an optimizing autoencoder (OAE) with an evaluation-enhanced K-nearest neighbors (EEKNN) algorithm. The OAE refines noisy signal features by learning their relationship to clean, ideal ones, making it easier to distinguish between different modulation types. Meanwhile, EEKNN improves the classification process by using Mahalanobis distance and a more thoughtful voting mechanism that reduces the impact of outliers. Together, these techniques deliver impressive results—achieving up to 99.25% accuracy and very fast recognition times (3.48 ms) on real-world data from the South China Sea. While the method shows great promise, especially for practical applications, it would benefit from further testing in more diverse and real-time underwater conditions to confirm its broader applicability.

While Huang et al. focused on enhancing denoising and classificaiton, the authors in [53] propose a deep learning model, R&CNN, for AMR in UWA communications. By combining the strengths of RNN and CNN, the model effectively captures both the time-dependent and spatial features of acoustic signals. Key architectural choices—like using 1D convolutional kernels and removing pooling layers—help preserve important signal characteristics while keeping the model lightweight and efficient. Tested on two real-world datasets (Trestle and South China Sea), R&CNN delivers impressive accuracy (up to 99.38%) and fast recognition times (7.164 ms), outperforming several well-known models such as LeNet5, AlexNet8, LSTM, and CNN-LSTM. While the results are promising, further validation under more diverse underwater conditions would strengthen the case for its practical deployment.

Building further on hybrid architectures that fuse CNNs with advanced training techniques, the paper [54] introduces a classifier, called UWA communication modulation classifier-supervised contrastive learning (UMC-SCL). The method starts by using a lightweight CNN to filter out ocean noise, ensuring only meaningful signals are processed. It then employs ResNet50 as a feature extractor, trained with supervised contrastive learning to make features from the same modulation type more consistent, even under low SNR conditions. A simple fully connected layer is used for final classification, making the system both accurate and efficient. Tested on a mix of simulated, pool, and real ocean data, the model achieves strong performance—reaching 98.6% accuracy at 0 dB—and shows clear advantages over traditional methods, especially in noisy environments. Although highly promising, further testing in more diverse underwater conditions would help confirm its robustness and generalizability.

Departing from deep neural networks, in [55], the authors present sixth order cumulant (

C_{63}

) to effectively separate OFDM signals from PSK and FSK variants, while an enhanced bispectrum approach helps differentiate between specific modulation types like BPSK, QPSK, 2FSK, and 4FSK. The system shows impressive performance in challenging non-cooperative environments, achieving perfect recognition at 0 dB SNR in simulations and nearly 99% accuracy in real-world lake tests using a ResNet-based classifier. While effective, the method struggles in extremely noisy conditions (below -8 dB SNR), relies heavily on simulated training data, and faces computational challenges for real-time applications.

While cumulant and bispectrum-based methods offer mathematical precision, an edge-enabled adaptive modulation framework for Internet of Underwater Things (IoUT), leveraging network pruning and EL to balance computational efficiency and accuracy is presented in [56]. The key contributions include: (1) a novel CSI dataset for six modulation schemes, enhancing feature representation under noise; (2) a Taylor expansion-based pruning criterion to reduce redundant CNN parameters while maintaining performance; (3) an ensemble learing (EL) strategy to compensate for accuracy loss post-pruning, achieving 93.4% accuracy at 5 dB SNR; and (4) successful deployment on edge devices like NVIDIA Jetson TX2, demonstrating practical feasibility. However, the framework struggles with SNR below 0 dB, where feature extraction becomes unreliable, and its reliance on simulated data may limit real-world adaptability in highly dynamic underwater environments.

In contrast to real-valued pruning and ensemble techniques, the paper [57] proposes a novel adaptive modulation method for UWA communication signals utilizing deep complex networks (DCNs). Their key contribution lies in developing a DCN architecture that directly processes complex-valued UWA signals, effectively capturing amplitude and phase information, which is crucial given the complex channel impairments in UWA environments. This approach aims to improve classification accuracy and robustness compared to traditional real-valued neural networks that often discard or separately process the complex nature of the signals. A potential shortcoming, though not explicitly detailed in the abstract, could be the computational complexity associated with DCNs, especially when deployed in resource-constrained underwater environments, and the need for extensive training data specific to diverse UWA channel conditions.

Exploring hardware-integrated solutions, in [58], the authors present a support vector machine (SVM)-powered underwater acoustic modem that integrates continuous wavelet transform (CWT)-based feature extraction and FPGA-based signal processing to enhance UWA communication. Key contributions include: (1) a novel system architecture combining SVM for signal classification and CWT for robust feature extraction, achieving 98.28% accuracy at 5 dB SNR; (2) the introduction of a transitional "C" symbol to mitigate spectrum leakage and improve demodulation reliability; (3) FPGA implementation for real-time processing, demonstrating a stable 10,000 baud rate with zero BER under controlled conditions. However, the computational complexity of CWT and SVM could pose challenges for low-power edge devices.

Returning to hybrid deep learning models, the paper in [62] proposes a hybrid neural network model S&SEFM, combining SqueezeNet and SENet, for modulation recognition in UWA communication. It introduces multi-attribute features, wavelet time-frequency (WTF) spectrum, square power spectrum, and cyclic spectrum contour maps, to mitigate the limitations of single-feature methods and employs multi-scale feature fusion to enhance recognition accuracy. The model demonstrates strong generalization across different UWA channels and robustness against Doppler shift, achieving high recognition rates in both simulated and sea trial data. However, the paper does not address computational complexity in real-time applications or the model’s performance in extremely low SNR conditions.

Continuing the pursuit of multi-scale feature extraction, Wang et al. [63] introduce several contributions to AMR in UWA systems. Key contributions include a data augmentation method that increases data sevenfold to address small sample size issues, a novel "microscale" concept for rationalizing UWA signals into time series, and the "One2Three block" temporal feature extractor designed to extract features from three microscales. Additionally, they propose a "Dual-Stream SE block" as a spatial feature extractor to synthesize advanced spatial features for AMR. The method’s effectiveness is validated on real-world datasets from the South China Sea and the Yellow Sea, demonstrating promising recognition accuracy for eight common UAC modulation modes. The computational complexity introduced by the multi-microscale feature extraction and the dual-stream architecture might be a concern for real-time deployments.

Taking a step further into attention mechanisms, a two-stream transformer (TSTR)-based network is proposed in [64] for AMR of UWA signals, aiming to overcome the challenges of complex UWA channels and severe ocean noise. Their key contributions include an input preprocessing layer that extracts I/Q and time-frequency features, a feature capture layer (FCL) for extracting high-dimensional signal features across time, frequency, and time-frequency domains, and a classification layer for modulation estimation. A notable innovation is the use of a multihead self-attention module with adaptive soft thresholding within the FCL to handle noise and varying feature characteristics. The computational intensity of the transformer architecture particularly the multihead self-attention, which might pose challenges for real-time implementation on resource-constrained UWA platforms.

To automate and optimize model design, Jiang et al. [59] propose neural architecture search (NAS) for modulation recognition of UWA signals to automatically discover optimal neural architectures, particularly improving performance in low SNR conditions. It introduces a feature fusion method combining time-frequency and cyclic spectrum features with an attention mechanism to enhance phase-modulated signal recognition. The method outperforms traditional models like AlexNet, ResNet, and GoogLeNet in simulations and field tests, demonstrating robustness in underwater environments. The approach has relatively high computational complexity and potential overfitting when pooling layers are included.

Focusing on enhancing representational diversity through feature fusion, the paper by Wang, Yang, and Fang [60] proposes a multi-scale feature fusion hybrid model (HM) for UWA communication signal modulation recognition, addressing the challenges posed by complex underwater environments. Their primary contribution is the integration of Gram angle field (GAF), Markov transition field (MTF), and recurrence plot (RP) to fuse time and frequency domain features into low-dimensional representations, which are then processed by a deep learning network. This approach demonstrates superior recognition accuracy (94.31% in lake trials) compared to existing methods, as validated by comparative experiments. A potential shortcoming, could be the computational overhead associated with generating and fusing these multiple feature representations which might impact real-time deployment in resource-constrained UWA systems.

Finally, complementing deep learning with signal processing and channel estimation, Yang et al. (2024) [61] propose a modulation classification method for non-cooperative UWA communication by integrating channel estimation to mitigate signal distortion caused by multipath fading and environmental noise. It leverages higher-order cumulants as features and employs various classifiers (e.g., SVM, GBDT, XGBoost) to validate the approach, demonstrating improved recognition accuracy after signal restoration. The method is tested on both simulated and real-world datasets (e.g., RML2016.10a and Five-Element Acoustic Underwater Dataset), showing robustness across different SNR conditions. The method’s performance depends on the accuracy of channel estimation, which can be compromised by measurement errors.

The reviewed literature highlights a clear trend towards leveraging advanced deep learning architectures and sophisticated feature engineering to address the complexities of UWA MR. While significant progress has been made in improving accuracy and robustness, particularly in challenging noisy environments, common limitations often revolve around the computational complexity of proposed models.

6. Challenges and Future Research Directions

This article has shown that several ML and DL approaches have been successfully employed for UWA systems for channel estimation, adaptive modulation, and modulation recognition. Although these methods have shown encouraging performance, several open challenges and promising research directions remain. Based on the studies referenced from [23,27,29,31,33,36,37,38,40,41,43,47,50,55,57,60,61,62,63,65], the following key challenges and future research directions have been identified.

6.1. Challenges in ML/DL-Based UWA Communication

Despite significant advancements in ML and DL techniques for underwater acoustic (UWA) communication, several challenges remain that hinder their practical deployment and efficiency. These challenges must be addressed to develop more scalable and adaptive solutions.

Computational Complexity and Real-Time Processing: ML/DL models such as DenseNet and transformer-based architectures [37,38] require substantial computational resources for training and inference, making real-time deployment a challenge. These models demand extensive memory and processing power, which limits their feasibility in resource-constrained underwater environments. Additionally, GAN-based simulators necessitate a large amount of training data and computational power to model underwater acoustic channels realistically [36]. The real-time execution of such models remains complex, particularly for mobile underwater communication nodes with limited processing capabilities.
Limited Training Data and Generalization: ML/DL-based underwater communication systems rely on large labeled datasets for robust training and model generalization. However, data collection in real-world underwater environments is extremely challenging due to dynamic changes in water temperature, pressure, and salinity [23,33,61]. Many ML/DL architectures, particularly hybrid models combining CNNs and RNNs, struggle with generalization due to variations in underwater acoustic signals across different geographic locations. Additionally, the emergence of Physics-Informed Neural Networks aims to bridge the gap between purely data-driven models and domain-specific physics by integrating environmental parameters into ML/DL-based predictions [41]. However, these networks require specialized datasets that accurately represent underwater propagation conditions, which remain difficult to obtain.
Multipath Propagation and Doppler Effects: Multipath propagation occurs when acoustic signals reflect off surfaces like the ocean floor, causing delays, signal fading, and interference [27,31,61]. Doppler shifts due to the motion of underwater vehicles further complicate signal processing. While attention-based models and BiLSTM architectures show promising improvements in dealing with multipath interference and Doppler shifts, extreme variations still pose a significant challenge [57]. Hybrid models such as CNN-LSTM combinations exhibit enhanced temporal tracking of dynamic underwater channels but struggle with extreme Doppler distortions that lead to data loss and synchronization issues.
Energy Efficiency and Hardware Constraints: Underwater communication devices are often deployed in remote environments with limited access to power sources. High-complexity models such as deep CNN architectures require significant computational resources, making them impractical for long-duration underwater deployments [29]. Pruned CNN architectures and lightweight deep learning models are being explored to mitigate energy consumption, but their performance often suffers due to reduced model complexity [56,63]. Additionally, implementing ML/DL models on low-power underwater sensors requires optimizing hardware configurations to balance computational efficiency and real-time performance.

Addressing these challenges is crucial for advancing AI-driven UWA communication systems. Future research should focus on developing lightweight and adaptive models, enhancing the generalization of techniques, improving robustness against multipath and Doppler effects, and designing energy-efficient algorithms for deployment in underwater networks. By overcoming these limitations, AI-powered underwater communication can become more reliable, scalable, and efficient, supporting applications such as marine research, deep-sea exploration, and naval security.

6.2. Future Research Directions

Future research directions can be categorized into the following key areas:

Development of Lightweight ML/DL Models [43,60,63]: As underwater communication devices operate in resource-constrained environments, optimizing models for real-time edge computing is crucial. Current architectures such as deep CNNs and RNNs often require extensive computational power, making deployment on low-power underwater sensors challenging.

-

Future research should focus on knowledge distillation, a technique that transfers knowledge from larger, complex models to smaller, more efficient models while retaining performance.

-

Quantization techniques should be explored to reduce model precision requirements, allowing ML/DL systems to run efficiently on energy-limited platforms.

-

Incorporating pruning methods can further reduce computational demands by eliminating redundant parameters in deep learning models.
Integration of Physics-Informed Neural Networks (PINNs) [40,65]: Physics-Informed Neural Networks (PINNs) combine traditional ML/DL approaches with domain-specific physics principles, enabling more accurate channel estimation and modulation schemes in underwater environments.

-

Future hybrid PINN-ML architectures can improve adaptive modulation by integrating wave propagation models into training data, reducing error rates caused by environmental distortions.

-

By employing PINNs, AI models can learn underwater acoustic behaviors, compensating for multipath propagation, Doppler effects, and fluctuating water conditions.
Advancements in Reinforcement Learning for Adaptive Modulation [38,50]:

-

Adaptive modulation techniques allow communication systems to dynamically adjust transmission parameters based on real-time channel conditions.

-

Reinforcement learning (RL)-based methods such as Q-learning and Deep Q-Networks (DQN) optimize transmission power and modulation schemes to maximize throughput while minimizing bit error rates (BER).

-

Meta-learning approaches can further speed up model adaptation, enabling systems to rapidly respond to varying underwater environments without requiring extensive retraining.
Improved Modulation Recognition Techniques [54,62]: Effective modulation recognition is essential for signal demodulation in UWA communication systems.

-

Contrastive learning-based models enhance classification accuracy in low-SNR conditions, distinguishing modulation patterns in noisy underwater environments.

-

Hybrid CNN-RNN architectures provide better temporal and spatial feature extraction, improving modulation classification accuracy under extreme multipath interference.
Expansion to Novel Modulation Schemes [47,55]: Current ML/DL applications largely focus on conventional single-carrier systems and OFDM-based techniques.

-

Future research should explore Index Modulation, which improves spectral efficiency by encoding information in the position of active subcarriers.

-

Non-Orthogonal Multiple Access can enhance capacity by allowing multiple users to share the same frequency resource, increasing efficiency in underwater communication.

-

Generalized Frequency Division Multiplexing provides flexible subcarrier arrangements that mitigate interference, improving performance in dynamic underwater environments.

These future research directions highlight the need for efficient, scalable, and adaptive models to enhance underwater acoustic communication systems. By addressing these challenges, AI-driven methodologies will pave the way for more reliable, energy-efficient, and intelligent UWA communication networks, supporting critical applications such as deep-sea exploration, naval security, and marine research.

7. Responses to Formulated Research Questions

This section addresses the research questions formulated in this article by evaluating the effectiveness of ML-based and DL-based techniques in UWA communication.

7.1. RQ1: How do ML/DL Techniques Improve Channel Estimation in UWA Communication, and What Are the Key System Characteristics and Performance Metrics of These Methods?

ML and DL techniques have revolutionized channel estimation in UWA communication by improving signal prediction, reducing BER, and enhancing communication reliability. Insights drawn from Table 4 and Table 5 show that various ML and DL methods, including LR, DBN, LSTM, and CNN, are applied in both SC-UWA and MC-UWA systems. SC-UWA systems primarily utilize simpler ML models like LR and LSTM, whereas MC-UWA systems employ more complex architectures such as CNN-LSTM hybrids, DNNs, GANs, indicating that multi-carrier systems require advanced learning techniques to manage signal distortions.

The key characteristics of channel estimation techniques, as presented in Table 6 and Table 7 highlight differences between SC-UWA and MC-UWA systems in terms of bandwidth, transmission distance, modulation schemes, system configurations, subcarriers, and CP. SC-UWA systems typically operate within a bandwidth range of 1-25 kHz and support transmission distances from 0.2 km to 3 km, with modulation schemes mainly limited to BPSK and QPSK. In contrast, MC-UWA systems feature bandwidths extending up to 37.5 kHz, with transmission distances reaching 5 km, utilizing more robust modulation schemes like OFDM, AFDM, and QAM. Additionally, MC-UWA systems integrate subcarriers ranging from 512 to 1024, with CP values between 16 and 256 samples to combat multipath interference.

Table 8 and Table 9 compare training loss, computational complexity, and BER performance across different ML/DL-based channel estimation techniques. Advanced models such as LSTM-based predictors and CNN-enhanced estimators achieve MSE values as low as

10^{- 4}

, indicating superior accuracy in channel estimation. While some models, such as DenseNet-based estimators, demand high computational resources, pruned CNN architectures significantly lower complexity, making them more viable for real-time applications. BER improvements of up to 8 dB at

10^{- 3}

BER demonstrate the effectiveness of AI-driven channel estimation in enhancing communication reliability.

7.2. RQ2: How do ML/DL Techniques Improve Adaptive Modulation in UWA Communication, and What Are the Key System Characteristics and Performance Metrics of These Methods?

ML and DL techniques have significantly improved adaptive modulation in UWA communication by dynamically adjusting transmission parameters based on real-time conditions. Insights from Table 10 and Table 11 show that various methods, including SVM, RL, DBN, and CNNs, are applied in both single-carrier SC-UWA and MC-UWA systems. SC-UWA systems primarily utilize classification-based ML/DL models such as SVM and CNN, whereas MC-UWA systems employ reinforcement learning-based models to optimize modulation schemes in response to channel variations.

The key characteristics of adaptive modulation techniques, as presented in Table 12 and Table 13, highlight differences between SC-UWA and MC-UWA systems in terms of bandwidth, transmission distance, modulation schemes, system configurations, subcarriers, and cyclic prefix. SC-UWA systems typically operate within a 5-10 kHz bandwidth and support transmission distances from 0.82 km to 3 km, with modulation schemes including PSK, QAM, and FSK. In contrast, MC-UWA systems feature bandwidths extending up to 25 kHz, with transmission distances reaching 5 km, utilizing OFDM-based modulation with subcarriers ranging from 32 to 1024 and CP values between 64 and 400 samples to enhance robustness against multipath interference.

Table 14 and Table 15 compare training loss, computational complexity, throughput, and BER performance across different ML/DL-based adaptive modulation techniques. Advanced models such as RL-based adaptive modulation and CNN-enhanced classifiers achieve classification accuracies exceeding 98%, demonstrating superior adaptability to underwater channel variations. While some models, such as meta-learning-based adaptive modulation, demand higher computational resources, pruned CNN architectures significantly lower complexity, making them more viable for real-time applications. BER improvements of up to 14.8% and throughput gains of 25% indicate the effectiveness of adaptive modulation in enhancing communication reliability.

7.3. RQ3: How Effective Are ML/DL-Driven Modulation Recognition Approaches in Identifying Modulation Schemes Under Complex Underwater Conditions, and What Are Their Strengths and Limitations?

Modulation recognition using ML and DL techniques has significantly improved the classification accuracy and reliability of UWA communication systems. Insights from Table 16– Table 18 highlight the effectiveness of various ML and DL techniques, system configurations, and performance metrics in modulation recognition.

Table 16 provides an overview of ML and DL techniques used for modulation recognition, including CNNs, ResNet, SVM, DCN, and hybrid architectures. These models leverage feature extraction, sequence modeling, and contrastive learning to distinguish modulation types under challenging underwater conditions. The training examples used in these studies include both simulated and measured data, ensuring robustness across different environments.

Table 17 presents the key characteristics of modulation recognition systems, including bandwidth, transmission distance, and modulation schemes. The bandwidths range from 1 kHz to 30 kHz, with transmission distances varying from 100 meters to 5 kilometers. The modulation schemes considered include PSK, QAM, FSK, DSSS, OFDM, and LFM, demonstrating the versatility of models in recognizing diverse signal types.

Table 18 compares training loss, computational complexity, classification accuracy, and precision across different ML/DL-based modulation recognition techniques. Advanced models such as CNN-RNN hybrids and transformer-based architectures achieve classification accuracies exceeding 95%, demonstrating superior performance over traditional feature-based methods. While some models, such as ensemble learning-based classifiers, demand higher computational resources, pruned CNN architectures significantly lower complexity, making them more viable for real-time applications. Precision values range from 40% to 100%, indicating the effectiveness of modulation recognition in minimizing false classifications.

7.4. RQ4: What Innovative Approaches and Emerging Trends in Machine/Deep Learning Can Be Employed to Address Unresolved Challenges in Underwater Acoustic Communication, and How Can These Advancements Shape the Future of Intelligent, Efficient, and Scalable UWA Systems?

ML and DL have transformed underwater acoustic communication, yet several challenges remain unresolved, as discussed in Section 6.1. To address these challenges, future research directions have been discussed in Section future. To summarize, future research must focus on developing lightweight, real-time ML/DL models optimized for edge computing, reducing complexity while maintaining accuracy. Techniques such as knowledge distillation and quantization can significantly improve computational efficiency. Another promising direction is the integration of physics-informed neural networks, which incorporate underwater wave propagation physics into ML/DL models, enhancing signal processing accuracy and reliability. Advancements in reinforcement learning for adaptive modulation, particularly meta-learning approaches, can improve system adaptability to fluctuating underwater environments, optimizing spectral efficiency. Moreover, contrastive learning-based modulation recognition and hybrid CNN-RNN architectures should be refined to boost classification accuracy in low-SNR conditions and extreme multipath interference scenarios. Expanding AI applications to novel modulation schemes, including index modulation, non-orthogonal multiple access, and generalized frequency division multiplexing, will further optimize UWA communication. These advancements will pave the way for more efficient, scalable, and intelligent underwater acoustic systems, ensuring better reliability and adaptability in oceanic exploration, naval security, and maritime connectivity.

8. Limitations of Research

Although this systematic literature review (SLR) has been conducted following established guidelines [19] and strictly adhering to the developed review protocol, certain limitations remain:

Search Exhaustiveness: While we have carefully selected appropriate search terms and thoroughly scanned the search results, some queries returned thousands of articles, making exhaustive screening impractical. Additionally, several studies were excluded based on their titles, and there is a possibility that relevant research may not have been properly represented in the title. Consequently, we do not claim absolute exhaustiveness in this review.
Database Selection: This SLR is based on four major scientific databases: IEEE, Elsevier, Springer, and Google Scholar, ensuring access to high-quality journals and conference publications. However, relevant studies may exist in other databases that were not included in our search. While this may result in missing some recent research, we believe that the selected databases provide a comprehensive and high-quality representation of the latest advancements in underwater acoustic communication.
Scope of Inclusion Criteria: The inclusion criteria were designed to focus on studies published between 2020 and 2025. While this ensures relevance to recent advancements, older foundational works that may still hold significance were excluded. Future research could incorporate a broader time frame to capture historical developments in machine/deep learning applications for underwater acoustic communication.
Generalization of Findings: The findings of this SLR are based on selected studies that met predefined criteria. While efforts were made to ensure a balanced representation of methodologies, there is a possibility that certain niche approaches or emerging techniques were underrepresented. Further studies incorporating additional perspectives may enhance the comprehensiveness of future reviews.

Despite these limitations, we believe that the core findings of this SLR provide valuable insights into the role of machine/deep learning in underwater acoustic communication and serve as a strong foundation for future research.

9. Conclusion

This systematic literature review has provided a structured evaluation of ML and DL techniques in underwater acoustic communication, focusing on four key research questions. For RQ1, ML and DL models significantly enhance channel estimation by improving signal prediction and reducing bit error rates. Techniques such as CNNs, LSTMs, and GANs adapt well to underwater variations, achieving MSE values as low as

10^{- 4}

. However, computational complexity remains a challenge for real-time deployment. Regarding RQ2, adaptive modulation powered by ML and DL dynamically optimizes transmission parameters, improving efficiency. RL and meta-learning approaches yield BER reductions of up to 14.8% and throughput gains of 25%. Further research is needed to address energy constraints and real-time adaptability. For RQ3, ML/DL-driven modulation recognition achieves classification accuracies exceeding 95%. CNN-RNN hybrids, transformers, and contrastive learning models improve signal detection, though precision variability (40%–100%) under low-SNR and high-Doppler conditions necessitates refinements. RQ4 highlights future research directions, including lightweight, real-time models, integration of physics-informed neural networks for enhanced signal processing, and advancements in RL-based adaptive modulation. Hybrid deep learning architectures combining data augmentation and adversarial training are promising for improving modulation recognition accuracy. By consolidating insights from multiple studies, this review provides a foundation for future AI-driven UWA communication advancements, unlocking new possibilities in deep-sea exploration, naval security, and maritime connectivity.

References

Theocharidis, T.; Kavallieratou, E. Underwater communication technologies: a review. Telecommunication Systems 2025, 88, 54. [Google Scholar] [CrossRef]
Stojanovic, M.; Preisig, J. Underwater acoustic communication channels: Propagation models and statistical characterization. IEEE communications magazine 2009, 47, 84–89. [Google Scholar] [CrossRef]
Ali, M.F.; Jayakody, D.N.K.; Chursin, Y.A.; Affes, S.; Dmitry, S. Recent advances and future directions on underwater wireless communications. Archives of Computational Methods in Engineering 2020, 27, 1379–1412. [Google Scholar] [CrossRef]
Huang, L.; Wang, Y.; Zhang, Q.; Han, J.; Tan, W.; Tian, Z. Machine Learning for Underwater Acoustic Communications. IEEE Wireless Communications 2022, 29, 102–108. [Google Scholar] [CrossRef]
Huang, L.; Wang, Y.; Zhang, Q.; Han, J.; Tan, W.; Tian, Z. Machine learning for underwater acoustic communications. IEEE Wireless Communications 2022, 29, 102–108. [Google Scholar] [CrossRef]
Niu, H.; Li, X.; Zhang, Y.; Xu, J. Advances and applications of machine learning in underwater acoustics. Intelligent Marine Technology and Systems 2023, 1, 8. [Google Scholar] [CrossRef]
Shovon, I.I.; Shin, S. Survey on multi-path routing protocols of underwater wireless sensor networks: Advancement and applications. Electronics 2022, 11, 3467. [Google Scholar] [CrossRef]
Menaka, D.; Gauni, S.; Manimegalai, C.T.; Kalimuthu, K. Challenges and vision of wireless optical and acoustic communication in underwater environment. International Journal of Communication Systems 2022, 35, e5227. [Google Scholar] [CrossRef]
Zia, M.Y.I.; Poncela, J.; Otero, P. State-of-the-art underwater acoustic communication modems: Classifications, analyses and design challenges. Wireless personal communications 2021, 116, 1325–1360. [Google Scholar] [CrossRef]
Khan, M.R.; Das, B.; Pati, B.B. Channel estimation strategies for underwater acoustic (UWA) communication: An overview. Journal of the Franklin Institute 2020, 357, 7229–7265. [Google Scholar] [CrossRef]
Wu, F.Y.; Yang, H.Z.; Zhou, Y.; Tong, F. A Fast Kalman Equalizer for Single-Carrier Underwater Acoustic Communication. IEEE Transactions on Industrial Informatics 2024. [Google Scholar] [CrossRef]
Jing, L.; Dong, C.; He, C.; Shi, W.; Wang, H.; Zhou, Y. Adaptive Modulation and Coding for Underwater Acoustic OTFS Communications Based on Meta-Learning. IEEE Communications Letters 2024. [Google Scholar] [CrossRef]
Wang, X.; Liu, J.; Han, G.; Wang, J.; Cui, J. End-to-End modulation recognition in underwater acoustic communications using temporal large kernel convolution with gated channel mixer. IEEE Transactions on Vehicular Technology 2024. [Google Scholar] [CrossRef]
Shwetha, M.; Krishnaveni, S. A systematic analysis, outstanding challenges, and future prospects for routing protocols and machine learning algorithms in underwater wireless acoustic sensor networks. Journal of Interconnection Networks 2025, 25, 2330001. [Google Scholar] [CrossRef]
Niu, H.; Li, X.; Zhang, Y.; Xu, J. Advances and applications of machine learning in underwater acoustics. Intelligent Marine Technology and Systems 2023, 1, 8. [Google Scholar] [CrossRef]
Wang, B.; Yang, H.; Fang, T. Modulation recognition of underwater acoustic communication signals based on deep learning. EURASIP Journal on Advances in Signal Processing 2024, 2024, 103. [Google Scholar] [CrossRef]
Liu, H.; Ma, L.; Wang, Z.; Qiao, G. Channel Prediction for Underwater Acoustic Communication: A Review and Performance Evaluation of Algorithms. Remote Sens. 2024, 16, 1546. [Google Scholar] [CrossRef]
Li, Z.; Chitre, M.; Stojanovic, M. Underwater acoustic communications. Nature Reviews Electrical Engineering 2025, 2, 83–95. [Google Scholar] [CrossRef]
Shaffril, H.A.M.; Samah, A.A.; Samsuddin, S.F. Guidelines for developing a systematic literature review for studies related to climate change adaptation. Environmental Science and Pollution Research 2021, 28, 22265–22277. [Google Scholar] [CrossRef]
Yougan Chen, Weijian Yu, X. S.L.W.Y.T.X.X. Environment-aware communication channel quality prediction for underwater acoustic transmissions: A machine learning method. Applied Acoustics 2021, 181, 108128. [Google Scholar] [CrossRef]
Liu, L.; Cai, L.; Ma, L.; Qiao, G. Channel State Information Prediction for Adaptive Underwater Acoustic Downlink OFDMA System: Deep Neural Networks Based Approach. IEEE Transactions on Vehicular Technology 2021, 70, 9063–9076. [Google Scholar] [CrossRef]
Lee-Leon, A.; Yuen, C.; Herremans, D. Underwater acoustic communication receiver using deep belief network. IEEE Transactions on Communications 2021, 69, 3698–3708. [Google Scholar] [CrossRef]
Gang Qiao, Yufei Liu, F. Z.Y.Z.S.M.G.Y. Deep learning-based M-ary spread spectrum communication system in shallow water acoustic channel. Applied Acoustics 2022, 192, 108742. [Google Scholar] [CrossRef]
Hu, X.; Huo, Y.; Dong, X.; Wu, F.Y.; Huang, A. Channel prediction using adaptive bidirectional GRU for underwater MIMO communications. IEEE Internet of Things Journal 2023, 11, 3250–3263. [Google Scholar] [CrossRef]
Yonglin Zhang, Haibin Wang, C. L.X.C.F.M. On the performance of deep neural network aided channel estimation for underwater acoustic OFDM communications. Ocean Engineering 2022, 259, 111518. [Google Scholar] [CrossRef]
Xinbin Li, Zhaoxing Han, H. Y.L.Y.S.H. Deep Learning for OFDM Channel Estimation in Impulsive Noise Environments. Wireless Personal Communications 2022, 125, 2947–2964. [Google Scholar] [CrossRef]
Zhang, Y.; Li, C.; Wang, H.; Wang, J.; Yang, F.; Meriaudeau, F. Deep learning aided OFDM receiver for underwater acoustic communications. Applied Acoustics 2022, 187, 108515. [Google Scholar] [CrossRef]
Kim, Y.; Lee, H.; Seol, S.; Chung, J. 2D BiLSTM based channel impulse response estimator for improving throughput in underwater sensor network. IEEE Access 2022, 10, 57227–57233. [Google Scholar] [CrossRef]
Feng, X.; Zhou, M.; Wang, J.; Sun, H.; Pan, G.; Wen, M. Model-driven deep learning-based estimation for underwater acoustic channels with uncertain sparsity. IEEE Transactions on Wireless Communications 2023, 23, 5710–5725. [Google Scholar] [CrossRef]
Zhu, Z.; Tong, F.; Zhou, Y.; Zhang, Z.; Zhang, F. Deep learning prediction of time-varying underwater acoustic channel based on LSTM with attention mechanism. Journal of Marine Science and Application 2023, 22, 650–658. [Google Scholar] [CrossRef]
Zhang, Y.; Chang, J.; Liu, Y.; Xing, L.; Shen, X. Deep learning and expert knowledge based underwater acoustic OFDM receiver. Physical Communication 2023, 58, 102041. [Google Scholar] [CrossRef]
Guo, J.; Guo, T.; Li, M.; Wu, T.; Lin, H. Underwater-Acoustic-OFDM Channel Estimation Based on Deep Learning and Data Augmentation. Electronics 2024, 13, 689. [Google Scholar] [CrossRef]
Kapileswar, N.; Phani Kumar, P. Optimized deep learning driven signal detection and adaptive channel estimation in underwater acoustic IoT networks. International Journal of Communication Systems 2024, 37, e5673. [Google Scholar] [CrossRef]
Qarabaqi, P.; Stojanovic, M. Statistical characterization and computationally efficient modeling of a class of underwater acoustic communication channels. IEEE Journal of Oceanic Engineering 2013, 38, 701–717. [Google Scholar] [CrossRef]
van Walree, P.A.; Socheleau, F.X.; Otnes, R.; Jenserud, T. The watermark benchmark for underwater acoustic modulation schemes. IEEE journal of oceanic engineering 2017, 42, 1007–1018. [Google Scholar] [CrossRef]
Liu, S.; Yan, H.; Ma, L.; Liu, Y.; Han, X. UACC-GAN: A Stochastic Channel Simulator for Underwater Acoustic Communication. IEEE Journal of Oceanic Engineering 2024. [Google Scholar] [CrossRef]
Liu, S.; Adil, M.; Ma, L.; Mazhar, S.; Qiao, G. DenseNet-Based Robust Channel Estimation in OFDM for Improving Underwater Acoustic Communication. IEEE Journal of Oceanic Engineering 2025. [Google Scholar] [CrossRef]
Cui, X.; Zhang, C.; Li, J.; Jiang, B.; Li, S.; Liu, J. Deep Learning Model-Driven Channel Estimation and Equalization for Underwater Acoustic OFDM Receivers. Internet Technology Letters 2025, p. e619.
Tian, T.; Raj, A.; Xavier, B.M.; Zhang, Y.; Wu, F.Y.; Yang, K. A Multi-Task Learning Framework for Underwater Acoustic Channel Prediction: Performance Analysis on Real-World Data. IEEE Transactions on Wireless Communications 2024. [Google Scholar] [CrossRef]
Huang, P.; Li, Q.; Huang, D.; Wang, J. Channel estimation and symbol detection for AFDM over doubly selective fading channels. Physical Communication 2025, 69, 102597. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, Y.; Liu, Y.; Shi, L.; Zang, Y. A Deep Learning Receiver for Underwater Acoustic OTFS Communications With Doppler Squint Effect. IEEE Wireless Communications Letters 2025. [Google Scholar] [CrossRef]
Alamgir, M.; Sultana, M.N.; Chang, K. Link adaptation on an underwater communications network using machine learning algorithms: Boosted regression tree approach. IEEE access 2020, 8, 73957–73971. [Google Scholar] [CrossRef]
Anitha, D.; Karthika, R. Hybrid deep learning-based adaptive multiple access schemes underwater wireless networks. Intelligent Automation & Soft Computing 2023, 35, 2463–2477. [Google Scholar]
Sweta, T.; Ruthrapriya, S.; Sneka, J.; Alex, J.S.R.; Rohith, G.; Das, M. Reinforcement learning-based automated modulation switching algorithm for an enhanced underwater acoustic communication. Results in Engineering 2024, 23, 102791. [Google Scholar]
Qiu, Y.; Yang, X.; Tong, F.; Chen, D. Evaluation of Reinforcement Learning-Based Adaptive Modulation in Shallow Sea Acoustic Communication. Journal of Marine Science and Application 2025, pp. 1–8.
Huang, L.; Zhang, Q.; Tan, W.; Wang, Y.; Zhang, L.; He, C.; Tian, Z. Adaptive modulation and coding in underwater acoustic communications: a machine learning perspective. EURASIP Journal on Wireless Communications and Networking 2020, 2020, 1–25. [Google Scholar] [CrossRef]
Byun, J.; Cho, Y.H.; Im, T.; Ko, H.L.; Shin, K.; Kim, J.; Jo, O. Iterative learning for reliable link adaptation in the Internet of Underwater Things. IEEE Access 2021, 9, 30408–30416. [Google Scholar] [CrossRef]
Cui, X.; Zhang, Z.; Li, J.; Jiang, B.; Li, S.; Liu, J. Reinforcement learning-based adaptive modulation scheme over underwater acoustic OFDM communication channels. Physical Communication 2023, 61, 102207. [Google Scholar] [CrossRef]
Cui, X.; Yan, P.; Li, J.; Li, S.; Liu, J. Deep reinforcement learning-based adaptive modulation for OFDM underwater acoustic communication system. EURASIP Journal on Advances in Signal Processing 2023, 2023, 1. [Google Scholar] [CrossRef]
Jing, L.; Dong, C.; He, C.; Shi, W.; Wang, H.; Zhou, Y. Adaptive Modulation and Coding for Underwater Acoustic OTFS Communications Based on Meta-Learning. IEEE Communications Letters 2024. [Google Scholar] [CrossRef]
Wang, Y.; Jin, Y.; Zhang, H.; Lu, Q.; Cao, C.; Sang, Z.; Sun, M. Underwater communication signal recognition using sequence convolutional network. IEEE Access 2021, 9, 46886–46899. [Google Scholar] [CrossRef]
Huang, Z.; Li, S.; Yang, X.; Wang, J. OAE-EEKNN: An accurate and efficient automatic modulation recognition method for underwater acoustic signals. IEEE Signal Processing Letters 2022, 29, 518–522. [Google Scholar] [CrossRef]
Zhang, W.; Yang, X.; Leng, C.; Wang, J.; Mao, S. Modulation recognition of underwater acoustic signals using deep hybrid neural networks. IEEE Transactions on Wireless Communications 2022, 21, 5977–5988. [Google Scholar] [CrossRef]
Gao, D.; Hua, W.; Su, W.; Xu, Z.; Chen, K. Supervised contrastive learning-based modulation classification of underwater acoustic communication. Wireless Communications and Mobile Computing 2022, 2022, 3995331. [Google Scholar] [CrossRef]
Zhang, R.; He, C.; Jing, L.; Zhou, C.; Long, C.; Li, J. A modulation recognition system for underwater acoustic communication signals based on higher-order cumulants and deep learning. Journal of Marine Science and Engineering 2023, 11, 1632. [Google Scholar] [CrossRef]
Wang, X.; Tu, Y.; Liu, J.; Han, G.; Yu, C.; Cui, J.H. Edge-Enabled Modulation Classification in Internet of Underwater Things Based on Network Pruning and Ensemble Learning. IEEE Internet of Things Journal 2023, 11, 13608–13621. [Google Scholar] [CrossRef]
Yao, X.; Yang, H.; Sheng, M. Automatic modulation classification for underwater acoustic communication signals based on deep complex networks. Entropy 2023, 25, 318. [Google Scholar] [CrossRef] [PubMed]
Guerrero-Chilabert, G.S.; Moreno-Salinas, D.; Sánchez-Moreno, J. Design and Development of an SVM-Powered Underwater Acoustic Modem. Journal of Marine Science and Engineering 2024, 12, 773. [Google Scholar] [CrossRef]
Jiang, Z.; Zhang, J.; Wang, T.; Wang, H. Modulation recognition of underwater acoustic communication signals based on neural architecture search. Applied Acoustics 2024, 225, 110155. [Google Scholar] [CrossRef]
Wang, B.; Yang, H.; Fang, T. Modulation recognition of underwater acoustic communication signals based on deep learning. EURASIP Journal on Advances in Signal Processing 2024, 2024, 103. [Google Scholar] [CrossRef]
Yang, X.; Wang, Z.; Shen, T.; Zhao, D. Modulation Classification of Underwater Communication Signals Based on Channel Estimation. Journal of Marine Science and Engineering 2024, 12, 1877. [Google Scholar] [CrossRef]
Wang, Y.; Shen, T.; Wang, T.; Qiao, G.; Zhou, F. Modulation recognition for underwater acoustic communication based on hybrid neural network and feature fusion. Applied Acoustics 2024, 225, 110185. [Google Scholar] [CrossRef]
Wang, J.; Huang, Z.; Shi, W.; Mao, S. One2ThreeNet: An automatic microscale-based modulation recognition method for underwater acoustic communication systems. IEEE Transactions on Wireless Communications 2024. [Google Scholar] [CrossRef]
Li, J.; Jia, Q.; Cui, X.; Gulliver, T.A.; Jiang, B.; Li, S.; Yang, J. Automatic modulation recognition of underwater acoustic signals using a two-stream transformer. IEEE Internet of Things Journal 2024. [Google Scholar] [CrossRef]
Wand, M.; Kristoffersen, M.B.; Franzke, A.W.; Schmidhuber, J. Analysis of neural network based proportional myoelectric hand prosthesis control. IEEE Transactions on Biomedical Engineering 2022, 69, 2283–2293. [Google Scholar] [CrossRef] [PubMed]
Willmott, C.J.; Matsuura, K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate research 2005, 30, 79–82. [Google Scholar] [CrossRef]
Sutton, R.S.; Barto, A.G. Reinforcement learning: An introduction; MIT press Cambridge, 1998.
Tharwat, A. Classification assessment methods. Applied computing and informatics 2021, 17, 168–192. [Google Scholar] [CrossRef]

Figure 1. Overview of the SLR: From the Article Selection to Comparative Analysis.

Figure 2. A Typical UWA OFDM (a) Transmitter and (b) Receiver.

Figure 3. A Typical UWA Transceiver Using ML/DL for Adaptive Modulation.

Figure 4. A Typical UWA Transceiver Using ML/DL for Modulation Recognition.

Figure 5. Stepwise Selection Process for Research Articles, illustrating filtering criteria applied across scientific databases to refine relevant studies for systematic analysis.

Figure 6. Year-Wise Distribution of Selected Research Articles from WoS-Indexed Journals (2020–2025), Illustrating Publication Trends and the Evolving Focus on ML/DL Applications in UWA Communication.

Table 1. Summary of Existing Reviews on ML/DL Techniques in UWA Communication.

Ref. (Year)	Key Contributions	Limitations
[4] (2022)	Targets adaptive modulation at the physical layer, and provides a taxonomy of ML algorithms while discussing their potential to address UWA challenges.	Does not evaluate ML and DL techniques for channel estimation or modulation recognition, and does not provide performance metrics for adaptive modulation.
[5] (2023)	Highlights the potential of ML and DL in addressing dynamic underwater environments, with a focus on adaptive modulation, channel prediction, and demodulation.	Lacks a detailed comparative analysis of ML and DL algorithms, including their system characteristics and performance metrics for key UWA challenges.
[6] (2023)	Focuses on source localization, target recognition, and geoacoustic inversion, while providing an evaluation of key techniques, datasets, and ML/DL models.	Lacks emphasis on ML and DL techniques for addressing UWA challenges such as channel estimation, adaptive modulation, and modulation recognition.
[17] (2024)	Analyzes UWA channel prediction techniques, categorizing them into linear, kernel-based, and deep learning approaches, with evaluations of performance and complexity.	Lacks investigations and the impact of adaptive modulation and modulation recognition on enhancing the efficiency and reliability of UWA communication.
[18] (2025)	Focuses on channel modeling, signal processing techniques, and network protocols, while suggesting future directions like standardized models and data-driven solutions.	Lacks exploration of ML and DL applications for specific UWA challenges, such as channel estimation, adaptive modulation, and modulation recognition.
[1] (2025)	Investigates DL techniques for modulation recognition in UWA communication, proposing a hybrid model with multi-scale feature fusion.	Does not explore the other key components such as channel estimation and adaptive modulation.
[14] (2025)	Analyzes routing protocols and ML algorithms, highlighting benefits, challenges, future prospects, and providing a detailed taxonomy and performance evaluation.	Does not explore channel estimation, adaptive modulation, and modulation recognition in terms of their key system characteristics and performance metrics.

Table 2. Search Results for ML and DL Techniques in UWA Communication (2020–2025).

	Search Term	IEEE	Elsevier	Springer
1	Underwater Acoustic	5189	8422	6187
2	Underwater Acoustic Machine Learning	346	1885	1264
3	Underwater Acoustic Deep Learning	601	2001	1350
4	Underwater Acoustic Channel Estimation Machine Learning	24	586	296
5	Underwater Acoustic Channel Estimation Deep Learning	60	649	342
6	Underwater Acoustic Channel Prediction Machine Learning	23	672	326
7	Underwater Acoustic Channel Prediction Deep Learning	19	744	363
8	Underwater Acoustic Receiver Machine Learning	36	526	359
9	Underwater Acoustic Receiver Deep Learning	55	561	373
10	Underwater Acoustic Modulation Classification	52	419	299
11	Underwater Acoustic Modem Machine Learning	6	107	43
12	Underwater Acoustic Modulation Machine Learning	42	382	243
13	Underwater Acoustic Modulation Deep Learning	76	407	276
14	Underwater Acoustic Modulation Recognition Machine Learning	16	199	128
15	Underwater Acoustic Modulation Recognition Deep Learning	28	222	147
Total Articles from Databases		6983	18882	12396
Sum of All Articles		38,261
Additional Google Scholar Articles		3000
Final Grand Total (All Sources)		41261

Table 3. Systematic Process for Extracting, Analyzing, and Classifying Research Studies in ML/DL-Based UWA Communication: Outlining Key Data Collection Methods, Evaluation Criteria, and Classification Approaches.

No.	Item	Corresponding Details
1	Citation Data	It includes the title, author(s), publication year, publisher, and research type (journal or conference).
2	Overview	A concise summary outlining the fundamental proposal and primary objective of the research
3	Results	Findings obtained from the analyzed research, highlighting key insights and conclusions
4	Data Collection	Specifies whether the study employs quantitative or qualitative data gathering methods
5	Assumptions	Identifies any underlying assumptions made to support and validate the research findings
6	Validation	Describes the methodology used to verify the accuracy and reliability of the proposed study
7	Channel	Overview: Table 4 and Table 5
	Estimation	Characteristics: Table 6 and Table 7
	Techniques	Comparison: Table 8 and Table 9
8	Adaptive	Overview: Table 10 and Table 11
	Modulation	Characteristics: Table 12 and Table 13
	Techniques	Comparison: Table 14 and Table 15
9	Modulation	Overview: Table 16, Characteristics: Table 17
	Recognition	Comparison: Table 18

Table 4. Overview of ML/DL-Based Channel Estimation in SC-UWA Communication.

Ref	ML/DL Technique	Optimizer	Training Examples
[20]	LR	Gradient Descent	Measured Data
[23]	LSTM, BiLSTM, SBULSTM	—	Simulated (BELLHOP) and Measured Data
[24]	ABiGRU	Adam	Measured and Simulated
[30]	AttLstmPreNet	—	Simulated Data (From [34])
[36]	UACC-GAN	Adam	Measured Data (From [35])

Table 5. Overview of ML/DL-Based Channel Estimation in MC-UWA Communication.

Ref	ML/DL Technique	Optimizer	Training Examples
[21]	CsiPreNet	Adam	Measured Data
[25]	DNN	Adam	Measured Data (From [35])
[26]	DAE and DNN	—	Simulated Data (GBG)
[27]	CNN	Adam	Measured and Simulated
[28]	2D BiLSTM	Adam	Measured and Simulated
[29]	UDNet	Adam	Measured and Simulated
[31]	SC-CNN,AM-BiLSTM	RMSprop	Measured Data (From [35])
[32]	CWGAN-GP, CNN, CAD	—	Simulated Data (From [34])
[33]	BDPCNN	Pelican	Unclear
[36]	UACC-GAN	Adam	Measured Data (From [35])
[39]	LSTM and Transformer	Adam	Measured Data
[37]	DenseNet	Adam	Measured Data (From [35])
[38]	CNN and LSTM	Adam	Measured Data (From [35])
[40]	DNN	Adam	Simulated Data
[41]	S-CNN-ResNet	Adam	Simulated Data

Table 6. Characteristics of ML/DL-Based Channel Estimation in SC-UWA Comm.

Ref	Bandwidth	Tx-Rx Distance	Modulation
	(KHz)	(Km)
[20]	2	0.205	BPSK
[23]	4	2 to 3	BPSK, DS-SS
[24]	14 to 18	1	QPSK
[30]	10 & 5	1	—
[36]	10 to 18	—	FH-SS

Table 7. Characteristics of ML/DL-Based Channel Estimation in MC-UWA Comm.

Ref	Bandwidth (KHz)	Tx-Rx Distance (Km)	System	Subcarriers	Modulation	CP
[21]	4	1 to 5	OFDMA	681	BPSK,	25
					QPSK,
					8-QAM,
					16-QAM
[25]	—	0.54, 0.75, 0.80, 1,08, 3.16	OFDM	64	QPSK	16
[26]	—	—	OFDM	512	16QAM	64
[27]	—	0.75, 3	OFDM	—	QPSK	—
[28]	5	1	OFDM	462	—	22.6ms
[29]	5	1.5, 3	OFDM	1024	QPSK, 8PSK, 16QAM	256
[31]	6	0.54, 0.75, 3.16	OFDM	512	BPSK	128
[32]	6-10	0.75, 1.08	OFDM	512	QPSK	128
[33]	1000	—	MIMO-OFDM	512	QPSK	128
[36]	10-18	—	OFDM	1024	QPSK	256
[39]	—	0.883, 0.967	OFDM	—	—	—
[37]	32.5-37.5	0.8, 1.08, 3.16	OFDM	1024	BPSK, QPSK	256
[38]	2	—	OFDM	64	QPSK	16
[40]	—	—	AFDM	32, 128	BPSK, QPSK	0
[41]	4	—	OTFS	—	BPSK	—

Table 8. Comparison for ML/DL-Based Channel Estimation in SC-UWA Communication

Ref.	Training	Complexity	Channel	Gain
	Loss		Prediction	(BER)
[20]	Not Given (Cross-Entropy)	—	66% Accuracy, 88% Precision	—
[23]	Not Given (Cross-Entropy)	—	—	$1.2 \times 10^{- 3}$
[24]	0.01285 (MAE)	Lower (Big-O)	$0.55 \times 10^{- 5}$	$2 \times 10^{- 5}$
[30]	$4 \times 10^{- 3}$ (MAE)	—	7% better	—
[36]	Not Given (WGAN-GP)	—	TVIR, CDF, JS Divergence, Entropy	Close to measurements

Table 9. Comparison for ML/DL-Based Channel Techniques in MC-UWA Comm.

Ref.	Training	Complexity	Channel	Gain
	Loss		Prediction	(BER)
[21]	0.025 (MAE)	Higher (Big-O, Runtime)	—	$0.5 \times 10^{- 6}$
[25]	$1.2 \times 10^{- 4}$ (MSE)	Medium $O (L K^{2})$	Near optimal	40% Improvement
[26]	0.1 (L2 loss function)	—	—	$1 \times 10^{- 2}$ at an SNR of 20 dB
[27]	Combines MSE and BER	47.8 ms (a single OFDM block)	—	Improvement of over 0.17
[28]	MSE	—	—	$1 \times 10^{- 3}$ (ComNet)
[29]	Not Given (MSE)	Same (Time complexity)	$- 24$ dB (NMSE)	$4 \times 10^{- 4}$ (AMP)
[31]	$0.018$ (MSE)	$2.12$ MB (Memory), $561, 526$ (Para.), $1.04 \times 10^{6}$ (FLOPs)	—	$1 \times 10^{- 3}$ (ComNet)
[32]	—	—	$2 \times 10^{- 6}$ (MSE)	$0.5 \times 10^{- 3}$ (ChannelNet)
[33]	$0.01$ (MSE)	$0.2$ MB (Memory), $8, 956$ (Parameters), Lower by $0.5 \times 10^{5}$ (FLOPs)	$0.25 \times 10^{- 6}$ (MSE)	$1 \times 10^{- 3}$ (biLSTM)
[36]	Not Given (WGAN-GP)	—	TVIR, CDF, JS Divergence, Entropy	Close to measurements
[39]	$4.09 \times 10^{- 3}$ (MSE)	$8, 460, 928$ (Par.), $23.7$ s (T. Time), $83.3$ ms (P. Time)	$9 \times 10^{- 4}$ (MSE)	Not Done
[37]	$0.05$ (MSE)	Lower (CNN)	Visually comparable (TVIR), $\approx 0$	$7 \times 10^{- 4}$ (FC-NN)
[38]	Not Given (MSE)	Not Done	Not Done	$1 \times 10^{- 2}$ (ComNet)
[40]	Not Given (MSE)	Lower by $21.3$ ms (Runtime)	$1 \times 10^{- 4}$ (NMSE)	$1.7 \times 10^{- 2}$ (LMMSE)
[41]	0.02 (MSE)	20.70 MB (Memory)	3 dB gain at $10^{- 3}$	$1 \times 10^{- 7}$ (CNN-ResNet)
		4,996,384 (Parameters)
		$2.48 \times 10^{8}$ (FLOPs)

Table 10. Overview of ML/DL-Based Adaptive Modulation in SC-UWA Communication

Ref.	ML/DL Technique	Optimizer	Training Examples
[42]	SVM, KNN, LDA, BRT	—	Measured Data
[47]	MLR, MLP	—	Measured Data
[43]	CNN	Adam	Simulated Data
[44]	RL	Not Applicable	No Dataset Used
[45]	RL	Not Applicable	Measured Data

Table 11. Overview of ML/DL Based Adaptive Modulation in MC-UWA Communication

Ref.	ML/DL Technique	Optimizer	Training Examples
[46]	A-kNN	Not Applicable	Measured Data
[48]	RL	PPO	Simulated (Bellhop) & Measured Data
[43]	CNN	Adam	Simulated Data
[49]	RL	Adam	Simulated Data (Bellhop)
[50]	CNN	Adam	Measured Data
[44]	RL	Not Applicable	No Dataset Used
[45]	RL	Not Applicable	Measured Data

Table 12. Characteristics of Adaptive Modulation Techniques in SC-UWA Comm.

Ref.	Bandwidth	Tx-Rx Distance	Modulation
[42]	5kHz, 10kHz	1 km, 2 km, 3 km	PSK, QAM
[47]	4 kHz	1 km	PSK, QAM
[43]	—	—	CDMA, TDMA
[44]	—	5 to 355 cm	FH-BFSK, ASK, PSK
[45]	5kHz	0.82 km	FSK, DS-SS

Table 13. Characteristics of Adaptive Modulation Techniques in MC-UWA Comm.

Ref.	Bandwidth	Tx-Rx Distance	System	Subcarriers	Modulation	CP
[46]	—	—	OFDM	—	FSK	—
[48]	6kHz	5 km	OFDM	256	PSK, QAM	64
[43]	—	—	OFDM	—	—	—
[49]	8kHz	5 km	OFDM	1024	PSK, QAM	400
[50]	6kHz	0.3 km to 1.5 km	OTFS	32	PSK	—
[44]	—	5 to 355 cm	OFDM	—	—	—
[45]	5kHz	0.82 km	OFDM	200	—	—

Table 14. Comparison of Adaptive Modulation Techniques in SC-UWA Communication

Ref.	Training Loss	Complexity	Throughput	Gain in BER
[42]	Not Given (MSE)	—	$> 99 %$ (Accuracy)	—
[47]	—	—	25% higher	Comparable to desired values
[43]	—	—	$> 98 %$ (Accuracy)	Substantial
[44]	Not Applicable	—	3.648% (RSSI)	32%
[45]	Not Given (TD)	—	14.8% better	$4.5 \times 10^{- 3}$

Table 15. Comparison of Adaptive Modulation Techniques in MC-UWA Communication

Ref.	Training Loss	Complexity	Throughput or Equivalent	Gain in BER
[46]	Not Applicable	Less than the standard (Processing + Memory)	Near ideal	Near ideal
[48]	Not Given (Actor and Critic Loss)	—	4% higher	Better than others
[43]	—	—	$> 98 %$ (Accuracy)	Substantial
[49]	Not Given (MSE)	High	Higher than others	Better than threshold
[50]	Not Given (Cross-entropy)	Higher	4% higher	Kept at $0.001$ throughout
[44]	Not Applicable	—	3.648% (RSSI)	32%
[45]	Not Applicable	—	14.8% higher	Stable

Table 16. Overview of ML/DL-Based Modulation Recognition

Ref.	ML/DL Technique	Optimizer	Training Examples
[51]	SCNet	Adam	Simulated Data
[52]	OAE-EEKNN	Gradient Descent & Adam	Measured Data
[53]	RNN & CNN	Adam	Measured Data
[54]	SCL	Adam	Simulated and Measured Data
[55]	ResNet	Adam	Simulated (Bellhop) and Measured Data
[56]	CNN with Ensemble Learning + Network Pruning	—	Simulated Data
[57]	DCN	—	Simulated and Measured Data
[58]	SVM	Not Applicable	Simulated and Measured Data
[62]	Hybrid SqueezeNet and SENet	—	Simulated and Measured Data
[63]	Hybrid (One2Three+Dual-Stream SE block)	Adam	Measured Data
[64]	TSTR	Adam	Measured Data (from [35]
[59]	NAS	Momentum SGD	Simulated and Measured Data
[60]	2D ResNet & CNN	—	Measured Data
[61]	Several classifiers	—	Simulated and Measured Data

Table 17. Characteristics of ML/DL-Based Modulation Recognition Techniques

Ref.	Bandwidth	Tx-Rx Distance	Modulations Considered
[51]	1,000 symbols/s	1.5 km	PSK, QAM, SSB, FM, PAM, FSK
[52]	10 kHz	1 km	FSK, PSK, QAM, DSSS, OFDM
[53]	4 kHz	1 km	PSK, FSK, OFDM
[54]	—	3, 6, 12, 60 m	PSK, FSK
[55]	4 kHz	1 km	PSK, FSK, OFDM
[56]	—	45 m	PSK, FSK, QAM
[57]	—	3 km, 5 km	PSK, QAM
[58]	100 kHz	6 m	FSK
[62]	—	10, 500, 1000 m	PSK, FSK, DSSS, OFDM
[63]	—	7 m, 1 km	PSK, FSK, QAM, DSSS, OFDM
[64]	—	—	PSK, QAM, FSK
[59]	1 kHz	100 m to 3 km	LFM, FSK, PSK, DSSS, OFDM
[60]	3 kHz, 10 kHz	1.22 km	FSK, PSK, CW, DSSS, LFM, OFDM
[61]	31.25 kHz	5 km, 1.25 km	PSK, QAM

Table 18. Comparative Analysis of Modulation Recognition Techniques.

Ref.	Training Loss	Complexity	Accuracy	Precision
[51]	0.01 (CCE)	Lower (153,930 parameters)	95.3%	89% to 100%
[52]	Not Given (MSE)	Lower (Exec. Time 3.48 ms)	99.25%	—
[53]	$1.946$ (Cross Entropy)	Lower (Exec. Time 7.164 ms)	99.38%	—
[54]	Not Given (Contrastive Loss)	—	98.6%	66% to 100%
[55]	$10^{- 8}$ (Cross-Entropy)	—	100%	40% to 100%
[56]	Not Given (negative log likelihood)	Lower (4.5 ms, $0.5 \times 10^{6}$ parameters, $0.5 \times 10^{8}$ FLOPs)	93.4%	33.33% to 100%
[57]	Not Given (Cross Entropy)	—	64% to 73%	58.8% to 100%
[58]	0.10 (Hinge Loss)	Low (SVM efficiency)	98.28% to 99.78%	99.90% to 99.94%
[62]	—	Lower by 9 times (No. of parameters)	98.5%	97% to 99%
[63]	Nearly zero (Not Given)	Lower (0.28 M parameters, 7.02 M FLOPs)	99%	—
[64]	Not Given (Cross-Entropy)	Medium (145.9 k parameters)	86% to 91.1%	61% to 100%
[59]	Not Given (Cross-entropy)	High (3.10 M parameters, 0.58 M FLOPs)	92.2%	80% to 100%
[60]	0.2 (Cross Entropy)	High (38.48 M parameters, 1,753 s training time)	94.31%	91.02%
[61]	—	Medium (Runtime from 2.23 ms to 8,720.94 ms)	64% to 83%	61% to 100%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.