Improving Long-Range Significant Wave Height Forecasts for Maritime Energy Efficiency: A Residual U-Net Approach Validated with Real-Ship Fuel Consumption Data

Hyunju Lee; Jaehee Jung; Joon-Woo Roh

doi:10.20944/preprints202606.0758.v1

Submitted:

08 June 2026

Posted:

09 June 2026

You are already at the latest version

Abstract

Accurate significant wave height prediction is essential for fuel-efficient ship operation and weather routing, as wave-induced resistance directly affects propulsion demand and fuel consumption. This study proposes a Residual U-Net-based deep learning correction model to improve long-range SWH forecasts from WAVEWATCH III (WW3). WW3 global forecast fields were corrected using the proposed model, with CMEMS reanalysis data used as the ground-truth reference. The corrected outputs, denoted as WW3_UNET, were evaluated against 10-minute-resolution main engine fuel oil consumption (ME1_FOC) records and onboard wave observations from a commercial vessel traversing the South Atlantic in 2025. WW3_UNET showed markedly improved agreement with ship observations compared with the raw WW3 forecast across all lead times from 0 to 288 h. When a 24-hour moving average was applied, WW3_UNET achieved a correlation of 0.720 with ME1_FOC at the 168–180 h lead time, closely approaching the 0.736 obtained from onboard wave measurements. These results indicate that AI-corrected forecasts can provide observation-consistent wave information up to 7–8 days in advance. The proposed approach can support fuel-aware weather routing and voyage planning, thereby contributing to improved maritime energy efficiency and decarbonization.

Keywords:

significant wave height

;

residual U-Net

;

CMEMS

;

WW3

;

wave forecast correction

;

real-ship telemetry

;

ship fuel consumption

;

weather routing

Subject:

Engineering - Marine Engineering

1. Introduction

1.1. Regulatory Context: IMO Decarbonization Mandates

According to the IMO's Fourth Greenhouse Gas Study (2020), the maritime sector accounts for approximately 3% of global greenhouse gas (GHG) emissions and transports more than 80% of world trade by volume [1]. In response to accelerating climate targets, the IMO has introduced stringent measures including the Carbon Intensity Indicator (CII) and Energy Efficiency Existing Ship Index (EEXI), which took effect in 2023. Under the CII framework, ships rated D for three consecutive years or E in a single year are required to develop a corrective action plan as part of SEEMP Part III. This regulatory landscape demands that shipping companies move beyond conventional operational practices toward data-driven, efficiency-focused navigation strategies in which high-quality environmental forecasts play a central enabling role.

1.2. Climate Change and Increasing Offshore Environmental Variability

Climate change is intensifying the variability of key offshore environmental factors, including sea surface winds, waves (wave height and wave period), and ocean currents [11,12]. Under high-emission scenarios, annual mean significant wave height and mean wave period are projected to change by 5–15% across widespread ocean regions, with robust signals detected along approximately 50% of the world's coastlines [11]. The South Atlantic and Southern Ocean—the primary study region of this work—are among the areas projected to experience some of the most pronounced increases in wave energy [11,12].

Critically, the anticipated changes extend beyond shifts in mean conditions. The IPCC Sixth Assessment Report (AR6) confirms that human-induced greenhouse gas emissions have already increased the frequency and/or intensity of various weather and climate extremes, with these trends projected to accelerate under future warming [13]. With respect to ocean waves specifically, extreme significant wave height return values are projected to increase over more than 25% of the ocean surface by the end of the century, particularly in high-latitude regions [14]. This amplification of wave variability and extremes presents a compounding challenge for numerical wave prediction models: the physical parameterizations embedded in models such as WAVEWATCH III (WW3) were largely calibrated under historical climate conditions, and their performance may degrade as wave regimes shift toward more extreme and variable states not well represented in their training history. Data-driven AI models, which learn directly from observed and reanalysis data, therefore offer a compelling complementary approach that can adapt to emerging wave climate signals more flexibly than fixed-physics numerical models.

Against this backdrop, more accurate offshore meteorological and oceanographic prediction—particularly wave forecasting—is becoming increasingly important for efficient navigation that accounts for fuel consumption. The ability to anticipate not only average wave conditions but also episodic extreme events along planned routes is directly relevant to fuel management, operational safety, and CII compliance. Among the environmental variables that affect a vessel's resistance at sea, significant wave height (SWH) is widely recognized as the dominant contributor to added resistance in open-ocean voyages.

1.3. Limitations of Existing Numerical Wave Models

Numerical wave models such as WW3 provide global SWH forecasts with lead times up to 288 hours at 0.5° spatial resolution. While WW3 performs reliably at short lead times (0–48 h), systematic forecast errors tend to accumulate at longer lead times due to atmospheric forcing uncertainties and physical parameterization limitations [7,8]. In the South Atlantic—a region characterized by energetic swell systems generated by mid-latitude storms—these errors can be particularly significant. Furthermore, as climate-driven variability increases the frequency of anomalous wave states, fixed-physics models face growing challenges in capturing these departures from climatological norms.

Deep learning techniques offer a promising avenue for post-processing numerical model output and reducing systematic biases. Convolutional neural network (CNN) architectures that preserve spatial structure, such as U-Net, have demonstrated strong performance in geophysical field correction tasks [5,6]. However, the direct validation of AI-corrected wave data against independent, high-resolution ship-based observations and synchronized fuel consumption telemetry—as opposed to satellite altimetry or buoy data alone—remains underexplored.

1.4. Research Objectives and Novelty

This study addresses two interrelated objectives:

Develop a Residual U-Net model to correct WW3 SWH forecasts across all lead times (0–288 h) using CMEMS reanalysis data as ground truth; and

Quantify the correlation between corrected SWH forecast data and real-ship fuel consumption (ME1_FOC) to demonstrate the practical value of AI-enhanced wave prediction for efficient maritime navigation and decarbonization.

The key novelty of this work lies in its validation methodology. While prior studies have evaluated AI-corrected wave fields against satellite altimetry, reanalysis products, or fixed buoy observations, this study employs 10-minute-resolution telemetry data from an operational commercial vessel—including synchronized main engine fuel consumption (ME1_FOC) and onboard SWH observations recorded along actual South Atlantic crossing routes. This approach provides a direct, operationally relevant measure of forecast improvement quality, bridging the gap between AI-enhanced geophysical modeling and practical maritime energy management.

2. Related Research

2.1. Weather Routing and Energy Efficiency

Mannarini et al. (2016) demonstrated through the VISIR (Variable Integrated Ship Routing) system that integrating real-time ocean current and wave data into voyage planning significantly reduces fuel consumption and travel time in the Mediterranean, highlighting that the accuracy of environmental input data is as important as the routing algorithm itself [2]. Zis et al. (2020) provided a comprehensive taxonomy and survey of ship weather routing studies, showing that weather routing problems are commonly formulated to minimize fuel consumption, operating cost, voyage time, or navigational risk under varying wind, wave, and current conditions [15]. Yang et al. (2020) showed that speed optimization considering ocean currents and wave conditions is a key lever for enhancing environmental sustainability in maritime shipping, particularly for vessels on long transoceanic routes [3]. Ormevik et al. (2023) proposed a high-fidelity approach for modeling weather-dependent fuel consumption on ship routes with speed optimization, demonstrating that weather effects such as wave conditions can materially influence fuel consumption estimates and routing decisions [16]. These studies collectively indicate that even marginal improvements in the accuracy of wave and current forecasts can translate into measurable operational benefits over multi-day or multi-week voyages. More recently, studies have explored machine learning-based approaches to improve wave forecasts, including the use of CMEMS reanalysis data to correct systematic biases in operational models [4].

2.2. AI-Based Ocean Model Post-Processing

Deep learning methods have increasingly been applied to improve geophysical forecast fields. U-Net architectures, originally developed for biomedical image segmentation [5], have proven effective for spatial field correction tasks due to their encoder-decoder structure with skip connections, which preserves both large-scale spatial patterns and fine-scale details. Residual learning—where the model predicts the difference (residual) between a forecast and a target rather than the target itself [6]—has been shown to accelerate convergence and improve performance in post-processing applications. Several studies have applied CNN-based post-processing to numerical weather prediction (NWP) output, including wind fields, precipitation, and sea surface temperature. More recently, AI-based approaches have also been applied directly to wave forecast correction. For example, Cao et al. (2025) proposed a Transformer-enhanced U-Net model to correct numerical wave forecasts, demonstrating that encoder-decoder architectures can effectively capture spatially structured wave-field errors [17]. These studies support the suitability of U-Net-type architectures for correcting operational ocean forecast fields, while also highlighting the need for validation against application-specific operational data.

2.3. Research Gap

Previous studies have established the importance of accurate environmental information for weather routing, speed optimization, and fuel-efficient ship operation [2,3,15,16]. In parallel, deep learning-based post-processing methods, including U-Net-type architectures and residual learning, have shown strong potential for correcting spatially structured errors in numerical geophysical and wave forecast fields [4,5,6,17]. However, these two research streams have not yet been sufficiently connected from the perspective of full-scale vessel operation. Existing weather routing and fuel-consumption studies generally rely on numerical or reanalysis-based environmental inputs, while AI-based wave forecast correction studies have mainly evaluated model performance against satellite altimetry, reanalysis products, or buoy observations. Although these validation sources are physically meaningful, they do not directly demonstrate whether AI-corrected wave forecasts better represent the wave-energy patterns that influence fuel consumption during actual ship operation. In particular, the improvement in corrected significant wave height (SWH) forecasts has rarely been quantified against synchronized real-ship telemetry, including onboard wave observations and main engine fuel consumption records, across the full range of numerical forecast lead times. This study addresses this gap by correcting WW3 SWH forecasts over 0–288 h lead times using a Residual U-Net model and validating the corrected outputs against 10-minute-resolution telemetry data from an operational vessel.

3. Data and Methodology

3.1. Overall Framework of the Proposed Method

This section describes the overall methodology of the proposed AI-based wave correction framework. As shown in Figure 1, the workflow consists of five sequential components: (1) preparation of WW3 forecast inputs and CMEMS reanalysis targets, (2) preprocessing and forecast–target pairing, (3) Residual U-Net-based correction, (4) generation of corrected significant wave height (SWH) fields, and (5) ship-track-based validation using onboard wave observations and main engine fuel consumption data. This overall framework is presented first to provide a unified view of the study design, and the detailed procedures of each component are described in the following subsections.

Figure 1 provides an overview of the proposed framework. WW3 global forecast data are used as the primary multi-channel input, while CMEMS reanalysis SWH is used as the ground-truth reference. After spatial interpolation and lead-time matching, paired samples are constructed for model training. The Residual U-Net is then trained to predict the residual between CMEMS and WW3 SWH rather than directly estimating the full wave field. The predicted residual is added to the original WW3 SWH to generate the corrected output, denoted as WW3_UNET. Finally, the corrected data are interpolated to the 10-minute ship positions and validated against onboard wave observations and synchronized main engine fuel consumption records.

3.2. Datasets

The data components summarized in Figure 1 are described in this section. Three types of data were used in this study: WW3 global forecast fields as multi-channel model inputs, CMEMS reanalysis SWH as the ground-truth target for residual learning, and real-ship telemetry data as an independent validation dataset.

3.2.1. WW3 Global Wave Forecast

WAVEWATCH III (WW3) global forecast data were used as the primary input for the AI correction model [7]. The dataset covers the period June–December 2025, with initialization twice daily at 00 and 12 UTC. Forecasts extend to 288 hours (12 days) at 6-hour intervals, yielding 49 time steps per initialization cycle. The spatial domain spans −70° to 70° latitude and −180° to 180° longitude at 0.5° horizontal resolution (281 × 720 grid points). Input variables include: significant wave height (wavhgt), mean wave direction (wavdir), mean wave period (wavprd), peak wave direction (wpkdir), peak wave frequency (wpkfre), zonal wind (wndu), meridional wind (wndv), wind speed (wndspd), and wind direction (wnddir).

3.2.2. CMEMS Wave Reanalysis (Ground Truth)

The Copernicus Marine Environment Monitoring Service (CMEMS) global wave reanalysis product was used as the target (ground truth) for model training [4]. This dataset covers the same June–December 2025 period at 0.2° horizontal resolution (899 × 1800 grid points), providing SWH fields that represent the best estimate of the true sea state through assimilation of satellite altimetry, buoy, and other observational data. Its higher spatial resolution and data assimilation make it substantially more accurate than the WW3 operational forecast, particularly in capturing mesoscale wave features.

3.2.3. Real-Ship Telemetry Data

High-frequency vessel telemetry data were obtained from a commercial vessel operating in the South Atlantic during June–December 2025. Observations are recorded at 10-minute intervals and include: onboard significant wave height observations (OBS_SHIP), main engine fuel oil consumption (ME1_FOC, kg/h), engine RPM, vessel speed over ground, and GPS position. Four distinct route segments traversing the South Atlantic study domain (lat: −40° to 5°N, lon: −50° to 20°E) were identified (Table 1 and Figure 2). Current analysis focuses on Route 1 (July 21–August 2, 2025), representing a complete South Atlantic crossing.

3.3. Data Preprocessing and Forecast-Target Pairing

As shown in Figure 1, preprocessing consisted of spatial interpolation, forecast–target pairing, and ship-position interpolation. CMEMS reanalysis data were first interpolated to the WW3 0.5° grid using bilinear interpolation to enable direct comparison and training target generation. WW3 forecast fields were then matched to CMEMS values by aligning initialization time and lead time, yielding paired (forecast, target) samples. The study subgrid covers the South Atlantic: latitude −40° to 5°N, longitude −50° to 20°E.

For validation against ship observations, gridded WW3, WW3_UNET, and CMEMS data were interpolated to the 10-minute vessel positions using bilinear spatial interpolation. Temporal interpolation between adjacent 6-hour model time steps was performed using linear interpolation consistent with each lead time band. For instance, an observation at 03:20 UTC on July 20 was assigned to the 0–6 h lead time band using the 00:00 UTC initialization and interpolated between the 0 h and 6 h forecast steps; the same observation was simultaneously assigned to the 12–18 h band using the 12:00 UTC initialization from July 19.

3.4. Residual U-Net Architecture and Model Selection Rationale

As illustrated in Figure 1, the core correction component of the proposed framework is a Residual U-Net architecture. The model is designed to transform multi-channel WW3 forecast fields into a spatial residual field that represents the difference between CMEMS reanalysis SWH and WW3-predicted SWH. This architecture was selected because it combines multi-scale spatial feature extraction with residual learning, both of which are particularly suitable for correcting structured errors in geophysical forecast fields.

First, the U-Net structure [5] preserves the spatial structure of the input wave field through its encoder-decoder design with skip connections. The encoder progressively downsamples the input feature maps to capture large-scale spatial context, while the decoder upsamples back to the original resolution; skip connections between corresponding encoder and decoder levels retain fine-scale spatial details that would otherwise be lost. This property is essential for wave height correction, where both the broad spatial pattern of swell systems and local-scale gradients influence forecast accuracy. Alternative architectures such as plain CNNs lack the multi-scale spatial reasoning provided by the encoder-decoder structure, and while Transformer-based models have shown promise in sequential data tasks, their quadratic attention complexity makes them less efficient for dense spatial field outputs without extensive architectural adaptations.

Second, the residual learning strategy [6]—in which the model predicts only the difference (residual) ΔH = H_CMEMS − H_WW3 rather than the full SWH field—offers significant practical advantages. Because WW3 already provides a physically consistent first-guess estimate, the residual to be corrected is substantially smaller in magnitude and more spatially structured than the raw SWH field. This reduces the learning burden on the model, accelerates convergence, and mitigates the vanishing gradient problem that afflicts deep networks trained to predict large-magnitude targets. The final corrected output is:

H_corr = H_WW3 + ΔĤ

where ΔĤ is the model-predicted residual. The model takes as input all nine WW3 spatial fields (wavhgt, wavdir, wavprd, wpkdir, wpkfre, wndu, wndv, wndspd, wnddir), enabling it to leverage the full suite of atmospheric and wave state information available in the operational forecast.

3.5. Lead-Time-Specific Model Training Strategy

A separate ResU-Net model was trained independently for each of the 49 lead times (0, 6, 12, ..., 288 h). This lead-time-specific training strategy is a deliberate design choice motivated by the distinct error characteristics that emerge at different forecast horizons. At short lead times (0–48 h), WW3 errors are primarily driven by initialization biases and local atmospheric forcing inaccuracies; these tend to manifest as amplitude offsets in specific geographic sub-regions. At long lead times (>120 h), errors increasingly reflect cumulative atmospheric forcing uncertainties and systematic underestimation or overestimation of swell energy propagation from distant storm systems. By training individual models for each lead time, the ResU-Net is optimized to correct the specific bias patterns relevant to that horizon rather than applying a single generic correction across the full forecast range. This fine-grained approach directly contributes to the model's ability to maintain high correlation with ship observations and fuel consumption even at lead times where the raw WW3 forecast becomes unreliable.

3.6. Rolling Cross-Validation Experimental Design

To rigorously evaluate the correction performance across all seven months of data, a monthly rolling validation scheme was implemented. In each experiment, one month serves as the test set, the immediately adjacent month as the validation set (for early stopping and hyperparameter selection), and the remaining five months as the training set. This design ensures that the test data are always temporally separated from the training data, preventing data leakage. Seven experiments were conducted in total, covering test months from June to December 2025.

4. Experimental Results and Analysis

4.1. Qualitative Comparison of SWH Time Series

Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10 present the SWH time series for Route 1 (July 21–August 2, 2025) at representative lead times of 0–12 h, 72–84 h, 120–132 h, and 168–180 h, comparing OBS_SHIP, WW3, WW3_UNET, and CMEMS, together with the ME1_FOC fuel consumption record. At short lead times (0–12 h, Figure 3 and Figure 4), all three model-based datasets broadly capture the dominant wave cycles along the route, with two pronounced swell maxima near July 22 and July 29. However, WW3 consistently overestimates SWH at the wave peaks and exhibits elevated high-frequency noise, while WW3_UNET and CMEMS show markedly improved alignment with OBS_SHIP in both magnitude and phase.

At the intermediate lead time of 72–84 h (Figure 5 and Figure 6), the gap between WW3 and observations begins to widen visibly. Temporal phase shifts appear in WW3, displacing predicted wave maxima by several hours, while WW3_UNET maintains smoother and more coherent tracking of the swell evolution. The 24-hour moving average (Figure 6) further clarifies this improvement by filtering high-frequency noise and revealing the underlying swell energy trend—the trend that most directly drives vessel resistance and fuel consumption.

At 120–132 h (Figure 7 and Figure 8), WW3 forecast degradation becomes pronounced: peak amplitudes are significantly overestimated and temporal alignment with observed swell cycles deteriorates. WW3_UNET substantially reduces these errors, with the moving average showing that the AI-corrected forecast continues to follow the multi-day swell trend captured by CMEMS and OBS_SHIP.

At 168–180 h (Figure 9 and Figure 10)—corresponding to a 7–8 day planning horizon—the divergence of raw WW3 from observations is stark. WW3's correlation with OBS_SHIP at this lead time falls to R = 0.483, and its correlation with ME1_FOC degrades to only R = 0.098, rendering it practically uninformative for routing decisions. WW3_UNET, by contrast, maintains R = 0.702 with OBS_SHIP and R = 0.454 with ME1_FOC in the raw time series. The 24-hour moving average further amplifies this advantage: WW3_UNET (mvg avg) achieves R = 0.720 with ME1_FOC, essentially matching the 0.736 achievable from direct onboard wave measurements. This result is of high practical relevance: it demonstrates that AI-corrected wave forecasts can provide near-observation-quality environmental data at the 7–8 day voyage planning stage, enabling route optimization decisions that are informed by wave energy patterns rather than degraded model noise.

4.2. Quantitative Correlation Analysis: SWH vs. OBS_SHIP

Figure 11 and Table 2 present the Pearson correlation coefficients between model-derived SWH and ship-observed SWH (OBS_SHIP) for all lead times along Route 1. CMEMS serves as a benchmark reference (R = 0.967 raw; R = 0.981 with 24-hour moving average), reflecting the best achievable correlation with the available reanalysis product. WW3_UNET consistently and substantially outperforms WW3 across the entire lead time spectrum.

At 0–12 h, WW3 achieves R = 0.937 while WW3_UNET achieves R = 0.961. The improvement becomes increasingly dramatic at longer lead times: at 168–180 h, WW3 drops to R = 0.483, while WW3_UNET maintains R = 0.702. Beyond 200 h, WW3 correlations fall near-zero or become unstable, while WW3_UNET retains meaningful positive values throughout the full 288-hour range. The 24-hour moving average further amplifies the contrast: WW3_UNET (mvg avg) sustains R ≥ 0.920 up to the 120–132 h band, while WW3 (mvg avg) falls below 0.900 after 108–120 h.

4.3. Quantitative Correlation Analysis: SWH vs. ME1_FOC

The central practical analysis of this study examines how SWH forecast quality translates into the ability to explain wave-related variability in shipboard fuel consumption. A higher correlation between SWH and ME1_FOC indicates that the SWH data more accurately captures the wave-induced resistance variations that drive fuel burn—and therefore more reliably supports fuel-aware routing decisions. Figure 12 first compares the agreement of each SWH dataset with CMEMS across lead times, while Figure 13 and Table 3 present the lead-time-resolved correlations between ME1_FOC and SWH from each data source. The benchmark correlations are represented by CMEMS (mvg avg, R = 0.757) and OBS_SHIP (mvg avg, R = 0.736), which provide reference levels for interpreting the relationship between SWH and ME1_FOC.

The results reveal four key findings. First, at short lead times (0–60 h), WW3_UNET consistently shows an increase of approximately 0.07–0.09 in correlation coefficient R compared with the raw WW3 forecast, establishing a robust improvement baseline across the operational planning window. Second, the performance gap widens dramatically at longer lead times: at 168–180 h, WW3 (mvg avg) achieves only R = 0.419, while WW3_UNET (mvg avg) achieves R = 0.720—essentially matching the OBS_SHIP performance (0.736) and demonstrating that AI-corrected data can provide an observation-consistent proxy for wave-energy trends at the 7–8 day planning horizon.

Third, raw WW3 correlations with ME1_FOC become negative at 180–204 h lead times (e.g., −0.051, −0.106), indicating that the forecast is not merely unhelpful but potentially counter-productive for routing decisions at these ranges. WW3_UNET maintains positive and meaningful correlations throughout the entire 0–288 h range, representing a fundamentally different and superior information value profile for long-range routing.

Fourth, the effect of the 24-hour moving average merits specific attention. The progressive improvement in correlation when moving from raw to smoothed data—most pronounced for WW3_UNET at long lead times—reflects an important physical insight: the wave-induced added resistance experienced by a vessel is not driven by instantaneous wave heights but by the sustained energy content of the swell system over multi-hour periods. The AI model appears to capture this energetic trend structure particularly well, filtering out high-frequency noise while retaining the slowly evolving swell patterns that most directly translate to fuel consumption. At very long lead times (240–288 h), WW3_UNET (mvg avg) achieves correlations of 0.804–0.867 with ME1_FOC, substantially exceeding both raw WW3 and even the static CMEMS reanalysis (0.757), which may suggest that the model has learned to represent the synoptic-scale swell patterns that persist over multi-week timescales.

4.4. Implications for Fuel-Aware Weather Routing and Maritime Energy Efficiency

The correlation results have direct and quantifiable implications for operational weather routing and maritime energy efficiency. Modern routing optimization algorithms use forecast SWH to estimate added resistance and adjust recommended speed profiles. A routing engine informed by WW3_UNET data can produce more accurate resistance estimates across the full forecast horizon (0–288 h), translating to better-calibrated speed recommendations: slowing down appropriately in high-wave zones while maintaining speed in windows where the raw WW3 would erroneously predict rough conditions.

The operational significance is illustrated concretely by the 168–180 h lead time result: WW3_UNET (mvg avg) achieves R = 0.720 with ME1_FOC, approaching the 0.736 from direct onboard observations. For a typical transoceanic voyage lasting 20–30 days, the departure routing plan is commonly finalized 7–10 days in advance. The fact that WW3_UNET effectively extends reliable wave forecasting to this horizon means that voyage optimization decisions—including selection of wave-avoidance detours, speed reduction zones, and engine load profiles—can be made at the pre-departure stage with observation-consistent reliability. Given that fuel costs represent approximately 50–60% of total vessel operating costs, and that CII ratings are calculated on the basis of annual fuel consumption, systematic improvement in routing quality through better wave data can translate directly into improved CII grade management and reduced compliance risk for shipping operators.

5. Discussion

This study demonstrates that a Residual U-Net model, trained to predict the systematic bias between WW3 wave height forecasts and CMEMS reanalysis fields, produces SWH estimates that are meaningfully more aligned with real-ship observations and fuel consumption patterns than the original forecast across the full 0–288 h lead time range. The results confirm the physical expectation that wave height is a primary driver of added resistance and hence fuel burn in open-ocean conditions, and demonstrate that the quality of wave forecast data is a practically limiting factor in the precision of voyage energy optimization.

The particularly striking improvement at long lead times (>168 h) is mechanistically attributable to two factors. First, systematic bias correction: the AI model reduces persistent geographic and seasonal overestimation or underestimation patterns in WW3 that compound over forecast time. Second, spatial pattern learning: by training on CMEMS reanalysis fields at high spatial resolution, the model learns to represent the mesoscale and synoptic-scale wave structures of the South Atlantic swell climate that WW3's 0.5° resolution and physical parameterizations systematically underrepresent. At these long ranges, the raw WW3 forecast has lost skill for predicting specific wave events along the vessel track; the AI model's maintained correlations reflect residual climatological skill—capturing the slowly evolving wave energy envelope of the region—that is sufficient to inform multi-day routing decisions.

The observation that WW3_UNET (mvg avg) at some long lead times exceeds even the CMEMS-ME1_FOC correlation (0.757) is noteworthy. This may reflect that the model has learned to correct temporal phase errors in WW3 that are not present in CMEMS—since CMEMS reanalysis is not a forecast product, it does not carry the phase uncertainty inherent in long-range numerical prediction. The model thus provides a correction that is distinct from simple convergence to CMEMS values, and may partially capture information about the temporal evolution of swell systems that CMEMS reanalysis smooths over.

5.1. Limitations

Several important limitations of this study must be acknowledged. First and most significantly, the analysis is based on a single vessel trajectory (Route 1) in a single ocean region (South Atlantic). While Route 1 represents a complete and physically meaningful South Atlantic crossing during the peak austral winter swell season, it does not capture the full diversity of wave regimes, seasonal conditions, or vessel types that a generalized AI correction system would need to handle. The South Atlantic is a particularly energetic wave environment dominated by long-period Southern Ocean swells; performance in lower-energy regions such as the equatorial Pacific or semi-enclosed seas may differ substantially.

Second, the model was trained on a seven-month window (June–December 2025), which limits its exposure to the full annual wave climate cycle of the study region. The rolling cross-validation design mitigates in-sample overfitting, but cannot guarantee skill for wave states outside the training distribution—a concern that grows in relevance as climate change pushes wave regimes toward conditions not well-represented in recent historical data.

Third, the use of CMEMS reanalysis as the training target introduces an implicit upper bound on model performance with respect to wave-field accuracy, particularly for features that CMEMS itself does not fully resolve. In addition, the correlation analysis captures the linear relationship between SWH and fuel consumption but does not isolate the causal contribution of waves from other operational factors such as engine load variations, vessel draft changes, speed adjustments, or navigation decisions. Therefore, cases in which WW3_UNET shows higher correlation with ME1_FOC than CMEMS should not be interpreted as evidence that the corrected forecast is physically more accurate than CMEMS reanalysis. Rather, such results indicate that the corrected forecast may be more temporally aligned with the wave-energy patterns experienced by the vessel along the analyzed route. Further validation using additional routes, seasons, vessel types, and controlled performance indicators is required to confirm the generality of this behavior.

Future work will address these limitations by extending the analysis to the remaining three South Atlantic routes (Routes 2–4) and additional ocean basins, including the North Atlantic, Indian Ocean, and Western Pacific, to assess generalizability across wave regimes and seasonal variability. Validation across different vessel types—bulk carriers, tankers, LNG carriers—will be pursued to characterize how hull form and operational profile modulate the SWH–fuel consumption relationship. The integration of AI-corrected surface current data alongside wave correction, and the direct quantification of fuel savings achievable under a simulated WW3_UNET routing scenario relative to a baseline WW3 routing, represent the natural next steps toward operational deployment.

6. Conclusions

This study presented a Residual U-Net-based deep learning framework for correcting WAVEWATCH III significant wave height forecasts using CMEMS reanalysis data, validated against 10-minute-resolution real-ship telemetry data from a commercial vessel operating in the South Atlantic. The principal conclusions are as follows:

The AI-corrected WW3_UNET data consistently and substantially outperforms raw WW3 in correlation with ship-observed SWH across all lead times from 0 to 288 hours. The improvement grows markedly with lead time, with WW3_UNET maintaining R = 0.702 with OBS_SHIP at 168–180 h compared to WW3's R = 0.483.

The 24-hour moving average of WW3_UNET achieves R = 0.720 with shipboard fuel consumption (ME1_FOC) at both 36–48 h and 168–180 h lead times, closely approaching the 0.736 achievable from direct onboard wave observations and substantially exceeding raw WW3 (0.419 at 168–180 h). This result demonstrates that AI-corrected forecasts can provide observation-consistent wave energy information for voyage planning up to 7–8 days in advance.

The 24-hour moving average effect reveals that the AI model captures the slowly evolving swell energy trends that drive vessel resistance and fuel consumption, rather than instantaneous wave heights, making it particularly suited to operational energy optimization workflows.

Raw WW3 correlations with ME1_FOC become near-zero or negative beyond 180 h lead time, confirming that unmodified WW3 data are operationally unreliable at these horizons. WW3_UNET maintains meaningful positive correlations throughout, effectively extending the useful wave forecast planning horizon by several days.

These findings provide quantitative, operationally grounded support for integrating AI-corrected wave forecast data into fuel-aware weather routing and voyage planning systems, with implications for maritime energy efficiency and decarbonization.

Author Contributions

Conceptualization: H.L., J.J., J.R.; Methodology: J.J., J.R.; Software: J.J., H.L.; Formal analysis: J.J.; Investigation: H.L; Data curation: J.J., H.L.; Validation: J.J.; Writing—original draft: H.L., J.J., J.R.; Writing—review and editing: J.R.; Supervision: J.R. All authors have read and agreed to the submitted version of the manuscript.

Funding

This research study was supported by the Korea Institute of Marine Science & Technology Promotion (KIMST), funded by the Ministry of Oceans and Fisheries (RS-2023-00256331).

Data Availability Statement

The WW3 and CMEMS datasets used in this study are available from their respective public data providers. The real-ship telemetry data, including onboard wave observations and fuel consumption records, are not publicly available due to commercial confidentiality and data-sharing restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

References

International Maritime Organization (IMO). Fourth IMO Greenhouse Gas Study; IMO: London, UK, 2020. [Google Scholar]
Mannarini, G.; Carelli, L.; Portell, C.; Coppini, G. VISIR: Technological infrastructure of an operational service for safe and efficient navigation in the Mediterranean Sea. Nat. Hazards Earth Syst. Sci. 2016, 16, 1791–1806. [Google Scholar] [CrossRef]
Yang, L.; Chen, F.; Liu, J.; Lin, Z. Ship speed optimization considering ocean currents to enhance environmental sustainability in maritime shipping. Sustainability 2020, 12, 3649. [Google Scholar] [CrossRef]
Copernicus Marine Service. Ocean State Report, Issue 8. J. Oper. Oceanogr. 2024, 17 (Suppl. 1), s1–s220. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. Proceedings of MICCAI, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. Proceedings of CVPR, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Tolman, H.L. User manual and system documentation of WAVEWATCH III version 3.14. NOAA/NWS/NCEP/MMAB Tech. Note 2009, 276, 194. [Google Scholar]
Ardhuin, F.; Rogers, E.; Babanin, A.V.; et al. Semiempirical dissipation source functions for ocean waves. Part I: Definition, calibration, and validation. J. Phys. Oceanogr. 2010, 40, 1917–1941. [Google Scholar] [CrossRef]
Hersbach, H.; Bell, B.; Berrisford, P.; et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 2020, 146, 1999–2049. [Google Scholar] [CrossRef]
Jolliff, J.K.; Kindle, J.C.; Shulman, I.; et al. Summary diagrams for coupled hydrodynamic-ecosystem model skill assessment. J. Mar. Syst. 2009, 76, 64–82. [Google Scholar] [CrossRef]
Morim, J.; Hemer, M.; Wang, X.L.; et al. Robustness and uncertainties in global multivariate wind-wave climate projections. Nat. Clim. Chang. 2019, 9, 711–718. [Google Scholar] [CrossRef]
Meucci, A.; Young, I.R.; Hemer, M.; et al. 140 Years of Global Ocean Wind-Wave Climate Derived from CMIP6 ACCESS-CM2 and EC-Earth3 GCMs: Global Trends, Regional Changes, and Future Projections. J. Clim. 2023, 36, 1605–1631. [Google Scholar] [CrossRef]
IPCC. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report; Masson-Delmotte, V., et al., Eds.; Cambridge University Press: Cambridge, UK, 2021. [Google Scholar]
Lobeto, H.; Menendez, M.; Losada, I.J. Future behavior of wind wave extremes due to climate change. Sci. Rep. 2021, 11, 7869. [Google Scholar] [CrossRef] [PubMed]
Zis, T.P.V.; Psaraftis, H.N.; Ding, L. Ship weather routing: A taxonomy and survey. Ocean Eng. 2020, 213, 107697. [Google Scholar] [CrossRef]
Ormevik, A.B.; Fagerholt, K.; Meisel, F.; Sandvik, E. A high-fidelity approach to modeling weather-dependent fuel consumption on ship routes with speed optimization. Marit. Transp. Res. 2023, 4, 100096. [Google Scholar] [CrossRef]
Cao, Y.; Zhang, S.; Lv, G.; Yu, M.; Ai, B. AI-based correction of wave forecasts using the Transformer-enhanced UNet model. Adv. Atmos. Sci. 2025, 42, 221–231. [Google Scholar] [CrossRef]

Figure 1. Overall framework of the Residual U-Net-based SWH correction and ship-track validation procedure.

Figure 2. Ship trajectory segments in the South Atlantic study domain. Lighter shades indicate the start of each route; darker shades indicate the end.

Figure 3. Time series of SWH (left axis) and ME1_FOC (right axis) for Route 1 at lead time 0–12 h. OBS_SHIP: blue; WW3: orange; WW3_UNET: green; CMEMS: red; ME1_FOC: purple.

Figure 4. 24-hour moving average of SWH and ME1_FOC for Route 1 at lead time 0–12 h.

Figure 5. Time series of SWH and ME1_FOC for Route 1 at lead time 72–84 h.

Figure 6. 24-hour moving average of SWH and ME1_FOC for Route 1 at lead time 72–84 h.

Figure 7. Time series of SWH and ME1_FOC for Route 1 at lead time 120–132 h.

Figure 8. 24-hour moving average of SWH and ME1_FOC for Route 1 at lead time 120–132 h.

Figure 9. Time series of SWH and ME1_FOC for Route 1 at lead time 168–180 h.

Figure 10. 24-hour moving average of SWH and ME1_FOC for Route 1 at lead time 168–180 h. At this 7–8 day horizon, WW3_UNET (mvg avg) achieves R = 0.720 with ME1_FOC, approaching the 0.736 from direct onboard observations.

Figure 11. Pearson correlation coefficients of SWH with OBS_SHIP by lead time. Solid lines: raw data; dashed lines: 24-hour moving average. WW3: orange; WW3_UNET: green; CMEMS: red.

Figure 12. Pearson correlation coefficients of SWH with CMEMS by lead time. Solid lines: raw data; dashed lines: 24-hour moving average. WW3: orange; WW3_UNET: green; OBS_SHIP: blue.

Figure 13. Pearson correlation coefficients of ME1_FOC with SWH from each data source by lead time. Solid lines: raw data; dashed lines: 24-hour moving average. WW3: orange; WW3_UNET: green; OBS_SHIP: blue; CMEMS: red.

Table 1. Ship trajectory segments in the South Atlantic study domain.

Route	Start	End	Region	Notes
1 (Red)	2025-07-21 06:00	2025-08-02 06:00	South Atlantic	Primary analysis route
2 (Green)	2025-08-05 12:00	2025-08-20 00:00	South Atlantic	—
3 (Orange)	2025-10-21 12:00	2025-10-31 12:00	South Atlantic	Contains data gaps
4 (Purple)	2025-11-20 18:00	2025-12-01 18:00	South Atlantic	—

Table 2. Pearson correlation coefficients (R) of SWH with OBS_SHIP for Route 1 (selected lead times). Bold values indicate WW3_UNET.

Lead Time (h)	WW3	WW3_UNET	CMEMS	WW3 (mvg)	WW3_UNET (mvg)	CMEMS (mvg)
0–12	0.937	0.961	0.967	0.967	0.969	0.981
36–48	0.927	0.975	0.967	0.972	0.989	0.981
72–84	0.894	0.967	0.967	0.955	0.982	0.981
120–132	0.819	0.901	0.967	0.877	0.920	0.981
168–180	0.483	0.702	0.967	0.726	0.858	0.981
216–228	0.459	0.600	0.967	0.609	0.766	0.981
276–288	0.062	0.434	0.967	0.245	0.639	0.981

Table 3. Pearson correlation coefficients (R) between ME1_FOC and SWH data sources for Route 1, all lead times. Bold values indicate WW3_UNET (AI-corrected).

Lead Time (h)	WW3	WW3_UNET	OBS_SHIP	CMEMS	WW3 (mvg)	WW3_UNET (mvg)	OBS_SHIP (mvg)	CMEMS (mvg)
0–12	0.399	0.471	0.568	0.604	0.597	0.632	0.736	0.757
12–24	0.409	0.495	0.568	0.604	0.609	0.662	0.736	0.757
24–36	0.441	0.531	0.568	0.604	0.650	0.705	0.736	0.757
36–48	0.467	0.543	0.568	0.604	0.680	0.720	0.736	0.757
48–60	0.462	0.540	0.568	0.604	0.683	0.718	0.736	0.757
60–72	0.441	0.529	0.568	0.604	0.667	0.715	0.736	0.757
72–84	0.427	0.535	0.568	0.604	0.650	0.713	0.736	0.757
84–96	0.376	0.492	0.568	0.604	0.606	0.678	0.736	0.757
96–108	0.369	0.478	0.568	0.604	0.607	0.663	0.736	0.757
108–120	0.450	0.498	0.568	0.604	0.675	0.682	0.736	0.757
120–132	0.456	0.522	0.568	0.604	0.659	0.690	0.736	0.757
132–144	0.400	0.499	0.568	0.604	0.578	0.643	0.736	0.757
144–156	0.349	0.483	0.568	0.604	0.582	0.681	0.736	0.757
156–168	0.267	0.504	0.568	0.604	0.554	0.734	0.736	0.757
168–180	0.098	0.454	0.568	0.604	0.419	0.720	0.736	0.757
180–192	−0.051	0.427	0.568	0.604	0.171	0.666	0.736	0.757
192–204	−0.106	0.279	0.568	0.604	−0.048	0.557	0.736	0.757
204–216	−0.035	0.329	0.568	0.604	0.063	0.565	0.736	0.757
216–228	0.152	0.459	0.568	0.604	0.339	0.682	0.736	0.757
228–240	0.310	0.543	0.568	0.604	0.564	0.804	0.736	0.757
240–252	0.319	0.613	0.568	0.604	0.607	0.867	0.736	0.757
252–264	0.296	0.615	0.568	0.604	0.574	0.829	0.736	0.757
264–276	0.263	0.586	0.568	0.604	0.427	0.809	0.736	0.757
276–288	0.246	0.576	0.568	0.604	0.509	0.813	0.736	0.757

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.