Abstract:Marine wave hazards represent one of the most prevalent risks in oceanic environments,posing substantial threats to maritime safety,offshore operations,and coastal infrastructure.Accurate forecasting of sea state conditions—particularly significant wave height—is therefore essential for mitigating risks associated with vessel instability,maritime accidents,and damage to marine structures.Conventional numerical wave prediction models,although widely applied,often suffer from high computational costs and limited capability in representing nonlinear wave dynamics under rapidly changing atmospheric conditions.In recent years,deep learning approaches have emerged as promising alternatives for ocean state prediction.Convolutional neural networks (CNNs),in particular,have demonstrated strong performance in feature extraction tasks;however,CNN-based models may experience information loss and degraded predictive skill when applied to extreme sea states characterized by strong nonlinearity,wave breaking,and steep wave gradients.
To overcome these limitations,this study proposes a regional significant wave height forecasting framework based on the Vision Transformer (ViT) architecture.Unlike convolution-based models that rely on localized receptive fields,the ViT employs a multi-head self-attention mechanism capable of capturing long-range dependencies and global spatiotemporal relationships between atmospheric forcing and wave responses.This design enables more effective preservation of fine-scale features and improves representation of complex wind-wave interactions,particularly under extreme marine conditions.The primary objective of this research is to develop a high-accuracy significant wave height prediction model with extended lead times,with an emphasis on improving performance during high-energy wave events.
The model was trained and validated using ERA5 reanalysis data provided by the European Centre for Medium-Range Weather Forecasts (ECMWF),as which offer comprehensive and consistent atmospheric and oceanic variables across diverse wave climate regimes.A systematic evaluation was conducted to assess the effects of different input variable combinations,including significant wave height,10 m sea surface wind components,and mean wave period.In addition,the influence of input sequence length on prediction accuracy was investigated using historical windows ranging from 6 to 48 hours.The results indicate that the optimal model configuration employed significant wave height together with 10 m wind vector component as input features,highlighting the critical role of wind-wave coupling in wave evolution.Furthermore,an input sequence length of 18 hours yielded the highest predictive skill,effectively balancing temporal dependency representation and noise suppression.
For 24-hour forecasts,the proposed ViT-based model achieved a root mean square error of 0.323 m and a correlation coefficient of 0.848,demonstrating notable improvements over persistence-based baselines and performance comparable to exceeding that of existing deep learning approaches reported in the literature.These findings highlight the strong potential of transformer-based architectures for enhancing operational wave forecasting systems,particularly under extreme sea-state conditions where traditional and CNN-based models may exhibit reduced reliability.Future work should explore hybrid CNN-Transformer architectures,incorporation of additional physical variables such as bathymetry,ocean currents,and atmospheric pressure,and broader validation across open-ocean,coastal,and semi-enclosed sea environments.