UNet-LSCNet: Integrating Dynamic Snake Convolution and Vision Transformer for Water Body Extraction with Complex Boundaries

Yukai Zhang; Xi Zhang; Zhenhua Wang; Wanwen He

doi:10.20944/preprints202604.1325.v1

Submitted:

10 April 2026

Posted:

20 April 2026

You are already at the latest version

Abstract

Water body extraction using remote sensing is crucial for ecological environment monitoring and water resource management. Nevertheless, the irregular and complicated shapes of water bodies make it difficult to obtain fine-grained characterization and preserve structural consistency with current approaches. To overcome the shortcomings of fixed receptive fields and sampling schemes of traditional convolutional networks, this paper proposes UNet-LSCNet, an advanced architecture based on U-Net. The proposed model integrates dynamic snake convolutions (DSConv), the convolutional block attention module (CBAM), and a lightweight Vision Transformer (LaViT) to enable local adaptive geometric modeling and contextually enriched semantics representation. The experimental findings demonstrate that the suggested approach outperforms various widely applied models. Specifically, UNet-LSCNet achieves a mIoU of 95.67% and an F1-score of 96.32%, while maintaining a competitive inference speed of 4.18 frames per second (FPS). Furthermore, it exhibits greater stability in highly complex situations, such as slender meandering rivers and fragmented small-scale water bodies. Ablation experiments confirm the synergistic utility of each module, revealing that the proposed model enhances segmentation accuracy and morphological resilience without compromising inference efficiency.

Keywords:

water body extraction

;

semantic segmentation

;

dynamic snake convolution

;

vision transformer

;

sentinel-2

Subject:

Environmental and Earth Sciences - Remote Sensing

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

UNet-LSCNet: Integrating Dynamic Snake Convolution and Vision Transformer for Water Body Extraction with Complex Boundaries

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe