Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Comparison of CNNs and ViTs Based Hybrid Models Using Gradient Profile Loss for Classification of Oil Spills in SAR Images

Version 1 : Received: 23 October 2021 / Approved: 25 October 2021 / Online: 25 October 2021 (15:42:36 CEST)

How to cite: Basit, A.; Siddique, M.A.; Sarfraz, M.S. Comparison of CNNs and ViTs Based Hybrid Models Using Gradient Profile Loss for Classification of Oil Spills in SAR Images. Preprints 2021, 2021100363. https://doi.org/10.20944/preprints202110.0363.v1 Basit, A.; Siddique, M.A.; Sarfraz, M.S. Comparison of CNNs and ViTs Based Hybrid Models Using Gradient Profile Loss for Classification of Oil Spills in SAR Images. Preprints 2021, 2021100363. https://doi.org/10.20944/preprints202110.0363.v1

Abstract

Oil spillage over a sea or ocean’s surface is a threat to marine and coastal ecosystems. Spaceborne synthetic aperture radar (SAR) data has been used efficiently for the detection of oil spills due to its operational capability in all-day all-weather conditions. The problem is often modeled as a semantic segmentation task. The images need to be segmented into multiple regions of interest such as sea surface, oil spill, look-alikes, ships and land. Training of a classifier for this task is particularly challenging since there is an inherent class imbalance. In this work, we train a convolutional neural network (CNN) with multiple feature extractors for pixel-wise classification; and introduce to use a new loss function, namely ‘gradient profile’ (GP) loss, which is in fact the constituent of the more generic Spatial Profile loss proposed for image translation problems. For the purpose of training, testing and performance evaluation, we use a publicly available dataset with selected oil spill events verified by the European Maritime Safety Agency (EMSA). The results obtained show that the proposed CNN trained with a combination of GP, Jaccard and focal loss functions can detect oil spills with an intersection over union (IoU) value of 63.95%. The IoU value for sea surface, look-alikes, ships and land class is 96.00%, 60.87%, 74.61% and 96.80%, respectively. The mean intersection over union (mIoU) value for all the classes is 78.45%, which accounts for a 13% improvement over the state of the art for this dataset. Moreover, we provide extensive ablation on different Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) based hybrid models to demonstrate the effectiveness of adding GP loss as an additional loss function for training. Results show that GP loss significantly improves the mIoU and F1 scores for CNNs as well as ViTs based hybrid models. GP loss turns out to be a promising loss function in the context of deep learning with SAR images.

Keywords

Oil spills; synthetic aperture radar (SAR); deep convolutional neural networks (DCNNs); vision transformers (ViTs); deep learning; semantic segmentation; marine pollution; remote sensing

Subject

Engineering, Control and Systems Engineering

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.