Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Neither Zero-Catch Data nor Model Structure. Noisy Labels Are the Key Hindrance of Improving Fisheries Forecasting Performance

Version 1 : Received: 29 August 2023 / Approved: 30 August 2023 / Online: 31 August 2023 (02:55:17 CEST)

How to cite: Li, Z.; Zhang, T. Neither Zero-Catch Data nor Model Structure. Noisy Labels Are the Key Hindrance of Improving Fisheries Forecasting Performance. Preprints 2023, 2023082081. https://doi.org/10.20944/preprints202308.2081.v1 Li, Z.; Zhang, T. Neither Zero-Catch Data nor Model Structure. Noisy Labels Are the Key Hindrance of Improving Fisheries Forecasting Performance. Preprints 2023, 2023082081. https://doi.org/10.20944/preprints202308.2081.v1

Abstract

The zero-catch problem is a key issue in CPUE(Catch Per Unit Effort) standardization, and previous studies have treated all zero-catch data uniformly, but this actually loses some correctly-labeled samples. On the other hand, for the main catches with few zero-catch samples, the problem of low performance of fisheries forecasting remains unsolved even though the forecasting model structure is updating constantly, since we cannot know whether the samples are correctly recorded. In this paper, we propose a method based on confident learning theory to detect anomalous samples in the datasets and unify zero-catch and non-zero samples as noise through an overarching framework of learning with noisy labels, which reveals the heterogeneity among zero-catch samples (as well as among non-zero samples) and the homogeneity between zero-catch samples and non-zero samples. Using three species of tuna in the tropical Atlantic Ocean with the spatial resolution of 0.5 ◦ × 0.5 ◦ and time resolution of days from 2016 to 2019 as experimental material, performance on all three classical machine learning models(Random forest, Support Vector Machine and XGBoost) is significantly improved compared to each baseline.This confirms that we propose a self-adaptive, effective method for detecting and repairing anomalous samples in the fishery dataset.

Keywords

Zero-catch problem; Non-zero samples; Thunnus; Confident learning; Learning with noisy labels

Subject

Environmental and Earth Sciences, Oceanography

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.