Physical and Biogeochemical Drivers for Forecasting Red Tides in Southwest Florida: A Regionally Integrated Machine Learning Framework

Matthew Duus; Ahmed S. Elshall; Michael L. Parsons; Ming Ye

doi:10.20944/preprints202603.0568.v1

Submitted:

06 March 2026

Posted:

06 March 2026

You are already at the latest version

Abstract

Harmful algal blooms (HABs) caused by Karenia brevis (K. brevis) present a persistent ecological and public health challenge across coastal Florida. This study develops a regionally integrated machine learning framework to predict weekly K. brevis bloom occurrence using environmental data from both the Peace and Caloosahatchee Rivers, combined with coastal bloom records from Southwest Florida and Tampa Bay to enhance the spatial and temporal continuity of the response record. A Random Forest classifier was trained on a multi-decadal dataset incorporating river discharge, nutrient concentrations (total nitrogen and total phosphorus), wind forcing, sea surface temperature, salinity, and sea surface height anomalies as a proxy for Loop Current variability. The model achieved strong predictive performance on a chronologically withheld test set, with an overall accuracy of ~90%, balanced accuracy of 87.6%, and high precision and recall for bloom events. Bloom timing and persistence were captured with strong agreement during ongoing bloom periods, while non-bloom conditions were identified with low false-positive rates. Feature-response analyses indicated that bloom probability increased most sharply under moderate discharge and nutrient conditions, with diminished sensitivity at higher extremes. Learning curve analysis demonstrated robust training performance and stable generalization, with validation accuracy plateauing near 84%, suggesting a data-limited ceiling on forecast skill. By aggregating nutrient inputs across multiple watersheds and integrating spatially aligned bloom observations, this study demonstrates the utility of multi-source machine learning frameworks for regional-scale HAB prediction. The results support the development of early warning tools and provide a reproducible foundation for evaluating how combined watershed loading and physical forcing are associated with K. brevis bloom occurrence in complex estuary systems with watershed and coastal coupling.

Keywords:

bloom forecasting

;

Karenia brevis

;

red tide

;

harmful algal blooms

;

nutrient loading

;

wa-tershed–coastal coupling

;

Gulf of Mexico or Gulf of America

;

machine learning

;

ran-dom forest classifier

Subject:

Environmental and Earth Sciences - Environmental Science

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Physical and Biogeochemical Drivers for Forecasting Red Tides in Southwest Florida: A Regionally Integrated Machine Learning Framework

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe