1. Introduction
Phytoplankton are a diverse group of microscopic photosynthetic organisms that play a crucial role in marine ecosystems and global biogeochemical cycles. Functioning as primary producers, they harness solar energy to convert carbon dioxide and nutrients into organic matter, serving as the foundation of the marine food web [
1]. Their pivotal role extends to the cycling of carbon, nitrogen, and other elements between the atmosphere and the ocean [
2,
3]. Phytoplankton also act as climate regulators by absorbing substantial amounts of atmospheric carbon dioxide during photosynthesis, thereby mitigating the impacts of greenhouse gas emissions on the planet’s climate system [
4,
5,
6]. Additionally, they contribute significantly to oxygen production in the ocean, a vital element for the survival of marine organisms [
3,
7]. Apart from their ecological and biogeochemical importance, changes in phytoplankton phenology can have direct impacts on human societies by influencing the survival of fish larvae and consequently fish stocks [
8]. Harmful algal blooms (HABs), for instance, can release toxins that accumulate in seafood, posing threats to human health and causing disruptions in local economies [
9]. It is, therefore, important to understand phytoplankton dynamics and identify changes to their seasonality and abundance.
Monitoring phytoplankton through remote sensing is important to study their distribution, abundance, and productivity across the world oceans and inland waters [
10]. Remote sensing enables the acquisition of high-resolution data on large spatial and temporal scales, providing a comprehensive understanding of the processes governing phytoplankton dynamics [
11]. A key advantage of remote sensing lies in its capacity to offer synoptic coverage of the ocean surface, surpassing the limitations of traditional sampling methods [
12]. This facilitates detecting spatial and temporal patterns in phytoplankton abundance and productivity, as well as the identification of ecological hotspots [
13] and their responses to global environmental changes, such as oceanic warming, ocean acidification, and eutrophication [
14]. Furthermore, monitoring oceans via remote sensing can provide early warnings of HABs, thereby enabling effective mitigation of the associated risks [
15,
16]. Remotely sensed data can also be utilized to calibrate and validate ocean biogeochemical models, which are crucial for predicting the responses of marine ecosystems to global environmental changes [
17].
Ocean colour algorithms are analytical tools that harness satellite imagery to estimate the concentrations of phytoplankton, sediment, and dissolved organic matter in the oceans. These algorithms operate on the principle of light absorption and scattering by various components present in the water, which influence the observed colour of light reflected from the ocean’s surface [
18]. To estimate phytoplankton concentration, these algorithms leverage the unique spectral characteristics exhibited by the pigment. Chlorophyll-a, the dominant photosynthetic pigment found in most phytoplankton, displays a distinctive optical signature, with high absorption of blue and red light and relatively high reflection of green light [
19,
20]. Often utilising the ratio (or difference) in blue to green light reflected by the ocean’s surface, these algorithms can derive estimates of chlorophyll-a concentration ([CHL-a]), thereby inferring the abundance of phytoplankton [
18]. Recent advances in satellite technology and algorithm development have significantly improved the accuracy of these empirical algorithms [
21,
22]. However, the performance of ocean colour models can still vary depending on the studied environmental conditions, with some regions posing greater challenges [
21]. Studies evaluating the performance of ocean colour algorithms generally indicate reasonable estimates across a range of conditions, though their accuracy can be affected by factors like cloud cover, atmospheric interference, bottom reflectance, and the presence of other water constituents, for example, in rivers and estuaries [
23,
24]. Different techniques could provide better solutions to such issues.
Neural networks have emerged as a promising approach for analyzing complex remote sensing data, as they can learn to recognize patterns within the data. One notable example is the Case2R algorithm developed by using the Inverse Modelling Technique [
25]. This algorithm uses an Inverse Radiative Transfer Model-Neural Network (IRTM-NN) to estimate [CHL-a] and total suspended matter from normalized water-leaving reflectance, employing a large database of
in situ measurements for training and validation [
25]. Convolutional Neural Networks (CNN) and Artificial Neural Networks (ANN) have also demonstrated promising results in accurately estimating [CHL-a], with some studies reporting improved accuracy compared to traditional ocean colour algorithms [
26,
27,
28,
29]. These methods provide deterministic estimates of [CHL-a] even though the dataset itself is noisy; particularly due to various sources of errors in collecting [CHL-a] samples and imperfect remotely sensed imagery. On the other hand, a branch of deep learning investigates the realm of probabilistic neural network models, which allow for uncertainty quantification and stochastic model evaluation [
30].
Bayesian Neural Networks (BNNs) are a class of neural networks that incorporate probabilistic modeling, generally using Bayes’ rule, to capture uncertainty in the model’s predictions [
31]. A BNN treats its parameters as random variables by associating a probability distribution to the network’s weights. Training these probabilistic networks involves incorporating prior distributions over the network weights, which are then updated using Bayesian inference techniques, such as Markov Chain Monte Carlo or variational methods [
32]. By accounting for uncertainty, BNNs enable more efficient and robust predictions, especially in scenarios with limited data or complex datasets [
33]. Bayesian models, including BNNs, offer significant advantages to deterministic models, notably in their capacity to integrate prior knowledge, estimate uncertainties through probability distributions, and exhibit robustness against overfitting due to their probabilistic framework [
34]. However, it’s important to note that these benefits come at the cost of increased computational complexity compared to conventional Markov Chain Monte Carlo techniques, necessitating more extensive training time. Nevertheless, these models have found applications in various fields, including but not limited to computer vision [
35], natural language processing [
36], robotics [
37], and healthcare [
38], where reliable uncertainty estimates and probabilistic reasoning are essential for making informed decisions and assessing risks.
The application of BNNs in ocean colour remote sensing to address some of the common challenges listed above remains relatively unexplored. It is worth mentioning that
Werther et al. 2022 proposes using Monte Carlo dropout as means of achieving a BNN ocean colour model, whereas here, the full posterior distribution of the NN’s parameters is obtained using Bayes rule. Specifically, by relying on Monte Carlo dropout the posterior for the NN parameters they obtain is a probability density function defined by two points, where the parameter assumes the trained value with some set probability
p or a value of zero with probability
. This is different than inferring the posterior distribution function using Bayes’ rule.
This study introduces a new BNN ocean colour algorithm designed to enhance the estimation of [CHL-a] from remotely sensed reflectance data while providing robust means for uncertainty quantification. The new BNN ocean colour model is applicable at a global scale, and improves upon traditional deterministic models by predicting the probability distribution of [CHL-a]. The aim of this research is to address the limitations of traditional ocean colour algorithms by integrating Bayesian inference principles into the algorithmic framework. The study explores the accuracy and reliability of the BNN ocean colour algorithm through systematic experimentation and comparison with existing algorithms. Specifically, the strength of the proposed methodology lies in the ability to perform a comprehensive statistical analysis with uncertainty quantification estimates. The findings of this study are expected to provide new insights to the field of ocean colour remote sensing and contribute to a deeper understanding of phytoplankton dynamics.
2. Match-up Data
We utilized the extensive bio-optical
in situ database from
Valente et al. 2022, which merges 27 datasets that were individually processed to maximize data quality. Observations with missing or incorrect date and/or geographic coordinates, as well as those obtained using incompatible measurement methods or exhibiting extreme values were excluded. This comprehensive database was constructed by applying a specified threshold for the coefficient of variation to assess spatial and temporal variability among replicate data points. Observations below this threshold were averaged, while those exceeding it were discarded, thereby ensuring a reliable and consistent dataset.
The unified database consists of inherent optical properties, such as absorption coefficients of phytoplankton (), detrital matter (), colored dissolved organic matter (), and backscattering coefficients of particles (). Additionally, the dataset contains measurements of [CHL-a] measured by fluorometry ([CHL-a]fluor) or by High-Performance Liquid Chromatography ([CHL-a]hplc), total suspended matter (), diffuse attenuation coefficient for downward irradiance (), and remote sensing reflectance (). Each of these variables offers valuable information about the optical and biological characteristics of the ocean, enabling comprehensive analyses and investigations related to ocean colour and ecosystem dynamics.
The data were collected from multi-project archives obtained through open internet services or directly from data providers. In particular, the match-ups were compiled from 27 sets of
in situ data obtained from various sources, including SeaBASS [
41], NOMAD [
42], MERMAID [
43], ICES (
https://www.ices.dk/data/dataset-collections/Pages/Plankton.aspx), ARCSSPP [
44], BIOCHEM [
45], BODC (
https://www.bodc.ac.uk/data/bodc_database/), COASTCOLOUR [
46], MAREDAT [
47], and SEADATANET (
seadatanet.org). In addition, data were collected from projects including MOBY [
48], BOUSSOLE, AERONET-OC, HOT, GeP&CO, AMT, AWI, BARENTSSEA, BATS, CALCOFI, CCELTER, CIMT, ESTOC, IMOS, PALMER, TPSS, and TARA. Remotely sensed reflectances were retrieved from ESA’s Medium Resolution Imaging Spectrometer (MERIS), ESA’s Ocean and Land Colour Instrument (OLCI), NASA’s Sea-viewing Wide Field-of-view Sensor (SeaWiFS), NASA’s Visible Infrared Imaging Radiometer Suite (VIIRS) and NASA’s MODerate resolution Imaging Spectro-radiometer (MODIS).
3. Bayesian Neural Network
Probabilistic BNNs were trained to be used as probabilistic ocean colour surrogates that take into account uncertainties when training the model [
49]. This is done by training a BNN within the Bayesian framework, which is outlined in
A.1. The Bayesian framework characterizes probabilities as expressions of uncertainty, encompassing both prior information regarding the problem and uncertainties within the data. In contrast, the traditional frequentist approach explores probabilities through the lens of long-term frequencies, demanding substantial datasets for analysis. A thorough comparison between the two approaches could be found in
Samaniego 2010. The BNN trained in this study is a densely connected multilayer perceptron [
51] consisting of 2 hidden layers with 20 neurons each, alongside the input and output layers. Softplus activation functions were employed at the outputs of the input and hidden layers. In this framework, the weights associated with the neurons and biases are treated as random variables, and assigned a uniform prior distribution. Note that a preliminary sensitivity study testing different Bayesian prior distributions and neural network architectures was conducted, however, this study showed similar behavior across various architectures and priors, and consequently was omitted for brevity.
An input is propagated through the BNN to predict the posterior value of [CHL-a]. As commonly performed in Bayesian inference applications, this [CHL-a] is assumed to be noisy following a Gaussian distribution with a learnable variance; i.e. the prediction of the BNN, , is , where is the input and is Gaussian noise with zero mean and learnable variance.
For a given input, [CHL-a] predictions are obtained by sampling model parameters from their respective distributions resulting in a distribution of [CHL-a] as opposed to a single value one would obtain with a deterministic model. Note that this prediction is the approximate posterior distribution of [CHL-a], and is different than previous studies in the literature that investigate the problem of ocean colour uncertainties from the complementary frequentist viewpoint [
52,
53]. The advantage of having a probabilistic ocean colour model lies in its capability to quantify uncertainties, providing uncertainty bounds [
54] for the model predictions. This feature proves particularly valuable in risk analysis studies requiring a probabilistic representation of [CHL-a].
The model is trained using Stochastic Variational Inference (SVI), a technique used to approximate the posterior distribution of the model’s latent variables in Pyro [
55]. SVI formulates inference as an optimization problem, seeking the best-fitting distribution within a parametric family to approximate the posterior distribution [
56,
57,
58]. During training, the SVI algorithm optimizes for the model’s parameters by minimizing the total loss (Eqn.
1). The SVI algorithm iteratively updates the variational parameters to approximate the true posterior distribution. A brief review of the SVI framework is provided in
Section A.2.
Within the SVI framework, the Evidence Lower Bound (ELBO) loss function was adopted to train the neural network [
59]. The ELBO loss is commonly used for training BNNs because maximizing the ELBO loss is equivalent to maximizing the log evidence or, alternatively, minimizing the Kullback Leibler divergence between the approximate and true posterior distributions [
58]. In the context of BNNs, the ELBO loss encompasses both the data likelihood and the prior distribution over the model parameters. The ELBO loss is expressed as:
where
x represents the observations,
z the latent variables,
p and
q the true and approximate posterior distributions that are parameterized by
and
, respectively. Note that the trained BNN model captures aleatoric (inherent to the underlying phenomena) and epistemic (imperfect models and lack of data) uncertainties [
60].
All BNN models were trained using the ADAM [
61] stochastic optimization algorithm with a learning rate of
. Furthermore, the BNN was trained using a 10-fold cross validation strategy, where the training dataset in each fold comprised of randomly sampling
of the complete dataset and leaving the rest for validation. The presented results correspond to using the best trained BNN for inference on the entire dataset. We note here that, as it can also be seen in the Results section, there was no overfitting issue, namely for three reasons; the model is shallow, the data is noisy, and the model prediction is made noisy as described earlier.
7. Conclusion
This work introduces the use of a bayesian neural network (BNN) for estimating [CHL-a] from remotely sensed data, using the largest available database of in situ match-ups. The BNN’s MAP was shown to outperform established ocean colour models, such as OC4 and OCI, providing reliable estimates for [CHL-a]. Furthermore, the learning-based method allows for more degrees of freedom in ocean colour modeling by involving more input variables, describing the state of the ocean, to predict the chlorophyll-a concentration. The true potential of the BNN model, however, lies in it’s uncertainty quantification capabilities, where the BNN predicts the distribution of potential [CHL-a] values, which builds confidence in the predicted values. Our findings demonstrate that the BNN model exhibits a remarkable capacity to capture mesoscale features and ocean circulation patterns, effectively delineating spatial and temporal variations in [CHL-a] across diverse marine ecosystems. Furthermore, by including uncertainty, the proposed model can provide more accurate information than traditional algorithms for coastal waters when using higher spatial resolution ocean colour imagery such as from the Sentinel-3 OLCI. This can benefit coastal ecosystem health and biodiversity assessments by studying the nutrient circulation, detecting localised HABs, as well as monitoring climate change and other anthropogenic impacts on phytoplankton dynamics.
The BNN’s ability to quantify uncertainty in predictions offers more confidence in the results, particularly in regions with sparse or irregular data coverage, and serves as a crucial step toward fostering informed decision-making in marine research and management. These uncertainty estimates help understand when and where the BNN predictions are reliable, as opposed to other regions where uncertainties are large and additional data may be necessary to improve prediction accuracy. The southern Red Sea (
Figure 9) is an example of a region, where regional tuning of the ocean colour model is needed in order to increase the accuracy of [CHL-a] estimation. Other such shallow coastal environments can be identified globally by applying the BNN model. By tuning the model further with more high quality regional observations, new information regarding their phytoplankton phenology may emerge.
The introduction of BNN models also creates new possibilities in the field of ocean colour remote sensing. Future research can expand the scope of this work by incorporating additional variables such as sea surface temperature to further improve the accuracy of [CHL-a] estimates, and estimating [CHL-a] along the water column.
Author Contributions
Conceptualization, M.H. and N.P.; methodology, M.H. and N.P.; software, M.H.; validation, M.H., N.P., R.B., D.R. and I.H.; formal analysis, M.H, N.P, R.B., D.R., O.K. and I.H; investigation, M.H, N.P, R.B., D.R., O.K. and I.H; resources, O.K. and I.H; data curation, M.H and N.P; writing—original draft preparation, M.H and N.P; writing—review and editing, M.H, N.P, R.B., D.R., O.K. and I.H; visualization, M.H; supervision, R.B., D.R., O.K. and I.H; project administration, R.B., D.R., O.K. and I.H; funding acquisition, R.B., O.K. and I.H All authors have read and agreed to the published version of the manuscript.