Submitted:
17 June 2026
Posted:
22 June 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Unified forecasting formulation. We formulate a multivariate, multi-horizon oyster-reef environmental forecasting problem with a 48-hour input and 24-hour output window, using consistent time-based splits and per-feature evaluation across seven key environmental variables.
- Systematic model comparison. We perform a controlled comparison of RF against ARIMA, Gradient Boosting, LSTM, and GRU under identical data processing, windowing, and evaluation metrics, providing a clear ranking of lightweight and deep models for this task.
- Empirical validation of theory. We show that the empirical performance of RF—strong average accuracy and per-feature robustness—aligns with its theoretical variance reduction and regularization properties, thereby validating the proposed optimization strategies.
- Reproducible, extensible baseline. The resulting RF-based pipeline constitutes a reproducible, low-complexity baseline that can be extended in future work to joint environmental–biological forecasting and to more advanced deep learning architectures.
2. Dataset and Problem Formulation
2.1. Notation and Variables
2.2. Multivariate, Multi-Horizon Forecasting Setup
2.3. Forecasting Operator
2.4. Training, Validation and Test Splits
2.5. Evaluation Metrics
2.6. Data Quality and Preprocessing
2.7. Problem Challenges and Modeling Implications
3. Random Forests for Environmental Forecasting
3.1. CART Regression Tree: Vector-Valued Leaves
3.2. Bootstrap Aggregation and Multi-Output Ensemble
3.3. Random Feature Subsampling and Variance Reduction
3.4. Consistency and Practical Considerations
3.5. RF Implementation and Link to Data
4. Optimizing Random Forest for Environmental Forecasting
4.1. Structural Regularization and Hyperparameters
4.2. Honest Forests and Split/Estimation Separation
4.3. Leaf Shrinkage and Regularized Estimation
4.4. Temporal Smoothing and Multi-Horizon Consistency
4.5. Subsampling, ExtraTrees, and Efficiency
4.6. Domain-Aware Feature Grouping
4.7. Practical Guidance and Hyperparameters
- trees;
- or group-based feature sampling;
- , , ;
- subsample fraction ;
- optional temporal smoothing on the 24-step output sequence.
5. Results
5.1. Experimental Setup
- Input length: 48 hours (past observations).
- Prediction length: 24 hours (future horizons).
- Evaluation metrics: root mean square error (RMSE) and mean absolute error (MAE), reported per feature and per model; we also compute averages across features (with and without DOSAT, as discussed below).
- Training loss: mean squared error (MSE) for LSTM and GRU; tree-based and ARIMA models use their standard regression objectives.
- Hardware: all classical models (ARIMA, RF, Gradient Boosting) run efficiently on CPU; LSTM/GRU can use CPU or GPU but remain lightweight at this scale.
5.2. Baseline Models
5.3. Quantitative Performance
5.3.1. Per-Feature Interpretation And Validation Of The Optimized Random Forest
- Temperature (temp). Temperature dynamics arise from nonlinear interactions among meteorological forcing, tidal mixing, and salinity gradients. The optimized RF captures these interactions effectively and benefits from ensemble averaging, which stabilizes predictions under diurnal and weather-driven variability. ARIMA’s linear structure cannot represent these nonlinearities, GradientBoosting is more sensitive to noise due to sequential boosting, and the recurrent models tend to oversmooth extremes given the limited sequence length and dataset size.
- Salinity (sal). Salinity exhibits threshold-like transitions driven by freshwater pulses and tidal mixing. The RF’s split-based structure naturally accommodates such discontinuities and remains robust to outliers and regime shifts. ARIMA fails to capture abrupt changes, GradientBoosting performs competitively but is slightly more sensitive to rare events, and LSTM/GRU struggle with the mixture of periodic and event-driven behavior when data volume is limited.
- Vertical velocity (vert). Vertical velocity is the one feature where RF does not achieve the lowest error. This variable is smooth and strongly autocorrelated, making it well suited to recurrent architectures. GRU, in particular, leverages gated temporal memory to track subtle flow dynamics that static models cannot represent. RF performs reasonably but lacks explicit temporal state, while ARIMA cannot model nonlinear vertical mixing and LSTM is slightly less stable than GRU on this dataset.
- Tide. Although tidal motion is quasi-periodic, it is modulated by nonlinear meteorological and bathymetric effects. The optimized RF captures these nonlinear distortions with low variance, whereas ARIMA only models idealized harmonic structure and degrades when the signal deviates from pure periodicity. GradientBoosting is competitive but more prone to overfitting local patterns, and the recurrent models tend to misalign phase or oversmooth when trained on limited sequences.
- Dissolved oxygen saturation (dosat). Dosat is highly nonlinear and strongly influenced by temperature, biological activity, and turbulent mixing. The optimized RF handles these multi-factor interactions and remains stable under extreme values and nonstationarity. ARIMA breaks down entirely under these conditions, producing extremely large errors. GradientBoosting performs reasonably but is more sensitive to rare high/low excursions, while LSTM/GRU require larger datasets to reliably model the noisy, multi-driver dynamics.
- Dissolved oxygen concentration (domgl). Dissolved oxygen depends on nonlinear coupling among temperature, salinity, and biological processes. The RF effectively models these interactions and handles heteroscedasticity and sharp drops or spikes. ARIMA misses nonlinear coupling, GradientBoosting is close but slightly more prone to overfitting, and LSTM/GRU benefit from temporal structure but are limited by noise and dataset size.
- pH (phn). pH responds nonlinearly to CO2 dynamics, temperature, and biological activity. The optimized RF captures these relationships and remains robust to small-scale fluctuations and sensor noise. ARIMA’s linear formulation is insufficient for carbonate chemistry, GradientBoosting is competitive but slightly noisier, and LSTM/GRU do not gain enough advantage from temporal structure given the limited data.
5.3.2. Validation of the Optimized Random Forest
5.4. Discussion
6. Conclusions and Future Work
Acknowledgments
References
- Kemp, W.M.; Boynton, W.R.; Adolf, J.E.; Boesch, D.F.; Boicourt, W.C.; Brush, G.; Cornwell, J.C.; Fisher, T.R.; Glibert, P.M.; Hagy, J.D.; et al. Eutrophication of Chesapeake Bay: historical trends and ecological interactions. Mar. Ecol. Prog. Ser. 2005, 303, 1–29. [Google Scholar] [CrossRef]
- Dai, M.; Zhao, Y.; Chai, F.; Chen, M.; Chen, N.; Chen, Y.; Cheng, D.; Gan, J.; Guan, D.; Hong, Y.; et al. Persistent eutrophication and hypoxia in the coastal ocean. Camb. Prism. Coast. Futur. 2023, 1, e19. [Google Scholar] [CrossRef]
- Breitburg, D.; Levin, L.A.; Oschlies, A.; Grégoire, M.; Chavez, F.P.; Conley, D.J.; Garçon, V.; Gilbert, D.; Gutiérrez, D.; Isensee, K.; et al. Declining oxygen in the global ocean and coastal waters. Science 2018, 359, eaam7240. [Google Scholar] [CrossRef] [PubMed]
- Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat, F. Deep learning and process understanding for data-driven Earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef] [PubMed]
- Pervej, M.F.; Tan, L.T.; Hu, R.Q. Artificial Intelligence Assisted Collaborative Edge Caching in Small Cell Networks. In Proceedings of the GLOBECOM 2020 - 2020 IEEE Global Communications Conference, 2020; pp. 1–7. [Google Scholar] [CrossRef]
- Tan, L.T.; Hu, R.Q. Mobility-Aware Edge Caching and Computing in Vehicle Networks: A Deep Reinforcement Learning. IEEE Trans. Veh. Technol. 2018, 67, 10190–10203. [Google Scholar] [CrossRef]
- Tan, L.T.; Hu, R.Q.; Hanzo, L. Twin-Timescale Artificial Intelligence Aided Mobility-Aware Edge Caching and Computing in Vehicular Networks. IEEE Trans. Veh. Technol. 2019, 68, 3086–3099. [Google Scholar]
- Wang, Q.; Tan, L.T.; Hu, R.Q.; Qian, Y. Hierarchical Energy-Efficient Mobile-Edge Computing in IoT Networks. IEEE Internet Things J. 2020, 7, 11626–11639. [Google Scholar] [CrossRef]
- Le, T.; Shetty, S. Artificial intelligence-aided privacy preserving trustworthy computation and communication in 5G-based IoT networks. Ad. Hoc Netw. 2022, 126, 102752. [Google Scholar] [CrossRef]
- Zahin, A.; Tan, L.T.; Hu, R.Q. Sensor-Based Human Activity Recognition for Smart Healthcare: A Semi-supervised Machine Learning. In Proceedings of the Artificial Intelligence for Communications and Networks; Springer International Publishing, 2019; pp. 450–472. [Google Scholar]
- Zahin, A.; Tan, L.T.; Hu, R.Q. A Machine Learning Based Framework for the Smart Healthcare Monitoring. 2020 Intermountain Engineering, Technology and Computing (IETC), 2020. [Google Scholar]
- Le, T.; Reisslein, M.; Shetty, S. Multi-Timescale Actor-Critic Learning for Computing Resource Management With Semi-Markov Renewal Process Mobility. IEEE Trans. Intell. Transp. Syst. 2024, 25, 452–461. [Google Scholar] [CrossRef]
- Tan, L.; Van, L.; Sachin, S. Privacy-Aware Framework of Robust Malware Detection in Indoor Robots: Hybrid Quantum Computing and Deep Neural Networks. TechRxiv, 2025. [Google Scholar]
- Tan, L.; Van, L.; Sachin, S. Quantum-Augmented AI/ML for O-RAN: Hierarchical Threat Detection with Synergistic Intelligence and Interpretability. TechRxiv, 2025. [Google Scholar]
- Le, T.; Le, V. DPFAGA-Dynamic Power Flow Analysis and Fault Characteristics: A Graph Attention Neural Network. In Proceedings of the The 2025 International Conference on the AI Revolution: Research, Ethics, and Society (AIR-RES 2025), 2025. [Google Scholar]
- Le, V.; Le, T. Hybrid Quantum–Classical Encoding for Accurate Residue-Level pKa Prediction. In Proceedings of the International Conference on the AI Revolution: Research, Ethics, and Society (AIR-RES 2026), 2026. [Google Scholar]
- Ho, T.K. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 832–844. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Biau, G.; Devroye, L.; Lugosi, G. Consistency of random forests and other averaging classifiers. J. Mach. Learn. Res. 2008, 9. [Google Scholar]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
- Le, V.; Antwi, R.; Le, T. Lightweight Machine Learning for Oyster-Reef Environmental Forecasting: A Random-Forest Baseline with Classical and Deep Benchmarks. Available online: https://www.dropbox.com/scl/fi/uqh17lvvkcqtp1kux5unn/Techreport_LMLEP.pdf?rlkey=knuvsiac74ih4jswtlp9xvlhh&st=krq8pdvy&dl=0.
- Wager, S.; Athey, S. Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 2018, 113, 1228–1242. [Google Scholar] [CrossRef]












| Feature | RMSE | MAE | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ARIMA | Random Forest | Gradient Boosting | LSTM | GRU | ARIMA | Random Forest | Gradient Boosting | LSTM | GRU | |
| temp | 1.0848 | 0.5470 | 0.7041 | 0.8626 | 0.8605 | 0.7209 | 0.3416 | 0.4847 | 0.5982 | 0.5911 |
| sal | 0.4444 | 0.3585 | 0.3631 | 0.5329 | 0.5194 | 0.3135 | 0.2585 | 0.2575 | 0.4225 | 0.3880 |
| vert | 0.2093 | 0.1350 | 0.1168 | 0.1129 | 0.1095 | 0.1386 | 0.0948 | 0.0839 | 0.0808 | 0.0745 |
| tide | 0.2079 | 0.0763 | 0.0850 | 0.0862 | 0.0841 | 0.1386 | 0.0502 | 0.0590 | 0.0574 | 0.0549 |
| dosat | 1665.95 | 13.93 | 15.11 | 21.25 | 20.41 | 38.30 | 9.48 | 10.32 | 15.69 | 15.04 |
| domgl | 2.0535 | 0.9666 | 1.0366 | 1.0683 | 1.0208 | 1.3064 | 0.6548 | 0.7121 | 0.7530 | 0.7139 |
| phn | 0.1948 | 0.1210 | 0.1285 | 0.1573 | 0.1616 | 0.1440 | 0.0877 | 0.0928 | 0.1211 | 0.1251 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).