ARTICLE | doi:10.20944/preprints202301.0561.v2
Subject: Computer Science And Mathematics, Information Systems Keywords: Sensors; data-set; Machine learning; river floods; river level
Online: 1 February 2023 (03:54:50 CET)
Reliable and accurate flood prediction is a challenging task in poorly gauged basins due to data scarcity. Data is an essential component of any AI/ML model today, and the performance of such models hugely depends on the availability of sufficient amount of trusted, representative data. However, unlike a few well-studied rivers, most of the rivers in developing countries are still insufficiently monitored, which significantly hinges the design and development of advanced flood prediction models and early warning systems. This paper presents a multi-modal, sensor-based and near-real time river monitoring system to produce a multi-feature data set for the Kikuletwa river in Northern Tanzania, an area that heavily suffers from frequent floods. Our deployed system, which gather information about river depth levels and weather at several locations, aims at widening the ground truth of the river characteristics and eventually improve the accuracy of flood predictions. We provide details on the monitoring system used to gather the data as well as report on the methodology and the nature of the data. Finally, we present the relevance of the data set in the context of flood prediction, discussing the most suitable AI/ML-based forecasting approaches, while also highlighting some applications of the data set beyond flood warning systems.
ARTICLE | doi:10.20944/preprints202301.0558.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Heavy rainfall; River floods; Machine learning
Online: 30 January 2023 (13:01:38 CET)
Advancements in Machine Learning techniques, availability of more data-sets, and increased computing power have enabled a significant growth in a number research areas. Predicting, detecting and classifying complex events in earth systems which by nature are difficult to model is one of such areas. In this work, we investigate the application of different machine learning techniques for detecting and identifying extreme rainfall events in a sub-catchment within Pangani River Basin, found in Northern Tanzania. Identification and prediction of extreme rainfall event is a preliminary crucial task towards success in predicting rainfall-induced river floods. To identify a rain condition in the selected sub-catchment, we use data from five weather stations which have been labeled for the whole sub-catchment. In order to assess which Machine Learning technique suits better for rainfall identification, we apply five different algorithms in a historical dataset for the period of 1979 to 2014. We evaluate the performance of the models in terms of precision and recall, reporting Random Forest and XGBoost as the ones with best overall performance. However, since the class distribution is imbalanced, the generic Multi-layer Perceptron performs best when identifying the heavy rainfall events, which are eventually the main cause of rainfall-induced river floods in the Pangani River Basin