2.1. Study Area and Data
2.1.1. Study Reservoir: Bovan (Aleksinac, Serbia)
The study was conducted for the Bovan Reservoir, formed by the construction of the Bovan dam in the vicinity of Aleksinac (southeastern Serbia). The reservoir is multipurpose and is used for municipal water supply for Aleksinac and surrounding settlements, flood protection, sediment retention, low-flow augmentation, hydropower generation, and irrigation. Due to these functions, the reservoir represents an important regional infrastructure asset for water supply and water-regime management.
The catchment is characterized by a mixed rainfall-snow hydrological regime, with peak inflows typically occurring during spring and early summer. Reservoir operation is primarily governed by water-supply requirements and flood-protection constraints, which influence the seasonal storage strategy
The operational regulation type corresponds to annual flow equalization, implying pronounced seasonal components in the water-level regime as well as a dependence on antecedent storage conditions (system “memory” effects). This characteristic is particularly relevant for sequential modelling at an hourly time step.
The main reservoir characteristics (based on the available project/operational information) are: gross storage 60 × 10⁶ m³, active storage 41 × 10⁶ m³, dead storage 3 × 10⁶ m³, minimum operating level 243.00 m a.s.l., normal water level 252.50 m a.s.l., and maximum water level 261.50 m a.s.l. In practice, reservoir operation is conducted within operational zones defined by the dead storage, minimum operating level, and normal/maximum water levels. Inflow–outflow scenarios are evaluated in terms of maintaining the water level within the prescribed limits. Water-level dynamics follow the storage balance , while changes in storage are
translated into water-level variations through the elevation–storage
relationship (H–V), which motivates the need for stable multi-step simulation.
Climatically, the area belongs to the temperate zone with pronounced temperate-continental characteristics, with seasonal variability reflected in inflow conditions and water-level dynamics.
2.1.2. Data Sources and Measured Variables
The data were obtained from the operational SCADA system, which collects and archives real-time measurements from the monitoring and control infrastructure of the Bovan dam. The SCADA system at Bovan has been in operational use only in the last few years and, in practice, exhibits frequent data gaps and availability issues (missing values and discontinuities in the time series). Therefore, the available dataset represents a typical real-world operational setting in which measurements exist but are not ideal. This circumstance is both a limitation and a key motivation of the study: the objective is to develop and evaluate a model that remains usable for scenario-based analyses even when the dataset is relatively short and affected by interruptions.
Although the observation period is relatively short for data-driven modelling, the hourly resolution provides a sufficiently large number of samples for training deep learning models. Nevertheless, the limited temporal coverage may restrict representation of rare extreme events.
SCADA systems are widely used for monitoring and control of industrial and municipal infrastructure, enabling continuous acquisition of sensor measurements and operational records. In this study, the analysis is based on hourly (1 h) data covering the period from 18 May 2021 to 20 October 2022. The following variables were used:
Reservoir water level, 1 h;
Inflow proxy, represented by discharge at the Žučkovac hydrological station, 1 h. The station is part of the RHMZ reporting network (South Morava basin). It was established in 1967, with a long-term discharge record (since 1967) and digital water-level recording since 2007, and the catchment area is 394 km². These characteristics make Žučkovac a representative indicator of inflow variability in the study catchment [
23];
Outflow (release), defined as the aggregated discharge across all measurement points where releases are recorded, 1 h;
Water temperature, 1 h.
Discharge at the Žučkovac station is used as a representative indicator of inflow variability in the study basin. Any mismatch between the station cross-section and the actual inflow entering the reservoir is treated as part of real-world measurement and data uncertainty.
2.1.3. Data Preprocessing and Missing-Value Handling
All signals were aligned to an hourly time step (1 h). Missing values were handled using a two-level strategy: short gaps (up to several consecutive hours) were filled by linear interpolation to preserve continuity for sequence construction, whereas longer gaps were not interpolated. Instead, any training or evaluation windows containing such discontinuities were excluded. This approach reduces the risk of introducing artificial trends and supports a more realistic assessment of model performance under operational data conditions.
2.1.4. Train/Validation/Test Split (Time-Ordered)
The dataset was split chronologically (time-ordered) without any random shuffling to prevent information leakage from future observations into the past. The first 80% of the time series was used for model development (training), with the last 10% of this training period reserved as a validation subset. The final 20% of the full record was retained as an independent test set.
Hyperparameter selection and early stopping were performed exclusively on the validation subset derived from the training period.
The test set remained untouched during model development and was used only for the final performance assessment in the recursive multi-step (rollout) simulation mode.
Table 1.
Variables used in the analysis.
Table 1.
Variables used in the analysis.
| Variable |
Symbol |
Unit |
Temporal resolution |
| Reservoir water level |
|
m a.s.l |
1 h |
| Inflow proxy |
|
m³/s |
1 h |
| Aggregated outflow |
|
m³/s |
1 h |
| Water temperature |
|
°C |
1 h |
All variables were aligned to the hourly (1 h) time step. Missing values were handled using a two-level strategy: short gaps were filled by linear interpolation, while longer gaps were not interpolated and windows containing discontinuities were excluded from training and evaluation.
Table 2.
Time coverage and dataset split.
Table 2.
Time coverage and dataset split.
| Dataset |
Period |
Role in modelling |
| Full dataset |
18 May 2021–20 October 2022 |
Hourly inputs |
Model training (with an internal validation subset) |
18 May 2021–16 July 2022 |
Model training and validation |
| Test period |
16 July 2022–20 October 2022 |
Final performance evaluation |
Early-stopping note. Early stopping was applied only on the validation subset extracted from the training period, while the test set was kept independent and used exclusively for final evaluation in rollout mode. This protocol enables configuration selection based on stability within the validation part of the training period, while reporting multi-step simulation performance on an unseen test window.
2.1.5. Data Preprocessing and Feature Engineering
All signals (water level, inflow, outflow, and water temperature) were aligned to an hourly time step (1 h). Missing values were handled using a two-level strategy: short gaps were filled by linear interpolation, while longer gaps were not interpolated and any windows containing discontinuities were excluded from training and evaluation. To enable the model to learn periodic patterns typical of reservoir operation (daily operational cycles and seasonal variability), additional time-related covariates were derived from the timestamp. Specifically, the features Hour, DayOfYear, and Weekday were constructed to encode intra-day, annual, and weekly periodicity.
This approach allows part of the unstructured seasonal and operational effects to be explicitly represented in the input, without introducing additional physical process variables.
Table 3.
Derived temporal features used in the model.
Table 3.
Derived temporal features used in the model.
| Derived feature |
Symbol |
Range/type |
Resolution |
Purpose in the model |
| Hour of day |
Hour |
0–23 |
1 h |
Intra-day periodicity and operational patterns |
| Day of year |
DayOfYear |
1–365 (366) |
1 h |
Seasonal variability (hydrological and thermal regime) |
| Day of week |
Weekday |
0–6 |
1 h |
Weekday/weekend differences; demand and water-supply operation |
The Weekday feature was introduced as a proxy for changes in demand and operational management between working days and weekends, which can influence abstraction patterns and, indirectly, releases.