Reverse flood routing in an open channel using genetic algorithm

Flood routing in flood forecasting issue, calculation the height of flood bands, determining the river boundaries, and estimation of protective facilities for flood –exposed building is applicable. In many cases, due to the lack of measuring stations, the status of the upstream flood generating hydrograph is not known. The purpose of this study is to present an integrated method comprising of an optimization model and a hydrodynamic numerical model for flood modeling to determine the upstream hydrograph using the provided hydrograph at the downstream measuring station of a river. The routing procedure consists of three steps: (1) generating a hypothetical upstream hydrograph using genetic algorithm method; (2) hydrodynamic modeling using a numerical simulation model for flood routing according to the hypothetical hydrograph which is generated in the first step; (3) compare the calculated and observed hydrograph in downstream by using a fitness function. This recommended procedure was named as Reverse Flood Routing Method (RFRM) and was then applied to Karun River, the largest river in Iran. Comparing the generated upstream hydrograph by the RFRM model with the corresponding measured hydrograph at Ahvaz hydrometric station, as an ungauged river location, shows the high accuracy of the recommended model in this study.


INTRODUCTION
Since the earliest recorded civilization, humans has tended to settle near rivers because of proximity to water supplies, natural needs for urban and rural populations, and advantages for agriculture activities, resulting for living a significant percentage of people around the world near rivers (Baldassarre et al., 2013). But living along rivers has dangers, like a flood, which is a natural phenomenon and its occurrence is inevitable in many places and times and sometimes no humanmade structures can stop it. However, there are some methods for mitigating the risk of its happening and consequences. Many researchers has studied the causes and presented strategies for reducing the destructive effects of the flood. Floods control and management and also the design of the suitable flood-resistance structures and specifying flood catchment areas on both sides of a river is dependent on the flood hydrograph, its peak discharge, and base flow time. Flow discharge is obtained from velocity, depth, and width of flow in the river, so river, and some cross-sections may be prepared for flow depth measurement (Zucco et al., 2015). These equipped cross-sections are generally called hydrometric stations. In these types of stations, water elevations are continuously recorded and are converted to discharges using a pre-prepared accurate rating curve or stage-discharge relationship. Therefore, the flood hydrographs are provided at these stations (Perumal et al., 2007;D'Oria and Tanda, 2012). Construction, equipment, and maintenance of hydrometric stations require high costs for governments or water agencies, especially for natural waterways with complex and challenging conditions. For this reason, the number of such stations along a river or canal may be limited and is mostly dependent on the importance of the river area and also on the current and future development projects. Therefore, providing hydrograph at any point of a waterway or river needs to set up a hydrodynamic numerical model and run it for the desired reach of channel and time period. These types of models need all information like stagedischarge at upstream and downstream boundaries of the chosen distance during the selected period.
To mitigate the flood risk, the knowledge of discharge at gauging stations and upstream or downstream of the section is essential (Zucco et al., 2015). Accurate information about flood characteristics is one of the critical issues for managing its disaster (Soleimani-Alyar et al., 2016). There are several hydraulic and hydrologic mathematical methods used for flood modeling in rivers and waterways (Pashazadeh and Javan, 2020). A hydraulic model consists of the conservation and energy or momentum equations that use geometric and hydraulic data for flood routing. But a hydrologic model relies on limited parameters coupled with a linear or nonlinear equations based on the continuity equation for flood routing, which is simple in comparison to a hydraulic model (Bozorg Hadad et al., 2018). Flow hydrographs are one of the fundamental elements for flood modeling which is recorded as time series at hydrometric or gauged stations also, the information of discharge hydrograph is essential for flood-risk assessment and management of water resource systems (Todaro et al., 2019). However, due to the lack of hydrometric and, or gauged stations, these hydrographs may not be available everywhere (Kaya et al., 2017). The Muskingum method is one of the hydrological approaches that been used for many years due to its simplicity and acceptable accuracy (Norouzi and Bazargan, 2020). The usual downstream routing of flood flow, i.e., using the existing upstream hydrograph, is a stable process for the Froude numbers less than one and, the results are reliable. In contrast, the reverse flood routing determines the upstream hydrograph based on the river geometry, flow hydraulic characteristics, and downstream hydrograph (Bruen and Dooge, 2007). A part of performing reverse routing is to solve the 1-D Saint-Venant equations, solving these equations requires initial and boundary conditions, and discharge hydrograph at the end of the river reach (Spada et al., 2017).
In this study, a novel procedure is deployed to develop an integrated model comprising a hydrodynamic mathematical model, which solves the 1-D Saint-Venant equations, and an optimization model accompanied by a genetic algorithm method. This combined program can generate the upstream missed hydrograph using the existing downstream hydrograph.

Governing 1D Hydrodynamic Equations
The basic formulation of unsteady one-dimensional flow in open channels defined as St.venant equations. These equations consist of continuity and momentum equation, which written in the following form, respectively (Cunge et al., 1980;Mohamadi andKashefipour, 2014, 2020): Where = discharge, = cross section area, x = distance in flow direction, = gravitational acceleration, Z= water surface elevation, R=hydraulic radius, n=Manning roughness coefficient, qL=lateral inflow or out flow (negative for out flow), TW=top width of channel, t= time and β=momentum correction coefficient.
These partial differential equations are solved using different types of numerical methods. For example, in FASTER (Flow And Solute Transport in Estuaries and Rivers) model, these equations are numerically solved using an implicit staggered central finite difference scheme with variable grid size, which is accurate and unconditionally stable. More details regarding this type of numerical solution of the Saint-Venant equations can be found in Kashefipour (2002). In the MIKE11 commercial model, which was developed and released by DHI (Danish Hydraulic Institute), the Saint-Venant equations are solved based on an implicit finite difference scheme developed by Abbott and lonescu (DHI Reference Manual, 2017).

Introduction to Optimization
To minimize an error, effort or energy, or to maximize profit or outcomes, the optimization process can be used in all sciences such as math, computer, engineering, and so on (Kramer, 2017). In other words, optimization is a technical and iterative procedure for making something better. In any process, there is a set of inputs and a set of outputs, as shown in Figure 1.

Fig.1. Optimization process
Optimization refers to finding the values of inputs in such a way to get the "best" output values. The definition of "best" varies from a problem to another, but in mathematical terms, it refers to maximizing or minimizing one or more objective functions, by changing the input parameters. The set of all possible solutions or values, which the inputs can take, make up the search space. In this search space, lies a facts or a set of points that gives the optimal solution. The optimizations aim is to find that point or set of points in the search space.

Genetic Algorithm
Genetic Algorithms (GA) guide search strategies inspired by a biological process, derived from Darwin's principal of natural selection and survival of fittest (Sivapragasam et al., 2008). This method is frequently used to solve optimization problems, research and in machine learning. The general framework of the most evolutionary algorithm are similar, that means they start with a population of random solution which is evaluated by a fitness function. The most popular components of this algorithm are crossover, mutation, and selection (Mirjalili et al., 2020). Crossover is an operator that allows exchanging genes between parents to produce a new solution.
There are three main ways of crossover methods in the literature, single-point crossover, double point crossover and multipoint crossover. For example, in single-point crossover, the chromosome of the two-parent solution change before and after single-point. Crossover is the primary operator of the exploitation of GA. The mutation causes random changes in genes and is the primary mechanism of exploration in GA. There are many mutation techniques such as uniform, nonuniform etc. that can be used in a genetic algorithms, by using a selection operator, the best offspring solutions selected to be parents in the new parental population to allow convergence towards optimal solutions. Selection process is based on the fitness value in a population. When we face a minimize situation, the low value is acceptable but in a maximize situation, the high value is preferred. There are many methods for simulating the selection, Roulette wheel selection is one way. More details about GA can be found in Haupt and Haupt (2004).

MATERIALS AND METHODS
Describe the procedure of developing the integrated model in this research is followed using an application for Karun River, which flows in the southwest of Iran.

Case Study Description
Karun River is the largest and the only navigation river in the southwest of Iran. The basin area of this river is about 45220 km 2 , with the 50 years average discharge being reported about 600m 3 /s. The selected reach length was about 50 km with, 49 cross-sections ( Figure 2). This figure shows study area domain; Ahvaz (cross-section 49 or C.S49) and Farsiat (cross-section 1 or C.S1) are the upstream and downstream boundaries, respectively. The measured hydrographs (Q-t chart) and time-series of water elevations (Z-t chart) were available at both boundaries for several periods, i.e., the continuous measurements at both hydrometric stations.
In this study, the numerical FASTER model, which was initially developed by Kashefipour (2002), and genetic algorithm was used for simulation the flow in Karun River and optimization, respectively. A piecewise equation was added to the FASTER model enabling it to calculate the Manning roughness coefficient at any spatial point and any time during simulation using flow discharge and depth. This dynamic equation makes the model very accurate for determining timedependent discharge and depth at any point of the domain. This part of the FASTER model is fully described by Mohamadi and Kashefipour (2014). The FASTER model was set up for the aforementioned domain. All data needed for implementing the model for the years 2016-2017 and also the bed elevations for all cross-sections were provided by KWPA (Khuzestan Water and Power Authorities) company. Ahvaz and Farsiat hydrometric stations were defined as the upstream and downstream boundaries, respectively (see Figure 2). The Q-t data is usually defined for the upstream boundary and the Z-t data for the downstream boundary.

Calibration of FASTER Model
Since the main part of the developed integrated model in this study is the flow simulation model, it was essential to ensure from the FASTER model accuracy in the prediction of discharges and water surface elevations. For this reason, the FASTER model was first calibrated using the existing data. It was found that the dynamic piecewise equations introduced for the Manning roughness coefficient by Mohamadi and Kashefipour (2014) in the FASTER model need to be slightly revised for this domain of Karun River. The original equations were replaced by the revised ones in the FASTER model. So we were assured of the accuracy of the model predictions for the discharges and water surface elevations. These two equations are written as Equations 3 and 4. Comparison of these equations with the original ones in the FASTER model shows that for the discharges more than 1100 m 3 /s, the roughness coefficients increase about 9%, and this value was about 27% for the discharges less than 1100 m 3 /s.
Where yave and uave are the cross-sectional average of flow depth and velocity, respectively; a=1.09 and b=1.27.

Optimization Method
The optimization model is written based on the genetic algorithm (GA) using the well-known and free access MATLAB software. To write a genetic algorithm first, a function as a fitness function is defined. The fitness function in this study is defined as Equation 5.
Where Qobs and Qcal are measured and calculated discharge, respectively. Given that we face a minimization situation, the goal is to bring this function closer to zero as close as possible. By changing some of the parameters in the genetic algorithm, at last, seven different optimization models were written. These parameters were the number of members of the population (Npop), percentage of crossover ( ), number of mutations (Nmut) and, number of iteration. These models were produced based on trial and error, previous experiences, and using two types of numbers, Integer (Models 1 to 3) and Real (Models 4 to 7) numbers, 0.9

Model Development
The main model consists of three parts including: 1-a numerical hydrodynamic model (here FASTER model), which solves the Saint Venant equations and determines water elevations and discharges at any point and time of simulation. 2-a genetic algorithm (GA) model, which produces the new set of discharges (hydrograph) for upstream boundary using the previous set and checking the fitness function (Equation 5) for minimization, and 3-an interface for linking the above models and manage the number of iterations towards the best-optimized results. Ahvaz and Farsiat hydrometric stations were set as the upstream and downstream boundaries, respectively. The Q-t data (hydrograph) and Z-t data for ten days period in 2016-2017 were defined for these boundaries, respectively. It is assumed that the upstream hydrograph is not existed, and the hydrodynamic model cannot be implemented without this hydrograph. Therefore first, a hypothetical hydrograph is produced for upstream boundary using the GA part of the model based on the downstream hydrograph. For producing the first hydrograph, a range of discharge was defined for the GA model. The hydrograph was randomly chosen between twice of the maximum discharge and 0.8 of the minimum discharge the downstream hydrograph. This range was set for converging the solution faster and receiving the final upstream hydrograph quicker with the minimum necessary iteration. The flow routing is then started along with this domain and continued up to the end of simulation time (i.e., ten days or 240 hours). At the end of the simulation time of the FASTER model, the downstream hydrograph is provided. By the defined fitness function, the calculated and observed hydrographs in the downstream boundary are compared. At last, for the reasonable error, the generated hypothetical hydrograph is the final upstream hydrograph. Otherwise, the GA model produces another hypothetical hydrograph, and this cycle continues until the calculated and observed hydrographs at the downstream boundary had an acceptable error. The interface part of the model transfers the information between the FASTER and GA models, i.e., the upstream hydrograph from GA to FASTER and the downstream hydrograph from FASTER to GA. In this part of the model, a conditional statement is placed to control the number of iterations based on the calculated error. According to the number of iterations, i.e., 10000 and running FASTER and GA models, it is necessary to use a supercomputer to reduce the program run time. A flowchart of the model is shown in Figure 3.

RESULTS AND DISCUSSION
Optimization problems are generally solved using a series of repetitive mathematical calculations and, by defining one or more conditional statements, tries to get the best answers. For example, in this study, in each iteration, a GA model, a numerical hydrodynamic model, and an interface model are implemented, and running these models together with a usual computer takes a very long time, especially when the number of iterations is high. Also, it should be mentioned that according to the simulation time and the amount of the time step, the hydrodynamic model may take a considerable time to be completed in each iteration. Therefore, in this type of computer program, the time consuming of running program is a key parameter. For minimizing the timeconsuming running a program, we used a supercomputer with a 64-core processor. However, selecting a suitable combination of the parameters and their values for the GA model (i.e., models 1-7, see Table 1) may significantly reduce time of implementing the main model. In addition to program execution time, the performance and accuracy of the GA model, which produces the upstream hydrograph, is another key parameter.
In this study, the number of iteration for all seven considered models in the GA model was the same and equal to 10000. The most critical decision for implementing a genetic algorithm is to choose a good representation, otherwise leads to poor performance and answer. The results showed that the best combination of the GA parameters is model No.6 (see Table 1). Running the program revealed that after 3030 iterations, the RMSE% of comparing the produced and real hydrographs for the downstream boundary was calculated less than 0.005, which indicated the high accuracy of the GA model. Furthermore, the implementation time of the program to reach the best answer is much shorter.
The produced hydrographs for the upstream boundary by the different GA models are illustrated in Table 2, and the real hydrograph is also added to this table. The RMSE% based on Equation 6 is also calculated and added to this table. As it is shown in Table 2, model No.6 performed better than the other considered models. Figures 4a-4g compares the produced or estimated hydrographs by seven GA models with the measured or actual hydrograph at the upper boundary, i.e., Ahvaz hydrometric station in this study.  Knowing the river conditions in terms of different discharges is essential for many river engineering projects, such as numerical flow and water quality modeling in riverine basins, watershed management, designing hydraulic and water supply structures, and flood control and distribution plans. However, continuous measurement of discharges along a river network may need to establish several hydrometric stations, and this may be impossible due to budget shortage or severe conditions, or high instability of river sections. This research teaches us how to produce the missed or non-measured hydrographs for upstream from downstream or vice-versa.

Conclusion
In this study, an integrated model was developed comprising a GA model, a hydrodynamic numerical 1-D model, and an interface program. This mathematical model was written so it can be used for the reverse flow routing to produce an upstream boundary hydrograph from the known downstream hydrograph. It was found that the GA parameters play an important role in not only producing precise hydrographs and best answers, but also reducing consuming computer running time. Here, we could generate the upstream hydrograph with a minimum RMSE error 0.04% for the GA parameters of Npop=10 (number of members of population), Nmut=1 (number of mutation), and PC=0.9 (percentage of crossover) with minimum iteration.