Preprint Article Version 1 This version is not peer-reviewed

How to Explain and Predict the Shape Parameter of the Generalized Extreme Value Distribution of Streamflow Extremes Using a Big Dataset

Version 1 : Received: 8 November 2018 / Approved: 12 November 2018 / Online: 12 November 2018 (04:59:22 CET)

A peer-reviewed article of this Preprint also exists.

Tyralis H, Papacharalampous G, Tantanee S (2019) How to explain and predict the shape parameter of the generalized extreme value distribution of streamflow extremes using a big dataset. Journal of Hydrology 574:628–645. doi:10.1016/j.jhydrol.2019.04.070 Tyralis H, Papacharalampous G, Tantanee S (2019) How to explain and predict the shape parameter of the generalized extreme value distribution of streamflow extremes using a big dataset. Journal of Hydrology 574:628–645. doi:10.1016/j.jhydrol.2019.04.070

Journal reference: Journal of Hydrology 2019, 574, 628-645
DOI: 10.1016/j.jhydrol.2019.04.070

Abstract

The finding of important explanatory variables for the location parameter and the scale parameter of the generalized extreme value (GEV) distribution, when the latter is used for the modelling of annual streamflow maxima, is known to have reduced the uncertainties in inferences, as estimated through regional flood frequency analysis frameworks. However, important explanatory variables have not been found for the GEV shape parameter, despite its critical significance, which stems from the fact that it determines the behaviour of the upper tail of the distribution. Here we examine the nature of the shape parameter by revealing its relationships with basin attributes. We use a dataset that comprises information about daily streamflow and forcing, climatic indices, topographic, land cover, soil and geological characteristics of 591 basins with minimal human influence in the contiguous United States. We propose a framework that uses random forests and linear models to find (a) important predictor variables of the shape parameter and (b) an interpretable model with high predictive performance. The process of study comprises of assessing the predictive performance of the models, selecting a parsimonious predicting model and interpreting the results in an ad-hoc manner. The findings suggest that the shape parameter mostly depends on climatic indices, while the selected prediction model results in more than 20% higher accuracy in terms of RMSE compared to a naïve approach. The implications are important, since incorporating the regression model into regional flood frequency analysis frameworks can considerably reduce the predictive uncertainties.

Subject Areas

CAMELS; flood frequency; hydrological signatures; extreme value theory; random forests; spatial modelling

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.