Assessing the performance of satellite-based products for monitoring extreme rainfall events in Bangladesh

This work focuses on the analysis of the performance of satellite-based precipitation products for monitoring extreme rainfall events. Five precipitation products are inter-compared and evaluated in capturing indices of extreme rainfall events during 1998-2019 considering four indices of extreme rainfall. Satellite products show a variable performance, which in general indicates that the occurrence and amount of rainfall of extreme events can be both underestimated or overestimated by the datasets in a systematic way throughout the country. Also, products that consider the use of ground truth data have the best performance.


INTRODUCTION
Bangladesh is a country highly affected by natural disasters. In many cases, these disasters are associated with extreme hydrometeorological events caused by tropical cyclones and wet spells causing floods and landslides, affecting lives, infrastructure and livelihoods (Eckstein et al., 2020). These events can be triggered by intense precipitation during the monsoon season and can be enhanced by geographical factors such as the sea proximity, low-elevation and flat terrain (Mirza, 2011), which, combined with the high population density, generate conditions of high propensity for natural disasters. Moreover, given that a large proportion of Bangladesh's land is under agricultural land use, extreme rainfall events can become a significant threat to dominant rainfed crops and to the country's food security during the main crop season (Kelley et al., 2020).
The above highlights the importance of having accurate precipitation measurements to characterize the amount, frequency and variability of extreme events, over a small-area country where precipitation exhibits high variability (Shahid, 2010). However, timely and reliable rainfall information on an appropriate spatial coverage can be difficult to generate by ground-truth rain gauges given their installation and maintenance costs. Apart from ground observation networks, other instruments are available for measuring precipitation, namely radars and satellites. Weather radars have proven to be very useful in generating detailed information at regional levels, but there are few networks worldwide generating homogeneous and long-term data, limiting their coverage, which necessary for studies of extreme events (Liang and Ding, 2017). However, time and space continuous gridded precipitation products generated from merging multiple ground and satellite sources of information have allowed capturing both space and time variability (Sorooshian et al., 2011).
Multiple gridded precipitation datasets have been released during the last decades for different purposes, resolutions, and spatial coverages (Beck et al., 2017). These products range from those generated using rain gauges only such as the Global Precipitation Climatology Centre (GPCC) (Schamm et al., 2010) and the Asian Precipitation-Highly Resolved Observational Data Integration Towards Evaluation of Water Resources (APHRODITE) (Yatagai et al., 2009) to merged products between satellite and rain gauges. By merging climate model outputs and observations through data assimilation schemes, current atmospheric reanalysis products provide homogeneous and continuous precipitation estimates. The NCEP/NCAR Reanalysis (Kalnay et al., 1996) and the European Center for Medium-Range Weather Forecasts (ECMWF) reanalysis ERA5 (C3S, 2017) can be mentioned as among the most used. However, large biases can be obtained in reanalysis in relation to field observations since they have been mostly developed for large-scale applications. Multiple precipitation products have become available due to the recent advances in satellite sensors and remote sensing algorithms. These products are generated from sensors that vary widely in their characteristics, from geostationary or geocentric and active to passive signal (Kidd and Levizzani, 2011), and consequently, retrieved precipitation can be heterogeneous and highly variable error (Sun et al., 2018). These uncertainties have been decreased in recent years due to the development of gridded products issued from the merging of direct satellite precipitation and ground observations (Xie et al., 2003).
Among the most commonly used operational satellite-derived precipitation products we can mention the Tropical Rainfall Measuring Mission (TRMM) (Huffman et al., 2007), the Climate Prediction Center morphing technique (CMORPH) (Joyce et al., 2004), and the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) (Ashouri et al., 2015). Additionally, the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS)  has become one of the most used products among those that are generated by merging satellite and gauge data. The global coverage and almost real time availability of these products make them useful for multiple applications such as the monitoring of extreme rainfall events. However, the inherent differences between these products have been reported as leading to significant differences in extreme rainfall retrieval accuracy, making their assessment necessary (Huang et al., 2014;Jiang et al., 2019). Although various studies have been carried out to assess satellite precipitation products in the context of extreme events, in Bangladesh these studies have focused on other aspects of rainfall variability. For instance, TRMM precipitation was evaluated by Islam et al. (2005) and its performance retrieving daily-scale precipitation and by Islam and Uyeda (2008) to assess seasonal patterns in vertical profiles of rainfall intensity. Similar, TRMM precipitation was assessed by Islam and Cartwright (2020) in terms of accumulated totals. Nashwan et al. (2019) used a set of gridded rainfall products to assess their performance in representing spatial patterns in annual and seasonal precipitation trends, revealing important differences among products.
The main purpose of this study is to provide an evaluation of satellite-derived estimates of extreme rainfall indices over Bangladesh using a set of rain gauges during the monsoon season. We considered four independent satellite gridded products: CHIRPS, TRMM, PERSIANN and CMORPH. Along with these products, we evaluated the performance of the Enhancing National Climate Services for Bangladesh (ENACTS-BMD) dataset recently generated by the Columbia University's International Research Institute for Climate and Society and the Bangladesh Meteorological Department.

Rain gauge data
The performance of the selected satellite precipitation products in capturing heavy rainfall events was assessed using ground-truth data provided by the Bangladesh Meteorological Department (BMD). Data consist in daily precipitation observations from 30 weather stations for the period 1998 through 2019, which corresponds to the overlapping period of satellite data time series, as presented below. The period June through September was considered as the monsoon season. Although BMD has a greater number of stations spanning a longer period of time than the selected ones, we removed stations with missing data higher than 10%, and also two stations located over small islands in the South of the country.

Satellite precipitation products
Five satellite-based precipitation products were selected. These products correspond to daily precipitation data from: (1)  The latest CHIRPS V.2 version was used in this study. CHIRPS is a satellite-derived precipitation dataset provided at 0.05° and 0.25° spatial resolutions and from daily to annual from 1981 until the present . It is developed by the Santa Barbara Climate Hazards Group at the University of California in association the U.S. Geological Survey Earth Resources Observation and Science Center. Data are obtained by combining infrared cold cloud duration data and TRMM precipitation data to generate a pentad rainfall estimation. These data are subsequently blended with ground precipitation measurements using an inverse distance weighting-based algorithm.
PERSIANN-CDR precipitation is generated by using an artificial neural network model developed by the National Centers for Environmental Prediction (NCEP) to convert infrared brightness temperature measured by geostationary satellites into precipitation rates. The final product is a multi-satellite high-resolution estimation of daily precipitation at 0.25º  0.25º, calibrated using monthly Global Precipitation Climatology Project (GPCP) precipitation data (Ashouri et al., 2015).
TRMM corresponds to a satellite mission which remained operational between the years 1997 and 2015 and carried out by NASA and the Japan's National Space Development Agency. This multi-instrument mission was designed to monitor tropical and sub-tropical precipitation between 50°N-50°S, generating data at 0.25º  0.25º resolution every 3 hours from infrared and microwave sensor data. We used the calibrated 3B42-V7 product, which, after 2015, is generated by the Global Precipitation Measurement (GPM) project using information from the Integrated Multi-satellite Retrievals for GPM (IMERG) algorithm until 2019.
The CMORPH algorithm is generated using satellite passive microwave observations to estimate instantaneous precipitation that is propagated spatially using thermal infrared observations from geostationary satellites. The "raw" product is generated at 8 km resolution and every 30 minutes between 60°S and 60°N and from 2002 to 2017. We used the more recent 0.25º  0.25º resolution CMORPH Version 1.0 daily product of Xie et al. (2017), which corresponds to a reprocessed version using observations from the Climate Prediction Center (CPC) and Global Precipitation Climatology Project (GPCP) products.
Finally, the IRI's and BMD's ENACTS product was considered for evaluation. The ENACTS-BMD dataset is a high resolution daily gridded (0.05° × 0.05°) rainfall and temperature constructed by blending data from BMD weather stations, satellite products (for rainfall) and reanalysis dataset (for temperature) (Acharya et al., 2020; Dinku et al., 2017). For constructing gridded rainfall, BMD station data merged with satellite rainfall estimates data from CHIRP (Funk et al., 2014). ENACTS-BMD used more station data (almost all available) than CHIRPS. Since February 2020, BMD hosts this dataset in their website * . Its record begins in January 1981 and is ongoing (updated every month in real-time) at daily, decadal and monthly temporal resolutions.  Table  1. However, we have modified the original definition recommended by ETCCDI since our period corresponds to the monsoon season (JJAS) instead of the whole year. These extreme precipitation indices are the total JJAS rainfall when daily precipitation (> 1 mm/day) is greater than the 95th and 99th percentile (R95p and R99p, respectively), and the total JJAS precipitation due to very (> 95th percentile; R95pTOT) and extremely wet days (> 99th percentile; R99pTOT).

RESULTS AND DISCUSSION
In this section, results are presented first in terms of the overall performance of the satellite products in representing total seasonal precipitation (JJAS). Subsequently, the comparison between satellite estimates and observations of extreme precipitation indices is presented, including a statistical evaluation of associated errors.

Monsoon rainfall climatology
Maps of climatology and interannual variability of total JJAS precipitation for satellite precipitation and stations are presented in Fig. 1. An overall agreement is observed in terms of a well captured pattern of higher precipitation over the Southeast and Northeast region by the five gridded products. These products are not completely independent, and as expected, CHIRPS and ENACTS-BMD products look similar and show the better performance since stations data are used in their algorithms; the rest showing greater discrepancies in relation to the observed values, similar to previously described by Nashwan et al. (2019) for other gridded products. In general, the main relative differences are observed for maximum mean values country, with TRMM and CMORPH showing the higher and lower maximum, respectively. Fig. 1 also shows the root mean square error (RMSE), a measure of the absolute difference between datasets being compared, and the bias between estimations and observations, both calculated between gauges and the closest grid cell. The higher country-mean RMSE is observed for PERSIANN and CMORPH, followed by TRMM, CHIRPS and ENACTS-BMD. While PERSIANN shows a slightly positive bias (52 mm), as a result of positive and negative biases, CMORPH exhibits a general underestimation of JJAS precipitation (bias = -274 mm). Interannual variability, represented by the standard deviation (SD; Fig. 1f-1j) shows a similar pattern. Higher discrepancies are observed for CMORPH, which overestimates variability, especially over the Southeast. PERSIANN shows a lower and more homogeneous variability along the country. TRMM tends to underestimate SD, showing an area of high variability in the northeast region, coinciding the maximum amount of Fig. 1d. Both CHIRPS and ENACTS-BMD show a good agreement with stations, being the last dataset the one with the better performance.

Extreme precipitation indices: rain gauges
In this section, the performance of the five selected precipitation products is evaluated in terms of extreme precipitation indices. As a first analysis, Fig. 2 shows the climatology of the selected indices (Table 1) calculated from rain gauges data only. The mean R95p exhibits a similar spatial pattern as total JJAS rainfall, ranging from 266 to 839 mm, with the highest values in the East of the country (Fig. 2a). A similar spatial distribution of total seasonal JJAS rainfall for extremely wet days (R99p) is observed (Fig. 2b), which range from 83 to 250 mm, with a country average of 140 mm. The total contribution from the top 5% rainy days (R95pTOT, Fig. 2c) shows a regional average of 25%, ranging from 23% to 29%. With a similar pattern, the contribution extremely wet days (R99pTOT, Fig. 2d) ranges from 7% to 9%. The values of these last two indices show a slightly different spatial pattern. For example, in the rainy Northeast area, which has high values of R95p and R99p, it is observed that the total contribution of very and extremely wet days is relatively lower. On the other hand, some stations in the center of the country show relatively higher total contribution from very rainy days. In addition, the respective averages of 8% and 25% are indicative that a small number of heavy precipitation events can account for a significant percentage of total precipitation during the rainy season. Fig. 2 also shows the interannual variability of the indices. Considering the country-averaged R95p and R99p of 446 mm and 140 mm, respectively, the interannual variations of the total rainfall from the top 5% and 1% precipitation events is 62% and 121%, respectively. SD for R95pTOT ranges from 9% to 16%, and exhibits a spatial pattern of higher variability that tends to be higher for stations located in the center of the country. R99pTOT SD appears as slightly more heterogeneous in its spatial distribution. our indices for BMD stations.

Extreme precipitation indices: gridded products
Indices of extreme precipitation were calculated for each of the gridded products. Fig. 3 shows the maps of climatology 1998-2019 R95p, including error metrics RMSE and bias. In general, Fig. 3a-3d shows a similar spatial distribution of R95p for the five gridded products, but some differences can be observed. As expected, CHIRPS and ENACTS-BMD look similar, while PERSIANN and CMORPH show relatively higher values in the Northeast and South of the country. The later can be corroborated by RMSE values for each satellite product. Overall, the error in the inland areas of the country is observed similar for each dataset, although it tends to be greater for CMORPH. In addition, it is also observed that the error for all the products shows its higher values over coastal areas. The latter may be due to the greater complexity of retrieving precipitation in these areas, which, for example, has been documented for microwave and infrared sensors (Kim et al., 2017), and similar results have been obtained over different regions (e.g . Jiang et al., 2019). Additionally, maps of Fig. 3f and 3g show that ENACTS-BMD reproduces a similar error than CHIRPS, but it is lower over regions such as the Northeast.
The bias statistics (Fig. 3k-3o) shows a shift from an average underestimation of 18 mm by CHIRPS to an average overestimation of 133 mm by ENACTS-BMD. CHIRPS overestimates R95p in the Northeast and South areas of the country. In the case of PERSIANN, a gradient that goes from a general underestimation in the South to an overestimation of R95p towards the North is observed. On the other hand, both TRMM and CMORPH show a general overestimation of R95p throughout the country. These results indicate that although the spatial pattern of R95p is adequately captured by these gridded products, the associated error varies considerably. The highest RMSE is presented by CMORPH, especially on coastal and southern areas of the country. Furthermore, both CHIRPS, PERSIANN and TRMM present an error that does not differ substantially among them in terms of magnitude and spatial distribution. It is also observed that ENACTS-BMD presents lower associated error. The satellite products present a varying performance in terms of the bias statistics, which has been also highlighted when probability of detection metrics is used (Islam, 2018).   4 show the climatology of R99p, also including error metrics RMSE and bias statistics. The spatial distribution of R99p is very similar to that previously described for R95p and rain gauges (Fig. 2). Again, while ENACTS-BMD shows a lower RMSE in relation to CHIRPS and to the other products, the bias shifts from negative to positive, indicating an overestimation of total precipitation during extremely wet days. Furthermore, PERSIANN generally shows the lowest R99p values, and TRMM and CMORPH show the highest values towards the Northeast and South of the country, which range between 132 mm and 402 mm, and 123 mm and 289 mm, respectively. RMSE maps show a slight reduction in ENACTS-BMD in relation to CHIRPS, similar values for PERSIANN and TRMM, and a higher average error in CMORPH. In the case of R99p, the bias in CHIRPS is generally negative, which shifts to positive in ENACTS-BMD. PERSIANN has a negative bias that is distributed similarly to CHIRPS. On the other hand, TRMM and CMORPH present a positive bias throughout the country, except over some southern coastal areas.
Results for the index R95pTOT, representing the contribution from the top 5% rainy days to total JJAS precipitation, are presented in Fig. 5. For all products, it is observed that in general the contribution of R95pTOT varies from about 20% to 40%. In this case, and unlike previous indices, the greatest contribution is observed for inland areas. The coastal zones, the Northeast and Southeast, which represent hotspots of extreme event according to R95p and R99p, present a greater number of rainy events that make the contribution of extreme events to total precipitation relatively lower. Fig. 5b shows that ENACTS-BMD represents an increase in R95TOT in relation to CHIRPS, being the later more homogeneous. While PERSIANN exhibits the most homogeneous spatial distribution, both TRMM and CMORPH show the highest values and very similar results. Again, although ENACTS-BMD is able to decrease RMSE to CHIRPS results, the bias shows the same shift from negative to positive observed previously. While PERSIANN shows a slight underestimation, a general overestimation is observed for TRMM and CMORPH. These results are of course in agreement with the skills of these products in representing both total JJAS rainfall and R95p. For example, similar values of R95pTOT for TRMM and CMORPH are the result of similar error pattern in JJAS rainfall and R95p for both products.

CONCLUSIONS
Five satellite-derived precipitation products were evaluated in terms of their performance representing the climatology of four indices of extreme rainfall over Bangladesh. There are important discrepancies in these products and in their ability to reproduce total precipitation during the rainy season. Statistically, this pattern is preserved when extreme rainfall indices are calculated. The five satellite-derived products are able to capture the spatial distribution and variability of the indices. Nevertheless, the performance can be highly variable in terms of values and associated errors in with respect to ground truth data. First, there is a poorer representation of the indices over coastal and high precipitation areas, which can be considered as inherent complexity of retrieving precipitation from satellites (sensors, algorithms). This is partially improved by CHIRPS and ENACTS-BMD, which are generated by merging with station data. PERSIANN, TRMM and CMORPH exhibit the worst performance, being CMORPH the one with the highest RMSE and biases. On the other hand, CHIRPS and ENACTS-BMD are products generated using BMD stations (although ENACTS-BMD used more station data than CHIRPS), resulting in a better performance. Moreover, ENACTS-BMD presents the best performance among the products considering the fact that it uses almost all available BMD stations, with the particular feature of a systematic overestimation of the indices, suggesting that the occurrence of extreme events could be included in its generation in order to improve its overall performance.
Results show that although the observed biases, satellite products are able to capture main features of extreme rainfall events, which is important for non-gauged areas. However, products with a poor performance should be used carefully and perhaps with other auxiliary data for heavy rainfall events monitoring. Additionally, results provide evidence of the usefulness of the algorithms used to correct satellite precipitation using ground data, which has been little explored in the case of extreme events.