Comparison of CMIP6 and CMIP5 models in simulating mean and extreme precipitation over East Africa

This study examines the improvement in coupled intercomparison project phase six (CMIP6) models against the predecessor CMIP5 in simulating mean and extreme precipitation over the East Africa region. The study compares the climatology of the precipitation indices simulated by the CMIP models with the CHIRPS dataset using robust statistical techniques for 1981 – 2005. The results display the varying performance of the general circulation models (GCMs) in the simulation of annual and seasonal precipitation climatology over the study domain. CMIP6-MME shows improved performance in the local annual mean cycle simulation with a better representation of two peaks, especially the MAM rainfall relative to its predecessor. Moreover, simulation of extreme indices is well captured in CMIP6 models relative to its predecessor. The CMIP6-MME performed better than the CMIP5-MME with lesser biases in simulating SDII, CDD, and R20mm over East Africa. Remarkably, most CMIP6 models are unable to simulate extremely wet days (R95p). A few CMIP6 models (e.g., NorESM2-MM and CNRM-CM6-1) depicts robust performance in reproducing the observed indices across all analyses. Conversely, OND season shows the overestimation of some indices (i.e., R95p, PRCPTOT), except for SDII, CDD, and R20mm. Consistent with other studies, the mean ensemble performance for both CMIP5/6 shows better performance due to the cancellation of some systematic errors in the individual models. Generally, the CMIP6 depicts improved performance in the simulation of MAM season akin CMIP5 models. However, the new model generation is still marred with uncertainty, thereby depicting substandard performance over the East Africa domain. This calls for further investigation of attribution studies into the sources of persistent systematic biases and a prerequisite for identifying individual models with robust features that can accurately simulate observed patterns


Introduction
The twenty-first century has witnessed unprecedented occurrences of extreme events that adversely affect every component of societal infrastructure and natural ecosystems (IPCC, 2018).
Nations in the extratropics continue to bear the brunt of climate change, characterized by incidences such as acute drought and floods, wildfires, tropical cyclones, and heatwaves, etc. (IPCC, 2014(IPCC, , 2018. Regions that were predominately considered 'safe havens' have been hit by far-reaching impacts of anomalous events, thereby weakening their economy and resilience to cope with the situations (Dahinden et al., 2017;Madakumbura et al., 2019). If such patterns are not accurately and timely forecasted and projected of how they will evolve in future trends, the catastrophic impact leading to massive loss of human lives and property will be the 'new norm'.
This calls for all relevant stakeholders'actions to devise possible measures and solutions to minimize any potential damages.
The scientific community continue to play a critical role in providing timely and accurate information regarding the evolution of climate extremes that currently define the present world and the future incidences. The advent of Coupled Model Intercomparison Project (CMIPhttps://www.wcrp-climate.org/wgcm-cmip) has significantly aided in understanding the projected patterns of the climate system. Resultant advancement is the General Circulation Models (GCMs), capable of enhancing our understanding of the climate system. The output of these models informs relevant stakeholders on formulation of effective and sustainable policies.
The steady progress from the first experiments conducted in a bid to enhance our knowledge of the complex earth system to the latest sixth phase (CMIP6) has witnessed remarkable improvement. Consequently, numerous studies have been conducted across various spectrums using the GCMs in an attempt to understand, attribute, and/or simulate various aspects of climate systems. Nevertheless, glaring weakness have been noted in various outputs of CMIP datasets over different regions. For instance, the first two phases of CMIP entailed simple simulations involving CO2 concentration which was subjected to an increase at a rate of 1 % yearly thereby no interannual variations in the radiative forcing being permitted (Meehl et al.,1997;2005;Stouffer et al., 2016). First forward, notable advances in the modelling of climate systems have been witnessed with the development of the fifth phase of the CMIP(CMIP5) comparable to the predecessors (Taylor et al., 2012;Smith et al., 2013). To illustrate, the CMIP5 entailed more model outputs than CMIP3 and involve use of Representative Concentration Pathways (RCPs; Riahi et al., 2007;van Vuuren et al., 2007). Other significant features can be accessed from relavant literature (Lamarque et al., 2013;Eyring et al., 2013;Taylor et al., 2012;Friedlingstein et al., 2014).
Despite the significant progress and new developments in phase 5, significant impediments still remained unresolved. For instance, the quantification of radiative forcing and responses continued to depict unsatisfactory performance (Stouffer et al., 2016). Further, the CMIP5 outputs are still marred with persistent systematic model biases, leading to major climate uncertainties (Flato et al., 2013). Other challenges noted involve the aspect of inadequate comprehension of the mechanism influencing internal changes, hence, unskillful decadal climate predictions (Meehl et al., 2007(Meehl et al., , 2009Collins et al., 2013). This resulted in more experiments that would attempt to address some of these pertinent issues that were noticeable in the model outputs used for the generation of IPCC assessment report 5 (AR5 report).
As a way forward, CMIP6, has been developed with massive improvements included as compared to previous outputs (Eyring et al., 2016). To illustrate this, higher spatial resolution (~ 70 km) in comparison to coarser resolution (~ 250 km) for CMIP5 characterizes the current model generation. Besides, improved physical processes and biogeochemical cycles, new features such as improved aerosols' effect or refined parameterization schemes, and large ensemble members are among the developments that describe the latest model outputs. Subsequently, large volumes of research outputs based on CMIP6 have highlighted notable improvements in modelling various aspects of climate systems or comparative analysis of the added value in CMIP6 as compared to CMIP5 (Voldoire et al., 2019;Mauritsen et al., 2019;Hajima et al., 2020;Moseid et al., 2020). Studies focusing on simulations or projection of mean and extreme climate based on CMIP6 (e.g., Akinsanola et al., 2020;Almazroui et al., 2020a, b;Grose et al., 2020;Jiang et al., 2020) or comparative studies of CMIP6 against CMIP5 performance have also reported better and more reliable results (Gusain et al., 2020;Jiang et al., 2020;Luo et al., 2020;Nie et al., 2020;Senevirante and Hauser, 2020;Zamani et al., 2020;Zhu et al., 2020). Chen et al. (2020), while equating the performance of CMIP6 to CMIP5 in the simulation of climate extreme noted a significant reduction in the model spread among the CMIP6 models compared to CMIP5, particularly over regions in the northern latitudes. Mainly, the study observed a more distinct projections of precipitation indices depicting very heavy precipitation days above 20 mm (R20mm) and maximum consecutive 5-day precipitation (RX5day) than in CMIP5 simulations.
Conversely, a regional study conducted using CMIP6 across the 41 sub-regions globally, as delineated for the upcoming IPCC assessment report six (AR6) reveals limited improvements compared to the CMIP5 models (Kim et al., 2020). Interestingly, the study shows persistent systematic biases (i.e., cold biases) in cold extremes over high-latitude regions. Nonetheless, simulation of precipitation extreme depicts an improved model skill in CMIP6 for the indices denoting the intensity and frequency with evidently reduced biases. The aforementioned studies continue to show varying results in the simulations of mean or extreme climate events in CMIP6 as compared to the predecessor. Remarkably, these studies continue to enhance our understanding of the suitable models to be employed for accurate diagnosis and projection of impact analysis.
As witnessed across other regions, the discrepancies in CMIP5 or RCMs are equally recognized in studies that have been based on such datasets over EA domain Yang et al., 2015;Kent et al., 2015;Kisembe et al., 2018;Ongoma et al., 2018a). Remarkably, no studies so far that have employed the latest CMIP6 model outputs to examine their capability in simulating the observed extreme events nor projecting the possible future incidences over the region. The improved performances illustrated over various domain gives a promise of accurate and reliable projections of future climate in a region that have been referred to demonstrate "paradox" pattern based on previous versions .
Thus, this study seeks to establish whether the CMIP6 models have stronger capability to reproduce mean seasonal and extreme events over EA region and whether the uncertainty is narrowed in CMIP6 as compared to CMIP5? The rest of the paper is organized as follows: details of data and techniques employed are enumerated in section 2 while results are captured in section 3. Discusions are presented in section 4. Lastly, the conclusion and recommendation are highlighted in part 5.

Model outputs and observation
This study uses thirteen historical simulations of CMIP6 (Eyring et al., 2016) and the predecessor (Taylor et al., 2012) obtained from an open-source platform available at https://esgfnode.llnl.gov/projects/esgf-llnl. The details regarding the dataset's structures are well presented in to other existing datasets (Kimani et al., 2017, Cattani et al., 2018Gebrechorkos et al., 2018;Dinku et al., 2018;Ayugi et al., 2019). More details regarding the algorithm and production process can be accessed from Funk et al. (2015). Many studies have pointed out insuffiecient reliable in-situ datasets that can be used for weather and climate studies over the EA region (Camberlin and Okoola, 2003;Su et al., 2008). The advent of alternative sources such as satellitederived or reanalysis datasets has played a critical role as a substitute source for climate estimates.
In this paper, all datasets are aggregated to a uniform temporal scale of 1981 -2005 and lowest spatial resolution of model using robust remapping technique.

Climate indices
This study employs extreme climate indices distinct by the Expert Team on Climate Change Detection and Monitoring Indices (ETCCDMI). The listed indices mainly considered aspects of extreme intensity, frequency, and duration of precipitation events over the study area (Klein Tank et al., 2009;Zhang et al., 2011). The indices can be divided into four main classes. Firstly, the duration indices, which mainly defines periods of excessive wetness/dryness. Here we used consecutive dry days (CDD) that represent the extent of the most prolonged dry anomaly in a year, characterizing the possible drought occurrence. Secondy, a percentile-based index that defines very wet days (R95P). The precipitation index used in this classification represents the rainfall amount falling about the 95 th (R95p). The threshold-based indices, identified as count of days when precipitation quantity is above/below a fixed threshold is equally employed. Here, we defined the sum of very heavy precipitation days > 20 mm (R20). Lastly, the study employed indices that delineate the period of seasonal precipitation total (PRCPTOT) and those that define precipitation intensity, such as the simple daily intensity index (SDII). Details regarding the indices are providedin Table 3.
where PC is the pattern correlation coefficient between the model outputs and observation. The PC0 is the highest PC achievable (here, we set the threshold at 1). Also, variable such as and ℎ represent SSD of the simulated and observed patterns, respectively. The score ~ 1 threshold value shows a perfect association between model and observed whereas 0 expresses contrary model performance. Successful application of this technique has been utilized in various studies (e.g., Luo et al., 2020;Xin et al., 2020;Zhu et al., 2020).
Lastly, portrait diagram depicting the RMSEs for each model is derived by first computing the multimodel mean and median for each index and then calculated relative to its RMSE. More information regarding the computational of this approach can be obtained from the works of Sillmann et al. (2013).

Climatology of mean and precipitation indices
As a first step, this study examined the characteristics of the monthly precipitation rate over East Africa as simulated by CMIP6/5 against the observed data. The annual cycle of model simulation and their multimodel ensemble mean relative to CHIRPS is presented in Figures 2 and 3. Over EA, the CHIRPS demonstrate a bimodal pattern with peaks during MAM and OND. The months of June -September (JJAS) are relatively dry seasons, with the month of July considered as the driest month, reflecting < 1 mm/day precipitation rate over the most region. The two peaks showed in the observed data are mostly associated with the tropical rain belt that oscillates from 15ºS -15 ºN throughout the year (Nicholson, 2018). Notably, the CMIP6 shows improved simulation of MAM season compared to CMIP5, which shows the inability of models to perfectly capture the rainfall patterns during aforementioned season ( Figure 2). Moreover, enhancement is further noted in the reproducibility of OND peak in CMIP6 with notable models such as NorESM2-MM, CNRM-CM6-1, and MPI-ESM1-2-LR able to simulate the peak satisfactorily. Comparative analysis of model's performance in the simulation of the two peaks for CMIP6 and CMIP5 reveal varying performance. Significantly, the two model outputs vastly overestimated the OND peaks, with CMIP5 depicting overestimations by all models, including the MME (Figure 3). Interestingly, the CMIP6-MME robustly mimicked the two tips of MAM and OND season, unlike CMIP5-MME. This shows better reproducibility of annual rainfall over study region by new model generation.
Generally, relatively dry bias is reflected in MAM season, with CMIP5/6 showing underestimations. Existing studies have remarked on the challenge of CMIP precipitation models' trend to manifest wet (dry) bias during OND (MAM) rainy season over the study domain (Otieno and Anyah, 2013;Ongoma et al., 2018b;Mumo et al., 2020). The pronounced dry biases during MAM (Figure 2 and 3) could be as a result of observed a reduction in seasonal precipitation (Funk et al., 2008;Lyon and Dewitt, 2012;Liebman et al., 2014;Ongoma et al., 2017;Ayugi et al., 2018). Thus, alternative solutions must be devised to cope with the dry patterns reflected in CMIP5 and CMIP6 over the region. With the models' well-pronounced ability to reproduce the MAM rains, the future projection of its likelihood will be of great importance to the country's economy.
On the contrary, OND precipitation shows many wet biases during the recent decades as is well reproduced in both model ensemble ( Fig. 2 and 3). The findings agree with previous findings (Hastenrath et al., 2011;Ongoma et al., 2017). Compared to the long rains, the OND is strongly correlated to the meridional and vertical circulations cells in the central Indian Ocean, in addition to the intensified upper-level subsidence over east Africa (Mutai et al., 2012;Nicholson, 2015).  Nicholson, 2015). The ability of models to reflect such changes shows strong skills of models in reproducing East Africa's climate. Past studies have reported a paradox scenario with the long rains, which needs to be addressed. Nevertheless, the overestimation of the OND rains can negatively impact farmers as they would expect more rainfall, and it turns out to be less expected.
Despite the ensemble mean showing an aspect of underestimations in both seasons in CMIP6/5, it outdid most individual models to depict closer performance as observed (Figure 2 and   3). Typically, the study shows an improved performance in CMIP6 in the local annual mean cycle simulation with a better representation of two peaks, especially the MAM. The models NorESM2-MM showed a remarkable performance in the annual cycle simulation over the study area (Fig. 2).
Comparative studies across other regions globally equally show varying performances with some studies depicting improved performance by CMIP6 models and their ensembles (Xin et al., 2020;Zhu et al., 2020;Zamani et al., 2020), while unsatisfactory performance of CMIP6 in simulating annual precipitation is pointed over Tibetan Plateau in China (Zhu and Yang, 2020).

Spatial patterns of precipitation extremes
A comparative analysis of model performance in the reproducibility of the observed seasonal climatology of precipitation indices for MAM and OND over the study region is shown in Figure   4 and 5. The spatial distribution of the precipitation biases of five extreme indices used (Table 2)  This suggests that while the CMIP6 models simulated large biases in PRCPTOT and R95p, the models outperformed the CMIP5 models with lesser negative biases in SDII, R20mm, and CDD (as Figure 4g, j and m illustrate).
In agreement with previous studies (i.e., Osima et al., 2018;Ogega et al., 2020), the GCMs show an underestimation of CDD, and R20mm, especially over eastern Kenya and northeastern Tanzania, where model agree significantly. Related study at the global level , equally observed persistent underestimation of R20mm and CDD in CMIP models. The study attributed the simulated biases to increased biases in CMIP6 models as compared to CMIP5 models. The remarkable negative bias for CDD is mainly as a result of poor simulations by models such as INM-CM4 (26 days) and MIROC-ES2L (30 days) (not shown here). Chen et al. (2020) attributed the notable underestimation of CDD to the increased spatial resolution in CMIP6 models, thereby capturing more precipitation that is often simulated at a much finer scale by highresolution models. Notably, western Uganda shows a substantial bias of overestimating heavy precipitation days despite underestimating in most regions during MAM (Figures 4k, l, m, and n), and OND (Figures 5k, l, m, and n). Furthermore, bias in PRCPTOT is exceeding 200 mm, R95p is > 40 mm, and -10.8 d -1 for CDD, respectively. This could be due to moist westerlies originating from the Congo basin resulting in enhanced rain during the wet seasons in the north and southwest when other parts of the country are cold and dry (Mchugh 2004;Kizza et al. 2009). Numerous studies have pointed the challenge of simulating precipitation variable as compared with temperature (Almazouri, 2020; Chen et al., 2020;Jiang et al., 2020;Zhu et al., 2020). This is mainly associated to the issues related to model parameterization, especially in regions with complex physiographical features such as East African region or challenges related to local mesoscale features such as lakes, vegetation cover or large coastline that causes regional heterogeneity (Nikulin et al., 2012). Generally, the comparison between the CMIP6-MME and CMIP5-MME over EA region depicts varying performance with improvement in the simulation of some indices (i.e. SDII, R20mm, and CDD) while no significant improvement is noted in reproducibility of extremely wet days for seasonal precipitation. Essentially, the new models depict small biases, such as PRCPTOT (Figure 4a), during MAM with lower amplitude of < 100 mm as compared with > 150 mm in CMIP5-MME. However, higher amplitude is observed in the simulation of PRCPTOT during OND (Fig. 5), with >250 mm in CMIP6-MME relative to CMIP5-MME which exhibit bias of 200 mm in the simulation of total precipitation. The areal-mean bias has reduced in MAM (OND) by 7 % (23 %). Interestingly, the dry biases shown in CMIP5-MME over eastern Kenya and southern Tanzania are enhanced in CMIP6-MME with model agreements demonstrated by significant score for annual precipitation (Figures 4e, and f). Remarkably, most regions that portrayed either dry/wet biases are reflected in both CMIP6-MME and CMIP5-MME except that the CMIP6-MME showed statistically significant changes at the 95% confidence level in such regions. Substantial dry biases are observed during MAM and OND for SDII with CMIP6- The results show an improvement in seasonal simulation, particularly for MAM season as compared to OND rains over the study region. The systematic overestimation (underestimations) in mean precipitation is equally observed in other similar evaluative studies over Tibetan Plateau and East Asian Monsoon region Zhu and Yang, 2020).    A portrait diagram that gives a summary for all individual model performance in simulating the precipitation extremes indices for MAM and OND is presented in Figures 8 and 9 respectively.
The portrait highlights the regional mean RMSE for each index (PRCPTOT, SDII, CDD, R20mm, and R95p) depicted in rows and for 26-CMIPs models in columns. Precipitation indices are represented by CMIP6 (red) and CMIP5 (blue) models from the same institution with respect to the observation. The colder color series indicate that model performance better whilst warmer color denotes models with relatively low skills on average. Moreover, CMIP5/6-MME is equally evaluated and presented in the last two columns of the portrait diagram.
Analysis shows that few CMIP6 models (e.g., NorESM2-MM, and CNRM-CM6-1) depict robust performance in reproducing the observed indices across all analyses. The majority of the CMIP6 models show substandard performance with most models depicting warmer colors.
However, comparative analysis shows improved performance during MAM relative to OND in the CMIP6 models.

Discussion
The present study assesses performance of the new GCMs of CMIP6 against its predecessor in the simulation of mean annual climatology and extreme events during MAM and OND over the EA Our results indicate the new ensemble models' satisfactory performance in simulation of two tips of MAM and OND season, despite aspects of overestimations depicted by most models.
Previous studies that assessed the performance of CMIP5 over the region observed underestimation of annual mean climatology and MAM season while overestimations of short rains occurring during OND (Yang et al., 2015;Ongoma et al., 2018b). The CMIP6, on the contrary, presents an improvement in the simulation of MAM rainfall and few models reproducing OND peaks (Figure 2). Further assessment of precipitation indices shows enhancement of CMIP6 models in representing MAM rainfall comparable to OND (Figures 4 -7). The possible attributions to the results in the present study could be associated with aspects of improved spatial resolutions of models (Eyring et al., 2016) that can capture local convective systems that could not otherwise be registered in the previous GCMs that had courser spatial resolution (Taylor et al., 2012). Most significantly, parameterization in the GCMs plays an essential role in the biases observed from one region to another (Flato et al., 2013;Stouffer et al., 2017). The CMIP5 featured aspects of inability to represent the local climate, which could be attributed to poor parameterization schemes and courser spatial resolutions (Kisembe et al., 2018;Ongoma et al., 2018b;Ayugi et al., 2020b;Mumo et al., 2020). The case of overestimation by most CMIP6 models and their respective ensembles (Figures 3 and 5), could be attributed to the systematic biases resulting from intermodel weaknesses in their framework schemes (Wu et al., 2019;Volodin et al., 2019;Tatebe et al., 2019). Similar results have been observed in related studies conducted over Tibetan Plateau (Zhu and Yang, 2020). Notably, the complex geomorphology that distinguishes the Tibetan region is also present in the study area. For instance, the high mountains (e.g., Mt. Kilimanjaro, Mt. Kenya, Mt. Elgon, and Mt. Ruwenzori), with an elevation of > 4000 m.s.l play a significant role in enhancing mesoscale features, which are in turn reflected in models with high resolutions (~ 70 km) (Indeje et al., 2000;Ogwang et al., 2014). The precipitation indices (i.e., R95p), which persistently remained poorly represented (Figures 4, and 5 consensus on the future state of the regional climate for suitable policy formulation and adaptation.
On the contrary, the latest models' inability to accurately simulate some extreme events and persistent overestimations of OND peaks highlights situation uncertainty and calls for in-depth studies on the causation of the inability of models to simulate the OND peaks robustly. Persistent uncertainty exposes various stakeholders such as policymakers and users of climate information to remain in a state of indefinite future climate projections for OND rainfall since the model's reliability cannot be wholly trusted. This calls for further investigation and attribution studies into the sources of unyielding systematic biases. The weakened performance in CMIP6 models over the East African region calls for cautionary in climate studies with the need for assessment studies to identify the individual models with robust features to accurately simulate observed patterns for future usage.

Summary and Conclusion
The main findings of the present study can be itemized as follows: 1. The CMIP6 models show improved performance in the simulation of mean and extreme precipitation over East Africa relative to CMP5 models. Particularly, the CMIP6-MME vigorously reproduce two tips of bimodal pattern, relative to CMIP5 models. For seasonal climatology, CMIP6 models exhibits robust performance in the simulation of MAM season and improved reproducibility of OND season, with notable models such as NorESM2-MM, CNRM-CM6-1, and MPI-ESM1-2-LR capable to simulate the peak satisfactorily relative to CMIP5. Despite the ensemble mean showing an aspect of underestimations in both seasons in CMIP6/5, it outdid most individual models to depict closer performance as observed. For instance, the CMIP6-MME robustly mimicked the tip of MAM season, unlike CMIP5-MME. This shows better reproducibility of MAM rainfall over study region by new model generation.
2. The performance of CMIP6-MME varies as compared to CMIP5-MME in the simulation of extreme indices. The model ensemble for MAM season indicate aspect of underestimation for most indices except for CDD over the study area while OND season shows aspect of overestimation of some indices (i.e., R95p, PRCPTOT), except for SDII, CDD and R20mm. Nonetheless, the CMIP6-MME performed better than the CMIP5-MME with lesser biases in simulating SDII, CDD, and R20mm over East Africa. To illustrate, the CMIP5 depict largest areal mean relative bias relative to CMIP6 in total precipitation with 28 % as compared to 21 % in CMIP6 for MAM precipitation. Other indices highlight the biases of 8.1 % (1.0 %) for R95p, SDII (-3.9 %) (-4.2), and R20mm (-0.8 days) (-1.0 days) while OND depicts biases of 29.8 % (21.6 %) for R95p, SDII (-2.2 %) (-2.3), and R20mm (0.6 days) (0.1 days). This suggests that while the CMIP6 models simulated large biases in PRCPTOT and R95p, the models outperformed the CMIP5 models with lesser negative biases in SDII, R20mm, and CDD. Remarkably, most CMIP6 models are unable to simulate the extremely wet days (R95p) while satisfactory simulation of CDD is noted in new model outputs over the study region.
3. A few CMIP6 models (e.g., NorESM2-MM, and CNRM-CM6-1) depicts robust performance in reproducing the observed indices across all analyses. Majority of the CMIP6 models are still marred with uncertainity. Consistent with other studies, the performance of mean ensemble for both CMIP5/6 show better performance due to the cancellation of some systematic errors in the individual models. Generally, the CMIP6 depict improved performance in the simulation of MAM season akin CMIP5 models.
Moreover, simulation of extreme indices is well captured in CMIP6 models relative to its predecessor. Generally, little improvement is noted over East Africa domain in the new model generation despite the improved parametrization schemes, enhanced spatial resolution, and physical processes including the biogeochemical cycles (Eyring et al., 2016).

Authors contributions:
All authors made an equal contribution to the manuscript's development and consented to the submission for publication in the esteemed journal of theoretical and applied climatology. The