Evaluation of the Global Climate Models in CMIP6 over Uganda

This study employed 15 CMIP6 GCMs and evaluated their ability to simulate rainfall over Uganda during 1981-2019. The models and the ensemble mean were assessed based on the ability to reproduce the annual climatologyseasonal rainfall distribution, trend, and statistical metrics, including mean bias error, root mean square error, and pattern correlation coefficient. The Taylor diagram and Taylor skill score (TSS) were used in ranking the models. The models performance varies greatly from one season to the other. The models reproduced the observed bimodal rainfall pattern of March to May (MAM) and September to November (SON) rains occurring over the region. Some models slightly overestimated, while some slightly underestimated, the MAM rainfall. However, there was a high rainfall overestimation during SON by most models. The models showed a positive spatial correlation with observed dataset, whereas a low correlation was shown interannually. Some models could not capture the rainfall patterns around local-scale features, for example, around the Lake Victoria basin and mountainous areas. The best performing models identified in the study include GFDL-ESM4, BCC-CMC-MR, IPSL-CM6A-LR, CanESM5, GDFL-CM4-gr1, and GFDL-CM4-gr2. The models CNRM-CM6-1 and CNRM-ESM2 underestimated rainfall throughout the annual cycle and mean climatology. However, these two models better reproduced the spatial trends of rainfall during both MAM and SON. The model spread in CMIP6 over the study area calls for further investigation on the attributions and possible implementation of robust approaches of Machine learning to minimize the biases. MAM some of the models that showed good performance in the ranking could not simulate well the seasonal climatology of the study region. With all that put the best performing models include GFDL-ESM4, BCC-CMC-MR, IPSL-CM6A-LR, CanESM5, GDFL-CM4-gr1, and GFDL-CM4-gr2. The spatial correlation of the models with CHIRPS is positive at seasonal and annual scales, but a negative correlation is depicted for interannual variability.

CMIP5 over East Africa reported a relatively poor performance of the models in simulating rainfall over the region (Akurut et al., 2014;Onyutha et al., 2016Onyutha et al., , 2019Yang et al., 2015;Ongoma et al., 2019;Mumo et al., 2020). According to these studies, the models highly overestimated SON rains but undersimulated the MAM rainfall. However, some of the models depicted a difference in peak of the MAM seasonal rainfall which occurs in April as some showed peaks in May and March. A few studies (Akurut et al., 2014;Onyutha et al., 2016Onyutha et al., , 2019 have been conducted in Uganda based on GCMs, particularly Coupled Model Intercomparison Project (CMIP) output. These studies were carried out over a small domain of the Lake Victoria basin. Therefore, a national study is needed to evaluate the performance of the GCMs in reproducing rainfall over the study region.
CMIP outputs have been widely used in many climate change studies and in developing the assessment reports of the Inter-governmental Panel on Climate Change (IPCC) (IPCC, 2012(IPCC, , 2013. The latest output from the World Climate Research Programme (WCRP) is phase six (CMIP6) of the CMIP project (Eyring et al., 2016). The models have an additional value in the parameterization schemes for the climate system's major physical and biogeochemical processes compared to the previous version of CMIP5 (Taylor et al., 2012). Recent studies that have utilized CMIP6 have reported that the models exhibited improvements compared to CMIP5 (Akinsanola et al., 2020;Almazroui et al., 2020a,b;Zhu et al., 2020). So far, only one study (Almazroui et al., 2020b) has utilized the CMIP6 models over Africa. The study covered the entire continent. Thus, regionalized studies are needed to evaluate the GCMs before predicting future regional climate.
This study evaluates the ability of the CMIP6 in simulating mean rainfall over Uganda and selects the best performing models which will be used to generate a multi-model ensemble Uganda lies within East Africa, bounded by the geographical coordinates of longitudes 29 o E to 35.2 o E and latitudes 1.5 o S to 4.5 o N. The total territorial domain is 241,038 km 2 , where land cover comprises up to 197,100 km 2 , and the remaining 43,938 km 2 is covered by water. This includes the world's second-largest lake, Lake Victoria, located in the southern part of the country and shared with two neighboring countries Kenya (~ 6 %) and Tanzania (49 %), with remaining section ( ~ 45 %) in Uganda. The lowest elevation regions lie in the northwestern part around Lake Albert along the Rift Valley, while high elevation areas are in the southwest (Mts. Rwenzori and Mufumbira) and the east (Mts. Elgon and Moroto) of the country.
The climate of the region is mostly influenced by the ITCZ, interactions between the Indian and western Pacific oceans (ENSO), Congo air mass, local features (Basalirwa, 1995), and Indian Ocean Dipole (IOD) (Saji 1999). Some parts of the country in the north and southwest also receive enhanced rainfall from June to August (JJA), which is also attributed to moist westerlies from the Congo basin (Ogwang et al., 2015).

Observed datasets
Many discrepancies exist in ground station data over Africa, both in the temporal and spatial aspects (Sylla et al., 2012). Thus, as a proxy for observation, we utilized a monthly satellitegauge based dataset from the Climate Hazards Group Infrared Precipitation (CHIRPS.V2) (Funk et al., 2015). The CHIRPS.V2 data is built from smart interpolation technique and high resolution, long periods of precipitation estimates, and infrared cold cloud duration observations. It covers a spatial resolution of 0.05 o x 0.05 o , running from 1981 to date.
CHIRPS.V2 data is preferred over other dataset because of its high resolution as well as its ability to capture the effects of topography and local features on rainfall over the study domain (Dinku et al., 2018;Ayugi et al., 2019;Ngoma et al., 2020).

Climate Model Datasets
The study utilized historical simulations of 15 GCMs from CMIP6 obtained from the Earth System Grid Data Portal -https://esgf-node.llnl.gov/search/cmip6. The basic information about the model datasets, the development centers, and their respective spatial resolution is given in Table 1. The study considered the ensemble of the first realization (r1i1p1f1) of the historical runs for all the models to ensure consistency in comparing and evaluating model performance against observation and to minimize the bias in the models . Although the models' historical runs are from 1850 to 2014, or 2015 for some models, this study covers the period 1981-2014 relative to the time frame of the gridded observation datasets to ensure consistency.

Methods
Firstly, averaged ensemble members of the first run of all the models were standardized to the international system (SI) unit for precipitation and set to a standard date format. The models were then re-gridded with a common grid of 1 o x1 o horizontal resolution using a remapping procedure of distance weighted (Isaaks and Srivastava, 1989). The ensemble of the models was generated by averaging all the models using a simple arithmetic mean technique. The models were then evaluated by examining their ability to reproduce the annual rainfall cycle and mean seasonal climatology for MAM and SON seasonal rainfall. The temporal rainfall distribution, and spatial and linear trends of the models were compared with observation data for further assessment. Theil's Sen Slope (Sen, 1968) was used to detect the magnitude of the trends, whilst Mann-Kendall (MK) test was applied to detect the significance of the trends (Mann, 1945;Kendall, 1975). These approaches have been applied by various trend analysis studies (Ayugi et al., 2019;Rizwan et al., 2020).
The spatio-temporal performance of each model and the ensemble in simulating rainfall over the region were further assessed using statistical metrics including mean bias error (MBE), centered root mean square error (RMSE), and pattern correlation coefficient (CC). The Taylor diagram was used in the ranking of the models (Taylor, 2001). This approach has been employed in many studies (Kisembe et al., 2019;Ayugi et al., 2019, Ngoma et al., 2020 in ranking models' performance. Furthermore, the Taylor skill score was used in ranking the models.Taylor skill score (TSS) calculated as Equation.
(1) is a numerical summary of the Taylor diagram to express a synthetic measure. where Rm is the spatial correlation coefficient of climatological mean between simulation and observation, Ro is the maximum attainable correlation coefficient set here to 0.999, and σm and σo are the standard deviations of the simulated and observed spatial patterns in climatological means, respectively. The closer is the value of TSS to 1, the better is the agreement between the simulation and observation. Similar approach has been successfully employed in previous studies (i.e., Zhu et al., 2020;Luo et al., 2020;Xin et al., 2020) Several studies have suggested that labeling of an ensemble as one GCM is not adequate to reproduce observed patterns (Kim et al., 2015;You et al., 2018). In addition, due to the inherent uncertainties of individual GCMs, the multi-model ensemble (MME) average generally provides more reliable and robust estimates than each individual model (Tebaldi and Knutti, 2007).
An ensemble of best performing models helps in reducing uncertainties among the models. Previous literature reveals no guideline for selecting the maximum number of GCMs in generating the ensemble. For example, Ongoma et al. (2019) and Ayugi et al. (2020) identified eight and five best performing models, respectively. Following recommendation of Ahmed et al. (2019), this study will identify top-ranked GCMS for the development of multimodel ensemble (MME), which is necessity in climate change impact assessment. throughout the year (Nicholson, 2018;Nicholson et al., 2018).. Most of the models as well as the MME reproduce the June-August (JJA) seasonal rainfall, which is significant in some parts of the country, including the north and southwest areas. This is attributed to the influx of moist westerlies from the Congo basin (Basalirwa, 1995). However, there is an overestimation of SON rains by most of the models. The CNRM-CM6-1 and CNRM-ESM2-1 exhbit poor performance as they underestimate rainfall in all months. The ensemble mean captures the MAM seasonal cycle relatively well, whereas it overestimates the SON rains. This observation can be attributed to the large wet bias depicted by 13 out of 15 of all the models assessed. These results agree with previous studies (Yang et al., 2015;Ongoma et al., 2018;Mumo et al., 2020) carried out over East Africa based on CMIP5.
However, there is an improvement in the performance of the CMIP6 ensemble in reproducing the MAM rains as compared to MME of CMIP5, which showed a dry bias in replicating the seasonal MAM rains Mumo et al., 2020). The MAM rainfall season is of great significance to the rain-fed agro-based economy of the country since the monthly rainfall influences the timing of crop planting and harvesting. With the wellpronounced ability of the models to reproduce the MAM rains, the future projection of its likelihood will be of great importance to the country's economy. Past studies have reported a paradox scenario over the long rains, which needs to be addressed. However, the overestimation of the SON rains can negatively impact farmers as they would expect more rainfall, and it turns out to be less than expected.

Seasonal Analysis
Annual rainfall over the region ranges between 500 and 2500 mm per year, with high spatial variability in rainfall across the region. High rainfall is received in the country's southern parts, around Lake Victoria, and in the eastern parts. On the other hand, low rainfall amounts have been recorded in the southwest, northeast, and northern parts of the country. Various mechanisms influence the rainfall over the region. The ITCZ, ENSO, IOD, Congo westerlies, and mesospheric effects are the most important Basalirwa (1995), Nicholson (1996), and Ogwang et al. (2014Ogwang et al. ( , 2015. The spatial distribution of seasonal rainfall for the CHIRPS and as simulated by models for MAM and SON is shown in Figures 3 and 4, respectively. Nearly 6 out of the 15 models underestimate MAM rainfall over most areas, whereas 8 models slightly overestimate the seasonal rain ( Figure 3). However, most models tend to capture the higher rainfall amount over the Lake Victoria basin than other parts of the study area. This is attributed to mesospheric effects, including land and sea breeze (Nsubuga et al., 2014). Twelve models overestimate the SON rainfall spatial patterns over the area, while only two models, CNRM-CM6-1 and CNRM-ESM2-1, reveal a dry bias (underestimate) for the SON rainfall's spatial distribution ( Figure 4). The two models from the same parent institution fail to capture enhanced rainfall patterns around mountainous areas, for example, in the east around mountain Elgon. Various studies have linked this observation to parameterization skills in the models and low resolution, which cannot capture topographic effects (Ogwang et al., 2016;Kisembe et al., 2019). The UKESMI1-0-LL performed relatively better than other models in reproducing SON rainfall's spatial patterns

Temporal distribution
The temporal distribution of CHIRPS and CMIP6 models-simulated rainfall for MAM and SON season is drawn in Figures 5 and 6, respectively. As shown in Figure 5a, the rainfall is distributed along a mean value of 129.89 mm and a standard deviation of 14.42 mm. 10/15 models exhibit mean values higher than that observed with higher standard deviation. This justifies the models' poor performance as they tend to overestimate rainfall and exhibit more variability. Further, 5/15 models show the mean values lower than the CHIRPS records, implying that there is an underestimation of the mean rainfall. However, these models show relatively low standard deviation, thus replicating the temporal variability patterns of rainfall over the study domain. The MME mean exhibited the lowest standard deviation of 5.93 mm, better capturing the temporal patterns of rainfall during MAM over the region. During SON, CHIRPS data reveal a low mean value of 113.62 mm and a standard deviation of 13.5 mm. In addition, 13/15 models exhibit higher mean values than observed. The standard deviation of the models is also higher as compared to that for MAM. All the models show a standard deviation of less than 20mm, signifying more variability in rainfall received during SON.   (Nicholson, 1996(Nicholson, ,2017. These mechanisms cause high interannual variability in SON rainfall as compared to MAM. The poor representation of these mechanisms during model parameterization increases the model uncertainties in simulating rainfall patterns over the study region (Kent et al., 2015;Endris et al., 2016;Souverijns et al., 2016).
The trends were further evaluated and tested for their significance and magnitude. MAM season is the main crop growing season over the study area. A decrease in rainfall during this season will have far-reaching negative impacts on the region's economy, which depends on rain-fed agriculture. This would be of significant effect as the rate of increase in rainfall for the MAM season exceeds the observed thus farmers could expect more rain but it later turns out less.The observed increase in the SON rainfall would benefit farmers by shifting the growing season to SON. However, this brings in other uncertainties as to the rain during this season is influenced by several mechanisms such as ENSO and IOD (Nicholson, 1996;Behera et al., 2006;Ogwang et al., 2015;Nicholson et al., 2017). These mechanisms lead to the SON rainfall exhibiting high interannual variability and is thus not completely reliable for rain-fed agriculture. The CMIP6 models also overestimate rainfall received during SON across the region.

Temporal Bias, RMSE and Correlation Coefficient metrics
A model's performance is considered better if it exhibits low bias, small RMSE, and a higher positive correlation coefficient (CC). The metrics were analyzed and averaged over the study domain for MAM and SON seasons during the study period of 1981-2014, as shown in Figure   9.
During the MAM season (Figrure 9a), 5 of the models depict a dry rainfall bias over the region in the order CNRM-ESM2-1, CNRM-CM6-1, UKESMI1-0-LL, EC-Earth-Veg, and SAMO UNICON. The CNRM-ESM2-1 exhibits the highest dry bias of >80 mm while the ensemble and CanESM5 simulate the MAM rainfall relatively well with a slightly lower dry bias of <10 mm. The rest of the models (9/15) show a wet bias over the region during the MAM season. However, the bias was not so high in BCC-EM1, which simulates the highest wet bias of <50 mm. In addition, most of the models revealed a wet bias for the SON rainfall except two models, CNRM-ESM2-1 and CNRM-CM6-1. These two models' overall performance is poor in simulating rainfall over the region as they tend to underestimate the rainfall throughout the whole period. SAMO-UNICON, CESM, and CESM-WACCM perform well in simulating the SON rains with a relatively lower wet bias of <40 mm. The models' biases are usually attributed to the coarse resolution of the models, which could not capture the topographic effects and poor representation of convective schemes Kisembe et al., 2019).
The RMSE of the CMIP6 models and ensemble employed in the study against CHIRPS data over the study domain is shown in Figure 9b. The models depict a relatively low RMSE when simulating rainfall for the MAM season as compared to SON. Only two models (CNRM- performance. In addition, UKESMI1-0-LL reveals the lowest RMSE of <40 mm in simulating the SON rains and shows the best performance than the rest of the models.  parts, higher RMSE is depicted over the western and southern parts. The rainfall over these regions is not evenly distributed, and this is attributed to the effects of topography and mesoscale systems (Nsubuga et al., 2017). Thus, these mechanisms are not well captured by the parameterization and coarse resolution of the models.
All the models exhibit a positive correlation with observed spatial patterns of rainfall.
In addition, the models correlate more positively with rainfall observed in the western parts of the region than in the eastern parts. The models' ensemble mean shows a strong positive correlation with the CHIRPS observed patterns. This justifies that the models performed well in simulating the observed patterns of rainfall over the study domain. The best performing models are BCC-CM2-MR, GFDL-CM4-gr1, GFDL-CM4-gr2, GFDL-ESM4, and UKESMI1-0-LL, while CESM2, CESM2-WACCM, CNRM-CM6-1, and CNRM-ESM2-1; the MRI-ESM2-0 exhibit low correlation with observed patterns.

Model Ranking
A summary of annual bias, RMSE, and the correlation coefficient is presented in Table 3 As compared to over East Africa , there has been an improvement in the correlation of the models with observed patterns. In a related study, Ongoma et al. (2019) evaluated the performance of CMIP5 in simulating rainfall over East Africa against Climatic Research Unit (CRU) datasets. The positive correlation between the models with CRU was low, ranging from 0.01 to 0.24, where even the ensemble mean had a negative correlation. Taylor diagram was used in ranking the CMIP6 models score for simulating spatial seasonal and annual mean rainfall over the region. Figure 13 shows the models' performance in correlating with the observed patterns, the centered RMSD, and the ability of the models to reproduce the variability in rainfall quantified by the standard deviation. Overall, most models perform better in reproducing SON season rainfall than MAM rainfall ( Figure 13). These results agree with a previous study by Ongoma et al. (2019), which utilized the CMIP5 models over East Africa. The better performance in the current study is attributed to the underlying mechanisms influencing rainfall over the region. Rainfall during SON is largely driven by large-scale features such as ENSO and IOD, which are captured by the GCMs. All the models positively correlated with the observed data for both MAM and SON seasons. Only one model, CNRM-CM6-1, had a negative correlation relative to observation when simulating annual rainfall. In selecting the best performing models, we employed the Taylor skill score (TSS) shown in Figure 14. The closer TSS's value is to 1, the better the agreement between the simulation and observation. The range of TSS of the models during MAM is 0.51 -0.78, whereas during SON is 0.62 -0.85. However, the models exhibit poor performance in simulating annual rainfall patterns over the region with the TSS ranging between 0.14 -0.74.
During In addition, for annual rainfall, the best performing models include BCC-CM2-MR, MRI-ESM2-0, IPSL-CM6A-LR, GFDL-ESM4 and GFDL-CM4-gr2. Generally, the performance of the models in reproducing rainfall over the study region varies from one season to the other. In addition, poor performance is exhibited when reproducing observed annual rainfall patterns than seasonal. It is noted that some of the models which exhibit good performance could not well reproduce the seasonal climatology and linear trends of rainfall over the study domain. Thus, the best performing models include GFDL-ESM4, BCC-CMC-MR, IPSL-CM6A-1, CanESM5 GDFL-CM4-gr1 and GFDL-CM4-gr2. Studies by  and Mumo and Yu (2020) using the CMIP5 over East Africa reported that CanESM5 exhibited the best performance in reproducing the MAM rainfall. However, in the present study, more models outperformed CanESM5 during MAM. Thus, more improvement is exhibited by the CMIP6 models in reproducing rainfall over the study domain.

Summary and conclusion
Rainfall is the most essential weather parameter in the tropics as it affects many socioeconomic activities. Uganda's national economy largely depends on rain-fed agriculture, so any slight fluctuation in rainfall will have far-reaching effects on the community livelihoods.
Understanding its patterns in variability and trends is crucial for predicting likely patterns and structuring effective adaption and mitigation strategies and climate change policies.
In this study, we evaluated 15 GCMs of the CMIP6 and their ensemble mean in reproducing mean rainfall over the country at annual and seasonal scales. In the study, only the first run of the first realization of the models was considered. The models were re-gridded to a common grid of 1 o x 1 o spatial resolution. The models and the ensemble mean were compared against CHIRPS dataset as a proxy to observation for the period 1981 -2014 by evaluating their ability to reproduce the climatology, linear trends, temporal distribution, and important statistical metrics.
The models tend to reproduce well the bimodal rainfall pattern regime received over the country. The results revealed that some models slightly overestimated, while others slightly underestimated, the MAM rainfall. In addition, the most models highly overestimated the short rains. Previous studies have also noted this over East Africa with CMIP5 Mumo and Yu, 2020). The SON rains have been reported to exhibit higher interannual variability (Nicholson 2017;Kisembe et al., 2019;Egeru et al., 2019;Ngoma et al., 2020) as compared to MAM by many past studies. This is attributed to the fact that SON rainfall is regulated by global teleconnections such as ENSO and IOD. Therefore, more research is necessary to understand the mechanisms governing precipitation over Uganda (e.g., landatmosphere interaction) and how models simulate them. The two models, CNRM-CM6-1 and CNRM-ESM2-1, tend to underestimate rainfall throughout the years. The performance of the models varies from seasonal to annual scale. Most models exhibited good performance during SON than MAM according to the TSS model's ranking. The models further depicted a reduction in dry bias compared to CMIP5 in simulating MAM rainfall. Nevertheless, some of the models that showed good performance in the ranking could not simulate well the seasonal climatology of the study region. With all that put into consideration, the best performing models include GFDL-ESM4, BCC-CMC-MR, IPSL-CM6A-LR, CanESM5, GDFL-CM4-gr1, and GFDL-CM4-gr2. The spatial correlation of the models with CHIRPS is positive at seasonal and annual scales, but a negative correlation is depicted for interannual variability.
The findings of this study are of great importance to climatologists and end-users of the datasets. The results will help producers improve parameterization schemes in the models, where the models could not reproduce the observed patterns well. There is still a need for improvement in the models to minimize biases resulting from topography and local-scale convective effects. For the end-users, more caution is needed when using CMIP6 outputs in projecting rainfall during SON, as most models tend to overestimate it. However, the model outputs are generally reproducing rainfall consistent with observed datasets during MAM, and thus can be adopted in future rainfall projection during MAM over Uganda.