Evaluation of precipitation simulations in CMIP6 models over Uganda

This study employed 15 CMIP6 GCMs and evaluated their ability to simulate rainfall over Uganda during 1981–2014. The models and the ensemble mean were assessed based on the ability to reproduce the annual climatology, seasonal rainfall distribution and trend. Statistical metrics used include mean bias error, normalized root mean square error, and pattern correlation coefficient. The Taylor diagram and Taylor skill score (TSS) were used in ranking the models. The models' performance varies greatly from one season to the other. The models reproduced the observed bimodal rainfall pattern of March to May (MAM) and September to November (SON) occurring over the region. Some models slightly overestimated, while some slightly underestimated, the MAM rainfall. However, there was a high rainfall overestimation during SON by most models. The models showed a positive spatial correlation with observed dataset, whereas a low correlation was shown inter‐annually. Some models could not capture the rainfall patterns around local‐scale features, for example, around the Lake Victoria basin and mountainous areas. The best performing models identified in the study include GFDL‐ESM4, CanESM5, CESM2‐WACCM, MRI‐ESM2‐0, NorESM2‐LM, UKESM1‐0‐LL, and CNRM‐CM6‐1. The models CNRM‐CM6‐1, and CNRM‐ESM2 underestimated rainfall throughout the annual cycle and mean climatology. However, these two models better reproduced the spatial trends of rainfall during both MAM and SON. Caution should be taken when employing the models in seasonal climate change studies as their performance varies from one season to another. The model spread in CMIP6 over the study area also calls for further investigation on the attributions and possible implementation of robust approaches of machine learning to minimize the biases.

on the attributions and possible implementation of robust approaches of machine learning to minimize the biases.

K E Y W O R D S
CHIRPS, CMIP6, CRU, East Africa, Model Evaluation, Rainfall, Uganda

| INTRODUCTION
Precipitation remains the most valuable weather parameter in the tropics, where Uganda lies. Various mechanisms ranging from mesoscale and synoptic scale features to global teleconnections regulate precipitation over the area. Rainfall is the most dominant form of precipitation over the region, and is mainly influenced by tropical rain belt that oscillates north-south throughout the year (Nicholson, 1996(Nicholson, , 2018. Thus, the region mainly experiences a bimodal rainfall pattern with "long rains" from March to May (MAM) and "short rains" from September to November (SON). The economy of the region largely depends on rainfed agriculture (Nsubuga and Rautenbach, 2017).
Unfortunately, the precipitation over the region exhibits high spatio-temporal variation (Basalirwa, 1995;Nsubuga et al., 2014, Nsubuga andRautenbach, 2017). Past studies have reported a decline in MAM rain, while an upward trend has been observed in SON season (Yang et al., 2014;Ongoma and Chen, 2017;Egeru et al., 2019;Ngoma et al., 2021). These changes in rainfall have resulted in more frequent and intense extreme events like droughts and floods (Mulinde et al., 2016;Nicholson, 2017;Ojara et al., 2020). According to the National Adaptation Programmes of Action, the wet areas of Uganda; the Lake Victoria basin and east and northwest, are becoming wetter (Government of Uganda (GOU), 2015). These situations have resulted in the destruction of property and loss of lives (GOU, 2015).
The observed impacts of changes in rainfall call for understanding the future patterns for informed decision making while planning. The projected future rainfall is mainly based on simulations of general circulation models (GCMs). Various studies have been carried out over the African continent with the use of either GCMs or regional climate models (RCMs), at continental, subcontinental or country scale (Indeje et al., 2001;Endris et al., 2013;Ogwang et al., 2014Ogwang et al., , 2015Ogwang et al., , 2016Akinsanola et al., 2015Akinsanola et al., , 2017Mugume et al., 2017;Kisembe et al., 2018;Ongoma et al., 2018;Osima et al., 2018;Ayugi et al., 2020). Over East Africa (EA), several studies, including Yang et al. (2015), Ongoma et al. (2018), Mumo and Yu (2020) have successfully utilized GCM datasets in understanding rainfall variability over the region. Past studies that evaluated the fifth Coupled Model Intercomparison Project (CMIP5) datasets over the region reported a relatively poor performance of the models in simulating rainfall over the region (Akurut et al., 2014;Yang et al., 2015;Onyutha et al., 2016Onyutha et al., , 2019Ongoma et al., 2019;Mumo and Yu, 2020). According to these studies, the models highly overestimated SON rains but undersimulated the MAM rainfall. However, some of the models depicted a delay in peak of the MAM seasonal rainfall which occurs in April as some showed peaks in May and March. A few studies (Akurut et al., 2014;Onyutha et al., 2016Onyutha et al., , 2019 have been conducted in Uganda based on GCMs, particularly CMIP output. These studies were carried out over a small domain of the Lake Victoria basin. Therefore, a national study is needed to evaluate the performance of the GCMs in reproducing rainfall over the entire country. CMIP outputs have been widely used in many climate change studies and in developing the assessment reports of the Inter-governmental Panel on Climate Change (IPCC) (IPCC, 2012(IPCC, , 2013. The latest output from the World Climate Research Programme (WCRP) is phase six (CMIP6) of the CMIP project (Eyring et al., 2016). The models have an additional value in the parameterization schemes for the climate system's major physical and biogeochemical processes compared to the previous version of CMIP5 (Taylor et al., 2012). Recent studies that have utilized CMIP6 have reported that the models exhibited improvements compared to CMIP5 (Akinsanola et al., 2020;Zhu et al., 2020;Almazroui et al., 2020a;2020b;Ayugi et al., 2021). So far, few studies (i.e., Almazroui et al., 2020b) has utilized the CMIP6 models over Africa. The study was conducted over the whole of Africa and evaluated multimodel ensemble but not individual models. Meanwhile, existing study over East Africa region mainly focused on evaluating the CMIP6 models in simulating the statistics of extreme precipitation (Akinsanola et al., 2021).
Thus, regionalized studies are necessary to evaluate the GCMs before employing them in predicting future regional climate. The existing studies based on CMIP5 models have pointed a paradox scenario in the future climate of Greater Horn of Africa (GHA) Tierney et al., 2015). This has caused state of confusion to relevant stakeholders on the reliability of the models and possible impact on policy planning and development. Meantime, the compounded occurrences of anomalous events over the study region during the recent decades present a worrying situation that calls for urgent action. For instance, the 2019 short rains over Uganda were considered the most pluvial year observed, affecting thousands of people by destroying key infrastructure (ReliefWeb, 2020). The need for accurate weather forecasting and climate projection as a way of minimizing the losses remains a paramount process.
As a first step to address the gap, this study seeks to evaluates the ability of the new generation experiment of CMIP6 in simulating mean rainfall over Uganda and selects the best performing models which will be used to generate a multi-model ensemble (MME) mean for projecting rainfall patterns over the region. Improved performance in CMIP6 reported over other domains (Luo et al., 2020;Xin et al., 2020;Zamani et al., 2020;Zhu et al., 2020;Ayugi et al., 2021) presents a promising reliability of models in its capability to accurately project the regions' climate which will play a key role in developing robust policies for sustainable development. The study is divided into three sections. Section 1 gives the introduction of the study. Section 2 gives a brief description of the study domain, the datasets, and methods employed in the study. Section 3 presents the results and discussion under sub-topics; annual rainfall cycle, seasonal analysis, temporal distribution, linear trend, temporal statistical metrics, spatial statistical metrics, and lastly, model ranking. The summary and conclusion of the study is summarized in Section 4.
2 | STUDY AREA, DATA AND METHODS

| Study area
Uganda lies within East Africa, bounded by the geographical coordinates of longitudes 29 E to 35.2 E and F I G U R E 1 Location of Uganda in Africa along longitudes 29.2-35.2 E and latitudes 1.5 S-4.5 N (a) and (b) shows elevation (m) and physical features. The Digital Elevation Model (DEM) datasets was obtained from Shuttle Radar Topography Mission (SRTM) 90 m spatial resolution (3 arcsec). The Lowest elevation is represented by light yellow in the northwest and the highest elevation by brown (Mt Rwenzori in the southwest and Mt Elgon in the east) [Colour figure can be viewed at wileyonlinelibrary.com] latitudes 1.5 S to 4.5 N (Figure 1). The country has an approximate area of 241,038 km 2 , of which 43,938 km 2 is covered by water. This includes the world's secondlargest lake, Lake Victoria that is shared with two neighbouring countries Kenya ($6%) and Tanzania (49%), with remaining section ($45%) in Uganda. The lowest elevation regions lie in the northwestern part around Lake Albert along the Rift Valley, while high elevation areas are in the southwest (Mts. Rwenzori and Mufumbira) and the east (Mts. Elgon and Moroto) of the country.
The climate of the region is mostly influenced by the equatorial rain belt, interactions between the Indian and western Pacific oceans (ENSO), Congo air mass, local features (Basalirwa, 1995), and Indian ocean dipole (IOD) (Saji et al., 1999). Some parts of the country in the north and southwest also receive enhanced rainfall from June to August (JJA), which is also attributed to moist westerlies from the Congo Basin (Ogwang et al., 2015).

| Observed datasets
Many discrepancies exist in ground station data over most countries in Africa, both in the temporal and spatial aspects (Sylla et al., 2012). Thus, as a proxy for observed data, two precipitation datasets retrieved from gaugebased estimates and satellite-derived based are utilized to support the confidence in the findings following an approach in related studies (Akinsanola et al., 2020;Ayugi et al., 2020). The datasets are Climatic Research Unit (CRU TS4.04; Harris et al., 2020) and the Climate Hazards Group Infrared Precipitation with Station (CHIRPS.V2; Funk et al., 2015). CHIRPS datasets have been proven to perform exemplary over the study region relative to other existing satellite datasets because of its high resolution as well as its ability to capture the effects of topography and local features on rainfall over the study domain (Asadullah et al., 2008;Diem et al., 2014Diem et al., , 2019Kimani et al., 2017;Cattani et al., 2018;Gebrechorkos et al., 2017;Dinku et al., 2018;Ayugi et al., 2019;Nicholson et al. 2019;Ngoma et al., 2021). To highlight, Dinku et al. (2018) employed >1,200 station datasets to evaluate the CHIRPS and established a higher skill and very low bias over EA domain. Equally, a number of studies have validated the realibility of CRU datasets over the study region in comparison to other guage-based products or reanalysis datasets (Ogwang et al., 2016;Ongoma and Chen, 2017). The CHIRPS.V2 data is built from smart interpolation technique and high resolution, long periods of precipitation estimates, and infrared cold cloud duration observations. It has a spatial resolution of 0.05 × 0.05 , running from 1981 to date. The CRU monthly time series (TS4.04) are available from 1901 to 2018. The data is gridded at 0.5 × 0.5 resolution. It can be accessed from crudata.uea.ac.uk/cru/data/ (Harris et al., 2014(Harris et al., , 2020.

| Climate model datasets
The study utilized historical simulations of 15 GCMs from CMIP6 obtained from the Earth System Grid Data Portal-https://esgf-node.llnl.gov/search/cmip6. The basic information about the model datasets, the development centers, and their respective spatial resolution is summarized in Table 1. The study considered the ensemble of the first realization (r1i1p1f1) of the historical runs for all the models to ensure consistency in comparing and evaluating model performance against observation and to minimize the bias in the models. Although the models' historical runs are from 1850 to 2014, or 2015 for some models, this study covers the period 1981-2014 relative to the time frame of the gridded observation datasets to ensure consistency.

| Digital elevation model data
Elevation data was downloaded from the Shuttle Radar Topography Mission (SRTM) 90-m DEM (digital elevation model) website (www.cgiar-csi.org/data/srtm-90mdigital-elevation-database-v4-1). For this study, a course resolution DEM (0.05 ) was resampled from the 90 m data and mosaicked over Uganda region through Geographical Information System (GIS) functionality.

| Methods
Averaged ensemble members of the first run of all the models were standardized to the international system (SI) unit for precipitation and set to a standard date format. The models were then re-gridded to a common grid of 1 × 1 resolution using a remapping procedure of distance weighted average (Isaaks and Srivastava, 1989). The aforesaid interpolation technique follows better classification of diverse geography through data triangulation of nearest points and sub-regionalization of grid points by the nearest cell center input grids thereby suitable for comparative analysis from uniform grids (Vermeulen et al., 2017). The ensemble of the models was generated by averaging all the models using a simple arithmetic mean technique. The models were then evaluated by examining their ability to reproduce the annual rainfall cycle and mean seasonal climatology for MAM and SON rainfall. The temporal rainfall distribution, spatial and linear trends of the models were compared with observation data for further assessment. Theil's Sen slope estimator (Sen, 1968) was used to measure the magnitude of the trends, whilst modified Mann-Kendall (m-MK) test was applied to detect the significance of the trends (Mann, 1945;Kendall, 1975;Sneyers, 1990;Hamed and Rao, 1998). The advantage of m-MK test is its capability to incorporate missing values in any time series and also due to the fact that it employs relative magnitudes rather than numerical values that allows "trace" or "below" detection data (Hirsch et al., 1993). These approaches have been applied by various trend analysis studies Tadeyo et al., 2020;Karim et al., 2020;Ongoma et al., 2021).
The spatio-temporal performance of each model and the ensemble in simulating rainfall over the region were further assessed using statistical metrics including mean bias error (MBE), normalized root mean square error (NRMSE), and pattern correlation coefficient (PCC). NRMSE is more reliable than RMSE in comparing model performance when the model outputs are in different units or the same unit but with different orders of magnitude and is less influenced by spatial errors (Willmott, 1982). Taylor diagram was used in the ranking of the models (Taylor, 2001). This approach has been employed in many studies in the region (Kisembe et al., 2018;Ayugi et al., 2019;Ngoma et al., 2021) in ranking models' performance.
Furthermore, the Taylor skill score (TSS) was used in ranking the models. The TSS calculated as shown in Equation (1) is a numerical summary of the Taylor diagram to express a synthetic measure.
where R m is the pattern correlation coefficient of climatological mean between simulation and observation, R 0 is the maximum attainable correlation coefficient set here to 0.999, while σ m and σ 0 are the standard deviations of the simulated and observed spatial patterns in climatological means, respectively. The closer the value of TSS is to 1, the better is the agreement between the simulation and observation. Similar approach has been successfully employed in previous studies (i.e., Luo et al., 2020;Xin et al., 2020;Zhu et al., 2020). Several studies have suggested that labeling of an ensemble as one GCM is not adequate to reproduce observed patterns (Kim et al., 2015;You et al., 2018). In T A B L E 1 CMIP6 models employed in the study, the modeling centers, horizontal resolution and references addition, due to the inherent uncertainties of individual GCMs, the MME average generally provides more reliable and robust estimates than each individual model (Tebaldi and Knutti, 2007). An ensemble of best performing models helps in reducing uncertainties among the models. Previous literature reveals no guideline for selecting the maximum number of GCMs in generating the ensemble. For example, Ongoma et al. (2019) and Ayugi et al. (2020) identified eight and five best performing models, respectively. Following recommendation of Ahmed et al. (2019), this study will identify top-ranked GCMs for the development of MME, which is necessity in climate change impact assessment.

| Annual cycle
A good model is one that is able to reproduce the seasonality of a weather parameter, as stated by Sperber and Palmer (1996). CHIRPS, CRU and the models' simulations reproduce bimodal rainfall patterns over Uganda, that is, the long rains (MAM) and the short rains (SON), as shown in Figure 2. Both the reference datasets (CHIRPS and CRU) depict almost similar patterns with a slight difference for January, April, and August-October rainfall. The bimodal pattern is associated with the tropical rain belt's influence, which moves from north to south throughout the year (Nicholson, 2018;Nicholson et al., 2018). Most of the models as well as the MME reproduce the JJA seasonal rainfall, which is significant in some parts of the country, including the north and southwest areas. This is attributed to the influx of moist westerlies from the Congo basin (Basalirwa, 1995). However, there is an overestimation of SON rains by most of the models. The CNRM-CM6-1 and CNRM-ESM2-1 perform relatively poorly as they underestimate rainfall in all months. The ensemble mean captures the MAM seasonal cycle relatively well, whereas it overestimates the SON rains. This observation can be attributed to the large wet bias depicted by 13 out of 15 of all the models assessed. These results corroborate previous studies (Yang et al., 2015;Ongoma et al., 2019;Mumo and Yu, 2020) over East Africa based on CMIP5.
There is an improvement in the performance of the CMIP6 ensemble in reproducing the MAM rains as compared to MME of CMIP5, which showed a dry bias in replicating the seasonal MAM rains Mumo and Yu, 2020 great significance to the rain-fed agro-based economy of the country since the monthly rainfall influences the timing of crop planting and harvesting. With the wellpronounced ability of the models to reproduce the MAM rains, the future projection of its likelihood will be of great importance to the country's economy. Past studies have reported a paradox scenario over the long rains which needs to be addressed, for example Tierney et al. (2015) projected a downward trend and other studies Ngoma et al., 2021) project an increasing increase in rainfall. The overestimation of the SON rains can negatively impact farming since the farmers will be too optimistic to receive rainfall and plan accordingly, which may not actualize leading to losses.

| Seasonal analysis
The spatial distribution of seasonal rainfall for the CHIRPS, CRU, and as simulated by models for MAM and SON is shown in Figures 3 and 4, respectively. CHIRPS and CRU show the spatial distribution of rainfall as expected. However CHIRPS captures well the enhanced rainfall patterns around geophysical features such as Lake Victoria, mountainous regions and the low rainfall in the south western parts of the country. This could be attributed to its higher resolution compared to CRU thus able to capture topographic effects on rainfall over the area. Nearly 6 out of the 15 models underestimate MAM rainfall over most areas, whereas 8 models slightly overestimate the seasonal rain ( Figure 3). However, most models tend to capture the higher rainfall amount over the Lake Victoria basin than other parts of the study area. According to Nsubuga et al. (2014), this is attributed to mesospheric effects, including land and sea breeze. Annual rainfall over the region ranges between 500 and 2,800 mm per year (Nsubuga and Rautenbach, 2017), with high spatial variability in rainfall across the region. High rainfall is received in the country's southern parts, around Lake Victoria, and in the eastern parts. On the other hand, low rainfall amounts are recorded in the southwest, northeast, and northern parts of the country. Various mechanisms influence the rainfall over the region. The Tropical rain belt, El-Nino Southern Oscillation (ENSO), IOD, Congo westerlies, and mesospheric effects are the most important (Basalirwa, 1995;Nicholson, 1996;Ogwang et al., 2014Ogwang et al., , 2015. Eleven models overestimate the SON rainfall spatial patterns over the area, while only two models, CNRM-CM6-1 and CNRM-ESM2-1, reveal a dry bias (underestimate) for the SON rainfall's spatial distribution ( Figure 4). Interestingly, the two models from the same parent institution fail to capture enhanced rainfall patterns around mountainous areas, for example, in the east around mountain Elgon. Various studies have linked this observation to parameterization skills in the models and low resolution, which cannot capture topographic effects (Ogwang et al., 2016;Kisembe et al., 2018). Generally, NorESM2-LM and UKESM1-0-LL performed relatively better than other models in reproducing SON rainfall's spatial patterns.

| Temporal distribution
The temporal distribution of CHIRPS, CRU and CMIP6 models-simulated rainfall for MAM and SON season is drawn in Figures 5 and 6, respectively. As shown in Figure 5a,b for CHIRPS (CRU), MAM rainfall is distributed along a mean value of 129.89 (124.38) mm and a standard deviation of 14.42 (13.7) mm. 9 out of 15 models exhibit mean values higher than that observed with higher standard deviation. This justifies the models' poor performance as they tend to overestimate rainfall and exhibit more variability. Furthermore, 6 out of 15 models show the mean values lower than the observed, implying that there is an underestimation of the mean rainfall. However, these models show relatively low standard deviation, thus replicating the temporal variability patterns of rainfall over the study domain. The MME mean exhibited the lowest standard deviation of 5.84 mm. CNRM-CM6-1, CNRM-ESM2-1, IPSL-CM6-LR, and UKESM1-0-LL tend to capture the temporal variation patterns better of rainfall during MAM over the region relatively well with standard deviation of less than 20. During SON, CHIRPS (CRU) data reveal a low mean value of 113.62 (118.58) mm and a standard deviation of 13.5 (13.83) mm. In addition, 12 out of 15 models exhibit higher mean values than observed. NorESM2-LM and UKESM1-0-LL performed better in reproducing the temporal distribution of SON rainfall though the former exhibited high standard deviation of 44.1. The standard deviation of the models is also higher as compared to that for MAM. All the models show a standard deviation of greater than 20 mm, signifying more variability in rainfall received during SON.

| Trend analysis
The spatial patterns of the linear trend of mean rainfall for MAM and SON season are shown in Figures 7 and 8, respectively. According to CHIRPS, a negative trend at a rate of less than −0.4 mmÁyear -1 is observed over most parts of the region, and a positive trend of less than 0.8 mmÁyear -1 is observed in parts of the northeast, southwest, and central region in MAM (Figure 7). CRU depicts almost a similar linear trend, with a reduction of −0.4 to 0.4 mmÁyear -1 . Smaller regions around Mt Elgon showing a negative extending to −0.8 mmÁyear -1 . The ability of the models to reproduce the linear spatial trends varies from one model to another. Most of the models simulated the trends within the observed range's proximity except for BCC-CSM2-MR and UKESM1-0-LL, which depict a higher positive MAM rainfall trend. Furthermore, as shown in Figure 8, a positive spatial linear trend of 0 to 1.2 mmÁyear -1 is observed with CHIRPS data over most parts of the region during the SON season. CRU however reveals a positive linear trend over the whole study domain signifying increase in SON rainfall over the whole study region. In total, 3 of the 15 models depict a negative trend for SON rainfall. These include GFDL-CM4, MRI-ESM2-0, and UKESM1-0-LL. However, the models CanESM5, BCC-ESM1, and CESM2-WACCM showed the highest positive linear trend for SON rainfall. Overall, the models well reproduce the spatial trends of rainfall during MAM than during SON. These results are attributed to the various SON rainfall mechanisms, such as ENSO, IOD, and quasi-biennial oscillation (QBO) (Nicholson, 1996(Nicholson, , 2017. These mechanisms The trends were further evaluated and tested for their significance and magnitude. Table 2 shows the mean, slope, Z-score, and significance of linear trend of MAM and SON rainfall for CHIRPS, CRU, ensemble and the 15 CMIP6 models. The rainfall over the region exhibits insignificant trends with a decreasing MAM trend and an increasing trend during SON. The mean (Z-score) values of 129.89 mm (−0.24) and 113.62 mm (0.65) for MAM and SON, respectively, are observed in the CHIRPS datasets. These results agree with past studies over the study area (Kizza et al., 2009;Nsubuga et al., 2014;Ngoma et al., 2021). CRU however shows a significantly increasing trend for SON with a Z-score of 1.96 just at the threshold. Only two models BCC-CSM2-MR and UKESM1-0-LL, show a significant trend for MAM season with Z-scores of 2.19 and 3.50. On the other hand, BCC-ESM1 portrayed pronounced wetting patterns during SON with a Z-score value of 3.45. Most models (e.g., 9/15) captured the observed linear trend during SON while 6/15 models reproduced MAM tendencies. This shows the challenge of models to accurately capture factors influencing MAM season as compared to SON rains. This could be explained by the country climate's nature, being situated between convective regions, the will have far-reaching negative impacts on the region's economy, which depends on rain-fed agriculture. This would be of significant effect as the models depict an increasing trend contrary to what is observed thus farmers could expect more rain but it later turns out less. The exact causation of the declining patterns during MAM remains unclear due to the weak correlation with large SST anomalies (Liebmann et al., 2014). Existing studies have narrowed to the impact of the west to central Pacific and the western Indian Ocean as the significant contributor to the observed decline (Williams and Funk, 2011;Liebmann et al., 2014;Ayugi et al., 2018). Meanwhile, the observed increase in the SON rainfall would benefit farmers by shifting the growing season to SON. However, this brings in other uncertainties as to the rain during this season is influenced by several mechanisms such as ENSO and IOD (Nicholson, 1996;Behera et al., 2006;Ogwang et al., 2015;Nicholson, 2017). These mechanisms lead to the SON rainfall exhibiting high interannual variability and is thus not completely reliable for rain-fed agriculture. The CMIP6 models also overestimate rainfall received during SON across the region.

| Temporal bias, NRMSE and correlation coefficient metrics
A model's performance is considered to be good if it exhibits low bias, small NRMSE, and a higher positive pattern correlation coefficient (PCC). The models were evaluated relative to CHIRPS data as it exhibited greater performance and agreement with the models compared to CRU. The metrics were analyzed and averaged over the study domain for MAM and SON seasons during the study period of 1981-2014, as shown in Figure 9.
During the MAM season (Figure 9a), 6 of the models depict a dry rainfall bias over the region in the order CNRM-ESM2-1, CNRM-CM6-1, UKESM1-0-LL, EC-Earth3-Veg, NorESM2-LM, and SAMO UNICON. The CNRM-ESM2-1 exhibits the highest dry bias of >80 mm while the ensemble and CanESM5 simulate the MAM rainfall relatively well with a slightly lower bias of <10 mm. The rest of the models (8/15) show a wet bias over the region during the MAM season. However, the bias was not so high even with BCC-ESM1, which simulates the highest wet bias of <50 mm. In addition, most of the models revealed a wet bias for the SON rainfall except two models, CNRM-ESM2-1 and CNRM-CM6-1. These two models tend to underestimate rainfall throughout the whole period. NorESM2-LM, UKESM1-0-LL, CESM2, and CESM2-WACCM perform well in simulating the SON rains with a relatively lower wet bias of <40 mm. The models' biases are usually attributed to the coarse resolution of the models, which could not capture the topographic effects and poor representation of convective schemes (Kisembe et al., 2018;Ongoma et al., 2019).
The NRMSE of the ensemble and CMIP6 models employed in the study against CHIRPS data over the study domain is shown in Figure 9b. The models depict a relatively low NRMSE when simulating rainfall for the MAM season as compared to SON. Only two models (CNRM-ESM2-1 and CNRM-CM6-1) show a high NRMSE of >0.5 for the MAM season against CHIRPS data. However, most models reveal NRMSE greater than the value mentioned above (>0.5) in simulating the SON rainfall. The models with the highest NRMSE (>0.8) for SON rains were in the order BCC-CSM2-MR, BCC-ESM1, EC-Earth3-Veg, IPSL-CM6A-LR, and SAMO-UNICON. The ensemble and CanESM5 model show a low NRMSE of 0.12 and 0.21, respectively for MAM, hence justifying their better performance. In addition, UKESM1-0-LL reveals the lowest NRMSE of 0.28 in simulating the SON rains and shows the best performance than the rest of the models. Figure 9c shows the ensemble and models' PCC relative to the CHIRPS dataset for MAM and SON rainfall. Correlation identifies the ability of models to reproduce observed variable patterns. About 8 of 15 of the models and the ensemble reveal a positive correlation with observed patterns during the MAM season. The CNRM-CM6-1 shows the highest correlation of 0.44 when simulating the MAM rains. On the other hand, the remaining eight models showed negative correlations, with SAMO-UNICON showing the lowest correlation of −0.31. For SON, the ensemble and 8 of the models showed positive correlations. The BCC-ESM1 reveals the highest correlation of 0.39 against CHIRPS data; thus, it replicates the observed patterns relatively well. Furthermore, six models show negative correlations, with the CESM2 model depicting the lowest value of −0.41.
While very limited studies have been conducted based on CMIP6 over Africa or sub-regions to be used for comparative analysis in this present study, the existing studies either employed ensembles or mainly focused on annual or other analyses (e.g., Piemontese et al., 2019;Almazroui et al., 2020b;Ayugi et al., 2021). Equally, the prevailing studies based on CMIP5 or RCMs over Africa or its subregions for example, East Africa, that mainly evaluated GCMs listed the models exhibiting robust performance (e.g., CNRM-CM5), are also among the top 8 better performing models with good correlation relative to other models in simulating MAM season (Kisembe et al., 2018;Ongoma et al., 2019;Ayugi et al., 2020). However, the most notable feature that can be attributed to the robust performance of the CNRM-CM6-1 model could be associated with the improvements in the mass and energy conservation in the simulated climate system to limit long-term drift. Also, deep ocean biases are generally reduced, whereas sea ice in the Artic improved. Sensitivity in rising CO 2 in the model has increased. Moreover, the equilibrium climate sensitivity (4.9 K) is now close to the upper bound of the range estimated from CMIP5 models . Meanwhile, comparative analysis of CMIP6 and CMIP5 over models in simulating mean and extreme precipitation over East Africa have also listed few CMIP6 models including CNRM-CM6-1 and NorESM2-MM as among the models that depict robust performance in reproducing the seasonal patterns across all analyses relative to its predecessor . Further research is recommended in order to establish the deeper understanding of the improved performance of CMIP6 models akin CMIP5.

| Spatial annual bias, NRMSE and correlation coefficient
The statistical metrics of bias, NRMSE, and correlation coefficient of the GCMs simulated rainfall were averaged over the area against CHIRPS for the period 1981-2014 at the annual mean monthly scale shown in Figures 10-12. The bias results disclose a varying wet and dry biases by the models. The BCC-CSM2-MR and BCC-ESM1 show the highest wet bias in the range of 10-140 mm, hence depicting overestimation of rainfall over the region as shown in Figure 10. Furthermore, CNRM-CM6-1 and CNRM-ESM2-1 reveal the highest dry bias in the range −10 to −140 mm, signifying underestimation of observed rainfall over the study area. UKESM1-0-LL and NorESM2-LM performed relatively well in simulating rainfall patterns on the western part of the region with a minimum bias between −20 and 20 mm. However, the models show a higher dry bias when simulating rainfall on the eastern part. The ensemble mean, CanESM5, GFDL-ESM4, MRI-ESM2-0, EC-Earth3-Veg, and IPSL-CM6A-LR exhibit the lowest bias in the range −20 to 40 mm, except for small regions where the bias was greater than 40 mm.
NRMSE is the non-dimensional form of RMSE which is derived by normalizing RMSE by the range of observations regardless of the sign. The models generally exhibit varying differences in the current NRMSE analysis as shown in Figure 11. The ensemble and UKESM1-0-LL show the lowest NRMSE, thus perform better in simulating annual rainfall patterns over most parts of the country. Although BCC-CSM2-MR, BCC-ESM1, CESM2, and CESM2-WACCM show low NRMSE on the eastern and northern parts, higher NRMSE is depicted over the western and southern parts. The rainfall over these regions is not evenly distributed, and this is attributed to the effects of topography and mesoscale systems (Nsubuga and Rautenbach, 2017). Thus, there is high likelihood that these mechanisms are not well captured by the parameterization and coarse resolution of the models.
All the models exhibit a positive correlation with observed spatial patterns of rainfall except NorESM2-LM which reveals a negative correlation at a smaller part in the north (Figure 12). In addition, the models correlate more positively with rainfall observed in the western parts of the region than in the eastern parts. The models' ensemble mean shows a strong positive correlation with the CHIRPS observed patterns. This justifies that the models performed well in simulating the observed patterns of rainfall over the study domain. The best performing models are: GFDL-CM4, GFDL-ESM4, BCC-CSM2-MR, SAMO-UNICON, and UKESM1-0-LL, while CESM2, CESM2-WACCM, CNRM-CM6-1, CNRM-ESM2-1, MRI-ESM2-0, and NorESM2-LM exhibit low correlation with observed patterns.

| Model ranking
A summary of annual bias, NRMSE, and pattern correlation coefficient is presented in Table 3. Most of the models tend to overestimate annual rainfall over the region. Only CNRM-CM6-1, CNRM-ESM2-1, NorESM2-LM, and UKESM1-0-LL show a negative bias. Based on the lowest bias, CanESM5, CESM2, CESM2-WACCM, EC-Earth3-Veg, MRI-ESM2-0, NorESM2-LM, and UKESM1-0-LL perform best. Generally, the NRMSE is in the range of 0.1 to 0.67. UKESM1-0-LL shows the lowest NRMSE, while CNRM-ESM2-1 reveals the highest. The annual rainfall correlates negatively with 6 out of 15 models employed in the study. The positive correlation ranges between 0 and 0.41, and the correlation of the models' ensemble with CHIRPS is 0.16. As compared to over East Africa , there has been an improvement in the correlation of the models with observed patterns. In a related study, Ongoma et al. (2019) evaluated the performance of CMIP5 in simulating rainfall over East Africa against CRU data. The positive correlation between the models with CRU was low, ranging from 0.01 to 0.24, with the ensemble mean having a negative correlation.
Taylor diagram was used in ranking the CMIP6 models score for simulating spatial seasonal and annual mean rainfall over the region. Figure 13 shows the models' performance in correlating with the observed patterns, the centered RMSD, and the ability of the models to reproduce the variability in rainfall quantified by the standard deviation. Overall, most models perform better in reproducing SON season rainfall than MAM rainfall ( Figure 13). These results agree with a previous study by Ongoma et al. (2019), which utilized the CMIP5 models over East Africa. The better performance in the current study is attributed to the underlying mechanisms influencing rainfall over the region. Rainfall during SON is largely driven by large-scale features such as ENSO and IOD, which are captured by the GCMs. All the models positively correlated with the observed data for both MAM, SON, and annual scale. In selecting the best performing models, we employed the TSS shown in Figure 14. The closer the TSS's value is to 1, the better the agreement between the simulation and observation. The range of TSS of the models during MAM is 0.35-0.71, whereas during SON is 0.35-0.82. However, the models exhibit poor performance in simulating annual rainfall patterns over the region with the TSS ranging between 0.16 and 0.76. During MAM, GFDL-ESM4 shows the highest score, and SAMO-UNICON depicted the poorest performance. The UKESM1-0-LL reveals the highest score, and IPSL-CM6A-LR shows the least performance during SON. GFDL-ESM4 exhibits the best performance, while NorESM2-LM exhibits the poorest at annual scale. The best performing models during MAM (Figure 14) are in the order GFDL-ESM4, MRI-ESM2-0, GFDL-CM4, CESM2-WACCM, CESM2, UKESM1-0-LL, EC-Earth3-Veg, BCC-CSM2-MR, and CNRM-ESM2-1. During SON, the best performing models include UKESM1-0-LL, BCC-CSM2-MR, NorESM2-LM, GDFL-ESM4, CNRM-CM6-1, CESM2-WACCM, CESM2, and GFDL-CM4.
In addition, for annual rainfall, the best performing models include GFDL-ESM4, EC-Earth3-Veg, GFDL-CM4, CESM2, CNRM-CM6-1, CESM2-WACCM, MRI-ESM2-0, and CNRM-ESM2-1. Generally, the performance of the models in reproducing rainfall over the study region varies from one season to the other. In addition, poor performance is exhibited when reproducing observed annual rainfall patterns than seasonal. It is also noted that some of the models which exhibit good performance could not reproduce well the seasonal climatology and linear trends of rainfall over the study domain. Thus, the best performing models include GFDL-ESM4, CanESM5, CESM2-WACCM, NorESM2-LM, UKESM1-0-LL, MRI-ESM2-0, and CNRM-CM6-1. Studies by Ongoma et al. (2019) and Mumo and Yu (2020) using the CMIP5 over East Africa reported that CanESM5 exhibited the best performance in reproducing the MAM rainfall. However, in the present study, more models outperformed CanESM5 during MAM. Thus, more improvement is exhibited by the CMIP6 models in reproducing rainfall over the study domain. Furthermore, CanESM5 tend to reproduce well the temporal patterns of rainfall over the region but exhibits better performance for the spatial characteristics. This could be attributed to the coarser resolution of the model.

| SUMMARY AND CONCLUSION
Rainfall is the most essential weather parameter in the tropics as it affects many socio-economic activities. Uganda's  national economy largely depends on rain-fed agriculture, so any slight fluctuation in rainfall will have far-reaching effects on the community livelihoods. Understanding its patterns in variability and trends is crucial for predicting likely patterns and structuring effective adaption and mitigation strategies and climate change policies. In this study, we evaluated 15 GCMs of the CMIP6 and their ensemble mean in reproducing mean rainfall over the country at annual and seasonal scales. In the study, only the first run of the first realization of the models was considered. The models and the ensemble mean were compared against CHIRPS and CRU datasets as a proxy to observation for the period 1981-2014 by evaluating their ability to reproduce the climatology, linear trends, and temporal distribution. Statistical metrics were employed against CHIRPS data as the reference as it was exhibiting strong agreement with the models in the climatology than CRU datasets.
The models tend to reproduce the bimodal rainfall pattern received over the country well. The results show that some models slightly overestimate, while others slightly underestimate, the MAM rainfall. In addition, the most models highly overestimated the short rains. Previous studies have also noted this over East Africa with CMIP5 Mumo and Yu, 2020). The SON rains have been reported to exhibit higher interannual variability as compared to MAM by many past studies (Nicholson, 2017;Kisembe et al., 2018;Egeru et al., 2019;Ngoma et al., 2021). This is attributed to the fact that SON rainfall is mainly regulated by global teleconnections such as ENSO and IOD. Therefore, more research is necessary to understand the mechanisms governing precipitation over Uganda (e.g., landatmosphere interaction) and how models simulate them. The two models, CNRM-CM6-1 and CNRM-ESM2-1, tend to underestimate rainfall throughout the study period. The performance of the models varies from seasonal to annual scale. Most models exhibited good performance during SON than MAM according to the TSS model's ranking. The models further depicted a reduction in dry bias compared to CMIP5 in simulating MAM rainfall. The spatial correlation of the models with CHIRPS is positive at seasonal and annual scales, but a negative correlation is depicted for interannual variability. Nevertheless, some of the models that showed good performance in the ranking could not simulate well the seasonal climatology of the study region. With all that put into consideration, the best performing models include; GFDL-ESM4, CanESM5, CESM2-WACCM, NorESM2-LM, UKESM1-0-LL, MRI-ESM2-0, and CNRM-CM6-1.
The findings of this study are of great importance to climatologists and end-users of the datasets. The results will help producers improve parameterization schemes in the models, where the models could not reproduce the observed patterns well. There is still a need for improvement in the models to minimize biases resulting from topography and local-scale convective effects. For the end-users, more caution is needed when using CMIP6 outputs in projecting rainfall during SON, as most models tend to overestimate it. However, the model outputs are generally reproducing rainfall consistent with observed datasets during MAM, and thus can be adopted in future rainfall projection during MAM over Uganda.
National Key Research and Development Program of China (2017YFA0603804), National Natural Science Foundation of China (41575070) supported this work.