Mapping Cassava Production in Uganda

Renata Retkute; Christopher A. Gilligan

doi:10.20944/preprints202605.2030.v1

Submitted:

28 May 2026

Posted:

29 May 2026

You are already at the latest version

Abstract

Cassava is a critical staple crop for food security and rural livelihoods in Sub-Saharan Africa, yet high-resolution maps of its distribution remain scarce, particularly for smallholder systems. In this study, we generated a 10 m resolution cassava presence map for Uganda (CM24) by fine-tuning a Random Forest classifier on TESSERA foundation model embeddings derived from Sentinel-1 and Sentinel-2 time series. Using field survey data from the Copernicus4GEOGLAM campaign for training and validation, the model achieved excellent discriminative ability (validation AUC = 0.9532, test AUC = 0.9524). Visual validation against high-resolution satellite imagery confirmed good spatial agreement, capturing both large contiguous fields and small fragmented plots. Comparison with two existing global products (CassavaMap and SPAM2020) and two seasons of national survey data conducted by the Uganda Bureau of Statistics showed that CM24 produced national harvested area estimates that fell between the two survey totals, whereas CassavaMap and SPAM2020 systematically overestimated harvested area by factors of two to three. Our results demonstrate that foundation-model embeddings offer a robust and scalable approach for mapping cassava in heterogeneous smallholder landscapes. The resulting CM24 map provides a spatially explicit tool to support disease surveillance, agricultural monitoring, and food security planning in Uganda and beyond.

Keywords:

cassava mapping

;

smallholder agriculture

;

TESSERA

;

foundation model

;

Uganda

Subject:

Biology and Life Sciences - Agricultural Science and Agronomy

1. Introduction

Cassava (Manihot esculenta Crantz) is a cornerstone of food security and rural livelihoods in Sub-Saharan Africa, where over half of the world’s crop is produced and an estimated half a billion people consume cassava daily as the second most important source of dietary carbohydrates after maize [1]. Its exceptional tolerance to drought and ability to grow on marginal soils make cassava a climate-resilient staple that ensures baseline food availability even under erratic rainfall and rising temperatures [2]. Beyond subsistence, cassava generates socio-economic benefits for millions of smallholder farmers by diversifying income, creating employment, and reducing vulnerability to economic and environmental shocks [3]. Increasingly recognised as a strategic asset for climate adaptation and industrial development, cassava products—starch, ethanol, and animal feed—play an expanding role in economic growth and poverty reduction across the continent [4].

Despite this critical importance, accurate spatial information on the distribution of the crop throughout cassava growing regions, cultivation practices, and health status remains strikingly sparse. The crop is highly vulnerable to cassava mosaic disease (CMD) and cassava brown streak disease (CBSD), which are spread rapidly through infected cuttings and by whitefly vectors. High-resolution maps are needed to support early detection, targeted surveillance, and effective intervention [5]. Cassava’s extended growing cycle (6–24 months) and cultivation in marginal, remote areas make ground-based monitoring logistically challenging and expensive, whereas remote sensing offers cost-effective alternatives for estimating planted area, yield, and production trends [?]. The growing role of cassava as an industrial feedstock demands reliable supply forecasts that depend on spatially explicit cropping calendars and yield variability assessments [6]. Climate change is projected to alter suitable growing zones across Africa, so spatially explicit models of current distribution serve as baselines for adaptation planning, varietal selection, and targeting of extension services [7]. Thus, mapping cassava using satellite imagery and field surveys is essential for improving agricultural management, disease control, and food security planning across Sub-Saharan Africa.

Recent advances in remote sensing and machine learning have opened new pathways for scalable crop mapping, yet cassava mapping remains challenging due to intercropping practices, small fragmented fields, and physiological overlap with analogous crops [8]. Current approaches to produce cassava production maps rely on profiles of vegetation indices derived from remote sensing time series [9,10]. Remote sensing-derived phenological metrics have also been used to predict cassava mosaic disease outbreaks for areas where the presence of cassava is known in situ [11]. Nevertheless, these phenology-based strategies remain vulnerable to persistent cloud cover during critical growth phases, mixed-pixel effects in fragmented smallholder landscapes, spectral confusion between cassava and similar crops (e.g., sugarcane [10]), and an inability to capture intercropping patterns—limitations that constrain both accuracy and scalability in tropical agricultural regions.

A particularly promising development to overcome these limitations is the emergence of geospatial foundation models (GFMs) for Earth observation [12]. These pre-trained, task-agnostic representations, fine-tuned for downstream tasks with minimal labeled data, have demonstrated strong performance in applications such as crop classification [13], crop yield and tillage monitoring [14], flood mapping and burn scar detection [15]. One such model is TESSERA (Temporal Embeddings of Surface Spectra for Earth Representation and Analysis), which generates dense, precomputed representations from multi-modal satellite time series [16]. These representations condense a full year of Sentinel-1 SAR and Sentinel-2 optical observations into compact 128-dimensional latent embeddings at 10 m spatial resolution, offering a rich feature space that can potentially capture subtle phenological and structural differences between cassava and other crops without requiring explicit handcrafted indices or region-specific calibration. Such a foundation model is particularly well-suited for mapping smallholder cassava systems, where fields are small, fragmented, and often intercropped.

Cassava production in Sub-Saharan Africa is predominantly carried out by smallholder farmers, who lack timely, high-resolution agricultural information. In Uganda, cassava serves as both a staple food and a strategic reserve crop for food security. According to the 2024 National Population and Housing Census, about 1.8 million agricultural households grow cassava nationwide [17]. Despite this importance, spatially explicit data on cassava distribution at field scales remain limited. In this study, we generate high-resolution (10 m) cassava distribution maps for Uganda (Cassava Map CM24) by fine-tuning a random forest classifier on TESSERA embeddings trained on field survey data. We compare our mapped cassava presence (CM24) with Annual Agricultural Survey Data for 2019 and two existing products (CassavaMap and SPAM 2020). By providing a fine-scale, spatially explicit tool, our work supports agricultural monitoring, targeted interventions, and food security planning for smallholder cassava producers in Uganda and beyond.

2. Materials and Methods

2.1. Study Area

The study covers the entire territory of Uganda, which lies approximately between longitude

29 . 5^{\circ}

E and

35 . 0^{\circ}

E, and latitude

1 . 5^{\circ}

S to

4 . 5^{\circ}

N. The country has distinct rainfall patterns, with the south experiencing a bimodal rainfall pattern, with the primary rainy seasons occurring from March to May (MAM) and September to November (SON) [18]. In contrast, the drier northern region follows a unimodal rainfall pattern, characterised by a single, prolonged rainy season from April to October, followed by an extended dry season between November and March [19].

2.2. Ground Reference Data

We used the publicly available land cover and crop type data from the European Commission’s Joint Research Centre for dataset for Uganda [20]. We call this data set Cop4Geoglam. The data were collected during a field campaign in northern and north-eastern Uganda during the 2022 short rains season. Sampling was based on 500 × 500 m patches (338 in total); within each patch, fields were delineated, and for each sampled field, geolocation, visit time, and field characteristics were recorded, with cropland fields further assessed for crop type.

2.3. Satellite Imagery

This study utilized TESSERA (Temporal Embeddings of Surface Spectra for Earth Representation and Analysis), which uses self-supervised learning to produce 128-dimensional latent embeddings at 10 m spatial resolution, distilling a full year of Sentinel-1 synthetic aperture radar (SAR) and Sentinel-2 optical observations into compact feature vectors for each 10 m pixel globally [16]. We obtained pre-computed TESSERA embeddings for Uganda for the 2024 calendar year using the geotessera Python package (version 1.0, available at github.com/ucam-eo/tessera). The global embedding archive is arranged into

5^{\circ} \times 5^{\circ}

geographic tiles (approximately 550 km × 550 km at the equator), each stored as a cloud-optimised GeoTIFF containing all 128 embedding channels at 10 m pixel resolution. A total of 3,780 tiles fully covered Uganda. For each 10 m pixel, the 128-dimensional feature vector was extracted and used directly as input to a pixel-wise supervised classifier. The trained classifier was then applied to the same embedding tiles to produce binary maps for cassava presence/ absence at 10 m resolution for the entire study area.

2.4. Model Training and Evaluation

Using built-in R caret functions [21], the Random Forest (RF) model was trained via repeated 5-fold cross-validation with three repeats. To address class imbalance, up-sampling of the minority class was applied, and class probabilities were requested for subsequent threshold optimization—all handled natively by caret. The model was configured with 100 trees; all other hyperparameters retained their default values as implemented in the caret package.

Model performance was assessed using accuracy, precision, recall, and F1 score [22]. These metrics were derived from the confusion matrix as follows:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(1)

Precision = \frac{T P}{T P + F P}

(2)

Recall = \frac{T P}{T P + F N}

(3)

F 1 score = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(4)

where,

True Positive, $T P$ , is the number of samples labelled as positive by the model that are actually positive;
False Positive, $F P$ , is the number of samples labelled as positive by the model that are actually negative;
True Negative, $T N$ , is the number of samples labelled as negative by the model that are actually negative;
False Negative, $F N$ , is the number of samples labelled as negative by the model that are actually positive.

2.5. Annual Agricultural Survey Data

The Uganda Bureau of Statistics (UBOS) established the Annual Agricultural Survey (AAS) in 2017 to address the growing demand for agricultural statistics [23]. We used results from the AAS 2019 - the third survey implemented by UBOS in collaboration with the Ministry of Agriculture, Animal Industries and Fisheries [24]. We extracted data from the annexes on estimated harvested area of cassava by sub-region during first and second seasons. The total area harvested was defined as the total area planted calculated on those observations whose production is available (not missing) and higher than zero. The AAS collects data for the timespan of an agricultural year (from January to December 2019). In Uganda, the agricultural year comprises two seasons: the first season from January to June and the second season from July to December. For each season, agricultural households are interviewed twice: during post-planting and post-harvesting visits. The assignment of districts to sub-regions follows the classification of the Uganda Annual Agricultural Survey 2019, with list provided in Appendix A.

2.6. Existing Cassava Production Maps

Few cassava production maps at country, regional or continental scales have been derived from empirical approaches using demographic and economic data rather than remote sensing. CassavaMap provides a 1 km disaggregation of cassava production and harvested area using administrative unit data for 32 countries within Uganda; the estimates are standardized to 2014 FAO reported levels and downscaled based on the distribution of rural population in 2014 [25].

Similarly, the International Food Policy Research Institute (IFPRI) Spatial Production Allocation Model (SPAM) disaggregates crop production and harvested area at 10 km resolution using a cross-entropy approach that allocates sub-national statistics to pixels based on agricultural land suitability, population density, and irrigation extent [26]. For the current study, we used Global Spatially-Disaggregated Crop Production Statistics Data for 2020 Version 2.0 Release 2 [27]. This dataset compiles sub-national crop production statistics for 42 major crops, harmonized with FAOSTAT country totals and supplemented by multiple national and sub-national sources.

3. Results

3.1. Characteristics of Ground Reference Data

Of the 23,349 field polygons delineated in the Copernicus dataset, we retained only those with valid land cover labels, excluding fields marked “NA”, i.e., fields that were not labelled (Figure 1A-B). The resulting pixel-level land cover and crop-type labels comprised 32.2% of the total 839,552 set. Cassava presence pixels accounted for 14.0% of this labelled set. The remaining non-cassava pixels were distributed across the following classes: built-up area (1.7%), forest (2.2%), natural grassland (31.6%), natural shrubs (8.3%), water (0.1%), and other crop types (39.7%).

Two patches of delineated fields with the highest cassava pixel proportions are presented (Figure 1C). Patch ID 124098 (West Nile region;

31 . 474^{\circ}

E,

3 . 250^{\circ}

N), has a cassava proportion of 59%, comprising 35 cassava, 35 non-cassava, and 50 NA fields, while patch ID 525062 (Eastern region;

33 . 844^{\circ}

E,

1 . 243^{\circ}

N) has a cassava proportion of 57%, comprising 86 cassava, 86 non-cassava, and 116 NA fields. Despite the larger number of fields in the second example, field polygons are physically smaller. Overall, field boundaries align well with the satellite imagery. The occasional slight misalignment may be explained by shifting planting decisions and field utilization in response to weather and market signals.

3.2. Performance of the RF Classifier

The labelled ground reference dataset, comprising all field data with valid labels aggregated into a single set, was used to train, test, and validate a Random Forest classifier for pixel-level cassava detection. The model was trained using 80% of the data (training set), tested using 10% of the data (testing set), and final performance was assessed on the remaining 10% of the data (validation set).

The random forest model was tuned automatically within the R caret package by varying the number of variables randomly sampled at each split (mtry). Among the tested values (mtry = 2, 65, and 128), the highest accuracy was achieved with mtry = 65 (accuracy = 0.907), compared with mtry = 2 (accuracy = 0.899) and mtry = 128 (accuracy = 0.906). Therefore, mtry = 65 was selected as the optimal tuning parameter for the final model.

After training the model, the performance was optimized by selecting the classification probability threshold based on the testing data. Figure 2A shows the performance metrics of the Random Forest classifier as a function of this threshold (ranging from 0 to 1). Accuracy rose from 0.323 at a threshold of 0.01 to a peak of 0.918 near thresholds of 0.42–0.44, then gradually declined to approximately 0.860 at higher thresholds up to 0.99. Precision remained at or near 1.0 for thresholds up to 0.03, after which it declined gradually to about 0.86, indicating that the model maintained very high positive predictive value across most threshold values. The weighted F1 score increased from 0.35 at a threshold of 0.01 to a peak of approximately 0.953 near thresholds of 0.43–0.44, then gradually declined to around 0.925 for thresholds above 0.90. Based on these results, the optimal classification threshold was set to 0.43 for the rest of the analysis. The random forest classifier demonstrated excellent discriminative ability in both the test and validation sets. The validation AUC was 0.9532 and the test AUC was 0.9524, indicating nearly identical and highly accurate performance with minimal overfitting (Figure 2B).

To further assess the spatial reliability of the predictions, a visual validation was performed using high-resolution satellite imagery from Google Earth Pro (Image ©2026 CNES / Airbus). A single 10 km × 10 km tile centred at longitude 33.85, latitude 1.25 was selected. Ten 500 m × 500 m squares were randomly sampled to represent a range of cassava fractional cover, ensuring that these squares did not overlap with the Cop4Geoglam data (Figure 2C). The CM24 cassava presence map at 10 m resolution across the region showed that cassava covered approximately 36% of the total area (Figure 2D). For each square, the model’s predicted cassava presence (red pixels) was overlaid on the corresponding high-resolution imagery (Figure 2E). In all ten cases, the spatial distribution of predicted cassava pixels showed good visual agreement with the actual field patterns visible in the satellite data, confirming that the model captures both large cassava blocks and small, fragmented plots.

3.3. Comparing Mapped Areas with Annual Agricultural and Existing Cassava Maps

To evaluate the accuracy of our cassava distribution map (CM24), we compared it against two existing products (CassavaMap and SPAM2020) and estimated production areas from two agricultural survey seasons (AA19S1 and AA19S2) from the 2019 AAS in Uganda. To enable comparison, we aggregated CM24 to 1 km (Figure 3A) and 10 km (Figure 3D) grid cells. For analysis, we extracted data on harvested area for Uganda from global maps in CassavaMap (Figure 3B) and SMPAM 2020 (Figure 3E). The pixel-wise comparison showed differences between CM24 and each of the other available product (Figure 3 C and F). The difference between CM24 and CassavaMap ranged from -50.00 to 56.63 ha per 1 km grid cell (median = -0.38 ha, mean = -2.14 ha), indicating a slight negative bias overall towards CM24. The difference between CM24 and SPAM2020 at 10 km resolution was substantially larger in magnitude, ranging from -3850 to 3601 ha per 10 km cell (median = -348 ha, mean = -428 ha), reflecting systematic lower estimates by CM24 relative to SPAM2020 across most of Uganda. One notable outlier was a 10 km grid cell in Zombo district (longitude 30.92351, latitude 2.3923995: West Nile subregion), where SPAM2020 reported a high harvested area of 8,287.3 ha—a value that strongly deviates from surrounding patterns and likely contributes to the large negative differences observed in that region.

Survey values differed notably between seasons, with Season 2 generally showing larger harvested areas across most subregions (Figure 3G-H). For example, in Lango (84,678 ha vs. 8,856 ha) and Busoga (45,518 ha vs. 16,146 ha). At the national level, total harvested area from the two survey seasons differed considerably: 184,814 ha (Season 1) versus 355,878 ha (Season 2) [24]. CM24 estimated a total of 361,160 ha (calculated from data across both seasons), which falls between the two AAS survey estimates and is close to the AAS report for Season 2. In contrast, CassavaMap (729,253 ha) and SPAM2020 (804,638 ha) far exceeded both survey totals, indicating systematic over-prediction. Among the three map products, CM24 frequently aligned more closely with at least one of the survey seasons, whereas CassavaMap and SPAM2020 consistently estimated higher harvested area in most subregions. For instance, in West Nile, CM24 (55,245 ha) approximated Season 2 (52,737 ha), while CassavaMap (98,605 ha) and SPAM2020 (101,816 ha) were substantially higher. Agreement between each map-based harvested area estimate and the survey-based estimates was quantified using the root-mean-square error (RMSE), which measures the average deviation between map and survey values across subregions; lower RMSE indicates closer agreement. RMSE against the two survey seasons also favoured CM24: for Season 1, RMSE values were 23,295 ha (CM24), 49,732 ha (CassavaMap), and 60,129 ha (SPAM2020); for Season 2, RMSE values were 17,831 ha (CM24), 33,994 ha (CassavaMap), and 38,039 ha (SPAM2020). Overall, CM24 consistently produced harvested area estimates that were closer to the available survey data than either CassavaMap or SPAM2020, while also delivering improved spatial granularity.

4. Discussion

High-resolution crop distribution maps are essential for disease surveillance, yield forecasting, and climate adaptation, yet they remain conspicuously absent for smallholder cassava systems in Sub-Saharan Africa. By fine-tuning a Random Forest classifier on TESSERA foundation model embeddings, we generated a 10 m resolution cassava presence map for Uganda (CM24) suitable for use in heterogeneous agricultural landscapes. The visual validation against high-resolution satellite imagery confirmed that the model’s spatial predictions align well with ground patterns, providing confidence for operational use.

Existing global cassava maps (CassavaMap, SPAM2020) rely on coarse demographic and agricultural suitability proxies rather than direct remote sensing signals. Their systematic over-prediction of harvested area compared with CM24 likely reflects the difficulty of downscaling administrative statistics to pixel-related scales without a strong crop-specific phenological signal. CM24, by contrast, produces national totals that fall between the two survey seasons and tracks subregional patterns more closely, as reflected in its lower RMSE values. The extreme outlier in SPAM2020 (Zombo district) illustrates a broader vulnerability of cross-entropy allocation methods when input data contain local anomalies. Our findings suggest that foundation-model embeddings offer a viable path toward crop-specific maps that respect actual satellite-observed patterns, even in data-sparse smallholder contexts.

Recent continental-scale mapping of maize and cropland in Africa [28,29] has shown that dense Sentinel-1 and Sentinel-2 time series can achieve accuracies comparable to ours. For soybean, Wang et al. [30] demonstrated 10 m mapping using phenological metrics. Cassava presents unique challenges: a flexible growing cycle of 6–24 months, frequent intercropping, and spectral similarity with other broadleaf crops. The TESSERA embeddings, which condense a full year of multi-modal observations into a compact feature space, appear to overcome these challenges without requiring manually defined phenological windows. This suggests that self-supervised representations could serve as a generalisable backbone for mapping other root crops or minor cereals in tropical regions.

The spatial arrangement of host plants is a key driver of the spread of CMD and CBSD [31,32]. Several epidemiological models have been developed for CBSD [33,34,35,36], but their predictive utility depends critically on accurate, high-resolution host distribution data. A detailed cassava presence map is therefore essential to parameterise disease spread models and to inform evidence-based surveillance and control policies at national or regional scales [37,38,39,40,41]. Coupling crop distribution maps with mechanistic epidemic models enables a transition from reactive to proactive plant health management, allowing targeted interventions before outbreaks escalate.

Several limitations should be acknowledged. Ground reference data were collected during a single season and concentrated in northern and north-eastern Uganda. While the TESSERA embeddings are pre-trained globally, expanding the training set to include multiple years and regions would improve generalisability, particularly for southern Uganda where bimodal rainfall patterns may alter cassava phenology. Our pixel-wise classifier does not explicitly model field boundaries; integrating object-based or convolutional architectures could reduce salt-and-pepper noise and better align predictions with cadastral units. The survey data (AAS 2019) themselves carry sampling and reporting uncertainties, and direct plot-level validation of CM24 remains a priority.

Despite these limitations, CM24 provides a substantial step forward. Its 10 m resolution and computationally moderate requirements make it suitable for routine use by agricultural monitoring agencies and plant health services. Future work should extend the time series to capture inter-annual crop rotation, integrate with yield models, and scale the approach to neighbouring countries facing similar smallholder cassava challenges.

Author Contributions

Conceptualization, R.R. and C.A.G.; methodology, R.R.; software, R.R.; validation, R.R; formal analysis, R.R.; investigation, R.R; resources, C.A.G.; data curation, R.R.; writing—original draft preparation, R.R.; writing—review and editing, R.R. and C.A.G.; visualization, R.R.; project administration, R.R; funding acquisition,C.A.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Gates Foundation grant INV070408, which we gratefully acknowledge.

Data Availability Statement

Zenodo https://www.mdpi.com/ethics.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A. Sub-Regions and Their Districts (Based on AAS 2019)

S. Buganda: Kalangala, Masaka, Mpigi, Rakai, Ssembabule, Wakiso, Lyantonde, Bukomansimbi, Butambala, Gomba, Kalungu, Lwengo
N. Buganda: Kiboga, Luwero, Mubende, Mukono, Nakasongola, Kayunga, Mityana, Nakaseke, Buikwe, Buvuma, Kyankwanzi
West Nile: Adjumani, Arua, Moyo, Nebbi, Yumbe, Koboko, Maracha, Zombo
Lango: Apac, Lira, Amolatar, Dokolo, Oyam, Alebtong, Kole, Otuke
Acholi: Gulu, Kitgum, Pader, Amuru, Agago, Lamwo, Nwoya, Omoro
Kigezi: Kabale, Kisoro, Rukungiri, Kanungu, Mitooma, Rubanda
Bunyoro: Hoima, Kibaale, Masindi, Buliisa, Kiryandongo, Kagadi, Kakumiro
Tooro: Bundibugyo, Kabarole, Kasese, Kamwenge, Kyenjojo, Kyegegwa
Busoga: Bugiri, Iganga, Jinja, Kamuli, Mayuge, Kaliro, Buyende, Luuka, Namayingo
Teso: Katakwi, Kumi, Soroti, Kaberamaido, Amuria, Bukedea, Serere
Bukedi: Busia, Pallisa, Tororo, Budaka, Butaleja, Kibuku
Elgon: Kapchorwa, Mbale, Sironko, Bududa, Bukwo, Manafwa, Bulambuli, Kween
Karamoja: Kotido, Moroto, Nakapiripirit, Abim, Kaabong, Amudat, Napak
Ankole: Bushenyi, Mbarara, Ntungamo, Ibanda, Isingiro, Kiruhura, Buhweju, Sheema

References

Natural Resources Institute. Transforming cassava to improve livelihoods in sub-Saharan Africa. Impact case study, n.d. Accessed: 2026-05-07.
Bacsi, Z.; Jarso, D.D. Cassava Response to Weather Variability in Eastern Africa. Agriculture 2026, 16, 209. [CrossRef]
Borku, A.W.; Tora, T.T.; Masha, M. Cassava in focus: A comprehensive literature review, its production, processing landscape, and multi-dimensional benefits to society. Food Chemistry Advances 2025, 7, 100945. [CrossRef]
CGIAR Research Program on Roots, Tubers and Bananas. RTB crop breeding in Africa shows wide-scale adoption of improved varieties. In Crop Improvement, Adoption, and Impact of Improved Varieties in Food Crops in Sub-Saharan Africa; CAB International, 2015.
Legg, J.P.; Lava Kumar, P.; Makeshkumar, T.; Tripathi, L.; Ferguson, M.; Kanju, E.; Ntawuruhunga, P.; Cuellar, W. Cassava virus diseases: biology, epidemiology, and management. Advances in Virus Research 2015, 91, 85–142. [CrossRef]
Adebayo, W.G. Cassava production in africa: A panel analysis of the drivers and trends. Heliyon 2023, 9, e19939. [CrossRef]
Sikazwe, G.; Yocgo, R.E.E.; Landi, P.; Richardson, D.M.; Hui, C. Predicting the current and future suitable habitats of cassava and cassava brown streak disease in Africa. East African Journal of Science, Technology and Innovation 2026, 7. [CrossRef]
Silva, D.V.; Ferreira, E.A.; Oliveira, M.C.; Pereira, G.A.; Oliveira, R.A.; Silva, D.V.; Ferreira, E.A.; Oliveira, M.C.; Pereira, G.A.; Oliveira, R.A. Productivity of cassava and other crops in an intercropping system. Ciencia e investigación agraria 2016, 43, 159–166. [CrossRef]
Daraneesrisuk, J.; Ninsawat, S.; Losiri, C.; Sitthi, A., Sugarcane and Cassava Classification Using Machine Learning Approach Based on Multi-temporal Remote Sensing Data Analysis. In Applied Geography and Geoinformatics for Sustainable Development; Springer International Publishing, 2022; p. 183–194. [CrossRef]
Wang, X.; Wang, Q.; Lai, H.; Zhang, Z.; Yun, T.; Lu, X.; Wang, G.; Lao, S.; Liao, Q.; Lu, S.; et al. A multi-sensor, phenology-based approach framework for mapping cassava cultivation dynamics and intercropping in highly fragmented agricultural landscapes. ISPRS Journal of Photogrammetry and Remote Sensing 2025, 228, 44–63. [CrossRef]
Chaiyana, A.; Khiripet, N.; Ninsawat, S.; Siriwan, W.; Shanmugam, M.S.; Virdis, S.G. Mapping and predicting cassava mosaic disease outbreaks using earth observation and meteorological data-driven approaches. Remote Sens. Appl. Soc. Environ. 2024, 35, 101231. [CrossRef]
Xiao, A.; Xuan, W.; Wang, J.; Huang, J.; Tao, D.; Lu, S.; Yokoya, N. Foundation Models for Remote Sensing and Earth Observation: A Survey, 2024. [CrossRef]
Szwarcman, D.; Roy, S.; Fraccaro, P.; Gíslason, Þ.E.; Blumenstiel, B.; Ghosal, R.; de Oliveira, P.H.; de Sousa Almeida, J.L.; Sedona, R.; Kang, Y.; et al. Prithvi-EO-2.0: A Versatile Multitemporal Foundation Model for Earth Observation Applications. IEEE Transactions on Geoscience and Remote Sensing 2026, 64, 1–20. [CrossRef]
Ma, Y.; Shen, Y.; Swatantran, A.; Lobell, D.B. Harvesting AlphaEarth: Benchmarking the Geospatial Foundation Model for Agricultural Downstream Tasks, 2026. [CrossRef]
Astruc, G.; Gonthier, N.; Mallet, C.; Landrieu, L. AnySat: One Earth Observation Model for Many Resolutions, Scales, and Modalities, 2024. [CrossRef]
Feng, Z.; Atzberger, C.; Jaffer, S.; Knezevic, J.; Sormunen, S.; Young, R.; Lisaius, M.C.; Immitzer, M.; Jackson, T.; Ball, J.; et al. TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis, 2025. [CrossRef]
Uganda Bureau of Statistics. The National Population and Housing Census 2024 – Final Report - Volume 1 (Main); Uganda Bureau of Statistics: Kampala, Uganda, 2024.
Ngoma, H.; Wen, W.; Ojara, M.; Ayugi, B. Assessing current and future spatiotemporal precipitation variability and trends over Uganda, East Africa, based on CHIRPS and regional climate model datasets. Meteorology and Atmospheric Physics 2021, 133, 823–843. [CrossRef]
Phillips, J.; McIntyre, B. ENSO and interannual rainfall variability in Uganda: implications for agricultural management. International Journal of Climatology 2000, 20, 171–182. [CrossRef]
European Commission, Joint Research Centre. Uganda AOI, 2026. [CrossRef]
Kuhn, M. Building Predictive Models in R Using the caret Package. Journal of Statistical Software 2008, 28. [CrossRef]
Grandini, M.; Bagli, E.; Visani, G. Metrics for Multi-Class Classification: an Overview, 2020. [CrossRef]
Ponzini, G.; Baryahirwa, S.; Brunelli, C.; Ilukor, J.; Kilic, T.; Mugabe, S.; Mupere, A.; Okello, P.; Oumo, F.; Ssennono, V. The integration of socio-economic and agricultural surveys by national statistical offices: The case of the Uganda Harmonized Integrated Survey 1. Statistical Journal of the IAOS 2022, 38, 141–161.
Uganda Bureau of Statistics. Annual Agriculture Survey (AAS) 2020. Online, 2022. Accessed: 2026-05-12.
Szyniszewska, A.M. CassavaMap, a fine-resolution disaggregation of cassava production and harvested area in Africa in 2014. Scientific Data 2020, 7. [CrossRef]
International Food Policy Research Institute (IFPRI). Global Spatially-Disaggregated Crop Production Statistics Data for 2020 Version 2.0 Release 2, 2024. [CrossRef]
International Food Policy Research Institute (IFPRI). Global Spatially-Disaggregated Crop Production Statistics Data for 2020 Version 2.0 Release 2, 2024. [CrossRef]
Abdelrahim, N.A.M.; Jin, S. Continental maize mapping and distribution in Africa by integrating radar and optical imagery. Environ. Monit. Assess. 2025, 197. [CrossRef]
Rufin, P.; Hammer, P.L.; Thomas, L.F.; Lisboa, S.N.; Ribeiro, N.; Sitoe, A.; Hostert, P.; Meyfroidt, P. National-scale field delineation in Mozambique refines our understanding of cropland distribution, field size, and deforestation actors. Environ. Res. Lett. 2026, 21, 084009. [CrossRef]
Wang, R.; Zhang, J.; Lu, X.; Fu, Z.; Cai, G.; Liu, B.; Li, J. JM-Guided Sentinel 1/2 Fusion and Lightweight APM-UNet for High-Resolution Soybean Mapping. Remote Sensing 2025, 17, 3934. [CrossRef]
Alicai, T.; Szyniszewska, A.M.; Omongo, C.A.; Abidrabo, P.; Okao-Okuja, G.; Baguma, Y.; Ogwok, E.; Kawuki, R.; Esuma, W.; Tairo, F.; et al. Expansion of the cassava brown streak pandemic in Uganda revealed by annual field survey data for 2004 to 2017. Scientific Data 2019, 6. [CrossRef]
Suprunenko, Y.F.; Gilligan, C.A. Where to refine spatial data to improve accuracy in crop disease modelling: an analytical approach with examples for cassava. Royal Society Open Science 2025, 12. [CrossRef]
McQuaid, C.F.; Sseruwagi, P.; Pariyo, A.; van den Bosch, F. Cassava brown streak disease and the sustainability of a clean seed system. Plant Pathology 2015, 65, 299–309. [CrossRef]
McQuaid, C.F.; van den Bosch, F.; Szyniszewska, A.; Alicai, T.; Pariyo, A.; Chikoti, P.C.; Gilligan, C.A. Spatial dynamics and control of a crop pathogen with mixed-mode transmission. PLOS Computational Biology 2017, 13, e1005654. [CrossRef]
Godding, D.; Stutt, R.O.J.H.; Alicai, T.; Abidrabo, P.; Okao-Okuja, G.; Gilligan, C.A. Developing a predictive model for an emerging epidemic on cassava in sub-Saharan Africa. Scientific Reports 2023, 13. [CrossRef]
Retkute, R.; Gilligan, C.A. A novel two-stage parameter estimation framework integrating Approximate Bayesian Computation and Machine Learning: The ABC-RF-rejection algorithm, 2025. [CrossRef]
Godding, D.; Stutt, R.O.J.H.; Savi, M.K.; Ahanhanzo, C.; Tiendrebeogo, F.; Doungous, O.; Godefroid, M.; Bakelana, Z.; Mavoungou, J.F.; Oppong, A.; et al. Predicting the cross-continental spread of the cassava brown streak disease epidemic in sub-Saharan Africa 2025. [CrossRef]
Retkute, R.; Gilligan, C.A. Developing a spatio-temporal model for banana bunchy top disease: leveraging remote sensing and survey data. Front. Plant Sci. 2025, 16. [CrossRef]
Retkute, R.; Zandjanakou-Tachin, M.; Omondi, B.A.; Agoi, U.R.; Vodounou, Y.M.; Akofodji, H.; Akpla, E.; Dossou, L.; Médénou, E.; Etchiha, A.; et al. Controlling banana bunchy top disease in Benin: Crop protection strategies with socio-economic perspectives. PLANTS, PEOPLE, PLANET 2025. [CrossRef]
Retkute, R.; Gilligan, C.A. Cost-effective early detection of banana bunchy top disease: insights from spatio-temporal modelling in Benin 2026. [CrossRef]
Smith, J.W.; Stutt, R.O.J.H.; Retkute, R.; Mona, T.; Thurston, W.; Bacha, N.; Gutu, K.; Horo, J.T.; Alemayehu, Y.; Hodson, D.; et al. Evaluating a landscape-scale model to forecast wheat stem rust. Environmental Research Letters 2026, 21, 014034. [CrossRef]

Figure 1. Ground data distribution and examples. (A) Map of Uganda showing the spatial distribution of ground truth samples. (B) Distribution and composition of ground truth samples. Note that sample coordinates have been jittered to prevent overlap and do not represent exact geographic locations. Pie charts indicate the pixel-level proportion of not labelled (dark gray), cassava (green), and non-cassava (orange) within each sample. (C) Two illustrative samples selected for their high proportion of cassava pixels: (i) patch ID = 124098; and (ii) patch ID = 525062. For each sample, the left column shows delineated field polygons coloured according to the categories in A, and the right column shows the corresponding high-resolution satellite imagery obtained from Google Earth Engine (Image ©2025 CNES / Airbus ).

Figure 2. Performance of the RF classifier. (A)Values of metrics of the Random Forest classifier across classification probability thresholds for testing data. (B) ROC curves with AUC for testing and validation sets. (C) Location of Cop4Geoglam data (white polygons), and randomly selected 10 samples for visual inspection. (D) CM24 cassava presence (red pixels) over the same region, 10-m resolution. (E) CM24 cassava presence (red pixels) and corresponding high-resolution satellite imagery for 10 sites shown in B. Numbers correspond to locations of sites. High-resolution satellite imagery was obtained from Google Earth Engine (Image ©2026 CNES / Airbus ).

Figure 3. (A) CM24 cassava map aggregated to ha per 1 km grid cell. The colour scale is log-transformed. (B) CassavaMap harvested area in Uganda [25]. The colour scale is log-transformed. (C) Difference between CM24 and CassavaMap in cassava production per 1 km grid cell. (D) CM24 cassava map aggregated to ha per 10 km grid cell. The colour scale is log-transformed. (E) SPAM 2020 cassava harvested area in Uganda [27]. The colour scale is log-transformed. (F) Difference between CM24 and SPAM 2020 in cassava production per 10 km grid cell. (G) Comparison of cassava harvested area (ha) per subregion in Uganda between two survey seasons (AA19S1 and AA19S2) and three map-based products: CM24 (10 m map), CassavaMap, and SPAM2020. Subregions are ordered by the maximum surveyed harvested area. Gray lines connect the two survey seasons for each subregion. (H) Map of the sub-regions in Uganda (adapted from [24]).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.