Submitted:
11 September 2024
Posted:
12 September 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Data
2.1. Land Cover Maps
2.1.1. ECOCLIMAP-SG
2.1.2. ECOCLIMAP-SG+
- Specialist maps (i.e. maps with a focus on a specific land cover type, such as forest, crops or urban, for example) are more reliable than non-specialist maps;
- The more maps that agree with the land cover label at a given location, the more confident we are of the label;
- When no better information is available, the ECOSG label is kept.
2.1.3. ESA WorldCover
2.1.4. Other Land Cover Maps Used
2.2. Training, Testing and Validation Sets
3. Methods
3.1. Map Translation with Auto-Encoders
- It requires the training to be redone if changes are done to the input or the output map.
- It does not provide a “common ground” for both maps.
- It is supervised by the output map, with its inaccuracies.
- the reconstruction loss (cross-entropy loss to ensure the auto-encoder correctly reproduces the original map),
- the translation loss (cross-entropy loss to penalize an incorrect translation),
- the embedding loss (mean squared error loss to ensure that the latent space is shared across all maps).
3.2. Training Strategy
3.3. Production of the Final Map: Merging Inference and ECOSG+
3.4. Generation of an Ensemble Land Cover
3.5. Evaluation Method
4. Results
4.1. Evaluation of the Inference against ECOSG+

4.2. Evaluation of ECOSG-ML against LUCAS

4.3. Qualitative Evaluation of the Final Map

4.4. Demonstration of Ensemble Land Cover Generation

5. Discussion
5.1. Limitations
- Obviously wrong classifications. Some pixels may show inconsistent land cover (e.g. lake or river pixels surrounded by sea pixels, or permanent snow at low altitude or latitude).
- Default secondary labels. For some primary labels (e.g. “Crops” or “Forest”), a default secondary label is predicted almost all the time (“19. winter C3 crops” for “Crops”, “8. temperate broadleaf deciduous” or “12. boreal needleleaf evergreen” for “Forest”), as visible in Figure 4. This results in correct primary label classification but incorrect secondary label classification.
- Too simple ensemble construction. With the current method of generating the members, u is the same everywhere on the map. As a result, all locations are modified in the same way, as if the uncertainty varies the same way everywhere, which may not be valid. Moreover, only a qualitative evaluation of the ensemble is made here. In particular, the representativity of the ensemble to the land cover uncertainty is not established.
5.2. Potential Directions for Improvement
- Enrich input information. Many of the current limitations are due to a lack of input information. The addition of informative variables like elevation or a position encoding would certainly help the network to better detect some labels (such as “6. permanent snow” or the bioclimatic classification). Such complementary information can be added as input to the auto-encoder or in the latent space (therefore as input to the decoder). For example, despite the limitations of ECOSG, it certainly contains valuable information to distinguish some secondary labels. After being projected in the latent space, the information from any land cover has the same resolution and channels, which makes the combination easier.
- Better loss function. The current loss is unaware of class similarities (classes are more similar within the same primary label, for example) and is unweighted. It is possible to put more weight on the loss of some classes, if these classes are critical, or to compensate for an unbalanced training set.
- Better input for CDF inversion. In this work, we used a single random number for all pixels and patches. This is better than to make a random draw for each pixel because the latter reduces correlation with geographical proximity, and is technically very simple. However, this is not entirely satisfactory because all locations are modified in the same way. A suggestion for future developments is to use a 2-dimensional stochastic process with appropriate properties to generate the members.
5.3. Prospects for Future Use
- Update other components of physiography. To be used in NWP the whole physiography database must be updated to be consistent with the land cover maps. Other components include Leaf Area Index (LAI), albedo, lakes parameters and tree height. In ECOSG, these components are present but stored in a way that is highly dependent on the land cover map (LAI and tree height only stored for pixels with vegetation or trees etc.). Therefore, despite a priori compatibility as ECOSG is already used in NWP, it can be complicated to reuse the values of ECOSG. Moveover, the values for the other components might be outdated since ECOSG is a static database. Consequently, we recommend to use up-to-date high-quality sources for these other components as much as possible. Over Europe, Copernicus products345 are available. Machine learning can also help to provide up-to-date and fit-for-purpose datasets for these components, such as in e.g., [30].
- Assess benefit of new maps in NWP. Once an updated physiography database is available, the potential benefit of this update will need to be evaluated. In particular, the resolution of ECOSG-ML also allows sub-kilometer NWP experiments to be carried out, for which the influence of the physiography is expected to be large.
- Assess benefit of ensemble land cover maps in physics-driven and data-driven ensemble forecasts. Besides the remaining questions about the representativity of the ensemble, there are open questions on the opportunities for using ensemble land covers in EPS. The effect of using a different land cover for each forecast member is unknown and is, in our opinion, an interesting question.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
- The 6 land cover maps described in this paper (users who do not need the ensemble may download member 0 only), each of these stored as 200 TIF files.
- The weights obtained after training, stored in Pytorch checkpoint (to be loaded with the provided code).
- The DS1 and DS2 datasets used fort raining and testing, stored as HDF5 files
Acknowledgments
Conflicts of Interest
Abbreviations
| DS1 | Dataset for phase 1 of the training (France mainland, 5 maps) |
| DS2 | Dataset for phase 2 of the training (EURAT, 3 maps) |
| ECOSG | ECOCLIMAP-SG: a physiography database currently used in NWP |
| ECOSG+ | ECOCLIMAP-SG+: the land cover map created by [23], used as a reference |
| ECOSG-ML | ECOCLIMAP-SG-ML: the ensemble land cover map described in this manuscript |
| EPS | Ensemble Prediction Systems |
| EURAT | Europe-Atlantic domain (longitudes: -32 to 42, latitudes: 20 to 72) |
| NWP | Numerical Weather Prediction |
References
- Bauer, P.; Thorpe, A.; Brunet, G. The quiet revolution of numerical weather prediction. Nature 2015, 525, 47–55. [Google Scholar] [CrossRef] [PubMed]
- Nuissier, O.; Duffourg, F.; Martinet, M.; Ducrocq, V.; Lac, C. Hectometric-scale simulations of a Mediterranean heavy-precipitation event during the Hydrological cycle in the Mediterranean Experiment (HyMeX) first Special Observation Period (SOP1). Atmospheric Chemistry and Physics 2020, 20, 14649–14667. [Google Scholar] [CrossRef]
- Sabatier, T.; Largeron, Y.; Paci, A.; Lac, C.; Rodier, Q.; Canut, G.; Masson, V. Semi-idealized simulations of wintertime flows and pollutant transport in an Alpine valley. Part II: Passive tracer tracking. Quarterly Journal of the Royal Meteorological Society 2020, 146, 827–845. [Google Scholar] [CrossRef]
- Lemonsu, A.; Alessandrini, J.; Capo, J.; Claeys, M.; Cordeau, E.; de Munck, C.; Dahech, S.; Dupont, J.; Dugay, F.; Dupuis, V. ; others. The heat and health in cities (H2C) project to support the prevention of extreme heat in cities, 2024.
- Seity, Y.; Brousseau, P.; Malardel, S.; Hello, G.; Bénard, P.; Bouttier, F.; Lac, C.; Masson, V. The AROME-France Convective-Scale Operational Model. Monthly Weather Review 2011, 139, 976–991. [Google Scholar] [CrossRef]
- Bengtsson, L.; Andrae, U.; Aspelien, T.; Batrak, Y.; Calvo, J.; Rooy, W.d.; Gleeson, E.; Hansen-Sass, B.; Homleid, M.; Hortal, M.; Ivarsson, K.I.; Lenderink, G.; Niemelä, S.; Nielsen, K.P.; Onvlee, J.; Rontu, L.; Samuelsson, P.; Muñoz, D.S.; Subias, A.; Tijm, S.; Toll, V.; Yang, X.; Køltzow, M.Ø. The HARMONIE–AROME Model Configuration in the ALADIN–HIRLAM NWP System. Monthly Weather Review 2017, 145, 1919–1935. [Google Scholar] [CrossRef]
- Masson, V.; Le Moigne, P.; Martin, E.; Faroux, S.; Alias, A.; Alkama, R.; Belamari, S.; Barbu, A.; Boone, A.; Bouyssel, F.; Brousseau, P.; Brun, E.; Calvet, J.C.; Carrer, D.; Decharme, B.; Delire, C.; Donier, S.; Essaouini, K.; Gibelin, A.L.; Giordani, H.; Habets, F.; Jidane, M.; Kerdraon, G.; Kourzeneva, E.; Lafaysse, M.; Lafont, S.; Lebeaupin Brossier, C.; Lemonsu, A.; Mahfouf, J.F.; Marguinaud, P.; Mokhtari, M.; Morin, S.; Pigeon, G.; Salgado, R.; Seity, Y.; Taillefer, F.; Tanguy, G.; Tulet, P.; Vincendon, B.; Vionnet, V.; Voldoire, A. The SURFEXv7.2 land and ocean surface platform for coupled or offline simulation of earth surface variables and fluxes. Geoscientific Model Development 2013, 6, 929–960. [Google Scholar] [CrossRef]
- Le Moigne, P.; Boone, A.; Calvet, J.C.; Decharme, B.; Faroux, S.; Gibelin, A.L.; Lebeaupin, C.; Mahfouf, J.F.; Martin, E.; Masson, V. SURFEX scientific documentation. Note de centre (CNRM/GMME), Météo-France, Toulouse, France 2009, 268. [Google Scholar]
- Zanaga, D.; Van De Kerchove, R.; Daems, D.; De Keersmaecker, W.; Brockmann, C.; Kirches, G.; Wevers, J.; Cartus, O.; Santoro, M.; Fritz, S.; Lesiv, M.; Herold, M.; Tsendbazar, N.; Xu, P.; Ramoino, F.; Arino, O. ESA WorldCover 10 m 2021 v200. 2022. [Google Scholar] [CrossRef]
- Malinowski, R.; Lewiński, S.; Rybicki, M.; Gromny, E.; Jenerowicz, M.; Krupiński, M.; Nowakowski, A.; Wojtkowski, C.; Krupiński, M.; Krätzschmar, E.; Schauer, P. Automated Production of a Land Cover/Use Map of Europe Based on Sentinel-2 Imagery. Remote Sensing 2020, 12, 3523. [Google Scholar] [CrossRef]
- Venter, Z.S.; Sydenham, M.A.K. Continental-Scale Land Cover Mapping at 10 m Resolution Over Europe (ELC10). Remote Sensing 2021, 13, 2301. [Google Scholar] [CrossRef]
- Mirmazloumi, S.M.; Kakooei, M.; Mohseni, F.; Ghorbanian, A.; Amani, M.; Crosetto, M.; Monserrat, O. ELULC-10, a 10 m European Land Use and Land Cover Map Using Sentinel and Landsat Data in Google Earth Engine. Remote Sensing 2022, 14, 3041. [Google Scholar] [CrossRef]
- Sumbul, G.; de Wall, A.; Kreuziger, T.; Marcelino, F.; Costa, H.; Benevides, P.; Caetano, M.; Demir, B.; Markl, V. BigEarthNet-MM: A Large Scale Multi-Modal Multi-Label Benchmark Archive for Remote Sensing Image Classification and Retrieval. IEEE Geoscience and Remote Sensing Magazine 2021, 9, 174–180. [Google Scholar] [CrossRef]
- Schmitt, M.; Hughes, L.H.; Qiu, C.; Zhu, X.X. SEN12MS – A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion, 2019. arXiv:1906.07789 [cs].
- Zhang, D.; Zhao, J.; Chen, J.; Zhou, Y.; Shi, B.; Yao, R. Edge-aware and spectral–spatial information aggregation network for multispectral image semantic segmentation. Engineering Applications of Artificial Intelligence 2022, 114, 105070. [Google Scholar] [CrossRef]
- Aksoy, A.K.; Ravanbakhsh, M.; Kreuziger, T.; Demir, B. A Consensual Collaborative Learning Method for Remote Sensing Image Classification Under Noisy Multi-Labels. 2021 IEEE International Conference on Image Processing (ICIP), 2021, pp. 3842–3846. ISSN: 2381-8549. [CrossRef]
- Baudoux, L.; Inglada, J.; Mallet, C. Toward a Yearly Country-Scale CORINE Land-Cover Map without Using Images: A Map Translation Approach. Remote Sensing 2021, 13, 1060. [Google Scholar] [CrossRef]
- Baudoux, L.; Inglada, J.; Mallet, C. Multi-nomenclature, multi-resolution joint translation: an application to land-cover mapping. International Journal of Geographical Information Science 2023, 37, 403–437. [Google Scholar] [CrossRef]
- Gneiting, T.; Raftery, A.E. Weather forecasting with ensemble methods. Science 2005, 310, 248–249. [Google Scholar] [CrossRef] [PubMed]
- Frogner, I.L.; Andrae, U.; Bojarova, J.; Callado, A.; Escribà, P.; Feddersen, H.; Hally, A.; Kauhanen, J.; Randriamampianina, R.; Singleton, A.; others. HarmonEPS—the HARMONIE ensemble prediction system. Weather and Forecasting 2019, 34, 1909–1937. [Google Scholar] [CrossRef]
- Ben Bouallègue, Z.; Clare, M.C.; Magnusson, L.; Gascon, E.; Maier-Gerber, M.; Janoušek, M.; Rodwell, M.; Pinault, F.; Dramsch, J.S.; Lang, S.T. ; others. The rise of data-driven weather forecasting: A first statistical assessment of machine learning-based weather forecasts in an operational-like context. Bulletin of the American Meteorological Society, 2024. [Google Scholar]
- Oskarsson, J.; Landelius, T.; Lindsten, F. Graph-based neural weather prediction for limited area modeling. arXiv preprint arXiv:2309.17370, 2023. [Google Scholar]
- Bessardon, G.; Rieutord, T.; Gleeson, E.; Palmason, B.; Oswald, S. High-resolution land use land cover dataset for meteorological modelling – Part 1: ECOCLIMAP-SG+ an agreement-based dataset. Land 2024. [Google Scholar]
- Venter, Z.S.; Barton, D.N.; Chakraborty, T.; Simensen, T.; Singh, G. Global 10 m land use land cover datasets: A comparison of dynamic world, world cover and esri land cover. Remote Sensing 2022, 14, 4101. [Google Scholar] [CrossRef]
- Inglada, J.; Vincent, A.; Arias, M.; Tardy, B.; Morin, D.; Rodes, I. Operational High Resolution Land Cover Map Production at the Country Scale Using Satellite Image Time Series. Remote Sensing 2017, 9, 95. [Google Scholar] [CrossRef]
- EEA. CORINE Land Cover 2018 (vector), Europe, 6-yearly - version 2020_20u1, May 2020, 2018. [CrossRef]
- Ballin, M.; Barcaroli, G.; Masselli, G. New LUCAS 2022 sample and subsamples design: Criticalities and solutions. Technical report, Publications Office of the European Union, 2022. [CrossRef]
- Devroye, L. Non-Uniform Random Variate Generation; Springer: New York, NY, 1986. [Google Scholar] [CrossRef]
- Fawcett, T. An introduction to ROC analysis. Pattern recognition letters 2006, 27, 861–874. [Google Scholar] [CrossRef]
- Keany, E.; Bessardon, G.; Gleeson, E. Using machine learning to produce a cost-effective national building height map of Ireland to categorise local climate zones. Advances in Science and Research. Copernicus GmbH, 2022, Vol. 19, pp. 13–27. [CrossRef]
| 1 | The ECOCLIMAP-SG wiki: https://opensource.umr-cnrm.fr/projects/ecoclimap-sg/wiki (last access: 2024/10/15 11:15:12) |
| 2 | Zenodo archive: https://doi.org/10.5281/zenodo.5843595 (last accessed 2024/10/15 11:15:12) |
| 3 | Leaf area index: https://land.copernicus.eu/en/products/vegetation/high-resolution-leaf-area-index (last accessed 2024/10/15 11:15:12) |
| 4 | Albedo: https://www.copernicus.eu/en/global-land-surface-albedo (last accessed 2024/10/15 11:15:12) |
| 5 | Building height: https://land.copernicus.eu/api/en/products/urban-atlas/building-height-2012 (last accessed 2024/10/15 11:15:12) |



| Primary label | Inference | ECOSG | Secondary label | Inference | ECOSG | Support |
|---|---|---|---|---|---|---|
| Water | 0.9149 | 0.4402 | 1. sea and oceans | 0.8185 | 0.5937 | 2% |
| 2. lakes | 0.7615 | 0.305 | 3% | |||
| 3. rivers | 0.3268 | 0.0812 | 140570 | |||
| Bare | 0.8767 | 0.6892 | 4. bare land | 0.645 | 0.4758 | 1% |
| 5. bare rock | 0.7874 | 0.0242 | 1% | |||
| Snow | 0.7018 | 0.4119 | 6. permanent snow | 0.7018 | 0.4119 | 83305 |
| Forest | 0.8806 | 0.6206 | 7. boreal broadleaf deciduous | 0.311 | 0.3506 | 397158 |
| 8. temperate broadleaf deciduous | 0.6397 | 0.4017 | 14% | |||
| 9. tropical broadleaf deciduous | - | - | 0 | |||
| 10. temperate broadleaf evergreen | 0.016 | 0.0808 | 25087 | |||
| 11. tropical broadleaf evergreen | - | - | 0 | |||
| 12. boreal needleleaf evergreen | 0.7628 | 0.6183 | 12% | |||
| 13. temperate needleleaf evergreen | 0.5098 | 0.2372 | 7% | |||
| 14. boreal needleleaf deciduous | - | - | 41558 | |||
| Shrubs | 0.0855 | 0.0509 | 15. shrubs | 0.0855 | 0.0509 | 228646 |
| Grass | 0.6983 | 0.428 | 16. boreal grassland | 0.5956 | 0.0574 | 1% |
| 17. temperate grassland | 0.6848 | 0.4222 | 10% | |||
| 18. tropical grassland | - | 0.0072 | 597 | |||
| Crops | 0.8513 | 0.6773 | 19. winter C3 crops | 0.7021 | 0.5265 | 25% |
| 20. summer C3 crops | 0.0 | 0.1015 | 3% | |||
| 21. C4 crops | 0.2624 | 0.1984 | 8% | |||
| Flooded | 0.5621 | 0.2335 | 22. flooded trees | 0.0118 | - | 53089 |
| 23. flooded grassland | 0.5478 | 0.2293 | 1% | |||
| Urban | 0.7543 | 0.3387 | 24. LCZ1: compact high-rise | - | 0.0284 | 8955 |
| 25. LCZ2: compact midrise | 0.3257 | 0.1207 | 53105 | |||
| 26. LCZ3: compact low-rise | 0.0697 | 0.0683 | 33709 | |||
| 27. LCZ4: open high-rise | 0.0272 | - | 9746 | |||
| 28. LCZ5: open midrise | 0.282 | 0.0676 | 139875 | |||
| 29: LCZ6: open low-rise | 0.6833 | 0.0781 | 1% | |||
| 30: LCZ7: lightweight low-rise | - | - | 38 | |||
| 31: LCZ8: large low-rise | 0.4995 | 0.103 | 254488 | |||
| 32: LCZ9: sparsely built | 0.434 | 0.1319 | 3% | |||
| 33: LCZ10: heavy industry | 0.0998 | 0.0881 | 10641 | |||
| Overall accuracy | 0.831 | 0.583 | Overall accuracy | 0.634 | 0.411 | 50M |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).