Submitted:
09 June 2024
Posted:
11 June 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Imagery Data Set
2.2.1. NAIP Orthoimagery
2.2.2. Sentinel Multi-Spectral Imagery
2.3. Layer Generation and Dicing
2.4. Selection of Features
2.5. Machine Learning Algorithms: Level 0
2.5.1. Random Forest
2.5.2. Gradient Boosting
2.6. Stacking Ensemble Machine Learning
2.7. Accuracy Assessment
3. Results
3.1. Level 0 and Level 1 Classification Comparison
| Model | Accuracy statistic |
Land Use Land Cover Class | ||||||
| Cropland | Grassland | Shrubland | Built ups | Water | Shadow | |||
| Random Forest | Precision | 0.9804 | 0.8977 | 0.9669 | 0.9015 | 0.9737 | 1 | |
| Recall | 0.9259 | 0.965 | 0.9733 | 0.9632 | 0.74 | 0.75 | ||
| F1-Statistic | 0.9524 | 0.9301 | 0.9701 | 0.9313 | 0.8409 | 0.8571 | ||
| Overall accuracy [ 0.9326], Kappa [ 0.9141], MCC [ 0.9149] | ||||||||
| GBM | Precision | 0.9813 | 0.9363 | 0.9605 | 0.9086 | 0.9762 | 0.9 | |
| Recall | 0.9722 | 0.955 | 0.9733 | 0.9421 | 0.82 | 0.8182 | ||
| F1-Statistic | 0.9767 | 0.9455 | 0.9669 | 0.9251 | 0.8913 | 0.8571 | ||
| Overall accuracy [ 0.9407], Kappa [ 0.9248], MCC [ 0.925] | ||||||||
| XGB | Precision | 0.9811 | 0.9234 | 0.973 | 0.9278 | 0.9545 | 0.878 | |
| Recall | 0.963 | 0.965 | 0.96 | 0.9474 | 0.84 | 0.8182 | ||
| F1-Statistic | 0.972 | 0.9438 | 0.9664 | 0.9375 | 0.8936 | 0.8471 | ||
| Overall accuracy [ 0.942], Kappa [ 0.9265], MCC [ 0.9267] | ||||||||
| Stacking | Precision | 0.972 | 0.9363 | 0.9613 | 0.9424 | 0.9545 | 0.9024 | |
| Recall | 0.963 | 0.955 | 0.9933 | 0.9474 | 0.84 | 0.8409 | ||
| F1-Statistic | 0.9674 | 0.9455 | 0.977 | 0.9449 | 0.8936 | 0.8706 | ||
| Overall accuracy [ 0.9474], Kappa [ 0.9334], MCC [ 0.9335] | ||||||||

4.2. Identifying and Reducing Overfitting

4.3. Contribution of Features
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A

Appendix B: Confusion Matrices produced on Independent Validation Data Points, Left (Random Cross-Validation), Right (Target-Oriented Cross-Validation).. Validation Data Points Were Generated Using Random Sampling, and Labeling Was Performed Using Google Earth Pro.

Appendix 3: Permutation-Based Variable Importance of Base Learners (RF (A), GBM (B), XGB (C)) and Stack Learner (Stack-XGB) in Random Cross Validation Approach.

Appendix D: Comparisons of Base-Learner and Stacking Ensemble Performance Using Target-Oriented Validation.

References
- Hirayama, H.; Sharma, R. C.; Tomita, M.; Hara, K. Evaluating Multiple Classifier System for the Reduction of Salt-and-Pepper Noise in the Classification of Very-High-Resolution Satellite Images. Int. J. Remote Sens. 2019, 40, 2542–2557. [Google Scholar] [CrossRef]
- Maxwell, A. E.; Strager, M. P.; Warner, T. A.; Zégre, N. P.; Yuill, C. B. Comparison of NAIP Orthophotography and Rapideye Satellite Imagery for Mapping of Mining and Mine Reclamation. GIScience Remote Sens. 2014, 51, 301–320. [Google Scholar] [CrossRef]
- Homer, C. G. C.; Dewitz, J. A. J.; Yang, L.; Jin, S.; Danielson, P.; Xian, G.; Coulston, J.; Herold, N. D. N.; Wickham, J. D. J.; Megown, K. Completion of the 2011 National Land Cover Database for the Conterminous United States-Representing a Decade of Land Cover Change Information; 2015; Vol. 81. [CrossRef]
- Fry, J. A.; Xian, G.; Jin, S.; Dewitz, J. A.; Homer, C. G.; Yang, L.; Barnes, C. A.; Herold, N. D.; Wickham, J. D. Completion of the 2006 National Land Cover Database for the Conterminous United States. Photogramm. Eng. Remote Sensing 2011, 77, 858–864. [Google Scholar]
- Castilla, G.; Hay, G. J. Image Objects and Geographic Objects BT - Object-Based Image Analysis: Spatial Concepts for Knowledge-Driven Remote Sensing Applications; Blaschke, T., Lang, S., Hay, G. J., Eds.; Springer Berlin Heidelberg: Berlin, Heidelberg, 2008; pp. 91–110. [Google Scholar] [CrossRef]
- Hayes, M. M.; Miller, S. N.; Murphy, M. A. High-Resolution Landcover Classification Using Random Forest. Remote Sens. Lett. 2014, 5, 112–121. [Google Scholar] [CrossRef]
- Knight, J. F.; Tolcser, B. P.; Corcoran, J. M.; Rampi, L. P. The Effects of Data Selection and Thematic Detail on the Accuracy of High Spatial Resolution Wetland Classifications. Photogramm. Eng. Remote Sensing 2013, 79, 613–623. [Google Scholar] [CrossRef]
- Zurqani, H. A.; Post, C. J.; Mikhailova, E. A.; Cope, M. P.; Allen, J. S.; Lytle, B. A. Evaluating the Integrity of Forested Riparian Buffers over a Large Area Using LiDAR Data and Google Earth Engine. Sci. Rep. 2020, 1–16. [Google Scholar] [CrossRef] [PubMed]
- Subedi, M. R.; Portillo-Quintero, C.; Kahl, S. S.; McIntyre, N. E.; Cox, R. D.; Perry, G. Leveraging NAIP Imagery for Accurate Large-Area Land Use/Land Cover Mapping: A Case Study in Central Texas. Photogramm. Eng. Remote Sens. 2023, 89, 547–560. [Google Scholar] [CrossRef]
- Li, X.; Shao, G. Object-Based Land-Cover Mapping with High Resolution Aerial Photography at a County Scale in Midwestern USA. Remote Sens. 2014, 6, 11372–11390. [Google Scholar] [CrossRef]
- Zylshal; Sulma, S. ; Yulianto, F.; Nugroho, J. T.; Sofan, P. A Support Vector Machine Object Based Image Analysis Approach on Urban Green Space Extraction Using Pleiades-1A Imagery. Model. Earth Syst. Environ. 2016, 2, 54. [Google Scholar] [CrossRef]
- Tzotsos, A.; Argialas, D. Support Vector Machine Classification for Object-Based Image Analysis. In Object-Based Image Analysis; Springer, 2008; pp 663–677.
- Ruiz, L. Á.; Recio, J. A.; Crespo-Peremarch, P.; Sapena, M. An Object-Based Approach for Mapping Forest Structural Types Based on Low-Density LiDAR and Multispectral Imagery. Geocarto Int. 2018, 33, 443–457. [Google Scholar] [CrossRef]
- Amini, S.; Homayouni, S.; Safari, A.; Darvishsefat, A. A. Object-Based Classification of Hyperspectral Data Using Random Forest Algorithm. Geo-Spatial Inf. Sci. 2018, 21, 127–138. [Google Scholar] [CrossRef]
- van Leeuwen, B.; Tobak, Z.; Kovács, F. Machine Learning Techniques for Land Use/Land Cover Classification of Medium Resolution Optical Satellite Imagery Focusing on Temporary Inundated Areas. J. Environ. Geogr. 2020, 13, (1–2). [Google Scholar] [CrossRef]
- Myint, S. W.; Gober, P.; Brazel, A.; Grossman-Clarke, S.; Weng, Q. Per-Pixel vs. Object-Based Classification of Urban Land Cover Extraction Using High Spatial Resolution Imagery. Remote Sens. Environ. 2011, 115, 1145–1161. [Google Scholar] [CrossRef]
- Yu, Q.; Gong, P.; Clinton, N.; Biging, G.; Kelly, M.; Schirokauer, D. Object-Based Detailed Vegetation Classification with Airborne High Spatial Resolution Remote Sensing Imagery. Photogramm. Eng. Remote Sensing 2006, 72, 799–811. [Google Scholar] [CrossRef]
- Yuan, Q.; Shen, H.; Li, T.; Li, Z.; Li, S.; Jiang, Y.; Xu, H.; Tan, W.; Yang, Q.; Wang, J.; Gao, J.; Zhang, L. Deep Learning in Environmental Remote Sensing: Achievements and Challenges. Remote Sens. Environ. 2020, 241, 111716. [Google Scholar] [CrossRef]
- Legendre, P.; Dale, M. R. T.; Fortin, M. J.; Gurevitch, J.; Hohn, M.; Myers, D. The Consequences of Spatial Structure for the Design and Analysis of Ecological Field Surveys. Ecography (Cop.). 2002, 25, 601–615. [Google Scholar] [CrossRef]
- Getis, A. A History of the Concept of Spatial Autocorrelation: A Geographer’s Perspective. Geogr. Anal. 2008. [Google Scholar] [CrossRef]
- Stehman, S. V.; Foody, G. M. Key Issues in Rigorous Accuracy Assessment of Land Cover Products. Remote Sens. Environ. 2019, 231, 111199. [Google Scholar] [CrossRef]
- Roberts, D. R.; Bahn, V.; Ciuti, S.; Boyce, M. S.; Elith, J.; Guillera-Arroita, G.; Hauenstein, S.; Lahoz-Monfort, J. J.; Schröder, B.; Thuiller, W.; Warton, D. I.; Wintle, B. A.; Hartig, F.; Dormann, C. F. Cross-Validation Strategies for Data with Temporal, Spatial, Hierarchical, or Phylogenetic Structure. Ecography (Cop.). 2017, 40, 913–929. [Google Scholar] [CrossRef]
- Griffith, G. E.; Bryce, S.; Omernik, J.; Rogers, A. Ecoregions of Texas; 2007.
- Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; Meygret, A.; Spoto, F.; Sy, O.; Marchese, F.; Bargellini, P. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
- Hagolle, O.; Sylvander, S.; Huc, M.; Claverie, M.; Clesse, D.; Dechoz, C.; Lonjou, V.; Poulain, V. SPOT-4 (Take 5): Simulation of Sentinel-2 Time Series on 45 Large Sites. Remote Sens. 2015, 7, 12242–12264. [Google Scholar] [CrossRef]
- Legendre, P.; Legendre, L. Numerical Ecology, Third.; Elsevier, 2012.
- Good, E. J.; Kong, X.; Embury, O.; Merchant, C. J.; Remedios, J. J. An Infrared Desert Dust Index for the Along-Track Scanning Radiometers. Remote Sens. Environ. 2012, 116, 159–176. [Google Scholar] [CrossRef]
- Franklin, S. E.; Wulder, M. A.; Gerylo, G. R. Texture Analysis of IKONOS Panchromatic Data for Douglas-Fir Forest Age Class Separability in British Columbia. Int. J. Remote Sens. 2001, 22, 2627–2632. [Google Scholar] [CrossRef]
- Haralick, R. M.; Dinstein, I.; Shanmugam, K. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973; SMC-3, 610–621. [Google Scholar] [CrossRef]
- Trimble. ECognition Developer 9;Sunnyvale, CA, USA. Sunnyvale, CA, USA 2020.
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; Vanderplas, J.; Passos, A.; Cournapeau, D.; Brucher, M.; Perrot, M.; Duchesnay, E. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Kuhn, M.; Johnson, K. Classification Trees and Rule-Based Models BT - Applied Predictive Modeling; Kuhn, M., Johnson, K., Eds.; Springer New York: New York, NY, 2013; pp. 369–413. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Silveyra Gonzalez, R.; Latifi, H.; Weinacker, H.; Dees, M.; Koch, B.; Heurich, M. Integrating LiDAR and High-Resolution Imagery for Object-Based Mapping of Forest Habitats in a Heterogeneous Temperate Forest Landscape. Int. J. Remote Sens. 2018, 39, 8859–8884. [Google Scholar] [CrossRef]
- Guo, L.; Chehata, N.; Mallet, C.; Boukir, S. Relevance of Airborne Lidar and Multispectral Image Data for Urban Scene Classification Using Random Forests. ISPRS J. Photogramm. Remote Sens. 2011, 66, 56–66. [Google Scholar] [CrossRef]
- Friedman, J. Greedy Function Approximation : A Gradient Boosting Machine Author ( s ): Jerome H. Friedman Source : The Annals of Statistics, Vol. 29, No. 5 ( Oct., 2001 ), Pp. 1189-1232 Published by : Institute of Mathematical Statistics Stable URL : Http://Www. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar]
- Feng, J.; Xu, Y.-X.; Jiang, Y.; Zhou, Z.-H. Soft Gradient Boosting Machine. 2020, 1–16.
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California,, USA: ACM; 2016; pp. 785–794. [Google Scholar]
- Wolpert, D. Stacked Generalization. Neural Networks 1992, 5, 241–259. [Google Scholar] [CrossRef]
- Congalton, R. G. A Review of Assessing the Accuracy of Classifications of Remotely Sensed Data. Remote Sens. Environ. 1991, 37, 35–46. [Google Scholar] [CrossRef]
- Matthews, B. W. Comparison of the Predicted and Observed Secondary Structure of T4 Phage Lysozyme. Biochim. Biophys. Acta - Protein Struct. 1975, 405, 442–451. [Google Scholar] [CrossRef] [PubMed]
- Dou, J.; Yunus, A. P.; Bui, D. T.; Merghadi, A.; Sahana, M.; Zhu, Z.; Chen, C. W.; Han, Z.; Pham, B. T. Improved Landslide Assessment Using Support Vector Machine with Bagging, Boosting, and Stacking Ensemble Machine Learning Framework in a Mountainous Watershed, Japan. Landslides 2020, 17, 641–658. [Google Scholar] [CrossRef]
- Meyer, H.; Reudenbach, C.; Hengl, T.; Katurji, M.; Nauss, T. Improving Performance of Spatio-Temporal Machine Learning Models Using Forward Feature Selection and Target-Oriented Validation. Environ. Model. Softw. 2018, 101, 1–9. [Google Scholar] [CrossRef]
- Congalton, R. G. A Comparison of Sampling Schemes Used in Generating Error Matrices for Assessing the Accuracy of Maps Generated from Remotely Sensed Data. Photogramm. Eng. Remote Sens. 1998, 54, 593–600. [Google Scholar]
- Wadoux, A. M. J. C.; Heuvelink, G. B. M.; de Bruin, S.; Brus, D. J. Spatial Cross-Validation Is Not the Right Way to Evaluate Map Accuracy. Ecol. Modell. 2021, 457, 109692. [Google Scholar] [CrossRef]
- Karasiak, N.; Dejoux, J. F.; Monteil, C.; Sheeren, D. Spatial Dependence between Training and Test Sets: Another Pitfall of Classification Accuracy Assessment in Remote Sensing. Mach. Learn. 2021. No. 0123456789. [Google Scholar] [CrossRef]
- Mannel, S.; Price, M.; Hua, D. Impact of Reference Datasets and Autocorrelation on Classification Accuracy. Int. J. Remote Sens. 2011, 32, 5321–5330. [Google Scholar] [CrossRef]




| Cross-Validation | Classifier | Accuracy Metrics (%) | ||
| Overall Accuracy | Kappa | MCC | ||
| Random-CV | RF | 89.64 | 86.94 | 87.45 |
| GBM | 90.6 | 88.17 | 88.54 | |
| XGB | 90.75 | 88.35 | 88.75 | |
| LLO-CV | STACK | 91.08 | 88.79 | 89.15 |
| RF | 89.93 | 87.31 | 87.77 | |
| GBM | 92.92 | 91.09 | 91.26 | |
| XGB | 92.96 | 91.15 | 91.33 | |
| STACK | 93.98 | 92.43 | 92.57 | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).