Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

Improving Land Cover Classification Using Genetic Programming for Feature Construction

Version 1 : Received: 7 October 2020 / Approved: 8 October 2020 / Online: 8 October 2020 (09:21:34 CEST)
Version 2 : Received: 23 December 2020 / Approved: 24 December 2020 / Online: 24 December 2020 (08:59:19 CET)

How to cite: Batista, J.; Cabral, A.; Vasconcelos, M.; Vanneschi, L.; Silva, S. Improving Land Cover Classification Using Genetic Programming for Feature Construction. Preprints 2020, 2020100168. https://doi.org/10.20944/preprints202010.0168.v2 Batista, J.; Cabral, A.; Vasconcelos, M.; Vanneschi, L.; Silva, S. Improving Land Cover Classification Using Genetic Programming for Feature Construction. Preprints 2020, 2020100168. https://doi.org/10.20944/preprints202010.0168.v2

Abstract

Genetic Programming (GP) is a powerful Machine Learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in Remote Sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs Feature Construction by evolving hyper-features from the original ones. In this work, we use the M3GP algorithm on several sets of satellite images over different countries to create hyper-feature from satellite bands to improve the classification of land cover types. We add the evolved hyper-features to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (Decision Trees, Random Forests and XGBoost) on multiclass classifications and no significant effect on the binary classifications. We show that adding the M3GP hyper-features to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI and NBR. We also compare the performance of the M3GP hyper-features in the binary classification problems with those created by other Feature Construction methods like FFX and EFS.

Keywords

Genetic Programming; Evolutionary Computation; Machine Learning; Classification; Multiclass Classification; Feature Construction; Hyper-features; Spectral Indices

Subject

Computer Science and Mathematics, Algebra and Number Theory

Comments (1)

Comment 1
Received: 24 December 2020
Commenter: João Batista
Commenter's Conflict of Interests: Author
Comment: After an initial round of reviews, this manuscript was extended with details such as:
- A more detailed explanation of the M3GP algorithm;
- Inclusion of more information about the climate in the Study Areas;
- Commentaries on the popularity of each original feature in the creation of hyper-features for each problem;
- Commentaries on the impact of the hyper-features in each class (rather than just overall accuracy) .
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.