Assessing the Potential of Multi-spectral and Multi-temporal Satellite Images for Classification and Mapping of Plant Communities in a Temperate Region

Classification and mapping of plant communities is an essential step for conservation and management of ecosystems and biodiversity. We adopt the GenusPhysiognomy-Ecosystem (GPE) system developed in previous study for satellite-based classification of plant communities. This paper assesses the potential of multi-spectral and multi-temporal images collected by Sentinel-2 satellites. This research was conducted in five representative study sites in a temperate region. It consists of 44 types of plant communities including a few land cover types as well. The plant community types were enumerated in the study sites and ground truth data were prepared with reference to extant vegetation surveys, visual interpretation of high-resolution images, and onsite field observations. We acquired all Sentinel-2 Level-1C product images available for the study sites between 2017-2019 and generated monthly median composite images consisting of ten spectral and twelve spectral-indices. Gradient Boosting Decision Trees (GBDT) classifier was employed as an efficient and distributed gradient boosting technique for the supervised classification of big datasets involved in the research. The cross-validation accuracy in terms of kappa coefficient varied from 87% in Oze site with 41 land cover and plant community types to 95% in Hakkoda site with 19 land cover and plant community types; with average performance of 91% across all sites. In addition, the resulting maps demonstrated a clear distribution of plant community types involved in all sites, highlighting the potential of Sentinel-2 multi-spectral and multi-temporal images with GPE classification system for operational and broad-scale mapping of land cover and plant communities.


Introduction
Classification and mapping of plant communities is an essential step for conservation and management of ecosystems and biodiversity. In recent years, availability of free and open access data, high performance computing, and automated data processing and analysis capabilities have brought new opportunities for classification and mapping of plant communities from remotely sensed images (Murakami and Mochizuki, 2014;Wulder, 2018). In contrast to potential natural vegetation mapping based on climatic parameters available at coarse spatial resolution (Hengl et al., 2018), actual vegetation mapping (Bredenkamp et al., 1998;Su et al., 2020) with recently available satellite images can provide much detailed information at higher spatial resolution for improving the knowledge of plant community.
In Japan, a wide variety of land cover and plant communities, ranging from Southern Subtropical Forests to Northern Arctic Meadows, exists (Numata et al., 1972;Miyawaki, 1984;Himiyama, 1998). Nationwide vegetation surveys have been conducted continuously since 1973 and distribution of plant communities is well known. First vegetation survey of the entire country was completed in 1999 with the production of vegetation survey maps at 1:50,000 scale (MoE and AAS, 1999). Since 1999, extensive field surveys have been repeated and a 1:25,000 scale vegetation survey map is being produced nationwide (Hioki, 2007). The vegetation survey follows phyto-sociological units based organization plant communities (Miyawaki 1968;Ohno, 2006). The plant communities are recognized through field surveys and delineated in a geographical environment via a manual procedure facilitated by visual interpretation of aerial and satellite images. The manual delineation procedure is subject to human discernment, laborious, and costly. To cope with these issues, more intelligent technology has been expected.
The major objective of this paper is to assess the potential of multi-spectral and multitemporal images available from the Sentinel-2 mission satellites (Sentinel-2A and 2B) for operational and broad-scale mapping of land cover and plant community types by adopting the Genus-Physiognomy-Ecosystem (GPE) system developed in previous study for satellite-based classification of plant communities.

Study sites
This research was conducted in five representative sites of the Tohoku region in Japan. These five study sites were selected in such a way that they can represent all land cover and plant communities types present in the Tohoku region. The location map of five study sites has been shown in Figure 1.

Preparation of ground truth data
First of all, land cover and plant community types present in five study sites were enumerated by adopting the Genus-Physiognomy-Ecosystem (GPE) system developed in previous study (Sharma, 2021) for satellite-based classification and mapping of plant communities. Extant vegetation survey reports available from Nature Conservation Bureau, Ministry of the Environment and Asia Air Survey Co., Ltd were utilized as reference materials for enumerating land cover and plant community types in each study site. The land cover and plant community types were further verified by onsite field observations conducted between 2017 and 2020 in all study sites. The final confirmed list of land cover and plant community types present in five study sites has been described in Table 1. The ground truth data, polygons representing homogeneous land cover and plant community types of around 1ha size, were collected with reference to extant vegetation survey maps (1:25,0000 scale) produced from extensive field surveys between 2012 to 2020, and visual interpretation of time-lapse images available in the Google Earth by local experts in plant ecology and vegetation sciences. The distribution of ground truth data in the study sites has been shown in Figure 2.

Processing of satellite data
We acquired all Level-1C product images collected by Sentinel-2 mission satellites (Sentinel-2A and 2B) for the study sites between 2017-2019. The Sentinel-2 mission satellites collect optical imagery at high spatial resolution (10-60m) in visible, near infrared, and short-wave wavelengths at a frequency of five days (Drusch et al., 2012). The images were processed for cloud masking and ten spectral bands (blue, green, red, red edge 1-3, near infrared, mid infrared, and shortwave infrared 1-2) were extracted. For each scene, twelve vegetation indices (as shown in Table 2) were also calculated. The spectral and spectral-indices images were composited by computing monthly median values. In this manner, we generated 264 features (22 spectral and spectral-indices × 12 months) altogether for machine learning, classification, and mapping.

Machine learning and classification
We employed Gradient Boosting Decision Trees (GBDT) classifier implemented by XGBoost, an efficient and optimized distributed gradient boosting library (https://github.com/dmlc/xgboost) for the supervised classification of Sentinel-2 images as it can handle large data volume with Compute Unified Device Architecture (CUDA) computations. We implemented a train-test split method for fine tuning of input features and model parameters. Classification accuracy metrics (Accuracy, Kappa coefficient, F1score, Recall, and Precision) were utilized for quantitative evaluation. For this method, ground truth data were shuffled and randomly splitted into train (75%) and test (25%) sets. The GBDT model was trained on the training data, whereas test data was utilized for fine tuning the parameters of the model. The GBDT model established in this was utilized for prediction and mapping of land cover and plant community types separately for each site.

Model test results
The model test results obtained from the machine learning (GBDT classifier) of multitemporal Sentinel-2 images have been shown using the confusion matrix figures (Figures  3-5) for three sites (Hakkoda, Zao, and Shirakami). Due to many classes involved, classwise accuracy tables (Tables 3 and 4) have been shown for two sites (Oze and Kitakami).    The classification accuracy matrices obtained for all study sites have been summarized in Table 5. The classification accuracy in terms of kappa coefficient varied from 87% in Oze site with 41 classes to 95% in Hakkoda site with 19 classes.

Land Cover and Plant Community Maps
The Land Cover and Plant Community Maps produced in this research have been shown in Figures 6-10. These maps demonstrate the extent and distribution of land cover and plant community types clearly for the study sites concerned.     Preparation of ground truth data becomes very difficult, time-consuming, and expensive when the heterogeneity and complexity of plant community types increase. Even with the large amounts of high quality ground truth data, classification of satellite images becomes increasingly challenging as the number of classes increases. On the other hand, the characteristic species based phyto-sociological classes (Poore, 1955;Whittaker, 1980;Miyawaki and Fujiwara, 1988) delineated by nationwide vegetation survey is out from automated digital mapping approach as remote sensing signals are mostly governed by physical interactions of dominant species rather than characteristic species. Therefore, a right and effective organization of plant communities is essential for operational and broad-scale mapping. In line with this, the Genus-Physiognomy-Ecosystem (GPE) system, developed by Sharma, 2021 for the classification of plant communities from the perspective of satellite remote sensing, was extended in this research for operational mapping of land cover and plant community types collectively.

Conclusions
In this research, we presented operational mapping of land cover and plant community types in five study sites in a temperate region in Japan by utilizing multispectral and multi-temporal Sentinel-2. Machine learning based accuracy analysis showed potential of the Sentinel-2 images for the mapping of land cover and plant community types by adopting Genus-Physiognomy-Ecosystem (GPE) system as the kappa coefficient varied from 87% (41 classes in Oze site) to 95% (19 classes in Hakkoda site). Still, some misclassifications were detected in some classes such as Betula DBF, Alnus DBF, Fagus DBF, Quercus DBF, Picea ECF, Hydrangea Shrub, and Zoysia Herb particularly in sites associated with many classes. Further increase in the temporal resolution of Sentinel-2 mission satellites images with future launches of Sentinel-2C and 2D satellites is highly expected for improving the classification accuracy of plant communities. Future plan is to expand this methodology for seamless mapping of plant communities in the same region by further increasing the ground truth data.