Evaluating the Influence of Pseudo Tree Crown (PTC) Input Alternatives for Machine Learning and Deep Learning Models on Individual Tree Classification Performance

Yan Tong; Kongwen (Frank) Zhang; Wuxue Cheng; Jane Liu

doi:10.20944/preprints202604.1694.v1

Submitted:

23 April 2026

Posted:

23 April 2026

You are already at the latest version

Abstract

Individual tree classification has a long history of diverse development, with recent trends focusing on the adoption of machine learning and deep learning approaches. A simple and powerful approach that lets the model auto-pilot, but weakens the need for physical characteristic understanding. Over more than a decade of our research, we have focused on establishing a direct representation of individual trees that bridges 2D top-down imagery and true 3D models. In this study, we investigated the fundamental question of the influence of the input data on these ML/DL models. In 2024, we introduced a novel data transformation method, the Pseudo Tree Crown (PTC), which provides a pseudo-3D pixel-value perspective that enhances the informational richness of images and significantly improves classification performance. Our original implementation was successfully tested on urban and deciduous trees in 2024 and was later extended to Canadian natural conifer species under snow conditions in 2025. However, the original PTC relied on the green band, limiting its applicability to green-leaf species. In this study, we analyzed and compared the performance of different data variations and transformations, such as the Green–Red Vegetation Index (GRVI) and Principal Component Analysis (PCA), as direct input and used their PTC forms. Classifications were conducted using Random Forest, ResNet50, and YOLOv10. The results confirmed the effectiveness of the PTC, which consistently improves classification accuracy by at least 7% without introducing additional computational time or complexity. Furthermore, PTC exhibits robust, consistent behaviour across all data forms, demonstrating its strong resilience and reliability.

Keywords:

Pseudo tree crown (PTC)

;

Individual tree species (ITS) classification

;

Green Red Vegetation Index (GRVI)

;

Principal Component Analysis (PCA)

Subject:

Environmental and Earth Sciences - Remote Sensing

1. Introduction

Individual tree classification is a fundamental task in forest remote sensing and serves as a critical parameter for both urban tree management [1] and natural mature forest ecosystems [2,3]. Accurate identification at the individual tree level enables a wide range of applications, including species-specific inventory, biomass estimation, biodiversity monitoring, and precision forestry practices [4].

In recent years, the research focus has increasingly shifted toward the adoption of machine learning (ML) and deep learning (DL) techniques in forest species and individual tree classification [3,5,6,7,8]. A wide variety of models, ranging from classical classifiers to deep learning architectures, have been proposed to improve classification performance [9,10]. However, most of these approaches are heavily data-driven and often lack an explicit consideration of the underlying physical and biological characteristics of trees. As a result, model development tends to emphasize architectural complexity and incremental gains in accuracy on relatively small, curated datasets, rather than fostering a deeper understanding of tree structure, spectral behaviour, and ecological context [11]. Technically speaking, the same approaches can be applied to any target objects, e.g., crop mapping [12,13], animal detection [14], mineral exploration [15], and iris recognition [16]. This trend has led to a proliferation of increasingly “fancy” models, where improvements are sometimes marginal and may not generalize well beyond the experimental setup [17]. Meanwhile, traditional methods such as watershed segmentation [18] and region-growing [19] approaches, which are more closely tied to geometric and structural interpretations of tree crowns, have gradually been sidelined in contemporary research or used as the comparison benchmark [20] .

Despite the rapid evolution of ML and DL algorithms, the format and quality of data remain the true foundation of individual tree classification [21]. Historically, major advances in remote sensing have been driven not only by model innovation but also by the introduction of new data types and acquisition platforms, ranging from ground-based systems to airborne and satellite observations, and, more recently, to UAVs [22]. Similarly, transitions from RGB imagery to multispectral and hyperspectral data [23], as well as the integration of LiDAR [19], thermal [24], and synthetic aperture radar (SAR) [25], have each introduced distinct physical characteristics that enabled breakthroughs in Earth observation and opened new research directions.

In this context, some studies have moved beyond purely model-centric approaches, seeking to align ML/DL frameworks with a more physically grounded understanding of Earth observation processes. At the same time, recent work has reported notable progress in multi-branch, multip-label [26], multi-source data fusion [27,28,29] and multimodal integration [30,31]. Some researchers have also reported issues with data quality, such as noise affecting classification accuracy and uncertainty, particularly for multimodal data [32].

Over more than a decade of our research, we have focused on establishing a direct estimation of an individual tree from two-dimensional (2D) top-down imagery to the true three-dimensional (3D) structures, progressing from multi-angular analysis [33], to 2D longitudinal profiles [34,35], and to 3D mesh-based crown top surfaces [36,37]. We have also examined the presentation of available data when used directly as input to ML/DL models.

In 2024, we introduced a novel data representation method, Pseudo Tree Crown (PTC), that provides a pseudo-3D pixel-value perspective, which enhances the informational richness of imagery. The original implementation was successfully tested on urban and deciduous trees [38], and later extended to Canadian conifer species in 2025 [39]. However, the initial PTC formulation relied on the green band, limiting its applicability primarily to green-leaf species.

In this study, on top of continuing to test the PTC approach on new tree species and different datasets. We investigate the influence of alternative inputs on different original data variations and transformations. Specifically, we employ the Green Red Vegetation Index (GRVI) [40] and Principal Component Analysis (PCA) to generate new PTC transformations. The results are compared between PTC and non-PTC inputs, as well as among different PTC alternatives. Selected classifiers, including Random Forest, ResNet50, and YOLOv10, are adopted to evaluate performance.

The key contributions and implications of this study are: First, we explicitly demonstrate the significant influence of input data alternatives on the performance of both ML/DL models, highlighting the critical role of data design in classification tasks; Second, the results confirm the effectiveness of the proposed PTC transformation, which consistently improves classification accuracy without introducing additional computational cost or model complexity; Third, the study shows that PTC exhibits robust and stable performance across diverse data forms, indicating strong generalizability, resilience, and reliability in varying experimental conditions.

2. Material and methods

2.1. The study area and data description

The original dataset comprises RGB images collected at the Chenglong Campus of Sichuan Normal University. The study area is located in the eastern part of Chengdu, Sichuan, China (104°12’5.76"E, 30°33’54.64"N), as illustrated in Figure 1. It contains five main tree species: Tree of Heaven (Ailanthus altissima), which has even-pinnate compound leaves with alternately arranged leaflets; the leaflets are ovate-lanceolate with shallow serrations and a relatively thin texture. Osmanthus tree (Osmanthus fragrans) has leathery, opposite, elongated-oval leaves with fine serrations or smooth edges and a smooth, glossy surface. Big-leaved Fig (Ficus virens) has even-pinnate compound leaves with alternately arranged leaflets; the leaflets are ovate-lanceolate with shallow serrations and a relatively thin texture. Chinese Banyan (Ficus macrocarpa) has leaves that are large, thin-leathery, elongated oval or ovate-lanceolate, with entire margins and a tapering tip; the surface is smooth. Camphor Tree (Cinnamomum camphora) has alternate, thin-leathery leaves that are ovate or ovate-elliptical, with three prominent veins from the base and smooth margins. A sample top-down view of tree crowns is shown in Figure 2.

A total of 639 trees were on-site observed, confirmed and manually labelled, including 117 Tree of Heaven, 151 Osmanthus, 107 Big-leaved Fig, 147 Chinese Banyan, and 117 Camphor Tree.

The survey was conducted using a UAV, the DJI Mavic 3 Enterprise Survey Edition (Shenzhen, China), a professional-grade drone designed for surveying and mapping. It is equipped with two cameras: a 4/3-inch CMOS 20 MP wide-angle camera and a 1/2-inch CMOS 12 MP telephoto camera. The wide-angle 4/3-inch camera was used to capture a wider field of view. The survey was conducted at an average flight altitude of approximately 80 m, with the shutter speed set to 1/500 s. The overlap rate was 80% and side overlap was 70% to balance the image mosaic and data volume. The fly speed was about 15 m/s. All images were georeferenced to the WGS84 coordinate system and saved in JPEG format. All aerial imagery was acquired on November 28, 2025, and the data are available upon request.

2.2. Methods: PTC, GRVI and PCA

As mentioned in the introduction, this study had one primary objective: to investigate the influence of data alternatives in original data source variations and transformations of the input data on different ML/DL models. We conducted the work in two separate groups of experiments: (1) directly feeding all alternatives of the data sources into the selected models; and (2) transforming them into PTC and then feeding them into the models.

The individual tree crowns were manually segmented out from the original imagery. Similar to our findings in 2025 [39], we reduced significant background noise in the imagery by clipping the surrounding area to focus only on the tree crowns, as shown in Figure 3, the RGB original image. In addition, because tree sizes varied across the dataset, all samples were standardized to a uniform size to ensure consistency during model training. All inputs were processed the same way, which should not affect the final outcomes.

Based on the UAV RGB imagery, several derived data variations were designed, including the Green band only, the Green-Red Vegetation Index (GRVI), as shown in Equation 1, and Principal Component Analysis (PCA) components. The Green band was used to transform into PTC in our original work [38,39]. The GRVI is a vegetation index that serves as an alternative to the Normalized Difference Vegetation Index (NDVI), while PCA represents information concentration. We used Principal Component (PC) band 1, which is the image intensity, and contained over 97% of the information; we refer to it as PCA1 throughout the remainder of the manuscript.

Therefore, the first group of experiments, as illustrated in Figure 3. We used the original RGB data to extract the Green band, calculate the GRVI (Equation 1), and perform PCA transformation (PC1 was used). The Green band was then used to generate the original version PTC. The original PTC used only the Green band, which is shown in Figure 4. As a result, a total of five data types were used in this study: RGB, GRVI, PCA1, Green, and PTC, where RGB is the only true-colour data input.

G R V I = \frac{ρ_{G r e e n} - ρ_{R e d}}{ρ_{G r e e n} + ρ_{R e d}}

(1)

The second part of the experiment, which is shown as Figure 5 a sample. We used the same tree RGB data, along with its variations and transformations, to generate PTCs.

Three ML/DL models were selected and employed for classification: Random Forest (RF) [41], ResNet50 [42], and YOLOv10 [43]. Our PTC is independent of the models themselves. RF represents a classic and stable machine learning approach, ResNet50 represents a reliable deep learning approach, and YOLOv10 is currently one of the most robust object detection algorithms. We acknowledge that other YOLO variants exist, such as YOLOv11 and YOLOv12. However, the version number does not necessarily reflect the performance increment, and we prioritize stability and robustness. And the models are not the focus of this study. These models were used to provide a relative comparison of the data alternatives rather than the approaches themselves. The dataset was divided into training and validation subsets using an 8:2 ratio, and all models were run multiple times with random assignment of trees into training and testing.

3. Results

In this study, model performance was evaluated using multiple metrics, including Accuracy, Precision, Recall, F1 Score, and Intersection over Union (IoU). For each training run, detailed outputs were recorded, including per-species accuracy matrices, training time, confusion matrices, and training loss and validation accuracy curves, to provide a comprehensive and intuitive assessment of model performance. The species-specific classification results are presented in Table 1.

The results indicate that PTC consistently achieves strong and stable performance across all cases, outperforming other data variations and transformations across all three models. This consistency further highlights the effectiveness of PTC in enhancing feature representation and improving classification reliability.

Accuracy = \frac{T P + T N}{T P + F N + F P + T N}

(2)

Precision = \frac{T P}{F P + T P}

(3)

Recall = \frac{T P}{T P + F N}

(4)

F 1 Score = \frac{2 \cdot Precision \cdot Recall}{Precision + Recall}

(5)

IoU = \frac{T P}{T P + F P + F N}

(6)

As noted in the methods, all results were obtained from multiple runs, and the mean and standard deviation are reported in the tables. The accuracy results are summarized in Table 2 for different variations and transformations directly used as input to the models. The PTC had clearly outperformed other data alternatives in all three models, 0.668 %, 0.865% and 0.793%, respectively.

The confusion matrix species classification of different data alternatives using ResNet50 is illustrated in Figure 6, providing a detailed view of class-wise performance. The matrix shows that most classes are well distinguished, with relatively low misclassification rates, further confirming the effectiveness of the PTC-based features in capturing discriminative structural information. The RF and YOLOv10 confusion matrices are provided in Appendix A ( Figure A1 and Figure A2).

The training loss and validation accuracy (val acc) over epochs using different data alternatives using ResNet50 are shown in Figure 7 as a sample. The remaining results for RF and YOLOv10 are provided in Appendix A ( Figure A3 and Figure A4). The curves indicate that ResNet50 and YOLOv10 have a very similar stable convergence behaviour, while RF had a much lower loss rate. However, no significant signs of overfitting or instability.

Table 4 presents the performance of all transformed PTCs. Although no single PTC variation stands out as a clear dominant performer, all configurations yield highly comparable results, with only minor differences among them. This further supports the consistency and stability of the PTC representation across different variations. The confusion matrix from ResNet50 is illustrated in Figure 8. The RF and YOLOv10 confusion matrices are provided in Appendix A ( Figure A5 and Figure A6). The learning curve of training loss and validation accuracy of different PTCs is shown in Figure 9. The other curves are listed in Appendix A ( Figure A7 and Figure A8).

The computation time comparison is presented in Table 5. Despite the additional PTC transformation step, the overall computational overhead remains minimal. In the RF, PTC even achieved the minimum time.

Table 3. Species classificationresults of different PTCs transformations using RF, ResNet50 and YOLOv10 models

Model	Data	Tree Species	Precision	Recall	F1-score	IoU
RF	RGB	Tree of Heaven	0.5700	0.5938	0.5819	0.4199
		Osmanthus Tree	0.6013	0.6636	0.6234	0.4646
		Chinese Banyan	0.7211	0.7000	0.7100	0.5111
		Big-leaved Fig	0.5824	0.6667	0.6243	0.4332
		Campphor Tree	0.5882	0.4909	0.5441	0.3636
	Green	Tree of Heaven	0.6296	0.6084	0.6169	0.4498
		Osmanthus Tree	0.6340	0.6501	0.6404	0.4749
		Chinese Banyan	0.6907	0.6834	0.6854	0.5283
		Big-leaved Fig	0.6239	0.6739	0.6459	0.4810
		Campphor Tree	0.5387	0.5341	0.5347	0.3739
	GRVI	Tree of Heaven	0.8283	0.8179	0.8176	0.7127
		Osmanthus Tree	0.7279	0.7595	0.7342	0.5896
		Chinese Banyan	0.7271	0.6928	0.7041	0.5528
		Big-leaved Fig	0.6292	0.6928	0.6509	0.4905
		Campphor Tree	0.6699	0.5552	0.5961	0.4448
	PCA1	Tree of Heaven	0.6014	0.6000	0.6003	0.4312
		Osmanthus Tree	0.6510	0.6375	0.6438	0.4803
		Chinese Banyan	0.6374	0.6708	0.6529	0.4915
		Big-leaved Fig	0.5844	0.5970	0.5902	0.4206
		Campphor Tree	0.5889	0.5483	0.5669	0.3975
ResNet50	RGB	Tree of Heaven	0.9091	0.8333	0.8696	0.7692
		Osmanthus Tree	0.9231	0.8000	0.8571	0.7500
		Chinese Banyan	0.9333	0.9333	0.9333	0.8750
		Big-leaved Fig	0.7917	0.9048	0.8444	0.7308
		Campphor Tree	0.7692	0.8696	0.8163	0.6897
	Green	Tree of Heaven	0.9130	0.8750	0.8936	0.8077
		Osmanthus Tree	0.8846	0.7667	0.8214	0.6970
		Chinese Banyan	0.9000	0.9000	0.9000	0.8182
		Big-leaved Fig	0.8000	0.9524	0.8696	0.7692
		Campphor Tree	0.7500	0.7826	0.7660	0.6207
	GRVI	Tree of Heaven	0.9565	0.9167	0.9362	0.8800
		Osmanthus Tree	0.6970	0.7667	0.7302	0.5750
		Chinese Banyan	0.9231	0.8000	0.8471	0.7500
		Big-leaved Fig	0.7619	0.7619	0.7619	0.6154
		Campphor Tree	0.4800	0.5217	0.5000	0.3333
	PCA1	Tree of Heaven	0.8947	0.7083	0.7907	0.6538
		Osmanthus Tree	0.9524	0.6667	0.7843	0.6452
		Chinese Banyan	0.8486	0.9333	0.8889	0.8000
		Big-leaved Fig	0.6774	1.0000	0.8077	0.6207
		Campphor Tree	0.7500	0.7826	0.7660	0.6774
YOLOv10	RGB	Tree of Heaven	0.8119	0.8056	0.7936	0.6631
		Osmanthus Tree	0.7971	0.6968	0.7068	0.5833
		Chinese Banyan	0.7941	0.6896	0.7381	0.5966
		Big-leaved Fig	0.7046	0.7190	0.7049	0.6107
		Campphor Tree	0.6657	0.5000	0.5542	0.3923
	Green	Tree of Heaven	0.8360	0.7986	0.7962	0.6675
		Osmanthus Tree	0.8656	0.6783	0.6816	0.5736
		Chinese Banyan	0.8444	0.7673	0.8031	0.6781
		Big-leaved Fig	0.7366	0.6440	0.6711	0.5536
		Campphor Tree	0.7258	0.6556	0.6610	0.5276
	GRVI	Tree of Heaven	0.8537	0.7461	0.7927	0.6681
		Osmanthus Tree	0.8644	0.7821	0.8171	0.6968
		Chinese Banyan	0.7576	0.7408	0.7475	0.6033
		Big-leaved Fig	0.7404	0.6765	0.6966	0.5641
		Campphor Tree	0.6820	0.7582	0.7114	0.5827
	PCA1	Tree of Heaven	0.8731	0.8334	0.8469	0.7346
		Osmanthus Tree	0.6243	0.6482	0.6340	0.5493
		Chinese Banyan	0.7972	0.7069	0.7491	0.6245
		Big-leaved Fig	0.7365	0.7965	0.7618	0.6318
		Campphor Tree	0.6304	0.6130	0.6161	0.4638

Table 4. Accuracy Comparison of Different Models Across Different PTC Transformations

Input Data	Random Forest	ResNet50	YOLOv10
RGB-PTC	$0.6367 \pm 7.6 \times 10^{- 4}$	$0.8809 \pm 2.8 \times 10^{- 3}$	$0.8132 \pm 1.3 \times 10^{- 4}$
Green-PTC	$0.6680 \pm 1.5 \times 10^{- 3}$	$0.8652 \pm 3.8 \times 10^{- 4}$	$0.7928 \pm 5.8 \times 10^{- 3}$
GRVI-PTC	$0.6328 \pm 1.9 \times 10^{- 4}$	$0.8281 \pm 1.2 \times 10^{- 2}$	$0.8083 \pm 6.9 \times 10^{- 4}$
PCA1-PTC	$0.7266 \pm 1.0 \times 10^{- 4}$	$0.8594 \pm 3.2 \times 10^{- 4}$	$0.7715 \pm 8.1 \times 10^{- 4}$

Table 5. Computation time with different data variations and transformation on different models

Model	Input	Data Variation	time (seconds)
RF	Directly	RGB	76.90
		Green	78.07
		GRVI	79.79
		PCA1	77.29
		PTC	75.28
	PTCs	RGB	81.15
		Green	75.28
		GRVI	83.58
		PCA1	77.14
ResNet50	Directly	RGB	45.79
		Green	23.55
		GRVI	29.49
		PCA1	43.05
		PTC	45.49
	PTCs	RGB	29.48
		Green	45.49
		GRVI	27.31
		PCA1	36.75
YOLOv10	Directly	RGB	409.53
		Green	412.97
		GRVI	416.11
		PCA1	406.36
		PTC	413.26
	PTCs	RGB	416.58
		Green	413.26
		GRVI	412.83
		PCA1	422.48

4. Discussion

The comparison of PTC with RGB variations and transformations is presented in Table 2. The results show that PTC consistently outperforms all original data variations and transformations as the direct input for all three models. This finding aligns with our previous studies [38,39], which demonstrate that PTC enhances information representation by incorporating vertical structural characteristics.

A key finding of Table 2 is that the RF exhibits substantially lower accuracy compared to ResNet50 and YOLOv10. For non-PTC data alternatives inputs, classification accuracy remains below 50%, with the exception of the Green band, which achieves a marginal improvement to 53%. This limitation can be rooted in the fundamental mechanism of RF, which relies on the feature vectors and is not well-suited to modelling image data, particularly in capturing spatial relationships. However, the PTC transformation demonstrates notably superior performance, achieving an accuracy of 66.8% when used with RF. These results provide strong evidence that the choice of input data representation plays a critical role in determining the performance of both ML/DL classification models, which is our primary research objective of this study.

The Green band ranked second across all three models and had the minimum difference between PTC 7% in ResNet50 and YOLOv10, while the other alternatives had 12 - 14%. As illustrated in Figure 3, the Green band effectively highlights sunlit tree crowns while suppressing shadowed canopy regions. This enhances the visibility of key crown structures. In comparison, the GRVI reduces shadow effects but also alters the representation of tree crown structures. Meanwhile, PCA exaggerates grayscale variations, leading to an oversimplified depiction of the tree crown. This finding is consistent with our previous studies [38,39]. The Green band is the most effective and simplest transformation for generating PTC in any green-leaved species, including deciduous and conifers. The differences between Green and other alternatives except PTC ranged from 7 - 9%, and it is greater than or equal to ResNet50 and YOLOv10, which is 5 - 7%. This echoes the influence of the input data, which is absolutely critical and can not be ignored.

Although some improvement was expected from PCA due to its ability to concentrate information, it does not produce meaningful performance gains. This suggests that merely enhancing spatial resolution or pixel-level correlation does not necessarily benefit machine learning or deep learning models. In contrast, features with more discernible shapes are easier to extract from both image processing and interpretation perspectives.

Since PTC does not introduce information loss and does not require complex computation, as shown in Table 5, the computation time is comparable to the other input alternative, even less in the RF model. Hence, the additional PTC transformation step is justified, especially considering the observed accuracy improvement of at least 7%. Furthermore, following our previous studies that examined different view angles, species, and tree ages, the effectiveness of PTC is once again approved.

These findings also suggested the flexibility of PTC, as it can be applied to a wide range of tree species and integrated with various ML/DL models to produce more robust results. The next stage of this research involves automating the application of PTC to entire images, which currently represents its primary limitation. More importantly, PTC serves as an intermediate bridge for retrieving true tree crown structures from top-down imagery.

The second group of experiments, representing the second objective of this study and shown in Table 4, suggests that PTC is highly robust and resilient to variations and transformations in the original data. Unlike the first group of experiments, where performance varied noticeably across different input alternatives, the PTC-based results exhibit a high degree of consistency. There is no single variation or transformation that consistently outperforms the others; however, the differences among them remain minimal, at approximately 3% or less.

This consistency indicates that once data are processed through PTC transformation, the influence of input variation is significantly reduced. In other words, PTC effectively normalizes or stabilizes the feature space, making downstream ML/DL models less sensitive to differences in input data, including resolution and data quality. It may be worth exploring the RGB converted PTC transformation further. The RGB-to-mono grayscale is essentially the average of the RGB values and has a marginally higher accuracy, 1.5-1.9% better than the Green band original.

Given that PTC has already demonstrated clear superiority over non-PTC approaches, these findings further suggest its robustness and resilience. This characteristic is particularly valuable in real-world applications, where data variability is inevitable, as it ensures stable and reliable model performance across different conditions.

5. Conclusion

In this study, we systematically evaluated the effectiveness of the PTC transformation in comparison with various original data variations and transformations. The results demonstrate that PTC consistently outperforms all original input alternatives across ML/DL models, achieving an accuracy improvement of at least 7%. This reinforces previous findings that incorporating vertical structural information significantly enhances feature representation.

This study also shows that structurally meaningful and visually discernible features play a more critical role in effective learning and interpretation compared to merely enhancing pixel-level correlations.

Furthermore, PTC exhibits strong flexibility and generalizability. It can be readily applied across different tree species, imaging conditions, and model architectures without requiring complex computation or introducing information loss. The second group of experiments demonstrated that PTC-based inputs produce highly consistent results across different variations and transformations, with only marginal performance differences. This indicates that PTC effectively stabilizes the feature space and reduces sensitivity to input data variability, including differences in resolution and data quality.

Overall, as the third study in the PTC research line, we reinforce the general conclusion that PTC is not only a high-performing representation but also a robust and reliable one for real-world applications. Its ability to bridge top-down imagery with meaningful tree crown structures represents an important step toward more accurate and interpretable remote sensing analysis.

In our future work, we will focus on automating the PTC generation process at the full-image scale, addressing its current limitations and enabling broader practical deployment, as well as establishing a direct correlation between PTC and true tree crown structures.

Author Contributions

“Conceptualization, K.Z.; methodology, Y.T. and K.Z.; software, Y. T. and K.Z.; validation, Y.T., K.Z. and W.C.; formal analysis, Y.T. and K.Z.; investigation, Y.T. and K.Z.; resources, K.Z. and W. C.; data curation, Y.T., K.Z. and W. C.; writing—original draft preparation, Y. T. and K.Z.; writing—review and editing, K.Z. and J.L.; visualization, Y.T and K.Z.; supervision, K.Z., W.C. and J.L.; project administration, W.C.; funding acquisition, W.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by This work was supported by the project: Intelligent Analysis of Urban Road Traffic Accident Risk Based on the Fusion of Street View and High-Resolution Remote Sensing Imagery (Key Project No. ZNJW2026ZZZD003).

Institutional Review Board Statement

Not applicable

Informed Consent Statement

Not applicable

Data Availability Statement

The data are available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PTC	Pseudo tree crown
ITS	Individual tree species
GRVI	Green Red Vegetation Index
PCA	Principal Component Analysis

Appendix A The confusion matrices of species classification results of different PTC transformations using RF and YOLOv10

Figure A1. The confusion matrices of species classification results of different data alternatives using RF

Figure A2. The confusion matrices of species classification results of different data alternatives using YOLOv10

Figure A3. Learning curves showing training loss and validation accuracy across epochs of different data alternatives using RF

Figure A4. Learning curves showing training loss and validation accuracy across epochs of different data alternatives using YOLOv10

Figure A5. The confusion matrices of species classification results of different PTC transformations using RF

Figure A6. The confusion matrices of species classification results of different PTC transformations using YOLOv10

Figure A7. Learning curves showing training loss and validation accuracy across epochs of different data alternatives using RF

Figure A8. Learning curves showing training loss and validation accuracy across epochs of different data alternatives using YOLOv10

References

Wang, K.; Wang, T.; Liu, X. A Review: Individual Tree Species Classification Using Integrated Airborne LiDAR and Optical Imagery with a Focus on the Urban Environment. Forests 2018, 10, 1. [Google Scholar] [CrossRef]
Guo, X.; Li, H.; Jing, L.; Wang, P. Individual Tree Species Classification Based on Convolutional Neural Networks and Multitemporal High-Resolution Remote Sensing Images. Sensors 2022, 22. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Lu, D.; Xu, L.; Robinson, D.T.; Tan, W.; Xie, Q.; Guan, H.; Chapman, M.A.; Li, J. Individual tree species classification using low-density airborne multispectral LiDAR data via attribute-aware cross-branch transformer. Remote Sensing of Environment 2024, 315, 114456. [Google Scholar] [CrossRef]
Lei, Z.; Li, H.; Zhao, J.; Jing, L.; Tang, Y.; Wang, H. Individual Tree Species Classification Based on a Hierarchical Convolutional Neural Network and Multitemporal Google Earth Images. Remote Sensing 2022, 14. [Google Scholar] [CrossRef]
Natesan, S.; Armenakis, C.; Vepakomma, U. Individual tree species identification using Dense Convolutional Network (DenseNet) on multitemporal RGB images from UAV. Journal of Unmanned Vehicle Systems 2020, 8, 310–333. [Google Scholar] [CrossRef]
Cetin, Z.; Yastikli, N. The Use of Machine Learning Algorithms in Urban Tree Species Classification. ISPRS International Journal of Geo-Information 2022, 11, 226. [Google Scholar] [CrossRef]
Bensi, M.E.; Esquivel, R.A. Unraveling the Significance of the Classification Tree Algorithm in Machine Learning: A Literature Review. Journal of Theoretical and Applied Sciences 2023, 1, 604–611. [Google Scholar] [CrossRef]
He, X.; Han, X.; Chen, Y.; Huang, L. A Light-Weighted Fusion Vision Mamba for Multimodal Remote Sensing Data Classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2025, 18, 21532–21548. [Google Scholar] [CrossRef]
Zhong, L.; Dai, Z.; Fang, P.; Cao, Y.; Wang, L. A Review: Tree Species Classification Based on Remote Sensing Data and Classic Deep Learning-Based Methods. Forest 2024, 15, 852. [Google Scholar] [CrossRef]
Southworth, J.; Smith, A.C.; Safaei, M.; Rahaman, M.; Alruzuq, A.; Tefera, B.B.; Muir, C.S.; Herrero, H.V. Machine learning versus deep learning in land system science: a decision-making framework for effective land classification. Frontiers in Remote Sensing 2024, 5–2024. [Google Scholar] [CrossRef]
Abreu-Dias, R.; Juan, M.S.G.; Fernando, M.R.; Luis, M.S. Advances in the Automated Identification of Individual Tree Species: A Systematic Review of Drone- and AI-Based Methods in Forest Environments. Technologies 2025, 13, 187. [Google Scholar] [CrossRef]
Wang, Y.; Huang, H.; State, R. Cross Domain Early Crop Mapping Using CropSTGAN. IEEE Access 2024, 12, 130800–130815. [Google Scholar] [CrossRef]
Wang, Y.; Huang, H.; State, R. Cross Domain Early Crop Mapping with Label Spaces Discrepancies using MultiCropGAN. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2024, X-1-2024, 241–248. [Google Scholar] [CrossRef]
Xu, Z.; Wang, T.; Skidmore, A.K.; Lamprey, R. A review of deep learning techniques for detecting animals in aerial and satellite images. International Journal of Applied Earth Observation and Geoinformation 2024, 128, 103732. [Google Scholar] [CrossRef]
Wang, W.; Zhao, C.; Wu, Y. Spatial weighting — An effective incorporation of geological expertise into deep learning models. Geochemistry 2024, 84, 126212. [Google Scholar] [CrossRef]
Xiong, Q.; Zhang, X.; Shen, J. A prior embedding-driven architecture for long distance blind iris recognition. Biomedical Signal Processing and Control 2025, 109, 108048. [Google Scholar] [CrossRef]
Ben Khalifa, M.A.; El Koundi, M.; Farah, I.R. Pushing boundaries in remote sensing: A comprehensive review of deep learning for spatial super-resolution. Remote Sensing Applications: Society and Environment 2025, 40, 101809. [Google Scholar] [CrossRef]
Fan, W.; Tian, J.; Troles, J.; Döllerer, M.; Kindu, M.; Knoke, T. Comparing Deep Learning and MCWST Approaches for Individual Tree Crown Segmentation. ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. X-1-2024 2024, 67–73. [CrossRef]
Burmeister, J.M.; Richter, R.; Reder, S.; Mund, J.P.; Döllner, J. Tree Instance Segmentation in Urban 3D Point Clouds Using a Coarse-to-Fine Algorithm Based on Semantic Segmentation. ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. X-4/W5-2024 2024, 79–86. [CrossRef]
Fu, Y.; Niu, Y.; Wang, L.; Li, W. Individual-Tree Segmentation from UAV–LiDAR Data Using a Region-Growing Segmentation and Supervoxel-Weighted Fuzzy Clustering Approach. Remote Sensing 2024, 16, 608. [Google Scholar] [CrossRef]
D’Amico, G.; Nilsson, M.; Axelsson, A.; Chirici, G. Data homogeneity impact in tree species classification based on Sentinel-2 multitemporal data case study in central Sweden. International Journal of Remote Sensing 2024, 45, 5050–5075. [Google Scholar] [CrossRef]
Zhang, H.; Liu, B.; Yang, B.; Guo, J.; Hu, Z.; Zhang, M.; Yang, Z.; Zhang, J. Efficient tree species classification using machine and deep learning algorithms based on UAV-LiDAR data in North China. Frontiers in Forests and Global Change 2025, 8–2025. [Google Scholar] [CrossRef]
Marconi, S.; Weinstein, B.G.; Zou, S.; Bohlman, S.A.; Zare, A.; Singh, A.; Stewart, D.; Harmon, I.; Steinkraus, A.; White, E.P. Continental-scale hyperspectral tree species classification in the United States National Ecological Observatory Network. Remote Sensing of Environment 2022, 282, 113264. [Google Scholar] [CrossRef]
Zakrzewska, A.; Kopeć, D.; Krajewski, K.; Charyton, J. Canopy temperatures of selected tree species growing in the forest and outside the forest using aerial thermal infrared (3.6–4.9 µm) data. European Journal of Remote Sensing 2022, 55, 313–325. [Google Scholar] [CrossRef]
Blickensdörfer, L.; Oehmichen, K.; Pflugmacher, D.; Kleinschmit, B.; Hostert, P. National tree species mapping using Sentinel-1/2 time series and German National Forest Inventory data. Remote Sensing of Environment 2024, 304, 114069. [Google Scholar] [CrossRef]
Qin, T.; Zhao, Q. Multi-branch and multi-label tree species classification using deep learning for UAV aerial photography and Sentinel remote sensing images. Sci Rep 2025, 15. [Google Scholar] [CrossRef]
Heinzel, J.; Koch, B. Investigating multiple data sources for tree species classification in temperate forest and use for single tree delineation. International Journal of Applied Earth Observation and Geoinformation 2012, 18, 101–110. [Google Scholar] [CrossRef]
Wan, H.; Tang, Y.; Jing, L.; Li, H.; Qiu, F.; Wu, W. Tree Species Classification of Forest Stands Using Multisource Remote Sensing Data. Remote Sensing 2021, 13. [Google Scholar] [CrossRef]
Cormier, K.; Zhang, K.F.; Padron-Uy, J.; Wong, A.; Gagnier, K.; Parihar, A. Data Warehouse Design for Multiple Source Forest Inventory Management and Image Processing. In Proceedings of the 2025 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE); IEEE, 2025; pp. 475–480. [Google Scholar] [CrossRef]
Liu, B.; Hao, Y.; Huang, H.; Chen, S.; Li, Z.; Chen, E.; Tian, X.; Ren, M. TSCMDL: Multimodal Deep Learning Framework for Classifying Tree Species Using Fusion of 2-D and 3-D Features. IEEE Transactions on Geoscience and Remote Sensing 2023, 61, 1–11. [Google Scholar] [CrossRef]
Cao, Y.; Coops, N.C.; Murray, B.A.; Sinclair, I.; Geordie, R.M. M3FNet: Multi-modal multi-temporal multi-scale data fusion network for tree species composition mapping. ISPRS Journal of Photogrammetry and Remote Sensing 2026, 231, 797–814. [Google Scholar] [CrossRef]
He, X.; Han, X.; Zhao, Y.; Chen, Y.; Zou, L. Uncertainty-Based Dendritic Model for Multimodal Remote Sensing Data Classification. IEEE Transactions on Geoscience and Remote Sensing 2026, 64, 1–18. [Google Scholar] [CrossRef]
Hu, B.; Zhang, F.; Wang, J. On the retrival of vegetation parameters from multi-angular hyperspectral remote sensing data. In Proceedings of the 2009 IEEE Toronto International Conference Science and Technology for Humanity (TIC-STH); IEEE, 2009. [Google Scholar] [CrossRef]
Zhang, K.; Hu, B. Individual Urban Tree Species Classification Using Very High Spatial Resolution Airborne Multi-Spectral Imagery Using Longitudinal Profiles. Remote Sensing 2012, 4, 1741–1764. [Google Scholar] [CrossRef]
Zhang, K.; Robinson, J.; Jing, L. Canopy vertical parameters estimation using unmanned aerial vehicle (UAV) imagery. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS); IEEE, 2016. [Google Scholar] [CrossRef]
Zhang, K.; Jing, L.; Robinson, J. Douglas fir productivity estimation using very high spatial resolution imagery - a case study on ground treatment impact in west Kootenay, British Columbia, Canada. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS); IEEE, 2016. [Google Scholar] [CrossRef]
Balkenhol, L.; Zhang, K.F. Identifying Individual Tree Species Structure with High-Resolution Hyperspectral Imagery Using a Linear Interpretation of the Spectral Signature. In Proceedings of the 38th Canadian Symposium on Remote Sensing, 2017. [Google Scholar]
Miao, S.; Zhang, K.F.; Zeng, H.; Liu, J. Improving artificial-intelligence-based individual tree species classification using pseudo tree crown derived from unmanned aerial vehicle imagery. Remote Sensing 2024, 16, 1849. [Google Scholar] [CrossRef]
Zhang, K.; Zhang, T.; Liu, J. Individual tree species classification using Pseudo Tree Crown (PTC) on coniferous forests. Remote Sensing 2025, 17, 3102. [Google Scholar] [CrossRef]
Tucker, C.J. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sensing of Environment 1979, 8, 127–150. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Machine Learning 2001, 45, 5–32. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016; pp. 770–778. [Google Scholar] [CrossRef]
Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. YOLOv10: Real-Time End-to-End Object Detection. arXiv 2024, arXiv:2405.14458. [Google Scholar] [CrossRef]

Figure 1. The study area is the Chenglong Campus of Sichuan Normal University, which is located in the eastern part of Chengdu, Sichuan, China (104°12’5.76"E, 30°33’54.64"N)

Figure 2. The sample tree species illustration (from left to right): (a) Tree of Heaven (Ailanthus altissima),(b) Osmanthus Tree (Osmanthus fragrands), (c) Big-leaved Fig (Ficus virens), (d) Chinese Banyan (Ficus macrocarpa), (e) Camphor Tree (Cinnamomum camphora)

Figure 3. Original image variations and transformations. RGB images were used to calculate the GRVI, clip the Green band and transform for PCA (PC Band 1 used). The Green band was used to generate PTC, which is the original version of PTC.

Figure 4. The sample of original PTC generated from different species. From left to right, (a) Tree of Heaven,(b) Osmanthus Tree, (c) Big-leaved Fig, (d) Chinese Banyan, (e) Camphor Tree

Figure 5. The PTCs transformed from a sample Tree of Heaven tree using different alternatives. From left to right: RGB, Green, GRVI and PCA.

Figure 6. The confusion matrices of species classification results of different data alternatives using ResNet50

Figure 7. Learning curves showing training loss and validation accuracy across epochs of different data alternatives using ResNet50

Figure 8. The confusion matrices of species classification results of different PTC transformations using ResNet50

Figure 9. Learning curves showing training loss and validation accuracy across epochs of different PTC transformations using ResNet50

Table 1. Species classification results of different data variations and transformations of RF, ResNet50 and YOLOv10 models

Model	Data	Tree Species	Precision	Recall	F1-score	IoU
RF	RGB	Tree of Heaven	0.4570	0.4513	0.4506	0.2910
		Osmanthus Tree	0.4778	0.3888	0.4216	0.2672
		Chinese Banyan	0.4922	0.5218	0.5024	0.3379
		Big-leaved Fig	0.4644	0.4366	0.4466	0.2875
		Campphor Tree	0.4211	0.4821	0.4439	0.2870
	Green	Tree of Heaven	0.4346	0.3894	0.4083	0.2606
		Osmanthus Tree	0.5150	0.5439	0.5269	0.3590
		Chinese Banyan	0.5150	0.5439	0.5269	0.3590
		Big-leaved Fig	0.5354	0.4341	0.4740	0.3133
		Campphor Tree	0.4726	0.5044	0.4857	0.3212
	GRVI	Tree of Heaven	0.4501	0.4360	0.4383	0.2812
		Osmanthus Tree	0.4833	0.3783	0.4133	0.2610
		Chinese Banyan	0.4905	0.5283	0.5046	0.3376
		Big-leaved Fig	0.4833	0.4880	0.4812	0.3187
		Campphor Tree	0.4000	0.4451	0.4146	0.2629
	PCA1	Tree of Heaven	0.4468	0.4572	0.4489	0.2936
		Osmanthus Tree	0.5080	0.5283	0.5167	0.3533
		Chinese Banyan	0.4950	0.5283	0.5113	0.3499
		Big-leaved Fig	0.4327	0.3945	0.4106	0.2616
		Campphor Tree	0.4886	0.4630	0.4764	0.3155
	PTC	Tree of Heaven	0.6337	0.6097	0.6196	0.4517
		Osmanthus Tree	0.6381	0.6514	0.6431	0.4769
		Chinese Banyan	0.6948	0.6847	0.6881	0.5303
		Big-leaved Fig	0.6280	0.6752	0.6486	0.4830
		Campphor Tree	0.5427	0.5354	0.5374	0.3758
ResNet50	RGB	Tree of Heaven	0.7241	0.7000	0.7119	0.5526
		Osmanthus Tree	0.7917	0.7917	0.7917	0.6552
		Chinese Banyan	0.6471	0.7333	0.6875	0.5238
		Big-leaved Fig	0.6667	0.5714	0.6154	0.4444
		Campphor Tree	0.5652	0.5652	0.5652	0.3939
	Green	Tree of Heaven	1.0000	0.5000	0.6667	0.5000
		Osmanthus Tree	0.7333	0.7333	0.7333	0.5789
		Chinese Banyan	0.7667	0.7667	0.7667	0.6216
		Big-leaved Fig	0.5405	0.9524	0.6897	0.5263
		Campphor Tree	0.7368	0.6087	0.6667	0.5000
	GRVI	Tree of Heaven	0.778	0.5833	0.6667	0.5000
		Osmanthus Tree	0.5946	0.7333	0.6567	0.4889
		Chinese Banyan	0.5429	0.6333	0.5846	0.4130
		Big-leaved Fig	0.6500	0.6190	0.6341	0.4643
		Campphor Tree	0.6667	0.5217	0.5854	0.4138
	PCA1	Tree of Heaven	0.7222	0.5417	0.6190	0.4483
		Osmanthus Tree	0.7143	0.8333	0.7692	0.6250
		Chinese Banyan	0.6250	0.8333	0.7143	0.5556
		Big-leaved Fig	0.7692	0.4762	0.5882	0.4167
		Campphor Tree	0.5909	0.5652	0.5882	0.4062
	PTC	Tree of Heaven	0.9130	0.8750	0.8936	0.8077
		Osmanthus Tree	0.8846	0.7667	0.8214	0.6970
		Chinese Banyan	0.9000	0.9000	0.9000	0.8182
		Big-leaved Fig	0.8000	0.9524	0.8696	0.7692
		Campphor Tree	0.7500	0.7826	0.7660	0.6207
YOLOv10	RGB	Tree of Heaven	0.8662	0.6064	0.7006	0.5444
		Osmanthus Tree	0.7289	0.5067	0.5929	0.4240
		Chinese Banyan	0.7116	0.4999	0.6029	0.4330
		Big-leaved Fig	0.4950	0.6299	0.5535	0.3875
		Campphor Tree	0.4276	0.6081	0.5019	0.3361
	Green	Tree of Heaven	0.7802	0.6384	0.6917	0.5454
		Osmanthus Tree	0.7503	0.6686	0.6591	0.5030
		Chinese Banyan	0.8103	0.6291	0.7029	0.5566
		Big-leaved Fig	0.7319	0.7510	0.7265	0.5863
		Campphor Tree	0.4769	0.6317	0.5219	0.3576
	GRVI	Tree of Heaven	0.8675	0.6058	0.7002	0.5430
		Osmanthus Tree	0.7259	0.5072	0.5937	0.4240
		Chinese Banyan	0.7120	0.5431	0.6031	0.4335
		Big-leaved Fig	0.4945	0.6324	0.5536	0.3882
		Campphor Tree	0.4283	0.6184	0.5014	0.3363
	PCA1	Tree of Heaven	0.7639	0.5413	0.6336	0.4807
		Osmanthus Tree	0.6721	0.5850	0.6233	0.4559
		Chinese Banyan	0.5803	0.4826	0.5247	0.3664
		Big-leaved Fig	0.5136	0.5149	0.5109	0.3661
		Campphor Tree	0.5380	0.6325	0.5819	0.4152
	PTC	Tree of Heaven	0.9359	0.6063	0.7774	0.5675
		Osmanthus Tree	0.7548	0.8750	0.8030	0.6721
		Chinese Banyan	0.8449	0.7674	0.8080	0.6779
		Big-leaved Fig	0.9853	0.7210	0.8104	0.7077
		Campphor Tree	0.4820	0.5787	0.5186	0.3732

Table 2. Accuracy Comparison of Different Models Across Spectral Inputs and Indices. RGB is short for RGB true colour data, and Green is short for Green band gray scale value.

Input Data	Random Forest	ResNet50	YOLOv10
RGB	$0.4766 \pm 1.9 \times 10^{- 3}$	$0.7090 \pm 1.8 \times 10^{- 4}$	$0.6748 \pm 3.1 \times 10^{- 4}$
Green	$0.5313 \pm 3.1 \times 10^{- 3}$	$0.7969 \pm 4.9 \times 10^{- 4}$	$0.7331 \pm 1.4 \times 10^{- 3}$
GRVI	$0.4844 \pm 1.9 \times 10^{- 3}$	$0.7227 \pm 1.8 \times 10^{- 4}$	$0.6748 \pm 3.1 \times 10^{- 4}$
PCA1	$0.4961 \pm 1.5 \times 10^{- 3}$	$0.7227 \pm 1.8 \times 10^{- 4}$	$0.6516 \pm 1.8 \times 10^{- 3}$
PTC	$0.6680 \pm 1.5 \times 10^{- 3}$	$0.8652 \pm 3.8 \times 10^{- 4}$	$0.7928 \pm 5.8 \times 10^{- 3}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.