1. Introduction
Forests are the largest carbon reservoir and ecosystem on land, providing not only vital ecological services but also enormous economic benefits in the process of human development [
1]. The acquisition of forestry parameters, such as tree height, crown width, species, and biomass, etc., is critical in the process of investigation and monitoring. The monitoring of forest resources is to provide an effective scientific methodology for off-ground density estimation, change trend analysis, forest growth detection, harvest prediction and so on [2-4]. Traditional forest resource monitoring is usually time-consuming and labor-intensive due to manual field collection, which is unsuitable for large-scale research. In addition, the information accuracy for parameters such as tree height and crown width collected by hand demonstrate a high margin of error; therefore, it is necessary to explore a new and reliable forest survey method to meet the current needs of forestry production and ecological construction [
5]. Since the characteristics of remote sensing technology include wide monitoring range, quick data acquisition, and low cost, it is theoretical and practical to apply it to the extraction of forestry parameters in large areas.
Passive optical remote sensing such as multispectral remote sensing, hyperspectral remote sensing, and high-resolution remote sensing have been widely used for estimating forest parameters with notable progress and outcomes. The spectral information of passive optical remote sensing data from visible to near-infrared reflects the physical structure parameters of the forest, and the forestry parameters such as vegetation index and texture information can then be the derived. Ouma used semi-variance functions on QuickBird images to investigate the relationship between forest biomass and spectral variables in Kenya [
6]. Marshall & Thenkabail compared the response of hyperspectral data EO-1 Hyperion and multispectral data on biomass generation, determining that hyperspectral data was superior [
7]. Mohammadi et al. developed a model for forest stock estimation in northern Iraq using Landsat ETM+ data [
8]. Franklin et al. estimated the depression of spruce using Thematic Mapper (TM) data with an accuracy of 80% [
9]. According to the findings of the preceding studies, passive optical remote sensing data are mostly used to invert the horizontal structural parameters of forests and are rarely utilized to estimate the vertical structure (e.g., tree height) of forests. This is mainly attributed to the low signal penetration of optical remote sensing data, which makes obtaining information in the vertical direction challenging. However, some researchers, such as Brown et al. [
10], have tried to use high resolution overlapping stereo images to achieve canopy height estimation, but the elevation accuracy of the under-tree surface still cannot meet sufficiency requirements.
Synthetic Aperture Radar (SAR) as the active remote sensing technology has the ability to penetrate forest vegetation canopies and observe the ground in all weather conditions. SAR can also interact with treetops and trunks to gather the vertical structure of forests. Cloude & Papathanassiou used polarization coherence tomography to reconstruct low-frequency three-dimensional (3D) images and provided a method for optimal interferometric baseline selection to estimate forest vertical structure [
11]. Blomberg et al. used L-band SAR data from Argentina’s observation satellite SAOCOM to accurately invert forest biomass in northern Europe [
12]. Matasci et al. approximated the above-ground biomass of forests with root-mean-square deviation (RMSD) error of less than 20% using European Space Agency (ESA) P-band radar data [
13]. Although SAR is sensitive to forest vertical structure, backscatter signal saturation often occurs when the forest biomass is large. For example, Luckman et al. used JERS-1 SAR data to estimate tropical forest biomass and discovered that the backscatter coefficient saturated when the biomass reached 6 kg/m
2, affecting the accuracy of forest biomass estimation [
14].
Light Detection and Ranging (LiDAR) has advantages such as high angle resolution, distance resolution, and anti-interference ability, which make it possible to gather high precision 3D surface information while avoiding signal saturation in high biomass areas [
15]. Particularly in the field of forestry survey application, LiDAR has significant advantages over other remote sensing technologies with respect to forest height measurement and vertical structure acquisition in forest stands. LiDAR can provide highly accurate horizontal and vertical information of forests depending on the sampling method and configuration, but the optical sensors can only be used to provide detailed information on the horizontal distribution of forests. Therefore, this study will use airborne LiDAR data to identify the critical indicators of the forest resources present in the sample area.
The basis for estimating forestry parameters is accurate segmentation of tree point clouds. Tree crown segmentation methods based on LiDAR data are mainly divided into the following two categories: raster-based tree segmentation and direct point cloud-based tree segmentation. By interpolating the 3D point cloud, the raster-based tree segmentation firstly develops a digital surface model (DSM) and a canopy height model (CHM) by normalizing the tree height. Then, based on the height undulations in the CHM, local maximum [16, 17] or variable windows [18, 19] are used to search for local maximum as initial treetop locations, and finally, edge detection or feature extraction methods are employed to identify tree canopies. Watershed segmentation algorithms [20, 21, 22] and flow tracking algorithms [
23] are two examples of raster-based tree segmentation algorithms. The CHM-based segmentation method is quick and effective, but it can identify the wrong segment and omit details. Moreover, the segmentation accuracy is directly influenced by the CHM resolution, and CHM only represents canopy surface information without describing the canopy’s vertical structure. With the development of LiDAR technology, the density and accuracy of point clouds have rapidly developed, and many researchers directly use the point cloud data to segment the tree crowns [24, 25]. Wang et al. first proposed voxel segmentation of raw point cloud data with the vertical canopy structure of the forest, dividing the canopy areas of different heights based on the elevation distribution within the voxels and performing tree segmentation [
26]. Morsdorf et al. used local maxima search as seed points for k-mean clustering of 3D point clouds [
27]. Li et al. proposed a top-to-bottom area growth algorithm relying on the relative distance between trees, and this method achieved 90% segmentation accuracy for coniferous forests, but the applicability was not transferrable to dense forest areas with overlapping canopies [
28]. Compared with the traditional raster-based tree segmentation method, the direct segment processing of point cloud data can more accurately reflect the 3D structure of trees. Unfortunately, the majority of segmentation studies on tree segmentation using LiDAR data prefer low-density stands, and most of them are not ideal for complex forest environments with overlapping canopies and a variety of tree species. Additionally, the single segmentation method is not universal and is challenging to apply to trees of different scales. To get good canopy segmentation for further tree species classification and parameter extraction, this paper adopts a rotating profile segmentation method to obtain all possible seed points as initial treetop and finds canopy edges by analyzing the trend of profile point clouds.
For the study of tree species classification and identification based on LiDAR data, Holmgren & Persson used a supervised classification method to distinguish Norway spruce and Scots pine with 95% accuracy [
29]. Othmani et al. used terrestrial laser scanning (TLS) data to distinguish five tree species using wavelet transform with an overall accuracy of 88% [
30]. Lin & Hyyppä used a support vector machine approach to classify the tree species by extracting point cloud distribution, crown-internal and tree-external features, and achieved an overall accuracy is 85% [
31]. Kim et al. extracted canopy structure parameters for tree species classification using leaf-on and leaf-off LiDAR data in the growing and deciduous seasons; the results indicated that tree species identification from both data was superior to single season data [
32]. In addition, some other scholars have made full use of point cloud intensity information and introduced it into tree species classification studies, such as Ørka et al. who combined structural and intensity features to classify Norway spruce and birch, and their results proved that the classification accuracy was better than using structural or intensity features alone [
33]. The primary benefit of LiDAR intensity is related to the reflectance of surface features; there are several intensity-related confounding variables, such as parameters connected to the feature’s environment, the sensor hardware system, and the data gathering geometry [
34]. As a result, algorithmic parametric models based on intensity information are usually limited to a single location. As demonstrated above, accurate canopy structure information is the most reliable feature for tree species classification. In this paper, a machine learning method is utilized to learn the shape of canopy profiles of known tree species in sample plots for learning, and finally to design the tree species identification model. The method is suitable for most tree species with different shapes and can be widely used in most forest survey situations.
LiDAR has been successfully applied in forestry parameter extraction for a long time. Solodukhin et al. used LiDAR point cloud data for tree height extraction, and the RMSE between their estimated tree height and photogrammetry results was 14 cm [
35]. The parameters that can be directly obtained from the segmented tree crowns are generated from the LiDAR data. Information such as tree height and crown width or height can be easily obtained, but the crown width diameter at breast height (DBH) and tree species cannot be directly obtained. Although LiDAR data cannot directly estimate the diameter at breast height of forest trees, some existing studies use measured data to establish relationships and indirectly infer tree diameter at breast height parameters from LiDAR data. For example, Shrestha & Wynne estimated the diameter at breast height of trees in urban areas of central Oklahoma, USA, using the Optech ALTM 2050 system with an R
2 of 0.89 [
36]. As parameters derived from LiDAR coordinate information, canopy structure parameters are widely used in forest biomass inversion. They are usually calculated from the vegetation echoes after elevation normalization, including 25%, 50%, 75% percentile height, maximum tree height, mean tree height, and forest canopy height. Bortolot & Wynne established a regression analysis based on the 25%, 50% and 75% percentile height and biomass, and obtained correlation coefficients between predicted and actual measurements ranging from 0.59 to 0.82, with RMSE ranging from 13.6 to 140.4 t/ha [
37]. Wang et al. estimated aboveground biomass based on an Unmanned Aerial Vehicle (UAV) LiDAR system and the results showed that the mean height of trees was the most reasonable parameter to predict aboveground biomass [
38]. Several researchers have recognized the importance of LiDAR intensity data and applied it to biomass inversion, such as García et al. who estimated biomass in a Mediterranean forest in central Spain using height parameters derived from airborne LiDAR point cloud data and distance-corrected intensity parameters; consequently, their results showed that intensity correction could improve the accuracy of forest biomass estimation [
39]. Numerous research studies have demonstrated that parameter estimation considering tree species classifications is more accurate. Donoghue et al. discovered that LiDAR-based tree height and biomass estimation algorithms for coniferous forests were not applicable to mixed forests [
40]. Jin et al. introduced tree species as a dummy variable into the regression model when point cloud feature regression modeling was performed to estimate the stocking volume using the peak forest site in Guangxi, with an elevated coefficient of determination R² of the model estimation results [
41]. Pang & Li divided temperate forests in the Xiaoxing’an Mountains into coniferous, broadleaf, and mixed forests for biomass inversion, and the findings revealed that differentiated biomass modeling can further improve biomass estimation accuracy [
42]. Therefore, in this paper we will use existing tree species to verify and update the wrong tree species information in the sample plots, as well as correct the above-ground biomass at breast height, storage volume, and other parameters of trees in the sample plots based on the accurate tree species information.
In summary, this paper focuses on the urgent needs of the current forestry survey by using LiDAR point cloud data, which has high-precision horizontal and vertical structure information, to verify and update the error information of manually collected sample plots. The paper addresses the following issues: (1) To solve the segmentation problem of staggered canopies for the complex growing condition of the northeastern primeval forest, the rotating profile segmentation method is used to obtain the canopy edge points and obtain the segmentation point cloud. (2) Since the spectrum information of tree species varies with seasons and growth phases, it is difficult to obtain multi-hyperspectral remote sensing data with LiDAR in most circumstances. This paper will focus on the classification of tree species by using the geometric structure information of tree canopies based on the segmented point cloud. However, the structure of individual tree species is very different, so this paper attempts to use the segmentation of the shape of the canopy section; that is, the 3D information is converted into two-dimensional (2D) information, and then the intercepted line segments of the section are used to change the 2D shape into a one-dimensional (1D) interpolation vector by determining the change trend of the line segments. Based on the 1D vector, the deep belief network (DBN) method is used to establish the tree species recognition model and update the sample tree species error information by combining the sample tree species information. (3) Finally, forestry parameters (diameter at breast height, above-ground biomass and storage volume) are estimated and updated based on the updated tree species information by achieving the extraction of forestry survey parameters based on LiDAR point cloud data and validating the superiority of LiDAR data in forestry parameter extraction for its application in forest resource surveying.
This paper is organized as follows:
Section 1 discusses the significance and advantages of LiDAR point cloud data in forestry resource surveying, as well as the current status and limitations of research on tree segmentation, tree species classification, and forestry parameter extraction based on LiDAR data, which leads to the method of checking and updating the incorrect information of manually collected sample plots based on LiDAR data proposed in this paper.
Section 2 includes an overview of the study area’s location and characteristics, as well as an introduction to the experimental data gathering methods and characteristics (including measured sample data and LiDAR point cloud data). It also describes the paper’s research methods and processes, such as point cloud data pre-processing, the tree segmentation method for rotating profiles, tree species classification based on segmented point clouds, and estimation and update of forestry parameters.
Section 3 contains the results of tree canopy segmentation, species identification, and parameter extraction, while
Section 4 has a full analysis and explanation of the findings. Section 6 outlines the approach’s merits and drawbacks and provides an analysis and outlook on future research works.