Modeling Potential C, N, H Content in Aboveground Biomass with Spectral Data from Sentinel 2a

Sentinel 2a Neftalí Reyes-Zurita, Joaquín A. Rincón-Ramírez, Gerardo Rodríguez-Ortiz, José R. Enríquez-del Valle, Vicente A. Velasco-Velasco, Ernesto Castañeda-Hidalgo Tecnológico Nacional de México, Instituto Tecnológico del Valle de OaxacaDivision de Estudios de Posgrado e Investigación. Ex Hacienda de Nazareno s/n, Z.C. 71233 Xoxocotlán, Oaxaca, México; Colegio de Postgraduados, Campus Tabasco, Periférico s/n, Z.C. 86500 H. Cárdenas, México. Corresponding author (gerardo.rodriguez@voaxaca.tecnm.mx). Abstract: Nutrient estimation in forest ecosystems through satellite images allows us to obtain accurate data, starting with data transformation from forest stands and the existing relationship with the spectral information of the image through modeling. The objective of the study was to quantify and validate the content of C, N, H in aboveground tree biomass in managed stands using spatial modeling and satellite images. This study was conducted during 2017-2018 in managed forest stands in San Juan Lachao, Oaxaca, Mexico. Fifteen 400 m experimental sites were selectively established, using a completely randomized experimental design of five silvicultural treatments with three replications. As part of data preprocessing, normality and homogeneity of variances assumptions were checked using the Shapiro-Wilk and Bartlett tests, respectively. From the pixels, data of the average of Normalized Difference Vegetation Index (NDVI) that surrounded the sampling sites were contrasted against the data obtained from forest inventory and the regression models to estimate C, N, H and biomass were generated. Models were validated by NDVI. With the models we estimated 0.95 t ha biomass, which contains between 0.61 and 0.63 of C, 0.440.46 of N and 0.24 of H. The models generated had coefficients of determination (R) of 0.85 to 0.87, which are significant parameters (p ≤ 0.0001). These results confirm that the use of Sentinel Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 11 March 2020 doi:10.20944/preprints202003.0187.v1


Introduction
Sentinel 2A belongs to a program of new satellites developed by the European Union that have been launched since 2015. As of 2016 the data can be used by any user at an international level. These images have become an indispensable tool for the analysis of flora and fauna. Research has been conducted with this sensor obtaining very satisfactory results due to its high resolution of spatial, temporal, spectral and radiometric data [1]. Therefore, with multitemporal analysis, it is possible to predict and prevent future risk scenarios [2].
In recent years, due to more in-depth understanding of the functional dynamics of both plant physics and physiology, sensor technology has been used to evaluate plants indirectly and nondestructively [3]. Satellite images have an extensive field of use, one of its main applications is related to the possibility of observing changes in the vegetation under study over time [4].
Knowledge of the spectral models of different terrestrial covers permits adequate interpretation of images coming from satellites. When characterizing vegetation, it can be observed that when it is vigorous, it manifests very little energy in the region of the visible spectrum, mainly due to absorption by chlorophyll to carry out photosynthesis, and therefore, a high proportion of the electromagnetic radiation in the near-infrared region is reflected because of the characteristic structure of leaf tissues. The use of satellite images is thus considered a tool for potential quantification the nutrient content in vegetation. However, it is a great challenge to estimate macro and micronutrients in heterogeneous mixed grassland [5].
The Normalized Difference Vegetation Index (NDVI) is a model used in agricultural and forest sites. It shows a direct relationship between the numerical value captured by the sensor and the plant variable to be measured, such as plant biomass or vigor. The spectral response of healthy vegetation shows a clear contrast in the visible spectrum, especially the red and the near infrared (NIR) bands. In its interpretation, values below 0.1 are considered to correspond to water bodies and bare soils, while higher values are indicators of the photosynthetic activity of different types of vegetation [6]. This index was introduced with the objective of separating vegetation from the brightness produced by the soil. It is based on the peculiar radiometric behavior of vegetation, related to the photosynthetic activity and the plant´s leaf structure, which determine plant vigor [7].
In Mexico [8], Italy [9], Philippines [10], China [11], France [12] and other countries, work has been done with the same satellite (NDVI) for analysis of urban vegetation, for soil classification, for crop analysis, etc. Direct models center the data at a single point, while satellite images (with correct processing of the image and dates as close as possible to the sampling dates) have shown to improve the accuracy of forest biomass estimations [13].
Although it has been relegated due to the methodological difficulties involved when aiming to produce detailed cartographic representation of vegetation variability or to provide relevant information for forest management and decision-making at landscape scale [10], transformation of the data recorded at the site (diameter at breast height and total height) and the relationship between the data and the pixels provide the information needed for modeling the strata of a given forest in terms of silviculture and ecology. In addition, with the satellite data, it is possible to estimate tree height and update inventory or fill in nonexistent data effectively through correlation of pixel data and points recorded in the stand inventory. Moreover, LIDAR measurements provide samples of the forest structure that must be integrated with satellite images to predict and map the variations of forest structure at the landscape scale [14].
Aboveground biomass is a fundamental element of forest ecosystems, very important for its capacity to store C as well as other elements such as N and H, which indicates the forest's production capacity. Forest ecosystems store very significant amounts of gases with greenhouse effect [15]. Estimation of forest stand biomass with accurate data from remote sensors is based on a strong statistical relationship between C and the spectral response captured by the sensor as spatially explicit knowledge [16].
Through research, it has been shown that forest reflectivity is assessed through the vertical distribution of the tree strata [17]. Interactions between the structural variables of the inventory data such as height, basal area, density, age and biomass at the time of modeling with satellite images have also been detected [18].
Planning forestry activities requires information such as geographical features. In addition to thematic maps that support decision-making, new alternatives have been adopted, as is the case in northwestern Mexico where Landsat ETM satellite images from 2001 were used to optimize forestry planning [19].
The community San Juan Lachao has large forest areas that are geographically difficult to access. This community is a pioneer in the commercialization of C bonds (Carboin) in the international voluntary market, through development of the project under the forest protocol for Mexico of the Climate Action Reserve. The community are interested in constantly improving their methods of evaluating their natural resources. In this context, the use of Sentinel satellite images presents an alternative that saves time and resources in evaluating natural and environmental resources. For this reason, the objective of this study was to quantify and validate the C, N and H content in aboveground tree biomass in different silvicultural treatments using spatial modeling and satellite images.

Study Area
The

Nutrients in Aboveground Biomass
In the arboreal compartment, the total tree volume of the pines and broadleaf species was To estimate the biomass of the different tree parts, a sample of approximately 100 g was extracted from each part and kept in a tagged paper bag to determine fresh weight (FW, kg) and dry weight (DW, kg) after drying for six days at temperature between 75 ºC and 100 ºC. The total Where: NDVI = normalized difference vegetation index, NIR = near infrared, R = red.
The indexes were constructed in order to highlight some characteristics of the vegetation, which is mainly a function of the chlorophyll, cell structure and water content of the vegetation [22]. The indexes are more sensitive in some parts of the electromagnetic spectrum and better detected in specific bands [23].

Data Analysis
A database was integrated with which regression models per tree were tested. The dependent variable was the value of the pixels and the independent variables were C, H, N and biomass ( Table   1). The analyses were carried out with SAS [24] statistical software. Assumptions of normality and homogeneity of variances were verified with the Shapiro-Wilk (UNIVARIATE procedure) and Bartlett tests, respectively. The models were adjusted using the MODEL procedure and selected based on their statistical indicators. A simple linear model was used for C, N and biomass and exponential regression for H.

Results
The models that best estimated C, N, H and biomass from spectral data were the simple linear and the exponential regression models. The root of mean square error (RMSE) showed low values, between 0.01 and 0.02, indicative of good fit. The NDVI spectral data explained 85% of the variation in C, H and biomass and 87% of the variation in N; in all cases, variation was significant (p = 0.0001). In this way, the parameters (β0 and β1) integrated to estimate the nutrient contents and biomass were essential in the models (p = 0.0001).
The NDVI based on the reflectivity of the Sentinel 4 and 8 bands were those that satisfactorily described the elements' behavior due to the strong energy absorption caused by the vigorous healthy vegetation, chlorophyll absorption and moisture present in the vegetation ( Table 2).

Discussion
The need to know and simulate the different behaviors associated with the different vegetation changes has been increasing. One of the first processes is observation; then comes experimentation, and finally satellite images [25,26] to obtain geolocated information, especially in large areas of difficult access [27]. To make the most accurate C estimation, the spatial resolution of the satellite image becomes a key factor in validating models. A study done by Yan [28] working with resolutions between 30 and 1000 m 2 , used a model that showed high coefficients of variation as the image resolution decreased. The results of our investigation was 5.55 (CV) with a 20×20 m resolution, the difference due to the type of satellite used (Sentinel 2a).
Results of studies carried out in Mexico, the United States, Europe, Iran and Asia, with different satellites such as Sentinel 2 and 3, Landsat and Envisat, using statistical models, have similar coefficients of determination, when compared with the results obtained in our research.
This highlights the fact that Sentinel sensors present differences relative to others, that is, it is much better to make these types of estimates in forest ecosystems (Table 3).  [34,35].
Indirect estimation of C and forest variability is done with different methods and technologies [36] from different variables, either height [37,38] or biomass, which was the methodology used in our investigation.
In estimating N with NDVI, it has been shown that reflectivity in some parts of the electromagnetic spectrum is highly correlated with the N percentage measured in the stand at tree level [39,40]. In our research it produced an R 2 = 0.87, comparable with the results of Pellissier [41] and Wang [42] in mixed temperate forest. Comparing the results generated in agricultural areas with airborne hyperspectral images, we had an adjustment of R 2 = 0.79, suggesting that the Sentinel 2a images estimate nutrient contents much better. Moreover, comparing the results obtained by Ewald [43] with the Lidar data (R 2 = 0.63), with the images obtained from Sentinel, more robust models can be generated to predict the aforementioned variables.
For this reason, the use of remote sensing in nutrient quantification and also in forest management to estimate forest productivity in large areas to support establishing inventory control points has boomed [44]. The result obtained in H quantification was 0.22 t ha -1 , through pixel ratio, to 0.24 t ha -1 , obtained from the samples collected in the stand. Norverto [45] mentions that H is one of the scarcest elements; in a complete tree 6% can be found, depending on the species.
The different satellite images provide very accurate information at the local, regional and national level if their methodology is followed correctly [28]. The biomass estimated through the NDVI and simple regression was 0.95 t ha -1 with an R 2 = 0.85, values slightly lower than those obtained by Lumbierres [46] using MODIS images from NDVI (1 to 10 t ha -1 ). This difference could be due, first, to the fact that the MODIS sensor spatial resolution is lower than that of Sentinel and, second, to its smaller R 2 = 0.62 with multiple regression. It may also vary depending on vegetation type and the site condition. For example, Ruiz [47] obtained an R 2 = 0.89 in Pinus halepensis Mill, while Méndez [48], who used the ALOS-PALSAR sensor found R 2 = 0.63 in pine species, and Cáceres [49] in an analysis of temporal behavior of grass biomass in Honduras obtained an R 2 = 0.82 from the biomass and the spectral index (SAVI) ratio with simple linear regression using Sentinel 2ª images. However, in the biomass quantification of managed forest areas of this study, a coefficient of R 2 = 0.85 was obtained (

Conclusions
Estimates carried out using statistical models with spectral data in an area of stands managed