ARTICLE | doi:10.20944/preprints202112.0138.v2
Subject: Biology, Agricultural Sciences & Agronomy Keywords: Yield mapping; vegetation index; Stepwise; SR; Random Forest; KNN
Online: 9 December 2021 (15:39:34 CET)
The use of machine learning techniques to predict yield based on remote sensing is a no-return path and studies conducted on farm aim to help rural producers in decision-making. Thus, commercial fields equipped with technologies in Mato Grosso, Brazil, were monitored by satellite images to predict cotton yield using supervised learning techniques. The objective of this research was to identify how early in the growing season, which vegetation indices and which machine learning algorithms are best to predict cotton yield at the farm level. For that, we went through the following steps: 1) We observed the yield in 398 ha (3 fields) and eight vegetation indices (VI) were calculated on five dates during the growing season. 2) Scenarios were created to facilitate the analysis and interpretation of results: Scenario 1: All Data (8 indices on 5 dates = 40 inputs) and Scenario 2: best variable selected by Stepwise regression (1 input). 3) In the search for the best algorithm, hyperparameter adjustments, calibrations and tests using machine learning were performed to predict yield and performances were evaluated. Scenario 1 had the best metrics in all fields of study, and the Multilayer Perceptron (MLP) and Random Forest (RF) algorithms showed the best performances with adjusted R2 of 47% and RMSE of only 0.24 t ha-1, however, in this scenario all predictive inputs that were generated throughout the growing season (approx. 180 days) are needed, so we optimized the prediction and tested only the best VI in each field, and found that among the eight VIs, the Simple Ratio (SR), driven by the K-Nearest Neighbor (KNN) algorithm predicts with 0.26 and 0.28 t ha-1 of RMSE and 5.20% MAPE, anticipating the cotton yield with low error by ±143 days, and with important aspect of requiring less computational demand in the generation of the prediction when compared to MLP and RF, for example, enabling its use as a technique that helps predict cotton yield, resulting in time savings for planning, whether in marketing or in crop management strategies.
ARTICLE | doi:10.20944/preprints202105.0216.v1
Subject: Social Sciences, Accounting Keywords: Built environment; pedestrian volume; stepwise regression; principal component analysis; Melbourne
Online: 10 May 2021 (15:34:00 CEST)
Previous studies have mostly examined how sustainable cities try to promote non-motorized travel by creating a walking-friendly environment. Such existing studies provide little research that identifies how the built environment affects pedestrian volume in high-density areas. This paper presents a methodology that combines person correlation analysis, stepwise regression, and principal component analysis for exploring the internal correlation and potential impact of built environment variables. To study this relationship, cross-sectional data in the Melbourne central business district were selected. Pearson’s correlation coefficient confirmed that visible green index and intersection density were not correlated to pedestrian volume. The results from stepwise regression showed that land-use mix degree, public transit stop density, and employment density could be associated with pedestrian volume. Moreover, two principal components were extracted by factor analysis. The result of the first component yielded an internal correlation where land-use and amenities components were positively associated with the pedestrian volume. Component 2 presents parking facilities density, which negatively relates to the pedestrian volume. Based on the results, existing street problems and policy recommendations were put forward to suggest diversifying community service within walking distance, improving the service level of the public transit system, and restricting on-street parking in Melbourne.
ARTICLE | doi:10.20944/preprints201806.0429.v1
Subject: Biology, Forestry Keywords: Mixed forests; Questionnaire Survey; Ecosystem Services; Stepwise Regression; Climate Change
Online: 26 June 2018 (15:48:31 CEST)
Scientific studies had shown that mixed forests of silver fir (Abies alba Mill.) and European beech (Fagus sylvatica L.) provide higher ecosystem services than monospecific forests. Mixed forests are known for their high resilience to climate change impacts and superior biodiversity compared to monospecific forests. In many countries, promotion of mixed forests in forest management is becoming a government policy since they can contribute to fulfill the Sustainable Development Goals set by the United Nation, respectively Goal 13 and 15. However, not much is known about public perceptions on mixed forests compared to monoculture forests. Our study on ecosystem services provided by mixed and monospecific forests in southwest Germany fill this gap. Based on a survey with 520 valid responses we analyzed people’s perception on 18 different supporting, cultural, regulating and provisioning ecosystem services measured by Likert scale. Stepwise regression analyses show relations between social profiles (gender, age, education, profession) and preferences on respondents’ perceptions. Our findings show that people perceive that mixed forests provide better cultural, regulating and supporting ecosystem services than monospecific forests of fir and beech whereas provisioning services were perceived as being equally or better provided by monospecific forests. Significant effects towards a positive perception on ecosystem services provided by mixed forests were mainly influenced by the perceived abundance of old trees, feeling of pleasantness in mixed forests, age, profession, and education. Our findings indicate that there is a high public support for the promotion of silver fir and beech mixed forests in Southwest Germany.
ARTICLE | doi:10.20944/preprints201711.0138.v1
Subject: Keywords: sensitive analysis; variable fuzzy method; mutual entropy; stepwise regression analysis; mountain flash flood risk
Online: 21 November 2017 (09:28:07 CET)
Flash flood is one of the most significant natural disasters in China, particularly in mountainous area, causing heavy economic damage and casualties of life. Accurate risk assessment is critical to an efficient flash flood management. There are more than 530,000 small watersheds in 2058 counties in China where flash flood should be prevented. In practice, with limited fund and different risk levels, the priorities of each small watershed for flash flood prevention and control are also needed for an efficient flash flood management. This paper, take Licheng county in China as an example, aims to give out these priorities for management. First, sensitive indexes are identified among index system, which includes 9 indexes based on underlying surface characteristics of small watershed in hilly region. Second, the range of each index and the rank division of each index for evaluation are determined. Based on the rank divisions for evaluation, the flash flood risk grade eigenvalue (H) is calculated by Variable Fuzzy Method (VFM ) using 1000 samplings generated by Latin hypercube sampling method. Third, the key sensitivity factors that affect flash flood risk grade eigenvalue (H) are assessed by two different global sensitivity analysis methods -- stepwise regression analysis and mutual entropy. Both results indicate that watershed slope (S) is the most sensitive factor; the second is antecedent precipitation index (CN); while other factors are slightly different sensitive in sequence. This study shows that stepwise regression analysis and mutual information analysis are appropriate for the sensitivity analysis of mountain flash flood risk. Finally, based on watershed slope (S), the priorities of flash flood prevention and control of 119 small watersheds in Licheng county are given out.
ARTICLE | doi:10.20944/preprints201608.0202.v2
Subject: Earth Sciences, Environmental Sciences Keywords: HR satellite remote sensing; urban fabric vulnerability; UHI & heat waves; landsat & MODIS sensors; LST & urban heating; segmentation & objects classification; data mining; feature extraction & selection; stepwise regression & model calibration
Online: 26 October 2021 (13:11:23 CEST)
Densely urbanized areas, with a low percentage of green vegetation, are highly exposed to Heat Waves (HW) which nowadays are increasing in terms of frequency and intensity also in the middle-latitude regions, due to ongoing Climate Change (CC). Their negative effects may combine with those of the UHI (Urban Heat Island), a local phenomenon where air temperatures in the compact built up cores of towns increase more than those in the surrounding rural areas, with significant impact on the quality of urban environment, on citizens health and energy consumption and transport, as it has occurred in the summer of 2003 on France and Italian central-northern areas. In this context this work aims at designing and developing a methodology based on aero-spatial remote sensing (EO) at medium-high resolution and most recent GIS techniques, for the extensive characterization of the urban fabric response to these climatic impacts related to the temperature within the general framework of supporting local and national strategies and policies of adaptation to CC. Due to its extension and variety of built-up typologies, the municipality of Rome was selected as test area for the methodology development and validation. First of all, we started by operating through photointerpretation of cartography at detailed scale (CTR 1: 5000) on a reference area consisting of a transect of about 5x20 km, extending from the downtown to the suburbs and including all the built-up classes of interest. The reference built-up vulnerability classes found inside the transect were then exploited as training areas to classify the entire territory of Rome municipality. To this end, the satellite EO HR (High Resolution) multispectral data, provided by the Landsat sensors were used within a on purpose developed "supervised" classification procedure, based on data mining and “object-classification” techniques. The classification results were then exploited for implementing a calibration method, based on a typical UHI temperature distribution, derived from MODIS satellite sensor LST (Land Surface Temperature) data of the summer 2003, to obtain an analytical expression of the vulnerability model, previously introduced on a semi-empirical basis.