ARTICLE | doi:10.20944/preprints202203.0337.v2
Subject: Earth Sciences, Geoinformatics Keywords: landslide susceptibility; stacking ensemble; machine learning; random forest; gradient boosting decision tree; extreme gradient boosting
Online: 5 October 2022 (10:29:51 CEST)
The current study aims to apply and compare the performance of six machine learning algorithms, including three basic classifiers: random forest (RF), gradient boosting decision tree (GBDT), and extreme gradient boosting (XGB), as well as their hybrid classifiers, using the logistic regression (LR) method (RF+LR, GBDT+LR, and XGB+LR), in order to map the landslide susceptibility of Zhangjiajie City, Hunan Province, China. First, a landslide inventory map was created with 206 historical landslide points and 412 non-landslide points, which was randomly divided into two datasets for model training (80%) and model testing (20%). Second, 15 landslide conditioning factors (i.e., altitude, slope, aspect, plane curvature, profile curvature, relief, roughness, rainfall, topographic wetness index (TWI), normalized difference vegetative index (NDVI), distance to roads, distance to rivers, land use/land cover (LULC), soil texture, and lithology) were initially selected to establish a landslide factor database. Thereafter, the multicollinearity test and information gain ratio (IGR) technique were applied to rank the importance of the factors. Subsequently, we used a series of metrics (e.g., accuracy, precision, recall, f-measure, area under the ROC (receiver operating characteristic) curve (AUC), kappa index, mean absolute error (MAE), and root mean square error (RMSE)) to evaluate the accuracy and performance of the six models. Based on the AUC values derived from the models, the GBDT+LR model with the highest AUC value (0.8168) was identified as the most efficient model for mapping landslide susceptibility, followed by the XGB+LR, XGB, RF+LR, GBDT, and RF models, which achieved AUC values of 0.8124, 0.8118, 0.8060, 0.7927, and 0.7883, respectively. The results from this study suggest that the stacking ensemble machine learning method is promising for use in landslide susceptibility mapping in the Zhangjiajie area and is capable of targeting the areas prone to landslides.
ARTICLE | doi:10.20944/preprints202103.0112.v1
Subject: Materials Science, Biomaterials Keywords: Intramolecular charge transfer; copolymers; pi-pi stacking; direct arylation polycondensation; excitonic state
Online: 2 March 2021 (21:36:19 CET)
Three low band gap copolymers based on isoindigo acceptor units were designed and successfully synthesized by direct arylation polycondensation method. Two of them were benzodithiophene(BDT)-isoindigo copolymers (PBDTI-OD and PBDTI-DT) with 2-octlydodecyl(OD) and 2-decyltetradecyl (DT) substituted isoindigo units, respectively. Thiophene donor and DT-substituted isoindigo acceptor units were copolymerized to synthesize PTI-DT. The copolymers have broad absorption range that extends to over 760 nm with a band gap ~ .5 eV. The photophysical property studies showed the BDT based copolymers have non-polar ground states. Their emission exhibited the population of intramolecular charge transfer (ICT) state in polar solvents and tightly bound excitonic state in non-polar solvents due to self-aggregation. On the contrary, the emission from the thiophene based copolymers was only from the tightly bound excitonic state. The thermal decomposition temperature of the copolymers was above 380 oC. The X-ray diffraction pattern of the three copolymers showed a halo due pi-pi stacking. A second sharper peak was observed in the BDT-based copolymer with longer side chain on the isoindigo unit (PBDTI-DT) and the thiophene based copolymers with PTI-DT exhibiting a better structural order.
ARTICLE | doi:10.20944/preprints202109.0181.v1
Subject: Keywords: Classification; stacking ensemble method; heart surgery; unbalanced data problem; hybrid predictive model; machine learning in healthcare; resampling method; Edited-Nearest-Neighbor; nonparametric test.
Online: 10 September 2021 (10:53:35 CEST)
Nowadays, according to spectacular improvement in health care and biomedical level, a tremendous amount of data is recorded by hospitals. In addition, the most effective approach to reduce disease mortality is to diagnose it as soon as possible. As a result, data mining by applying machine learning in the field of diseases provides good opportunities to examine the hidden patterns of this collection. An exact forecast of the mortality after heart surgery will cause Successful medical treatment and fewer costs. This research wants to recommend a new stacking predictive model after utilizing the random forest feature importance method to foresee the mortality after heart surgery on a highly unbalanced dataset by using the most practical features. To solve the unbalanced data problem, a combination of the SVM-SMOTE over-sampling algorithm and the Edited-Nearest-Neighbor under-sampling algorithm is used. This research compares the introduced model with some other machine learning classifiers to ensure efficiency through shuffle hold-out and 10-fold cross-validation strategies. In order to validate the performance of the implemented machine learning methods in this research, both shuffle hold-out, and 10-fold cross-validation results indicated that our model had the highest efficiency compared to the other models. Furthermore, the Friedman statistical test is applied to survey the differences between models. The result demonstrates that the introduced stacking model reached the most accurate predicting performance after Logistic Regression.
ARTICLE | doi:10.20944/preprints202103.0650.v1
Subject: Materials Science, Biomaterials Keywords: palladium; palladium alloys; generalized stacking fault energy; twinnability; stacking fault energy; ab initio calculations
Online: 25 March 2021 (17:19:22 CET)
Generalized stacking fault energies of palladium alloys were calculated using the density functional theory. The stacking fault energy of palladium alloys is correlated with the valence electron of the transition metal element. The twinning tendency is also modified by the presence of an alloying element in the plane of deformation. The obtained results suggest that Pd –transition metal alloys with elements such as Cr, Mo, W, Mn, Re are expected to exhibit high work hardening rate due to the tendency to emit of the partial dislocations and mechanical twins, which results in increased strength and ductility.
ARTICLE | doi:10.20944/preprints202104.0414.v1
Subject: Physical Sciences, Acoustics Keywords: Heterogeneous Nucleation; Ice Polymorph; Stacking Disorder; Phase Selectivity
Online: 15 April 2021 (12:43:16 CEST)
Recently, ice with the stacking disorder structure, consisting of random sequences of cubic ice (Ic) and hexagonal ice (Ih) layers, is reported to be more stable than pure Ih/Ic. While, due to a much lower free energy barrier of heterogeneous nucleation, in practice, the freezing process of water is usually controlled by heterogeneous nucleation which is triggered by an external medium. Herein, molecular dynamic simulations were carried out to explore the polymorph dependence of ice on the lattice structure of substrates. It turns out that, during the nucleation stage, the polymorph of ice nuclei can be severely altered by the graphene substrate, on which the Ih was found to occupy an absolute majority in new-formed ice. This can be attributed to the structure similarity between graphene and basal face of Ih. Besides the nucleation stage, our results suggest that the substrate can not affect the polymorph of ice which is far from the graphene surface. The polymorph selectivity of graphene to Ih will diminish with the growth of ice layer.
ARTICLE | doi:10.20944/preprints201811.0096.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: machine learning; stacking; forecasting; regression; sales; time series
Online: 5 November 2018 (09:54:54 CET)
In this paper, we study the usage of machine learning models for sales time series forecasting. The effect of machine learning generalization has been considered. A stacking approach for building regression ensemble of single models has been studied. The results show that using stacking technics, we can improve the performance of predictive models for sales time series forecasting.
ARTICLE | doi:10.20944/preprints202204.0209.v1
Subject: Physical Sciences, Condensed Matter Physics Keywords: defect-induced superconductivity; graphite; stacking faults; magnetic force microscopy
Online: 22 April 2022 (03:43:45 CEST)
Granular superconductivity at high temperatures in graphite can emerge at certain two-dimensional (2D) stacking faults (SFs) between regions with twisted (around the c-axis) or untwisted crystalline regions with Bernal (ABA...) and/or rhombohedral (ABCABCA...) stacking order. One way to observe experimentally such 2D superconductivity is to measure the frozen magnetic flux produced by a permanent current loop that remains after removing an external magnetic field applied normal to the SFs. Magnetic force microscopy was used to localize and characterize such a permanent current path found in one natural graphite sample out of ∼50 measured graphite samples of different origins. The position of the current path drifts with time and roughly follows a logarithmic time dependence similar to the one for flux creep in type II superconductors. We demonstrate that a ≃10nm deep scratch on the sample surface at the position of the current path causes a change in its location. A further scratch was enough to irreversibly destroy the remanent state of the sample at room temperature. Our studies clarify some of the reasons for the difficulties of finding a trapped flux in remanent state at room temperature in graphite samples with SFs.
REVIEW | doi:10.20944/preprints202212.0175.v1
Subject: Medicine & Pharmacology, Pharmacology & Toxicology Keywords: SARS-CoV-2; Poly(A); dexrazoxane; supramolecular self-assembly; base stacking
Online: 9 December 2022 (09:55:12 CET)
In 2018, the author identified a previously unknown/unreported association between dexrazoxane and poly(ADP-ribose) (PAR). Interestingly, PAR is a close structural analogue of the polyadenine nucleotide polymer, polyadenosine monophosphate (poly(A)). In this report, subsequent in silico modelling of the interaction between dexrazoxane and poly(A) reveals some notable differences from the previously reported interaction between dexrazoxane and PAR. Significantly, the supramolecular self-assembly of dexrazoxane and poly(A) is distinguished by vertically-orientated nonelectrostatic forces comparable to the stabilizing interactions between stacked bases within DNA. Notably, the vertical separation of 3.4 Å between each stack is consistent with solvent entropy as a dominant driving force in stabilising the interaction. Additionally, concomitant conformational analysis by the author reveals the existence of low energy planar conformers of dexrazoxane. This analysis enables an explanation for the considerable discrepancies and conflicts that exist within the reported pharmacokinetic data for dexrazoxane. Exploring the significance of the interaction between dexrazoxane and poly(A), the author illustrates that survival, translation and replication of the severe acute respiratory syndrome virus 2 (SARS-CoV-2) is absolutely dependent upon the mature and unhindered poly(A) tail of the SARS-CoV-2 genome. The proposition herein, that dexrazoxane, as a chameleonic agent sequesters the poly(A) tail of the SARS-CoV-2 genome by the catalysis of a supramolecular hybrid assembly establishes SARS-CoV-2 infected cells as deep compartments for the accumulation of dexrazoxane. Taken together, dexrazoxane or its demethylated analogue, represent a novel treatment to kill the SARS-CoV-2 virus by irreversible destabilization of the SARS-CoV-2 poly(A) tail.
ARTICLE | doi:10.20944/preprints202007.0149.v1
Subject: Materials Science, General Materials Science Keywords: organic electronics; organic semiconductors; molecular design; crystal design; π-stacking; charge mobility
Online: 8 July 2020 (11:23:32 CEST)
Chemical versatility of organic semiconductors provides nearly unlimited opportunities for tuning their electronic properties. However, despite decades of research, relationship between molecular structure, molecular packing and charge mobility in these materials remains poorly understood. This reduces the search for high-mobility organic semiconductors to the inefficient trial-and-error approach. For clarifying the abovementioned relationship, investigations of the effect of small changes in the chemical structure on OSs properties are particularly important. In this study, we address computationally the impact of substitution of C-H atom pairs by nitrogen atoms (N-substitution) on molecular properties, molecular packing and charge mobility of crystalline oligoacenes. Besides of decreasing frontier molecular orbital levels, N-substitution dramatically alters molecular electrostatic potential yielding pronounced electron-rich and electron-deficient areas. These changes in the molecular electrostatic potential strengthen face-to-face and edge-to-edge interactions in the corresponding crystals and result in the crossover from the herringbone packing motif to π-stacking. When the electron-rich and electron-deficient areas are large, sharply defined and, probably, have certain symmetry, charge mobility increases up to 3-4 cm2V-1s-1. The results obtained highlight the potential of azaacenes for application in organic electronic devices and are expected to facilitate rational design of organic semiconductors for steady improvement of organic electronics.
ARTICLE | doi:10.20944/preprints202301.0040.v1
Subject: Life Sciences, Biotechnology Keywords: ADP-ribosylation; proteomics; post-translational modifications; deep-learning; stacking-based ensemble learning; protein network
Online: 4 January 2023 (02:26:50 CET)
Protein phosphorylation and ADP-ribosylation (ADPr), as two types of post-translational modifications (PTM), are the process of adding phosphate group and ADP-ribose moieties to proteins, respectively. Although both PTM types can occur on many amino acid types, serine is the most common. Serine phosphorylation (pS), serine ADPr (SADPr), and their in situ crosstalks (pSADPr) play essential roles in biological processes. Although in silico classifiers have been developed for predicting pS and SADPr sites, the classifier for predicting pSADPr sites is unavailable. In this study, we developed classifiers to predict pSADPr sites. Specifically, we collected 3250 human pSADPr, 7520 SADPr, 151,227 pS and 80,096 unmodified serine sites. Based on them, we investigated the characteristics of pSADPr sites and constructed three classifiers to predict pSADPr sites from the pS dataset, the SADPr dataset and the protein sequences separately. We built and evaluated five deep-learning classifiers in ten-fold cross-validation and independent test datasets. Three of them (e.g. Convolutional Neural Network with the One-Hot encoding, dubbed CNNOH) performed better than the rest two. For instance, CNNOH had the AUC values of 0.700, 0.914 and 0.954 for recognizing pSADPr sites from the SADPr, pS and unmodified serine sites.Therefore, it is challenging to distinguish pSADPr sites from SADPr sites compared to the other two. It is consistent with our observation that pSADPr's characteristics are more similar to those of SADPr than the rest. Furthermore, we used the classifiers as base classifiers to develop a few stacking-based ensemble classifiers to improve performance. However, none of the ensemble classifiers showed better performances, suggesting that the base classifiers have good enough performances. Finally, we developed an online tool for extensively predicting human pSADPr sites based on the CNNOH classifier, dubbed EdeepSADPr. It is freely available through http://edeepsadpr.bioinfogo.org/.
SHORT NOTE | doi:10.20944/preprints202105.0689.v1
Subject: Physical Sciences, Acoustics Keywords: complex networks; network models; link prediction; automata theory; network automata; Cannistraci-Hebb; stacking modelling
Online: 28 May 2021 (10:13:43 CEST)
Link prediction is an iconic problem in complex networks because deals with the ability to predict nonobserved existing or future parts of the network structure. The impact of this prediction on real applications can be disruptive: from prediction of covert links between terrorists in their social networks to repositioning of drugs in molecular diseasome networks. Here we compare: (1) an ensemble meta-learning method (Ghasemian et al.), which uses an artificial intelligence (AI) stacking strategy to create a single meta-model from hundreds of other models; (2) a structural predictability method (SPM, Lü et al.), which relies on a theory derived from quantum mechanics and does not assume any model; (3) a modelling rule named Cannistraci-Hebb (CH, Muscoloni et al.), which relies on one brain-bioinspired model adapting to the intrinsic network structure.We conclude that brute-force stacking of algorithms by AI does not perform better than (and is often significantly outperformed by) SPM and one simple brain-bioinspired rule such as CH. This agrees with the Gödel incompleteness: stacking is optimal but incomplete, you cannot squeeze out more than what is already in your features. Hence, we should also pursue AI that resembles human-like physical ‘understanding’ of simple generalized rules associated to complexity. The future might be populated by AI that ‘steals for us the fire from Gods’, towards machine intelligence that creates new rules rather than stacking the ones already known.
ARTICLE | doi:10.20944/preprints202202.0101.v2
Subject: Engineering, Civil Engineering Keywords: artificial intelligence; climate forecast; deep learning; ensemble model; multi-layer perceptron; neural network; regression; soil temperature; stacking method
Online: 17 February 2022 (09:56:27 CET)
Soil temperature is a fundamental parameter in water resources and engineering. A cost-effective model which can forecast soil temperature accurately is extensively needed. Recently, many studies have applied artificial intelligence (AI) at both surface and underground levels for soil temperature prediction. However, there is no comprehensive and detailed assessment of the performance of different AI approaches in soil temperature estimation, and primarily limited atmospheric variables are used as input data for AI models. In the present study, great varieties of various land and atmospheric variables are applied to evaluate the performance of a wide range of AI methods on soil temperature prediction. Herein, thirteen approaches, from classic regressions to well-established methods of random forest and gradient boosting to advanced AI techniques like multi-layer perceptron and deep learning are taken into account. The results show that AI is a promising approach in climate parameter forecast and deep learning demonstrates the best performance among other models. It has the highest R-squared ranging from 0.957 to 0.980, the lowest NRMSE ranging from 2.237% to 3.287% and the lowest MAE, ranging from 0.510 to 0.743 in predicting soil temperature. The prediction is repeated for different sizes of data, and prediction outcomes confirm the conclusion mentioned above.
REVIEW | doi:10.20944/preprints201707.0020.v2
Subject: Earth Sciences, Geology Keywords: thin-skinned tectonics, thick-skinned tectonics, structural geology, structure of mountain ranges, fold-and-thrust belts, décollement, nappe stacking, continent-continent collision, subduction, basin inversion
Online: 27 July 2017 (10:14:40 CEST)
This paper gives an overview of the large-scale tectonic styles encountered in orogens worldwide. Thin-skinned and thick-skinned tectonics represent two end member styles recognized in mountain ranges. Both styles are encountered in former passive margins of continental plates. Thick-skinned style including the entire crust and possibly the lithospheric mantle are associated with intracontinental contraction. Delamination of subducting continental crust and horizontal protrusion of upper plate crust into the opening gap occurs in the terminal stage of continent-continent collision. Continental crust thinned prior to contraction is likely to develop relatively thin thrust sheets of crystalline basement. A true thin-skinned type requires a detachment layer of sufficient thickness. Thickness of the décollement layer as well as the mechanical contrast between décollement layer and detached cover control the style of folding and thrusting within the detached cover units. In subduction-related orogens, thin- and thick-skinned deformation may occur several hundreds of kilometers from the plate contact zone. Basin inversion resulting from horizontal contraction may lead to the formation of basement uplifts by the combined reactivation of pre-existing normal faults and initiation of new reverse faults. In most orogens thick-skinned and thin-skinned structures both occur and evolve with a pattern where nappe stacking propagates outward and downward