ARTICLE | doi:10.20944/preprints202201.0447.v1
Subject: Medicine & Pharmacology, Psychiatry & Mental Health Studies Keywords: mass multivariate analysis; neuroimaging, depression, schizophrenia
Online: 31 January 2022 (11:07:48 CET)
We have used Mass Multivariate Method on structural, resting state and task related fMRI data from two groups of patients with schizophrenia and depression, respectively, in order to define several regions of significant relevance to the differential diagnosis between those conditions. The regions included the left Planum polare, Left opercular part of the inferior frontal gyrus (OpIFG), Medial orbital gyrus (MOrG), Posterior Insula (PIns), and Parahippocampal gyrus (PHG). This study delivers evidence that multimodal neuroimaging approach can potentially enhance the validity of psychiatric diagnosis. Either structural, or resting state or task related functional MRI modality cannot provide independent biomarkers. Further studies need to consider and implement a model of incremental validity to combine clinical measures with different neuroimaging modalities to discriminate depressive disorders from schizophrenia. Biological signatures of disease on the level of neuroimaging are more likely to underpin broader nosological entities in psychiatry.
ARTICLE | doi:10.20944/preprints201711.0191.v1
Subject: Social Sciences, Econometrics & Statistics Keywords: Pesticides, Vegetable, Nepal, Determinant, Multivariate Probit
Online: 29 November 2017 (13:27:57 CET)
Currently, the pesticides are the global core concern because it is a boon to farmers against increasing disease-pest and simultaneously, pesticide residue is the major anxiety regarding human health. For that reason, identification and determination of factors affecting the application of pesticides are essential. To identify and evaluate determinants of pesticides application in Nepal, a household survey of 300 households was carried-out and an empirical analysis was done using multivariate probit model. Moreover, powder and liquid forms of pesticides were considered for summer and winter season in vegetable farming, which was assigned as outcome variables. Likewise, socio-economic, demographic, farm-level and perception data were considered as explanatory variables. Use of chemical fertilizers, age and gender of head of household, household size and access to weather information were found the most influencing factors. Moreover, forms of pesticides and growing seasons were found complementary to each other. Therefore, devising the policy options accordingly should balance needs of farmers and health of consumers.
ARTICLE | doi:10.20944/preprints201807.0215.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: multivariate gaussian mixture model (MVGMM); multivariate linear regression; expectation-maximization imputation; WiFi localization; hidden markov model (HMM)
Online: 12 July 2018 (08:24:06 CEST)
The extensive deployment of wireless infrastructure provides a low-cost way to track mobile users in indoor environment. This paper demonstrates a prototype model of an accurate and reliable room location awareness system in a real public environment, where three typical problems arise. First, a massive number of access points (APs) can be sensed leading to a high-dimensional classification problem. Second, heterogeneous devices record different received signal strength (RSS) levels due to the variations in chip-set and antenna attenuation. Third, APs are not necessarily visible in every scanning cycle leading to missing data. This paper presents a probabilistic Wi-Fi fingerprinting method in a hidden Markov model (HMM) framework for mobile user tracking. Considering the spatial correlation of the signal strengths from multiple APs, a Multivariate Gaussian Mixture Model (MVGMM) is fitted to model the probability distribution of RSS measurements in each cell. Furthermore, the unseen property of invisible AP has been investigated in this research, and demonstrated the efficiency of differentiation between cells. The proposed system is able to achieve comparable localization performance. The filed test results present a reliable 97% localization room level accuracy of multiple mobile users in a real university campus WiFi network without any prior knowledge of the environment.
ARTICLE | doi:10.20944/preprints202012.0321.v1
Subject: Earth Sciences, Atmospheric Science Keywords: quantile regression; groundwater; environmental; multivariate; metals; health
Online: 14 December 2020 (10:13:09 CET)
One of the most important defining characteristics of groundwater quality is pH as it fundamentally controls the amount and chemical form of many organic and inorganic solutes in groundwater. Groundwater data are frequently characterized by a wide degree of variability of the factors which possibly influence pH distribution. For this reason, it is challenging to link the spatio-temporal dynamics of pH to a single environmental factor by the ordinary least squares regression technique of the conditional mean. In this study, quantile regression was used to estimate the response of pH to nine environmental factors (As, Cd, Fe, Mn, Pb, turbidity, electrical conductivity, total dissolved solids and nitrates). Results of 25%, 50%, 75% quantile regression and ordinary least squares (OLS) regression were compared. The standard regression of the conditional means (OLS) underestimated the rates of change of pH due to the selected factors in comparison with the regression quantiles. The effect of arsenic increased for sampling locations with higher pH values (higher quantiles) likewise the influence of Pb and Mn. However, the effects of Cd and Fe decreased for sampling locations in higher quantiles. It can be concluded that these detected heterogeneities would be missed if this study had focused exclusively on the conditional means of the pH values. Consequently, quantile regression provides a more comprehensive account of possible spatio-temporal relationships between environmental covariates in groundwater. This study is one of the first to apply this technique on groundwater systems in sub-Saharan Africa. The approach is useful and interesting and has broad application for other mining environments especially tropical low-income countries where climatic conditions can drive rapid cycling or transformations of pollutants. It is also pertinent to geopolitical contexts where regulatory; monitoring and management capacities are weak and where mining pollution of groundwater largely occur.
ARTICLE | doi:10.3390/sci1030057
Online: 20 September 2019 (00:00:00 CEST)
This study assessed farmers’ perception of climate change, estimated the determinants of, and evaluated the relationship among adaptation practices using the multivariate probit model. A survey in 300 agricultural households was carried out covering 10 sample districts considering five agro-ecological zones and a vulnerability index. Four adaptation choices (change in planting date, crop variety, crop type and investment in irrigation) were deemed as outcome variables and socioeconomic, demographic, institutional, farm-level and perceptions variables were deployed as explanatory variables. Their marginal effects were determined for three climatic variables—temperature, precipitation and drought. Age, gender and education of head of household, credit access, farm area, rain-fed farming and tenure, are found to be more influential compared to other factors. All four adaptation-options are found to be complimentary to each other. Importantly, the intensity of impact of dependent variables in different models, and for available adaptation-options, are found to be unequal. Therefore, policy options and support facilities should be devised according to climatic variables and adaptation options to achieve superior results.
ARTICLE | doi:10.20944/preprints201808.0118.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: Archimedean Copula; Elliptical Copula; Multivariate Distribution; Hydrology
Online: 6 August 2018 (11:39:25 CEST)
This study generalized the best copula to characterize the joint probability distribution between rainfall severity and duration in Peninsular Malaysia using two dimensional copulas. Specifically, to construct copulas, Inference Function for Margins (IFM) and Canonical Maximum Likelihood (CML) methods were specially exploited. For the purpose of achieving copula fitting, the derived rainfall variables by making use of the Standardized Precipitation Index (SPI) were fitted into several distributions. Five copulas, namely Gaussian, Clayton, Frank, Joe and Gumbel were put to the tests to establish the best data fitted copula. The tests produced acknowledged and satisfactory results of copula fitting for rainfall severity and duration. Surveying the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), only three copulas produced a better fit for parametric and semi parametric approaches. Finally, two consistency tests were conducted and the results had shown that Frank Copula produced consistent results.
ARTICLE | doi:10.20944/preprints202109.0081.v1
Subject: Chemistry, Food Chemistry Keywords: rootstocks; untargeted metabolomics; features; grafted; multivariate analysis; volatile compounds
Online: 6 September 2021 (09:52:40 CEST)
To allow for a broad survey of subtle metabolic shifts in wine caused by rootstock and irrigation, an integrated metabolomics-based workflow followed by quantitation was developed. This workflow was particularly useful when applied to a poorly studied variety cv. Chambourcin. Allowing volatile metabolites that otherwise may have been missed with a targeted analysis to be included, this approach allowed deeper modeling of treatment differences which then could be used to identify important compounds. Wines produced on a per vine basis, over two years, were analyzed using SPME-GC-MS/MS. From the 382 and 221 features that differed significantly among rootstocks in 2017 and 2018 respectively, we tentatively identified 94 compounds by library search and retention index, with 22 confirmed and quantified using authentic standards. Own-rooted Chambourcin differed from other root-systems for multiple volatile compounds with fewer dif-ferences among grafted vines. For example, the average concentration of β-Damascenone present in own-rooted vines (9.49 µg/L) was significantly lower in other rootstocks (8.59 µg/L), whereas mean Linalool was significantly higher in 1103P rootstock compared to own-rooted. β-Damascenone was higher in regulated deficit irrigation (RDI) than other treatments. The workflow outlined not only was shown to be useful for scientific investigation, but also in creating a protocol for analysis that would ensure differences of interest to industry are not missed.
ARTICLE | doi:10.20944/preprints202106.0530.v1
Subject: Earth Sciences, Atmospheric Science Keywords: airborne LiDAR; forest attributes; multivariate power model; sample size
Online: 22 June 2021 (13:03:33 CEST)
Exploring the effect of the sample size on the estimation accuracy of airborne LiDAR forest attributes in a large-scale area can help in optimizing the technical application scheme of operational ALS-based large-scale forest stand inventories. In our study, sample datasets composed of different sample plots were constructed by repeated sampling from 1003 sample plots in a subtropical study area covering 2376 × 103 km2. Sixteen multiplicative power models were built in each forest type consisting of four forest attributes. Through these models, the variations of standard deviation (SD) and coefficient of variation (CV) of R2 and rRMSE of forest attribute estimation models for different quantity levels of sample plots were also analyzed. The results showed that, first, when the sample size increased from 30 to the top limit, the SD of the forest attributes and LiDAR variables showed a decreasing trend. Second, as the sample size increased, the rRMSE of the 16 forest attribute estimation models gradually decreased, while the R2 gradually increased. Third, when the sample size was small, both the SD of R2 and rRMSE of the models were large, and the SD of R2 and rRMSE gradually decreased as the sample size increased. In 50 models conducted for each attribute at the same sample size, for the mean standard deviations of forest attributes, the ten best performing models were lower than those of the total 50 models, and the worst ten models were the opposite. When the sample size increased, the accuracy of each forest attribute estimation model for each forest type gradually improved. The variation of forest attributes and the LiDAR variable of the construction model are critical factors that affect the model’s accuracy. To efficiently apply airborne LiDAR in order to survey large-scale subtropical forest resources, the sample size of the Chinese fir forest, pine forest, eucalyptus forest, and broad-leaved forest should be 110, 80, 85, and 70, respectively.
ARTICLE | doi:10.20944/preprints201810.0374.v1
Subject: Life Sciences, Biotechnology Keywords: mini-bioreactors; parallelization; automation; digitalization; multivariate analysis; dynamic processes
Online: 17 October 2018 (06:19:46 CEST)
Mini-bioreactor systems enabling automatized operation of numerous parallel cultivations have been used to accelerate and optimize bioprocess development. As implementation of fed-batch conditions, multiple options of process control and sample analysis are possible, these systems represent valuable screening tools for large-scale production. However, the dynamic behavior of cultivations has not yet been considered regarding data evaluation and decision making during high-throughput screening in mini-bioreactors. In this study, the characterization of Saccharomyces cerevisiae AH22 secreting recombinant endopolygalacturonase is performed in 48 parallel fed-batch cultivations regarding 16 experimental conditions. Automated parallel process control, frequent sampling and analysis were implemented. Data-driven multivariate methods were developed to allow for fast, automated decision making as well as online predictive data analysis regarding endopolygalacturonase production. Using dynamic process information, a cultivation with abnormal behavior could be detected by principal component analysis as well as two clusters of similarly behaving cultivations, later classified according to the feeding rate. By decision tree analysis, cultivation conditions leading to an optimal recombinant product formation could be identified automatically. The developed method is easily adaptable and suitable for automatized process development reducing the experimental times and costs.
ARTICLE | doi:10.20944/preprints201805.0126.v1
Subject: Mathematics & Computer Science, Computational Mathematics Keywords: graph isomorphism problem; multivariate polynomial system; zero-knowledge proof
Online: 8 May 2018 (09:35:15 CEST)
Zero-Knowledge Proofs ZKP provide a reliable option to verify that a claim is true without giving detailed information other than the answer. A classical example is provided by the ZKP based in the Graph Isomorphism problem (GI), where a prover must convince the verifier that he knows an isomorphism between two isomorphic graphs without publishing the bijection. We design a novel ZKP exploiting the NP-hard problem of finding the algebraic ideal of a multivariate polynomial set, and consequently resistant to quantum computer attacks. Since this polynomial set is obtained considering instances of GI, we guarantee that the protocol is at least as secure as the GI based protocol.
ARTICLE | doi:10.20944/preprints202201.0472.v1
Subject: Medicine & Pharmacology, Pharmacology & Toxicology Keywords: aspirin; pharmacometabolomic; nuclear magnetic resonance; spectroscopy; gastric toxicity; multivariate analysis
Online: 31 January 2022 (17:26:48 CET)
Background: Low-dose aspirin (LDA) is the backbone for secondary prevention of coronary artery disease, though limited by gastric toxicity. This study was aimed to identify novel metabolites that could predict LDA-induced gastric toxicity using pharmacometabolomics. Methods: Pre-dosed urine samples were collected from male Sprague-Dawley rats. The rats were treated with either LDA (10 mg/kg) or 1% methylcellulose (10 ml/kg) per oral for 28 days. The rats' stomachs were examined for gastric toxicity using a stereomicroscope. The urine samples were analyzed using a proton nuclear magnetic resonance spectroscopy. Metabolites were systematically identified by exploring established databases and multivariate analyses to identify the spectral pattern of metabolites related to LDA-induced gastric toxicity. Results: Treatment with LDA resulted in gastric toxicity in 20/32 rats (62.5%). The orthogonal projections to latent structures discriminant analysis (OPLS-DA) model displayed a goodness-of-fit (R2Y) value of 0.947, suggesting a near-perfect reproducibility, a goodness-of-prediction (Q2Y) of -0.185 with perfect sensitivity, specificity and accuracy (100%). Furthermore, the area under the receiver operating characteristic (AUROC) displayed was 1. The final OPLS-DA model had an R2Y value of 0.726 and Q2Y of 0.142 with sensitivity (100%), specificity (95.0%) and accuracy (96.9%). Citrate, hippurate, methylamine, trimethylamine N-oxide and alpha-keto-glutarate were identified as the possible metabolites implicated in the LDA-induced gastric toxicity. Conclusion: The study identiﬁed metabolic signatures that correlated with the development of a low dose Aspirin-induced gastric toxicity in rats. This pharmacometabolomic approach could further be validated to predict LDA-induced gastric toxicity in patients with coronary artery disease.
REVIEW | doi:10.20944/preprints202105.0194.v1
Subject: Medicine & Pharmacology, Allergology Keywords: aphthous stomatitis, risk factors, genetic polymorphisms, multivariate analysis, systematic review
Online: 10 May 2021 (13:55:48 CEST)
The cause and prevention of recurrent aphthous stomatitis (also called aphthous ulcers or canker sores) are still unknown. This may be due in part to ignorance of the risk factors present in susceptible people. In this systematic review (PROSPERO record #CRD42019122214), we show that most of the risk factors for the disease are single nucleotide genetic polymorphisms in genes related to the functioning of immune system (TLR4, MMP9, E-selectin, IL-1 beta and TNF-alpha). Single nucleotide genetic polymorphisms do not constitute a modifiable risk. This indicates that, at least in part, susceptibility to recurrent aphthous stomatitis is hereditary, and that these factors cannot be modified.
Subject: Social Sciences, Accounting Keywords: performance analysis; elite football; multivariate analysis; principal components analysis; LaLiga
Online: 8 February 2021 (16:18:14 CET)
The use of principal components analysis provided information about the main characteristics of teams, based on a set of indicators, instead of displaying individualized information for each of these indicators. In this work we have considered reducing an extensive data matrix to improve interpretation, using the principal components analysis. Subsequently, with new components and with a multiple linear regression, we have carried out a comparative analysis between the best and bottom teams of LaLiga. The sample consisted of the matches corresponding to the 2015/16, 2016/17 and 2017/18 seasons. The results showed that the best teams were characterized and differentiated from bottom teams in the realization of a greater number of successful passes and in the execution of a greater number of dynamic offensive transitions. The bottom teams were characterized by executing more defensive than offensive actions and showing a fewer number of goals, a greater ball possession time in the final third of the field. Goals, ball possession time in the final third of the field, number of effective shots and crosses are the main performance factors that influence the offensive success of football. This information allows us to increase knowledge about the key performance indicators in football.
ARTICLE | doi:10.20944/preprints201608.0118.v1
Subject: Biology, Physiology Keywords: acclimation; coral reefs; endosymbiosis; molecular biology; multivariate statistics; temperature; upwelling
Online: 11 August 2016 (11:03:03 CEST)
Multivariate statistical approaches (MSA), such as principal components analysis and multidimensional scaling, seek to uncover meaningful patterns within datasets by considering multiple response variables in a concerted fashion. Although these techniques are readily used by ecologists to visualize and explain differences between study sites, they could theoretically be employed to differentiate organisms within an experimental framework while simultaneously identifying response variables that drive documented experimental differences. Therefore, MSA were used herein to attempt to understand the response of the common, Indo-Pacific reef coral Seriatopora hystrix to temperature changes using data from laboratory-based temperature challenge studies performed in Southern Taiwan. Gene expression and physiological data partitioned experimental specimens by time of sampling, treatment temperature, and site of origin upon employing MSA, signifying that S. hystrix and its dinoflagellate endosymbionts display physiological and molecular signatures that are characteristic of sampling time, site of colony origin, and/or temperature regime. These findings promote the utility of MSA for documenting biologically meaningful shifts in the physiological and/or sub-cellular response of marine invertebrates exposed to environmental change.
ARTICLE | doi:10.20944/preprints202206.0328.v1
Subject: Biology, Ecology Keywords: mangrove forests; Marine Protected Areas; α-diversity; β-diversity; multivariate analyses
Online: 24 June 2022 (03:28:50 CEST)
Differences in fish assemblages’ structure and their relation with environmental variables (due to the variations in sampled seasons, habitats, and zones), were analyzed in two adjacent estuaries on the north Pacific coast of Mexico. Environmental variables and fish catches were registered monthly between August 2018 and October 2020. Multivariate analyses were conducted to define habitats and zones based on their environmental characteristics, and the effect of this variability on fish assemblages’ composition, biomass, and diversity (α and β) was evaluated. A total of 12,008 fish individuals of 143 species were collected using different fishing nets. Multivariate analyses indicated that fish assemblages’ structure was different between zones due to the presence, height, and coverage of distinct mangrove species. Additionally, factors such as depth and salinity showed effects on fish assemblages’ diversity (α and β-nestedness), which presented higher values in the ocean and remained similar in the rest of the analyzed zones and habitats. These results and the differences in species replacement (β-turnover) indicate the singularity of fish assemblages at estuaries (even in areas very close to the ocean), and the necessity to establish local management strategies for these ecosystems.
ARTICLE | doi:10.20944/preprints202106.0470.v1
Subject: Earth Sciences, Atmospheric Science Keywords: heavy metals; surface sediment; Manila Bay; pollution; multivariate analysis; ecological risk
Online: 18 June 2021 (08:32:18 CEST)
Recent work on heavy metal pollution in Manila Bay suggests elevated concentration in the surface sediments. It is critical to identify the sources of these heavy metals to effectively rehabilitate the bay. Our study investigated the sources of the heavy metal pollution that ended up in Manila Bay and the risks associated with these toxic metals based on a recent survey conducted. Surface sediment samples with higher heavy metal concentrations were found in the upper to middle parts of the bay while lower concentrations were in the southeast areas. Multivariate analyses such as hierarchical cluster analysis (HCA), principal component analysis (PCA), and Pearson correlation analysis were used to identify the sources of the heavy metals. The heavy metal pollution in Manila Bay is attributed to several rivers draining northeast of Manila Bay, particularly the Marilao-Meycauayan-Obando River System (MMORS) which is cited as one of the 30 dirtiest river systems in the world. The ecological risks associated with heavy metals in the sediments found higher incidences of toxicity in north and middle parts of Manila Bay. Cu and Cr posed the highest risks of toxicities than any other heavy metals. Based on our analysis, the counterclockwise water gyre of the bay can explain the distribution and ecological risks associated with the heavy metals as supported by the findings of the PCA. Given the high priority by the Philippine government to rehabilitate the bay, our study strongly shows that efforts to restore the ecological status of Manila Bay will only succeed if the pollution from major rivers draining to it will be properly addressed.
ARTICLE | doi:10.20944/preprints202105.0412.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Expression of multilayer network models, oriented graph, multivariate model, nonlinear regression
Online: 18 May 2021 (10:26:19 CEST)
Neural networks models are mostly represented by oriented graphs where only the components, constitutive elements of the graph, are transcribed into mathematical xpression. Indeed, accurate knowledge of the full expression of the model is required in certain situations such as selecting among several reference models, the one that best fits the available data or comparing the explanatory and predictive performance of an established model with respect to some reference models. In this paper, we establish a formalism of the mathematical expression for multilayer perceptron neural network in a general framework, MLP-p-n-q, with p, n and q natural integers and show its restriction to cases where one has a hidden layer and multivariate outputs (MLP-p-1-q), and then a single output (MLP-p-1-1). Then, we give some specific cases of the most commonly used models. An application case is presented in the context of solving a nonlinear regression problem.
ARTICLE | doi:10.20944/preprints202012.0080.v1
Subject: Medicine & Pharmacology, Allergology Keywords: multivariate linear method; validation; diagnosis; discriminative; signatures of disease; schizophrenia; depression
Online: 3 December 2020 (10:38:31 CET)
In order to overcome this problem our group designed a novel machine learning technique, multivariate linear method (MLM) which can capture convergent data from voxel-based morphometry, functional resting state and task-related neuroimaging and the relevant clinical measures. In this paper we report results from convergent cross-validation of biological signatures of disease in a sample of patients with schizophrenia as compared to depression. Our model provides evidence that the combination of the neuroimaging and clinical data in MLM analysis can inform the differential diagnosis in terms of incremental validity to reach 90 % accuracy of the prediction.
ARTICLE | doi:10.20944/preprints202007.0195.v1
Subject: Biology, Animal Sciences & Zoology Keywords: cranial variation; otters (Lutra lutra); 3D surface scanning; multivariate statistical methods
Online: 9 July 2020 (12:52:26 CEST)
3D surface scans were carried out to determine the shapes of the upper sections of (skeletal) crania of adult Eurasian otters (Lutra lutra) from Great Britain. Landmark points were placed on these shapes by using a graphical user interface (GUI) and distance measurements (i.e., the length, height, and width of the crania) could be found by using the landmark points. These “GUI-based” distances were shown to be accurate and reliable in comparison to physical measurements taken on the crania directly by using a digital calliper. The crania of males were 6.85mm, 5.44mm, 1.66mm larger in terms of length, width and height, respectively, than females in our sample (P < 0.001), i.e., male otters had significantly larger skulls than females. Significant differences in size occurred also by geographical area in Great Britain (P < 0.05). Multilevel Principal Components Analysis (mPCA) indicated that sex and geographical area explained 31.1% and 9.6% of shape variation in “unscaled” shape data and that they explained 17.2% and 9.7% of variation in “scaled” data. The first mode of variation at level 1 (sex) correctly reflected size changes between males and females for “unscaled” shape data. Modes at level 2 (geographical area) also showed possible changes in size and shape. Clustering by sex and geographical area was observed in standardised component scores. Such clustering in cranial shape by geographical area might reflect genetic differences that are known to occur in otter populations in Great Britain, although other potentially confounding factors (e.g. population age-structure, diet, etc.) might also drive regional differences. Furthermore, sample sizes per group were small for geographical comparisons. However, this work provides a successful first test of the effectiveness of 3D surface scans and multivariate methods such as mPCA to study the cranial morphology of otters.
ARTICLE | doi:10.20944/preprints202004.0392.v1
Subject: Mathematics & Computer Science, Applied Mathematics Keywords: Multivariate Public Key Cryptosystem; Random polynomial; Oil Vinegar signature; Provable Security
Online: 22 April 2020 (06:09:50 CEST)
An oil and vinegar scheme is a signature scheme based on multivariate quadratic polynomials over finite fields. The system of polynomials contains $n$ variables, divided into two groups: $v$ vinegar variables and $o$ oil variables. The scheme is called balanced (OV) or unbalanced (UOV), depending on whether $v = 0$ or not, respectively. These schemes are very fast and require modest computational resources, which make them ideal for low-cost devices such as smart cards. However, the OV scheme has been already proven to be insecure and the UOV scheme has been proven to be very vulnerable for many parameter choices. In this paper, we propose a new multivariate public key signature whose central map consists of a set of polynomials obtained from the multiplication of block matrices. Our construction is motivated by the design of the Simple Matrix Scheme for Encryption and the UOV scheme. We show that it is secure against the Separation Method, which can be used to attack the UOV scheme, and against the Rank Attack, which is one of the deadliest attacks against multivariate public-key cryptosystems. Some theoretical results on matrices with polynomial entries are also given, to support the construction of the scheme.
ARTICLE | doi:10.20944/preprints201911.0025.v1
Subject: Chemistry, Physical Chemistry Keywords: macro-minerals; micro-minerals; environmental-minerals; beef quality; beef production; multivariate analysis
Online: 3 November 2019 (17:38:11 CET)
Mineral profile of beef interests human health, but also animal performance and meat quality. This study analyzes the relationships of 20 minerals in beef (ICP-OES) with 3 animal performance and 13 meat quality traits analyzed on 182 samples of Longissimus thoracis. Animals’ breed and sex showed limited effects. The major sources of variation (farm/date of slaughter, individual animal within group and side/sample within animal) differed greatly from trait to trait. Mineral contents were correlated to animal performance and meat quality being significant 52 out of the 320 correlations at the farm/date level, and 101 out of the 320 at the individual animal level. Five latent factors explained 69% of mineral co-variation. The most important, “Mineral quantity” factor correlated with age at slaughter and with the meat color traits. Two latent factors (“Na+Fe+Cu” and “Fe+Mn”) correlated with performance and meat color traits. Two other (“K-B-Pb” and “Zn”) correlated with meat chemical composition and the latter also with carcass weight and daily gain, and meat color traits. Meat cooking losses correlated with “K-B-Pb”. Latent factor analysis appears be a useful means of disentangling the very complex relationships that the minerals in meat have with animal performance and meat quality traits.
ARTICLE | doi:10.20944/preprints201810.0669.v1
Subject: Biology, Ecology Keywords: scorpion ecology; multivariate statistics; body size; offspring characteristics, K and r strategists
Online: 29 October 2018 (10:09:45 CET)
There are no studies that quantitatively compare life histories among scorpion species. Statistical procedures applied to 94 scorpion species indicate that those with larger bodies do not necessarily have larger litters or longer life cycles, opposite to some theoretical predictions.
ARTICLE | doi:10.20944/preprints201810.0176.v1
Subject: Biology, Agricultural Sciences & Agronomy Keywords: agricultural stakeholders; extension; multivariate analysis; socio-ecological systems; mental models; sustainable agriculture
Online: 9 October 2018 (06:03:38 CEST)
The sustainability of agriculture depends as much on the natural resources required for production as it does on the stakeholders that manage those resources. It is thus essential to understand the variables that influence the decision-making process of agricultural stakeholders to design educational programs, interventions, and policies geared towards their specific needs, a required step to enhance agricultural sustainability. We examined the perceptions, experiences, and priorities that influence management decisions of five major groups of agricultural stakeholders (conventional small grain producers, organic small grain producers, organic vegetable producers, extension agents and agro-industry crop consultants, and researchers) across the Montana, United States. Results revealed that while stakeholder groups have distinct perceptions, experiences, and priorities, there were similarities across groups. Specifically, organic vegetable and organic small grain producers showed similar responses that were, in turn, divergent of conventional producers, researchers, and crop consultants. Conventional small grain producers and researchers showed overlapping response patterns while crop consultants formed an isolated group. Our results reinforce the need for agricultural education and programs that address unique and shared experiences, priorities, and concerns of multiple stakeholder groups. This study endorses the call for a paradigm shift from the traditional top-down agricultural extension model to one that accounts for participants’ socio-ecological contexts to facilitate the adoption of sustainable agricultural systems that support environmental and human wellbeing.
ARTICLE | doi:10.20944/preprints201809.0224.v1
Subject: Medicine & Pharmacology, Other Keywords: aging; muscle; protein; metabolism; metabolomics; profiling; biomarkers; multi-marker; physical performance; multivariate
Online: 12 September 2018 (17:11:33 CEST)
Physical frailty and sarcopenia (PF&S) are hallmarks of aging that share a common pathogenic background. Perturbations in protein/amino acid metabolism may play a role in the development of PF&S. In this preliminary study, 68 community-dwellers aged 70 years and older, 38 with PF&S and 30 non-sarcopenic, non-frail controls (nonPF&S), were enrolled. A panel of 37 serum amino acids and derivatives was assayed by UPLC-MS. Partial Least Squares Discriminant Analysis (PLS-DA) was used to characterize the amino acid profile of PF&S. The optimal complexity of the PLS-DA model was found to be three latent variables. The proportion of correct classification was 76.6 ± 3.9% (75.1 ± 4.6% for enrollees with PF&S; 78.5 ± 6.0% for controls). Older adults with PF&S were characterized by higher levels of asparagine, aspartic acid, citrulline, ethanolamine, glutamic acid, sarcosine, and taurine. The profile of nonPF&S individuals was defined by higher levels of α-aminobutyric acid and methionine. Distinct profiles of circulating amino acids and derivatives characterize older individuals with PF&S. The dissection of these patterns may provide novel insights into the role played by protein/amino acid perturbations in the disabling cascade and possible new targets for interventions.
ARTICLE | doi:10.20944/preprints201806.0482.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: Monte Carlo; regime-switching multivariate black-scholes; metamodeling; variable annuity; portfolio valuation
Online: 29 June 2018 (11:31:49 CEST)
Dynamic hedging has been adopted by many insurance companies to mitigate the financial risks associated with variable annuity guarantees. In order to simulate the performance of dynamic hedging for variable annuity products, insurance companies rely on nested stochastic projections, which is highly computationally intensive and often prohibitive for large variable annuity portfolios. Metamodeling techniques have recently been proposed to address the computational issues. However, it is difficult for researchers to obtain real datasets from insurance companies to test metamodeling techniques and publish the results in academic journals. In this paper, we create synthetic datasets that can be used for the purpose of addressing the computational issues associated with the nested stochastic valuation of large variable annuity portfolios. The runtime used to create these synthetic datasets would be about 3 years if a single CPU were used. These datasets are readily available to researchers and practitioners so that they can focus on testing metamodeling techniques.
ARTICLE | doi:10.20944/preprints202105.0105.v1
Subject: Earth Sciences, Atmospheric Science Keywords: data scarcity; water quality; missing data; univariate imputation; multivariate imputation; machine learning; hydroinformatics.
Online: 6 May 2021 (15:18:23 CEST)
The monitoring of surface-water quality followed by water-quality modeling and analysis is essential for generating effective strategies in water-resource management. However, worldwide, particularly in developing countries, water-quality studies are limited due to the lack of a complete and reliable dataset of surface-water-quality variables. In this context, several statistical and machine-learning models were assessed for imputing water-quality data at six monitoring stations located in the Santa Lucía Chico river (Uruguay), a mixed lotic and lentic river system. The challenge of this study is represented by the high percentage of missing data (between 50% and 70%) and the high temporal and spatial variability that characterizes the water-quality variables. The competing algorithms implemented belonged to both univariate and multivariate imputation methods (inverse distance weighting (IDW), Random Forest Regressor (RFR), Ridge (R), Bayesian Ridge (BR), AdaBoost (AB), Hubber Regressor (HR), Support Vector Regressor (SVR), and K-nearest neighbors Regressor (KNNR)). According to the results, more than 76% of the imputation outcomes are considered satisfactory (NSE > 0.45). The imputation performance shows better results at the monitoring stations located inside the reservoir than the ones positioned along the mainstream. IDW was the most chosen model for data imputation.
ARTICLE | doi:10.20944/preprints201808.0415.v1
Subject: Earth Sciences, Environmental Sciences Keywords: Taiwan rivers; water quality; multivariate statistical analysis; river pollution index; pollution source apportionment
Online: 23 August 2018 (11:54:51 CEST)
This study reports multivariate statistical techniques applied including cluster analysis to evaluate and classify the river pollution level in Taiwan, and principal component analysis-multiple linear regression (PCA-MLR) to identify the possible pollution source. Water quality and heavy metal monitoring data from Taiwan Environmental Protection Administration (EPA) was evaluated for 14 rivers in the four regions of Taiwan. The Erren River was classified as the most polluted River in Taiwan. Biochemical oxygen demand, ammonia, and total phosphate concentration in this river were the highest of the 14 rivers evaluated. In addition, heavy metal levels of the following rivers exceeded the Taiwan EPA standard limit: lead - in the Dongshan, Jhuoshuei, and Xinhuwei Rivers; copper - in the Dahan, Laojie, and Erren Rivers; and manganese - in all rivers. Water pollution in the Erren River was estimated to originate 72% from industrial sources, 16% from domestic black water, and 12% from natural sources and runoff from other tributaries. Our research showed that PCA-MLR and the cluster analysis model accomplished our study objectives and will be helpful tools to evaluate water quality in rivers and we suggest that the continuous monitoring should be conducted to monitor water pollution from anthropogenic activities.
ARTICLE | doi:10.20944/preprints202104.0499.v1
Subject: Engineering, Automotive Engineering Keywords: fault detection; induced draft fan; multivariate state estimation technique (MSET); model update; power plant
Online: 19 April 2021 (14:36:56 CEST)
The induced draft (ID) fan is important auxiliary equipment in the thermal power plant. It is of great significance to monitor the operation of the ID fan for safe and efficient production. In this paper, an adaptive warning model is proposed to detect early faults of ID fans. First, a non-parametric monitoring model is constructed to describe the normal operation states with the multivariate state estimation technique (MSET). Then, an early warning approach is presented to identify abnormal behaviors based on the results of the MSET model. As the performance of the MSET model is heavily influenced by the normal operation data in the historic memory matrix, an adaptive strategy is proposed by using the samples with a high data quality index (DQI) to manage the memory matrix and update the model. The proposed method is applied to a 300 MW coal-fired power plant for early fault detection, and it is compared with the model without an update. Results show that the proposed method can detect the fault earlier and more accurately.
ARTICLE | doi:10.20944/preprints201808.0072.v4
Subject: Engineering, Civil Engineering Keywords: flood risk; copula; compound events; multivariate; storm surge; spatial dependence; coastal catchment; Bayesian Network.
Online: 11 September 2018 (14:19:43 CEST)
Traditional flood hazard analyses often rely on univariate probability distributions; however, in many coastal catchments, flooding is the result of complex hydrodynamic interactions between multiple drivers. For example, synoptic meteorological conditions can produce considerable rainfall-runoff, while also generating wind-driven elevated sea levels. When these drivers interact in space and time, they can exacerbate flood impacts; this phenomenon is known as compound flooding. In this paper, we build a Bayesian Network based on Gaussian copulas to generate the equivalent of 500 years of daily stochastic boundary conditions for a coastal watershed in Southeast Texas. In doing so, we overcome many of the limitations of conventional univariate approaches and are able to probabilistically represent compound floods caused by riverine and coastal interactions. We calculate the resulting water levels using a 1D steady-state hydraulic model and find that flood stages in the catchment are strongly affected by backwater effects from tributary inflows and downstream water levels. By comparing with a bathtub modeling approach, we show that simplifying the multivariate dependence between flood drivers can lead to an underestimation of flood impacts, highlighting that accounting for multivariate dependence is critical for the accurate representation of flood risk in coastal catchments prone to compound events.
ARTICLE | doi:10.20944/preprints201709.0099.v1
Subject: Biology, Forestry Keywords: near-infrared spectroscopy; multivariate analysis; partial least-squares regression; floor litter; optimal wavelength selection
Online: 21 September 2017 (04:36:21 CEST)
Near-infrared spectroscopy (NIRS) was implemented to monitor the moisture content of broadleaf litters. Partial least-squares regression (PLSR) models, incorporating optimal wavelength selection techniques, have been proposed to better predict the litter moisture of forest floor. Three broadleaf litters were used to sample the reflection spectra corresponding the different degrees of litter moisture. Maximum normalization preprocessing technique was successfully applied to remove unwanted noise from the reflectance spectra of litters. Four variable selection methods were also employed to extract the optimal subset of measured spectra for establishing the best prediction model. The results showed that the PLSR model with the peak of beta coefficients method was the best predictor among all candidate models. The proposed NIRS procedure is thought to be a suitable technique for on-the-spot evaluation of litter moisture.
ARTICLE | doi:10.20944/preprints202208.0283.v1
Subject: Engineering, Industrial & Manufacturing Engineering Keywords: grinding; multivariate statistics; maintenance decision; condition-based maintenance; condition monitoring; health management; prognostics; fault diagnosis
Online: 16 August 2022 (09:44:46 CEST)
Grinding processes’ stochastic nature poses a challenge in predicting the quality of the resulting surfaces. Post-production measurements for form, surface roughness, and circumferential waviness are commonly performed due to infeasibility in measuring all quality parameters during the grinding operation. Therefore, it is challenging to diagnose the root cause of quality deviations in real-time resulting from variations in the machine’s operating condition. This paper introduces a novel approach to predicting the overall quality of the individual parts. The grinder is equipped with sensors to implement condition-based maintenance and is induced with five frequently occurring failure conditions for the experimental test runs. The crucial quality parameters are measured for the produced parts. Fuzzy c-means (FCM) and Hotelling’s T-squared (T2) have been evaluated to generate quality labels from the multi-variate quality data. Benchmarked random forest regression models are trained using fault diagnosis feature set and quality labels. Quality labels from the T2 statistic of quality parameters are preferred over FCM approach for their repeatability. The model, trained from T2 labels achieves more than 94% accuracy when compared to the measured ring disposition. The predicted overall quality using the sensors’ feature set is compared against the threshold to reach a trustworthy maintenance decision.
ARTICLE | doi:10.20944/preprints202203.0205.v1
Subject: Earth Sciences, Environmental Sciences Keywords: heavy metals; abandoned mine; soil pollution; potential ecological risk; multivariate analysis; health index; soil; sediments
Online: 15 March 2022 (10:58:46 CET)
A recent survey that determined heavy metal concentrations in an abandoned Hg mine in Palawan, Philippines, found the occurrence of Hg with As, Ba, Cd, Co, Cr, Cu, Fe, Hg, Mn, Ni, Pb, Sb, Tl, V, and Zn. While the Hg originated from the mine waste calcines as supported by previous studies, the critical knowledge about the origin of the other heavy metals remains to be unknown. Our study investigated the sources of heavy metal pollution surrounding the abandoned Hg mine; and assessed the soil and sediment quality, ecological risks, and health risks associated with these toxic metals. Multivariate analyses, such as hierarchical cluster analysis (HCA), principal component analysis (PCA), and Pearson correlation analysis, were used to identify the heavy metal sources from the results of a previous paper. Our results showed that Fe, Ni, Cr, Co, and Mn are associated with the ultramafic geology of the study, whereas As, Ba, Cd, Cu, Pb, Sb, Tl, V, and Zn are likely due to historical mining and processing of cinnabar from 1953-1976. The mine waste calcines were used as construction material for the wharf and as land filler for the adjacent communities. The modified contamination factor (mCdeg) showed that the coast of Honda Bay is highly contaminated, while the inland areas, including the rivers, are very- to ultra-highly contaminated. There is a considerable ecological risk associated with the heavy metals, wherein Ni, Hg, Cr, and Mn contribute an average of 46.3 %, 26.3 %, 11.2 %, and 9.3 % to the potential ecological risk index (RI), respectively. The overall mean hazard index (HI) for both adults (1.4) and children (12.1) exceeded 1, implying the probability of non-carcinogenic adverse effects. The mean total cancer risk over a lifetime (LCR) for both adults (1.19×10-3) and children (2.89×10-3) exceeded the tolerable threshold of 10-4, suggesting a potentially high risk for developing cancer mainly by Ni, Co, and Cr exposure.
ARTICLE | doi:10.20944/preprints202106.0439.v1
Subject: Earth Sciences, Atmospheric Science Keywords: heavy metals; MMORS, Meycauayan River; soil pollution; multivariate analysis; Sediment Quality Guidelines; Single Pollution Index
Online: 16 June 2021 (10:34:20 CEST)
The City of Meycauayan is considered as one of the most polluted cities in the developing world on the account of industrial discharges of toxic materials to the environment. This work investigated the sources of the heavy metal pollution by analyzing soil and sediment samples for heavy metals (Cr, Hg, Ni, and Pb) together with selected environmental indicators (TN, TOM, and TP) located along the Meycauayan River. Hierarchical cluster analysis (HCA), principal components analysis (PCA), and Pearson correlation analysis (CA) were used to identify the sources of the metals. Results showed delineated locations of severe levels of heavy metal pollution downstream because of the concentration of industrial activities. Cr contributed more than any other heavy metals analyzed due to proliferation of tanneries discharging untreated wastewaters to the river. Significant inputs of Pb and Hg from Pb-acid battery recycling and gold smelting industries were also found. Risk assessments indicated severe levels of heavy metal pollution where industrial activities are concentrated. The mean Cr, Pb, Ni, and Hg in the sampling locations have mean incidences of toxicity of 91.7 %, 53.6 %, 27.7 %, and 70.0 %, respectively. Our study showed a serious need to address heavy metal pollution in Meycauayan.
ARTICLE | doi:10.20944/preprints201809.0038.v1
Subject: Earth Sciences, Environmental Sciences Keywords: Karenia brevis, harmful algal bloom (HAB), moderate resolution imaging Spectroradiometer (MODIS), prediction, chlorophyll, multivariate regression
Online: 3 September 2018 (13:52:41 CEST)
Over the past two decades, persistent occurrences of harmful algal blooms (HAB; Karenia brevis) have been reported in Charlotte County, southwestern Florida. We developed data-driven models that rely on spatiotemporal remote sensing and field data to identify factors controlling HAB propagation, provide a same-day distribution (nowcasting), and forecast their occurrences up to three days in advance. We constructed multivariate regression models using historical HAB occurrences (213 events reported from January 2010 to October 2017) compiled by the Florida Fish and Wildlife Conservation Commission and validated the models against a subset (20%) of the reported historical events. The models were designed to specifically capture the onset of the HABs instead of those that developed days earlier and continued thereafter. A prototype of an early warning system was developed through a threefold exercise. The first step involved the automatic downloading and processing of daily Moderate Resolution Imaging Spectroradiometer (MODIS) Aqua products using SeaDAS ocean color processing software to extract temporal and spatial variations of remote sensing-based variables over the study area. The second step involved the development of a multivariate regression model for same-day mapping of HABs and similar subsequent models for forecasting HAB occurrences one, two, and three days in advance. Eleven remote sensing variables and two non-remote sensing variables were used as inputs for the generated models. In the third and final step, model outputs (same-day and forecasted distribution of HABs) were posted automatically on a web-based GIS (http://www.esrs.wmich.edu/webmap/bloom/). Our findings include the following: (1) the variables most indicative of the timing of bloom propagation are bathymetry, euphotic depth, wind direction, SST, chlorophyll-a [OC3M] and distance from the river mouth, and (2) the model predictions were 90% successful for same-day mapping and 65%, 72% and 71% for the one-, two- and three-day advance predictions, respectively. The adopted methodologies are reliable, dependent on readily available remote sensing data sets, and cost-effective and thus could potentially be used to map and forecast algal bloom occurrences in data-scarce regions.
COMMUNICATION | doi:10.20944/preprints202111.0549.v1
Subject: Keywords: Principal Component Regression, Partial Least Squares, Orthogonal Partial Least Squares, multivariate regression, hypothesis generation, Parkinson’s disease
Online: 29 November 2021 (15:42:03 CET)
In the current era of ‘big data’, scientists are able to quickly amass enormous amount of data in a limited number of experiments. The investigators then try to hypothesize about the root cause based on the observed trends for the predictors and the response variable. This involves identifying the discriminatory predictors that are most responsible for explaining variation in the response variable. In the current work, we investigated three related multivariate techniques: Principal Component Regression (PCR), Partial Least Squares or Projections to Latent Structures (PLS), and Orthogonal Partial Least Squares (OPLS). To perform a comparative analysis, we used a publicly available dataset for Parkinson’ disease patien ts. We first performed the analysis using a cross-validated number of principal components for the aforementioned techniques. Our results demonstrated that PLS and OPLS were better suited than PCR for identifying the discriminatory predictors. Since the X data did not exhibit a strong correlation, we also performed Multiple Linear Regression (MLR) on the dataset. A comparison of the top five discriminatory predictors identified by the four techniques showed a substantial overlap between the results obtained by PLS, OPLS, and MLR, and the three techniques exhibited a significant divergence from the variables identified by PCR. A further investigation of the data revealed that PCR could be used to identify the discriminatory variables successfully if the number of principal components in the regression model were increased. In summary, we recommend using PLS or OPLS for hypothesis generation and systemizing the selection process for principal components when using PCR.rewordexplain later why MLR can be used on a dataset with no correlation
ARTICLE | doi:10.20944/preprints201810.0523.v1
Subject: Biology, Other Keywords: spatiotemporal neural dynamics; vision; dorsal and ventral streams; multivariate pattern analysis; representational similarity analysis; fMRI; MEG
Online: 23 October 2018 (06:41:16 CEST)
To build a representation of what we see, the human brain recruits regions throughout the visual cortex in cascading sequence. Recently, an approach was proposed to evaluate the dynamics of visual perception in high spatiotemporal resolution at the scale of the whole brain. This method combined functional magnetic resonance imaging (fMRI) data with magnetoencephalography (MEG) data using representational similarity analysis and revealed a hierarchical progression from primary visual cortex through the dorsal and ventral streams. To assess the replicability of this method, here we present results of a visual recognition neuro-imaging fusion experiment, and compare them within and across experimental settings. We evaluated the reliability of this method by assessing the consistency of the results under similar test conditions, showing high agreement within participants. We then generalized these results to a separate group of individuals and visual input by comparing them to the fMRI-MEG fusion data of Cichy et al. (2016), revealing a highly similar temporal progression recruiting both the dorsal and ventral streams. Together these results are a testament to the reproducibility of the fMRI-MEG fusion approach and allows for the interpretation of these spatiotemporal dynamic in a broader context.
REVIEW | doi:10.20944/preprints201807.0241.v1
Subject: Materials Science, Biomaterials Keywords: biomaterial; bone regeneration; drug release; hydrogel; lignin; multivariate data processing; osteogenesis; scaffolds; stem cells; tissue engineering
Online: 13 July 2018 (15:07:37 CEST)
Renewable resources gain increasing interest as source for environmentally benign biomaterials, such as drug encapsulation/release compounds, and scaffolds for tissue engineering in regenerative medicine. Being the second largest naturally abundant polymer, the interest in lignin valorization for biomedical utilization is rapidly growing. Depending on resource and isolation procedure, lignin shows specific antioxidant and antimicrobial activity. Today, efforts in research and industry are directed toward lignin utilization as renewable macromolecular building block for the preparation of polymeric drug encapsulation and scaffold materials. Within the last five years, remarkable progress has been made in isolation, functionalization and modification of lignin and lignin-derived compounds. However, literature so far mainly focuses lignin-derived fuels, lubricants and resins. The purpose of this review is to summarize the current state of the art and to highlight the most important results in the field of lignin-based materials for potential use in biomedicine (reported in 2014–2018). Special focus is drawn on lignin-derived nanomaterials for drug encapsulation and release as well as lignin hybrid materials used as scaffolds for guided bone regeneration in stem cell-based therapies.
ARTICLE | doi:10.20944/preprints201804.0161.v1
Subject: Life Sciences, Other Keywords: electronic nose; nanowire gas sensors; food quality control; Parmigiano Reggiano; multivariate data analysis; artificial neural network
Online: 12 April 2018 (06:25:29 CEST)
Parmigiano Reggiano cheese is one of the most appreciated and consumed food worldwide, especially in Italy, for its high content of nutrients and for its taste. However, these characteristics make this product subject to counterfeiting in different forms. In this study, a novel method based on an electronic nose has been developed in order to investigate the potentiality of this tool to distinguish rind percentage in grated Parmigiano Reggiano packages that should be lower than 18%. Different samples in terms of percentage, seasoning and rind working process were considered to tackle the problem at 360°. In parallel, GC-MS technique was used to give a name to the compounds that characterize Parmigiano and to relate them with sensors responses. Data analysis consisted of two stages: multivariate analysis (PLS) and classification made in a hierarchical way with PLS-DA ad ANNs. Results are promising in terms of correct classification of the samples. The classification rate is higher for ANNs than PLS-DA, reaching also 100% values.
ARTICLE | doi:10.20944/preprints201808.0519.v1
Subject: Chemistry, Analytical Chemistry Keywords: metabolomics; γ-Hydroxybutyric acid; polyamine profiling analysis, gas chromatography-mass spectrometry; star pattern recognition analysis; multivariate analysis
Online: 30 August 2018 (08:14:55 CEST)
1) Background: Recently, illegal abuse of γ-hydroxybutyric acid (GHB) has increased in drug-facilitated crimes, but determination of GHB exposure and intoxication is difficult due to rapid metabolism of GHB. Its biochemical mechanism has not been completely investigated. And metabolomic study by polyamine profile and pattern analyses was not performed in rat urinefollowing intraperitoneal injection with GHB. 2) Methods: Polyamine profiling analysis by gas chromatography-mass spectrometry combined with star pattern recognition analysis was performed in this study. Multivariate statistical analysis was used to evaluate discrimination between control and GHB administration groups. 3) Results: Six polyamines were determined in control, single and multiple GHB administration groups. Star pattern showed distorted hexagonal shapes with characteristic and readily distinguishable patterns for each group. N1-Acetylspermine (p < 0.001), putrescine (p <0.006), N1-acetylspermidine (p <0.009), and spermine (p < 0.027) were significantly increased in single administration group but were significantly lower in the multiple administration group than in the control group. N1-Acetylspermine was the main polyamine for discrimination between control, single and multiple administration groups. Spermine showed similar levels in single and multiple administration groups. 4) Conclusions: The polyamine metabolic pattern was monitored in GHB administration groups. N1-Acetylspermine and spermine were evaluated as potential biomarkers of GHB exposure and addiction.
ARTICLE | doi:10.20944/preprints201902.0039.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: Volatility; Stocks; Persistence; Exchange Rate, Inflation Rate; Financial Time Series; Generalized Autoregressive Conditional Heteroscedasticity (GARCH); Multivariate GARCH (MGARCH)
Online: 4 February 2019 (14:56:07 CET)
The aim of this research work was to provide model for predicting stock volatility in Nigeria Stock market. To achieve this, monthly data for Nigerian stock exchange, Exchange rate, Share index and inflation rate was collected for a period of January 1990 to December 2016.The descriptive statistics revealed these variables to exhibit volatility as a characteristics of financial time –varying series. DCC Model was fitted, were the coefficients for all the parameters and that of the correlation-Targeting (rho_21) are both negative and positive and tend very close to 1 and -1, indicating that high persistence in the conditional variances. The Model DCC, satisfied the properties of a good model of conditional mean and variance of the confidential Interval (C.I) of 1 and -1, that is, the conditional variances are finites and their series are strictly stationary. This therefore implies that the Nigerian Stock Exchange, Exchange rate, share index and Inflation rate will experience a non-steady shock in the Stock market. However Each of these variables have different length of recovery (volatility half- life) ranging from 1.5month, 6.5months, 6months to 2,4months for stock exchange, exchange rate, share index and inflation rate respectively. By implication, the volatility of these variables had a long memory, persistence and mean-reverting.
ARTICLE | doi:10.20944/preprints201711.0114.v1
Subject: Earth Sciences, Environmental Sciences Keywords: Hydrochemcial characteristics; water-rock interaction; multivariate statistical analysis; mixing model; δD and δ18O isotopes; natural water system; Kangding County
Online: 17 November 2017 (12:34:26 CET)
The utilization for water resource has been of great concern to human life. To assess the natural water system in Kangding County, the integrated methods of hydrochemcial analysis, multivariate statistics and geochemical modelling were conducted on surface water, groundwater and thermal water samples. Surface water and groundwater were dominated by Ca-HCO3 type, while thermal water belonged to Ca-HCO3 and Na-Cl type. The analyzing results concluded the driving factors that affect hydrochemical components. Following the results of the combined assessments, hydrochemcial process was controlled by the dissolution of carbonate and silicate minerals with slight influence from anthropogenic activity. The mixing model of groundwater and thermal water was calculated using silica-enthalpy method, yielding cold-water fraction of 0.56-0.79 and estimated reservoir temperature of 130-199 oC, respectively. δD and δ18O isotopes suggested surface water, groundwater and thermal springs were of meteoric origin. Thermal water should have deep circulation through the Xianshuihe fault zone, while groundwater flows through secondary fractures where it recharges with thermal water. Those analytical results were used to construct a hydrological conceptual model, providing a better understanding of the natural water system in Kangding County.
ARTICLE | doi:10.20944/preprints202010.0436.v1
Subject: Keywords: Naïve Bayes Classification; Eulers Strength Formula; Cricket Prediction; Supervised Learning; KNIME Tool; Cricket prediction; sports analytics; multivariate regression; neural network
Online: 21 October 2020 (12:34:00 CEST)
In cricket, particularly the twenty20 format is most watched and loved by the people, where no one can guess who will win the match until the last ball of the last over. In India, The Indian Premier League (IPL) started in 2008 and now it is the most popular T20 league in the world. So we decided to develop a machine learning model for predicting the outcome of its matches. Winning in a Cricket Match depends on many key factors like a home ground advantage, past performances on that ground, records at the same venue, the overall experience of the players, record with a particular opposition, and the overall current form of the team and also the individual player. This paper briefs about the key factors that affect the result of the cricket match and the regression model that best fits this data and gives the best predictions. Cricket, the mainstream and widely played sport across India which has the most noteworthy fan base. Indian Premier League follows 20-20 format which is very unpredictable. IPL match predictor is a ML based prediction approach where the data sets and previous stats are trained in all dimensions covering all important factors such as: Toss, Home Ground, Captains, Favorite Players, Opposition Battle, Previous Stats etc, with each factor having different strength with the help of KNIME Tool and with the added intelligence of Naive Bayes network and Eulers strength calculation formula.
ARTICLE | doi:10.20944/preprints202010.0328.v1
Subject: Engineering, Automotive Engineering Keywords: wearable biosensors; wireless technology; human grip force; motor control; complex task-user systems; expertise; multivariate data; correlation analysis; functional analysis
Online: 15 October 2020 (15:13:43 CEST)
Biosensors and wearable sensor systems with transmitting capabilities are currently developed and used for the monitoring of health data, exercise activities, and other performance data. Unlike conventional approaches, these devices enable convenient, continuous, and unobtrusive monitoring of a user’s behavioral signals in real time. Examples include signals relative to hand an finger movement/pressure control reflected by individual grip force data. As will be shown here, these directly translate into task, skill and hand-specific (dominant versus non-dominant hand) grip force profiles for different measurement loci in the fingers and palm of the hand. On the basis of thousands of sensor data from multiple sensor locations, individual grip force profiles of an task expert, a trained user and a highly proficient user (expert) performing an image-guided and robot-assisted precision task with the dominant or the non-dominant hand are analyzed in several steps following Tukey’s “detective work” approach. Correlation analyses (Person’s Product Moment) reveal skill-specific differences in individual grip force profiles across multiple sources of variation, functionally mapped to the somatosensory brain networks which ensure grip force control and its evolution with control expertise. Implications for the real-time monitoring of individual grip force profiles and their evolution with training in complex task-user systems are brought forward.
ARTICLE | doi:10.20944/preprints201904.0218.v1
Subject: Social Sciences, Business And Administrative Sciences Keywords: eco-innovation; anticipated regulation; self-regulation; industry-specific characteristics; information sourcing openness; multivariate probit model; zero inflated negative binomial model
Online: 19 April 2019 (11:25:06 CEST)
The move to a low carbon economy is very important for enhancing international competitiveness. The eco-innovation is the critical factor of the green paradigm. This study is designed to investigate deeply the determinants of eco-innovation of manufacturing firms in Korea by suggesting anticipated regulations, self-regulations, and industry-specific characteristics as external factors and open information sources as internal factors. The data used in the analysis is 1946 sample firms from Korean Innovation Survey 2010 based on the Oslo Manual. Using the multi-variate probit analysis and the zero-inflated negative binomial (ZINB) regression analysis, we have found out that the anticipated regulations and self-regulations have significant influences both on eco-process innovation and eco-product innovation, while industrial characteristics have no effects. The empirical results also show that the breadth of information sources has a positive effect on businesses in implementing eco-innovations. Our findings show that the Korean government should provide a good platform where firms can better understand the future trends of environmental policies, particularly policies on anticipated and self-regulations. At the same time, Korean firms should establish a voluntary system to control environmental activities so that they can improve eco-innovations through integrating external information.
ARTICLE | doi:10.20944/preprints202201.0317.v1
Subject: Social Sciences, Econometrics & Statistics Keywords: Cohort-Component Method; Multivariate Methods; Time Series Analysis; Monte Carlo Methods; Stochastic Forecasting; Demography; Statistical Epidemiology; Labor Market Research; Health Economics
Online: 21 January 2022 (10:32:54 CET)
Demographic change is leading to the aging of German society. As long as the baby boom co-horts are still of working age, the working population will also age - and decline as soon as this baby boom generation gradually reaches retirement age. At the same time, there has been a trend towards increasing absenteeism (times of inability to work) in companies since the zero years, with the number of days of absence increasing with age. We present a novel stochastic forecast approach that combines population forecasting with forecasts of labor force participation trends, considering epidemiological aspects. For this, we combine a stochastic Monte Carlo-based cohort-component forecast of the population with projections of labor force participation rates and morbidity rates. This article examines the purely demographic effect on the economic costs associated with such absenteeism due to the inability to work. Under expected future employment patterns and constant morbidity patterns, absenteeism is expected by close to 5 percent by 2050 relative to 2020, associated with increasing economic costs of almost 3 percent. Our results illustrate how strongly the pronounced baby boom/ baby bust phenomenon determines demographic development in Germany in the midterm.
ARTICLE | doi:10.20944/preprints201804.0157.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: information theory; cohomology; algebraic topology; topological data analysis; genetic expression; epigenetics; machine learning; statistical physic; multivariate mutual-information; complex systems; biodiversity
Online: 12 April 2018 (05:35:26 CEST)
This paper establishes methods that quantify the structure of statistical interactions within a given data set using the characterization of information theory in cohomology by finite methods, and provides their expression in terms of statistical physic and machine learning. Following [1–3], we show directly that k multivariate mutual-informations (Ik) are k-coboundaries. The k-cocycles are given by Ik = 0, which generalize statistical independence to arbitrary dimension k. The topological approach allows to investigate Shannon’s information in the multivariate case without the assumptions of independent identically distributed variables. We develop the computationally tractable subcase of simplicial information cohomology represented by entropy Hk and information Ik landscapes. The I1 component defines a self-internal energy functional Uk, and (−1)k Ik,k≥2 components define the contribution to a free energy functional Gk of the k-body interactions. The set of information paths in simplicial structures is in bijection with the symmetric group and random processes, provides a topological expression of the 2nd law and points toward a discrete Noether theorem (1st law). The local minima of free-energy, related to conditional information negativity and the non-Shannonian cone of Yeung , characterize a minimum free energy complex. This complex formalizes the minimum free-energy principle in topology, provides a definition of a complex system, and characterizes a multiplicity of local minima that quantifies the diversity observed in biology. Finite data size effects and estimation bias severely constrain the effective computation of the information topology on data, and we provide simple statistical tests for the undersampling bias and for the k-dependences following . We give an example of application of these methods to genetic expression and cell-type classification. The maximal positive Ik identifies the variables that co-vary the most in the population, whereas the minimal negative Ik identifies clusters and the variables that differentiate-segregate the most. The methods unravel biologically relevant I10 with a sample size of 41. It establishes generic methods to quantify the epigenetic information storage and a unified epigenetic unsupervised learning formalism.
ARTICLE | doi:10.20944/preprints201709.0112.v1
Subject: Mathematics & Computer Science, Computational Mathematics Keywords: multivariate logarithmic polynomial; generating function; completely monotonic function; Bernstein function; integral representation; Lévy-Khintchine representation; real part; imaginary part; uniform convergence; recurrence relation; mathematical induction
Online: 23 September 2017 (10:55:57 CEST)
In the paper, by induction and recursively, the author proves that the generating function of multivariate logarithmic polynomials and its reciprocal are a Bernstein function and a completely monotonic function respectively, establishes a Lévy-Khintchine representation for the generating function of multivariate logarithmic polynomials, deduces an integral representation for multivariate logarithmic polynomials, presents an integral representation for the reciprocal of the generating function of multivariate logarithmic polynomials, computes real and imaginary parts for the generating function of multivariate logarithmic polynomials, derives two integral formulas, and denies the uniform convergence of a known integral representation for Bernstein functions.