ARTICLE | doi:10.20944/preprints202007.0269.v1
Subject: Keywords: regularized latent class analysis; regularization; fused regularization; fused grouped regularization; distractor analysis
Online: 12 July 2020 (16:59:18 CEST)
The last series of Raven's standard progressive matrices (SPM-LS) test were studied with respect to its psychometric properties in a series of recent papers. In this paper, the SPM-LS dataset is analyzed with regularized latent class models (RLCM). For dichotomous item response data, an alternative estimation approach for RLCMs is proposed. For polytomous item responses, different alternatives for performing regularized latent class analysis are proposed. The usefulness of the proposed methods is demonstrated in a simulated data illustration and for the SPM-LS dataset. For the SPM-LS dataset, it turned out the regularized latent class model resulted in five partially ordered latent classes.
ARTICLE | doi:10.20944/preprints202209.0231.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: neural networks; regularization; deep networks
Online: 15 September 2022 (13:06:13 CEST)
Numerous approaches address over-fitting in neural networks: by imposing a penalty on the parameters of the network (L1, L2, etc); by changing the network stochastically (drop-out, Gaussian noise, etc.); or by transforming the input data (batch normalization, etc.). In contrast, we aim to ensure that a minimum amount of supporting evidence is present when fitting the model parameters to the training data. This, at the single neuron level, is equivalent to ensuring that both sides of the separating hyperplane (for a standard artificial neuron) have a minimum number of data points — noting that these points need not belong to the same class for the inner layers. We firstly benchmark the results of this approach on the standard Fashion-MINST dataset, comparing it to various regularization techniques. Interestingly, we note that by nudging each neuron to divide, at least in part, its input data, the resulting networks make use of each neuron, avoiding a hyperplane completely on one side of its input data (which is equivalent to a constant into the next layers). To illustrate this point, we study the prevalence of saturated nodes throughout training, showing that neurons are activated more frequently and earlier in training when using this regularization approach. A direct consequence of the improved neuron activation is that deep networks are now easier to train. This is crucially important when the network topology is not known a priori and fitting often remains stuck in a suboptimal local minima. We demonstrate this property by training a network of increasing depth (and constant width): most regularization approaches will result in increasingly frequent training failures (over different random seeds) whilst the proposed evidence-based regularization significantly outperforms in its ability to train deep networks.
ARTICLE | doi:10.20944/preprints202104.0515.v2
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: LSTM; convolution; regularization; stock market
Online: 20 April 2021 (21:12:17 CEST)
Many studies in the current literature annotate patterns in stock prices and use computer vision models to learn and recognize these patterns from stock price-action chart images. Additionally, current literature also use Long Short-Term Memory Networks to predict prices from continuous dollar amount data. In this study, we combine the two techniques. We annotate the consolidation breakouts for a given stock price data, and we use continuous stock price data to predict consolidation breakouts. Unlike computer vision models that look at the image of a stock price action, we explore using the convolution operation on raw dollar values to predict consolidation breakouts under a supervised learning problem setting. Unlike LSTMs that predict stock prices given continuous stock data, we use the continuous stock data to classify a given price window as breakout or not. Finally, we do a regularization study to see the effect of L1, L2, and Elastic Net regularization. We hope that combining regression and classification shed more light on stock market prediction studies.
ARTICLE | doi:10.20944/preprints201710.0165.v1
Subject: Physical Sciences, Applied Physics Keywords: magnetoencephalography; signal space separation; magnetometer; gradiometer; beamforming; regularization
Online: 27 October 2017 (03:50:25 CEST)
Background: Modern MEG devices include 102 sensor triplets containing one magnetometer and two planar gradiometers. The first processing step is often a signal space separation (SSS), which provides a powerful noise reduction. A question commonly raised by researchers and reviewers is which data should be employed in source reconstruction: (1) magnetometers only, (2) gradiometers only, (3) magnetometers and gradiometers together. The MEG community is currently divided about the proper answer and strong arguments in favor and against these three approaches often expressed. Methods: First, we provide theoretical evidence that both gradiometers and magnetometers contain the same information after SSS, and argue that they both result from the backprojection of the same SSS components. Then, we compare beamforming source reconstructions from magnetometers and gradiometers in real MEG recordings before and after SSS. Results: Without SSS, the correlation between source time series extracted from magnetometers and gradiometers was high, with Pearson correlation coefficient r=0.5-0.8. After SSS, these correlation values increased dramatically, reaching over 0.90 across all cortical areas. Conclusions: After SSS, almost identical source reconstructions (r>0.9) can be obtained with magnetometers and gradiometers, as long as regularization is selected appropriately to account for the different properties in magnetometers and gradiometers covariance matrices.
ARTICLE | doi:10.20944/preprints201712.0197.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: air pollutant prediction; multi-task learning; regularization; analytical solution
Online: 28 December 2017 (09:09:20 CET)
In this paper, we tackle air quality forecasting by using machine learning approaches to predict the hourly concentration of air pollutants (e.g., Ozone, PM2.5 and Sulfur Dioxide). Machine learning, as one of the most popular techniques, is able to efficiently train a model on big data by using large-scale optimization algorithms. Although there exists some works applying machine learning to air quality prediction, most of the prior studies are restricted to small scale data and simply train standard regression models (linear or non-linear) to predict the hourly air pollution concentration. In this work, we propose refined models to predict the hourly air pollution concentration based on meteorological data of previous days by formulating the prediction of 24 hours as a multi-task learning problem. It enables us to select a good model with different regularization techniques. We propose a useful regularization by enforcing the prediction models of consecutive hours to be close to each other, and compare with several typical regularizations for multi-task learning including standard Frobenius norm regularization, nuclear norm regularization, ℓ2,1 norm regularization. Our experiments show the proposed formulations and regularization achieve better performance than existing standard regression models and existing regularizations.
ARTICLE | doi:10.20944/preprints202004.0288.v1
Subject: Mathematics & Computer Science, Analysis Keywords: branch points; bifurcation points; Fredholm operator; uniformization; asymptotics; iterations; regularization
Online: 17 April 2020 (01:34:37 CEST)
The necessary and sufficient conditions of existence of the nonlinear operator equations' branches of solutions in the neighbourhood of branching points are derived. The approach is based on reduction of the nonlinear operator equations to finite-dimensional problems. Methods of nonlinear functional analysis, integral equations, spectral theory based on index of Kronecker-Poincare, Morse-Conley index, power geometry and other methods are employed. Proposed methodology enables justification of the theorems on existence of bifurcation points and bifurcation sets in the nonstandard models. Formulated theorems are constructive. For a certain smoothness of the nonlinear operator, the asymptotic behaviour of the solutions is analysed in the neighbourhood of the branch points and uniformly converging iterative schemes with a choice of the uniformization parameter enables the comprehensive analysis of the problems details. General theorems are illustrated on the nonlinear integral equations.
ARTICLE | doi:10.20944/preprints202003.0022.v1
Subject: Mathematics & Computer Science, Applied Mathematics Keywords: Arbitrage-Regularization; Bond Pricing; Model Selection; Deep Learning; Dynamic PCA
Online: 2 March 2020 (01:15:00 CET)
A regularization approach to model selection, within a generalized HJM framework, is introduced which learns the closest arbitrage-free model to a prespecified factor model. This optimization problem is represented as the limit of a one-parameter family of computationally tractable penalized model selection tasks. General theoretical results are derived and then specialized to affine term-structure models where new types of arbitrage-free machine learning models for the forward-rate curve are estimated numerically and compared to classical short-rate and the dynamic Nelson-Siegel factor models.
ARTICLE | doi:10.20944/preprints201911.0261.v1
Subject: Engineering, General Engineering Keywords: feature selection; locally linear embedding; regularization technology; bearing fault diagnosis
Online: 22 November 2019 (10:05:03 CET)
The purpose of feature selection is to find important features from the original high-dimensional space. As atypical feature selection algorithm, Locally linear embedding(LLE)-based feature selection algorithm, which applies the idea of LLE to the graph-preserving feature selection framework, has been received wide attention. However, LLE-based feature selection framework is sensitive to noise and K-nearest neighbors. To address these problems, an improved LLE-based feature selection algorithm, robust LLE (RLLE) vote, is proposed. In this algorithm, $l_1$ and $l_2$ regularization are introduced into the high-dimensional reconstruction model of LLE. Furthermore, RLLE vote also proposes a criterion to measure the difference between the reconstruction features and the original features, and then the importance features can be selected by this criteria. Extensive experiments are carried out on a benchmark fault data set and the bearing data set collected from our own laboratory, and the experimental results demonstrate that RLLE vote achieves the most significant performance compared existing state-of-art methods.
ARTICLE | doi:10.20944/preprints201710.0076.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: big data; machine learning; regularization; data quality; robust learning framework
Online: 17 October 2017 (03:47:41 CEST)
The concept of ‘big data’ has been widely discussed, and its value has been illuminated throughout a variety of domains. To quickly mine potential values and alleviate the ever-increasing volume of information, machine learning is playing an increasingly important role and faces more challenges than ever. Because few studies exist regarding how to modify machine learning techniques to accommodate big data environments, we provide a comprehensive overview of the history of the evolution of big data, the foundations of machine learning, and the bottlenecks and trends of machine learning in the big data era. More specifically, based on learning principals, we discuss regularization to enhance generalization. The challenges of quality in big data are reduced to the curse of dimensionality, class imbalances, concept drift and label noise, and the underlying reasons and mainstream methodologies to address these challenges are introduced. Learning model development has been driven by domain specifics, dataset complexities, and the presence or absence of human involvement. In this paper, we propose a robust learning paradigm by aggregating the aforementioned factors. Over the next few decades, we believe that these perspectives will lead to novel ideas and encourage more studies aimed at incorporating knowledge and establishing data-driven learning systems that involve both data quality considerations and human interactions.
ARTICLE | doi:10.20944/preprints202205.0232.v1
Subject: Engineering, Civil Engineering Keywords: concrete tensile fatigue; neural networks; Bayesian regularization; parameter assessment; fatigue life prediction
Online: 17 May 2022 (13:53:48 CEST)
The fatigue life of concrete is affected by many interwoven factors whose effect is nonlinear. Be-cause of its unique self-learning ability and strong generalization capability, the Bayesian regu-larized backpropagation neural network (BR-BPNN) is proposed to predict concrete behavior in tensile fatigue. The optimal model was determined through various combinations of network parameters. The average relative impact value (ARIV) was constructed to evaluate the correla-tion between fatigue life and its influencing parameters (maximum stress level Smax, stress ratio R, static strength f, failure probability P). ARIV results were also compared with other factor as-sessment methods (weight equation and multiple linear regression analyses). Using BR-BPNN, S-N curves were then obtained for the combinations of R=0.1, 0.2, 0.5; f=5, 6, 7MPa; P=5%, 50%, 95%. The tensile fatigue results under different testing conditions were finally compared for compatibility. It was concluded that Smax has the most significant negative effect on fatigue life; the degree of influence of R, P, and f, which positively correlate with fatigue life, decreases suc-cessively. ARIV is confirmed as a feasible way to analyze the importance of parameters and could be recommended for future applications. The tensile fatigue performance of plain concrete under different stress states (flexural tension, axial tension, splitting tension) does not differ sig-nificantly. Besides utilizing the valuable fatigue test data scattered in the literature, insights gained from this work could provide a reference for subsequent fatigue test program design and fatigue evaluation.
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: classification; optimization; batch normalization; kernel regularization; convolution; pooling; dropout layer; learning rate
Online: 20 July 2021 (09:34:53 CEST)
Alcoholism is attributed to regular or excessive drinking of alcohol and leads to the disturbance of the neuronal system in the human brain. This results in certain malfunctioning of neurons that can be detected by an electroencephalogram (EEG) using several electrodes on a human skull at appropriate positions. It is of great interest to be able to classify an EEG activity as that of a normal person or an alcoholic person using data from the minimum possible electrodes (or channels). Due to the complex nature of EEG signals, accurate classification of alcoholism using only a small data is a challenging task. Artificial neural networks, specifically convolutional neural networks (CNN), provide efficient and accurate results in various pattern-based classification problems. In this work, we apply CNN on raw EEG data, and demonstrate how we achieved 98% average accuracy by optimizing a baseline CNN model and outperforming its results in a range of performance evaluation metrics on the UCI-KDD EGG dataset. This article explains the step-wise improvement of the baseline model using the dropout, batch normalization, and kernel regularization techniques, and provides a comparison of the two models that can be beneficial for aspiring practitioners who aim to develop similar classification models in CNN. A performance comparison is also provided with other approaches using the same dataset.
ARTICLE | doi:10.20944/preprints201808.0219.v1
Subject: Mathematics & Computer Science, Computational Mathematics Keywords: reduced order modeling; regularization; fluid dynamics; stochastic Burgers Equation; proper orthogonal decomposition; spatial filter
Online: 13 August 2018 (08:12:13 CEST)
In this paper, we introduce the evolve-then-filter (EF) regularization method for reduced order modeling of convection-dominated stochastic systems. The standard Galerkin projection reduced order model (G-ROM) yield numerical oscillations in a convection-dominated regime. The evolve-then-filter reduced order model (EF-ROM) aims at the numerical stabilization of the standard G-ROM, which uses explicit ROM spatial filter to regularize various terms in the reduced order model (ROM). Our numerical results based on a stochastic Burgers equation with linear multiplicative noise. It shows that the EF-ROM is significantly better results than G-ROM.
ARTICLE | doi:10.20944/preprints202201.0411.v1
Subject: Engineering, General Engineering Keywords: compressive sensing; image reconstruction; regularization; total variation; augmented Lagrangian; non-local self-similarity; wavelet denoising
Online: 27 January 2022 (11:02:58 CET)
In remote sensing applications, one of the key points is the acquisition, real-time pre-processing and storage of information. Due to the large amount of information present in the form of images or videos, compression of this data is necessary. Compressed sensing (CS) is an efficient technique to meet this challenge. It consists in acquiring a signal, assuming that it can have a sparse representation, using a minimal number of non-adaptive linear measurements. After this CS process, a reconstruction of the original signal must be performed at the receiver. Reconstruction techniques are often unable to preserve the texture of the image and tend to smooth out its details. To overcome this problem, we propose in this work, a CS reconstruction method that combines the total variation regularization and the non-local self-similarity constraint. The optimization of this method is performed by the augmented Lagrangian which avoids the difficult problem of non-linearity and non-differentiability of the regularization terms. The proposed algorithm, called denoising compressed sensing by regularizations terms (DCSR), will not only perform image reconstruction but also denoising. To evaluate the performance of the proposed algorithm, we compare its performance with state-of-the-art methods, such as Nesterov's algorithm, group-based sparse representation and wavelet-based methods, in terms of denoising, and preservation of edges, texture and image details, as well as from the point of view of computational complexity. Our approach allows to gain up to 25% in terms of denoising efficiency, and visual quality using two metrics: PSNR and SSIM.
ARTICLE | doi:10.20944/preprints201705.0175.v2
Subject: Mathematics & Computer Science, Analysis Keywords: generalized functions; tempered distributions; regular functions; local functions; regularization-localization duality; regularity; Heisenberg’s uncertainty principle
Online: 17 July 2017 (06:25:18 CEST)
In this paper, we relate Poisson’s summation formula to Heisenberg’s uncertainty principle. They both express Fourier dualities within the space of tempered distributions and these dualities are furthermore the inverses of one another. While Poisson’s summation formula expresses a duality between discretization and periodization, Heisenberg’s uncertainty principle expresses a duality between regularization and localization. We define regularization and localization on generalized functions and show that the Fourier transform of regular functions are local functions and, vice versa, the Fourier transform of local functions are regular functions.
ARTICLE | doi:10.20944/preprints202208.0216.v1
Subject: Earth Sciences, Geoinformatics Keywords: block cokriging; clay composition; granulometry; multi-collocated cokriging; multi-collocated fac-torial cokriging; regularization; SIDSAM; VIS-NIR-SWIR spectroscopy
Online: 11 August 2022 (11:30:23 CEST)
Traditional soil characterization methods are time consuming, laborious and invasive and do not allow long-term repeatability of measurements. The overall aim of this paper was to assess and model spatial variability of the soil in an olive grove in south Italy by using data from two sensors of different type: a multi-spectral on-board drone radiometer and a hyperspectral visible-near infrared-shortwave infrared (VIS-NIR-SWIR) reflectance radiometer as well as sample data, to arrive at a delineation of homogeneous areas. The hyperspectral data were processed using continuum removal methodology to obtain information about the content and composition of clay. Differently, the multispectral data were firstly upscaled to the support of soil data using geostatistics and taking into account change of support. Secondly, the two-sensor data were integrated with soil granulometric properties by using the multivariate geostatistical techniques of multi-collocated cokriging and factor cokriging, in order to achieve a more exhaustive and finer-scale soil characterisation. The paper shows the impact of change of support on the uncertainty of soil prediction that can have a significant effect on decision making in Precision Agriculture. Moreover, four regionalised factors at two different scales (two per each scale) were retained and mapped. Each factor provided a different delineation of the field with areas characterised by different granulometry and clay composition. The applied method is sufficiently flexible and could be applied to any number and type of sensors.
ARTICLE | doi:10.20944/preprints202111.0092.v1
Subject: Mathematics & Computer Science, Computational Mathematics Keywords: Inverse problems; Regularization; Bayesian inference; Machine Learning; Artificial Intelligence; Gauss-Markov-Potts; Variational Bayesian Approach (VBA); Physics Informed ML
Online: 3 November 2021 (20:18:51 CET)
Classical methods for inverse problems are mainly based on regularization theory. In particular those which are based on optimization of a criterion with two parts: a data-model matching and a regularization term. Different choices for these two terms and great number of optimization algorithms have been proposed. When these two terms are distance or divergence measures, they can have a Bayesian Maximum A Posteriori (MAP) interpretation where these two terms correspond, respectively, to the likelihood and prior probability models.