ARTICLE | doi:10.20944/preprints202004.0426.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: Bandit Algorithm; Upper Confidence Bounds; Kullback-Leibler divergence
Online: 24 April 2020 (04:24:31 CEST)
Upper confidence bound multi-armed bandit algorithms (UCB) typically rely on concentration in- equalities (such as Hoeffding’s inequality) for the creation of the upper confidence bound. Intu- itively, the tighter the bound is, the more likely the respective arm is or isn’t judged appropriately for selection. Hence we derive and utilise an optimal inequality. Usually the sample mean (and sometimes the sample variance) of previous rewards are the information which are used in the bounds which drive the algorithm, but intuitively the more infor- mation that taken from the previous rewards, the tighter the bound could be. Hence our inequality explicitly considers the values of each and every past reward into the upper bound expression which drives the method. We show how this UCB method fits into the broader scope of other information theoretic UCB algorithms, but unlike them is free from assumptions about the distribution of the data, We conclude by reporting some already established regret information, and give some numerical simulations to demonstrate the method’s effectiveness.
ARTICLE | doi:10.3390/sci2040078
Online: 22 October 2020 (00:00:00 CEST)
This paper presents a quantitative approach to poetry, based on the use of several statistical measures (entropy, informational energy, N-gram, etc.) applied to a few characteristic English writings. We found that English language changes its entropy as time passes, and that entropy depends on the language used and on the author. In order to compare two similar texts, we were able to introduce a statistical method to asses the information entropy between two texts. We also introduced a method of computing the average information conveyed by a group of letters about the next letter in the text. We found a formula for computing the Shannon language entropy and we introduced the concept of N-gram informational energy of a poetry. We also constructed a neural network, which is able to generate Byron-type poetry and to analyze the information proximity to the genuine Byron poetry.
Subject: Physical Sciences, Acoustics Keywords: Kullback–Leibler divergence; granular gases; kinetic theory; molecular dynamics
Online: 8 October 2020 (10:43:55 CEST)
Finding the proper entropy functional associated with the inelastic Boltzmann equation for a granular gas is a yet unsolved challenge. The original H-theorem hypotheses do not fit here and the H-functional presents some additional measure problems that are solved by the Kullback–Leibler divergence (KLD) of a reference velocity distribution function from the actual distribution. The right choice of the reference distribution in the KLD is crucial for the latter to qualify or not as a Lyapunov functional, the “homogeneous cooling state” (HCS) distribution of the freely cooling system being a potential candidate. Due to the lack of a formal proof, the aim of this work is to support this conjecture aided by molecular dynamics simulations of inelastic hard disks and spheres in a wide range of values for the coefficient of restitution (α). Our results reject the Maxwellian distribution as a possible reference, whereas reinforce the HCS one. Moreover, the KLD is used to measure the amount of information lost on using the former rather than the latter, and reveals a nonmonotonic dependence with α. Additionally, a Maxwell-demon-like velocity-inversion experiment highlights the microscopic irreversibility of the granular gas dynamics.
ARTICLE | doi:10.20944/preprints202208.0234.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: Variational Bayesian Approach (VBA); Kullback–Leibler Divergence; Mean Field Approximation (MFA); Optimization Algorithm
Online: 12 August 2022 (10:26:02 CEST)
In many Bayesian computations, first, we obtain the expression of the joint distribution of all the unknown variables given the observed data. In general, this expression is not separable in those variables. Thus, obtaining their marginals for each variable and computing the expectations are difficult and costly. This problem becomes even more difficult in high dimensional quandaries, which is an important issue in inverse problems. We may then try to propose a surrogate expression with which we can do approximate computations. Often a separable expression approximation can be useful enough. The Variational Bayesian Approximation (VBA) is a technique that approximates the joint distribution $p$ with an easier, for example separable, one $q$ by minimizing Kullback–Leibler Divergence $KL(q|p)$. When $q$ is separable in all the variables, the approximation is also called Mean Field Approximation (MFA) and so $q$ is the product of the approximated marginals. A first standard and general algorithm is alternate optimization of $KL(q|p)$ with respect to $q_i$. A second general approach is its optimization in the Riemannian manifold. However, in this paper, for practical reasons, we consider the case where $p$ is in the exponential family and so is $q$. For this case, $KL(q|p)$ becomes a function of the parameters $\thetab$ of the exponential family. Then, we can use any other optimization algorithm to obtain those parameters. In this paper, we compare three optimization algorithms: standard alternate optimization, a gradient-based algorithm and a natural gradient algorithm and study their relative performances on three examples.
ARTICLE | doi:10.20944/preprints202210.0169.v1
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: low-rank matrices; errors-in-variables models; lower bounds; Kullback-Leibler divergence; information-theoretic limitations
Online: 12 October 2022 (09:57:25 CEST)
Noisy data is always encountered in real applications, such as bioinformatics, neuroimage and remote sensing. Existing methods mainly consider linear or generalized linear errors-in-variables regression, while relatively little attention is paid for the multivariate response case, and how to evaluate the estimation performance under perturbed covariates is still an open question. In this paper, we consider the information-theoretic limitations of estimating a low-rank matrix in the multi-response errors-in-variables regression model. By application of the information theory and statistical techniques on concentration inequalities, the minimax lower bound is provided in terms of the squared Frobenius loss, which recaptures the rate provided under the clean covariate assumption in previous literatures. Hence our result further indicates that though under the more realistic errors-in-variables situation, no more samples are required so as to achieve a rate-optimal estimation.
Subject: Engineering, Automotive Engineering Keywords: Modal expansion; Information theory; Kullback-Leibler divergence; Utility theory; virtual sensing; response reconstruction; Structural Dynamics
Online: 15 April 2021 (09:38:20 CEST)
A framework for optimal sensor placement (OSP) for virtual sensing using the modal expansion technique and taking into account uncertainties is presented based on information and utility theory. The OSP maximizes a utility function that quantifies the expected information gained from the data for reducing the uncertainty of quantities of interest (QoI) predicted at the virtual sensing locations. The utility function is extended to make the OSP design robust to uncertainties in structural model and modelling error parameters, resulting in a multidimensional integral of the expected information gain over all possible values of the uncertain parameters and weighted by their assigned probability distributions. Approximate methods are used to compute the multidimensional integral and solve the optimization problem that arises. The Gaussian nature of the response QoI is exploited to derive useful and informative analytical expressions for the utility function. A thorough study of the effect of model, prediction and measurement errors and their uncertainties, as well as the prior uncertainties in the modal coordinates on the selection of the optimal sensor configuration is presented, highlighting the importance of accounting for robustness to errors and other uncertainties.
ARTICLE | doi:10.20944/preprints201812.0209.v2
Subject: Mathematics & Computer Science, Probability And Statistics Keywords: neural population coding; mutual information; Kullback-Leibler divergence; Rényi divergence; Chernoff divergence; approximation; discrete variables
Online: 7 March 2019 (07:36:24 CET)
Although Shannon mutual information has been widely used, its effective calculation is often difficult for many practical problems, including those in neural population coding. Asymptotic formulas based on Fisher information sometimes provide accurate approximations to the mutual information but this approach is restricted to continuous variables because the calculation of Fisher information requires derivatives with respect to the encoded variables. In this paper, we consider information-theoretic bounds and approximations of the mutual information based on Kullback--Leibler divergence and Rényi divergence. We propose several information metrics to approximate Shannon mutual information in the context of neural population coding. While our asymptotic formulas all work for discrete variables, one of them has consistent performance and high accuracy regardless of whether the encoded variables are discrete or continuous. We performed numerical simulations and confirmed that our approximation formulas were highly accurate for approximating the mutual information between the stimuli and the responses of a large neural population. These approximation formulas may potentially bring convenience to the applications of information theory to many practical and theoretical problems.
Subject: Keywords: Textual data distributions; supervised learning; unsupervised learning; Kullback-Leibler divergence; sentiment; textual analytics; text generation; vaccine; stock market
Online: 17 June 2021 (10:03:41 CEST)
Efficient textual data distributions (TDD) alignment and generation are open research problems in textual analytics and NLP. It is presently difficult to parsimoniously and methodologically confirm that two or more natural language datasets belong to similar distributions, and to identify the extent to which textual data possess alignment. This study focuses on addressing a segment of the broader problem described above by applying multiple supervised and unsupervised machine learning (ML) methods to explore the behavior of TDD by (i) topical alignment, and (ii) by sentiment alignment. Furthermore we use multiple text generation methods including fine-tuned GPT-2, to generate text by topic and by sentiment. Finally we develop a unique process driven variation of Kullback-Leibler divergence (KLD) application to TDD, named KL Textual Distributions Contrasts (KL-TDC) to identify the alignment of machine generated textual corpora with naturally occurring textual corpora. This study thus identifies a unique approach for generating and validating TDD by topic and sentiment, which can be used to help address sparse data problems and other research, practice and classroom situations in need of artificially generated topic or sentiment aligned textual data.
ARTICLE | doi:10.20944/preprints201712.0074.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: radar; transmit signal waveform design; doubly spread; extended target; fluctuation; Kullback-Leibler divergence; locally most powerful detector; colored noise
Online: 12 December 2017 (08:48:57 CET)
Radar transmit signal design is a critical factor for the radar performance. In this paper, we investigate the problem of radar signal waveform design under the small signal power conditions for detecting a doubly spread target, whose impulse response can be modeled as a random process, in a colored noise environment. The doubly spread target spans multiple range bins (range-spread) and its impulse response is time-varying due to fluctuation (hence also Doppler-spread), such that the target impulse response is both time-selective and frequency-selective. Instead of adopting the conventional assumption that the target is wide-sense stationary uncorrelated scattering,we assume that the target impulse response is both wide-sense stationary in range and in time to account for the possible correlation between the impulse responses corresponding to close range intervals. The locally most powerful detector, which is asymptotically optimal for small signal cases, is then derived for detecting such targets. The signal waveform is optimized to maximizing the detection performance of the detector or equivalently maximizing the Kullback-Leibler divergence. Numerical simulations validate the effectiveness of the proposed waveform design for the small signal power conditions and performance of optimum waveform design are shown in comparison to the frequency modulated waveform.
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: Large deviation principle; Sub-critical SINR random network model; Poisson point process; Empirical power measure; Empirical connectivity measure; Relative entropy; Kullback action
Online: 13 April 2021 (09:17:54 CEST)
The article obtains large deviation asymptotic for sub-critical communication networks modelled as signal-interference-noise-ratio(SINR) random networks. To achieve this, we define the empirical power measure and the empirical connectivity measure, as well as prove joint large deviation principles(LDPs) for the two empirical measures on two different scales. Using the joint LDPs, we prove an Asymptotic equipartition property(AEP) for wireless telecommunication Networks modelled as the subcritical SINR random networks. Further, we prove a Local Large deviation principle(LLDP) for the sub-critical SINR random network. From the LLDPs, we prove the large deviation principle, and a classical McMillan Theorem for the stochastic SINR model processes. Note that, the LDPs for the empirical measures of this stochastic SINR random network model were derived on spaces of measures equipped with the $\tau-$ topology, and the LLDPs were deduced in the space of SINR model process without any topological limitations. We motivate the study by describing a possible anomaly detection test for SINR random networks.