Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

# Variational Bayesian Approximation (VBA): A comparison between three optimization algorithms

Version 1 : Received: 11 August 2022 / Approved: 12 August 2022 / Online: 12 August 2022 (10:26:02 CEST)

How to cite: Fallah Mortezanejad, S. A.; Mohammad-Djafari, A. Variational Bayesian Approximation (VBA): A comparison between three optimization algorithms. Preprints 2022, 2022080234. https://doi.org/10.20944/preprints202208.0234.v1 Fallah Mortezanejad, S. A.; Mohammad-Djafari, A. Variational Bayesian Approximation (VBA): A comparison between three optimization algorithms. Preprints 2022, 2022080234. https://doi.org/10.20944/preprints202208.0234.v1

## Abstract

In many Bayesian computations, first, we obtain the expression of the joint distribution of all the unknown variables given the observed data. In general, this expression is not separable in those variables. Thus, obtaining their marginals for each variable and computing the expectations are difficult and costly. This problem becomes even more difficult in high dimensional quandaries, which is an important issue in inverse problems. We may then try to propose a surrogate expression with which we can do approximate computations. Often a separable expression approximation can be useful enough. The Variational Bayesian Approximation (VBA) is a technique that approximates the joint distribution $p$ with an easier, for example separable, one $q$ by minimizing Kullback–Leibler Divergence $KL(q|p)$. When $q$ is separable in all the variables, the approximation is also called Mean Field Approximation (MFA) and so $q$ is the product of the approximated marginals. A first standard and general algorithm is alternate optimization of $KL(q|p)$ with respect to $q_i$. A second general approach is its optimization in the Riemannian manifold. However, in this paper, for practical reasons, we consider the case where $p$ is in the exponential family and so is $q$. For this case, $KL(q|p)$ becomes a function of the parameters $\thetab$ of the exponential family. Then, we can use any other optimization algorithm to obtain those parameters. In this paper, we compare three optimization algorithms: standard alternate optimization, a gradient-based algorithm and a natural gradient algorithm and study their relative performances on three examples.

## Keywords

Variational Bayesian Approach (VBA); Kullback–Leibler Divergence; Mean Field Approximation (MFA); Optimization Algorithm

## Subject

Computer Science and Mathematics, Probability and Statistics