Preprint
Short Note

This version is not peer-reviewed.

A Short Note on Gaussian Distribution with Non-Constant Correlation

Submitted:

07 April 2025

Posted:

07 April 2025

Read the latest preprint version here

Abstract

This article studies the terminal distribution of multi-variate Brownian motion where the correlations are not constant. In particular, with the assumption that the correlation function is driven by one factor, this article developed PDEs to quantify the moments of the conditional distribution of other factors. By using normal distribution and moment matching, we found a good approximation to the true Fokker Planck solution and the method provides a good analytic tractability and fast performance due to the low dimensions of PDEs to solve. This method can be applied to model correlation skew effect in quantitative finance, or other cases where a non-constant correlation is desired in modelling multi-variate distribution.

Keywords: 
;  

1. Background

Gaussian copula is widely used in quantitative finance modelling. The Gaussian distribution is closely related to an underlying Brownian motion: the standard multi-variate normal distribution is the terminal distribution of an underlying multi-variate Brownian motion where the correlations are constant over time. However the correlation being constant is a limitation of this model which might not fit the actual market. On the other hand, if the correlations are not constant, the result terminal distribution has no closed-form representation in general. Without the closed-form solution or analytic tractability, it becomes less attractive for practical usage. There are research in alternative directions which bypass this tractability issue, for example in (Lucic 2012; Luján 2022), the respective authors created different terminal distributions which can admit shape with the desired correlation skew effect. In this paper, we still focus on the terminal distribution result from the Brownian motion itself. We study the PDE for the density function and show that with some assumption on the correlation function, we can derive some PDEs which can describe the moments of the marginal and conditional distributions. In particular, these PDEs are of lower dimensions, therefore the calculations are fast and practical. With these moments, we can generate moment matching approximations with nice analytic tractability to the true terminal distribution. The result analytic distribution can be a useful variation to the standard multi-variate normal distribution and it can be used for purpose like modelling correlation skew effect in quant finance.

2. Methodology

2.1. Model Setup

We study this math problem below. This is a 2-dimensional case however we show later that similar techniques can be applied to higher dimensions.
x ( 0 ) , y ( 0 ) = 0 , 0
d x = d w 1
d y = ρ ( x , y , t ) d w 1 + 1 ρ 2 ( x , y , t ) d w 2
< d w 1 , d w 2 > = 0
The Fokker-Planck equation (Fokker 1914; Kolmogorov 1931; Planck 1917) describes the joint probability density function p ( x , y , t ) by:
p t = 1 2 ( p x 2 + 2 ( ρ p ) x y + p y 2 )
This is a 2d-PDE in the convention of quant finance industry (2d refers to 2-dimension in space variables ( x , y ) while in fact it is a 3-d PDE if counting t, given the common presence of t in this type of PDE we refer the dimensions to only the space variables) and the general numerical method is slow. However, we can reduce the complexity of the 2d-PDE if we make a reasonable assumtion on the correlation function as below:
ρ ( x , y , t ) = ρ ( x + y , t )
This means the correlation depends on the ( x , y ) in terms of the total ( x + y ) , which can be interpreted as: correlation depends on a market factor which is the average of the underlyers. With this extra assumption, we can simplify the problem as below:
Lets make change of variables below
u = 1 2 ( x + y )
v = 1 2 ( x y )
Then we have
< d u , d v > = 1 4 ( < d x , d x > < d y , d y > ) = 0
And d u , d v can be written as
d u = 1 + ρ ( u , t ) 2 d w 3
d v = 1 ρ ( u , t ) 2 d w 4
Note the first diffusion only involves u, then Fokker-Planck equation for probability density of u is a 1d-PDE:
p ( u , t ) t = 1 2 2 u 2 ( 1 + ρ ( u , t ) 2 p ( u , t ) )
So we can solve p ( u , t ) first, then we look at the v ( t ) . For any given path u ( s ) , 0 s t , the v ( t ) is simply a sum of infinitesimal normal variables with variances 1 ρ ( u ( s ) , s ) 2 , so we know the distrubtion of v ( t ) condition on this path u ( s ) , 0 s t is a normal distribution with mean 0 and variance
0 t 1 ρ ( u ( s ) , s ) 2 d s

2.1.1. Conditional Distribution and the First Two Moments

Conditioned on a path is not easy to use for calculation, it would be more useful to condition on a value u ( t ) instead of the whole path. Let { v ( t ) | u ( t ) = u } be the conditional distribution, note this distribution is not a strict normal distribution in general. A more detailed discussion of this distribution is left to later and now we focus on the first two moments, ie, the mean and variance of this conditional distribution.
The mean is clearly zero by symmetry. For the variance, denoted as f ( u , t ) , we have
Theorem 1.
Let { v ( t ) | u ( t ) = u } be the conditional distribution of v ( t ) conditioned on u ( t ) = u , then its variance f ( u , t ) satisfy
f ( u , t ) = E [ 0 t 1 ρ ( u ( s ) , s ) 2 d s | u ( t ) = u ]
Proof. 
The proof is straightforward. For completeness included below: By the stochastic integral definition as limit of sums, v ( t ) | u ( t ) = u is the following sum with the constraint u ( t ) = u :
i 1 ρ ( u i , s i ) 2 Δ i w 4
Using definition of variance, we get this sum
i , j 1 ρ ( u i , s i ) 2 1 ρ ( u j , s j ) 2 Δ i w 4 Δ j w 4
Which only has non-zero terms as below after taking expectation.
i 1 ρ ( u i , s ) 2 Δ i t
with the constraint u ( t ) = u .
Then take the limit of Δ i t 0 and the statement is proven. □
Note f ( u , t ) is a path integral on all possible paths u ( s ) that get to u at t. We have the following:
p ( u , t + d t ) f ( u , t + d t ) = p ( x , t ) [ f ( x , t ) + d t 1 ρ ( x , t ) 2 ] p ( u , t + d t | x , t ) d x
The p ( u , t + d t | x , t ) is the transition probability from state ( x , t ) to ( u , t + d t ) .
Now we follow the Fokker-Planck equation derivation technique, we will get:
Theorem 2.
t ( p f ) = p 1 ρ ( u , t ) 2 + 1 2 u 2 ( p f 1 + ρ ( u , t ) 2 )
Proof. 
t ( p f ) = lim d t 0 p ( u , t + d t ) f ( u , t + d t ) p ( u , t ) f ( u , t ) d t = lim d t 0 p ( x , t ) f ( x , t ) p ( u , t + d t | x , t ) d x p ( u , t ) f ( u , t ) d t + lim d t 0 p ( x , t ) 1 ρ ( x , t ) 2 p ( u , t + d t | x , t ) d x
Note the second term comes to
lim d t 0 p ( x , t ) 1 ρ ( x , t ) 2 p ( u , t + d t | x , t ) d x = p ( u , t ) 1 ρ ( u , t ) 2
So we just have to prove
lim d t 0 p ( x , t ) f ( x , t ) p ( u , t + d t | x , t ) d x p ( u , t ) f ( u , t ) d t = 1 2 u 2 ( p f 1 + ρ ( u , t ) 2 )
Let h ( u ) be a smooth function with compact support, consider
h ( u ) p ( x , t ) f ( x , t ) p ( u , t + d t | x , t ) d x d u = p ( x , t ) f ( x , t ) h ( u ) p ( u , t + d t | x , t ) d u d x = p ( x , t ) f ( x , t ) ( h ( x ) + h ( u x ) + 1 2 h ( u x ) 2 + O ( ( u x ) 3 ) ) p ( u , t + d t | x , t ) d u d x
Now the integral ( u x ) k p ( u , t + d t | x , t ) d u is the k t h moment of the Brownian motion d u = 1 + ρ ( u , t ) 2 d w 3 , so we have
( u x ) p ( u , t + d t | x , t ) d u = 0 ( u x ) 2 p ( u , t + d t | x , t ) d u = 1 + ρ ( x , t ) 2 d t ( u x ) k p ( u , t + d t | x , t ) d u = higher order than d t when k > 2
Then we have below, in the order of d t
h ( u ) p ( x , t ) f ( x , t ) p ( u , t + d t | x , t ) d x d u = p ( x , t ) f ( x , t ) h ( x ) p ( u , t + d t | x , t ) d u d x + d t 1 2 p ( x , t ) f ( x , t ) h 1 + ρ ( x , t ) 2 d x = p ( x , t ) f ( x , t ) h ( x ) d x + d t 1 2 p ( x , t ) f ( x , t ) h 1 + ρ ( x , t ) 2 d x
so
lim d t 0 h ( u ) p ( x , t ) f ( x , t ) p ( u , t + d t | x , t ) d x d u h ( u ) p ( u , t ) f ( u , t ) d u d t = 1 2 p ( x , t ) f ( x , t ) h 1 + ρ ( x , t ) 2 d x = h ( x ) 1 2 2 x 2 ( p ( x , t ) f ( x , t ) 1 + ρ ( x , t ) 2 ) d x
The last step in above is integration by parts. Because the h ( u ) is arbitrary smooth function so it follows that:
lim d t 0 p ( x , t ) f ( x , t ) p ( u , t + d t | x , t ) d x p ( u , t ) f ( u , t ) d t = 1 2 u 2 ( p f 1 + ρ ( u , t ) 2 )
  □
To recap, we have these 2 key equations:
p ( u , t ) t = 1 2 2 u 2 ( 1 + ρ ( u , t ) 2 p ( u , t ) )
t ( p f ) = p 1 ρ ( u ( t ) , t ) 2 + 1 2 u 2 ( p f 1 + ρ ( u , t ) 2 )
We can solve for p first and then solve for f (It is also possible to bundle the PDE solving for p and f together in discretization etc). Knowing p ( u , t ) and f ( u , t ) , we know the marginal distribution of u and the mean (0) and variance of the conditional distribution { v ( t ) | u ( t ) = u } . Note we mentioned previously the conditional distribution { v ( t ) | u ( t ) = u } is not strictly normal in general: A sample of it is basicly a two step process: first choose a path for u subject to the u ( t ) terminal condition, this yields a path-wise integral 0 t 1 ρ ( u , s ) 2 d s . Then choose a point from a normal distribution with variance set to 0 t 1 ρ ( u , s ) 2 d s . Or one can also think it as first choose a sample from a standard normal distribution, then choose a path for u subject to the terminal condition, calculate the 0 t 1 ρ ( u , s ) 2 d s and scale the normal variable with it.
One might think the 2nd view can keep the normal ness of the whole sampling result, but actually not: some heriustic thinking is that the normal sampling tends to be centralized, and then the scaling also has some centralized tendency, therefore not the same as a constant scaling will do. This is heriustic of course, but next we develope equations for higher moment, then one can see it won’t be a strict normal as the 4th moment vs 2nd moment relation is different to a normal distribution.

2.1.2. Higher Order Moments

We can derive the equations of the higher order moments following the similar technique. To demonstrate, we look at the 4-th order. Let g ( u , t ) be the 4-th order moment of the conditional distribution { v ( t ) | u ( t ) = u } .
Theorem 3. 
( p g ) t = 3 ( 1 ρ ( u , t ) ) p f + 1 2 2 u 2 ( p g 1 + ρ ( u , t ) 2 )
Proof. 
4-th order moment is limit of sum below with constraint u ( t ) = u
i , j , k , l ( m { i , j , k , l } 1 ρ ( u m , s m ) 2 Δ m w )
After taking expectation and removing zero terms, this becomes below: note ( i = j terms are approaching to 0 when making finer grids so we ignored them)
6 i < j 1 ρ ( u i , s i ) 2 1 ρ ( u j , s j ) 2 Δ i t Δ j t
Taking limit, in integral representation, it is
g ( u , t ) = E [ 6 0 t 1 ρ ( u ( s ) , s ) 2 d s s t 1 ρ ( u ( x ) , x ) 2 d x | u ( t ) = u ]
Then following the Fokker Planck derivation, we have:
p ( u , t + d t ) g ( u , t + d t ) = p ( x , t ) [ g ( x , t ) + 6 d t f ( x , t ) 1 ρ ( x , t ) 2 ] p ( u , t + d t | x , t ) d x
The p ( u , t + d t | x , t ) is the transition probability from state ( x , t ) to ( u , t + d t ) . Then following the previous derivation we have
( p g ) t = 3 ( 1 ρ ( u , t ) ) p f + 1 2 2 u 2 ( p g 1 + ρ ( u , t ) 2 )
  □
Now we can prove the conditional distribution is not normal in general, because normal distribution’s 4th order moment is 3 times of the variance square.
Lemma 1. 
if g = 3 f 2 , then f u = 0
Proof. 
Plug into the equations and straightforward. □
If f u = 0 then it becomes the standard multi-variate normal distribution.

2.1.3. Normal Approximation to the Conditional

With above in mind, we still prefer to use the normal distribution with the variance matching f ( u , t ) to approximate the true conditional distribution. This is because first it can match the moments to the 2nd order which is usually good enough for many practical usage. Secondly, the normal distribution has very good analytic tractability.
Note that the normal distribution is not a random choice either: if we denote the true joint density as p ( u , t ) q ( u , v , t ) where q ( u , v , t ) is the conditional probability of v conditioned on u. Then when we use a normal density form for the q, it will satisfy the Fokker Planck equation on most of the terms.
This leads to a question: is there a good analytic form for the q such that it can match moments to higher order (for example 4-th order) ? Obviously such form will involve f , g or the moments it need to match. Due to my limited knowledge, this remains interesting but also a mystery to me.

3. Higher Dimensions and General Form

In higher dimensions, similar technique can be applied if we assume the correlations are driven by one factor. Lets denote that factor as M, then by Cholesky decomposition, we can re-write the dynamics as following:
d M = A ( M , t ) d W 1
d X 2 = A 21 ( M , t ) d W 1 + A 22 ( M , t ) d W 2
d X 3 = A 31 ( M , t ) d W 1 + A 32 ( M , t ) d W 2 + A 33 ( M , t ) d W 3
d X i = k A i k d W k
d W i are independent brownian motions. All the correlation terms only depend on M and t. We have following result:
Theorem 4. 
Let p ( M , t ) be the probability density function for M at t, then
p ( M , t ) t = 1 2 M 2 ( A 2 p )
Let f i ( M , t ) be the E [ X i | M ( t ) = M ] , then
( p f i ) t = M ( p A i 1 A ) + 1 2 M 2 ( A 2 p f i )
Let g i j ( M , t ) be the E [ X i X j | M ( t ) = M ] , then
( p g i j ) t = p k m i n ( i , j ) A i k A j k M ( p f i A j 1 A + p f j A i 1 A ) + 1 2 M 2 ( A 2 p g i j )
Proof. 
Standard Fokker Planck derivation using Taylor expansions and integration by parts. Similar to previous one in 2-d case. □
These are 1-d PDEs and after solving them numerically, we use normal distribution satisfying these moments to approximate the condtional distribution of ( X 2 , X 3 , . . . , X n | M ) .
Note that the equations can be written in following intrinsic form.
p ( M , t ) t = 1 2 M 2 ( < d M , d M > d t p )
( p E i ) t = M ( p < d X i , d M > d t ) + 1 2 M 2 ( < d M , d M > d t p E i )
( p C O V i j ) t = p < d X i , d X j > d t                          
                                                                                                                                      M ( p E i < d X j , d M > d t + p E j < d X i , d M > d t )
                                                                                              + 1 2 M 2 ( < d M , d M > d t p C O V i j )
Note above form is when all the M , X i have 0 drift. General form with non-zero drift can be derived as well but would be more complicated.

4. Implementaion Example

We show one example of 2-d case: We discretize p and f together and solve for p first for a time step, and then solve for f. We don’t use chain rule to break out the partial derivatives of product but instead discretize on the product. With standard finite difference methods, the calculation is fast and stable. We present an example of the distribution below:
Figure 1 shows the contour of a Gaussian distribution with correlation skew. The underlying correlation function is:
ρ ( u ) = 0.9 if u < 2 t 0.9 u + 2 t 4 t 0.4 if 2 t u < 2 t 0.5 if u 2 t
The graph axis is in x and y. Note u , v will be the two diagonal directions.
The shape of the contour is expected. As we put higher correltion when the u = x + y 2 is lower, and lower correlation when u is higher, the probability is more concentrated when u is low and more dispersed when u is high. Note with u fixed, the graph also shows symmetry in the direction of v.
The following graphs shows more details on p ( u ) in above example.
In Figure 2 the distribution of u is very close but different to a standard normal. To see the difference, we reflected the probability around center and then one can see the negative part has a fatter tail than positive part.
This is expected as we correlated x , y more when x + y is more negative, we expect x + y will have more potential to go lower in the negative direction, and as we de-correlate x , y more when x + y more positive, we expect the diversifying effect makes the x + y less potential to go higher when x + y positive.
To demonstrate this point, we can increase the skew of correlation further to see the fat tail effect. Below is the p ( u ) for a more skewed correlation function.
The correlation function in Figure 3 is
ρ ( u ) = 0.9 if u < 2 t 0.9 u + 2 t 4 t 1.4 if 2 t u < 2 t 0.5 if u 2 t
The coutour in Figure 1 shows the v is concentrated when u more negative and v is spreaded when u is more positive. Below Figure 4 shows the std dev of v conditioned on u, ie, the f function.

5. Copula Application

Knowing the p ( u ) and f ( u ) we can integrate any function on this approximated terminal distribution. For given u , v , it maps to x , y , and the marginal distribution of x and y are approximated normal distribution respectively (Note that the true terminal distribution will have the marginals as true normal distribution, as we approximate the terminal we can think we approximated the marginals), so they can be readily used to invert CDFs.
We found the moment matching to 2nd order produces good accuracy in our tests.

References

  1. Fokker, A. D. (1914). Die mittlere energie rotierender elektrischer dipole im strahlungsfeld. Annalen der Physik, 348(5), 810–820. [CrossRef]
  2. Kolmogorov, A. (1931). ¨Uber die analytischen methoden in der wahrscheinlichkeitstheorie. Math Annal, 104, 415–458. [CrossRef]
  3. Lucic, V. (2012). Correlation skew via product copula. In Financial engineering workshop, cass business school.
  4. Luján, I. (2022). Pricing the correlation skew with normal mean–variance mixture copulas. Journal of Computational Finance, 26(2).
  5. Planck, V. (1917). ¨Uber einen satz der statistischen dynamik und seine erweiterung in der quantentheorie. Sitzungberichte der.
Figure 1. Contour of Gaussian distribution with correlation skew
Figure 1. Contour of Gaussian distribution with correlation skew
Preprints 155015 g001
Figure 2. Marginal distribution of u
Figure 2. Marginal distribution of u
Preprints 155015 g002
Figure 3. Marginal distribution of u
Figure 3. Marginal distribution of u
Preprints 155015 g003
Figure 4. std dev of v conditioned on u
Figure 4. std dev of v conditioned on u
Preprints 155015 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated