Induced Dualistic Geometry of Finitely Parametrized Probability Densities on Manifolds

This paper aims to describe the geometrical structure and explicit expressions of family of finitely parametrized probability densities over smooth manifold M. The geometry of family of probability densities on M are inherited from probability densities on Euclidean spaces {Uα} via bundle morphisms, induced by an orientation-preserving diffeomorphisms ρα : Uα → M. Current literature inherits densities on M from tangent spaces via Riemannian exponential map exp : Tx M → M; densities on M are defined locally on region where the exponential map is a diffeomorphism. We generalize this approach with an arbitrary orientation-preserving bundle morphism; we show that the dualistic geometry of family of densities on Uα can be inherited to family of densities on M. Furthermore, we provide explicit expressions for parametrized probability densities on ρα(Uα) ⊂ M. Finally, using the component densities on ρα(Uα), we construct parametrized mixture densities on totally bounded subsets of M. We provide a description of inherited mixture product dualistic geometry of the family of mixture densities.


Introduction
Statistics and data analysis on manifolds is of interest in fields such as image processing, shape analysis, machine learning, and natural computation.For instance, the notion of probability densities on manifold of motions [1] arises naturally; yet a description of geometrical properties and the explicit expressions of families of probability densities over a such general manifold is lacking.
Striving for explicit expressions of probabilty density functions over manifolds goes back to the field of directional statics on spheres [2], and recently [3] on the space of symmetric positive definite matricies.Both studies employ the notion of Riemannian exponential map described by Pennec in [4].
However, finding the exact expression of Riemannian exponential map for general Riemannian manifolds can in practice be computationally expensive, as it involves solving the geodesic equation, which is a second-order differential equation.Therefore, in this work we aim to extend and generalize the approach by [4] in two ways: 1. Inherit the dualistic geometry of family of parametrized probabilty densities over Euclidean spaces to probability densities over manifolds via orientation-preserving bundle morphism.This generalizes the Riemannian exponential map 2. Construct and study the product dualistic geometry of families of parametrized mixture densities on totally bounded subsets of M. This generalizes the extent of region where family of finitely parametrized probability densities and their corresponding geometrical structure can be described on M.
The discussion of this paper revolves around the following bundle morphism induced by a local orientation-preserving diffeomorphism ρ : In this work we will discuss how the geometry of S, a family of probability densities over Euclidean space U ⊂ R n , is inherited to S via pulled-back bundle morphism defined by ρ −1 .
The paper is organized as follows: In the second section we discuss and establish the domain of discourse.We provide a brief summary of the natural identification of probability densities over manifolds both as functions and as volume forms described in the existing literature.This allows us to use density and volume form interchangeably in further discussion for simplicity.This describes the vertical parts of the bundle morphism shown above.
In the third section we extend the notion of naturality of Riemannian structure to dualistic structure [5] of Hessian-Riemannian manifolds.We study how dualistic structure can be inherited from one manifold to another via a diffeomorphism.We show how the pulled-back dualistic structure can be determined explicitly via the pulled-back local coordinates.As it is well known that statistical manifolds can be endowed with dualistic geometry.This will be used throughout the rest of the paper to naturally inherit family of probability densities over Euclidean spaces to probability densities over manifolds.This describes the top horizontal part of the bundle morphism shown above.
In the forth section we consider the entire bundle morphism and present a construction of probability densities on manifolds that generalizes the current approach [4] as discussed above.We first show how family of parametrized probability densities S on M is inherited from open subset U ⊂ R n via the bundle morphism, while preserving the dualistic geometrical structure of S on U.This allows us to derive an exact expression of probability density functions on M as a pulled-back densities.Moreover, the induced family of probability densities S on M inherits the dualistic geometry of S on U, which in turn allows us to inherit useful properties such as metric and divergence.This local construction is then extended to construct parametrized mixture densities over totally bounded subsets of M. We show that, under rather mild natural conditions, the family of mixture densities is a product manifold with the corresponding dualistic product geometry (generated by locally induced family of probability densities).
Finally, in section five we provide a simple example of inducing family of mixture Gaussian density functions over the unit 2-sphere S 2 .

Setting
Let M be connected n-dimensional smooth Riemannian manifold with finite Riemann volume.Let Vol(M) denote the line bundle of smooth densities over M, and E (Vol(M)) denote the smooth sections of Vol(M).Let Dens + (M) denote the subspace of positive densities over M (reader is kindly referred to relevant publications [6,7] for details on volume bundle): We consider the subspace of positive densities that integrate to 1 over M: Let µ 0 := dV g denote the Riemannian volume form, be the reference measure on M. The space of probability density function over M denoted by P(M) ⊂ L 1 (M, µ 0 ) ⊂ C ∞ (M, R + ) can be associated naturally to Prob(M) ⊂ Dens + (M) volume forms in via Hodge star operator [8], or more notably in recent work, a diffeomorphism [7,9] given by the map: For the rest of the paper we will associate the space of probability densities as a function space and space of densities naturally with respect to the Riemannan volume form dV g .
In this work we will also restrict our attention to finitely parametrized families of probabilty distributions.Family of finitely parametrized probability distributions, equipped with Fisher information metric, has the structure of a Riemannian manifold also known as a statistical manifold [5].
That is, the statistical manifold of the family of probability distributions will be finite dimensional for the rest of the paper.

Preliminary: Induced dualistic structure
Consider a statistical manifold S with its corresponding metric g together with a pair of g-conjugate connections (∇, ∇ * ).The triplet (g, ∇, ∇ * ), known as the dualistic structure [5], is fundamental to the study of the intrinsic geometry of statistical manifolds [10].The triplet (g, ∇, ∇ * ) satisfies the following: In this section we show how dualistic structure can be inherited to an arbitrary smooth manifold from a given manifold with dualistic structure naturally via a diffeomorphism.
We discuss two different ways of pulling back (dually flat) dualistic structures given a diffeomorphism from one manifold to another.We first show that general dualistic structures can be pulled back directly via diffeomorphism.We then show when the manifolds are dually flat, the induced dualistic structure can be computed implicitly via the pulled-back coordinates and metric.
Whilst the first method arises more naturally in a theoretical setting, the second provides a more computable way to describe an such an induced dualistic structure that is equivalent to the first when the manifolds are dually flat.
It is worth noting that, the intrinsic geometry of a statistical manifolds (S, g, ∇, ∇ * ) can be alternatively be associated to a pair (g, T), where T denotes Amari-Chenstov tensor In a recent paper, Ay et al. [11] showed how (g, T) can be inherited via sufficient statistics.In our case since diffeomorphisms are injective therefore is a sufficient statistics, our discussion is a special case of what they showed.
In this section we provide an alternative proof for the case when S is finite dimensional, and when the map between statistical manifolds is a diffeomorphism.In particular we show the relation between induced dualistic structure, Hessian structure, and local coordinate systems on finite dimensional statistical manifolds.

Naturality of dualistic structure
Suppose S is a finite dimensional manifold equipped with torsion-free dually flat dualistic structure (g, ∇, ∇ * ), then we can induce via a diffeomorphism a (dually flat) dualistic structure onto another manifold S. Theorem 1.If ϕ : S → S is a diffeomorphism between smooth manifolds, and S is equipped with torsion-free dualistic structure (g, ∇, ∇ * ), then S is a Riemannian manifold with induced dualistic structure (ϕ * g, ϕ * ∇, ϕ * ∇ * ).
Proof.Let S be a smooth n-dimensional manifold, and let S be a smooth n-dimensional manifold with torsion-free dualistic structure (g, ∇, ∇ * ) then the following condition is satisfied: Let ϕ : S → S be a diffeomorphism.Then the pullback of g along ϕ given by g = ϕ * g defines a Riemannian metric on S.
Consider pullback of ∇ via ϕ is given by: where E (T S) denote the set of smooth sections of tangent bundle over S, ϕ * is the push-forward of ϕ, and ϕ * is the pullback of ϕ.Since pullback of torsion-free connection by diffeomorphism is a torsion-free connection.Therefore ϕ * ∇ and ϕ * ∇ * are torsion-free connections on the tangent bundle over S: π ∼ : T S → S.
Remark 1.This means the diffeomorphism ϕ : S → S is a local isometry.
Proof.Let ϕ : ( S, g, ∇, ∇ * ) → (S, g, ∇, ∇ * ) be a local isometry, where ( g, ∇, ∇ * ) = (ϕ * g, ϕ * ∇, ϕ * ∇ * ), then by the proof of theorem 1: Hence we have: Equation 1 is obtained by: By definition the Riemannian curvature tensor on S is given by: X, Y, Z ∈ TS: Let X, Ỹ, Z, W ∈ T S. The pullback curvature tensor ϕ * R is thus given by: By symmetry, ϕ * R * satisfies: If S is dually flat, meaning R = 0 = R * , we then have the equality: This result has been known for Levi-Civita connection on Riemannian manifolds.Here we generalize it slightly to pair of g-conjugate dual connections.

Computing induced dualistic structure
Finally we discuss how pulled-back dually flat dualistic structure (ϕ * g, ϕ * ∇, ϕ * ∇ * ) can be determined explicitly via the Hessian dualistic structure generated by the pulled-back metric, local coordinates, and the corresponding induced potential function.
A dualistic structure g D , ∇ D , ∇ D * on S can be determined by divergence function D via the following equations [5,14,15] for each point p ∈ S: where (θ i ) denote local coordinates on M with corresponding local coordinate frame (∂ i ) about p.Let ∂ i denote the i th partial derivative on the th argument of D. By an abuse of notation, we may write [5]: Remark 2. Conversely, given a torsion-free dualistic structure and a local coordinate system, there exists a divergence that induces the dualistic structure [16].We will refer to the divergence D on S corresponding to the pulled-back dualistic structure (ϕ * g, ϕ * ∇, ϕ * ∇ * ) (not necessarily dually flat) as the induced divergence on S.
For the rest of the section we will assume both S and S are dually flat, and we show how the pulled-back dually flat dualistic structure can be determined explicitly.
denote the local coordinate frame for TS corresponding to local ∇-affine coordinates (θ i ), and let D ⊂ T S denote the tangent subspace spanned by vector fields ∂i : Moreover, if we consider smooth pulled back coordinates (θ i := θ i • ϕ) and the correspond local coordinate frame This implies for each i, there exists a constant c i such that θi = θ i + c i , this implies ∂i = ∂ i for all i Since g is a Hessian metric with respect to ∇, there exists a potential function ψ such that g = ∇ ∇ ψ gij = ∂i ∂j ψ = ∂ i ∂ j ψ.The corresponding g-dual local coordinate system of S with respect to θ i , denoted by (η i ) can be defined by (η i ) = ∂ i ψ with correspond local coordinate frame ∂ i of T S [5].Now let's consider the Hessian Riemannain structure of dually flat manifolds [13] induced by the following divergence function: where ψ † is a smooth function on S representing the Legendre-Fréchet transformation of ψ with respect to the pair of g-dual local coordinates (θ i ), (η i ) on S. Let g denote the Hessian metric generated by D: g i j| p := g D ij p.By definition gij p = −∂ Let ∇ D , ∇ D * denote pair of g-dual connections defined by D. We now show ∇, ∇ * = ∇, ∇ * Let X, Ỹ, Z ∈ E T S and p ∈ S, the following is satisfied by construction: Since g = g, the two equations are equal, hence if we let for p ∈ S let (∂ i = ∂i ) denote the local frame of T p S, and let Since ∂i = ∂ i for all i, and g = g, we have for all p ∈ S: .

Probability densities on manifold
In this section we construct family of probability densities over smooth manifolds via orientation-preserving bundle morphism.This extend and generalize the construction of probability distributions on geodesically complete Riemannian manifolds via Riemanian exponential map described in previous literature [4], summarized as follows: Given an arbitrary point x on Riemannian manifold M, the geodesic γ v : [0, I v ) ⊂ R → M with initial point p and initial velocity v ∈ T x M is uniquely determined on an interval [0, I v ).This allows us to define the Riemannian exponential map at a given point x, which maps the tangent space T x M to M by tracing along the geodesic γ v starting at x, determined by initial velocity v ∈ T x M for time 1.To be precise, we have the following definition [17]: Definition 2. Given a point x ∈ M, consider the subset O x of T x M given by: then the exponential map at x is the map: For each x ∈ M, there exists a local neighbourhood W x in T x M where the Riemannian exponential map is a diffeomorphism.The the image of Riemannian exponential map can be viewed as a generalization of "going from p with direction v in the shortest path for time 1" within this local neighbourhood.It is also well known that if M is geodesically complete, then the Riemannian exponential map is defined on the entire tangent space T x M for any x ∈ M.
In previous literature [4], probability densities and the corresponding statistical properties on geodesically complete Riemannian manifolds are constructed by inheriting probability densities on the tangent space via the Riemannian exponential map.
In particular, let W x ⊂ T x M denote the region where the exponential map is diffeomorphism.For each y ∈ exp x (W x ) there is a unique v ∈ W x such that exp x (v) = y.
Given a probability density function p on T x M, a probability density function p on M whose function value on y = exp x (v) ∈ exp x (W x ) can be constructed by: where log x is the inverse of the Riemannian exponential map on W x .However, finding the exact expression for Riemannian exponential map for general Riemannian manifolds maybe computationally expensive in practice, as it involves solving the geodesic equation, which is a second order differential equation.Therefore in this work we are aim to find the explicit expression of parametrized probability distributions on manifold with a more general map.
We extend and generalize the above construction in two ways: We first discuss family of probability densities on M inherited locally from open subsets of R n via an orientation-preserving bundle diffeomorphism.We discuss how the pulled-back family of probability densities on M inherit the geometrical properties from the family of probability densities on R n , and show that it generalities the above construction with the Riemannian exponential map.
We then extend this to construct probability distributions on M supported beyond the region where the map ρ is a diffeomorphism.In particular, we show that parametrized family of probability densities on inherited in this fashion can be extended to any totally bounded subset V of M. We first consider an orientation-preserving open cover of V, where each element of the open cover is equipped with a local family of inherited probability distributions.A family of parametrized mixture densities L V on the entire V can thus be constructed by gluing the locally inherited densities.Finally we discuss the geometrical properties of elements of L V , and show that it is a product manifold of locally inherited family under two conditions.
We shall begin our discussion by first considering the case when M is Riemannian manifold.In the last subsection we generalize it to arbitrary smooth topological manifold.

Local densities on M via local bundle morphism
Let M be a smooth topological manifold, let U be an open subset of R n .Suppose there exists orientation-preserving diffeomorphism ρ : U ⊂ R n → M.Here we consider locally inherited family of probability densities on M via orientation-preserving diffeomorphism ρ.In particular we construct local parametrized families of probability densities over ρ(U) ⊂ M as a subspace of the density bundle Vol(M) via the bundle morphism induced by ρ.
Let Ŝ := {p θ |θ ∈ Ξ ⊂ R m } ⊂ P(U) denote a finitely parametrized family of probability density functions over U ⊂ R n .We further assume elements of Ŝ are mutually absolute continuous, then Ŝ has the structure of a statistical manifold [5].
Let µ 0 be an arbitrary reference measure on U, and let S := {ν θ |θ ∈ Ξ} denote the set of volume densities over U naturally associated to Ŝ with respect to µ 0 (as discussed in section 2).
Locally defined family of probability densities over ρ(U) ⊂ M can thus be constructed via the pullback bundle morphism defined by orientation-preserving diffeomorphism 1 ρ −1 : M → U: where S ⊂ Prob(M) is a family of probability densities over M given by: More precisely, let x ∈ V := ρ(U) ⊆ M, and let X 1 , . . ., X n ∈ T x M be arbitrary vectors.Given a density ν θ ∈ S ⊂ Prob(U), the pulled-back density νθ on V ⊂ M is given by: Since ρ is an orientation-preserving diffeomorphism, so is ρ −1 , hence we have the following equality: Suppose ν θ has probability density function p θ with respect to the reference measure µ 0 on U, i.e.
ν θ = p θ dµ 0 , then in local coordinates x 1 , . . ., x n of M, the above integral has the following form: Next we show the diagram commutes: since ρ is a local diffeomorphism, for each v ∈ U, ∃!y ∈ ρ(U) ⊂ M such that y = ρ(v), and (v, p θ ) is a section in the line bundle π U : Prob(U) → U. We have the following equalities: Finally, suppose S has dualistic structure given by (g, ∇, ∇ * ).Since ρ −1 is a diffeomorphism, so is ρ −1 * .Therefore by the discussion in Section 3, S has inherited dualistic structure (ϕ * g, ϕ * ∇, ϕ * ∇ * ), where ϕ = ρ −1 * .In particular, the induced family of probability distributions inherits the geometrical structure via the bundle morphism as well.
Remark 4. Note that both the local coordinate map p θ ∈ S → θ ∈ R and ϕ := ρ −1 * are diffeomorphisms.For the rest of the paper we will, without loss of generality, assume S to be parametrized by (θ i ) instead of the pulled-back local coordinates θ i • ρ −1 * unless specified otherwise (described in Section 3.2 and Remark 3) if the map ρ −1 * is clear.

Special case: Riemannian exponential map
We now illustrate the framework outlined above on a special case of locally inherited family of probability densities discussed in [4] via Riemann exponential map: Example 2. Let M be a complete Riemannian manifold.For each x ∈ M [17]: 1]  , where cut(x) is known as cut locus in current literature.Consider the open set U Since each tangent T x M is a topological vector space, it can be considered naturally as a metric space with the metric topology induced by the Riemannian metric.Since finite dimensional topological vector spaces of the same dimension n := dim(M) are unique up to isomorphism, T x M is isomorphic to R n .Moreover, since Euclidean metric and Riemannian metric are equivalent on finite dimensional topological vector spaces, the respective induced metric topologies are also equivalent.This means probability density functions over T x M can be considered naturally as density functions over R n .
Let S x denote a finitely parametrized family of probability densities over U x .Since exp x is a diffeomorphism in U x , we can construct parametrized family of probability distributions on exp x (U x ) by: where log x := exp −1 x denotes the Riemannian log function.p ∈ Sx = log * x are given by p(y) = log * x p(y) = p(exp −1 x (y)) = p(log x (y)).This coincides with equation 3. Since exp x is an orientation preserving diffeomorphism on U x = M \ cut(x), this reduces to a special case of the construction via orientation-preserving bundle morphism discussed above.
It is worth nothing that for general Riemannian manifolds, this approach maybe quite limiting since U x maybe a small region in the tangent space.
Throughout the rest of the paper, we will be using the Riemannian exponential map as an example to illustrate our approach.It is however worth noting that our construction applies to all orientation-preserving diffeomorphism, not just the Riemannian exponential map.

Mixture densities on totally bounded subsets of M
In this section we discuss probability distributions on M supported beyond the region where the map ρ is a diffeomorphism.In particular, we discuss how parametrized family of mixture probability densities with locally inherited dualistic geometry can be defined on totally bounded subsets of M. We begin with the notion of orientation-preserving open cover on M. Remark 5.There always exists an orientation-preserving open cover for orientable smooth manifold M; we may simply consider the smooth atlas A := {(ρ α , U α )} of M. It is worth noting that we only require ρ α : U α → M to be orientation-preserving locally on each U α .Therefore, orientation-preserving open cover exists even when M is inorientable.

We now provide two examples of orientation-preserving open cover using Riemannian exponential map:
Example 3. Given a complete, Riemannian manifold M, and a point x ∈ M, the injectivity radius at x is the real number [17]: For x ∈ M, let B x := B(0, inj(x)) ⊂ T x M ∼ = R n denote the ball of injectivity radius centred at 0 i.e. the largest metric ball in T x M such that exp x is a diffeomorphism, then exp x (B x ) is an open cover of M.
In B x , the pushforward of the Riemannian exponential map denoted by (exp x ) * : B x ⊂ T x M → M is the identity map, a linear isomorphism.Therefore exp x is an orientation preserving local diffeomorphism on the ball of injectivity radius [17].Hence {(ρ x , U x )} := (exp x , B x ) is an orientation-preserving open cover of M.
Example 4. Alternatively we can consider another orientation-preserving open cover extended from the one defined above: is also an orientation-preserving open cover of M.

Refinement of orientation-preserving open cover:
Given an orientation-preserving open cover over Riemannian manifold M, there exists a refinement of open cover by metric balls in M.

Proof. Given (orientation-preserving
Since N x α is open for all x α ∈ ρ α (U α ), there exists x α > 0 such that the metric ball centred at x α denoted by B x α satisfies: In other words, B x α is the metric ball centred at x α in normal coordinates under norm given by radial distance function.Since ρ α is a diffeomorphism for all α, this implies Observe that the proof did not use the fact that ρ α is orientation-preserving.It suffices to consider open cover {(ρ α , U α )} of M such that ρ α are just diffeomorphisms with the following simple result: Lemma 2. Let f : M → N be a local diffeomorphism between manifolds M, N. Then there exists local orientation-preserving diffeomorphism f : M → N.
Proof.Since f : M → N is a local diffeomorphism, the pushforward f * : T p M → T f (p) N is a linear isomorphism for all p ∈ M. Since f * is a linear isomorphism, the determinant of the matrix D f is non-zero: detD f = 0.If f is orientation-preserving, then there's nothing left to prove.Hence we will now assume f is orientation reversing, in other words: detD f < 0.
Let x 1 , . . ., x n denote local coordinates in M, let f denote coordinate representation of f , then we can write: , where x := x 1 (p), . . ., x n (p) .Choose a ∈ [1, . . . ,n], and let f : M → N denote the diffeomorphism from M to N defined by the following coordinate representation: .
In other words, we define f by swap the a th and a + 1 st coordinates of f .The matrix representation of f * in standard coordinates is thus given by: where I is the matrix given by: , where I k is the identity matrix of the dimension k, the sub-matrix 0 1 1 0 is located at the (a, a) th to the (a + 1, a + 1) st position of I , and the rest of the entries are all zero.Since detD f < 0, this means For the rest of the discussion we will consider the orientation-preserving open cover by metric balls {(ρ α , W x α )} of Riemannian manifold M.

Mixture densities on totally bounded subsets of M
Now we are ready to construct parametrized family of probability densities on totally bounded subsets of Riemannian manifold M. Let V ⊂ M be totally bounded subset of M, and let {(ρ α , U α )} be an orientation-preserving open cover of M. Let {(ρ x α , W x α )} denote a refinement of {(ρ α , U α )} by open metric balls in M discussed in Lemma 1.
Since {ρ α (W x α ) = B(x α , x α )} x α ∈M is an open cover of M by metric balls, it is an open cover of V ⊂ M as well.Moreover, since V is totally bounded, there exists a finite subcover {B(x α , x α )} Λ α=1 of V.
For simplicity, by an abuse of notation, we will denote the finite subcover by Note that in general we allow the parametric families S α to have different parametrizations and dimensions m α .Furthermore, consider for each α the induced family of local probability densities Sα : we can then define parametrized mixture densities over V ⊂ M by patching together the locally induced ones: where ϕ α ∈ (0, 1) for all α ∈ [1, . . . ,Λ], and ∑ Λ α=1 ϕ α = 1.Let S 0 denote the simplex of mixture coefficients treated as a family of discrete distributions: Then S 0 is an exponential family, hence a dually flat manifold with local parametrization given by [5]: denote the set of parameters of S 0 .We denote the set of mixture densities by L V : In local coordinates x 1 , . . ., x n of M, the mixture volume form να can be expressed in the following form: where pα (x, θ α ) are parametrized probability density functions naturally associated to να that are parametrized by sets of parameters (θ α ) ∈ Ξ α (see remark 4).In local coordinates x 1 , . . ., x n of M, pα (x, θ α ) is defined implicitly by pα (x, θ α )dx 1 ∧ • • • ∧ dx n := να .The parameters of the mixture distribution p(x, ξ) are collected in (ξ i )

Remark: Exhaustion by compact set and the extent of extended support
It is worth noting that since every smooth topological manifold is σ-locally compact, M admits a compact exhaustion: Definition 4.An exhaustion by compact sets is an increasing sequence of compact subsets K j of M such that For each x ∈ M, for any totally bounded subset V M, there exists N ∈ N such that for all n > N: Even though at first glance totally bound-ness might be quite restrictive, this shows that it allows us to approximate the manifold sufficiently well.

Geometrical structure of L V
Consider a totally bounded subset V ⊂ M, orientation-preserving finite open cover {(ρ α , W α )} Λ α=1 of V by metric balls, and dually flat families of densities S α over W α .We will show that the family of mixture distirbutions L V is, under two conditions, a dually flat product Riemannian manifold, hence naturally inheriting the local geometry of families of component densities S α established in the beginning of section 3.

L V as a smooth manifold
For simplicity, let pα (x) := p(x, θ α ) denote the probability density function corresponding to να ∈ Sα for all α.We first show that L V is indeed a smooth manifold under the following two natural conditions: (C1).Family of mixture component distributions have different proper support: Let K α := x ∈ M | pα (x) > 0, ∀ pα ∈ Sα denote the proper support of probability densities pα ∈ Sα for each α ∈ {1, . . . ,Λ}.We assume  C2).No functional dependency between mixture component densities : We construct mixture densities in L V as unconstrained mixtures, meaning there are no functional dependency between mixture component.In other words, changing parameters θ β ∈ Ξ β of mixture component pβ ∈ Sβ has no influence on pα ∈ Sα for β = α and vice versa.We write this condition as follows: For each pα ∈ Sα , 1.The first condition C1 can always be satisfied simply by choosing a suitable open cover of V. 2. The second condition C2 is automatically fulfilled for unconstrained mixture models.One can imagine introducing functional dependencies among mixture component distributions, but this is not the case considered here.We make the assumption that: if we alter one distribution pα ∈ Sα , it does not affect distributions in Sβ for β = α.
Therefore the parametrization map θ 0 , θ 1 , . . ., By condition C2, the parameters are also independent in the sense that for β = α: In other words, pullback by ρ does not introduce additional functional dependencies among parameters.
Mixture densities p ∈ L V can thus be identified naturally by the map p = ∑ Λ α=1 ϕ α pα → (ϕ 1 , . . . ,ϕ Λ , p1 , . . . ,pΛ ), where the image represents the mixture coefficients and the mixture component distributions.Since parametrizations θ α → pα are smooth with smooth invserse, L V is a smooth manifold with coordinates L Torsion-free dualistic structure on L V Now consider the following function on L V × L V : where D KL is the Kullback-Leibler divergence (relative entropy) on mixture coefficients S 0 = {ϕ α = P(A = α)} Λ α=1 as family of discrete distributions, and D α is the induced divergence on smooth manifolds Sα described by Remark 2 in Section 3.
It is immediate by definition that D satisfies the conditions of a divergence: 1. Non-negativity: Since D α 's and D KL are both non-negative, so is D: 2. Identity: Since D α 's and D KL are divergences, the following is satisfied: be coordinates of S α , and one again let (ξ i ) d i=1 := (θ 0 , θ1 , . . ., θΛ ) denote the coordinates of L V , where Moreover, if where , where θ β i denote parametrization of Sβ for some β ∈ {1, . . . ,Λ}, then by condition C2: where g β is the metric on S β , and p β := ρ −1 * −1 pβ = ρ * pβ .The last equality is due to the analysis towards the end of Section 3. Otherwise, let The Christoffel symbols of the connection ∇ D is given by: where we recall −D This implies that for X 0 , Y 0 ∈ E (TS 0 ), X α , Y α ∈ E T Sα for α ∈ [1, . . . ,Λ], we have the following: By symmetry and the fact that g D = g D * [5], we also have the following: and we also obtain the following result analogous to equation ( 12): Finally, for X 0 , Y 0 ∈ E (TS 0 ), X α , Y α ∈ E T Sα for α ∈ [1, . . . ,Λ], we would have the following: Hence by equations (8,13,16), the dualistic structure g D , ∇ D , ∇ D * naturally decomposes into the part of mixture coefficients and mixture components.We abbreviate equations (8,13,16) to the following compact notation:

Dualistic structure of L V
To show that L V is indeed a product Riemannian manifold, we recall some properties of product Riemannian manifolds [18,19].Remark 7. Note that since L V consists of a finite mixture of probability distributions, to show that L V = S 0 × S1 × • • • SΛ , it suffices to consider the dualistic structure of the product of two manifolds.
Given two Riemannian manifolds (M, g 1 ), (N, g 2 ), the product Riemannian metric on M × N is given by [18]: where P, Q are projections from T(M × N) to TM, TN respectively.Suppose ∇ 1 and ∇ 2 are connections of M, N respectively, then the product connection is given by [19]: where X 1 , Y 1 ∈ TM and X 2 , Y 2 ∈ TN.We once again abbreviate the product connection to a more compact notation for simplicity: ∇ = ∇ 1 ⊕ ∇ 2 .Since the Lie bracket of M × N is : and the curvature tensor is given by: The curvature tensor on the M × N is thus: where R 1 , R 2 denote the curvature tensor of M, N respectively.Hence if M and N are flat, so is M × N.
Hence equation ( 20) equals to: where ∇ 1 * , ∇ 2 * denote the g 1 , g 2 -dual connection to ∇ 1 , ∇ 2 on M, N respectively.The unique [12] g-dual connection to ∇ of M × N, denoted by ∇ * , is thus given by the following: Furthermore, since the curvature of M × N satisfies product curvature tensor described in equation (19), if (M, g 1 , ∇ 1 , ∇ 1 * ), (N, g 2 , ∇ 2 , ∇ 2 * ) are both dually flat, then so is their product Remark 8.By the theorem above, L V = S 0 × S1 × • • • SΛ is therefore a product manifold with product dualistic structure: where the equality follows from equation ( 17) and the discussion towards the end of Section 3.2. 3 Since mixture coefficients S 0 correspond to the family of multinomial distributions, which in turn is a member of the exponential family, it is dually flat.Therefore by the above result, if the mixture component families S 1 , . . ., S Λ on orientation-preserving open cover {W α } Λ α=1 are all dually flat, then so is S1 , . . ., SΛ .Therefore by applying induction on the above result, L V = S 0 × S1 × • • • SΛ is also dually flat.
By the discussion towards the end of section 3, let ψα denote the pulled-back potential function on Sα defined by local coordinates θα and pulled-back metric gα on Sα .The gα -dual local coordinates to θα can be defined via induced potential function ψα by: ηi Since S 0 and Sα are all dually flat for α ∈ [1, . . . ,Λ], we can write the divergences D KL and D α of S 0 and Sα in the canonical form [5] respectively as follows: 3 For detailed discussion of the dualistic structures on S 0 , S1 , . . ., SΛ please refer to the beginning of Section 4.
The functions ψ † 0 and ψ † α denote the Legendre-Fenchel transformation of ψ 0 and ψα , and given by the following equations, respectively: The divergence D on L V from equation ( 7) can then be expressed as: First and second part is convex due to linearity of derivative, the independence of parameters given by condition C2, and the Hessians of potential functions ψ 0 , ψ1 , . . ., ψΛ are positive semi-definite.The third part is a sum of inner products, which is again an inner product on the product parameter space in Recall that the parameters of the mixture distribution p( 5  By linearity of derivative and condition C2, the g D -coordinate dual to (ξ i ) d i=1 is given by (η 0 , η1 , . . . ,ηΛ ) := . Furthermore, the dual potential of ψ 0 (ϕ) + ∑ Λ α=1 ψα ( pα ) is given by the following Legendre-Fenchel transformation: The third equality follows from the functional independence of ϕ ∈ S 0 and pα 's in Sα .Hence the Legendre-Fenchel transform of the first component of D is exactly the second component of D, Finally we discuss generalizations of the constructions discussed in this section.

Inheriting densities of unbounded support
Whilst the previous discussion allows us to inherit families of distributions supported in open subsets of Euclidean spaces R n , it can be extended to inherit families of distributions with unbounded support over R n .
One way is to inherit family of distributions with unbounded support by applying the orientation-preserving bundle morphism construction twice.
Let x ∈ M be arbitrary.The first bundle morphism is constructed via a diffeomorphism between R n (linearly isomorphic to tangent spaces T x M of M) to star-shaped open neighbourhoods V x about the origin of tangent spaces T x M of M.
Using the bundle morphism construction discussed in the beginning of this section, this allows us to construct a family of distributions supported V x .
The second bundle morphism is constructed via local diffeomorphism exp as discussed in the beginning of this section.We discuss this more formally as follows: Let M by a Riemannian manifold and let x ∈ M be arbitrary.Consider a metric ball B(x, x ) ⊂ M.
. By Gonnord and Tosel [20], there exists a diffeomorphism f x : V x → R n , and by lemma 2, we can assume f x to be orientation-preserving.
Let S be a family of probability densities supported in R n .We can induce a family of probability densities S on M by the following composition of orientation-preserving bundle morphisms.
The matrix D ρ−1 is given by: and the corresponding determinant is given by the following, which is always positive for x, y, z ∈ R: x 2 +y 2 x 2 +y 2 +z 2 .Therefore ρ−1 is orientation-preserving, and so is ρ −1 .8 Since x 2 + y 2 + z 2 = 1 is a constant in S 2 → R 3 , the determinant of Dρ −1 becomes: 1 which is always positive on S 2 .
In this example we will inherit truncated bivariate Gaussian distributions on the rectangle U 1 = (0, Induced distributions SV 1 on V 1 ⊂ S 2 via pullback bundle morphism induced by ρ −1 is given by the following commuting diagram (c.f.Section 4.1): SV 1 ⊂ Prob(S 2 ) The induced probability density p ∈ SV 1 := ρ −1 * S U 1 on ρ(U 1 ) ⊂ S 2 is given by (see equation ( 4)): The close form expression of p ∈ SV 1 is thus given by the following expression: where r := x 2 + y 2 + z 2 = 1 in S 2 .Notice since z = 1 − x 2 − y 2 on S 2 , the change of measure only depends on x, y.
Furthermore, since S U 1 is an exponential family, which is an α-affine manifold with α = 1 and ρ is a (local) diffeomorphism hence a sufficient statistics for SV 1 , by example 1, SV 1 is an 1-affine manifold as well.The canonical divergence on S U 1 is given by the Kullback-Leibler divergence D KL , hence the induced divergence D1 on SV 1 can thus be computed by: D1 (ρ −1 * p, ρ −1 * q) = D KL (p, q) , for p, q ∈ S U 1 . (26) As an illustration of the construction of pulled-back dualistic structure and induced divergence discussed in Remark 2 Section 3.
Let V 2 = (x, y, z) ∈ S 2 |y ≥ 0 denote another closed subset of S 2 and let V = V 1 ∪ V 2 denote a totally bounded subset of S 2 .