1. Introduction
The study of estimation efficiency within the framework of information geometry has evolved significantly since the pioneering work of Rao [
6] and some years later by other [
2,
7] or [
8], and may be studied in more recent papers and books like [
9,
10]. The Fisher information metric, providing a canonical Riemannian structure on parametric statistical models, allows the intrinsic quantification of statistical distinguishability and the derivation of sharp risk bounds. This was developed in the article [
1], where intrinsic classical results, such as Cramér–Rao inequalities, under regularity conditions, were established. Building on these foundations, the notion of global efficiency has recently attracted renewed attention, emphasizing the behavior of estimators not only locally but across regions of the parameter space, particularly if we take into account that the interplay between geometry and physics has been further enriched by applications of Fisher information to variational principles in classical and quantum mechanics; see [
11,
12,
13].
The specific aim of this paper is to derive lower bounds for two global risk measures of an estimator over a subset of the parameter space, under the intrinsic geometry induced by the Fisher information: the average risk and the maximum risk. In the Introduction, we outline the setting of the problem and recall some results on local risk bounds, which will be later applied to obtain global bounds. Related work appears in [
2,
3].
1.1. The Framework
Let be a sample space, a –algebra of subsets of and a –finite positive measure in . A parametric statistical model is defined as the triple, where is a measure space, is a topological space, known as the parameter space, and f is a non-negative measurable map, such that is a probability measure on , . Here, is referred to as the reference measure and f as the model function.
For simplicity, in this paper we shall focus on the case in which is an open connected subset of . In this setting, it is customary to use the same symbol to denote both the points in and their coordinate representations. Adopting this convention, we shall present the results in this familiar form hereafter, even though the statements can be formulated in greater generality.
Additionally, we assume that the model function f satisfies certain regularity conditions:
- 1
When x is fixed, the real function is a function on the manifold .
- 2
The functions in x, , are linearly independent, and belong to for a suitable .
- 3
The partial derivatives of the required orders
and the integration of
with respect to
can always be interchanged.
- 4
The model is identifiable: the map , with is one-to-one.
Within this framework, the probabilistic mechanism that generates the data under analysis can be equivalently described as a probability measure, a density function or a parameter, that is, a point in the parametric manifold
. When all these conditions are satisfied, we shall say that the parametric statistical model is
regular. We shall start by considering that
is a Riemannian manifold with an arbitrary fundamental tensor
h in
with components
, although in this case it is well known that the parameter space has a natural Riemannian structure, induced by probability measures, called the
information metric, whose fundamental tensor components,
, are the components of the Fisher information matrix. For further details, see [
1,
6,
7,
8,
10] among many others.
In this context, given a sample size k, an estimator for the true parameter , that is, the parameter corresponding to the true probabilistic mechanism that has generated the sample data, is a measurable map , assuming that the probability measure on is .
1.2. Local Bounds
Let
be the components of the metric tensor of the Riemannian metric on
and
the components of the information metric on
. Then consider the Levi–Civita connection associated with
and
and estimators
such that
B is a
field on
. Let
,
be the tangent space at
; for each
we define
where
d is the Riemannian distance and
is a geodesic defined in an open interval that contains zero, such that
and with a tangent vector at
equal to
. Now, if we set the following,
and
it is known that
is a diffeomorphism which maps
onto
(see Hicks [
14]). We have
Theorem 1
(Riemannian Cramér-Rao lower bound).
Let be an estimator for a sample size k, corresponding to an n-dimensional regular parametric family of density functions. Assume that the manifold Θ is simply connected and that . Let us assume that the mean squared Riemannian distance given by , between the true parameter and an estimate, , exists and that the covariant derivative of B can be obtained by differentiating under the integral sign. Then
where and represent the divergence operator.
Proof. Let
C be any vector field. Then, applying the Cauchy–Schwartz inequality twice,
where
and
denote, respectively, the inner product and the norm defined on each tangent space.
Let
, where
is the gradient operator. Taking expectations and using the repeated index convention,
Furthermore, we also have
and
Thus,
but
. Moreover,
Then the theorem follows. □
Remark 1.
We can choose a geodesic spherical coordinate system with origin ; under this coordinate system, we have the following.
where g is the determinant of the metric tensor. Then
Now we can use Bishop’s comparison theorems (see [15]) to estimate
In the Euclidean case,
and thus .
When the sectional curvatures are non-positive, we obtain
and therefore .
Finally, when the supreme of the sectional curvatures, , is positive and the diameter of the manifold satisfies , we have
and then we obtain .
In any case, , with or , depending on the sectional curvature sign.
Corollary 1.
Suppose there is a global chart such that . Identifying the points with their coordinates, we have
where MSE is the mean squared error, and we are using the repeated index summation convention.
Proof. It follows straightforwardly from the previous theorem and the facts that d is the Euclidean distance, and . □
Corollary 2
(Intrinsic Cramér-Rao lower bound).
If , we have
where ρ is theRao distance.
Proof. If the Riemannian metric is the Fisher metric, the distance is known by the Rao distance and . □
1.3. Global Bounds
Whatever loss function is considered, it is well known that, in general, there is no estimator whose risk function is uniformly smaller than any other. Therefore, given an estimator, it seems reasonable, in order to measure its performance over a certain region of the statistical model, to compute the integral of the risk and then to divide this quantity by the Riemannian volume of the region considered. In the following, we take the Rao distance as a loss function and the Riemannian metric as the Fisher metric. This is the Intrinsic Analysis framework.
Let
be a measurable subset with
, where
V is the Riemannian measure. We denote the
Riemannian average of the mean squared Rao distance by
The performance index obtained is a weighted average of the mean squared distance. This approach is compatible with a Bayesian point of view: a uniform prior with respect to the Riemannian volume is a kind of noninformative prior (see [
16]). It can be shown (see [
17]) that when the parameter space is a locally compact topological group, this Riemannian volume is a left-invariant Haar measure and is unique up to a multiplicative constant. In any case, this volume is invariant under any group that leaves the parametric family of densities invariant. In the first part of the paper we derive lower bounds for this global index on balls of radius
R.
Another way to measure the global behavior of an estimator is to consider the maximum risk in a region of the parameter space. This is a mini-max approach. The last part of the paper is devoted to obtaining lower bounds for the maximum risk.
2. Variational Methods to Obtain Global Bounds
As we shall show, variational methods can be used to obtain global bounds. A previous study in this direction can be found in [
2]. The idea is to consider the integral of the local bounds for the Rao distance given above when the Riemannian metric is the Fisher metric and on a submanifold
with boundary
. That is,
where we take
if the sectional curvatures are non–positive and
in the other case. The above functional depends only on
B, and we can attempt to find the
vector field
B that minimizes it. Since the minimum we obtain is in a class of vector fields larger than that of
bias vector fields, this method gives a lower bound for the
average of the mean squared Rao distance.
Lemma 1.
The field B minimizes the functional
and the minimum value is given by
Proof. Consider the first variation
, where
is an arbitrary field. Then it is easy to see that
and
Thus, the functional is strictly convex, and the stationary point is a global minimum. Now, condition
is equivalent to
We take into account that
we obtain
We are able to write the stationary condition as
and by the Gauss divergence theorem,
where
is the Riemannian measure on
. Then (
4) follows from the fact that the previous equality must be verified for any
.
Now we see the second part of the proposition. By the first stationary condition (
4) and by (
6), we obtain the following,
and, putting it in
,
Now, by the second stationary condition in (
4),
It is clear that
, then since
on
and
it turns out that
. Finally, by the Gauss divergence theorem we obtain the second equality in (
5). □
Remark 2.
Note that the minimum value of depends only on , and that verifies the partial differential equation.
as is easy to check from (4).
We have solved this boundary value problem in the case where , a ball of radius R, and constant sectional curvatures .
Theorem 2.
When the parametric statistical model is a manifold of constant sectional curvature , we have the following lower bound for the average of the mean squared Rao distance, in balls of radius R such that :
where
and
Proof. By symmetry and uniqueness, the solution of the boundary problem in
,
depends only on the distance to the center of
. Then, taking geodesic spherical coordinates
with origin in the center of
, since
(see the Appendix), we have
We can then write
Let
and
; taking into account that
we obtain
Moreover, since
it turns out that
If we try
, since
we have
that is,
Now, if
, then
Thus,
and
where
is determined by the condition
. It is easy to see that this series is convergent iff
. This is always true in the case of non–negative sectional curvatures.
Furthermore, we have to evaluate
. In spherical coordinates (see Appendix)
Since
, we find that
and
□
Corollary 3.
When the parametric statistical model is an Euclidean manifold, we have the following lower bound for the Riemannian average of the mean squared Rao distance, on a ball of radius R:
where is a generalized hypergeometric function (see (A2) of the Appendix).
If the Euclidean manifold M is complete and simply connected, we obtain the following lower bound in the manifold:
Proof. It is a particular case of the previous theorem with
. The second part of the proposition follows by taking the limit when
in (
12). □
Example 1.
As an example, consider the n–variate normal distribution with known covariance matrix . Given a sample of size k, the Riemannian density of the mean squared Rao distance corresponding to the sample mean is , which coincides with the previous bound.
In the case
, the manifold is Euclidean and we can apply the previous result; we obtain
which coincides with the result already obtained by Chentsov [
2].
In fact, if we take a Cartesian coordinate system with origin
and try to solve the variational problem for a cube with center
,
we have to solve the Dirichlet problem:
If we try
for convenient real-valued functions
, we obtain
Obviously, a solution is given by
, with
g such that
The solution of the last equation is
and then
This provides the bound
which improves upon the result given by Chentsov [
2]. By Corollary (1), we can also give, in the general non–Euclidean case (fixed coordinate system), for the mean squared error (M.S.E.), a bound of the form
where
is an upper bound of
in
We can also give lower bounds to the general case.
Theorem 3.
When the parametric statistical model is a manifold with sectional curvatures bounded from above by , then we have the following lower bound for the average of the mean squared Rao distance:
where is the area of the n–dimensional sphere of radius R, its volume and is the solution of the boundary problem in , (11), on a manifold of constant sectional curvature .
Proof. Consider geodesic spherical coordinates
. Let
be the solution to the boundary problem (
11) on a manifold with sectional curvatures bounded from above by
. Let
be the solution to the same problem but on a manifold of constant sectional curvature
, which, as we know, depends only on the radial coordinate. Then
By Bishop’s comparison theorems, we have
and, since
we have
with
the Laplacian for the constant sectional curvature case. Thus,
Now, since
, we can apply the comparison theorem to elliptic differential equations, [
18] Theorem 6, p. 243. We find that
and, since equality holds on the boundary,
Finally, by (
7) and (
4),
and the proposition follows. □
Remark 3.
Estimates for the volumes of balls given in the Appendix are useful to give a final expression for these bounds. Note that if the sectional curvatures are bounded from below by κ and from above by , by the proposition (A3), we have
3. Lower Bounds for the Maximum Risk
Even though we could use the Riemannian average of the risk to derive bounds for the maximum risk, we can obtain sharper minimax bounds and more directly.
Lemma 2.
Let X be a smooth field on Θ such that , let f be a non–negative function on Θ and let W be a submanifold with boundary in Θ, then
Theorem 4.
We have the following lower bound for the local minimax risk of an estimator on W
Proof. By the previous lemma if
by integrating (
14) with respect to
and by Fubini’s Theorem, we have
Thus,
□
Corollary 4.
When the parametric statistical model is an n–dimensional Euclidean manifold, we have the following lower bound for the local minimax risk:
where . If the Euclidean manifold, M, is complete and is simply connected, we obtain the following lower bound over the manifold:
Proof. Since
we have
thus
We derive the second statement taking the limit as
. □
We can also use the previous lemma to derive the bounds of the average of the mean squared Rao distance.
Theorem 5.
We have the following lower bound for the Riemannian average of the mean squared Rao distance in .
where if the sectional curvatures are non–positive and if the supreme of the sectional curvatures, , is positive
Proof. Consider (
16) for
,
and integrate it with respect to
from 0 to
R.
Now, we take into account that
is a positive monotonic increasing function of
r, and
we obtain
and the proposition follows. □
Corollary 5.
When the parametric statistical model is a Euclidean manifold, we have the following lower bound for the Riemannian average of the mean squared Rao distance on :
If the Euclidean manifold Θ is complete and is simply connected, we obtain the following lower bound over the manifold:
Proof. Since
we have
and
then
We derive the second statement taking the limit as
. □
Remark 4.
Example (1) shows that the bound obtained here is worse than the variational one if R goes to infinity, but it is better if R goes to zero:
Remark 5.
Note that global bounds for the average of the mean squared Rao distance also provide bounds for the local minimax risk in an obvious way. It can be shown that these last bounds are sharper than the bounds provided by the variational methods.
Author Contributions
Conceptualization, J.M.C. and J.M.O.; writing—original draft preparation, J.M.C. and J.M.O.; writing—review and editing, J.M.C. and J.M.O. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
The study did not require ethical approval.
Data Availability Statement
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
Conflicts of Interest
The authors declare no conflicts of interest.
Appendix A. Comparison Theorems and Volumes
We can use Bishop’s theorems to obtain the volume of a ball of radius r in a Riemannian manifold whose sectional curvatures are constant and to give bounds for this volume when the sectional curvatures are bounded. We have the following propositions:
Proposition A1.
If the sectional curvatures are constant and equal to , the volume of a Riemannian ball of radius r and center is given by
Proof. We have
where
is the unit sphere in
. On the other hand, by Bishop’s comparison theorems, when the sectional curvatures are constant,
with
Then, integrating this expression, we have
However,
does not depend on
. In fact,
where
is the surface element of the unit sphere in a Euclidean manifold and, since
we conclude that
. Thus, we may write
and finally,
□
Proposition A2.
When the sectional curvatures are constant and equal to and , we have the following expression for the volume of a Riemannian ball of radius r:
Proof. From the previous proposition,
Then, since by the definition of
,
and, making the change
, we have
Moreover, there is a relationship between integrals of this kind and generalized hypergeometric functions. These functions are defined by
where
and
z is any complex number if
,
if
and they diverge for all
if
(see Abramowitz [
19]). The relationship is as follows:
This leads to
and the proposition is proved. □
Proposition A3.
Let be the volume of a ball with center θ and radius r, on a manifold with sectional curvatures bounded from below by κ and from above by . Then
where and are, respectively, the volumes of balls of radius r and arbitrary centers in manifolds with constant sectional curvatures κ and .
Proof. If we integrate, from
to
, the inequalities in Bishop’s comparison theorems, we obtain
Moreover,
and, since
we conclude that
□
References
- Oller, J.M.; Corcuera, J.M. Intrinsic Analysis of Statistical Estimation. The Annals of Statistics 1995, 23, 1562–1581. [Google Scholar] [CrossRef]
- Chentsov, N.N. Statistical Decision Rules and Optimal Inference; Vol. 53, Translations of Mathematical Monographs, American Mathematical Society: Providence, RI, 1982. Translated from the Russian by the Israel Program for Scientific Translations. [CrossRef]
- Rao, B.L.S.P. Remarks on Cramer-Rao Type Integral Inequalities for Randomly Censored Data. Lecture Notes-Monograph Series 1995, 27, 163–175. [Google Scholar]
- Oller, J.M. On an intrinsic analysis of statistical estimation. In Multivariate Analysis: Future Directions 2; CUADRAS, C., RAO, C., Eds.; North-Holland Series in Statistics and Probability; North-Holland: Amsterdam, 1993; pp. 421–437. [Google Scholar] [CrossRef]
- Sato, M.; Akahira, M. An information inequality for the Bayes risk. The Annals of Statistics 1996, 24, 2288–2295. [Google Scholar] [CrossRef]
- Rao, C. Information and Accuracy Attainable in Estimation of Statistical Parameters. Bulletin of the Calcutta Mathematical Society 1945, 37.3, 81–91. [Google Scholar]
- Atkinson, C.; Mitchell, A.F.S. Rao’s Distance Measure. Sankhyā: The Indian Journal of Statistics, Series A (1961-2002) 1981, 43, 345–365. [Google Scholar]
- Burbea, J.; Rao, C.R. Entropy differential metric, distance and divergence measures in probability spaces: A unified approach. Journal of Multivariate Analysis 1982, 12, 575–596. [Google Scholar] [CrossRef]
- Nielsen, F. An Elementary Introduction to Information Geometry. Entropy 2020, 22. [Google Scholar] [CrossRef] [PubMed]
- Amari, S.i. Information Geometry and Its Applications, 1st ed.; Springer Publishing Company, Incorporated, 2016.
- Frieden, B. Science from Fisher information : a unification, [2nd ed.]. ed.; Cambridge University Press: Cambridge, 2004.
- Brody, D.C.; Hughston, L.P. Statistical geometry in quantum mechanics. Proc. R. Soc. Lond. A. 1998, 454, 2445–2475. [Google Scholar] [CrossRef]
- Bernal-Casas, D.; Oller, J.M. Variational Information Principles to Unveil Physical Laws. Mathematics 2024, 12. [Google Scholar] [CrossRef]
- Hicks, N. Notes on Differential Geometry; Mathematica studies, Van Nostrand, 1965.
- Chavel, I. Eigenvalues in Riemannian Geometry; Elsevier: United States, 1984. [Google Scholar] [CrossRef]
- Jeffreys, H. An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences 1946, 186, 453–461. [Google Scholar] [CrossRef] [PubMed]
- Berger, J.O. Statistical Decision Theory and Bayesian Analysis; Springer: New York, 1985. [Google Scholar]
- Rauch, J. Partial differential equations; Graduate texts in mathematics ; 128, Springer-Verlag: New York, 1991. [Google Scholar]
- Abramowitz, M.; Stegun, I.A. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, ninth dover printing, tenth gpo printing ed.; Dover: New York, 1964.
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).