3. Minimum Riemannian risk estimators
In the framework of intrinsic analysis, where the loss function is the square of the Rao distance, the Riemannian distance induced by the information metric in the parameter space , once the class of the equivariant estimators has been determined a natural question arises: which is the equivariant estimator that minimizes the risk?
First of all, we summarize the basic geometric results corresponding to the model (
1) which are going to be used hereafter. We are going to use a
standardized version of the information metric, given by the usual
information metric corresponding to this linear model divided by a constant factor
n, i.e. the number of rows of matrix
. This metric is given by
which is, up to a linear coordinate change, the
Poincaré hyperbolic metric of the upper half space
, see [
13]. The Riemannian curvature
is constant and negative and the unique geodesic, parameterized by the arc–length, which connects two points
and
, when
, is given by:
where
s is the arc–length,
and
are
vectors whose components, and also
, are convenient real integration constants, such that
,
,
,
being
the Riemannian distance between
and
. Finally,
K is given by
. When
, the geodesic is given by
where
B is a positive integration constant.
The Rao distance
between the points
and
is
where
and
or, equivalently,
Let
be the inverse of the exponential map corresponding to Levi-Civita connection and
its components corresponding to the basis field
. Then, we have
It is well know that the Riemannian distance induced by the information metric is invariant under equivariant estimator transformations. We shall supply a direct and alternative proof for the linear model setting.
Proposition 2.
The Rao distance ρ given by (11) is invariant under the action of the induced group by on the parameter space, . In other words
Proof: Observe that
and taking into account that
is the projection matrix into
F, we have
Therefore
and the invariance of
and
trivially follows. □
Proposition 3.
acts transitively on Θ.
Proof: The transitivity follows observing that a is an arbitrary positive real number and is the projection matrix into F with . □
Since
, and thus
, is invariant under the action of
and
acts transitively on
, the distribution of
does not depend on
, and therefore, the risk of any equivariant estimator remains constant and independent of the target parameter provided that this risk is finite. More precisely, observe that if we let
from (
1) and (
4) we clearly have that
with a rank
m idempotent covariance matrix, and
and
are independent random variables following a chi-square distribution with
m and
degrees of freedom, equal to the dimensions of
F and
since
and
are quadratic forms based on the projection matrices on these subspaces of
and
(or
) and
are independent random vectors. Therefore, since
and
, we have that
and
or
which have a distribution which depend only on
and
, independent random variables with fixed distribution, whatever the value of
.
Since the risk of any equivariant estimator remains constant on the parameter space, it’s enough to examine it at one point, for instance at the point . Let us denote the expectation with respect to the n–variate linear normal model by and by E the . We can prove the following propositions.
Proposition 4.
for any .
Proof: From (
14) and (
13), since
we have
developing the square of the difference and taking into account that the standard Euclidean norm of a vector is less or equal to the absolute value of the sum of its components, we obtain
Notice that both bounds (
22) and (
23) are invariant under the action of the induced group on the parameter space.
As we mentioned before, from [
14], it is enough to prove that the risk is finite at
. Taking into account (
16) it follows, from (
23), that
Observe that if
Q has a chi-square distribution with
k degrees of freedom
Therefore, since
and
are independent random variables following a central chi-square distribution with
m and
degrees of freedom, we have
and
Then, taking the average, it follows that
Since n and m are positive integers being , we conclude that the risk is finite if . □
Proposition 4 is a sufficient condition for the existence of the Riemannian risk of the equivariant estimator
, thus
is well defined for
, which we shall assume hereafter.
Proposition 5.
There exists a unique minimizer of the Riemannian risk given by Φ.
Proof:
Let us consider the Riemannian risk at
as a function of
s, that is
The particular selection of
, from which
F follows, relies on the Riemannian structure of
induced by the information metric. The Riemannian curvature is constant and equal to
and taking into account (
10) we have that
is a geodesic in
; precisely a geodesic parameterized by the arc–length, see [
13] for further details.
Then, following [
15], the real valued function
is strictly convex. Since almost surely convexity of a stochastic process carries over the mean of a process, the map
F is strictly convex as well.
On the other hand, from Fatou’s Lemma as or . This, together with the strict convexity of F yield the existence of a unique minimizer of the function F, which depends on n and m.
Finally, since the map is a strictly monotonous function; must exist a unique , namely , such that . □
In fact, this result guarantees the unicity of the MIRE, although a numerical analysis is required to obtain it explicitly (see next section). It could be useful to develop a simple approximate estimator, that shall be referred hereafter
a-MIRE, obtained, luckily, minimizing a convenient upper bound of
. Since
we shall have
and therefore
the upper bound
it is clearly a convex functions with an absolute minimum attained when
satisfy
Furthermore, given an arbitrary
m, we have
and, therefore,
a-MIRE is very close to MLE for large values of
n. Observe also that it is possible to compute
a-MIRE for
, a condition which is slightly stronger that the result required for the existence of MIRE in proposition (4).
A further aspect is the intrinsic bias of the equivariant estimators. In fact connections between minimum risk, bias and invariance have been established, see [
14]. Since the action of the group
G is not commutative, we cannot guarantee the unbiasedness of the MIRE and an additional analysis must be performed. First of all we are going to compute the vector bias, see [
4], a quantitative measure of the bias which is compatible with Lehmann results.
Let
and
be the components of
corresponding to the basis field
,
. With matrix notation,
. Furthermore, let us define
for
and
; taking into account (
16), (
19) and from (
11) and (
15) we have
where
and
Let be the intrinsic bias vector corresponding to an equivariant estimator evaluated at the point and let be their components. In matrix notation, . We have
Proposition 6.
If , the bias vector is finite and
where and are independent random variables following a chi-square distribution with m and degrees of freedom respectively.
Moreover, the square of the norm of the bias vector is constant and given by
Proof: Observe that if
we have
where
denotes the Riemannian norm at the tangent space at
.
On the other hand taking into account (
31) and defining
as in (
16) observe that
,
is independent of
and
and
has the same distribution. Then we have
and since
it follows that
.
is obtained directly from (
31). The distribution of
and
follow from basic properties of multivariate normal distribution. Finally, the norm of the bias vector field follows from (
32) and (
8).
□
We may remark, finally, that the norm of the bias vector field of any equivariant estimator,
, is invariant under the action of the induced group,
, on the parameter space and since this group acts transitively on
, this quantity must be constant, which is clear from (
33).