Chordal Metric Formula Between Generalized Singular Values of Grassmann Matrix Pairs by Riemannian Optimization Models

Yujie Wang; Wen Kang; Lei Zhu

doi:10.20944/preprints202502.1991.v1

Submitted:

24 February 2025

Posted:

25 February 2025

You are already at the latest version

Abstract

In this paper, we present a new explicit formula for the sum of the chordal distance between the generalized singular values of Grassmann matrix pairs, based on Riemannian optimization models. The new formulas involve small-size unitary matrices and real orthogonal matrices derived from Riemannian optimization models. We then apply Newton’s method on Riemannian manifolds to efficiently solve the variable matrices involved. The new explicit formulas provide significant improvements over the existing theoretical and computational upper bounds.

Keywords:

riemannian optimization models

;

riemannian manifolds

;

grassman matrix pair

;

Generalized singular values

Subject:

Computer Science and Mathematics - Computational Mathematics

0. Introduction

The generalized singular value decomposition (GSVD) is useful in matrix computation problems and practical applications, such as the general Gauss-Markov linear model, the Kronecker canonical form of a general matrix pencil, generalized total least squares problem, gene data analysis and so on [1,2,3,4,5,6,8,9,10,11,12]. In particular, GSVD and generalized singular values (GSVs) can provide a comparative mathematical framework for two genome-scale expression data sets as seen in [1,2,3,4,22].

0.1. GSVD and Its Applications

More precisely, GSVD induces a linear transformation of two data sets from the two genes × arrays spaces to two reduced and diagonalized “genelets” × “arraylets” spaces. A single microarray probes the relative expression levels of p genes in a single sample. A series of

q_{1}

arrays probes the genome-scale expression levels in

q_{1}

different samples, i.e., under

q_{1}

different experimental conditions. Let matrix A of size p-genes ×

q_{1}

-arrays, denote the full expression data, whose k-th row is the expression of the k-th gene across the different samples that correspond to the different arrays. Let the matrix B of size t-genes ×

q_{2}

-arrays, denote the relative expression levels of t genes under

q_{2} = q_{1} = q < max {p, t}

experimental conditions that correspond one to one to the

q_{1}

conditions underlying A. GSVD induces simultaneous linear transformation of two expression data sets A and B from two p-genes ×q-arrays and t-genes ×q-arrays spaces to two reduced q-genelets ×q-arraylets spaces.

Let A and B have the following GSVD:

A = U Σ_{A} R^{- 1}, B = V Σ_{B} R^{- 1}

. In these spaces the data is represented by the diagonal non-negative matrices

Σ_{A}

and

Σ_{B}

and denote

〈 k | Σ_{A} | l 〉 = ε_{1, l} δ_{k l}, 〈 k | Σ_{B} | l 〉 = ε_{2, l} δ_{k l}

for all

1 \leq k, l \leq q

, where

δ_{k l} = 1

if

k = l

and

δ_{k l} = 0

if

k \neq l

. By

〈 k | Σ_{A}

we denote the k-th row of the matrix

Σ_{A}

, which lists the expression of the k-th gene across the different samples that correspond to the different arrays. By

Σ_{A} | l 〉

we denote the l-th column of the matrix

Σ_{A}

, which lists the genome-scale expression measured by the l-th array. The antisymmetric angular distance between the data sets [3],

θ_{l} = arctan (ε_{1, l} / ε_{2, l}) - π / 4

indicates the relative significance of the l-th genelet, i.e., its significance in the first data set relative to that in the second in terms of the ratio of the expression information captured by this genelet in the first data set to that in the second. An angular distance of 0 indicates a genelet of equal significance in both data sets, with

ε_{1, l} = ε_{2, l}

.

Alter et al. [1] made comparative analysis of gene mRNA expression data sets by transforming those into GSVs and analyzed the degree of change for reduced and diagonalized “genelets” × “arraylets” spaces in gene expression levels from samples under different experimental conditions. This behavior is significant for guiding further experimental research. Motivated by practical applications proximity and range size between GSVs, Xu et al. [22] considered the upper bound of sum of chordal metric between generalized singular values of Grassmann matrix pairs and applied this result in comparative analysis of gene mRNA expression data.

0.2. Riemannian Optimization

Optimization on Riemannian manifolds is called Riemannian optimization. Unconstrained optimization methods on the Euclidean space, such as the steepest descent, the conjugate gradient, and Newton’s methods are generalized to those on a Riemannian manifold. The term “optimization on manifolds” is applied to algorithms that preserve manifold constraints during iterations. Generally speaking, preserving constraints has advantages in many situations, for example, when the cost functions are undefined or of little use outside the feasible region and when the algorithm is terminated prior to convergence yet a feasible solution is required. Theories and methods for optimization on manifolds date back to the 1970s, and algorithms for specific problems, e.g., various eigenvalue problems, appeared even earlier. Most of the general-purpose algorithms, however, did not appear until the 1990s [30]. While it is trivial to generate trial points in Rn along straight search lines, it is not as easy to do so in the curved manifold. A natural choice is the geodesic, which is the analogof straightline: it has the shortest length between two points in the manifold. For more details we may refer reders to [29,30,31,32].

In this paper we provide explicit formula of sum of chordal metric between generalized singular values of Grassmann matrix pairs with Riemannian optimization models, which absolutely improves the existing upper bounds in [22] in theory. The new formula involves two small-size unitary variable matrices from Riemannian optimization models. Then we use Newton methods on Riemannian manifolds to solve the involved Riemannian optimization models for deducing these two variable matrices solutions efficiently. The contributions of this paper are summarized as follows.

On theoretic aspect we provide a new explicit formula of sum of chordal metric between generalized singular values of Grassmann matrix pairs with Riemannian optimization models instead of the existing upper bounds.
On algorithm aspect the new expression formula involves only two small-size unitary variable matrices from Riemannian optimization models and Newton methods on Riemannian manifolds are given, which reduces the required computational cost.

0.3. Notations

Throughout this paper, by

R

,

C

,

R^{n}

,

C^{m \times n}, U_{n}

and

{OR}_{n}

we denote the sets of real numbers, complex numbers, real vectors of order n,

m \times n

matrices with entries in

C

, and

n \times n

unitary matrices and real orthogonal matrices, respectively. The symbols

| \cdot |

and

Re (\cdot)

stand for absolute value and real part of a complex number, respectively. The symbols

I_{n}

and

O_{m \times n}

represent the identity matrix of order n and

m \times n

zero matrix, respectively. For a square matrix

A = (a_{i j}) \in C^{n \times n}

, by

A^{T}

,

A^{H}

,

A^{- 1},

and

tr (A)

we denote the transpose, conjugate transpose, inverse, and trace of matrix A, respectively. We use

{∥ \cdot ∥}_{2}

to denote the spectral norm of a matrix. The singular value set of A is denoted by

σ (A)

. The complex number

a + b ı (a, b \in R)

and the real number pair

(a, b)

can establish a one-to-one correspondence with a point on the coordinate plane, so that the coordinate plane that establishes a one-to-one correspondence with the whole complex number is called Gauss plane, which is denoted by

P (1, 1)

. For given matrices

A, B \in C^{n \times n}

,

A < (\leq) B

means

B - A

is a positive (semi-)definite matrix. As in [9,15,16,17], we also use the chordal metric on the Riemann sphere to measure the difference between two GSVs. The pair

(α, β)

can be regarded as a point in Gauss plane

P (1, 1)

. For two points

(α, β) \neq (0, 0), (γ, δ) \neq (0, 0)

in Gauss plane

P (1, 1),

their chordal distance is defined by

\begin{matrix} ρ ((α, β), (γ, δ)) = \frac{| α δ - β γ |}{\sqrt{{(| α |}^{2} + {| β |}^{2} {) (| γ |}^{2} + {| δ |}^{2})}} \end{matrix}

(1)

to measure the difference between two points

(α, β)

and

(γ, δ)

. Next, we outline some definitions related to GSVD. Readers may refer to [7,9,16] for these definitions.

Definition 1.

The matrix pair

{A, B}

with

A, B \in C^{n \times n}

is said to be regular if there exists

ζ \in C

such that

\det (A + ζ B) \neq 0 .

Denote by

G_{1, 2} = {(α, β) \neq (0, 0) : α, β \in C} .

The pair

(α, β) \in G_{1, 2}

is called a generalized eigenvalue of a regular pair

{A, B}

, if

\det (β A - α B) = 0

.

The set of generalized eigenvalues of the regular pair

{A, B}

is denoted by

λ (A, B)

.

Definition 2.

Let

A \in C^{m \times n}

and

B \in C^{p \times n}

. A matrix pair

{A, B}

is an

(m, p, n)

-Grassman matrix pair (GMP) if rank

{(A^{T}, B^{T})}^{T} = n

.

For GSVD, it has several formulations in the literature. In this paper we adopt the following form as in [9,16,19].

Definition 3.

Let {A,B} be an (m,p,n)-GMP. Then there exist unitary matrices

U \in C^{m \times m}, V \in C^{p \times p},

and a nonsingular matrix

R \in C^{n \times n}

such that

U^{H} A R = Σ_{A}, V^{H} B R = Σ_{B},

(2)

Σ_{A} = (\begin{matrix} Λ \\ O_{(m - r - s) \times (n - r - s)} \end{matrix}), Σ_{B} = (\begin{matrix} O_{(p + r - n) \times r} \\ Ω \end{matrix}),

(3)

where

O_{(m - r - s) \times (n - r - s)}

and

O_{(p + r - n) \times r}

are zero matrices, and

Λ = diag (α_{1}, \dots, α_{r + s}), Ω = diag (β_{r + 1}, \dots, β_{n}),

with

1 = α_{1} = \dots = α_{r} > α_{r + 1} \geq \dots \geq α_{r + s} > α_{r + s + 1} = \dots = α_{n} = 0,

0 = β_{1} = \dots = β_{r} < β_{r + 1} \leq \dots \leq β_{r + s} < β_{r + s + 1} = \dots = β_{n} = 1,

and

α_{i}^{2} + β_{i}^{2} = 1, 1 \leq i \leq n .

We note that by Definition 3, given another

(m, p, n)

-GMP

{\tilde{A}, \tilde{B}}

, there also exist unitary matrices

\tilde{U} \in C^{m \times m}, \tilde{V} \in C^{p \times p}

, nonsingular

\tilde{R} \in C^{n \times n}

, and nonnegative diagonal matrices

Σ_{\tilde{A}} \in C^{m \times n}, Σ_{\tilde{B}} \in C^{p \times n}

for the GSVD of

{\tilde{A}, \tilde{B}}

.

For GSVD of a real matrix pair

{A, B}

we adopt the following definition form as in [9,16,19].

Definition 4.

Let

A \in R^{m \times n}, B \in R^{p \times n}

have rank

(A^{T}, B^{T}) = n .

Then there exist two orthogonal matrices

U \in R^{m \times m}, V \in R^{p \times p}

and a nonsingular matrix

R \in R^{n \times n}

such that

U^{T} A R = Σ_{A}, V^{T} B R = Σ_{B},

(4)

Σ_{A} = (\begin{matrix} Λ \\ O_{(m - r - s) \times (n - r - s)} \end{matrix}), Σ_{B} = (\begin{matrix} O_{(p + r - n) \times r} \\ Ω \end{matrix}),

(5)

where

Λ = diag (α_{1}, \dots, α_{r + s})

and

Ω = diag (β_{r + 1}, \dots, β_{n})

with

\begin{matrix} 1 = α_{1} = \dots = α_{r} > α_{r + 1} \geq \dots \geq α_{r + s} > α_{r + s + 1} = \dots = α_{n} = 0, \\ 0 = β_{1} = \dots = β_{r} < β_{r + 1} \leq \dots \leq β_{r + s} < β_{r + s + 1} = \dots = β_{n} = 1, \end{matrix}

and

α_{i}^{2} + β_{i}^{2} = 1

for

1 \leq i \leq n

.

0.4. Overview of Existing Results about Chordal Metric Between GSVs of GMP

There have been a large amount of papers in various topics related to GSVD and its applications, e.g., see [5,6,8,9,10,11,12,16,17,18,19,23]. One of interesting and important topics is about chordal metric between GSVs of Grassmann matrix pairs (GMPs), which was first analyzed by Sun [16] in 1983. In [16], Sun presented the Hoffman-Wielandt theorem for GSVs of GMPs and gave bounds on perturbations of GSVs, which generalized several well-known results for the standard singular value problem. Since then, Paige and Saunders [15] in 1981 gave a bound for GSV variations. Li [9] in 1993 presented several perturbation bounds of GSVs and its associated subspace. Recently Xu et al. [22] considered sharp upper bounds of the chordal metric between generalized singular values of Grassmann matrix pairs. The following results provide perturbation bounds based on the chordal metric of GSVs.

Theorem 1

(The Hoffman-Wielandt type theorem for GSV of GMP[16]). Let

{A, B}

and

{\tilde{A}, \tilde{B}}

be two

(m, p, n)

-GMPs,

σ {A, B} = {(α_{i}, β_{i})}_{i = 1}^{n}

and

σ {\tilde{A}, \tilde{B}} = {({\tilde{α}}_{i}, {\tilde{β}}_{i})}_{i = 1}^{n}

. Then

\prod_{i = 1}^{n} (1 - ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))) \geq 1 - d^{2} (Y, \tilde{Y}),

(6)

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i})) \leq n (1 - \sqrt[n]{1 - d^{2} (Y, \tilde{Y})}),

(7)

where

Y = (\begin{matrix} A \\ B \end{matrix}), \tilde{Y} = (\begin{matrix} \tilde{A} \\ \tilde{B} \end{matrix}),

ρ ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

is given by (1) and

d (Y, \tilde{Y}) = {\{1 - \frac{| \det (Y^{H} \tilde{Y}) |^{2}}{\det (Y^{H} Y) \det ({\tilde{Y}}^{H} \tilde{Y})}\}}^{\frac{1}{2}}

.

Theorem 2

(Explicit expression and sharp upper bound[22]). Let

{A, B}

and

{\tilde{A}, \tilde{B}}

be two

(m, p, n)

-GMPs,

σ {A, B} = {(α_{i}, β_{i})}_{i = 1}^{n}

and

σ {\tilde{A}, \tilde{B}} = {({\tilde{α}}_{i}, {\tilde{β}}_{i})}_{i = 1}^{n}

. Then

\begin{matrix} \prod_{i = 1}^{n} (1 - ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))) & \equiv max_{Ψ_{1} \in U_{m}, Ψ_{2} \in U_{p}} (1 - d^{2} (Y, {\tilde{Y}}_{Ψ})) \\ = 1 - d^{2} (Y, {\tilde{Y}}_{Ψ_{0}}), \end{matrix}

(8)

\begin{matrix} \sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i})) & \leq n (1 - max_{Ψ_{1} \in U_{m}, Ψ_{2} \in U_{p}} \sqrt[n]{1 - d^{2} (Y, {\tilde{Y}}_{Ψ})}) \\ = n (1 - \sqrt[n]{1 - d^{2} (Y, {\tilde{Y}}_{Ψ_{0}})}), \end{matrix}

(9)

where

Y = (\begin{matrix} A \\ B \end{matrix}), \tilde{Y} = (\begin{matrix} \tilde{A} \\ \tilde{B} \end{matrix}), {\tilde{Y}}_{Ψ} = Ψ \tilde{Y}, {\tilde{Y}}_{Ψ_{0}} = Ψ_{0} \tilde{Y},

Ψ = (\begin{matrix} Ψ_{1} & O_{m \times p} \\ O_{p \times m} & Ψ_{2} \end{matrix}), Ψ_{0} = (\begin{matrix} U {\tilde{U}}^{H} & O_{m \times p} \\ O_{p \times m} & V {\tilde{V}}^{H} \end{matrix}),

and

Ψ_{1} \in U_{m}, Ψ_{2} \in U_{p}; U, \tilde{U}, V, \tilde{V}

are the unitary matrices in GSVD of

{A, B}

and

{\tilde{A}, \tilde{B}}

(refer to (2)-(3)). The inequality (9) is an equality if and only if

ρ ((α_{1}, β_{1}), ({\tilde{α}}_{1}, {\tilde{β}}_{1})) = ρ ((α_{2}, β_{2}), ({\tilde{α}}_{2}, {\tilde{β}}_{2})) = \dots = ρ ((α_{n}, β_{n}), ({\tilde{α}}_{n}, {\tilde{β}}_{n})) .

In this paper we continue along this line and further consider explicit expression of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

instead of the existing upper bound in (9).

0.5. Organization

The rest of this paper is organized as follows. Section 1 is the preliminary. In Section 2, we present an explicit expression of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

by Riemannian optimization models and the corresponding numerical methods. Finally, in Section 3 concluding remarks are drawn.

1. Preliminaries

In this section we propose some useful lemmas in order to deduce the main results.

Lemma 1.

[21]Let

A_{1}, \dots, A_{m} \in C^{n \times n}

with

σ_{1} (A_{j}) \geq σ_{2} (A_{j}) \geq \dots \geq σ_{n} (A_{j}) \geq 0

the singular values arranged in decreasing order and

c \in R, j = 1, \dots, m

. We have the following extreme values:

\begin{matrix} max_{U_{1}, \dots, U_{m} \in U_{n}} | tr (c I_{n} \pm \prod_{j = 1}^{m} U_{j} A_{j}) | = n ∣ c ∣ + \sum_{i = 1}^{n} \prod_{j = 1}^{m} σ_{i} (A_{j}) . \end{matrix}

Let

(A, B)

and

(\tilde{A}, \tilde{B})

be two

(m, p, n)

GMPs and

\begin{matrix} S_{1} = \{\begin{matrix} {(Z^{H} Z)}^{- 1 / 2} A^{H}, m \leq n, m \leq p; \\ A {(Z^{H} Z)}^{- 1 / 2}, n \leq m, m \leq p; \\ {(Z^{H} Z)}^{- 1 / 2} B^{H}, p \leq n, p \leq m; \\ B {(Z^{H} Z)}^{- 1 / 2}, n \leq p, p \leq m; \end{matrix} \end{matrix}

(10)

S_{2} = \{\begin{matrix} {({\tilde{Z}}^{H} \tilde{Z})}^{- 1 / 2} {\tilde{A}}^{H}, m \leq n, m \leq p; \\ \tilde{A} {({\tilde{Z}}^{H} \tilde{Z})}^{- 1 / 2}, n \leq m, m \leq p; \\ {({\tilde{Z}}^{H} \tilde{Z})}^{- 1 / 2} {\tilde{B}}^{H}, p \leq n, p \leq m; \\ \tilde{B} {({\tilde{Z}}^{H} \tilde{Z})}^{- 1 / 2}, n \leq p, p \leq m \end{matrix}

and

\begin{matrix} M (Φ) = Φ^{H} S_{2}^{H} S_{2} Φ S_{1}^{H} S_{1}, \forall Φ \in U_{min {m, p, n}} . \end{matrix}

(11)

Let

(A, B)

and

(\tilde{A}, \tilde{B})

be two

(m, p, n)

real matrix pairs defined in (4) and

\begin{matrix} S_{1} = \{\begin{matrix} {(Z^{T} Z)}^{- 1 / 2} A^{T}, m \leq n, m \leq p; \\ A {(Z^{T} Z)}^{- 1 / 2}, n \leq m, m \leq p; \\ {(Z^{T} Z)}^{- 1 / 2} B^{T}, p \leq n, p \leq m; \\ B {(Z^{T} Z)}^{- 1 / 2}, n \leq p, p \leq m; \end{matrix} \end{matrix}

(12)

S_{2} = \{\begin{matrix} {({\tilde{Z}}^{T} \tilde{Z})}^{- 1 / 2} {\tilde{A}}^{T}, m \leq n, m \leq p; \\ \tilde{A} {({\tilde{Z}}^{T} \tilde{Z})}^{- 1 / 2}, n \leq m, m \leq p; \\ {({\tilde{Z}}^{T} \tilde{Z})}^{- 1 / 2} {\tilde{B}}^{T}, p \leq n, p \leq m; \\ \tilde{B} {({\tilde{Z}}^{T} \tilde{Z})}^{- 1 / 2}, n \leq p, p \leq m \end{matrix}

and

\begin{matrix} M (Φ) = Φ^{T} S_{2}^{T} S_{2} Φ S_{1}^{T} S_{1}, \forall Φ \in {OR}_{min {m, p, n}} . \end{matrix}

(13)

We set

\begin{matrix} Q_{i} = \{\begin{matrix} diag (I_{i}, O_{(min {m, p, n} - i) \times (min {m, p, n} - i)}), m \leq n, m \leq p; n \leq m, m \leq p; \\ diag (O_{(min {m, p, n} - i) \times (min {m, p, n} - i)}, I_{i}), p \leq n, p \leq m; n \leq p, p \leq m \end{matrix} \end{matrix}

(14)

for

1 \leq i \leq min {m, p, n}

.

Lemma 2.

[24] Let

f (X) : C^{n \times n} \to R

be an analytical function of several complex variables on the domain

X X^{H} \leq I_{n}

. Then

f (X)

attains its maximum modulus on the characteristic manifold

{X \in C^{n \times n} : X X^{H} = I_{n}}

.

Lemma 3.

Let

{A, B}

and

{\tilde{A}, \tilde{B}}

be two

(m, p, n)

-GMPs and let

σ {A, B} = {(α_{i}, β_{i}), i = 1, 2 \dots, n}

and

σ {\tilde{A}, \tilde{B}} = {({\tilde{α}}_{j}, {\tilde{β}}_{j}), j = 1, 2 \dots, n} .

Then

\begin{matrix} \sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) & = 2 max_{Φ \in U_{min {m, p, n}}} t r (M (Φ)) - t r (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n, \end{matrix}

(15)

where

M (Φ)

is given by (11).

Proof.

Let

Z = (\begin{matrix} A \\ B \end{matrix}), \tilde{Z} = (\begin{matrix} \tilde{A} \\ \tilde{B} \end{matrix}),

then

Z^{H} Z = R^{- H} R^{- 1}, {\tilde{Z}}^{H} \tilde{Z} = {\tilde{R}}^{- H} {\tilde{R}}^{- 1}

are nonsingular. Let

\begin{matrix} S_{1} = {(Z^{H} Z)}^{- 1 / 2} A^{H} = {(R^{- H} R^{- 1})}^{- 1 / 2} R^{- H} Σ_{A}^{H} U^{H}, \end{matrix}

\begin{matrix} S_{2} = {({\tilde{Z}}^{H} \tilde{Z})}^{- 1 / 2} {\tilde{A}}^{H} = {({\tilde{R}}^{- H} {\tilde{R}}^{- 1})}^{- 1 / 2} {\tilde{R}}^{- H} Σ_{\tilde{A}}^{H} {\tilde{U}}^{H} \\ S_{3} = {(Z^{H} Z)}^{- 1 / 2} B^{H} = {(R^{- H} R^{- 1})}^{- 1 / 2} R^{- H} Σ_{B}^{H} V^{H}, \end{matrix}

(16)

\begin{matrix} S_{4} = {({\tilde{Z}}^{H} \tilde{Z})}^{- 1 / 2} {\tilde{B}}^{H} = {({\tilde{R}}^{- H} {\tilde{R}}^{- 1})}^{- 1 / 2} {\tilde{R}}^{- H} Σ_{\tilde{B}}^{H} {\tilde{V}}^{H} \end{matrix}

(17)

and

\begin{matrix} M_{1} (Φ) = S_{1} Φ^{H} S_{2}^{H} S_{2} Φ S_{1}^{H}, \forall Φ \in U_{m}; M_{2} (Φ) = S_{1}^{H} Φ^{H} S_{2} S_{2}^{H} Φ S_{1}, \forall Φ \in U_{n} \\ M_{3} (Φ) = S_{3} Φ^{H} S_{4}^{H} S_{4} Φ S_{3}^{H}, \forall Φ \in U_{p}; M_{4} (Φ) = S_{3}^{H} Φ^{H} S_{4} S_{4}^{H} Φ S_{3}, \forall Φ \in U_{n} . \end{matrix}

(18)

We first prove

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 max_{Φ \in U_{m}} tr (M_{1} (Φ)) - t r (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n .

(19)

Using the GSVDs of

{A, B}

and

{\tilde{A}, \tilde{B}}

in Definition 3 we have

A = U Σ_{A} R, B = V Σ_{B} R, \tilde{A} = \tilde{U} Σ_{\tilde{A}} \tilde{R}, \tilde{B} = \tilde{V} Σ_{\tilde{B}} \tilde{R},

where

U, \tilde{U}, V, \tilde{V}, Σ_{A}, Σ_{B}, Σ_{\tilde{A}}

and

Σ_{\tilde{B}}

are given by (2) and (3). Let

M_{1} (Φ)

be given by (18). Then

\begin{matrix} max_{Φ \in U_{m}} tr (M_{1} (Φ)) & = max_{Φ \in U_{m}} tr (Φ^{H} S_{2}^{H} S_{2} Φ S_{1}^{H} S_{1}) \\ = max_{Φ \in U_{m}} tr (Φ \tilde{U} Σ_{\tilde{A}} Σ_{\tilde{A}}^{H} {\tilde{U}}^{H} Φ^{H} U Σ_{A} Σ_{A}^{H} U^{H}) . \end{matrix}

(20)

Next we consider two cases for different m and n. Case 1: if

n \leq m

, then we write

Σ_{A} = (\begin{matrix} {\hat{Σ}}_{A} \\ O_{(m - n) \times n} \end{matrix}), Σ_{\tilde{A}} = (\begin{matrix} {\hat{Σ}}_{\tilde{A}} \\ O_{(m - n) \times n} \end{matrix}), U^{H} Φ \tilde{U} = (\begin{matrix} W_{11} & W_{12} \\ W_{21} & W_{22} \end{matrix}), \forall Φ \in U_{m},

where

{\hat{Σ}}_{A}, {\hat{Σ}}_{\tilde{A}}, W_{11} \in C^{n \times n}

and

W_{22} \in C^{(m - n) \times (m - n)}

. Substituting into (20) we have

max_{Φ \in U_{m}} tr (M_{1} (Φ)) = max_{W_{11} W_{11}^{H} \leq I_{n}} tr ({\hat{Σ}}_{A}^{H} W_{11} {\hat{Σ}}_{\tilde{A}} {\hat{Σ}}_{\tilde{A}}^{H} W_{11}^{H} {\hat{Σ}}_{A}),

where

W_{11} W_{11}^{H} \leq I_{n}

means that

I_{n} - W_{11} W_{11}^{H}

is positive semi-definite. As we know,

tr ({\hat{Σ}}_{A}^{H} W_{11} {\hat{Σ}}_{\tilde{A}} {\hat{Σ}}_{\tilde{A}}^{H} W_{11}^{H} {\hat{Σ}}_{A})

as an analytical function of several complex variables on the domain

W_{11} W_{11}^{H} \leq I_{n}

attains its maximum modulus on the characteristic manifold

{W_{11} \in C^{n \times n} : W_{11} W_{11}^{H} = I_{n}}

(see [21]), therefore by Lemma 2 we have

max_{Φ \in U_{m}} tr (M_{1} (Φ)) = max_{W_{11} W_{11}^{H} = I_{n}} tr ({\hat{Σ}}_{A}^{H} W_{11} {\hat{Σ}}_{\tilde{A}} {\hat{Σ}}_{\tilde{A}}^{H} W_{11}^{H} {\hat{Σ}}_{A}),

which together with Lemma 1 and (20) yields

max_{Φ \in U_{m}} tr (M_{1} (Φ)) = \sum_{i = 1}^{n} α_{i}^{2} {\tilde{α}}_{i}^{2} .

(21)

Let

U^{H} Ψ U = (\begin{matrix} U_{11} & U_{12} \\ U_{21} & U_{22} \end{matrix}), {\tilde{U}}^{H} \tilde{Ψ} \tilde{U} = (\begin{matrix} {\tilde{U}}_{11} & {\tilde{U}}_{12} \\ {\tilde{U}}_{21} & {\tilde{U}}_{22} \end{matrix}), \forall Ψ, \tilde{Ψ} \in U_{m}

with

U_{11}, {\tilde{U}}_{11} \in C^{n \times n}

and

U_{22}, {\tilde{U}}_{22} \in C^{(m - n) \times (m - n)}

, then by Lemma 1 and (33) we have

\begin{matrix} tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) \\ = tr (Σ_{A} Σ_{A}^{H} + Σ_{\tilde{A}} Σ_{\tilde{A}}^{H}), \\ = \sum_{i = 1}^{n} (α_{i}^{2} + {\tilde{α}}_{i}^{2}) . \end{matrix}

(22)

Case 2: if

m \leq n

, then we write

Σ_{A} = ({\hat{Σ}}_{A}, O_{m \times (n - m)}), Σ_{\tilde{A}} = ({\hat{Σ}}_{\tilde{A}}, O_{m \times (n - m)}) .

Then by (20) and Lemma 1 we have

\begin{matrix} max_{Φ \in U_{m}} tr (M_{1} (Φ)) & = max_{Φ \in U_{m}} tr (Φ \tilde{U} Σ_{\tilde{A}} Σ_{\tilde{A}}^{H} {\tilde{U}}^{H} Φ^{H} U Σ_{A} Σ_{A}^{H} U^{H}) \\ = max_{Φ \in U_{m}} tr (\begin{matrix} Φ \tilde{U} {\hat{Σ}}_{\tilde{A}} {\hat{Σ}}_{\tilde{A}}^{H} {\tilde{U}}^{H} Φ^{H} U {\hat{Σ}}_{A} {\hat{Σ}}_{A}^{H} U^{H} & 0 \\ 0 & 0_{(n - m) \times (n - m)} \end{matrix}) \\ = max_{Φ \in U_{m}} tr (Φ \tilde{U} {\hat{Σ}}_{\tilde{A}} {\hat{Σ}}_{\tilde{A}}^{H} {\tilde{U}}^{H} Φ^{H} U {\hat{Σ}}_{A} {\hat{Σ}}_{A}^{H} U^{H}) \\ = \sum_{i = 1}^{m} α_{i}^{2} {\tilde{α}}_{i}^{2} = \sum_{i = 1}^{n} α_{i}^{2} {\tilde{α}}_{i}^{2} \end{matrix}

(23)

since

α_{m + 1} = \dots = α_{n} = {\tilde{α}}_{m + 1} = \dots = {\tilde{α}}_{n} = 0

by the definition of GSVD. It follows from Lemma 1 and (33) that

\begin{matrix} tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) & = tr (U Σ_{A} Σ_{A}^{H} U^{H} + \tilde{U} Σ_{\tilde{A}} Σ_{\tilde{A}}^{H} {\tilde{U}}^{H}), \\ = \sum_{i = 1}^{m} (α_{i}^{2} + {\tilde{α}}_{i}^{2}) = \sum_{i = 1}^{n} (α_{i}^{2} + {\tilde{α}}_{i}^{2}) . \end{matrix}

(24)

By Definition 3 we have

β_{i}^{2} = 1 - α_{i}^{2}, {\tilde{β}}_{i}^{2} = 1 - {\tilde{α}}_{i}^{2},

and together with (20)-(24) we have

\begin{matrix} \sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) & = & \sum_{i = 1}^{n} (2 α_{i}^{2} {\tilde{α}}_{i}^{2} - (α_{i}^{2} + {\tilde{α}}_{i}^{2}) + 1) \\ = & 2 \sum_{i = 1}^{n} α_{i}^{2} {\tilde{α}}_{i}^{2} - \sum_{i = 1}^{n} (α_{i}^{2} + {\tilde{α}}_{i}^{2}) + n \\ = & 2 max_{Φ \in U_{m}} tr (M_{1} (Φ)) - tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n . \end{matrix}

(25)

By Cases 1 and 2 we deduce (35) holds true. By the similar aforementioned methods we get

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 max_{Φ \in U_{n}} tr (M_{2} (Φ)) - tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n .

(26)

Note that in (41)

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 \sum_{i = 1}^{n} β_{i}^{2} {\tilde{β}}_{i}^{2} - \sum_{i = 1}^{n} (β_{i}^{2} + {\tilde{β}}_{i}^{2}) + n,

and hence by the similar aforementioned methods we have

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 max_{Φ \in U_{p}} tr (M_{3} (Φ)) - tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n,

(27)

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 max_{Φ \in U_{n}} tr (M_{4} (Φ)) - tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n .

(28)

In order to select the smallest dimensional variable unitary matrix by (35), (42)-(44) we have (31) holds. This completes the proof. □

Lemma 4.

Let

Λ = d i a g (λ_{1}, \dots, λ_{n}), Δ = d i a g (δ_{1}, \dots, δ_{n}) \in R^{n \times n}

be diagonal matrices with

λ_{1} \geq \dots \geq λ_{n} \geq 0

and

δ_{1} \geq \dots \geq δ_{n} \geq 0

. Then we have

\begin{matrix} max_{Φ \in {OR}_{n}} tr (Λ Φ^{T} Δ Φ) = \sum_{i = 1}^{n} λ_{i} δ_{i}, \end{matrix}

Proof.

It follows from (1) that

\begin{matrix} max_{Φ \in {OR}_{n}} tr (Λ Φ^{T} Δ Φ) \\ = max_{Φ \in {OR}_{n}} tr (Λ Φ^{H} Δ Φ) \\ \leq max_{Φ \in U_{n}} tr (Λ Φ^{H} Δ Φ) \\ = \sum_{i = 1}^{n} λ_{i} δ_{i} . \end{matrix}

(29)

Meanwhile,

\begin{matrix} \sum_{i = 1}^{n} λ_{i} δ_{i} = tr (Λ Δ) \leq max_{Φ \in {OR}_{n}} tr (Λ Φ^{T} Δ Φ) . \end{matrix}

(30)

Combing (29) and (30) yields

max_{Φ \in {OR}_{n}} tr (Λ Φ^{T} Δ Φ) = \sum_{i = 1}^{n} λ_{i} δ_{i} .

This completed the proof. □

Lemma 5.

Let

A \in R^{m \times n}, B \in R^{p \times n}

have rank

(A^{T}, B^{T}) = n

and the GSV

(α_{i}, β_{i})

of

{A, B}

be given in (4) and (5). Then

\begin{matrix} \sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) & = 2 max_{Φ \in {OR}_{min {m, p, n}}} t r (M (Φ)) - t r (S_{1}^{T} S_{1} + S_{2}^{T} S_{2}) + n, \end{matrix}

(31)

where

M (Φ)

is given by (13).

Proof.

Let

Z = (\begin{matrix} A \\ B \end{matrix}), \tilde{Z} = (\begin{matrix} \tilde{A} \\ \tilde{B} \end{matrix}),

then

Z^{T} Z = R^{- T} R^{- 1}, {\tilde{Z}}^{T} \tilde{Z} = {\tilde{R}}^{- T} {\tilde{R}}^{- 1}

are nonsingular. Let

\begin{matrix} S_{1} = {(Z^{T} Z)}^{- 1 / 2} A^{T} = {(R^{- T} R^{- 1})}^{- 1 / 2} R^{- T} Σ_{A}^{T} U^{T}, \end{matrix}

\begin{matrix} S_{2} = {({\tilde{Z}}^{T} \tilde{Z})}^{- 1 / 2} {\tilde{A}}^{T} = {({\tilde{R}}^{- T} {\tilde{R}}^{- 1})}^{- 1 / 2} {\tilde{R}}^{- T} Σ_{\tilde{A}}^{T} {\tilde{U}}^{T} \\ S_{3} = {(Z^{T} Z)}^{- 1 / 2} B^{T} = {(R^{- T} R^{- 1})}^{- 1 / 2} R^{- T} Σ_{B}^{T} V^{T}, \end{matrix}

(32)

\begin{matrix} S_{4} = {({\tilde{Z}}^{T} \tilde{Z})}^{- 1 / 2} {\tilde{B}}^{T} = {({\tilde{R}}^{- T} {\tilde{R}}^{- 1})}^{- 1 / 2} {\tilde{R}}^{- T} Σ_{\tilde{B}}^{T} {\tilde{V}}^{T} \end{matrix}

(33)

and

\begin{matrix} M_{1} (Φ) = S_{1} Φ^{T} S_{2}^{T} S_{2} Φ S_{1}^{T}, \forall Φ \in {OR}_{m}; M_{2} (Φ) = S_{1}^{T} Φ^{T} S_{2} S_{2}^{T} Φ S_{1}, \forall Φ \in {OR}_{n} \\ M_{3} (Φ) = S_{3} Φ^{T} S_{4}^{T} S_{4} Φ S_{3}^{T}, \forall Φ \in {OR}_{p}; M_{4} (Φ) = S_{3}^{T} Φ^{T} S_{4} S_{4}^{T} Φ S_{3}, \forall Φ \in {OR}_{n} . \end{matrix}

(34)

We first prove

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 max_{Φ \in {OR}_{m}} tr (M_{1} (Φ)) - t r (S_{1}^{T} S_{1} + S_{2}^{T} S_{2}) + n .

(35)

Then

\begin{matrix} max_{Φ \in {OR}_{m}} tr (M_{1} (Φ)) & = max_{Φ \in {OR}_{m}} tr (Φ^{T} S_{2}^{H} S_{2} Φ S_{1}^{T} S_{1}) \\ = max_{Φ \in {OR}_{m}} tr (Φ \tilde{U} Σ_{\tilde{A}} Σ_{\tilde{A}}^{T} {\tilde{U}}^{T} Φ^{H} U Σ_{A} Σ_{A}^{H} U^{T}), \end{matrix}

(36)

which together with Lemma 1 and (20) yields

max_{Φ \in U_{m}} tr (M_{1} (Φ)) = \sum_{i = 1}^{n} α_{i}^{2} {\tilde{α}}_{i}^{2} .

(37)

Let

U^{T} Ψ U = (\begin{matrix} U_{11} & U_{12} \\ U_{21} & U_{22} \end{matrix}), {\tilde{U}}^{T} \tilde{Ψ} \tilde{U} = (\begin{matrix} {\tilde{U}}_{11} & {\tilde{U}}_{12} \\ {\tilde{U}}_{21} & {\tilde{U}}_{22} \end{matrix}), \forall Ψ, \tilde{Ψ} \in {OR}_{m}

with

U_{11}, {\tilde{U}}_{11} \in C^{n \times n}

and

U_{22}, {\tilde{U}}_{22} \in C^{(m - n) \times (m - n)}

, then by Lemma 1 and (33) we have

\begin{matrix} tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) & = tr (Σ_{A} Σ_{A}^{H} + Σ_{\tilde{A}} Σ_{\tilde{A}}^{H}), \\ = \sum_{i = 1}^{n} (α_{i}^{2} + {\tilde{α}}_{i}^{2}) . \end{matrix}

(38)

Case 2: if

m \leq n

, then we write

Σ_{A} = ({\hat{Σ}}_{A}, O_{m \times (n - m)}), Σ_{\tilde{A}} = ({\hat{Σ}}_{\tilde{A}}, O_{m \times (n - m)}) .

Then by (20) and Lemma 1 we have

\begin{matrix} max_{Φ \in {OR}_{m}} tr (M_{1} (Φ)) & = max_{Φ \in {OR}_{m}} tr (Φ \tilde{U} Σ_{\tilde{A}} Σ_{\tilde{A}}^{T} {\tilde{U}}^{T} Φ^{T} U Σ_{A} Σ_{A}^{T} U^{T}) \\ = max_{Φ \in {OR}_{m}} tr (\begin{matrix} Φ \tilde{U} {\hat{Σ}}_{\tilde{A}} {\hat{Σ}}_{\tilde{A}}^{T} {\tilde{U}}^{T} Φ^{T} U {\hat{Σ}}_{A} {\hat{Σ}}_{A}^{T} U^{T} & 0 \\ 0 & 0_{(n - m) \times (n - m)} \end{matrix}) \\ = max_{Φ \in {OR}_{m}} tr (Φ \tilde{U} {\hat{Σ}}_{\tilde{A}} {\hat{Σ}}_{\tilde{A}}^{T} {\tilde{U}}^{T} Φ^{T} U {\hat{Σ}}_{A} {\hat{Σ}}_{A}^{T} U^{T}) \\ = \sum_{i = 1}^{m} α_{i}^{2} {\tilde{α}}_{i}^{2} = \sum_{i = 1}^{n} α_{i}^{2} {\tilde{α}}_{i}^{2} \end{matrix}

(39)

since

α_{m + 1} = \dots = α_{n} = {\tilde{α}}_{m + 1} = \dots = {\tilde{α}}_{n} = 0

by the definition of GSVD. It follows from Lemma 1 and (33) that

\begin{matrix} tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) & = tr (U Σ_{A} Σ_{A}^{T} U^{T} + \tilde{U} Σ_{\tilde{A}} Σ_{\tilde{A}}^{T} {\tilde{U}}^{T}), \\ = \sum_{i = 1}^{m} (α_{i}^{2} + {\tilde{α}}_{i}^{2}) = \sum_{i = 1}^{n} (α_{i}^{2} + {\tilde{α}}_{i}^{2}) . \end{matrix}

(40)

By Definition 3 we have

β_{i}^{2} = 1 - α_{i}^{2}, {\tilde{β}}_{i}^{2} = 1 - {\tilde{α}}_{i}^{2},

and together with (20)-((24)) we have

\begin{matrix} \sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) & = & \sum_{i = 1}^{n} (2 α_{i}^{2} {\tilde{α}}_{i}^{2} - (α_{i}^{2} + {\tilde{α}}_{i}^{2}) + 1) \\ = & 2 \sum_{i = 1}^{n} α_{i}^{2} {\tilde{α}}_{i}^{2} - \sum_{i = 1}^{n} (α_{i}^{2} + {\tilde{α}}_{i}^{2}) + n \\ = & 2 max_{Φ \in U_{m}} tr (M_{1} (Φ)) - tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n . \end{matrix}

(41)

By Cases 1 and 2 we deduce (19) holds true. By the similar aforementioned methods we get

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 max_{Φ \in {OR}_{n}} tr (M_{2} (Φ)) - tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n .

(42)

Note that in (25)

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 \sum_{i = 1}^{n} β_{i}^{2} {\tilde{β}}_{i}^{2} - \sum_{i = 1}^{n} (β_{i}^{2} + {\tilde{β}}_{i}^{2}) + n,

and hence by the similar aforementioned methods we have

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 max_{Φ \in {OR}_{p}} tr (M_{3} (Φ)) - tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n,

(43)

\sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) = 2 max_{Φ \in {OR}_{n}} tr (M_{4} (Φ)) - tr (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) + n .

(44)

In order to select the smallest dimensional variable unitary matrix by (35), (42)-(44) we have (31) holds. This completes the proof. □

2. Explicit Formula of $\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))$ with Riemannian Optimization Models

We first present a new explicit formula of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

for complex grassman matrix pair. Then we use the corresponding numerical methods to solve solutions of the involved Riemannian optimization models.

2.1. For complex grassman matrix pair

We present a new explicit formula of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

for complex grassman matrix pair.

Lemma 6.

Let

M (Φ), Q_{i}, S_{1}, S_{2}

be given by (10), (11) and (14). Then

max_{Φ \in U_{min {m, p, n}}} t r (M (Φ)) = t r (M (F_{2} F_{1}^{H})),

where

F_{1}, F_{2} \in U_{min {m, p, n}}

can be obtained from

Π_{1}, Π_{2}

by solving the trace function optimization models

\begin{matrix} max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1}^{H} S_{1}^{H} S_{1} Π_{1} Q_{i}) |, \end{matrix}

(45)

\begin{matrix} max_{Π_{2} \in U_{min {m, p, n}}} | t r (Π_{2}^{H} S_{2}^{H} S_{2} Π_{2} Q_{i}) | \end{matrix}

(46)

for every

i = 1, \dots, n

, respectively.

Proof.

We first prove the case for

m \leq n, m \leq p

and for others cases when

n \leq m, m \leq p;

p \leq n, p \leq m;

and

n \leq p, p \leq m

the corresponding results can be proved similarly. By (10), (11) and (18)

S_{1}, S_{2}, M, F_{1}, F_{2}

are taken as

S_{1}, S_{2}, M_{1}, F_{1}, F_{2}

, where

F_{1}

and

F_{2}

can be obtained by solving

\begin{matrix} max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1}^{H} S_{1}^{H} S_{1} Π_{1} Q_{i}) |, \end{matrix}

(47)

max_{Π_{2} \in U_{min {m, p, n}}} | t r (Π_{2}^{H} S_{2}^{H} S_{2} Π_{2} Q_{i}) |

for every

i = 1, \dots, n

, respectively. Let

f_{i}

be the i-th column of unitary matrix

F_{1} .

By Lemma 1 we have for

i = 1

\begin{matrix} max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1}^{H} S_{1}^{H} S_{1} Π_{1} Q_{1}) | = | t r (F_{1}^{H} S_{1}^{H} S_{1} F_{1} Q_{1}) | = f_{1}^{H} S_{1}^{H} S_{1} f_{1} = λ_{1} . \end{matrix}

(48)

Here

f_{1}

has some solutions and is set to satisfy

S_{1}^{H} S_{1} f_{1} = λ_{1} f_{1} .

Similarly, for

i > 1

\begin{matrix} max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1}^{H} S_{1}^{H} S_{1} Π_{1} Q_{i}) | & = | t r (F_{1}^{H} S_{1}^{H} S_{1} F_{1} Q_{i}) | \\ = f_{1}^{H} S_{1}^{H} S_{1} f_{1} + \dots + f_{i}^{H} S_{1}^{H} S_{1} f_{i} \\ = λ_{1} + \dots + λ_{i} . \end{matrix}

(49)

Here

f_{i}

has some solutions and is set to satisfy

S_{1}^{H} S_{1} f_{i} = λ_{i} f_{i} .

Since

F_{1}

is unitary, then

f_{k}^{H} f_{j} = 0

and

f_{k}^{H} f_{k} = 1, k \neq j; k, j = 1, \dots, n

. It follows that

f_{k}^{H} S_{1}^{H} S_{1} f_{j} = λ_{i} f_{k}^{H} f_{j} = 0

and

f_{k}^{H} S_{1}^{H} S_{1} f_{k} = λ_{k} f_{k}^{H} f_{k} = λ_{k}, k \neq j; k, j = 1, \dots, n

. Then

\begin{matrix} F_{1}^{H} S_{1}^{H} S_{1} F_{1} = Λ_{1}, \end{matrix}

(50)

where

Λ_{1} = d i a g (λ_{i})

. Similarly,

F_{2}^{H} S_{2}^{H} S_{2} F_{2}^{H} = Λ_{2},

where

Λ_{2} = d i a g (χ_{i})

. By Lemma 1 and (47) we have for

i = 1

\begin{matrix} max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1}^{H} S_{1}^{H} S_{1} Π_{1} Q_{1}) | = | t r (F_{1}^{H} S_{1}^{H} S_{1} F_{1} Q_{1}) | = λ_{1} = max_{1 \leq i \leq n} λ_{i} . \end{matrix}

(51)

For other

i = 2, \dots, n

, we have

\begin{matrix} max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1}^{H} S_{1}^{H} S_{1} Π_{1} Q_{i}) | = | t r (F_{1}^{H} S_{1}^{H} S_{1} F_{1} Q_{i}) | = | t r (Λ_{1} Q_{i}) | = λ_{1} + \dots + λ_{i} . \end{matrix}

(52)

Then

Λ_{1} = d i a g (λ_{i}), Λ_{2} = d i a g (χ_{i})

have descending sorts with

λ_{i} \geq 0, χ_{i} \geq 0

, otherwise, there will be conflicts with (51) and (52). Then by Lemma 1 we have

\begin{matrix} max_{Φ \in U_{m}} tr (M_{1} (Φ)) & = max_{Φ \in U_{m}} tr (Φ^{H} S_{2}^{H} S_{2} Φ S_{1}^{H} S_{1}) \\ = max_{Φ \in U_{m}} tr (Φ^{H} F_{2} Λ_{2} F_{2}^{H} Φ F_{1} Λ_{1} F_{1}^{H}) = tr (Λ_{2} Λ_{1}) . \end{matrix}

(53)

It is easy to check that

tr (M_{1} (F_{2} F_{1}^{H})) = tr (Λ_{2} Λ_{1}),

which together with (53) implies

max_{Φ \in U_{m}} tr (M_{1} (Φ)) = tr (M_{1} (F_{2} F_{1}^{H})) .

For other cases for

n \leq m, m \leq p;

p \leq n, p \leq m;

and

n \leq p, p \leq m;

the desired corresponding results can be proved similarly. The proof is complete. □

Theorem 3.

Let

M (Φ), Q_{i}, S_{1}, S_{2}

be given by (10), (11) and (14). Let

{A, B}

and

{\tilde{A}, \tilde{B}}

be two

(m, p, n)

-GMPs and

σ {A, B} = {(α_{i}, β_{i}), i = 1, 2 \dots, n}

and

σ {\tilde{A}, \tilde{B}} = {({\tilde{α}}_{j}, {\tilde{β}}_{j}), j = 1, 2 \dots, n} .

Then

\begin{matrix} \sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i})) \\ = & t r (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) - 2 max_{Φ \in U_{min {m, p, n}}} t r (M (Φ)) - 2 t r (L_{1}^{\frac{1}{2}} {(I_{min {m, p, n}} - L_{1})}^{\frac{1}{2}} L_{2}^{\frac{1}{2}} {(I_{min {m, p, n}} - L_{2})}^{\frac{1}{2}}) \\ = & t r (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) - 2 t r (M (F_{2} F_{1}^{H})) - 2 t r (L_{1}^{\frac{1}{2}} {(I_{min {m, p, n}} - L_{1})}^{\frac{1}{2}} L_{2}^{\frac{1}{2}} {(I_{min {m, p, n}} - L_{2})}^{\frac{1}{2}}), \end{matrix}

where

L_{1} = F_{1}^{H} S_{1}^{H} S_{1} F_{1}, L_{2} = F_{2}^{H} S_{2}^{H} S_{2} F_{2}

and

F_{1}, F_{2} \in U_{min {m, p, n}}

can be obtained from

Π_{1}, Π_{2}

by solving the trace function optimization models (45) and (46).

Proof.

We first prove the case when

m \leq n, m \leq p

and for others cases when

n \leq m, m \leq p;

p \leq n, p \leq m;

and

n \leq p, p \leq m

the corresponding results can be proved similarly. By (10), (11) and (18)

S_{1}, S_{2}, M, F_{1}, F_{2}

can be taken as

S_{1}, S_{2}, M_{1}, F_{1}, F_{2}

. Since

F_{1}

and

F_{2}

are solutions of (45) and (46) for every

i = 1, \dots, n

, respectively, then

S_{1}^{H} S_{1} = A {(A^{H} A + B^{H} B)}^{- 1} A^{H} = U Σ_{A} Σ_{A}^{H} U^{H}

. For

i = 1

by (45) and Lemma 1 we have

max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1}^{H} A {(A^{H} A + B^{H} B)}^{- 1} A^{H} Π_{1} Q_{1}) | = | t r (F_{1}^{H} A {(A^{H} A + B^{H} B)}^{- 1} A^{H} F_{1} Q_{1}) | = α_{1},

For

i = 2

we have

max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1}^{H} A {(A^{H} A + B^{H} B)}^{- 1} A^{H} Π_{1} Q_{i}) | = | t r (F_{1}^{H} A {(A^{H} A + B^{H} B)}^{- 1} A^{H} F_{1} Q_{2}) | = α_{1} + α_{2} .

For

i > 2

, we have

max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1}^{H} A {(A^{H} A + B^{H} B)}^{- 1} A^{H} Π_{1} Q_{i}) | = α_{1} + \dots + α_{i} .

Hence, we have

L_{1} = F_{1}^{H} S_{1}^{H} S_{1} F_{1} = d i a g (α_{i}), L_{2} = F_{2}^{H} S_{2}^{H} S_{2} F_{2} = d i a g ({\tilde{α}}_{i})

. It follows that

\sum_{i = 1}^{n} α_{i} β_{i} {\tilde{α}}_{i} {\tilde{β}}_{i} = t r (L_{1}^{\frac{1}{2}} {(I_{m} - L_{1})}^{\frac{1}{2}} L_{2}^{\frac{1}{2}} {(I_{m} - L_{2})}^{\frac{1}{2}}),

which together with (25) and Lemma 6 yields

\begin{matrix} \sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i})) = n - \sum_{i = 1}^{n} (α_{i}^{2} {\tilde{α}}_{i}^{2} + β_{i}^{2} {\tilde{β}}_{i}^{2}) - 2 \sum_{i = 1}^{n} α_{i} β_{i} {\tilde{α}}_{i} {\tilde{β}}_{i}, \\ = & t r (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) - 2 max_{Φ \in U_{m}} t r (M_{1} (Φ)) - 2 t r (L_{1}^{\frac{1}{2}} {(I_{m} - L_{1})}^{\frac{1}{2}} L_{2}^{\frac{1}{2}} {(I_{m} - L_{2})}^{\frac{1}{2}}), \\ = & t r (S_{1}^{H} S_{1} + S_{2}^{H} S_{2}) - 2 t r (M_{1} (F_{2} F_{1}^{H})) - 2 t r (L_{1}^{\frac{1}{2}} {(I_{m} - L_{1})}^{\frac{1}{2}} L_{2}^{\frac{1}{2}} {(I_{m} - L_{2})}^{\frac{1}{2}}) . \end{matrix}

For the cases when

n \leq m, m \leq p;

p \leq n, p \leq m;

and

n \leq p, p \leq m;

by the similar aforementioned methods the desired results can be proved. □

Remark 1.

From Theorem 3 we see that the formula of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

in (3) only refers to two small size unitary matrices

F_{1} \in U_{min {m, p, n}}, F_{2} \in U_{min {m, p, n}}

, which may require less computational cost for computing the explicit expression in Theorem 3.

When using Theorem 3 we first determine

S_{1}, S_{2}

by comparing

m, p, n

. Then by using (45) and (46)

F_{1}, F_{2}

can be computed.

2.2. Solving Riemannian Optimization Models in (45) and (46) for Complex Grassman Matrix Pair

We will use Newton methods on Riemannian manifolds to solve Riemannian optimization models in (45) and (46) of Theorem 3.

\begin{matrix} max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1} S_{1}^{H} S_{1} Π_{1}^{H} Q_{i}) |, \end{matrix}

(54)

for

n \leq m, n \leq p

and

m \leq n, m \leq p

and

\begin{matrix} max_{Π_{2} \in U_{min {m, p, n}}} | t r (Π_{2} S_{1}^{H} S_{1} Π_{2}^{H} Q_{i}) |, \end{matrix}

(55)

for

p \leq n, p \leq m

and

n \leq p, p \leq m .

Solving

max_{Π_{1} \in U_{min {m, p, n}}} | t r (Π_{1} S_{2}^{H} S_{2} Π_{1}^{H} Q_{i}) |

can be similarly discussed. For simplicity, we assume that

n \leq m

and

n \leq p

for (54) and

n \leq p, p \leq m

for

()

and other cases for different relationships between

m, n, p

can be similarly discussed. To compute (54) and (55), we consider the following matrix optimization problems:

\begin{matrix} max & f_{1} (Φ_{1}) : = \frac{1}{2} tr (Φ_{1}^{H} Q_{i} Φ_{1} C) \\ s . t . & Φ_{1} \in U_{n} \end{matrix}

(56)

and

\begin{matrix} max & f_{2} (Φ_{2}) : = \frac{1}{2} tr (Φ_{2}^{H} Q_{i} Φ_{2} D) \\ s . t . & Φ_{2} \in U_{n} \end{matrix}

(57)

for

1 \leq i \leq n

. We only need to focus on the following matrix optimization problems:

\begin{matrix} max & f_{1} (Φ_{11}) : = \frac{1}{2} tr (Φ_{11}^{H} C Φ_{11}) \\ s . t . & Φ_{11} \in S t (n, i) : = {Φ_{11} \in C^{n \times i} | Φ_{11}^{H} Φ_{11} = I_{i}}, \end{matrix}

(58)

\begin{matrix} max & f_{2} (Φ_{22}) : = \frac{1}{2} tr (Φ_{22}^{H} D Φ_{22}) \\ s . t . & Φ_{22} \in S t (n, n - i + 1), \end{matrix}

(59)

where

C = {(A^{H} A + B^{H} B)}^{- 1 / 2} A^{H} A {(A^{H} A + B^{H} B)}^{- 1 / 2}

,

D = {(A^{H} A + B^{H} B)}^{- 1 / 2} B^{H} B {(A^{H} A + B^{H} B)}^{- 1 / 2}

,

Φ_{11}

is the first i columns of

Φ_{1}^{H}

and

Φ_{22}

is the last

n - i + 1

columns of

Φ_{2}^{H}

. In the following, we focus on the solution of the matrix optimization problem (58) and the matrix optimization problem (59) can be solved in a similar way.

We note that

f_{1} (Φ_{11} Q) = f_{1} (Φ_{11})

for all

Φ_{11} \in S t (n, i)

and

Q \in U_{i}

. Let the complex Grassmann manifold

Grass (n, i)

be the set of all i-dimensional complex subspace of

C^{n}

. If

[X]

means the subspace spanned by the columns of

X \in S t (n, i)

, then we have

[X] \in Grass (n, i)

. In particular, for any

X \in S t (n, i)

, the natural projection

[X] \in Grass (n, i)

corresponds the equivalent class

{X Q \in S t (n, i) | Q \in U_{i}}

of

S t (n, i)

. Thus, instead of problem (58), we consider the following optimization problem:

\begin{matrix} max & {\tilde{f}}_{1} ([Φ_{11}]) : = f_{1} (Φ_{11}) \\ s . t . & [Φ_{11}] \in Grass (n, i) . \end{matrix}

(60)

Next, we present Newton’s method for solving the optimization problem (60). Let

Φ_{11} \in S t (n, i)

. The tangent space of

S t (n, i)

at

Φ_{11}

is given by [29]

T_{Φ_{11}} S t (n, i) = {Z \in C^{n \times i} | Z = Φ_{11} Ω + {(Φ_{11})}_{⊥} K, Ω^{H} = - Ω, Ω \in C^{i, i}, K \in C^{n - i, i}},

which can be endowed with the inner product

〈 Z_{1}, Z_{2} 〉 = R [tr (Z_{2}^{H} {(I - \frac{1}{2} Φ_{11} Φ_{11})}^{H}) Z_{1})], Z_{1}, Z_{2} \in T_{Φ_{11}} S t (n, i), Φ_{11} \in S t (m, i) .

Here,

{(Φ_{11})}_{⊥}

means that span

({(Φ_{11})}_{⊥})

is the orthogonal complement of span

(Φ_{11})

, where span

(Φ_{11})

denotes a linear space spanned by the column vectors of

Φ_{11}

. We note that the tangent space of

Grass (n, i)

at

[Φ_{11}]

is given by [29]

T_{[Φ_{11}]} Grass (n, i) = {Z \in C^{n \times i} | Z = {(Φ_{11})}_{⊥} K, K \in C^{n - i, i}} \subset T_{Φ_{11}} S t (n, i) .

Hence, we can define a Riemannian metric on

Grass (n, i)

by

〈 Z_{1}, Z_{2} 〉 = R [tr (Z_{2}^{H} Z_{1})], Z_{1}, Z_{2} \in T_{[Φ_{11}]} Grass (n, i), Φ_{11} \in S t (n, i)

with the induced norm

∥ \cdot ∥

. Then the orthogonal projection onto

T_{[Φ_{11}]} Grass (m, i)

is given by

P_{Φ_{11}} Z = (I - Φ_{11} Φ_{11}^{H}) Z, \forall Z \in C^{n \times i} .

We define the local cost function

g : T_{[Φ_{11}]} Grass (n, i) \to R

by

g (Z) = {\tilde{f}}_{1} ([Φ_{11} + Z]) .

It is easy to check that, for any

Z \in T_{[Φ_{11}]} Grass (n, i)

,

\begin{matrix} g (Z) & = & f_{1} (Φ_{11}) + R tr (Z^{H} D_{Φ_{11}}) \\ + \frac{1}{2} vec {(Z)}^{H} (H_{Φ_{11}} - {(Φ_{11}^{H} D_{Φ_{11}})}^{T} \otimes I_{n}) vec (Z), \end{matrix}

where

D_{Φ_{11}} = C Φ_{11}

and

H_{Φ_{11}} = I_{i} \otimes C

are the derivative and Hessian of

f_{1}

at

Φ_{11}

, respectively.

Based on the above analysis, Newton’s method for solving the optimization problem (60) can be described as follows [29].

Algorithm 4.

Step 0. Choose

Φ_{11}^{0} \in S t (n, i)

,

ρ, η \in (0, 1)

,

σ \in (0, 1 / 2]

and let

k : = 1

.

Step 1.: Apply the conjugate gradient (CG) method to solving

$P_{Φ_{11}^{k}} (C Z^{k} - Z^{k} {(Φ_{11}^{k})}^{H} D_{Φ_{11}^{k}}) = - P_{Φ_{11}^{k}} D_{Φ_{11}^{k}}$

for $Z^{k} \in T_{[Φ_{11}^{k}]} Grass (n, i)$ such that

$∥ P_{Φ_{11}^{k}} (C Z^{k} - Z^{k} {(Φ_{11}^{k})}^{H} D_{Φ_{11}^{k}}) + P_{Φ_{11}^{k}} D_{Φ_{11}^{k}} ∥ \leq η_{k} ∥ P_{Φ_{11}^{k}} D_{Φ_{11}^{k}} ∥$

(61)

and

$〈 P_{Φ_{11}^{k}} D_{Φ_{11}^{k}}, Z^{k} 〉 \leq - η_{k} 〈 Z^{k}, Z^{k} 〉,$

(62)

where $η_{k} = min {η, ∥ P_{Φ_{11}^{k}} D_{Φ_{11}^{k}} ∥}$ . If (61) and (62) are not attainable, then let

$Z^{k} = - P_{Φ_{11}^{k}} D_{Φ_{11}^{k}} .$
Step 2.: Let $l_{k} > 0$ be the smallest integer m such that

$f_{1} (π (Φ_{11}^{k} + β^{l} Z^{k})) \leq f_{1} (Φ_{11}^{k}) + σ ρ^{l} 〈 P_{Φ_{11}^{k}} D_{Φ_{11}^{k}}, Z^{k} 〉$

Set

$Z^{k + 1} = π (Φ_{11}^{k} + β^{l} Z^{k}) .$
Step 3.: Replace k by $k + 1$ and go to Step 1.

In Step 2 of Algorithm 4,

π : C^{m \times i} \to Grass (n, i)

is the projection onto

Grass (n, i)

, which is defined as follows: Let

X \in C^{n \times i}

be full column rank. Then

π (X) = [\underset{Y \in S t (n, i)}{argmin} {∥ X - Y ∥}^{2}] .

As noted in [29], if the SVD of X is given by

X = U Σ V^{H}

, then

π (X) = U I_{n, i} V^{H}

. If the QR decomposition of X is

X = Q R

, then

π (X) = Q_{n, i}

.

Remark 2.

(i) An explicit expression of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

is given in Theorem 3. As a comparison, Theorem 2 only provides an upper bound. Obviously, Theorem 3 improves the existing results in [22] in theory, see Example below. We adopt trace function optimization formula with orthogonal constraints in Theorem 3 to evaluate the explicit expression of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

, which costs less complexity. While Theorem 3.2 in [22] needs unitary matrices

U, V, \tilde{U}, \tilde{V}

of the GSVDs of two matrix pairs

{A, B}

and

{\tilde{A}, \tilde{B}}

. It is known that computing GSVDs may cost much higher amount of calculations than computing small-size unitary matrice. Hence, on algorithm aspect Theorem 3 also improves Theorem 3.2 in [22].

(ii) The equality in Theorem 3 refers to matrix optimization with orthogonal constraints. Based on matrix optimization formula with orthogonal constraints we use classical Golub-Kahan bidiagonalization method to solve it efficiently. The proposed algorithms by trace function optimization under two variables are efficient for computing for

A^{H} A + B^{H} B

being not ill-posed. For

A^{H} A + B^{H} B = W

being ill-posed, by matrix decompositions (eg., Schur decomposition or QR decomposition) we compute

W_{1}, W_{2}

with

W = W_{1} W_{2}

, where

W_{1}, W_{2}

have small condition numbers. We calculate

W_{1}^{- 1}, W_{2}^{- 1}

and then compute multiplications for

W^{- 1} = W_{2}^{- 1} W_{1}^{- 1}

, see [27,28]. In general we first compute the inverse of the matrix with small condition numbers and then multiply these two inverse matrices, which can well get the inverse of the original matrix with large condition numbers. The proposed algorithms by trace function optimization under two variables are efficient for computing for condition numbers of

A^{H} A + B^{H} B

being large, see Table 1. In practical applications, obviously, by the new explicit expression formula getting exact values of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

are more precise than its upper bound.

Example. To test the comparison of the explicit formula in Theorem 3, and the upper bound of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

in [22], we randomly generate grassman matrix pencils

{A, B}

and

{\tilde{A}, \tilde{B}}

by MATLAB command randn(m,n)+i*randn(m,n) for

A, \tilde{A}

and randn(p,n)+i*randn(p,n) for

B, \tilde{B}

with different random

m, n, p

. We first assess condition numbers of

A^{H} A + B^{H} B

by using two-norm and denoted by

κ_{2} (\cdot)

. By simple computations we present the table 1.

Table 1. Comparisons of exact values, explicit formula by Alg. 4 and the upper bound with different

(m, p, n)

Table 1. Comparisons of exact values, explicit formula by Alg. 4 and the upper bound with different

(m, p, n)

$(m, p, n)$	$κ_{2} (\cdot)$	Exact values	formula by Alg. 4	Upper Bound
(80,40,60)	30.2	27.24709031	$27.24698583$	41.3510
(200,500,450)	218.6	$189.26904380$	189.26910247	380.4629
(900,800,700)	359.4	$458.42617335$	458.42618210	573.0744
(60,120,140)	$600.3$	103.56173621	103.56172847	126.0846
(100,200,150)	$400.6$	79.35162440	79.35157629	118.5329
(250,500,450)	$O (10^{3})$	268.49009138	268.49008938	376.4290
(800,600,500)	$O (10^{3})$	409.05215925	409.05215570	454.7393
(2000,1900,1800)	$O (10^{4})$	1516.38636811	1516.38632996	1704.1106
(500,600,800)	$O (10^{5})$	379.23164800	379.23164655	634.2572

Here `Explicit expression’ denotes the equality in Theorem 3, `Upper Bound’ denotes the upper bound in [22]. This table obviously shows the equality in Theorem 3 is almost equal to exact value of

\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))

and may be much sharper than the upper bounds in [22]. This again verifies Remark (i).

3. Concluding Remarks

In this paper we give explicit expressions of sum of chordal metric between generalized singular values of Grassmann matrix pairs, which absolutely improves the existing upper bounds in [22]. For the new expression formula only two small-size singular value decompositions are required in the algorithm which dominates the computational cost. As a byproduct, we also present an interval estimate without using the singular values.

References

Alter, O., Brown, P. O., and Botstein, D. 2000. "Singular value decomposition for genome-wide expression data processing and modelling." Proceedings of the National Academy of Sciences 97: 10101–10106. [CrossRef]
Alter, O., Brown, P. O., and Botstein, D. 2001. "Processing and modelling gene expression data using singular value decomposition." Proceedings of SPIE 4266: 171–186. [CrossRef]
Alter, O., Brown, P. O., and Botstein, D. 2003. "Generalized singular decomposition for comparative analysis of genome-scale expression data sets of two different organisms." Proceedings of the National Academy of Sciences 100: 3351–3356. [CrossRef]
lter, O., Golub, G. H., Brown, P. O., and Botstein, D. 2004. "Novel genome-scale correlation between DNA replication and RNA transcription during the cell cycle in yeast is predicted by data-driven models." In Proceedings of the Miami Nature Biotechnology Winter Symposium on the Cell Cycle, Chromosomes and Cancer, vol. 15, edited by M. P. Deutscher et al., 15. University of Miami School of Medicine, Miami.
Bai, Z. , and Demmel, J. W. 1993. "Computing the generalized singular value decomposition." SIAM Journal on Scientific Computing 14: 1464–1486.
Bai, Z. , and Zha, H. 1993. "A new preprocessing algorithm for the computation of the generalized singular value decomposition." SIAM Journal on Scientific Computing 14: 1007–1012. [CrossRef]
Bhatia, R. 1997. Matrix Analysis. New York: Springer.
Sun, J.-G. 1983. "Perturbation analysis for the generalized eigenvalue problem and the generalized singular value problem." In Matrix Pencils, Lecture Notes in Mathematics, vol. 973, edited by Springer Verlag, 221–244.
Li, R.-C. 1993. "Bounds on perturbations of generalized singular values and of associated subspaces." SIAM Journal on Matrix Analysis and Applications 14: 195–234. [CrossRef]
Li, R.-C. 1993. "A perturbation bound for definite pencils." Linear Algebra and its Applications 179: 191–202. [CrossRef]
Li, R.-C. 1994. "On perturbations of matrix pencils with real spectra." Mathematics of Computation 62: 231–265.
Li, R.-C. 2002. "On perturbations of matrix pencils with real spectra, a revisit." Mathematics of Computation 72: 715–728.
Mahony, R. E. 1996. "The constrained Newton method on a Lie group and the symmetric eigenvalue problem." Linear Algebra and its Applications 248(15): 67–89.
Owren, B. , and Welfert, B. 2000. "The Newton iteration on Lie groups." BIT 40(1): 121–145.
Paige, C. C. , and Saunders, M. A. 1981. "Towards a generalized singular value decomposition." SIAM Journal on Numerical Analysis 18: 398–405.
Sun, J.-G. 1983. "Perturbation analysis for the generalized singular value problem." SIAM Journal on Matrix Analysis and Applications 20: 611–625.
Sun, J.-G. 1998. "Perturbation analysis of generalized singular subspaces." Numerische Mathematik 79: 615–641.
Sun, J.-G. 2000. "Condition number and backward error for the generalized singular value decomposition." SIAM Journal on Matrix Analysis and Applications 22: 323–341.
Van Loan, C. F. 1976. "Generalizing the singular value decomposition." SIAM Journal on Numerical Analysis 13: 76–83.
Wang, J. H., and Li, C. 2011. "Kantorovich’s theorems for Newton’s method for mappings and optimization problems on Lie groups." IMA Journal of Numerical Analysis 31:.
Xu, W. W., Li, W., Zhu, L., and Huang, X. P. 2019. "The analytic solutions of a class of constrained matrix minimization and maximization problems with applications." SIAM Journal on Optimization 29: 1657–1686.
Xu, W. W., Pang, H. K., Li, W., Huang, X. P., and Guo, W. J. 2018. "On the explicit expression of chordal metric between generalized singular values of Grassmann matrix pairs with applications." SIAM Journal on Matrix Analysis and Applications 39(4): 1547–1563.
Zha, H. 1992. "A numerical algorithm for computing restricted singular value decomposition of matrix triplets." Linear Algebra and Its Applications 168: 1–26.
Hua, L. K. 1963. Harmonic Analysis of Functions of Several Complex Variables in the Classical Domains. Providence, RI: American Mathematical Society.
Golub, G. H., and Van Loan, C. F. 2013. Matrix Computations, 4th ed. Baltimore: Johns Hopkins University Press.
Golub, G. H. , and Kahan, W. 1965. "Calculating the singular values and pseudo-inverse of a matrix." SIAM Journal on Numerical Analysis 2: 205–224.
Ben-Israel, A., and Greville, T. N. E. 1977. Generalized Inverses: Theory and Applications. New York: Wiley.
Jodár, L., Law, A. G., Rezazadeh, A., Watson, J. H., and Wu, G. 1991. "Computations for the Moore-Penrose and Other Generalized Inverses." Congressus Numerantium 80: 57–64.
Manton, J. H. 2002. "Optimization algorithms exploiting unitary constraints." IEEE Transactions on Signal Processing 50: 635–650.
Absil, P.-A., Mahony, R., and Sepulchre, R. 2008. Optimization Algorithms on Matrix Manifolds. Princeton: Princeton University Press.
Adler, R. L., Dedieu, J. P., Margulies, J. Y., Martens, M., and Shub, M. 2002. "Newton’s method on Riemannian manifolds and a geometric model for the human spine." IMA Journal of Numerical Analysis 22: 359–390.
Edelman, A., Arias, T. A., and Smith, S. T. 1998. "The geometry of algorithms with orthogonality constraints." SIAM Journal on Matrix Analysis and Applications 20: 303–353.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Chordal Metric Formula Between Generalized Singular Values of Grassmann Matrix Pairs by Riemannian Optimization Models

Abstract

Keywords:

Subject:

0. Introduction

0.1. GSVD and Its Applications

0.2. Riemannian Optimization

0.3. Notations

0.4. Overview of Existing Results about Chordal Metric Between GSVs of GMP

0.5. Organization

1. Preliminaries

2. Explicit Formula of $\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))$ with Riemannian Optimization Models

2.1. For complex grassman matrix pair

2.2. Solving Riemannian Optimization Models in (45) and (46) for Complex Grassman Matrix Pair

3. Concluding Remarks

References

MDPI Initiatives

Important Links

Subscribe

Chordal Metric Formula Between Generalized Singular Values of Grassmann Matrix Pairs by Riemannian Optimization Models

Abstract

Keywords:

Subject:

0. Introduction

0.1. GSVD and Its Applications

0.2. Riemannian Optimization

0.3. Notations

0.4. Overview of Existing Results about Chordal Metric Between GSVs of GMP

0.5. Organization

1. Preliminaries

2. Explicit Formula of ∑ i = 1 n ρ 2 ( ( α i , β i ) , ( α ˜ i , β ˜ i ) ) with Riemannian Optimization Models

2.1. For complex grassman matrix pair

2.2. Solving Riemannian Optimization Models in (45) and (46) for Complex Grassman Matrix Pair

3. Concluding Remarks

References

MDPI Initiatives

Important Links

Subscribe

2. Explicit Formula of $\sum_{i = 1}^{n} ρ^{2} ((α_{i}, β_{i}), ({\tilde{α}}_{i}, {\tilde{β}}_{i}))$ with Riemannian Optimization Models