Matrix-Sequences of Geometric Means in the Case of Hidden (Asymptotic) Structures

Danyal Ahmad; Muhammad Faisal Khan; Stefano Serra-Capizzano

doi:10.20944/preprints202412.1524.v1

Submitted:

17 December 2024

Posted:

18 December 2024

You are already at the latest version

Abstract

In the current work, we analyze the spectral distribution of the geometric mean of two or more matrix-sequences constituted by Hermitian positive definite matrices, under the assumption that all input matrix-sequences belong to the same Generalized Locally Toeplitz (GLT) $*$-algebra. We consider the geometric mean for two matrices, using the Ando-Li-Mathias (ALM) definition, and then we pass to the extension of the the idea to more than two matrices by introducing the Karcher mean. While there is no simple formula for the geometric mean of more than two matrices, iterative methods from the literature are employed to compute it. The main novelty of the work is the extension of the study in the distributional sense when input matrix-sequences belong to one of the GLT $*$-algebras. More precisely, we show that the geometric mean of more than two positive definite GLT matrix-sequences forms a new GLT matrix-sequence, with the GLT symbol given by the geometric mean of the individual symbols. Numerical experiments are reported concerning scalar and block GLT matrix-sequences in both one-dimensional and two-dimensional cases. A section with conclusions and open problems ends the current work.

Keywords:

Matrix geometric mean

;

Karcher mean

;

Positive definite matrix

;

multilevel Toeplitz matrices

;

generalized locally Toeplitz sequences

Subject:

Computer Science and Mathematics - Mathematics

1. Introduction

The study of geometric means of matrices has become increasingly important in various fields, including numerical linear algebra, differential equations, and signal processing. Positive definite matrices arise naturally in many applications, such as in the discretization of differential equations via methods like finite differences or finite elements. These methods produce matrix-sequences whose sizes increase as the mesh size becomes finer. An essential concept related to matrix-sequences is the asymptotic eigenvalue distribution in the Weyl sense and this has been a classical study; see e.g. [17,23,40]. Since the publication of the seminal papers by Tyrtyshnikov in 1996 [41] and Tilli in 1998 [38,39], there has been growing interest in this area, which eventually contributed to the development of the theory of Generalized Locally Toeplitz (GLT) sequences [35,36]. The increasing attention to this topic is not purely theoretical, as the asymptotic eigenvalue and singular value distributions have significant practical applications, particularly in the analysis of large-scale matrix computations especially in the context of the numerical approximations of systems of (fractional) partial differential equations ((F)PDEs); see e.g. the books and review papers [7,8,20,21,22] and references therein. In fact, it is worth noticing that virtually any meaningful approximation technique of a (F)PDE leads to a GLT matrix-sequence, including finite differences, finite elements of any order, discontinuous Galerkin techniques, finite volumes, isogeometric analysis etc. More in detail, if we fix

d, r

positive integer numbers, then the set of d-level r-block GLT matrix-sequences forms a maximal *-algebra of matrix-sequences, isometrically equivalent to d-variate

r \times r

matrix-valued mesurable functions defined on

{[0, 1]}^{d} \times {[- π, π]}^{d}

. Furthermore, a d-level r-block GLT sequence

{A_{n}}_{n}

is uniquely associated with a

r \times r

matrix-valued Lebesgue-measurable function

κ

, known as the GLT symbol, which is defined over the domain

D = {[0, 1]}^{d} \times {[- π, π]}^{d}

. Notice that the set

{[0, 1]}^{d}

can be replaced by any bounded Peano-Jordan measurable subset of

R^{d}

as occurring with the notion of reduced GLT *-algebras, see [35][pp. 398-399, formula (59)] for the first occurrence with applications of approximated PDEs on general non-Cartesian domains in d dimensions, [36][Section 3.1.4] for the first formal proposal, and [5] for an exhaustive treatment, containing both the *-algebra theoretical results and a number of applications. This symbol provides a powerful tool for analyzing the singular value and eigenvalue distributions when the matrices

A_{n}

are Hermitian and part of a matrix-sequence of increasing size. The notation

{A_{n}}_{n} \sim_{GLT} κ

indicates that

{A_{n}}_{n}

is a GLT sequence with symbol

κ

. Notably, the symbol of a GLT sequence is unique in the sense that if

{A_{n}}_{n} \sim_{GLT} κ

and

{A_{n}}_{n} \sim_{GLT} ξ

, then

κ = ξ

almost everywhere in

{[0, 1]}^{d} \times {[- π, π]}^{d}

[7,8,20,21]. Furthermore, by the *-algebra structure,

{A_{n}}_{n} \sim_{GLT} κ

and

{B_{n}}_{n} \sim_{GLT} κ

implies that

{A_{n} - B_{n}}_{n} \sim_{GLT} 0

, i.e., the matrix-sequence

{A_{n} - B_{n}}_{n}

is zero-distributed and the latter is very important for building explicit matrix-sequences approximating a GLT matrix-sequence and whose inversion is computationally cheap, in the context of preconditioning of large linear systems.

In certain physical applications, it is often necessary to represent the results of multiple experiments through a single average matrix G, where the data is represented by a set of positive definite matrices

A_{1}, A_{2}, . . ., A_{k}

. The arithmetic mean

\frac{1}{k} \sum_{i = 1}^{k} A_{i}

is not appropriate in these cases because it does not fulfill the requirement that the inverse of the mean should coincide with the average of the inverses

G^{- 1} = \frac{1}{k} \sum_{i = 1}^{k} A_{i}^{- 1}

. This property, which is crucial in certain physical models is satisfied by the geometric mean. For positive real numbers

a_{1}, a_{2}, . . ., a_{k}

, the geometric mean is defined as

g = {(\prod_{i = 1}^{k} a_{i})}^{1 / k},

a concept that is extended to the case of matrices [10,26] in a nontrivial way, where the difficulty is of course the lack of commutativity. A well-known definition that satisfies desirable properties such as congruence invariance, permutation invariance, and consistency with the scalar geometric mean was proposed by ALM [1]. They defined the geometric mean of two Hermitian positive definite (HPD) matrices as

G (A, B) = A^{1 / 2} {(A^{- 1 / 2} B A^{- 1 / 2})}^{1 / 2} A^{1 / 2} .

(1)

We recall the definition of functions applied to diagonalizable matrices, which we frequently utilize throughout our work. Suppose

A \in C^{n \times n}

is diagonalizable, meaning

A = M D M^{- 1}

, where

D = (d_{i j})

is a diagonal matrix, and f is a given function. In this case,

f (A)

is defined as

M f (D) M^{- 1}

, with

f (D)

being a diagonal matrix whose diagonal entries are

f (d_{i i})

, for

i = 1, \dots, n

. In the case where f is a multi-valued function, the same branch of f must be chosen for any repeated eigenvalue.

If

k > 2

then the ALM-mean is obtained through a recursive iteration process where at each step the geometric means of k matrices are computed by reducing them to

k - 1

matrices. However, a significant limitation of this method is its linear convergence leading to a high computational cost due to the large number of iterations required at each recursive step. As a result, the computation of the ALM-mean using this approach becomes quite expensive. However, despite the elegance of the ALM geometric mean for two matrices, it becomes computationally infeasible to extend this formula to more than two matrices [14].

To overcome these limitations, the Karcher mean [13], was introduced as a generalization of the geometric mean for more than two matrices. The Karcher mean of HPD matrices

A_{1}, A_{2}, \dots, A_{k}

is defined as the unique positive definite solution to the matrix equation:

\sum_{i = 1}^{k} log (A_{i}^{- 1} X) = 0,

(2)

as established by Moakher [26, Proposition 3.4]. This equation can be equivalently expressed in other forms, such as

\sum_{i = 1}^{k} log (X A_{i}^{- 1}) = 0, or \sum_{i = 1}^{k} log (X^{1 / 2} A_{i}^{- 1} X^{1 / 2}) = 0,

(3)

by utilizing the formula

M^{- 1} log (K) M = log (M^{- 1} K M)

, which holds for any invertible matrix M and any matrix K with real positive eigenvalues. This formulation arises from Riemannian geometry, where the space of positive definite matrices forms a Riemannian manifold with non-positive curvature. The Karcher mean represents the center of mass (or barycenter) on this manifold [10]. In this manifold, the distance between two positive definite matrices A and B is defined as

δ (A, B) = ∥ log (A^{- 1 / 2} B A^{- 1 / 2}) ∥_{F},

where

{∥ \cdot ∥}_{F}

denotes the Frobenius norm.

Several numerical methods have been proposed for solving the Karcher mean equation. Initially, fixed-point iteration methods were used, but these methods suffered from slow convergence, especially in cases where the matrices involved were poorly conditioned [26]. Later, methods based on gradient descent in the Riemannian manifold were introduced. A common iteration scheme for approximating the Karcher mean is

X_{v + 1} = X_{v} exp (- θ_{v} \sum_{i = 1}^{k} log (A_{i}^{- 1} X_{v})), X_{0} \in P_{n}, v \geq 0,

(4)

where

X_{v}

is the approximation at the v-th step, and the exponential and logarithmic functions are matrix operations. Although this method improves convergence, it can exhibit slow linear convergence in certain cases. The iteration with

X_{0} = A_{1}

or

X_{0} = I

and

θ_{v} = 1 / k

, as considered in [25,26], can fail to converge for some matrices

A_{1}, \dots, A_{k}

. Furthermore, similar iterations have been proposed in [3,28], but without specific recommendations on choosing the initial value or step length. While an optimal

θ_{v}

could theoretically be determined using a line search strategy, this approach is often computationally expensive. Heuristic strategies for selecting the step size, as discussed in [18], may result in slow convergence in many cases.

To further enhance the convergence rate, the Richardson iteration method is employed. Indeed, the considered method improves the convergence by using a parameter

θ

, which controls the step size in each iteration [13]. More precisely, given

X_{0} \in P_{n}

, the Richardson iteration is given by

X_{v + 1} = X_{v} - θ X_{v} \sum_{i = 1}^{k} log (A_{i}^{- 1} X_{v}) \equiv T (X_{v}), v \geq 0 .

(5)

Any solution of Eq. (3) is a fixed point of the map T in (5). The iterative formula can also be rewritten as

X_{ν + 1} = X_{ν} - θ X_{ν}^{1 / 2} (\sum_{i = 1}^{k} log (X_{ν}^{1 / 2} A_{i}^{- 1} X_{ν}^{1 / 2})) X_{ν}^{1 / 2}, v \geq 0,

(6)

provided that all the iterates

X_{ν}

remain positive definite. Equation (6) further demonstrates that if

X_{ν}

is Hermitian, then

X_{ν + 1}

is also a Hermitian matrix.

The parameter

θ

plays a crucial role in controlling the step size of each iteration and choosing an optimal value of

θ

can significantly influence the convergence behavior of the iteration. If

θ

is small enough, the iteration is guaranteed to produce positive definite matrices and converge towards the solution. In particular, when the matrices

A_{1}, \dots, A_{k}

commute, setting

θ = \frac{1}{k}

ensures at least quadratic convergence. More generally, the optimal value of

θ

can be determined based on the condition numbers of matrices

M_{i} = G^{1 / 2} A_{i}^{- 1} G^{1 / 2}, i = 1, \dots, k

, where G is the desired solution. The closer the eigenvalues of

M_{i}

are to 1, the faster the convergence is. This analysis guarantees that if the initial guess

X_{0}

is close enough to the solution and is positive definite, the sequence

{X_{v}}

generated by the iteration remains well-defined and converges to the desired solution. However, if the initial iterate is not positive definite, adjusting the value of

θ

or modifying the iteration scheme may be necessary to ensure that all iterates remain positive.

Numerical experiments show that selecting appropriate initial guesses, such as the arithmetic mean

X_{0} = \frac{1}{k} \sum_{i = 1}^{k} A_{i}

or the identity matrix

X_{0} = I

, can significantly affect the convergence rate. In particular, the cheap mean introduced in [12] provides a practical initial approximation that leads to faster convergence in many cases.

In our study, we consider the analysis of the Karcher mean to matrix-sequences, particularly those arising from the discretization of differential equations, which often form GLT sequences. Numerical results demonstrate that the geometric mean of not only two but more than two GLT matrix-sequences is itself a GLT matrix-sequence, with the symbol of the new sequence given by the geometric mean of the original symbols: the latter is formally proven in the case of two GLT matrix-sequences. Regarding the examples, we consider either scalar unilevel and multilevel GLT matrix-sequences or block GLT asymptotic structures, with special attention to cases stemming from the approximation by local methods of differential operators. By analyzing the spectral distribution of the geometric mean of these matrix-sequences, we provide new insights into the asymptotic behavior of large-scale matrix computations and their potential applications in numerical analysis.

This paper is structured as follows. In Section 2, we introduce notations, terminology, and preliminary results concerning Toeplitz and GLT structures, which are essential for the mathematical formulation of the problem and its technical solution. In Section 3, we present the geometric mean of two matrices and the Karcher mean for more than two matrices, followed by a discussion of the iterative methods employed for their computation: the section contains the GLT theoretical results. Section 4 contains numerical experiments that illustrate the (asymptotic) spectral behavior of the geometric mean for GLT matrix-sequences in both 1D and 2D settings and in both scalar and block cases. Finally, in Section 5 we draw conclusions and we punt in evidence few open problems.

2. Preliminaries

In this introduction, we provide the necessary tools for performing the spectral analysis of the matrices involved, based on the theory of uni and multilevel block GLT matrix-sequences.

2.1. Matrices and Matrix-Sequences

Given a square matrix

A \in C^{m \times m}

, we denote by

A^{*}

its conjugate transpose and by

A^{†}

the Moore–Penrose pseudoinverse of A. Recall that

A^{†} = A^{- 1}

whenever A is invertible. Regarding matrix norms

∥ \cdot ∥

refers to the spectral norm and for

1 \leq p \leq \infty

, the notation

{∥ \cdot ∥}_{p}

stands for the Schatten p-norm defined as the p-norm of the vector of the singular values. Note that the Schatten ∞-norm, which is equal to the largest singular value coincides with the spectral norm

∥ \cdot ∥

; the Schatten 1-norm since it is the sum of the singular values is often referred to as the trace-norm; and the Schatten 2-norm coincides with the Frobenius norm. Schatten p-norms, as important special cases of unitarily invariant norms are treated in detail in a wonderful book by Bhatia [9].

Finally, the expression matrix-sequence refers to any sequence of the form

{A_{n}}_{n}

, where

A_{n}

is a square matrix of size

d_{n}

with

d_{n}

strictly increasing so that

d_{n} \to \infty

as

n \to \infty

. A r-block matrix-sequence, or simply a matrix-sequence if r can be deduced from context is a special

{A_{n}}_{n}

in which the size of

A_{n}

is

d_{n} = r φ_{n}

, with

r \geq 1 \in N

fixed and

φ_{n} \in N

strictly increasing.

2.2. Multi-Index Notation

To effectively deal with multilevel structures it is necessary to use multi-indices, which are vectors of the form

i = (i_{1}, \dots, i_{d}) \in Z^{d}

. The related notation is listed below

$0, 1, 2, \dots$ are vectors of all zeroes, ones, twos, etc.
$h \leq k$ means that $h_{r} \leq k_{r}$ for all $r = 1, \dots, d$ . In general, relations between multi-indices are evaluated componentwise.
Operations between multi-indices, such as addition, subtraction, multiplication, and division, are also performed componentwise.
The multi-index interval $[h, k]$ is the set ${j \in Z^{d} : h \leq j \leq k}$ . We always assume that the elements in an interval $[h, k]$ are ordered in the standard lexicographic manner

${[\dots {[[(j_{1}, \dots, j_{d})] j_{d} = h_{d}, \dots, k_{d}]}_{j_{d - 1} = h_{d - 1}, \dots, k_{d - 1}} \dots]}_{j_{1} = h_{1}, \dots, k_{1}} .$
$j = h, \dots, k$ means that $j$ varies from $h$ to $k$ , always following the lexicographic ordering.
$m \to \infty$ means that $min (m) = {min}_{j = 1, \dots, d} m_{j} \to \infty$ .
The product of all the components of $m$ is denoted as $ν (m) : = \prod_{j = 1}^{d} m_{j}$ .

A multilevel matrix-sequence is a matrix-sequence

{A_{n}}_{n}

such that n varies in some infinite subset of

N

,

n = n (n)

is a multi-index in

N^{d}

depending on n, and

n \to \infty

when

n \to \infty

. This is typical of many approximations of differential operators in d dimensions.

2.3. Singular Value and Eigenvalue Distributions of a Matrix-Sequence

Let

μ_{k}

be the Lebesgue measure in

R^{k}

. Throughout this work, all terminology from measure theory (such as “measurable set,” “measurable function,” “almost everywhere”, etc.) always refers to the Lebesgue measure. Let

C_{c} (R)

(resp.,

C_{c} (C)

) be the space of continuous complex-valued functions with bounded support defined on

R

(resp.,

C

). If

A \in C^{n \times n}

, the singular values and eigenvalues of A are denoted by

σ_{1} (A), \dots, σ_{n} (A)

and

λ_{1} (A), \dots, λ_{n} (A)

, respectively. The set of the eigenvalues (i.e., the spectrum) of A is denoted by

Λ (A)

.

Definition 1.

(Singular value and eigenvalue distribution of a matrix-sequence). Let

{A_{n}}_{n}

be a matrix-sequence, with

A_{n}

of size

d_{n}

, and let

ψ : D \subset R^{t} \to C^{r \times r}

be a measurable function defined on a set D with

0 < μ_{t} (D) < \infty

.

We say that ${A_{n}}_{n}$ has an (asymptotic) singular value distribution described by ψ, and we write ${A_{n}}_{n} \sim_{σ} ψ$ , if

$lim_{n \to \infty} \frac{1}{d_{n}} \sum_{i = 1}^{d_{n}} F (σ_{i} (A_{n})) = \frac{1}{μ_{t} (D)} \int_{D} \frac{1}{r} \sum_{i = 1}^{r} F (σ_{i} (ψ (x))) d x, \forall F \in C_{c} (R) .$
We say that ${A_{n}}_{n}$ has an (asymptotic) spectral (or eigenvalue) distribution described by ψ, and we write ${A_{n}}_{n} \sim_{λ} ψ$ , if

$lim_{n \to \infty} \frac{1}{d_{n}} \sum_{i = 1}^{d_{n}} F (λ_{i} (A_{n})) = \frac{1}{μ_{t} (D)} \int_{D} \frac{1}{r} \sum_{i = 1}^{r} F (λ_{i} (ψ (x))) d x, \forall F \in C_{c} (C) .$
If ψ describes both the singular value and eigenvalue distribution of ${A_{n}}_{n}$ , we write ${A_{n}}_{n} \sim_{σ, λ} ψ$ .

In this case, the function ψ is referred to as the eigenvalue (or spectral) symbol of

{A_{n}}_{n}

.

The same definition when the considered matrix-sequence shows a multilevel structure. In that case n is replaced by

n

uniformly in

A_{n}

and

d_{n}

.

The informal meaning behind the spectral distribution definition is as follows: if

ψ

is continuous, then a suitable ordering of the eigenvalues

{λ_{j} (A_{n})}_{j = 1, \dots, d_{n}}

, assigned in correspondence with an equispaced grid on D, reconstructs approximately the r surfaces

x \mapsto λ_{i} (ψ (x))

,

i = 1, \dots, r

. For example, in the simplest case where

t = 1

and

D = [a, b]

,

d_{n} = n r

, the eigenvalues of

A_{n}

are approximately equal — up to a few potential outliers — to

λ_{i} (ψ (x_{j}))

, where

x_{j} = a + j \frac{(b - a)}{n}, j = 1, \dots, n, i = 1, \dots, r .

If

t = 2

and

D = [a_{1}, b_{1}] \times [a_{2}, b_{2}]

,

d_{n} = n^{2} r

, the eigenvalues of

A_{n}

are approximately equal — again up to a few potential outliers — to

λ_{i} (ψ (x_{j_{1}}, y_{j_{2}}))

, where

x_{j_{1}} = a_{1} + j_{1} \frac{(b_{1} - a_{1})}{n}, y_{j_{2}} = a_{2} + j_{2} \frac{(b_{2} - a_{2})}{n}, j_{1}, j_{2} = 1, \dots, n, i = 1, \dots, r .

If the considered structure is two-level then the subscript is

n = (n_{1}, n_{2})

and

d_{n} = n_{1} n_{2} r

.

Furthermore, for

t \geq 3

, a similar reasoning applies.

Finally we report an observation which is useful in the following derivations.

Remark 1.

The relation

{A_{n}}_{n} \sim_{λ} f

and

Λ (A_{n}) \subseteq S

for all n imply that the range of f is a subset of the closure

\bar{S}

of S. In particular

{A_{n}}_{n} \sim_{λ} f

and

A_{n}

positive definite for all n imply that f is nonnegative definite almost everywhere, simply nonnegative almost everywhere if

r = 1

. The same applies when a multilevel matrix-sequence

{A_{n}}_{n}

is considered and similar statements hold when singular values are taken into account.

2.4. Approximating Classes of Sequences

In this subsection, we present the notion of the approximating class of sequences and a related key result.

Definition 2

(Approximating class of sequences). Let

{A_{n}}_{n}

be a matrix-sequence and let

{{B_{n, j}}_{n}}_{j}

be a class of matrix-sequences, with

A_{n}

and

B_{n, j}

of size

d_{n}

. We say that

{{B_{n, j}}_{n}}_{j}

is an approximating class of sequences (a.c.s.) for

{A_{n}}_{n}

if the following condition is met: for every j, there exists

n_{j}

such that, for every

n \geq n_{j}

,

A_{n} = B_{n, j} + R_{n, j} + N_{n, j},

rank (R_{n, j}) \leq c (j) d_{n} and ∥ N_{n, j} ∥ \leq ω (j),

where

n_{j}

,

c (j)

, and

ω (j)

depend only on j, and

lim_{j \to \infty} c (j) = lim_{j \to \infty} ω (j) = 0 .

{{B_{n, j}}_{n}}_{j} \overset{a . c . s . wrt j}{\to} {A_{n}}_{n}

denotes that

{{B_{n, j}}_{n}}_{j}

is an a.c.s. for

{A_{n}}_{n}

.

The following theorem represents the expression of a related convergence theory and it is a powerful tool used, for example, in the construction of the GLT *-Algebra

Theorem 1.

Let

{A_{n}}_{n}

,

{B_{n, j}}_{n}

, with

j, n \in N

, be matrix-sequences and let

ψ, ψ_{j} : D \subset R^{d} \to C^{r \times r}

be measurable functions defined on a set D with positive and finite Lebesgue measure. Suppose that:

1.: ${B_{n, j}}_{n} \sim_{σ} ψ_{j}$ for every j;
2.: ${{B_{n, j}}_{n}}_{j} \overset{a . c . s . wrt j}{\to} {A_{n}}_{n}$ ;
3.: $ψ_{j} \to ψ$ in measure.

Then

{A_{n}}_{n} \sim_{σ} ψ .

Moreover, if all the involved matrices are Hermitian, the first assumption is replaced by

{B_{n, j}}_{n} \sim_{λ} ψ_{j}

for every j, and the other two are left unchanged, then

{A_{n}}_{n} \sim_{λ} ψ .

We end this section by observing that the same definition can be given and corresponding results (with obvious changes) hold, when the involved matrix-sequences show a multilevel structure. In that case n is replaced by

n

uniformly in

A_{n}

,

B_{n, j}

,

d_{n}

.

2.5. Matrix-Sequences with Explicit or Hidden (Asymptotic) Structure

In this subsection, we introduce the three types of matrix structures that constitute the basic building blocks of the GLT *-algebras. To be more specific, for any

d, r

positive integers we consider the set of d-level r-block GLT matrix-sequences. For any

d, r

positive integers, the considered set forms a *-algebra of matrix-sequences, which is maximal and is isometrically equivalent to the maximal *-algebra of

2 d

-variate

r \times r

matrix-valued measurable functions (with respect to the Lebesgue measure) defined canonically over

{[0, 1]}^{d} \times {[- π, π]}^{d}

; see [4,7,8,19,20,21] and references therein.

The reduced version is essential when dealing with approximations of integro-differential operators (also in fractional versions) defined over general (non-Cartesian) domains. The idea was presented in [35,36] and it was exhaustively developed in [5], where the GLT symbols are again measurable functions defined over

Ω \times {[- π, π]}^{d}

, with

Ω

Peano-Jordan measurable and contained in

{[0, 1]}^{d}

. Als the reduced versions form maximal *-algebras isometrically equivalent to the corresponding maximal *-algebras of measurable functions. The considered GLT *-algebras represent rich examples of hidden (asymptotic) structures. Their building blocks are formed by two classes of explicit algebraic structures, d-level r-block Toeplitz and sampling diagonal matrix-sequences (see Section 2.7 and Section 2.8), plus the asymptotic structures given by the zero-distributed matrix-sequences; see Section 2.6. It is worth noticing that the latter class plays the role of compact operators with respect to bounded linear operators and in fact they form a two-sided ideal of matrix-sequences with respect to any of the GLT *-algebras.

2.6. Zero-Distributed Sequences

Zero-distributed sequences are defined as matrix-sequences

{A_{n}}_{n}

such that

{A_{n}}_{n} \sim_{σ} 0

. Note that, for any

r \geq 1

,

{A_{n}}_{n} \sim_{σ} 0

is equivalent to

{A_{n}}_{n} \sim_{σ} O_{r}

, where

O_{r}

is the

r \times r

zero matrix. The following theorem (see [34] and [20]), provides a useful characterization for detecting this type of sequence. We use the natural convention

1 / \infty = 0

.

Theorem 2.

Let

{A_{n}}_{n}

be a matrix-sequence, with

A_{n}

of size

d_{n}

. Then

${A_{n}}_{n} \sim_{σ} 0$ if and only if $A_{n} = R_{n} + N_{n}$ with $rank (R_{n}) / d_{n} \to 0$ and $∥ N_{n} ∥ \to 0$ as $n \to \infty$ ;
${A_{n}}_{n} \sim_{σ} 0$ if there exists $p \in [1, \infty]$ such that $∥ A_{n} ∥_{p} / d_{n}^{1 / p} \to 0$ as $n \to \infty$ .

As in Section 2.4, the same definition can be given and corresponding result (with obvious changes) holds, when the involved matrix-sequences show a multilevel structure. In that case n is replaced by

n

uniformly in

A_{n}, N_{n}, R_{n}, d_{n}

.

2.7. Multilevel Block Toeplitz Matrices

Given

n \in N^{d}

, a matrix of the form

{[A_{i - j}]}_{i, j = 1}^{n} \in C^{ν (n) r \times ν (n) r},

with blocks

A_{k} \in C^{r \times r}

,

k \in [- (n - 1), \dots, n - 1]

, is called multilevel block Toeplitz, or more precisely, d-level r-block Toeplitz matrix.

Given a matrix-valued function

f : {[- π, π]}^{d} \to C^{r \times r}

belonging to

L^{1} ({[- π, π]}^{d})

, the

n

-th Toeplitz matrix associated with f is defined as

T_{n} (f) : = {[{\hat{f}}_{i - j}]}_{i, j = 1}^{n} \in C^{ν (n) r \times ν (n) r},

where

{\hat{f}}_{k} = \frac{1}{{(2 π)}^{d}} \int_{{[- π, π]}^{d}} f (θ) e^{- i (k, θ)} d θ \in C^{r \times r}, k \in Z^{d},

are the Fourier coefficients of f, in which i denotes the imaginary unit, the integrals are computed componentwise, and

(k, θ) = k_{1} θ_{1} + \dots + k_{d} θ_{d}

. Equivalently,

T_{n} (f)

can be expressed as

T_{n} (f) = \sum_{| j_{1} | < n_{1}} \dots \sum_{| j_{d} | < n_{d}} J_{n_{1}}^{(j_{1})} \otimes \dots \otimes J_{n_{d}}^{(j_{d})} \otimes {\hat{f}}_{(j_{1}, \dots, j_{d})},

where ⊗ denotes the Kronecker tensor product between matrices and

J_{m}^{(l)}

is the matrix of order m whose

(i, j)

entry equals 1 if

i - j = l

and zero otherwise.

{T_{n} (f)}_{n \in N^{d}}

is the family of (multilevel block) Toeplitz matrices associated with f, which is called the generating function.

2.8. Block Diagonal Sampling Matrices

Given

d \geq 1

,

n \in N^{d}

, and a function

a : {[0, 1]}^{d} \to C^{r \times r}

, we define the multilevel block diagonal sampling matrix

D_{n} (a)

as the block diagonal matrix

D_{n} (a) = {diag}_{i = 1, \dots, n} a (\frac{i}{n}) \in C^{ν (n) r \times ν (n) r} .

2.9. The *-Algebra of d-Level r-Block GLT Matrix-Sequences

Let

r \geq 1

be a fixed integer. A multilevel r-block GLT sequence, or simply a GLT sequence if we do not need to specify r, is a special multilevel r-block matrix-sequence equipped with a measurable function

κ : {[0, 1]}^{d} \times {[- π, π]}^{d} \to C^{r \times r}

,

d \geq 1

, called symbol. The symbol is essentially unique, in the sense that if

κ, ς

are two symbols of the same GLT sequence, then

κ = ς

almost everywhere. We write

{A_{n}}_{n} \sim_{GLT} κ

to denote that

{A_{n}}_{n}

is a GLT sequence with symbol

κ

.

It can be proven that the set of multilevel block GLT sequences is the *-algebra generated by the three classes of sequences defined in Section 2.6, Section 2.7, Section 2.8: zero-distributed, multilevel block Toeplitz, and block diagonal sampling matrix-sequences. The GLT class satisfies several algebraic and topological properties that are treated in detail in [7,8,20,21]. Here, we focus on the main operative properties, listed below, that represent a complete characterization of GLT sequences, equivalent to the full constructive definition.

GLT Axioms

GLT 1. If ${A_{n}}_{n} \sim_{GLT} κ$ then ${A_{n}}_{n} \sim_{σ} κ$ in the sense of Definition 1, with $t = 2 d$ and $D = {[0, 1]}^{d} \times {[- π, π]}^{d}$ . Moreover, if each $A_{n}$ is Hermitian, then ${A_{n}}_{n} \sim_{λ} κ$ , again in the sense of Definition 1 with $t = 2 d$ .
GLT 2. We have

-

${T_{n} (f)}_{n} \sim_{GLT} κ (x, θ) = f (θ)$ if $f : {[- π, π]}^{d} \to C^{r \times r}$ is in $L^{1} ({[- π, π]}^{d})$ ;

-

${D_{n} (a)}_{n} \sim_{GLT} κ (x, θ) = a (x)$ if $a : {[0, 1]}^{d} \to C^{r \times r}$ is Riemann-integrable;

-

${Z_{n}}_{n} \sim_{GLT} κ (x, θ) = O_{r}$ if and only if ${Z_{n}}_{n} \sim_{σ} 0$ .
GLT 3. If ${A_{n}}_{n} \sim_{GLT} κ$ and ${B_{n}}_{n} \sim_{GLT} ς$ , then:

-

${A_{n}^{*}}_{n} \sim_{GLT} κ^{*}$ ;

-

${α A_{n} + β B_{n}}_{n} \sim_{GLT} α κ + β ς$ for all $α, β \in C$ ;

-

${A_{n} B_{n}}_{n} \sim_{GLT} κ ς$ ;

-

${A_{n}^{†}}_{n} \sim_{GLT} κ^{- 1}$ , provided that $κ$ is invertible almost everywhere.
GLT 4. ${A_{n}}_{n} \sim_{GLT} κ$ if and only if there exist ${B_{n, j}}_{n} \sim_{GLT} κ_{j}$ such that ${{B_{n, j}}_{n}}_{j} \overset{a . c . s . wrt j}{\to} {A_{n}}_{n}$ and $κ_{j} \to κ$ in measure.
GLT 5. If ${A_{n}}_{n} \sim_{GLT} κ$ and $A_{n} = X_{n} + Y_{n}$ , where

-

every $X_{n}$ is Hermitian,

-

$| | X_{n} | |, | | Y_{n} | | \leq C$ for some constant C independent of n,

-

$N {(n)}^{- 1} {∥ Y_{n} ∥}_{1} \to 0$ ,

then ${A_{n}}_{n} \sim_{λ} κ$ .
GLT 6. If ${A_{n}}_{n} \sim_{GLT} κ$ and each $A_{n}$ is Hermitian, then ${f (A_{n})}_{n} \sim_{GLT} f (κ)$ for every continuous function $f : C \to C$ .

Note that, by GLT 1, it is always possible to obtain the singular value distribution from the GLT symbol, while the eigenvalue distribution can only be deduced either if the involved matrices are Hermitian or the related matrix-sequence is quasi-Hermitian in the sense of GLT 5.

3. Geometric Mean of GLT Matrix-Sequences

This section discusses the geometric mean of positive definite matrices, starting with the well-established case of two matrices and then considering multiple matrices. In particular we give distribution results in the case of GLT matrix-sequences.

3.1. Means of Two Matrices

The geometric mean of two positive numbers a and b is simply

\sqrt{a b}

, a fact well-known from basic arithmetic. However, extending this concept to HPD matrices introduces a number of challenges, as matrix multiplication is not commutative. The question of how to define a geometric mean for matrices in a way that preserves key properties such as congruence invariance, consistency with scalars, and symmetry was solved by ALM [1]. Their work presented an axiomatic approach to defining the geometric mean of two HPD matrices previously defined in Equation (1). ALM formalized the geometric mean for matrices by establishing a set of ten essential properties, known as the ALM axioms. These axioms ensure the geometric mean behaves appropriately in the matrix setting. Here are three key properties:

1. Permutation invariance:

G (A, B) = G (B, A)

for all

A, B \in P_{n}

.

2. Congruence invariance:

G (M^{*} A M, M^{*} B M) = M^{*} G (A, B) M

for all

A, B \in P_{n}

and all invertible matrices

M \in C^{n \times n}

.

3. Consistency with scalars:

G (A, B) = {(A B)}^{1 / 2}

for all commuting

A, B \in P_{n}

(note that

A B \in P_{n}

for all commuting

A, B \in P_{n}

because

{(A B)}^{*} = B^{*} A^{*} = B A = A B

and

A B

is similar to the HPD matrix

A^{1 / 2} B A^{1 / 2} = A^{- 1 / 2} (A B) A^{1 / 2}

).

When considering sequences of matrices, particularly in the framework of GLT sequences, the geometric mean operation is well-preserved under the structure of GLT sequences. If

r = d = 1

and we consider two scalar unilevel GLT matrix-sequences, that is,

{A_{n}}_{n} \sim_{GLT} κ

and

{B_{n}}_{n} \sim_{GLT} ξ

, where

A_{n}, B_{n} \in P_{n}

are HPD matrices for every n, the matrix-sequence of their geometric mean

{G (A_{n}, B_{n})}_{n}

also forms a scalar unilevel GLT matrix-sequence. The symbol of the resulting sequence is the geometric mean of the individual symbols

κ

and

ξ

.

Theorem 3

([20], Theorem 10.2). Let

r = d = 1

. Suppose

{A_{n}}_{n} \sim_{GLT} κ

and

{B_{n}}_{n} \sim_{GLT} ξ

, where

A_{n}, B_{n} \in P_{n}

for every n. Assume that at least one between κ and ξ is nonzero almost everywhere. Then

{G (A_{n}, B_{n})}_{n} \sim_{GLT} {(κ ξ)}^{1 / 2}

(7)

and

{G (A_{n}, B_{n})}_{n} \sim_{σ, λ} {(κ ξ)}^{1 / 2} .

(8)

The previous result is easily extended to the case of matrix-sequence resulting from the geometric mean of two block multilevel GLT matrix-sequences, thanks to powerful *-algebra structures of considered spaces described in Section 2.9. Indeed the following two generalizations of Theorem 3 hold.

Theorem 4.

Let

r = 1

and

d \geq 1

. Suppose

{A_{n}}_{n} \sim_{GLT} κ

and

{B_{n}}_{n} \sim_{GLT} ξ

, where

A_{n}, B_{n} \in P_{ν (n)}

for every multi-index

n

. Assume that at least one between κ and ξ is nonzero almost everywhere. Then

{G (A_{n}, B_{n})}_{n} \sim_{GLT} {(κ ξ)}^{1 / 2}

(9)

and

{G (A_{n}, B_{n})}_{n} \sim_{σ, λ} {(κ ξ)}^{1 / 2} .

(10)

Proof. Since both

A_{n}

and

B_{n}

are positive definite for every multi-index

n

, the matrix-sequence

{G (A_{n}, B_{n})}_{n}

is well defined according to formula (1) since

A_{n}^{- 1 / 2}

is well defined for every multi-index

n

. According to the assumption, we start with the case where

κ

is nonzero almost everywhere. Hence the matrix-sequence

{A_{n}^{- 1}}_{n}

is a GLT matrix-sequence with GLT symbol

κ^{- 1}

by Axiom GLT 3, part 3. Since the square root is continuous and well defined over positive definite matrices also the matrix-sequence

{A_{n}^{- 1 / 2}}_{n}

is a GLT matrix-sequence with GLT symbol

κ^{- 1 / 2}

by virtue of Axiom GLT 6.

Now using two times GLT 3, part 2, we infer that

{A_{n}^{- 1 / 2} {B_{n} A_{n}}^{- 1 / 2}}_{n}

is a GLT matrix-sequence with GLT symbol

κ^{- 1} ξ

, where

X_{n} = A_{n}^{- 1 / 2} {B_{n} A_{n}}^{- 1 / 2}

is positive definite because of the Sylvester inertia law. Hence the square root of

X_{n}

is well defined and by exploiting again Axiom GLT 6 we deduce that

{\{{(A_{n}^{- 1 / 2} {B_{n} A_{n}}^{- 1 / 2})}^{1 / 2}\}}_{n}

is a GLT matrix-sequence with GLT symbol

{(κ^{- 1} ξ)}^{1 / 2}

. Finally, by exploiting Axiom GLT 3, part 3, we have

{A_{n}^{1 / 2}}_{n}

is a GLT matrix-sequence with GLT symbol

κ^{1 / 2}

and the application of GLT 3, part 2, two time leads to the desired conclusion

{G (A_{n}, B_{n})}_{n} \sim_{GLT} {(κ ξ)}^{1 / 2},

where the latter and Axiom GLT 3 imply

{G (A_{n}, B_{n})}_{n} \sim_{σ, λ} {(κ ξ)}^{1 / 2}

.

Finally the other case where

ξ

is nonzero almost everywhere has the very same proof since

G (\cdot, \cdot)

is invariant under permutations and hence

G (A, B) = A^{1 / 2} {(A^{- 1 / 2} B A^{- 1 / 2})}^{1 / 2} A^{1 / 2} = G (B, A) = B^{1 / 2} {(B^{- 1 / 2} A B^{- 1 / 2})}^{1 / 2} B^{1 / 2},

so that the the same steps can be repeated by exchanging

A_{n}

and

B_{n}

. •

Theorem 5.

Let

r, d \geq 1

. Suppose

{A_{n}}_{n} \sim_{GLT} κ

and

{B_{n}}_{n} \sim_{GLT} ξ

, where

A_{n}, B_{n} \in P_{ν (n)}

for every multi-index

n

. Assume that at least one between the minimal eigenvalue of κ and the minimal eigenvalue of ξ is nonzero almost everywhere. Then

{G (A_{n}, B_{n})}_{n} \sim_{GLT} G (κ, ξ)

(11)

and

{G (A_{n}, B_{n})}_{n} \sim_{σ, λ} G (κ, ξ) .

(12)

Furthermore

G (κ, ξ) = {(κ ξ)}^{1 / 2}

whenever the GLT symbols κ and ξ commute.

Proof. The case of

r = 1

is already contained in Theorem 4, so we assume

r > 1

i.e. a true GLT block setting. The proof is in fact a repetition of that of the previous theorem with the only attention to GLT symbol part where the multiplication is noncommutative.

Since both

A_{n}

and

B_{n}

are positive definite for every multi-index

n

, the matrix-sequence

{G (A_{n}, B_{n})}_{n}

is well defined according to formula (1) since

A_{n}^{- 1 / 2}

is well defined for every multi-index

n

. According to the assumption, we start with the case where

κ

is invertible almost everywhere so that

{A_{n}^{- 1}}_{n} \sim_{GLT} κ^{- 1}

by Axiom GLT 3, part 3, and

{A_{n}^{- 1 / 2}}_{n} \sim_{GLT} κ^{- 1 / 2}

thanks to Axiom GLT 6.

Now using two times GLT 3, part 2, we have

{A_{n}^{- 1 / 2} {B_{n} A_{n}}^{- 1 / 2}}_{n} \sim_{GLT} κ^{- 1 / 2} ξ κ^{- 1 / 2}

, where

X_{n} = A_{n}^{- 1 / 2} {B_{n} A_{n}}^{- 1 / 2}

is positive definite because of the Sylvester inertia law. Hence the square root of

X_{n}

is well defined and by exploiting again Axiom GLT 6 we obtain

{\{{(A_{n}^{- 1 / 2} {B_{n} A_{n}}^{- 1 / 2})}^{1 / 2}\}}_{n} \sim_{GLT} {(κ^{- 1 / 2} ξ κ^{- 1 / 2})}^{1 / 2}

. Finally, by exploiting Axiom GLT 3, part 3, on the matrix-sequence

{A_{n}^{1 / 2}}_{n}

and using GLT 3, part 2, two times, we conclude

{G (A_{n}, B_{n})}_{n} \sim_{GLT} κ^{1 / 2} {(κ^{- 1 / 2} ξ κ^{- 1 / 2})}^{1 / 2} κ^{1 / 2},

where the symbol

κ^{1 / 2} {(κ^{- 1 / 2} ξ κ^{- 1 / 2})}^{1 / 2} κ^{1 / 2}

is exactly

G (κ, ξ)

. Furthermore, relation (11) and Axiom GLT 3 imply

{G (A_{n}, B_{n})}_{n} \sim_{σ, λ} G (κ, ξ)

where

G (κ, ξ) = {(κ ξ)}^{1 / 2}

, whenever

κ

and

ξ

commute.

Finally the remaining case where

ξ

is invertible almost everywhere has the very same proof since

G (\cdot, \cdot)

is invariant under permutations and hence

G (A, B) = A^{1 / 2} {(A^{- 1 / 2} B A^{- 1 / 2})}^{1 / 2} A^{1 / 2} = G (B, A) = B^{1 / 2} {(B^{- 1 / 2} A B^{- 1 / 2})}^{1 / 2} B^{1 / 2},

so that the the same steps can be repeated by exchanging

A_{n}

and

B_{n}

and by exchanging

κ

and

ξ

. •

3.2. Mean of More Than Two Matrices

In this section, we describe the iterative method used to compute the Karcher mean for more than two HPD matrices. The Karcher mean is an extension of the geometric mean to more than two matrices and can be computed using an iterative method based on the Richardson-like iteration. As detailed earlier in the introduction (see Equation 6), the iteration updates the approximation at each step based on the logarithmic correction term. The step-size parameter

θ

is dynamically computed during the iteration, ensuring that the process converges efficiently by accounting for the condition numbers of the matrices involved.

Specifically,

θ

is given by

θ = \frac{2}{γ + β},

(13)

with

β

and

γ

computed as

[β, γ] = [\sum_{j = 1}^{k} \frac{log (c_{j})}{c_{j} - 1}, \sum_{j = 1}^{k} c_{j} \frac{log (c_{j})}{c_{j} - 1}],

(14)

where

c_{j} = \frac{λ_{\max} (M_{j})}{λ_{\min} (M_{j})},

and

M_{j} = G^{1 / 2} A_{j}^{- 1} G^{1 / 2},

G is the current approximation of the Karcher mean.

The Richardson-like iteration can be implemented in different equivalent forms

\begin{matrix} X & = X - θ X^{1 / 2} (\sum_{i = 1}^{k} log (X^{1 / 2} A_{i}^{- 1} X^{1 / 2})) X^{1 / 2}, G = X, \end{matrix}

(15)

\begin{matrix} X & = X^{1 / 2} (I - θ \sum_{i = 1}^{k} log (X^{1 / 2} A_{i}^{- 1} X^{1 / 2})) X^{1 / 2}, G = X, \end{matrix}

(16)

\begin{matrix} Y & = Y - θ Y^{1 / 2} (\sum_{i = 1}^{k} log (Y^{1 / 2} A_{i} Y^{1 / 2})) Y^{1 / 2}, G = Y^{- 1}, \end{matrix}

(17)

\begin{matrix} Y & = Y^{1 / 2} (I - θ \sum_{i = 1}^{k} log (Y^{1 / 2} A_{i} Y^{1 / 2})) Y^{1 / 2}, G = Y^{- 1} . \end{matrix}

(18)

Among these equivalent formulations, the first one, Equation (15), is the most practical for implementation. It avoids matrix inversions, which can introduce numerical instabilities and increase computational complexity. While formulations Equation () and Equation () aim to reduce the number of matrix inversions, the final step requires inverting the result, which can lead to inaccuracies, especially for poorly conditioned matrices. Additionally, Equation () retains the simplicity of direct matrix operations without introducing unnecessary complications. A numerically more efficient approach uses Cholesky factorization to reduce the computational cost of forming matrix square roots at every step, enhancing efficiency as forming the Cholesky factor costs less than computing a full matrix square root [24].

Suppose

X_{ν} = R_{ν}^{T} R_{ν}

is the Cholesky decomposition of

X_{ν}

, where

R_{ν}

is an upper triangular matrix. The iteration step can be rewritten as

X_{ν + 1} = X_{ν} + θ R_{ν}^{T} (\sum_{i = 1}^{k} log (R_{ν}^{- T} A_{i} R_{ν}^{- 1})) R_{ν} .

(19)

In this formulation, the Cholesky factor

R_{ν}

is updated at each iteration. The condition number of the Cholesky factor

R_{ν}

in the spectral norm is the square root of the condition number of

X_{ν}

, thus ensuring good numerical accuracy. For this heuristic to be effective, it is essential that

X_{0}

provides a good approximation of G. Therefore, selecting

X_{0}

as the Cheap mean is critical. An adaptive version of this iteration has been proposed and implemented in the Matrix Means Toolbox [11].

Of course the Richardson-like iteration is relevant for computing efficiently the Karcher mean: we also exploit its formal expression for theoretical purposes when dealing with sequences of matrices, particularly those involving GLT sequences. For the theory we come back at relation (15) and we consider the associated iteration

X_{ν + 1} = X_{ν} - θ X_{ν}^{1 / 2} (\sum_{i = 1}^{k} log (X_{ν}^{1 / 2} A_{i}^{- 1} X_{ν}^{1 / 2})) X_{ν}^{1 / 2},

(20)

with

X_{0}

given positive definite matrix. We know that

X_{ν}

converges to the geometric mean of

A_{1}, \dots, A_{k}

as

ν

tends to infinity for every fixed positive definite initial guess

X_{0}

.

Fix

r, d \geq 1

. Suppose now that the block multitivel sequence of matrices

{A_{n}^{(i)}}_{n} \sim_{GLT} κ_{i}

for

i = 1, \dots, k

are given, where

A_{n}^{(1)}, \dots, A_{n}^{(k)}

are positive definite for every multi-index

n

.

Due to the positive definiteness of the matrices

{A_{n}^{(i)}}_{n}

and because of

{A_{n}^{(i)}}_{n} \sim_{GLT} κ_{i}

, from Axiom GLT 1 it follows that each

κ_{i}

is nonnegative definite almost everywhere (see Remark 1).

In this setting, it is conjectured that the sequence of Karcher means

{G (A_{n}^{(1)}, \dots, A_{n}^{(k)})}_{n}

forms a new GLT matrix-sequence whose symbol is the geometric mean of the individual symbols

κ_{1}, \dots, κ_{k}

, specifically

{(κ_{1} \dots κ_{k})}^{1 / k}

if all symbols commute and

G (κ_{1}, \dots, κ_{k})

in the general case. In order to attack the problem the initial guess matrix-sequence

{X_{n, 0}}_{n}

must be of GLT type with nonnegative definite GLT symbol. In this way thanks to (20) and using the GLT axioms in the way it is done in Theorem 5, we deduce easily that

{X_{n, ν}}_{n}

is still a GLT matrix-sequence with symbol

g_{ν}

converging to

G (κ_{1}, \dots, κ_{k})

.

Finally Theorem 1 and Axiom GLT 4 could be applied if we prove that

{{X_{n, ν}}_{n}}_{ν}

if an a.c.s. for the limit sequence

{G (A_{n}^{(1)}, \dots, A_{n}^{(k)})}_{n}

. This could be proven using Schatten estimates like those in the second item of Theorem 2, but at the moment this is not easy because the known convergence proofs for the Karcher iterations are all based on pointwise convergence.

4. Numerical Experiments

In this section, we present and critically analyze several selected examples, by considering matrix-sequences of geometric means k HPD matrices for

k = 2, 3

. In particular our numerics show asymptotical spectral properties of the resulting matrix-sequences in accordance with the theoretical results (and conjectures) of the previous section. We introduce few examples in which the input matrix-sequences are either of Toeplitz type or are general r-block d-level GLT matrix-sequences, stemming from the approximations of differential operators via local methods such finite differences, finite elements, isogeometric analysis: in the first group we explore the geometric mean of two matrix-sequences and in the second group we consider the Karcher mean of three matrix-sequences, both taking into account one-dimensional (1D) and two-dimensional (2D) settings, i.e.,

d = 1, 2

and

r = 1

; in the final group we deal with r-block GLT matrix-sequences with

r = 2

. We anticipate the strong agreement of the numerical evidences with the theoretical results in Theorem 3, Theorem 4, Theorem 5, and with the conjecture regarding the Karcher mean, also when the matrix-sizes are quite moderate. The latter is a nontrivial numerical finding since all the theoretical results have an asymptotic spectral nature.

4.1. Example 1 (1D)

Let

A_{n} = T_{n} ({(2 - 2 cos (θ))}^{2})

according to Section 2.7 and let

B_{n}

be the finite difference discretization of the differential operator

{(α (x) u^{''})}^{''}

on the interval

(0, 1)

, with boundary conditions

u (0) = u (1) = u^{'} (0) = u^{'} (1) = 0

, where

α (x)

is positive on

(0, 1)

. For the fourth order boundary value problem

{(α (x) u^{''})}^{''} = f (x), u (0) = u (1) = u^{'} (0) = u^{'} (1) = 0,

we approximate the derivative

u^{(4)} (x)

by using the second-order central FD scheme characterized by the stencil

(1, - 4, 6, - 4, 1)

. More specifically, for

α, u

smooth enough, we have

{{(α (x) u^{''} (x))}^{''}|}_{x = x_{i}}

equal to

\frac{α_{i - 2} u_{i - 2} - 2 (α_{i - 1} + α_{i}) u_{i - 1} + (α_{i - 1} + 4 α_{i} + α_{i + 1}) u_{i} - 2 (α_{i + 1} + α_{i}) u_{i + 1} + α_{i + 1} u_{i + 2}}{h^{4}} + O (h^{2}),

for all

i = 2, 3, . . . ., n + 1

; here,

h = \frac{1}{n + 3}

and

x_{i} = i h

for

i = 0, 1, . . . ., n + 3

.

By taking into account the homogeneous boundary conditions and by neglecting the

O (h^{2})

approximation error, we approximate the nodal value

u (x_{i})

with the value of

u_{i}

for

i = 0, 1, . . . ., n + 3

, where

u_{0} = u_{1} = u_{n + 2} = u_{n + 3} = 0

and

u = {(u_{2}, . . . ., u_{n + 1})}^{T}

is the solution of the linear system

α_{i - 2} u_{i - 2} - 2 (α_{i - 1} + α_{i}) u_{i - 1} + (α_{i - 1} + 4 α_{i} + α_{i + 1}) u_{i} - 2 (α_{i + 1} + α_{i}) u_{i + 1} + α_{i + 1} u_{i + 2} = h^{4} f (x),

for all

i = 2, 3, . . . ., n + 1

.

The structure of the resulting matrix

B_{n} = B_{n} (α)

is as reported below

(21)

Looking at

B_{n}

in (21), we observe that it can be written as

B_{n} = B_{n} (α) = D_{n}^{+} K_{n}^{+} + D_{n} K_{n} + D_{n}^{-} K_{n}^{-},

with

K_{n}^{+} = (\begin{matrix} 1 & - 2 & 1 \\ 1 & - 2 & 1 \\ 1 & - 2 & 1 \\ 1 & - 2 & ⋱ \\ 1 & ⋱ & 1 \\ ⋱ & - 2 \\ 1 \end{matrix}) = T_{n} (1 - 2 e^{- i θ} + e^{- 2 i θ}),

K_{n} = (\begin{matrix} 4 & - 2 \\ - 2 & 4 & - 2 \\ - 2 & 4 & - 2 \\ - 2 & 4 & - 2 \\ ⋱ & ⋱ & ⋱ \\ - 2 & 4 & - 2 \\ - 2 & 4 \end{matrix}) = T_{n} (4 - 2 e^{- i θ} - 2 e^{i θ}),

K_{n}^{-} = (\begin{matrix} 1 \\ - 2 & 1 \\ 1 & - 2 & 1 \\ 1 & - 2 & 1 \\ ⋱ & ⋱ & ⋱ \\ 1 & - 2 & 1 \\ 1 & - 2 & 1 \end{matrix}) = T_{n} (1 - 2 e^{i θ} + e^{2 i θ}) .

\begin{matrix} D_{n}^{+} & = & \underset{i = 1, 2 . . . ., n}{diag} α_{i + 2} = \underset{i = 1, 2 . . . ., n}{diag} α (x_{i + 2}), \\ D_{n} & = & \underset{i = 1, 2 . . . ., n}{diag} α_{i + 1} = \underset{i = 1, 2 . . . ., n}{diag} α (x_{i + 1}), \\ D_{n}^{-} & = & \underset{i = 1, 2 . . . ., n}{diag} α_{i} = \underset{i = 1, 2 . . . ., n}{diag} α (x_{i}) . \end{matrix}

It is easy to check that grids

G_{n}^{+} = {x_{i + 2}}_{i = 1, . . . ., n}, G_{n} = {x_{i + 1}}_{i = 1, . . . ., n}

and

G_{n}^{-} = {x_{i}}_{i = 1, . . . ., n}

are asymptotically uniform in

[0, 1]

, according to the notion given in [6]. In fact, for

G_{n}^{+} = {x_{i + 2}}_{i = 1, . . . ., n}

we have

\max_{i = 1, 2, \dots, n} |x_{i, n} - \frac{i}{n}| = \max_{i = 1, 2, \dots, n} |x_{i + 2} - \frac{i}{n}| = \max_{i = 1, 2, \dots, n} |(i + 2) h - \frac{i}{n}| .

. = \max_{i = 1, 2, \dots, n} |\frac{i + 2}{n + 3} - \frac{i}{n}| = \max_{i = 1, 2, \dots, n} |\frac{n i + 2 n - i (n + 3)}{n (n + 3)}| = \max_{i = 1, 2, \dots, n} |\frac{- i}{n (n + 3)}| \leq \frac{n}{n (n + 3)} \to 0 .

For

G_{n} = {x_{i + 1}}_{i = 1, . . . ., n}

we have

\max_{i = 1, 2, \dots, n} |x_{i, n} - \frac{i}{n}| = \max_{i = 1, 2, \dots, n} |x_{i + 1} - \frac{i}{n}| = \max_{i = 1, 2, \dots, n} |(i + 1) h - \frac{i}{n}| .

. = \max_{i = 1, 2, \dots, n} |\frac{i + 1}{n + 3} - \frac{i}{n}| = \max_{i = 1, 2, \dots, n} |\frac{n i + n - i (n + 3)}{n (n + 3)}| = \max_{i = 1, 2, \dots, n} |\frac{- 2 i}{n (n + 3)}| \leq \frac{2 n}{n (n + 3)} \to 0 .

For

G_{n}^{-} = {x_{i}}_{i = 1, . . . ., n}

we have

\max_{i = 1, 2, \dots, n} |x_{i, n} - \frac{i}{n}| = \max_{i = 1, 2, \dots, n} |x_{i} - \frac{i}{n}| = \max_{i = 1, 2, \dots, n} |i h - \frac{i}{n}| .

. = \max_{i = 1, 2, \dots, n} |\frac{i}{n + 3} - \frac{i}{n}| = \max_{i = 1, 2, \dots, n} |\frac{n i - i (n + 3)}{n (n + 3)}| = \max_{i = 1, 2, \dots, n} |\frac{- 3 i}{n (n + 3)}| \leq \frac{3 n}{n (n + 3)} \to 0 .

Hence, by

GLT 2

, part 2, and by [6], we deduce that

{D_{n}^{+}}_{n} \sim_{GLT} α (x)

,

{D_{n}}_{n} \sim_{GLT} α (x)

and

{D_{n}^{-}}_{n} \sim_{GLT} α (x)

. In conclusion, by invoking

GLT 2 - GLT 3

, we infer

{B_{n}}_{n} \sim_{GLT} α (x) (1 - 2 e^{- i θ} + e^{- 2 i θ}) + α (x) (4 - 2 e^{- i θ} - 2 e^{i θ}) + α (x) (1 - 2 e^{i θ} + e^{2 i θ}) = α (x) {(2 - 2 cos (θ))}^{2} .

Eigenvalue Distribution

We begin by numerically verifying the eigenvalue distribution of the matrix-sequence

{G (A_{n}, B_{n})}_{n}

with respect to its GLT symbol

{(κ ξ)}^{1 / 2}

according to Theorem 8, with

A_{n} = T_{n} ({(2 - 2 cos (θ))}^{2})

,

B_{n} = B_{n} (α)

as in (21) with

α (x) = x

,

κ (x, θ) = {(2 - 2 cos (θ))}^{2}

,

{ξ (x, θ)) = α (x) (2 - 2 cos (θ))}^{2}

. In Figure 1, we compare the eigenvalues of the geometric mean with a uniform sampling of the symbol. It is evident that, as n increases, the symbol provides a better and better approximation of the eigenvalues. Similar results for the two-dimensional case are shown in Example Section 4.2, Figure 2.

Figure 1. Comparison between the symbol

{(κ ξ)}^{\frac{1}{2}}

and

eig (G (A_{n}, B_{n}))

.

Figure 1. Comparison between the symbol

{(κ ξ)}^{\frac{1}{2}}

and

eig (G (A_{n}, B_{n}))

.

4.2. Example 2 (2D)

Let

A_{n} = T_{n} (F (θ_{1}, θ_{2}))

according to the notation in Section 2.7 with

d = 2

and

B_{n}

be the finite difference discretization of the differential operator

\frac{\partial^{2}}{\partial x^{2}} (a (x, y) \frac{\partial^{2}}{\partial x^{2}}) + \frac{\partial^{2}}{\partial y^{2}} (b (x, y) \frac{\partial^{2}}{\partial y^{2}}),

on the open domain

Ω = {(0, 1)}^{2}

, with

a (x, y), b (x, y)

nonnegative on the closure of

Ω

, with homogeneous Dirichlet boundary conditions on

u (x, y)

, and zero normal derivatives at

\partial Ω

.

Regarding

A_{n}

, the generating function

F (θ_{1}, θ_{2})

is given by

F (θ_{1}, θ_{2}) = f (θ_{1}) + f (θ_{2}),

with

f (θ) = {(2 - 2 cos (θ))}^{2}

.

Thus, we have

A_{n} = T_{n} (f (θ_{1}) + f (θ_{2})) = T_{n} (8 + 4 {cos}^{2} (θ_{1}) + 4 {cos}^{2} (θ_{2}) - 8 cos (θ_{1}) - 8 cos (θ_{1})) .

As in the one-dimensional setting in Example Section 4.1, we apply the second-order central finite difference scheme separately to the x- and y-directions, in perfect analogy with the 1D case, and take

a (x, y) = α (x)

,

b (x, y) = α (y)

. Choosing

α (t) = t

the related problem is semielliptic, but it is of separable nature. This separable nature is reflected algebraically in a tensor decomposition of the whole approximation and in fact we find that resulting global matrix is given by the Kronecker sum

B_{n} = B_{(n_{1}, n_{2})} = B_{n_{1}} (α) \otimes I_{n_{2}} + I_{n_{1}} \otimes B_{n_{2}} (α),

where

I_{n}

denotes the identity matrix of size n and

B_{n} (α)

is exactly the structure displayed in (21). By exploiting the GLT analysis in the one-dimensional case, the symbol for the two-dimensional matrix-sequence can be derived similarly. More precisely, we have

{B_{n}}_{n} \sim_{GLT} α (x) {(2 - 2 cos (θ_{1}))}^{2} + α (y) {(2 - 2 cos (θ_{2}))}^{2} .

As shown in Figure 2, the agreement is remarkable even for the largest eigenvalues for which the absolute discrepancy is higher.

Figure 2. Comparison between the symbol

{(κ ξ)}^{\frac{1}{2}}

and

eig (G (A_{n}, B_{n})

.

Figure 2. Comparison between the symbol

{(κ ξ)}^{\frac{1}{2}}

and

eig (G (A_{n}, B_{n})

.

4.3. Example 3 (1D)

Let

A_{n}^{(1)} = T_{n} (3 + 2 cos (θ))

,

A_{n}^{(2)} = D_{n} (a)

, and

A_{n}^{(3)} = D_{n} (a) T_{n} (4 - 2 cos (θ)) D_{n} (a)

, where

T_{n} (\cdot)

is the Toeplitz operator for

d = 1

as in Section 2.7 and

D_{n} (a)

is the diagonal matrix generated by the continuous function

a (x) = x^{2}

, according to the notation in Section 2.8.

Eigenvalue distribution

First, we aim to verify the eigenvalue distribution of more than two matrices using the Karcher mean. In Figure 3, we compare the eigenvalues of the Karcher mean with a uniform sampling of the resulting limit GLT symbol. It can be observed that the symbol provides a reliable approximation of the eigenvalues, and in fact, as n increases, the spectral distribution holds asymptotically. Similar results for the two-dimensional case are shown in Example Section 4.4, Figure 4. Both types of result corroborate the conjecture on the GLT nature of the limit matrix-sequence of the Karcher means, as discussed at the end of Section 3.2.

Figure 3. Comparison between the symbol

{(κ_{1} κ_{2} κ_{3})}^{\frac{1}{3}}

and

eig (G (A_{n}^{(1)}, A_{n}^{(2)}, A_{n}^{(3)}))

.

Figure 3. Comparison between the symbol

{(κ_{1} κ_{2} κ_{3})}^{\frac{1}{3}}

and

eig (G (A_{n}^{(1)}, A_{n}^{(2)}, A_{n}^{(3)}))

.

Figure 4. Comparison between the symbol

{(κ_{1} κ_{2} κ_{3})}^{\frac{1}{3}}

and

eig (G (A_{n}^{(1)}, A_{n}^{(2)}, A_{n}^{(3)}))

.

Figure 4. Comparison between the symbol

{(κ_{1} κ_{2} κ_{3})}^{\frac{1}{3}}

and

eig (G (A_{n}^{(1)}, A_{n}^{(2)}, A_{n}^{(3)}))

.

4.4. Example 4 (2D)

Consider the matrix

A_{n}^{(1)} = T_{n} (F (θ_{1}, θ_{2}))

, where

T_{n} (\cdot)

is the Toeplitz operator as in Section 2.7 with

d = 2

, and the function

F (θ_{1}, θ_{2})

is defined as

F (θ_{1}, θ_{2}) = f (θ_{1}) + f (θ_{2})

, with

f (θ) = 3 + 2 cos (θ)

, resulting in

A_{n}^{(1)} = T_{n} (6 + 2 cos (θ_{1}) + 2 cos (θ_{2}))

.

Also, consider the diagonal matrix

A_{n}^{(2)} = D_{n} (a)

, where, according to the notation in Section 2.8,

D_{n} (a)

is the diagonal sampling matrix generated by a continuous function

a (x, y) = x^{2} + y^{2}

, which is positive on the domain

{(0, 1)}^{2}

.

We take the matrix

A_{n}^{(3)} = D_{n} (b) T_{n} (G (θ_{1}, θ_{2})) D_{n} (b)

, where

b (x, y) = 1 / x + 1 / y

is positive and unbounded on the domain

{(0, 1)}^{2}

, and where the generating function

G (θ_{1}, θ_{2}) = f (θ_{1}) + f (θ_{2})

, with

f (θ) = 4 - 2 cos (θ)

implies that

A_{n}^{(3)}

is

T_{n} (8 - 2 cos (θ_{1}) - 2 cos (θ_{2}))

.

Also in the current example the agreement between the limit GLT symbol and the displayed eigenvalues is remarkably good, so giving ground to the conjecture on the Karcher means reported in the final part of Section 3.2.

4.5. Galerkin Discretization of the Laplacian Eigenvalue Problem

The one-dimensional Laplacian eigenvalue problem is given by the differential equation

- u_{j}^{''} (x) = λ_{j} u_{j} (x), x \in (0, 1),

(22)

with Dirichlet boundary conditions:

u_{j} (0) = u_{j} (1) = 0 .

The goal is to find the eigenvalues

λ_{j} \in R^{+}

and corresponding eigenfunctions

u_{j} \in H_{0}^{1} ([0, 1])

, for

j = 1, 2, . . . . \infty

, where

H_{0}^{1} ([0, 1])

is the standard Sobolev space of

L^{2}

functions with

L^{2}

derivatives and vanishing at the boundaries.

Weak formulation

To derive the weak formulation, we multiply both sides of the differential equation by a test function

v \in H_{0}^{1} ([0, 1])

and integrate over the interval

[0, 1]

i.e.

- \int_{0}^{1} u_{j}^{''} (x) v (x) d x = \int_{0}^{1} λ_{j} u_{j} (x) v (x) d x

. Using integration by parts on the left-hand side and noting that the boundary terms vanish, we deduce

\int_{0}^{1} u_{j}^{''} (x) v (x) d x = \int_{0}^{1} u_{j}^{'} (x) v^{'} (x) d x

, so that the weak form becomes

\int_{0}^{1} u_{j}^{'} (x) v^{'} (x) d x = λ_{j} \int_{0}^{1} u_{j} (x) v (x) d x,

for every test function

v \in H_{0}^{1} ([0, 1])

. The latter is rewritten compactly as

a (u_{j}, v) = λ_{j} (u_{j}, v),

(23)

where the bilinear form

a (u_{j}, v)

and the

L^{2}

inner product

(u_{j}, v)

are defined as

a (u_{j}, v) : = \int_{0}^{1} u_{j}^{'} (x) v^{'} (x) d x, (u_{j}, v) : = \int_{0}^{1} u_{j} (x) v (x) d x .

Galerkin Approximation

The weak formulation allows us to use the Galerkin method to approximate the solution. Let

W_{n} = span {ϕ_{1}, \dots, ϕ_{N_{n}}}

be a finite-dimensional subspace of

H_{0}^{1} ([0, 1])

. The weak problem now becomes: find approximate eigenvalues

λ_{j, n} \in R^{+}

and eigenfunctions

u_{j, n} \in W_{n}

, for

j = 1, 2 . . . ., N_{n}

such that, for all

v_{n} \in W_{n}

, we have

a (u_{j, n}, v_{n}) = λ_{j, n} (u_{j, n}, v_{n})

. By expanding

u_{j, n}

and

v_{n}

in terms of the basis functions

{u_{i}}

, we obtain

u_{j, n} = \sum_{i = 1}^{N_{n}} c_{i} ϕ_{i}

,

v_{n} = \sum_{k = 1}^{N_{n}} d_{k} ϕ_{k}

, and substituting into the bilinear forms, the generalized eigenvalue problem

K_{n} c_{j} = λ_{j, n} M_{n} c_{j}

is defined, where the stiffness matrix

K_{n}

and mass matrix

M_{n}

are defined as

K_{n} = {[a (ϕ_{j}, ϕ_{i})]}_{i j = 1}^{N_{n}} = [\int_{0}^{1} ϕ_{j}^{'} (x) ϕ_{i}^{'} (x) d x]_{i j = 1}^{N_{n}},

(24)

M_{n} = {[(ϕ_{j}, ϕ_{i})]}_{i j = 1}^{N_{n}} = [\int_{0}^{1} ϕ_{j} (x) ϕ_{i} (x) d x]_{i j = 1}^{N_{n}} .

(25)

Both

K_{n}

and

M_{n}

are symmetric and positive definite, due to the coercive character of the underlying bilinear forms.

4.6. Quadratic C⁰ B-Spline Discretization

In the quadratic C⁰ B-spline discretization of the one-dimensional Laplacian eigenvalue problem, the basis functions

ϕ_{1}, \dots, ϕ_{N_{n}}

are chosen as B-splines of degree 2 defined on a uniform mesh with step size

\frac{1}{n}

. The basis functions are explicitly constructed on the knot sequence

{0, 0, 0, \frac{1}{n}, \frac{1}{n}, \frac{2}{n}, \frac{2}{n} \dots, \frac{n - 1}{n}, \frac{n - 1}{n}, 1, 1, 1},

(excluding the first and last B-splines, which do not vanish on the boundary of

[0, 1]

); see [22]. The resulting normalized stiffness and mass matrices are given by

(26)

(27)

The stiffness matrix

\frac{1}{n} K_{n}

and the mass matrix

n M_{n}

contain, as principal submatrices, the unilevel 2-block Toeplitz matrices

T_{n - 1} (f)

and

T_{n - 1} (h)

, according to the notation in Section 2.7 with

d = 1

and

r = 2

, where the generating functions

f (θ)

and

h (θ)

are given by

f (θ) : = \frac{1}{3} ([\begin{matrix} 0 & - 2 \\ 0 & - 2 \end{matrix}] e^{i θ} + [\begin{matrix} 4 & - 2 \\ - 2 & 8 \end{matrix}] + [\begin{matrix} 0 & 0 \\ - 2 & - 2 \end{matrix}] e^{- i θ}),

f (θ) = \frac{1}{3} [\begin{matrix} 4 & - 2 - 2 e^{i θ} \\ - 2 - 2 e^{- i θ} & 8 - 4 cos θ \end{matrix}],

and

h (θ) : = \frac{1}{30} ([\begin{matrix} 0 & 3 \\ 0 & 1 \end{matrix}] e^{i θ} + [\begin{matrix} 4 & 3 \\ 3 & 12 \end{matrix}] + [\begin{matrix} 0 & 0 \\ 3 & 1 \end{matrix}] e^{- i θ}),

h (θ) = \frac{1}{30} [\begin{matrix} 4 & 3 + 3 e^{i θ} \\ 3 + 3 e^{- i θ} & 12 + 2 cos θ \end{matrix}] .

Since

{T_{n} (g)}_{n} \sim_{GLT} g

for any Lebesgue integrable generating function g (Axiom GLT 2, part 1), the theory of GLT sequences and a basic use of the extradimensional approach [2,42] lead to

{\{\frac{1}{n} K_{n}\}}_{n} \sim_{GLT} f (θ),

(28)

{n M_{n}}_{n} \sim_{GLT} h (θ) .

(29)

By the Axiom GLT 3, the linear combination of two GLT sequences is again a GLT sequence, with the symbol being the corresponding linear combination of their symbols. The matrix-sequence

{L_{n}}_{n}

with

L_{n} : = \frac{1}{n} K_{n} + n M_{n},

(30)

is a linear combination of the sequences

{\{\frac{1}{n} K_{n}\}}_{n}

and

{n M_{n}}_{n}

. Consequently,

{L_{n}}_{n} \sim_{GLT} f (θ) + h (θ) = e (θ),

where

f (θ)

and

h (θ)

are the symbols of the stiffness matrix

\frac{1}{n} K_{n}

and the mass matrix

n M_{n}

, respectively.

Figure 5. Comparison between the symbol

G (f (θ), h (θ), e (θ))

and

eig (G (n^{- 1} K_{n}, n M_{n}, L_{n}))

.

Figure 5. Comparison between the symbol

G (f (θ), h (θ), e (θ))

and

eig (G (n^{- 1} K_{n}, n M_{n}, L_{n}))

.

The numerical evidence is convincing: furthermore, from the related figure, we observe that there exist two branches of the spectrum in accordance with Theorem 5 with

r = 2

.

4.7. Cubic C¹ B-Spline Discretization

In the cubic C¹ B-spline discretization on a uniform mesh with step size

\frac{1}{n}

, the basis functions

ϕ_{1}, \dots, ϕ_{N_{n}}

are chosen as the B-splines; see [22]. The resulting normalized stiffness and mass matrices are given by

(31)

(32)

According to the notation in Section 2.7 with

d = 1

and

r = 2

, the stiffness matrix

\frac{1}{n} K_{n}

and the mass matrix

n M_{n}

contain, as principal submatrices, the unilevel 2-block Toeplitz matrices

T_{n - 1} (f)

and

T_{n - 1} (h)

, where the generating functions

f (θ)

and

h (θ)

are

2 \times 2

matrix-valued functions given by

f (θ) : = \frac{1}{40} ([\begin{matrix} - 15 & - 15 \\ - 3 & - 15 \end{matrix}] e^{i θ} + [\begin{matrix} 48 & 0 \\ 0 & 48 \end{matrix}] + [\begin{matrix} - 15 & - 3 \\ - 15 & - 15 \end{matrix}] e^{- i θ}),

f (θ) = \frac{1}{40} [\begin{matrix} 48 - 30 cos (θ) & - 15 - 3 e^{- i θ} \\ - 3 e^{i θ} - 15 e^{- i θ} & 48 - 30 cos θ \end{matrix}],

and

h (θ) : = \frac{1}{560} ([\begin{matrix} 9 & 53 \\ 1 & 9 \end{matrix}] e^{i θ} + [\begin{matrix} 128 & 80 \\ 80 & 128 \end{matrix}] + [\begin{matrix} 9 & 1 \\ 53 & 9 \end{matrix}] e^{- i θ}),

h (θ) = \frac{1}{560} [\begin{matrix} 128 + 18 cos (θ) & 80 + 53 e^{i θ} + e^{- i θ} \\ 80 + 53 e^{- i θ} + e^{i θ} & 128 + 18 cos (θ) \end{matrix}] .

As discussed in Section 4.6, the same reasoning applies to the matrix-sequence

{L_{n}}_{n}

with

L_{n} : = \frac{1}{n} K_{n} + n M_{n},

which results in the GLT sequence symbol

e (θ) = f (θ) + h (θ)

. The numerical evidence is again strong and we can see from the related figure that there exist two branches of the spectrum in accordance with Theorem 5 with

r = 2

.

Figure 6. Comparison between the symbol

G (f (θ), h (θ), e (θ))

and

eig (G (n^{- 1} K_{n}, n M_{n}, L_{n}))

.

Figure 6. Comparison between the symbol

G (f (θ), h (θ), e (θ))

and

eig (G (n^{- 1} K_{n}, n M_{n}, L_{n}))

.

4.8. Minimal Eigenvalues and Conditioning

In the last part of the current numerical section, we consider the problem of understanding the extremal behavior of the spectral of the conditioning of

{G (A_{n}^{(1)}, A_{n}^{(2)}, A_{n}^{(3)})}_{n}

in dependence of the analytical features of the corresponding GLT symbol. The idea is borrowed from the literature, where we find papers dealing with the extremal eigenvalues in a Toeplitz setting [16,29,30,37], in a r-block Toeplitz setting [31,32], in a differential setting [27,43], including multilevel cases of

d > 1

. Here we restrict our attention to the unilevel scalar setting with

d = r = 1

and we consider again the examples in Section 4.1 and in Section 4.3.

4.8.1. Example 1 (1D): Minimal Eigenvalue

We consider the example in Section 4.1. The generating function

κ (x, θ) = {(2 - 2 cos (θ))}^{2}

of

A_{n}

has a unique zero of order 4 at

θ = 0

, while the matrix-sequence

{B_{n}}_{n}

shows the GLT symbol

{ξ (x, θ)) = x (2 - 2 cos (θ))}^{2}

with combinations of zeros at

x = 0

of order 1 and at

θ = 0

of order 4. According to the theory, the minimal eigenvalue of

A_{n}

is positive and tends to zero as

n^{- 4}

(see [16,29,30,37]). On the other hand, according to [27,43], we know that eigenvalue of

B_{n}

is positive and tends to zero as

n^{- 4}

. If we consider the geometric mean of the two symbols, we deduce that it has again zeros of order at most 4: as a consequence, heuristically we expect that the minimal eigenvalue of

X_{n} = G (A_{n}, B_{n})

is positive and tends to zero as

n^{- 4}

as it is perfectly verified in the table below with

α_{j}

tending to 4.

$X_{n} = G (A_{n}, B_{n})$
Take $n_{j} = 40 * 2^{j}, j = 0, 1, 2, 3,$
Compute $τ_{j} = λ_{min} (X_{n}), j = 0, 1, 2, 3,$
Compute $α_{j} = {log}_{2} (\frac{τ_{j}}{τ_{j + 1}}), j = 0, 1, 2 .$

Table 1.

	$τ_{j}$	$α_{j}$
$n_{0} = 40$	0.00010994	3.8698
$n_{1} = 80$	0.00000752	3.9398
$n_{2} = 160$	0.00000049	4.0297
$n_{3} = 320$	0.00000003

4.8.2. Example 3 (1D): Minimal Eigenvalue

We consider the example in Section 4.3. The generating function

κ_{1}

of

A_{n}^{(1)}

is strictly positive, the generating function

κ_{2} (x, θ) = x^{2}

of

A_{n}^{(2)}

has a unique zero of order 2 at

x = 0

, while the matrix-sequence

{A_{n}^{(3)}}_{n}

shows the GLT symbol

κ_{3} (x, θ)) = x^{4} {(2 - 2 cos (θ))}^{4}

with combinations of zeros at

x = 0

of order 4 and at

θ = 0

of order 4. According to the theory, the minimal eigenvalue of

A_{n}^{(1)}

is positive and tends to

1 = min κ_{1}

as (see [16,29,30,37]). On the other hand, according to [27,43], we know that eigenvalue of

A_{n}^{(3)}

is positive and tends to zero as

n^{- 4}

, while for

A_{n}^{(2)}

it is trivial tp check that the minimal eigenvalue tends to zero as

n^{- 2}

. If we consider the geometric mean of the two symbols, we observe that it has zeros of order at most 2: hence, heuristically we expect that the minimal eigenvalue of

X_{n} = G (A_{n}^{(1)}, A_{n}^{(2)}, A_{n}^{(3)})

is positive and tends to zero as

n^{- 2}

as it is perfectly verified in the table below with

α_{j}

tending to 2.

$X_{n} = G (A_{n}^{(1)}, A_{n}^{(2)}, A_{n}^{(3)})$
Take $n_{j} = 40 * 2^{j}, j = 0, 1, 2, 3,$
Compute $τ_{j} = λ_{min} (X_{n}), j = 0, 1, 2, 3,$
Compute $α_{j} = {log}_{2} (\frac{τ_{j}}{τ_{j + 1}}), j = 0, 1, 2 .$

Table 2.

	$τ_{j}$	$α_{j}$
$n_{0} = 40$	0.0014	2.2223
$n_{1} = 80$	0.00034797	2
$n_{2} = 160$	0.000086993	2
$n_{3} = 320$	0.000021748

5. Conclusions

We have studied the spectral distribution of the geometric mean of two or more matrix-sequences constituted by HPD matrices, under the assumption that k input matrix-sequences belong to the same d-level, r-block GLT *-algebra, where

k \geq 2

,

d, r \geq 1

. For

k = 2

an explicit formula exists and this has allowed to prove that the new matrix-sequence is of GLT type for any fixed

d, r

, with GLT symbol being the corresponding geometric mean of the

k = 2

input symbols, according to Theorems 4 and 5. As a consequence the spectral distribution of the geometric mean of

k = 2

matrix-sequences is formally demonstrated. For

k > 2

, an explicit formula is not available and we have considered the Karcher mean, for which efficient iterative procedures available in the literature can be used for computational purposes. Interestingly enough, using the same tools as in the proof of Theorems 4 and 5, we have shown that all the matrix-sequences made by the Karcher mean iterates are still GLT matrix-sequences with symbols converging the geometric mean of the k symbols, provided that the matrix-sequence of the initial input matrices is a GLT matrix-sequence made by HPD matrices. The theoretical results have been confirmed through various numerical experiments, where we compared the eigenvalues of the geometric mean with a uniform sampling of the geometric mean of the symbols. A very good agreement has been observed in all the numerical tests and aven for very moderate matrix-sizes. In fact, as the matrix-size increases, the symbol provides a better and better approximation of the eigenvalues, both in one-dimensional and two-dimensional cases, and both with scalar and block matrices, i.e. using GLT matrix-sequences taken from the differential world with

k = 2, 3

,

d, r = 1, 2

and including finite difference, finite element, isogeometric analysis approximations of constant and variable differential operators. Few open questions remain and in the subsequent lines we indicate three main issues:

A formal proof of the GLT nature of the Karcher mean of $k > 2$ HPD GLT matrix-sequences has to be given under the assumption that the initial guess is a HPD GLT matrix-sequence; in this respect, also from a computational viewpoint, starting with a initial guess ${X_{n, 0}}_{n}$ having already as GLT symbol the geometric mean of the input symbols should reduce sensibly the number of iterations;
in connection with Theorems 4 and 5 and $k = 2$ , the ALM axioms suggest that the technical assumption on the invertibility almost everywhere of the input GLT symbols is not necessary for every $d, r \geq 1$ ;
a completely open problem concerns the study of the extremal eigenvalues of the geometric means of GLT matrix-sequences as a function of the analytic features of the geometric means of the GLT symbols: in this direction it should be recalled that a rich literature exists regarding the extremal eigenvalues in a Toeplitz setting [16,29,30,37], in a r-block Toeplitz setting [31,32], in a differential setting [27,43], so involving all the types of examples considered in the numerical experiments. Prelinary numerical experiments in the unilevel scalar setting with $k = 2, 3$ has been performed in Section 4.8 and the results are quite promising. Interestingly enough, and substantially mimicking the cases already studied in the literature, it seems that the order of the zeros of the GLT symbol decides the asymptotic behavior of the minimal eigenvalues and hence of the conditioning of $G (A_{n}^{1}, \dots, A_{n}^{k}))$ , at least for $d = r = 1$ , $k = 2, 3$ .

All the previous items show that a lot of scientific research has still to be done in connection with the findings discussed in the current work.

References

T. Ando, C.-K. Li, R. Mathias, Geometric Means, Linear Algebra and its Applications, 385 (2004), pp. 305–334.
N. Barakitis, P. Ferrari, I. Furci, S. Serra-Capizzano, An extradimensional approach and distributional results for the case of 2 × 2 block Toeplitz structures, Springer Proceedings on Mathematics and Statistics, (2025), in press.
F. Barbaresco, New foundation of radar Doppler signal processing based on advanced differential geometry of symmetric spaces, in: International Radar Conference 2009, Bordeaux, France, 2009.
G. Barbarino, Equivalence between GLT sequences and measurable functions, Linear Algebra and its Applications, 529 (2017), pp. 397–412. [CrossRef]
G. Barbarino, A systematic approach to reduced GLT, BIT 62 (3) (2022), PP. 681–743. [CrossRef]
G. Barbarino, C. Garoni, An extension of the theory of GLT sequences: sampling on asymptotically uniform grids, Linear and Multilinear Algebra, 71 (12) (2023), pp. 2008–2025. [CrossRef]
G. Barbarino, C. Garoni, S. Serra-Capizzano, Block generalized locally Toeplitz sequences: theory and applications in the unidimensional case, Electronic Transactions on Numerical Analysis, 53 (2020), pp. 28–112. [CrossRef]
G. Barbarino, C. Garoni, S. Serra-Capizzano, Block generalized locally Toeplitz sequences: theory and applications in the multidimensional case, Electronic Transactions on Numerical Analysis, 53 (2020), pp. 113–216. [CrossRef]
R. Bhatia, Matrix Analysis, Graduate Texts in Mathematics, vol. 169, Springer-Verlag, New York, 1997.
R. Bhatia, Positive Definite Matrices, Princeton University Press, 2007.
D.A. Bini, B. Iannazzo, The Matrix Means Toolbox, http://bezout.dm.unipi.it/software/mmtoolbox/. Retrieved 7 May 2010.
D.A. Bini, B. Iannazzo, A note on computing matrix geometric means, Advances in Computational Mathematics, 35 (2–4) (2011), pp. 175–192. [CrossRef]
D.A. Bini, B. Iannazzo, Computing the Karcher Mean of Symmetric Positive Definite Matrices, Linear Algebra and its Applications, 438 (2013), pp. 1700–1710. [CrossRef]
D.A Bini, B. Meini, F. Poloni, An effective matrix geometric mean satisfying the Ando-Li-Mathias properties, Math. Comp. 79 (269) (2010), pp. 437–452. [CrossRef]
S. Bonnabel, R. Sepulchre, Riemannian metric and geometric mean for positive semidefinite matrices of fixed rank, SIAM Journal on Matrix Analysis and Applications, 31 (3) (2009), pp. 1055–1070. [CrossRef]
A. Böttcher, S.M. Grudsky, On the condition numbers of large semi-definite Toeplitz matrices, Linear Algebra and its Applications, 279 (1-3) (1998), pp. 285–301.
A. Böttcher, B. Silbermann, Introduction to large truncated Toeplitz matrices, Universitext. Springer-Verlag, New York, 1999.
P.T. Fletcher, S. Joshi, Riemannian geometry for the statistical analysis of diffusion tensor data, Signal Processing, 87 (2007), pp. 250-262.
C. Garoni, Topological foundations of an asymptotic approximation theory for sequences of matrices with increasing size, Linear Algebra and its Applications, 513 (2017), pp. 324–341. [CrossRef]
C. Garoni, S. Serra-Capizzano, Generalized locally Toeplitz sequences: theory and applications. Vol. I, Springer, Cham, 2017.
C. Garoni, S. Serra-Capizzano, Generalized locally Toeplitz sequences: theory and applications. Vol. II, Springer, Cham, 2018.
C. Garoni, H. Speleers, S.-E. Ekström, A. Reali, S. Serra-Capizzano, T.J.R. Hughes, Symbol-based analysis of finite element and isogeometric B-spline discretizations of eigenvalue problems: exposition and review, Archives of Computational Methods in Engineering. State of the Art Reviews, 26 (5) (2019), pp. 1639–1690. [CrossRef]
U. Grenander, G. Szegö, Toeplitz forms and their applications, California Monographs in Mathematical Sciences. University of California Press, Berkeley-Los Angeles, 1958.
N.J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008.
J.H. Manton, A globally convergent numerical algorithm for computing the centre of mass on compact Lie groups, in: Eighth International Conference on Control, Automation, Robotics and Vision, ICARCV 2004, Kunming, China, 2004.
M. Moakher, A differential geometric approach to the geometric mean of symmetric positive-definite matrices, SIAM Journal on Matrix Analysis and Applications, 26 (3) (2005), pp. 735–747. [CrossRef]
D. Noutsos, S. Serra Capizzano, P. Vassalos, The conditioning of FD matrix sequences coming from semi-elliptic differential equations, Linear Algebra and its Applications, 428 (2-3) (2008), pp 600–624. [CrossRef]
Y. Rathi, O. Michailovich, A. Tannenbaum, Segmenting images on the tensor manifold, in: Proceedings of Computer Vision and Pattern Recognition, 2007, pp. 1-8.
S. Serra-Capizzano, On the extreme spectral properties of Toeplitz matrices generated by L¹ functions with several minima/maxima, BIT, 36 (1) (1996), pp. 135–142.
S. Serra-Capizzano, On the extreme eigenvalues of Hermitian (block) Toeplitz matrices, Linear Algebra and its Applications, 270 (1998), pp. 109–129.
S. Serra-Capizzano, Asymptotic results on the spectra of block Toeplitz preconditioned matrices, SIAM Journal on Matrix Analysis and Applications, 20 (1) (1999), pp. 31–44.
S. Serra-Capizzano, Spectral and computational analysis of block Toeplitz matrices having nonnegative definite matrix-valued generating functions, BIT, 39 (1) (1999), pp. 152–175.
S. Serra-Capizzano, Distribution results on the algebra generated by Toeplitz sequences: a finite-dimensional approach, Linear Algebra and its Applications, 328 (1-3) (2001), pp. 121–130. [CrossRef]
S. Serra-Capizzano, Spectral behavior of matrix sequences and discretized boundary value problems, Linear Algebra and its Applications, 337 (2001), pp. 37-78. [CrossRef]
S. Serra-Capizzano, Generalized locally Toeplitz sequences: spectral analysis and applications to discretized partial differential equations, Linear Algebra and its Applications, 366 (2003), pp. 371–402. [CrossRef]
S. Serra-Capizzano, The GLT class as a generalized Fourier analysis and applications, Linear Algebra and its Applications, 419 (2006), pp. 180–233. [CrossRef]
S. Serra-Capizzano, P. Tilli, Extreme singular values and eigenvalues of non-Hermitian block Toeplitz matrices, Journal of Computational and Applied Mathematics, 108 (1-2) (1999), pp. 113–130. [CrossRef]
P. Tilli, Locally Toeplitz sequences: spectral properties and applications, Linear Algebra and its Applications, 278 (1998), pp. 91–120.
P. Tilli, A note on the spectral distribution of Toeplitz matrices, Linear and Multilinear Algebra, 45 (2-3) (1998), pp. 147–159.
C.A. Tracy, H. Widom, Level-spacing distributions and the Airy kernel, Communications in Mathematical Physics, 159 (1) (1994), pp. 151–174.
E.E. Tyrtyshnikov, A unifying approach to some old and new theorems on distribution and clustering, Linear Algebra and its Applications, 232 (1996), pp. 1–43.
E.E. Tyrtyshnikov, Extra dimension approach to Spectral Distributions, Private discussion (1997).
P. Vassalos, Asymptotic results on the condition number of FD matrices approximating semi-elliptic PDEs, Electronic Journal of Linear Algebra, 34 (2018), pp. 566-581.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Matrix-Sequences of Geometric Means in the Case of Hidden (Asymptotic) Structures

Abstract

Keywords:

Subject:

1. Introduction

2. Preliminaries

2.1. Matrices and Matrix-Sequences

2.2. Multi-Index Notation

2.3. Singular Value and Eigenvalue Distributions of a Matrix-Sequence

2.4. Approximating Classes of Sequences

2.5. Matrix-Sequences with Explicit or Hidden (Asymptotic) Structure

2.6. Zero-Distributed Sequences

2.7. Multilevel Block Toeplitz Matrices

2.8. Block Diagonal Sampling Matrices

2.9. The *-Algebra of d-Level r-Block GLT Matrix-Sequences

GLT Axioms

3. Geometric Mean of GLT Matrix-Sequences

3.1. Means of Two Matrices

3.2. Mean of More Than Two Matrices

4. Numerical Experiments

4.1. Example 1 (1D)

Eigenvalue Distribution

4.2. Example 2 (2D)

4.3. Example 3 (1D)

Eigenvalue distribution

4.4. Example 4 (2D)

4.5. Galerkin Discretization of the Laplacian Eigenvalue Problem

Weak formulation

Galerkin Approximation

4.6. Quadratic C⁰ B-Spline Discretization

4.7. Cubic C¹ B-Spline Discretization

4.8. Minimal Eigenvalues and Conditioning

4.8.1. Example 1 (1D): Minimal Eigenvalue

4.8.2. Example 3 (1D): Minimal Eigenvalue

5. Conclusions

References

MDPI Initiatives

Important Links

Subscribe

Matrix-Sequences of Geometric Means in the Case of Hidden (Asymptotic) Structures

Abstract

Keywords:

Subject:

1. Introduction

2. Preliminaries

2.1. Matrices and Matrix-Sequences

2.2. Multi-Index Notation

2.3. Singular Value and Eigenvalue Distributions of a Matrix-Sequence

2.4. Approximating Classes of Sequences

2.5. Matrix-Sequences with Explicit or Hidden (Asymptotic) Structure

2.6. Zero-Distributed Sequences

2.7. Multilevel Block Toeplitz Matrices

2.8. Block Diagonal Sampling Matrices

2.9. The *-Algebra of d-Level r-Block GLT Matrix-Sequences

GLT Axioms

3. Geometric Mean of GLT Matrix-Sequences

3.1. Means of Two Matrices

3.2. Mean of More Than Two Matrices

4. Numerical Experiments

4.1. Example 1 (1D)

Eigenvalue Distribution

4.2. Example 2 (2D)

4.3. Example 3 (1D)

Eigenvalue distribution

4.4. Example 4 (2D)

4.5. Galerkin Discretization of the Laplacian Eigenvalue Problem

Weak formulation

Galerkin Approximation

4.6. Quadratic C0 B-Spline Discretization

4.7. Cubic C1 B-Spline Discretization

4.8. Minimal Eigenvalues and Conditioning

4.8.1. Example 1 (1D): Minimal Eigenvalue

4.8.2. Example 3 (1D): Minimal Eigenvalue

5. Conclusions

References

MDPI Initiatives

Important Links

Subscribe

4.6. Quadratic C⁰ B-Spline Discretization

4.7. Cubic C¹ B-Spline Discretization