A Simplified Algorithm for a Full-Rank Update Quasi-Newton Method

Peter Berzi

doi:10.20944/preprints202412.2511.v1

Submitted:

27 December 2024

Posted:

30 December 2024

You are already at the latest version

Abstract

One of the most common ideas for finding the zero of a nonlinear function is to replace it by a series of suitably chosen linear functions, for which the zeros can easily be determined and the sequence of zeros approximate the zero of the nonlinear function. Widely used classic methods are Newton’s method and a large family of quasi-Newton methods (secant, Broyden’s, discretized Newton, Steffensen’s,...). This strategy can be called “linearization” and such methods may be called “linearization methods”. An efficient linearization method for solving a system of nonlinear equations was developed which showed good stability and convergence properties. It uses an unconventional and simple strategy to improve the performance of classic methods by full-rank update of the Jacobian approximates. It can be considered both as a discretized Newton’s method or as a quasi-Newton method with full-rank update of the Jacobian approximates. A solution to the secant equation presented in [] was based on the Wolfe-Popper procedure [,]. The secant equation was splitted into two equations by introducing an auxiliary variable. A simplified algorithm is given in this paper for the full-rank update procedure described in []. It directly solves the secant equation with the pseudo-inverse of the Jacobian approximate matrix. Numerical examples are shown for demonstration purpose. The convergence and efficiency of the suggested method are discussed and compered with the convergence and efficiency of classic linearization methods.

Keywords:

quasi-Newton methods

;

least-squares solution

;

multi-variable nonlinear equations

;

full-rank Jacobian approximate update

;

convergence

;

efficiency

Subject:

Computer Science and Mathematics - Applied Mathematics

1. Introduction

A mathematical model

ϕ (ω, x)

with some parameters

x = [x_{i}]

(i = 1, \dots, n)

is constructed for an observed system which gives observable response

D = [D_{j}]

sampled at locations

ω_{j}

(j = 1, \dots, m)

(

m > n

) to an observable external effect. It is assumed that the simulated system response

ϕ (ω, x)

is sensitive to the perturbations

▵ x_{i}

of all parameters. The n adjustable parameters

x

of the mathematical model are determined so that the distance

l_{q} = {∥f (x)∥}_{q} = {∥ϕ (ω, x) - D∥}_{q}

(1.1)

between the observed and the simulated system responses (

[D_{j}]

and

ϕ (ω, x)

) is minimized, where

l_{q} : R^{n} \to R

is a q norm

l_{q} = {(\sum_{j = 1}^{m} {|f_{j} (x)|}^{q})}^{\frac{1}{q}}

(1.2)

If

q = 2

is chosen, then we get the Euclidean or least-squares norm

l_{2} = {(\sum_{j = 1}^{m} {|f_{j} (x)|}^{2})}^{\frac{1}{2}}

(1.3)

that is well computable and safely usable for a great class of problems. Then minimizing the distance 1.1 leads to the problem of minimizing the least squares norm

l_{2} = {∥f (x)∥}_{2} = {∥ϕ (ω, x) - D∥}_{2}

(1.4)

(

x \in

R^{n}

,

f : R^{n} \to R^{m}

and

n < m

). The minimum value of this norm is zero, but for “real life” problems it is not reachable in most cases due to modeling inaccuracies and measurement noises. The suggested procedure gives a least-squares solution to the overdetermined system of nonlinear equations

f (x) = 0

(1.5)

where the solution

x^{*}

minimizes the least squares norm 1.4 of the nonlinear residual function

f (x) = ϕ (ω, x) - D

(1.6)

Least squares minimizations are successively made within the actual iteration steps. The basic concept of solving Equation 1.5 is that the function

f (x)

is “linearized” by repeatedly replacing it with a linear function

y_{p} = y (x_{p})

as

f (x) ≃ y_{p} = f_{p} + J_{p} (x - x_{p})

(1.7)

for the Newton’s method, where

f_{p} = f (x_{p})

is the function value,

J_{p} = J (x_{p})

is the Jacobian matrix of the function

f (x)

(p is iteration counter) and

f (x) ≃ y_{p} = f_{p} + S_{p} (x - x_{p})

(1.8)

for the quasi-Newton methods methods, where

S_{p} = S (x_{p})

is the finite-difference Jacobian approximate of the function

f (x)

. Then the nonlinear problem is solved by solving a series of linear problems

y_{p} = 0

(1.9)

and it follows that the new approximate

x_{p + 1}

to the solution

x^{*}

can be given from Equations

J_{p} ▵ x_{p} = - f_{p}

(1.10)

(Newton’s method) and

S_{p} ▵ x_{p} = - f_{p}

(1.11)

(quasi-Newton methods) where the iteration stepsize is

▵ x_{p} = x_{p + 1} - x_{p}

(1.12)

in both cases.

The system of simultaneous multi-variable nonlinear Equations 1.5 can be solved by the Newton’s method when the derivatives

J_{p}

of

f (x)

are available analytically and a new iterate

x_{p + 1}

can be determined. In many cases, explicit formulas for the function

f (x)

are not available and the Jacobian

J_{p}

can only be approximated. The partial derivatives of the analytic Jacobian may be replaced by suitable finite-difference quotients (discretized Newton’s iteration [12,25]) with properly chosen step sizes. For most problems, Newton’s method using analytic derivatives and Newton’s method using properly chosen divided differences are virtually indistinguishable [12]. However, the determination of the finite-difference stepsizes is not clearly defined. The suggested full-rank update procedure helps to overcome this deficit.

Let

S_{p}

be a full column rank matrix (all column vectors of matrix

S_{p}

are linearly independent). As

n < m

(there are more rows than columns),

S_{p}

is not invertible and the solution to the overdetermined system of Equation 1.11 doesn’t exist. However,

S_{p}^{T} S_{p}

is invertible and the pseudo-inverse

S_{p}^{+} = {(S_{p}^{T} S_{p})}^{- 1} S_{p}^{T}

(1.13)

of matrix

S_{p}

is a unique left inverse which allows finding a least-squares approximate solution

\tilde{x}

to Equation 1.11. The pseudo-inverse can be determined in different ways (e.g. rank factorization [14], singular value decomposition [31]). Singular value decomposition is a widely used technique, and it has an advantage, that it gives solution even if

S_{p}

is singular (not a full column rank matrix). Matrix

S_{p}

can be factorized as

S_{p} = U_{p} Σ_{p} V_{p}^{T}

(1.14)

where

U_{p}

(

m x n

matrix) and

V_{p}

(

n x n

matrix) are orthogonal matrices with orthonormal column eigenvectors, and

Σ_{p} = d i a g (σ_{i, p})

(1.15)

(i = 1, \dots, n)

with

σ_{i, p}

singular values of matrix

S_{p}

and

σ_{1} \geq \dots \geq σ

. Then the least square solution to Equation 1.11 exists and the iteration stepsize is

▵ x_{p} = - S_{p}^{+} f_{p}

(1.16)

where

S_{p}^{+} = V_{p} Σ_{p}^{- 1} U_{p}^{T}

(1.17)

is the pseudoinverse of

S_{p}

and

Σ_{p}^{- 1} = d i a g (\frac{1}{σ_{i, p}})

(1.18)

The pseudoinverse

{[\dots]}^{+}

gives a unique least squares solution to Equation 1.11 for which the

q = 2

norm (Euclidean norm) 1.4 of the residual function 1.6 is minimum. If the rank of matrix

S_{p}

is less than n (the column vectors of

S_{p}

are not linearly independent and some singular values are zero), then unique solution to Equation 1.11 doesn’t exist. The spectral condition number

κ (S_{p}) = \frac{σ_{1}}{σ_{n}} \geq 1

(1.19)

of matrix

S_{p}

measures the linear dependency of the column vectors. If they are linearly independent (non of the singular values are zero) and the condition number

κ

is not much larger than one, then the matrix is well-conditioned. This is a desirable situation through the whole iteration process. Then the column vectors of matrix

S_{p}

are in “general positions” and the Ortega-Rheinboldt condition [25] satisfies. If

κ ≫ 1

then the matrix is ill-conditioned and the solution

▵ x_{p}

(iteration stepsize) to the Equation 1.11 may be sensitive to small changes in

S_{p}

or

f_{p}

.

Classic quasi-Newton linearization methods generate a sequence of improved iterates

x_{p}

(p = 0, 1, 2, \dots)

so that the next approximate

x_{p + 1}

to the solution

x^{*}

is determined on the basis of rank-1 update

S_{p + 1}

of the Jacobian approximate

S_{p}

. Such linearization update is not unique if only a single new approximate

x_{p + 1}

exists. In single variable case

(n = 1)

, there are two possible updates (secant lines) from which the selection is not obvious (see details in Section 3). In multi variable case, only rank-1 update is possible if one single new approximate

x_{p + 1}

exists.

The solution to Equation 1.11 presented in [2] and applied to the identification of physically nonlinear dynamic systems [3,4,5] was based on the Wolfe-Popper procedure [29,42]. The secant equation 1.11 was splitted into two equations by introducing an auxiliary variable. The simplified solution presented in Section 4 directly solves Equation 1.11 with the pseudo-inverse 1.13 of matrix

S_{p}

. The simplified algorithm given in Section 5 gives an unconventional and simple strategy to improve the performance of Newton and quasi-Newton iterations for the solution of system of nonlinear equations 1.5. It can be considered both as a discretized Newton’s method or as a quasi-Newton method with full-rank update of the Jacobian approximates.

The rate of convergence

α

of the classic secant method equals to the golden section ratio

φ

(

α = φ ≃ 1.618

) for simple root in single variable case. The suggested method gives an additional new independent approximate

x_{p + 1}^{B}

to the solution x*, then rank-n linearization update can be made with the classic approximate

x_{p + 1}

(notated as

x_{p + 1}^{A}

in the followings) and the suggested new approximate

x_{p + 1}^{B}

. The results of single- and multi-variable numerical examples are given in Section 6. The efficiency of the proposed method is discussed in Section 7 and it is compared with other classic rank-one update and line-search methods on the basis of available test data. Concluding remarks are summarized in Section 8.

2. Notations

Vectors and matrices are denoted by bold-face letters. Subscripts refer to components of vectors and matrices, superscripts A and B refer to interpolation base points. Notations A and B are introduced to be able to clearly distinguish between the two new approximates

x^{A}

and

x^{B}

. Vectors and matrices may also be given by their general elements. ▵ refers to a difference of two elements. x and

X

denotes unknown quantities, f and

F

denotes function values and matrices. t and

T

denotes multiplier scalars and scaling transformation matrices. e,

ε

and E denotes approximate error, p is iteration counter,

α

is convergence rate,

ε^{*}

is termination criterion. n is the number of unknowns, m is the number of function values, i, j and k and are running indexes of matrix columns and rows. Superscripts S and

T S

refer to the traditional Secant-method and to the suggested full-rank update method (T-Secant) receptively.

3. Linearization Methods

The origin of the secant method can be traced back to ancient time to the “rule of double false position” method, described in the

18^{t h}

century B.C. on the Egyptian Rhind Papyrus [28] and it predates Newton’s method by more than 3000 years. It is also well-known that the local convergence rate of the classic Newton’s and the classic secant methods for simple roots are quadratic and super-linear respectively [8,11,26,38]. Weerakson and Fernando’s third order Newton variant [40] involves evaluation of a function value and two derivatives, similarly like in Traub’s suggestion [38]. Secant method variations with improved convergence have been reported in literature [13,15,16,17,23,27,33,34,40]. Mueller’s method [24] uses a quadratic approximation. A class of variations employs two function value and a derivative evaluation [37]. Third order methods are obtained by a second derivative evaluation, such as Halley’s [23] and the “super Halley” [1] method. Further improvements with higher degree polynomial approximations are proposed by Chen [9], Kanwar [17], Zhang [43] and Wang [39].

3.1. Single-Variable Case

The zero

x^{*}

of a scalar nonlinear function

f (x)

(

x \to f (x)

,

x \in R^{1}

,

f : R^{1} \to R^{1}

) has to be determined, where

f (x^{*}) = 0

(3.1)

Linearization means that

f (x)

is locally replaced by a linear function

y (x)

(

x \to y (x)

,

x \in R^{1}

,

y : R^{1} \to R^{1}

) (secant- or tangent line) as shown on Figure 1 (Left) and operations are made on the linear function

y_{p} (x)

in the

p^{t h}

iteration.

y_{p} (x)

may be defined through two points

A_{p} (x_{p}^{A} f_{p}^{A})

and

B_{p} (x_{p}^{B} f_{p}^{B})

of the nonlinear function

f (x)

, where

f_{p}^{A} = f (x_{p}^{A})

and

f_{p}^{B} = f (x_{p}^{B})

are function values, or through one point

A_{p} (x_{p}^{A} f_{p}^{A})

and by the “slope”

S_{p} = \frac{▵ f_{p}}{▵ x_{p}}

(3.2)

of the secant line

y_{p} (x)

, where

▵ x_{p} = x_{p}^{B} - x_{p}^{A}

and

▵ f_{p} = f_{p}^{B} - f_{p}^{A}

are differences, or by the “slope”

S_{p} = f^{'} (x_{p}^{A}) = \frac{d f}{d x}

(3.3)

at

x_{p}^{A}

of the tangent line

y_{p} (x)

, where

f^{'}

is the derivative function of

f (x)

. The successive local replacement of the nonlinear function

f (x)

by a secant- or tangent line

y_{p} (x)

gives a simple and efficient numerical root finding procedure. It follows from the condition

y_{p} (x_{p + 1}^{A}) = 0

(3.4)

that the zero

x_{p + 1}^{A} = x_{p}^{A} - \frac{f_{p}^{A}}{S_{p}}

(3.5)

of the linear function

y_{p} (x)

approximates the zero

x^{*}

of the nonlinear function

f (x)

and the new point

A_{p + 1} (x_{p + 1}^{A} f_{p + 1}^{A})

can be determined for the next iteration as shown on Figure 1 (Right). With the iteration step length

▵ x_{p}^{A} = x_{p + 1}^{A} - x_{p}^{A}

(3.6)

from point

A_{p}

, Equation 3.5 can be rewritten as

S_{p} ▵ x_{p}^{A} = - f_{p}^{A}

(3.7)

that is called secant equation or we may also call it “linearization” equation as it also includes Newton’s method as a special case (tangent line). The secant based procedure has an advantage that it doesn’t need the calculation of function derivatives, it only uses function values and the order of asymptotic convergence is super-linear with convergence rate

α^{S} ≅ 1.618 . . .

. However, the “slope” of the updated secant line

y_{p + 1} (x)

can be determined in two ways from the secant (quasi-Newton) conditions as

S_{p + 1} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A}

(3.8)

or

S_{p + 1} ▵ x_{p}^{B} = f_{p + 1}^{A} - f_{p}^{B}

(3.9)

where

▵ x_{p}^{B} = x_{p + 1}^{A} - x_{p}^{B}

(3.10)

is the iteration step length from point

B_{p}

. The decision is far not obvious, as shown on Figure 1 (Right). The tangent line based iteration (Newton’s method) has an advantage that the “slope”

S_{p + 1} = f^{'} (x_{p + 1}^{A})

(3.11)

of the updated tangent line is always well-defined. The iteration then continues with updated secant- or tangent line

y_{p + 1} (x)

.

3.2. Multi-Variable Case

The scalar linearization procedure can be extended to multi-dimensions. The zero

x^{*}

of a nonlinear vector-valued function

f (x) = [\begin{matrix} f_{1} (x) \\ ⋮ \\ f_{m} (x) \end{matrix}]

(3.12)

of n variables

x = [\begin{matrix} x_{1} \\ ⋮ \\ x_{n} \end{matrix}]

(3.13)

(

x \to f (x)

,

x \in R^{n}, f : R^{n} \to R^{m}, n \leq m

) has to be determined, where

f (x^{*}) = 0

(3.14)

Then

f (x)

is locally replaced by an n-dimensional hyperplane

y (x) (x \to y (x), x \in R^{n}, y : R^{n} \to R^{m}

(3.15)

(n \leq m)

and operations are made on the hyperplane

y_{p} (x)

in the

p^{t h}

iteration.

y_{p} (x)

may be defined through

n + 1

points

A_{p} (x_{p}^{A} f_{p}^{A})

and

B_{k, p} (x_{k, p}^{B} f_{k, p}^{B})

(

k = 1, \dots, n

) of the nonlinear function

f (x)

, where

f_{p}^{A} = f (x_{p}^{A})

and

f_{k, p}^{B} = f (x_{k, p}^{B})

(

k = 1, \dots, n

) are function values, or by one point

A_{p} (x_{p}^{A} f_{p}^{A})

and by the “slope” (divided differences)

S_{p} = ▵ F_{p} ▵ X_{p}^{- 1}

(3.16)

(i = 1, \dots, n)

,

(j = 1, \dots, m)

,

(k = 1, \dots, n)

of the n-dimensional secant hyperplane

y_{p} (x)

, where

▵ x_{k, p} = x_{k, p}^{B} - x_{p}^{A}

and

▵ f_{k, p} = f_{k, p}^{B} - f_{p}^{A}

are difference vectors with n and m components respectively, and

▵ F_{p} = [▵ f_{k, p}] = [f_{k, p}^{B} - f_{p}^{A}] = [\begin{matrix} ▵ f_{1, 1, p} & \dots & ▵ f_{n, 1, p} \\ ⋮ & ⋱ & ⋮ \\ ▵ f_{1, m, p} & \dots & ▵ f_{n, m, p} \end{matrix}]

(3.17)

▵ X_{p} = [▵ x_{k, p}] = [x_{k, p}^{B} - x_{p}^{A}] = [\begin{matrix} ▵ x_{1, 1, p} & \dots & ▵ x_{n, 1, p} \\ ⋮ & ⋮ & ⋮ \\ ▵ x_{1, n, p} & \dots & ▵ x_{n, n, p} \end{matrix}]

(3.18)

The n-dimensional tangent hyperplane

y_{p} (x)

through point

A_{p} (x_{p}^{A} f_{p}^{A})

may be defined by the “slope”

S_{p} = [\frac{\partial f_{k, j, p}}{\partial x_{k, i, p}}] = [\begin{matrix} \frac{\partial f_{1, 1, p}}{\partial x_{1, p}} & \dots & \frac{\partial f_{n, 1, p}}{\partial x_{n, p}} \\ ⋮ & ⋮ & ⋮ \\ \frac{\partial f_{1, m, p}}{\partial x_{1, p}} & \dots & \frac{\partial f_{n, m, p}}{\partial x_{n, p}} \end{matrix}]

(3.19)

that corresponds to the Jacobian matrix

J_{p}

of the nonlinear vector-valued function

f (x)

and accordingly, Definition 3.16 corresponds to the approximation of the Jacobian matrix. It follows from the condition

y_{p} (x_{p + 1}^{A}) = 0

(3.20)

that the zero

x_{p + 1}^{A} = x_{p}^{A} - S_{p}^{+} f_{p}^{A}

(3.21)

of the n-dimensional hyperplane

y (x)

approximates the zero

x^{*}

of the nonlinear vector-valued function

f (x)

, where

{[.]}^{+}

stands for the pseudo-inverse. Then the

i^{t h}

element of the new approximate

x_{p + 1}^{A}

in the

p^{t h}

iteration will be

x_{i, p + 1}^{A} = x_{i, p}^{A} - \sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A})

(3.22)

(i = 1, \dots, n), (j = 1, \dots, m)

.With the iteration stepsize

▵ x_{p}^{A} = x_{p + 1}^{A} - x_{p}^{A} = - S_{p}^{+} f_{p}^{A} = - [{(\sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A}))}_{i}]

(3.23)

Equation 3.21 can be rewritten as

S_{p} ▵ x_{p}^{A} = - f_{p}^{A}

(3.24)

that is called “secant equation” or “linearization equation” in multi-variable case. With the new approximate

x_{p + 1}^{A}

and with the corresponding function value

f_{p + 1}^{A}

, an updated Jacobian or Jacobian approximate

S_{p + 1}

has to be determined. Table 1 summarizes the basic equations of the above detailed linearization methods.

If all partial derivatives of the function

f (x)

is known, then the Jacobian update can easily be calculated. Newton’s method is one of the most widely used algorithm with very attractive theoretical and practical properties and with some limitations. The computational costs of Newton’s method is high, since the Jacobian

J_{p}

and the solution to the linear system 3.24 must be computed at each iteration. It is well-known that the local convergence of Newton’s method is q-quadratic if the initial trial approximate

x_{0}

is close enough to the solution

x^{*}

, if

J (x^{*})

is non-singular and if

J (x)

satisfies the Lipschitz condition

∥J (x) - J (x^{*})∥ \leq L ∥x - x^{*}∥

(3.25)

for all

x

close enough to

x^{*}

.

However, in many cases, the function

f (x)

is not an analytical function, the partial derivatives are not known or difficult to evaluate and Newton’s method cannot be applied. Quasi-Newton methods are widely used for solving systems of nonlinear equations when the Jacobian is not known or difficult to determine. The Jacobian is approximated by divided differences 3.16, and the system of nonlinear equations 1.5 is solved by repeatedly solving systems of linear equations 3.24 providing the new approximates $x_{p + 1}^{A}$ .

The Jacobian approximate 3.16 should be updated according to the fundamental equation of the quasi-Newton methods (“quasi-Newton condition” or “secant condition”)

S_{p + 1} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A}

(3.26)

for all

p = 0, 1, 2, \dots

. However, the condition 3.26 doesn’t uniquely specify the update

S_{p + 1}

, so the iterative procedure is not well-defined and further constraints are needed. Different methods offer their own specific solution, but new quasi-Newton approximate

x_{p + 1}

will never allow full-rank update of the Jacobian approximate

S_{p + 1}

(Equation 3.26 is an underdetermined system of m linear equations with

m n

unknowns). A large family of methods are available with different additional conditions or assumptions. Martinez [21] has been made a thorough survey on the family of practical quasi-Newton methods.

The partial derivatives of the Jacobian matrix may be replaced by suitable difference quotients (discretized Newton iteration, see [12,25])

[\frac{▵ f}{▵ x}] = [\frac{f_{j} (x + ▵ x_{k} d^{k}) - f_{j} (x)}{▵ x_{k}}] = [\frac{▵ f_{j} (x)}{▵ x_{k}}]

(3.27)

(k = 1, \dots, n)

,

(j = 1, \dots, m)

with n additional function value evaluations, where

d^{k}

is the

k^{t h}

Cartesian unit vector. However, it is difficult to choose the stepsize

▵ x

. If any

▵ x_{k}

is too large, then Expression 3.27 can be a bad approximation to the Jacobian so the iteration converges much more slowly, if it converges at all. On the other hand, if any

▵ x_{k}

is too small, then

▵ f_{j} (x) ≃ 0

, and cancellations can occur which reduces the accuracy of the difference quotients 3.27 (see [35]). Another modification is the inexact-Newton approach, when the nonlinear equation is solved by an iterative linear solver (see [6,10,22]). Wolfe [42] and Popper [29] suggested column update of matrix 3.17.

One of the most widely used Broyden-formula [7] makes a rank-one update for the Jacobian approximate as

S_{p + 1} = S_{p} + Q_{p} = S_{p} + \frac{▵ f_{p}^{A} - S_{p} ▵ x_{p}^{A}}{{∥▵ x_{p}^{A}∥}^{2}} {[▵ x_{p}^{A}]}^{T}

(3.28)

where

Q_{p}

is a rank-one matrix and by using the Sherman-Morrison formula, the inverse Jacobian approximate update is given as

S_{p + 1}^{+} = S_{p}^{+} + \frac{▵ x_{p}^{A} - S_{p}^{+} ▵ f_{p}^{A}}{{[▵ x_{p}^{A}]}^{T} S_{p}^{+} ▵ f_{p}^{A}} {[▵ x_{p}^{A}]}^{T} S_{p}^{+}

(3.29)

Broyden’s secant condition 3.26 can be re-written as

S_{p} ▵ x_{p}^{A} + Q_{p} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A}

(3.30)

and with the secant Equation 3.24 we have

Q_{p} ▵ x_{p}^{A} = f_{p + 1}^{A}

(3.31)

As

Q_{p}

is a rank-one matrix, then this equation has infinitely large number of solutions (the left nullspace of matrix

Q_{p}

is an

n - 1

dimensional vector space [36]) and the Ortega-Rheinboldt [25] condition will not be satisfied (the column vectors of matrix

S_{p}

should be linearly independent and have to be “in general position” through the whole iteration process) and it may result an ill-conditioned Broyden update

S_{p + 1}

.

4. T-Secant Method

A numerical procedure has been developed for solving an overdetermined system of nonlinear equations 1.5. It can be considered both as a discretized Newton’s method or as a quasi-Newton method with full-rank update of the Jacobian approximates. The suggested full-rank update procedure (“T-Secant” method [2]) provides an unconventional and simple strategy to improve the performance of quasi-Newton iterations. It is based on the classic secant linearization, that was completed with a new independent approximate

x_{p + 1}^{B}

. The secant equation was modified by a suitably chosen non-uniform scaling transformation and a new independent approximate

x_{p + 1}^{B}

was determined from the modified secant equation. The solution to the secant equation presented in [2] was based on the Wolfe-Popper procedure [29,42] so that the secant equation was splitted into two equations by introducing an auxiliary variable. The simplified algorithm, presented in the paper, directly solves the secant equation by using the pseudo-inverse of the transformed Jacobian approximate.

4.1. Single-Variable Case

The suggested procedure uses the information on the improvement of the classic secant procedure and gives a new approximate

x_{p + 1}^{B}

in the vicinity of the classic secant approximate

x_{p + 1}^{A}

as follows. Given two initial approximates

x_{p}^{A}

and

x_{p}^{B}

with function values

f_{p}^{A}

and

f_{p}^{B}

and the new approximate

x_{p + 1}^{A}

with function value

f_{p + 1}^{A}

is known from the solution of the classic secant Equation 3.7. An independent new approximate

x_{p + 1}^{B} = x_{p + 1}^{A} + ▵ x_{p + 1}

(4.1)

is planned to be determined in the vicinity of the classic secant approximate

x_{p + 1}^{A}

. Let the ratio

t_{p}^{f} = \frac{f_{p + 1}^{A}}{f_{p}^{A}}

(4.2)

(f_{p}^{A} \neq 0)

of the function value improvement and the ratio

t_{p}^{x} = \frac{x_{p + 1}^{B} - x_{p + 1}^{A}}{x_{p + 1}^{A} - x_{p}^{A}} = \frac{▵ x_{p + 1}}{▵ x_{p}^{A}}

(4.3)

(▵ x_{p}^{A} \neq 0)

of the desired iteration stepsize change be defined. The single variable secant equation 3.7 is then modified as

S_{T p} ▵ x_{p}^{A} = - f_{p}^{A}

(4.4)

where

S_{T p} = \frac{t_{p}^{f}}{t_{p}^{x}} S_{p}

(4.5)

Then the new approximate

x_{p + 1}^{B}

can be expressed from Equation 4.4

t_{p}^{f} \frac{x_{p + 1}^{A} - x_{p}^{A}}{x_{p + 1}^{B} - x_{p + 1}^{A}} S_{p} ▵ x_{p}^{A} = - f_{p}^{A}

(4.6)

After re-arrangement

x_{p + 1}^{A} - x_{p + 1}^{B} = \frac{t_{p}^{f}}{f_{p}^{A}} S_{p} {(▵ x_{p}^{A})}^{2}

(4.7)

and with the secant Equation 3.7

x_{p + 1}^{B} = x_{p + 1}^{A} - \frac{t_{p}^{f}}{f_{p}^{A}} S_{p} {(▵ x_{p}^{A})}^{2} = x_{p + 1}^{A} + t_{p}^{f} ▵ x_{p}^{A}

(4.8)

the desired new iteration stepsize

▵ x_{p + 1} = t_{p}^{f} ▵ x_{p}^{A}

(4.9)

can be given. The suggested procedure can also be applied to the Newton’s method (T-Newton method). Then the “slope”

S_{p}

corresponds to the derivative

f^{'}

of function

f (x)

.

The geometrical representation of the suggested method can be derived as follows ([2]). Let the new approximate

x_{p + 1}^{B}

be the zero of a function

z_{p} (x)

:

z_{p} (x_{p + 1}^{B}) = 0

(4.10)

Then by replacing

x_{p + 1}^{B}

with x in Equation 4.8 gives the function

z_{p} (x) = \frac{t_{p}^{f} S_{p} {(▵ x_{p}^{A})}^{2}}{x - x_{p + 1}^{A}} + f_{p}^{A} = 0

(4.11)

with zero at

x = x_{p + 1}^{B}

(4.12)

Equation 4.11 is a hyperbolic function with vertical and horizontal asymptotes

x_{p + 1}^{A}

and

f_{p}^{A}

, and its root

x_{p + 1}^{B}

will be in the vicinity of

x_{p + 1}^{A}

in “appropriate distance” that is regulated by the function value

f_{p + 1}^{A}

(see Figure 2). This virtue of the suggested procedure ensures an automatic mechanism for having the actual approximates

x_{p}^{A}

and

x_{p}^{B}

in general positions through the whole iteration process providing stable and efficient numerical performance. The suggested method’s performance is demonstrated with the results of numerical tests on test functions in Section 6.

4.2. Multi-Variable Case

Let two independent approximates

x_{p}^{A} = [x_{k, p}^{A}]

and

x_{p}^{B} = [x_{k, p}^{B}]

to the zero

x^{*}

of the nonlinear vector-valued function

f (x)

be given in the

p^{t h}

iteration

(k = 1, \dots, n)

. Let the series of n approximates

x_{k, p}^{B}

be constructed by individually increment the elements

x_{k, p}^{A}

of the approximate

x_{p}^{A}

by an increment

▵ x_{k, p} = x_{k, p}^{B} - x_{k, p}^{A}

(4.13)

as

x_{k, p}^{B} = x_{p}^{A} + ▵ x_{k, p} d^{k}

(4.14)

where

d^{k}

is the

k^{t h}

Cartesian unit vector. It follows from this special construction of the approximates

x_{k, p}^{B}

, that

x_{k, i, p}^{B} - x_{k, p}^{A} = 0

for

i \neq k

and

x_{k, i, p}^{B} - x_{k, p}^{A} = ▵ x_{k, p}

for

i = k

and matrix 3.18 will be a diagonal matrix :

▵ X_{p} = [▵ x_{k, p}] = [x_{k, p}^{B} - x_{p}^{A}] = [\begin{matrix} ▵ x_{1, p} & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & ▵ x_{n, p} \end{matrix}] = d i a g (▵ x_{k, p})

(4.15)

Let the ratios

T_{p}^{F} = d i a g (t_{j, p}^{F}) = d i a g (\frac{f_{j, p + 1}^{A}}{f_{j, p}^{A}})

(4.16)

(f_{j, p}^{A} \neq 0)

of the function value improvements

(j = 1, \dots, m)

and the ratios

T_{p}^{X} = d i a g (t_{i, p}^{X}) = diag (\frac{x_{i, p + 1}^{B} - x_{i, p + 1}^{A}}{x_{i, p + 1}^{A} - x_{i, p}^{A}}) = diag (\frac{▵ x_{i, p + 1}}{▵ x_{i, p}^{A}})

(4.17)

(▵ x_{i, p}^{A} \neq 0)

of the desired iteration stepsize changes

(i = 1, \dots, n)

be defined similarly like in the single variable case. Let the secant equation 3.24 be modified as

S_{T p} ▵ x_{p}^{A} = - f_{p}^{A}

(4.18)

where

S_{T p} = T_{p}^{F} S_{p} {(T_{p}^{X})}^{- 1} = d i a g (t_{j, p}^{F}) [\frac{▵ f_{k, j, p}}{▵ x_{i, p}}] d i a g (\frac{1}{t_{i, p}^{X}}) = [\frac{t_{j, p}^{F}}{t_{i, p}^{X}} \frac{▵ f_{k, j, p}}{▵ x_{i, p}}]

(4.19)

(i = 1, \dots, n)

,

(j = 1, \dots, m)

,

(k = 1, \dots, n)

or in explicit form :

S_{T p} = [\begin{matrix} \frac{f_{1, p + 1}^{A}}{f_{1, p}^{A}} & 0 & 0 \\ 0 & ⋱ & 0 \\ 0 & 0 & \frac{f_{m, p + 1}^{A}}{f_{m, p}^{A}} \end{matrix}] [\begin{matrix} \frac{▵ f_{1, 1, p}}{▵ x_{1, p}} & \dots & \frac{▵ f_{1, n, p}}{▵ x_{n, p}} \\ ⋮ & ⋮ & ⋮ \\ \frac{▵ f_{1, m, p}}{▵ x_{1, p}} & \dots & \frac{▵ f_{n, m, p}}{▵ x_{n, p}} \end{matrix}] [\begin{matrix} \frac{▵ x_{1, p}^{A}}{▵ x_{1, p + 1}} & 0 & 0 \\ 0 & ⋱ & 0 \\ 0 & 0 & \frac{▵ x_{n, p}^{A}}{▵ x_{n, p + 1}} \end{matrix}]

(4.20)

and

S_{T p}^{+} = T_{p}^{X} S_{p}^{+} {(T_{p}^{F})}^{- 1}

(4.21)

is the pseudoinverse of

S_{p}

. Then Equation 4.18 can be re-written as

▵ x_{p}^{A} = - S_{T p}^{+} f_{p}^{A}

(4.22)

Then the

i^{t h}

element

x_{i, p + 1}^{B}

of the new approximate

x_{p + 1}^{B}

in the

p^{t h}

iteration will be

x_{i, p + 1}^{B} = x_{i, p + 1}^{A} + ▵ x_{i, p + 1} = x_{i, p + 1}^{A} - \frac{{(▵ x_{i, p}^{A})}^{2}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})}

(4.23)

where

t_{p, j}^{F} \neq 0

,

(j = 1, \dots, m)

and

(i = 1, \dots, n)

. Let the ratios

μ_{i, p} = \frac{\sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A})}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})} = \frac{- ▵ x_{i, p}^{A}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})}

(4.24)

be introduced . By using Equation 3.23, the new iteration stepsize can be expressed from Equation 4.23 as

▵ x_{i, p + 1} = x_{i, p + 1}^{B} - x_{i, p + 1}^{A} = - \frac{{(▵ x_{i, p}^{A})}^{2}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})} = μ_{i, p} ▵ x_{i, p}^{A}

(4.25)

Table 2 summarizes the basic equations of the above detailed multi-variable update method. The suggested procedure can also be applied to the Newton’s method (T-Newton method). Then matrix

S_{p}

corresponds to the Jacobian matrix

J_{p}

of function

f (x)

. The function value improvement parameter

t_{j, p}^{F}

(j = 1, \dots, m)

in Equation 4.24, defined as 4.16 has a key role in the suggested iteration process and the absolute value of the denominator

|f_{j, p}^{A}|

has to be low bounded with a pre-given value

T_{m i n}

to avoid division by zero. When the values

|f_{j, p}^{A}|

tend to zero with increasing iteration counter p, then

μ_{i, p} \to T_{m i n}

(4.26)

Figure 3 shows the effect of iteration parameter

T_{m i n}

on the iteration process (

N = 1000

). It is clear that too high value of

T_{m i n}

has negative effect on the iteration efficiency. It can be seen on the example that a value

0.001

causes the iteration process failure, and there is almost no effect with lower values

0.0001 ≦ T_{m i n}

.

The suggested procedure may also be repeated without using the classic secant approximates and without updating the previously known Jacobian approximate

S_{p}

as

▵ x_{i, p + 1 + h} = x_{i, p + 1 + h}^{B} - x_{i, p + 1 + h}^{A} = \frac{- {(\sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p + h}^{A}))}^{2}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p + h}^{A}}{t_{j, p + h}^{F}})}

(4.27)

where

t_{j, p + h}^{F} = \frac{f_{j, p + 1 + h}^{A}}{f_{j, p + h}^{A}}

(4.28)

(h = 1, \dots)

until the new approximates

x_{i, p + 1 + h}^{B} = x_{i, p + 1 + h}^{A} + ▵ x_{i, p + 1 + h}

(4.29)

(i = 1, \dots,, n)

are sufficiently improved. A numerical example is shown in Table 3 with test function

f (x) = x^{3} - 2 x - 5

(4.30)

for demonstration purpose. The results indicate linear convergence with convergence rate

α = 1

.

α

is the computed convergence rate,

N_{f}

is the number of function value evaluations and L is the mean convergence rate, suggested by Broyden [7] in Table 3.

5. Algorithm

Let

ε^{*}

be the error bound for termination criterion and if

x^{*}

is known then

e_{p}^{A} = x_{p}^{A} - x^{*}

(5.1)

is the error vector of approximate

x_{p}^{A}

in the

p^{t h}

iteration with elements

e_{i, p}^{A}

(i = 1, \dots, n)

. Let the error norm

ε_{p} = \frac{{∥e_{p}^{A}∥}_{2}}{n} = \frac{\sqrt{\sum_{i = 1}^{n} {(e_{i, p}^{A})}^{2}}}{n}

(5.2)

be defined, where

{∥.∥}_{2}

is Euclidean norm and let the iteration be terminated when

ε_{p} < ε^{*}

(5.3)

holds. Choose

T_{m i n}

as lower bound for

|t_{j, p}^{F}| (j = 1 \dots, m)

and let

f_{m i n}

be a lower bound for

|f_{j, p}^{A}|

(j = 1, \dots, m)

. Let

p = 0

and let the approximate

x_{p}^{A}

and the difference vector

▵ x_{p}

be given. Calculate the corresponding function values

f_{p}^{A}

and assure that

f_{m i n} < |f_{j, 0}^{A}|

(j = 1, \dots, m)

. Iteration constants

f_{m i n}

and

T_{m i n}

are necessary to avoid division by zero and to avoid computed values be near the numerical precision.

Step 1 : Generate a set of n additional approximates $x_{k, p}^{B}$ (Equation 4.14) and evaluate function values $f_{k, p}^{B}$ $(k = 1, \dots, n)$ . Assure that $f_{m i n} < |f_{k, j, p}^{B}|$ $(j = 1, \dots, m)$ .
Step 2 (Classic secant update method) : Construct the Jacobian approximate matrix $S_{p}$ , determine its pseudo-inverse $S_{p}^{+}$ [31], calculate $x_{p + 1}^{A}$ from Equation 3.21 and $ε_{p}$ from Equation 5.2.
Step 3 : If $ε_{p} < ε^{*}$ then terminate iteration, else continue with Step 4.
Step 4 (suggested update method) : Calculate $f_{p + 1}^{A}$ (assure that $f_{m i n} < |f_{j, p}^{A}|$ ), $T_{p}^{F}$ from Equation 4.16 and $μ_{i}$ from Equation 4.24. Let $T_{\min} < |t_{j, p}^{F}|$ and determine $x_{p + 1}^{B}$ from Equation 4.25.
Step 5 : Continue iteration from Step 1 with $p = p + 1$ , $x_{p + 1}^{A}$ and $x_{p + 1}^{B}$ .

If

p_{m a x}

is the number of necessary iterations for satisfying the termination criterion

ε_{p} < ε^{*}

and n is the number of unknowns to be determined, then the suggested update method needs

n + 1

function evaluations in each iterations and altogether

N_{f} = p_{m a x} (n + 1)

(5.4)

function evaluations to reach the desired termination criterion.

p_{m a x}

is depending on many circumstances such as the nature of the function

f (x)

, termination criteria (

ε^{*}

or others), the distance of the initial approximate

x^{A}

from the solution

x^{*}

and from the iteration constants

f_{m i n}

and

T_{m i n}

.

6. Numerical Tests Results

6.1. Single Variable Test Function

A numerical example is given with a single-variable test function

f (x) = cos (x) - x

(6.1)

with root

x^{*} ≅ 0.73908513 . . .

The results of the classic secant iteration process are shown on Table 4 and Figure 4. Iterations were made with initial approximates

x_{0}^{A} = - 2.0

and

x_{0}^{B} = 2.0

(p = 0)

providing

f_{0}^{A} = 1.584

. The first secant approximate

x_{1}^{A} = - 0.416 . . .

is found as the zero of the first secant line

y_{0} (x)

providing

f_{1}^{A} = 1.331 . . .

(p = 1)

. The next iteration (

p = 2

) will continue with approximate

x_{2}^{A} = 0.442 . . . . . .

and with

f_{2}^{A} = 0.462 . . .

.

The results of the suggested iteration process are shown on Table 5 and Figure 5. Iterations were made with the same initial approximates as in case of the classic secant iteration and the first secant approximate

x_{1}^{A} = - 0.416 . . .

is found. The first T-secant approximate

x_{1}^{B} = 0.915 . . .

is given as the zero of the first hyperbola function

z_{0} (x)

(Figure 5, left). Iteration then goes on with approximates

x_{1}^{A} = - 0.416 . . .

and

x_{1}^{B} = 0.915 . . .

providing

f_{1}^{A} = 1.331 . . .

(p = 1)

, and new approximates

x_{2}^{A} = 0.667 . . .

and

x_{2}^{B} = 0.764 . . .

as the zeros of the second secant and hyperbola functions

y_{1} (x)

and

z_{1} (x)

receptively (Figure 5, left). The next iteration (

p = 2

) will then continue with approximates

x_{2}^{A} = 0.667 . . . . . .

and

x_{2}^{B} = 0.764 . . .

and with

f_{2}^{A} = 0.119 . . .

and gives

x_{3}^{A} = 0.7387 . . . . . .

and

x_{3}^{B} = 0.7391 . . .

(Figure 5, right).

6.2. Solution of an Inverse Problem

An example is given for an imaginary modeling problem. Modeling of systems on the basis of observed or specified system responses plays a key role in scientific research. A central problem of modeling is to fit a structured mathematical model to available data so that the values of the unknown model parameters are determined providing that the simulated data is as near to the available data as possible. A system response can generally be represented by a curve in a two (or three) dimensional space and can be simulated by a computer program. Parameter identification corresponds to minimize the distance between observed and simulated system responses. Parameter identification problems can generally be formulated as non-linear least squares problems so that some unknown model parameters are determined providing minimal deviation between observed and simulated system responses. Solving non-linear least squares problems is often a difficult task especially if the number of unknowns is high.

A rational system under investigation produces measurable response to a known measurable external effect. A mathematical model with n unknown parameters

(p_{1}, \dots, p_{n})

simulates the behavior of the system providing response to the same external effect as acted on the system during observation. Let, the observed and the simulated system responses be represented by two dimensional curves and sampled in k discrete points with coordinates

x_{i}

and

y_{i}

(i = 1, \dots k)

. The coordinates of a synthetic observed response were generated as

x (z) = (a + b) cos (z + α) - c cos (\frac{a + b}{b} z + α) + x_{0}

(6.2)

y (z) = (a + b) sin (z + α) - c sin (\frac{a + b}{b} z + α) + y_{0}

(6.3)

(an epicycloid) with parameters

x_{0} = 10

,

y = 8

,

a = 4

,

b = 2

,

c = 3.5

and

α = 1

and with

0 ≦ z ≦ 2 π

. The parameters

p = [\begin{matrix} x_{0} & y_{0} & a & b & c \end{matrix}]

(6.4)

were were considered to be unknown, and the system response was simulated with initial approximate

p_{0} = [\begin{matrix} x_{0} & y_{0} & a & b & c \end{matrix}] = [\begin{matrix} 8 & 11 & 3.5 & 2.5 & 3 \end{matrix}]

(6.5)

Distance between observed and simulated system responses was defined for arbitrary two-dimensional curves. This definition can be extended to n dimensions making wide range of parameter identification problems possible to formulate. The value of the defined distance is dimensionless and it expresses the ratio of area between system response graphs in normalized coordinate system to the area of a rectangle with unit area.

The distance between the observed and the simulated system responses was quantified by area

D (p)

between the graphs of system responses in a normalized coordinate system. Partial area

r_{j} (p)

was defined as the area of a quadrangle among the

{(j - 1)}^{t h}

and

j^{t h}

division points on the system response graphs

(j = 1, \dots, m)

as shown on Figure 6. These division points were selected equidistantly along system response graphs.

Thus, parameter identification problems can be formulated as solving a system of nonlinear equations

r (p) = 0

(6.6)

where

p \to r (p)

,

p \in R^{5}

,

r : R^{5} \to R^{m}

and

r

is residual vector with components

r_{j}

(j = 1, \dots, m)

. The unknown parameters

p

of an epicycloid 6.2 and 6.3 were determined according to the suggested update algorithm. The variation of parameters

p

through iterations and the observed and the simulated system responses for the initial approximate are shown on Figure and Figure 6. The observed and simulated system response pairs are shown on Figure 7 7, 13, 19, 25, 31 and 43 simulations (S).

The results of numerical tests with different initial approximates showed that the optimal solutions were not reached in all cases. The convergence was especially sensitive for the initial values of parameters a, b and c. Several local optimal solutions were detected as shown on Figure 8.

7. Efficiency

The efficiency of an algorithm for the solution of nonlinear equations is thoroughly discussed by Traub [38] as follows. Let

β

be the order of the iteration sequence such that for the approximate errors

e_{i} = x_{i} - x^{*}

, there exists a nonzero constant C (asymptotic error constant) for which

\frac{|e_{i + 1}|}{{|e_{i}|}^{β}} \to C

(7.1)

A natural measure of the information used by an algorithm is the “informational usage” d, which is defined as the number of new pieces of information (values of the function and its derivatives) required per iteration (called “horner” by Ostrowski [26]). Then the efficiency of the algorithm within one iteration can be measured by the “informational efficiency”

E F F = \frac{β}{d}

(7.2)

An alternative definition of efficiency is

* E F F = β^{\frac{1}{d}}

(7.3)

called “efficiency index” by Ostrowski [26]. Another measure of efficiency, called “computational efficiency” takes into account the “cost” of calculating different derivatives. The concept of informational efficiency (

E F F

) and efficiency index (

* E F F

) doesn’t take into account the cost of evaluating f and its derivatives, nor does they take into account the total number of pieces of information needed to achieve a certain accuracy in the root of the function. If f is composed of elementary functions, then the derivatives are also composed of elementary functions, thus the cost of evaluating the derivatives is merely the cost of combining the elementary functions.

Broyden [7] suggested the mean convergence rate

L = \frac{1}{N_{f}} ln \frac{R (x_{0}^{A})}{R (x_{p_{\max}}^{A})}

(7.4)

as a measure of efficiency of an algorithm for solving a particular problem, where

N_{f}

is the total number of function evaluations,

x_{0}^{A}

is the initial approximate,

x_{p_{\max}}^{A}

is the last approximate to the solution

x^{*}

when the termination criteria is satisfied after

p_{m a x}

iterations.

R (x)

is the Euclidean norm of

f (x)

.

Zang’s two-step method [43] is similar to King-Werner method [18,41] with convergence rate

2.414 \dots

[32]. Chen’s [9] and Wang’s [39] three-step methods show asymptotic convergence order

2.732 . . .

. Chen’s [9] method require two function and one derivative evaluations by using a second-order polynomial, proposed in [39] and shows

1.174 \dots

Ostrowski’s [26] efficiency index. Wang’s [39] method require three function and two derivative evaluations shows

1.101 \dots

Ostrowski-index. Zhang [43] published a finite difference based secant method with

2.618 \dots

asymptotic convergence order. Later Ren [32] showed that Zhang’s [43] method has only

2.414 \dots

convergence order due to some mistakes in the derivation. Table 6 compares the efficiencies of classic (Rows 1-2 : Secant, Newton) and improved algorithms (Rows 3, 5 : T-Secant, T-Newton with the suggested full-rank update and from references [9,39]), the T-Secant method with constant Jacobian approximate

S_{p}

(row 4 : TS-const.

S_{p}

, see data in Table 3), Chen’s [9] and Wang’s [39] methods.

Very limited data are available to compare the performance of the suggested update method (T-Secant) with other classic methods, especially for large number of unknowns. Efficiency results were given by Broyden [7] for the Rosenbrock function for

N = 2

. The calculated convergence rates for the two Broyden method variants [7], for the Powell’s method [30], for the adaptive coordinate descent method [19] and for the Nelder-Mead simplex method [20] were compared with the calculated values for the T-secant method in Table 7 [2]. Rows 1-5 are data from referenced papers, rows 6-8 are T-secant results with the referenced initial approximates and rows 9-15 are calculated data for

N > 2

. If the value of

R (x) = 0

(for

p_{m a x}

) is zero, then the mean convergence rates (L and

L_{N}

) are not countable (zero in the denominator). A substitute value

10^{- 25}

was used when

R (x) = 0

(for

p_{m a x}

) in rows 6, 7, 10 and 13.

Results show that the mean convergence rate L (Equation 7.4) for

N = 2

is much higher for the T-secant method (

≃ 5.5 - 6.9

) than for the other listed methods (

≃ 0.1 - 0.6

). However it is obvious that the mean convergence rate values decrease rapidly with increasing N values (more unknowns need more function evaluations). A modified convergence rate

L_{N} = N * L = \frac{N}{N_{f}} ln \frac{R (x_{0}^{A})}{R (x_{p_{\max}}^{A})}

(7.5)

is suggested to use as an

" N "

independent measure of efficiency (see Table 7). The values of L and

L_{N}

are at least 10 times larger for the T-secant method than for the referenced classic methods for

N = 2

(see Table 7).

The efficiency measures (L and

L_{N}

) are also depending on the initial conditions (distance of the initial approximate from the optimal solution, termination criterion). Results from large number of numerical tests indicate an average

L_{N} ≃ 7.4

with standard deviation around

3.7

for the T-secant method even for large N values.

8. Conclusions

A numerical procedure has been developed for solving an overdetermined system of nonlinear equations 1.5. It can be considered both as a discretized Newton’s method or as a quasi-Newton method with full-rank update of the Jacobian approximates. Quasi-Newton methods are widely used for solving systems of nonlinear equations when the function derivatives (Jacobian) are not known or difficult to determine. The derivatives (Jacobian) are approximated by divided differences 3.16, and the system of nonlinear equations 1.5 is solved by repeatedly solving systems of linear equations 3.24. The new approximate

x_{p + 1}^{A}

is determined from the classic secant equation and the divided differences are updated according to the secant condition 3.26. However, the secant condition doesn’t uniquely specify the Jacobian approximate, so the update procedure is not well-defined and further constraints are needed. Different methods offer specific update solutions.

The suggested numerical procedure (“T-Secant” method [2]) provides an unconventional and simple strategy to improve the performance of quasi-Newton iterations by full-rank update of the Jacobian approximates. It is based on the classic secant linearization and allows full-rank update of the Jacobian approximates. The classic secant iteration was completed with a new independent approximate

x_{p + 1}^{B}

, that was determined from a modified secant equation 4.18. Modification was made by a suitably chosen non-uniform scaling transformation. Scaling was made by the difference quotients of the classic secant function value improvements 4.16 and the difference quotients of the desired new iteration stepsizes and the classic secant stepsizes 4.17. A new independent approximate

x_{p + 1}^{B}

was determined from the modified secant equation 4.18. The solution to the secant equation presented in [2] was based on the Wolfe-Popper procedure [29,42] so that the secant equation was splitted into two equations by introducing an auxiliary variable. The simplified algorithm, presented in the paper, directly solves the secant equation by using the pseudo-inverse 4.21 of the transformed Jacobian approximate 4.19.

It has been shown that the new T-secant approximate

x_{p + 1}^{B}

will be in the vicinity of the classic secant approximate

x_{p + 1}^{A}

if the classic secant iterates converge to the root of the nonlinear function [2]. The Jacobian approximate was then full-rank updated by constructing new divided differences (Jacobian approximates) from the classic secant and from the T-secant approximates.

It was shown that the iterative procedure possesses super-quadratic asymptotic convergence property with convergence rate

α = φ + 1 = φ^{2} ≃ 2.618

for simple root in single variable case ([2]), where

φ

is the golden section ratio. The suggested procedure can also be applied to the Newton’s method (matrix

S_{p}

corresponds to the Jacobian matrix

J_{p}

). Numerical test results indicate that the quadratic convergence (

α = 2

) of the Newton’s method increases to cubic (

α = 3

).

The efficiency has been studied in multi-variable case and compared with other classic rank-one update and line-search methods on the basis of available data. Results show that its efficiency is considerably better than the efficiency of other classic low-rank update methods. The suggested method’s performance was demonstrated by the results of numerical tests with widely used single- and multi-variable benchmark test functions. A Rosenbrock test function was used with up to 1000 variables [2]. The method has also been successfully applied for the solution of different “real life” inverse problems [3,4,5] on physically nonlinear dynamic systems. Further studies can be made on convergence properties in case of multiple roots and on the efficiency characteristics in cases of more benchmark test functions. The suggested procedure may be used for other applications especially with large number of unknowns and when low number of function evaluations is crucial.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

A considerable part of the research work has been done between years 1988 -1992 at Technical University of Budapest (Hungary), at TNO-BOUW Structural Division (The Netherlands) and at Technical High-school of Lulea (Sweden). The work has been sponsored by the Technical University of Budapest (Hungary), by the Hungarian Academy of Sciences (Hungary), by TNO-BOUW (The Netherlands), by Sandvik Rock Tools (Sweden), by CP Test a/s (Denmark) and by Óbuda University (Hungary). Valuable discussions and personal supports from Géza Petrasovits, György Popper, Peter Middendorp, Rikard Skov, Bengt Lundberg, Mario Martinez and Csaba J. Hegedűs are greatly appreciated

Conflicts of Interest

The author declares no conflict of interest..

References

Amat, S.; Busquier, S.; Gutiérrez, J.M. Geometric constructions of iterative functions to solve nonlinear equations. J. Comput. Appl. Math. 2003, 157, 197–205. [Google Scholar] [CrossRef]
Berzi, P. Convergence and Stability Improvement of Quasi-Newton Methods by Full-Rank Update of the Jacobian Approximates. MDPI AppliedMath 2024, 4, 143–181. [Google Scholar] [CrossRef]
Berzi, P.; Beccu, R.; Lundberg, B. Identification of a Percussive Drill Rod Joint from its Response to Stress Wave Loading. International Journal of Impact Engineering 1994, 18, 281–290. [Google Scholar] [CrossRef]
Berzi, P. Pile-Soil Interaction due to Static and Dynamic Load. Proceedings of the 13th International Conference on Soil Mechanics and Foundation Engineering, 1994, 1994-01-05, New Delhi, India, pp. 609–612.
Berzi, P.; Popper, Gy. Evaluation of dynamic load test results on piles. Proceedings of the International Symposium on Identification of Nonlinear Mechanical Systems from Dynamic Tests (Euromech 280), 1991 1991-10-29, Ecully, France, pp. 121–128.
Birgin, E.G.; Krejic, N.; Martinez, J.M. Globally convergent inexact quasi-Newton methods for solving nonlinear systems. Num. Algorithms. 2003, 32, 249–260. [Google Scholar] [CrossRef]
Broyden, C. G. A class of Methods for Solving Nonlinear Simultaneous Equations. Mathematics of Computation. American Mathematical Society 1965, 19, 577–593. [Google Scholar]
Broyden, C.G.; Dennis, J.E. , Mor, J.J. On the local and superlinear convergence of quasi-Newton methods. J. Inst. Math. Appl. 1973, 12, 223–245. [Google Scholar] [CrossRef]
Chen, L.; Ma, Y. A new modified King–Werner method for solving nonlinear equations. Computers and Mathematics with Applications 2011, 62, 3700–3705. [Google Scholar] [CrossRef]
Dembo, R.S.; Eisenstat, S.C.; Steihaug, T. Inexact Newton methods. SIAM J. Numer. Anal. 1971, 19, 400–408. [Google Scholar] [CrossRef]
Dennis, J.E.; Mor, J.J. A characterization of superlinear convergence and its application to quasi-Newton methods. Mathematics and Computation 1974, 28, 543–560. [Google Scholar] [CrossRef]
Dennis, J.E. Jr.; Schnabel, R.B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Englewood Cliffs, NJ, 1983.
Gerlach, J. Accelerated convergence in Newton’s method. SIAM Rev. 1994, 36, 272–276. [Google Scholar] [CrossRef]
Hegedus, Cs. Numerical Methods I. ELTE, Faculty of Informatics, Budapest, 2015.
Homeier, H.H.H. On Newton-type methods with cubic convergence. J. Comput. Appl. Math. 2005, 176, 425–432. [Google Scholar] [CrossRef]
Jisheng, K.; Yitian, L. , Xiuhua, W.; Third-order modification of Newton’s method. J. Comput. Appl. Math. 2007, 205, 1–5. [Google Scholar] [CrossRef]
Kanwar, V.; Sharma, J.R. , Mamta J. A new family of Secant-like method with super-linear convergence. Appl. Math. Comput. 2005, 171, 104–107. [Google Scholar]
King, R.F. ; Tangent method for nonlinear equations. Numer. Math. 1972, 18, 298–304. [Google Scholar] [CrossRef]
Loshchilov, I.; Schoenauer, M.; Sebag, M. Adaptive Coordinate Descent. Genetic and Evolutionary Computation Conference (GECCO), ACM Press, 2011, 885–892.
Nelder, J.A.; Mead, R. A simplex method for function minimization. Computer Journal. 1965, 7, 308–313. [Google Scholar] [CrossRef]
Martínez, J.M. Practical quasi-Newton methods for solving nonlinear systems. Journal of Computational and Applied Mathematics 2000, 124, 97–121. [Google Scholar] [CrossRef]
Martinez, J.M.; Qi, L. Inexact Newton methods for solving non-smooth equations. J. Comput. Appl. Math. 1995, 60, 127–145. [Google Scholar] [CrossRef]
Melman, A. Geometry and convergence of Euler’s and Halley’s methods. SIAM Rev. 1997, 39, 728–735. [Google Scholar] [CrossRef]
Muller, D.E. A Method for Solving Algebraic Equations Using an Automatic Computer. Math. Tables and Other Aids to Computation 1956, 10, 208–215. [Google Scholar] [CrossRef]
Ortega, J.M.; Rheinboldt, W.C. Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970.
Ostrowski, A.M. Solution of Equations and Systems of Equations. Academic Press, New York, 1966.
Özban, A.Y. Some new variants of Newton’s method. Appl. Math. Letter. 2004, 17, 677–682. [Google Scholar] [CrossRef]
Papakonstantinou, J.M.; Tapia, R.A. Origin and evolution of the secant method in one dimension. The American Mathematical Monthly 2013, 120/6, 500–518. [Google Scholar] [CrossRef]
Popper, Gy. Numerical method for least square solving of nonlinear equations. Periodica Polytechnica. 1985, 29, 67–69. [Google Scholar]
Powell, M. Powell, M.; J.; D. An efficient method for finding the minimum of a function of several variables without calculating derivatives. Computer Journal. 1964, 7, 155–162. [Google Scholar] [CrossRef]
Press, W.H.; Flannery, B.P.; Teukolsky, S.A. , Wetterling, W.T. Numerical recepies. Cambridge University Press, Cambridge, 1986.
Ren, H.; Wu, Q.; Bi, W. On convergence of a new secant-like method for solving nonlinear equations. Appl. Math. Comput. 2010, 217, 583–589. [Google Scholar] [CrossRef]
Scavo, T.R.; Thoo, J.B. ; On the geometry of Halley’s method. American Math. Monthly. 1995, 102, 417–426. [Google Scholar] [CrossRef]
Shaw, S.; Mukhopadhyay, B. An improved regula falsi method for finding simple roots of nonlinear equations. Appl. Math. and Computation 2015, 254, 370–374. [Google Scholar] [CrossRef]
Stoer, J.; Bulirsch, R. Introduction to Numerical Analysis. Springer -Verlag, 2002.
Strang, G. Introduction to Linear Algebra. Revised International Edition, Wellesley-Cambridge Press, 2005.
Thukral, R. A New Secant-type method for solving nonlinear equations. Amer. J. Comput. Appl. Math. 2018, 8, 32–36. [Google Scholar]
Traub, J.F. Iterative Methods for the Solution of Equations, 1st ed.; Prentice-Hall, Inc.: Englewood Cliffs, NJ, USA, 1964. [Google Scholar]
Wang, X.; Kou, J.; Gu, C. A new modified secant-like method for solving nonlinear equations. Comput. Math. Appl. 2010, 60, 1633–1638. [Google Scholar] [CrossRef]
Weerakoon, S.; Fernando, T.G.I. ; A variant of Newton’s method with accelerated third-order convergence. Appl. Math. Letter. 2000, 13, 87–93. [Google Scholar] [CrossRef]
Werner, W. ; Über ein Verfarhren der Ordnung 1 + √ 2 zur Nullstellenbestimmung. Numer. Math. 1979, 32, 333–342. [Google Scholar] [CrossRef]
Wolfe, P. The Secant Method for Simultaneous Nonlinear Equations. Communications of the ACM 1959, 2, 12–13. [Google Scholar] [CrossRef]
Zhang, H.; Li, D.-S.; Liu, Y.-Z. A new method of secant-like for nonlinear equations. Commun. Nonlinear Sci. numer. Simul. 2009, 14, 2923–2927. [Google Scholar]

Figure 1. Left :Linearization of a nonlinear function,Right :Classic secant method

Figure 2. Suggested full-rank update method (T-secant)

Figure 3. Effect of the iteration parameter

T_{\min}

on the iteration process (

N = 1000

)

Figure 3. Effect of the iteration parameter

T_{\min}

on the iteration process (

N = 1000

)

Figure 4. Classic secant iterations with test function 6.1 (see Table 4). (

Left

:

x_{1}^{A} = - 0.416 \dots

is the root of

y_{0} (x)

, then

x_{2}^{A} = 0.442 \dots

is the root of

y_{1} (x)

.

Middle

:

x_{3}^{A} = 0.898 \dots

is the root of

y_{2} (x)

and

x_{4}^{A} = 0.728 \dots

is the root of

y_{3} (x)

Right

:

x_{5}^{A} = 0.7387 \dots

is the root of

y_{4} (x)

and

x_{6}^{A} = 0.739086 \dots

is the root of

y_{5} (x)

)

Figure 4. Classic secant iterations with test function 6.1 (see Table 4). (

Left

:

x_{1}^{A} = - 0.416 \dots

is the root of

y_{0} (x)

, then

x_{2}^{A} = 0.442 \dots

is the root of

y_{1} (x)

.

Middle

:

x_{3}^{A} = 0.898 \dots

is the root of

y_{2} (x)

and

x_{4}^{A} = 0.728 \dots

is the root of

y_{3} (x)

Right

:

x_{5}^{A} = 0.7387 \dots

is the root of

y_{4} (x)

and

x_{6}^{A} = 0.739086 \dots

is the root of

y_{5} (x)

)

Figure 5. T-Secant iterations with test function 6.1 (see Table 5) (

Left

:

x_{1}^{A} = - 0.416 \dots

is the root of

y_{0} (x)

,

x_{1}^{B} = 0.915 \dots

is the root of

z_{0} (x)

, then

x_{2}^{A} = 0.667 \dots

is the root of

y_{1} (x)

,

x_{2}^{B} = 0.764 \dots

is the root of

z_{1} (x)

.

Right

:

x_{3}^{A} = 0.7387 \dots

is the root of

y_{2} (x)

,

x_{3}^{B} = 0.7391 \dots

is the root of

z_{2} (x)

)

Figure 5. T-Secant iterations with test function 6.1 (see Table 5) (

Left

:

x_{1}^{A} = - 0.416 \dots

is the root of

y_{0} (x)

,

x_{1}^{B} = 0.915 \dots

is the root of

z_{0} (x)

, then

x_{2}^{A} = 0.667 \dots

is the root of

y_{1} (x)

,

x_{2}^{B} = 0.764 \dots

is the root of

z_{1} (x)

.

Right

:

x_{3}^{A} = 0.7387 \dots

is the root of

y_{2} (x)

,

x_{3}^{B} = 0.7391 \dots

is the root of

z_{2} (x)

)

Figure 6. Definition of the distance between system responses (left) and variation of parameters through iterations (right)

Figure 7. Observed and simulated system response pairs after “S” function value evaluations

Figure 8. Local optimal solutions

Table 1. Linearization method’s basic equations (single- and multi-variable cases)

	Single-variable $(m = n = 1)$	Multi-variable $(m \geq n > 1)$	Equations
1	$▵ x_{p} = x_{p}^{B} - x_{p}^{A}$	$▵ X_{p} = [\begin{matrix} ▵ x_{1, 1, p} & \dots & ▵ x_{n, 1, p} \\ ⋮ & ⋮ & ⋮ \\ ▵ x_{1, n, p} & \dots & ▵ x_{n, n, p} \end{matrix}]$	3.18
2	$▵ f_{p} = f_{p}^{B} - f_{p}^{A}$	$▵ F_{p} = [\begin{matrix} ▵ f_{p, 1, 1} & \dots & ▵ f_{p, n, 1} \\ ⋮ & ⋱ & ⋮ \\ ▵ f_{p, 1, m} & \dots & ▵ f_{p, n, m} \end{matrix}]$	3.17
3	$S_{p} = \frac{▵ f_{p}}{▵ x_{p}}$	$S_{p} = ▵ F_{p} ▵ X_{p}^{- 1}$	3.2, 3.16
4	$x_{p + 1}^{A} = x_{p}^{A} - \frac{f_{p}^{A}}{S_{p}}$	$x_{p + 1}^{A} = x_{p}^{A} - S_{p}^{+} f_{p}^{A}$	3.5, 3.21
5	$S_{p + 1} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A}$	$S_{p + 1} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A}$	3.8, 3.26

Table 2. Basic equations of classic and suggested multi-variable update methods

	Classic secant method	Suggested update method	Equations
1		$x_{k, p}^{B} = x_{p}^{A} + ▵ x_{k, p} D^{k}$	4.14
2	$▵ X_{p} = [▵ x_{k, p}]$	$▵ X_{p} = diag (▵ x_{k, p})$	3.18, 4.15
3	$▵ F_{p} = [f_{k, p}^{B} - f_{p}^{A}]$	$▵ F_{p} = [f_{k, p}^{B} - f_{p}^{A}]$	3.17
4		$T_{p}^{X} = diag (t_{i, p}^{X}) = diag (\frac{▵ x_{i, p + 1}}{▵ x_{i, p}^{A}})$	4.17
5		$T_{p}^{F} = diag (t_{j, p}^{F}) = diag (\frac{f_{j, p + 1}^{A}}{f_{j, p}^{A}})$	4.16
6	$S_{p} = ▵ F_{p} ▵ X_{p}^{- 1}$	$S_{Tp} = T_{p}^{F} S_{p} {(T_{p}^{X})}^{- 1} = [\frac{t_{j, p}^{F} ▵ f_{k, j, p}}{t_{i, p}^{X} ▵ x_{i, p}}]$	3.16, 4.19
7	$S_{p} ▵ x_{p}^{A} = - f_{p}^{A}$	$S_{Tp} ▵ x_{p}^{A} = - f_{p}^{A}$	3.24, 4.18
8	$▵ x_{p}^{A} = - S_{p}^{+} f_{p}^{A}$	$▵ x_{p}^{A} = - S_{Tp}^{+} f_{p}^{A}$	3.23, 4.22
9	$x_{i, p + 1}^{A} = x_{i, p}^{A} - \sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A})$	$x_{i, p + 1}^{B} = x_{i, p + 1}^{A} - \frac{{(▵ x_{i, p}^{A})}^{2}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})}$	3.22, 4.23
10	$x_{i, p + 1}^{B} = x_{i, p}^{B}$	$μ_{i, p} = \frac{\sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A})}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})}$	4.24
11	$▵ x_{i, p + 1} = x_{i, p + 1}^{B} - x_{i, p + 1}^{A}$	$▵ x_{i, p + 1} = μ_{i, p} ▵ x_{i, p}^{A}$	4.25

Table 3. T-Secant iteration with constant Jacobian approximate

S_{p}

(Equation 4.27)

Table 3. T-Secant iteration with constant Jacobian approximate

S_{p}

(Equation 4.27)

$h$	$x_{p + h}^{A}$	$x_{p + h}^{B}$	$Δ x_{p + h}^{A}$	$t_{p + h}^{f}$	$Δ x_{p + 1 + h}$	$\|e_{p + 1 + h}^{A}\|$	$α_{p + h}^{T}$	$N_{f}$	$L$
0	1.700	2.000	0.421	-0.085	0.036	$2.6 \cdot 10^{- 2}$		2	1.23
1	2.121	2.156	-0.036	-0.359	-0.013	$2.2 \cdot 10^{- 2}$		4	0.87
2	2.085	2.072	0.013	-0.342	0.0044	$7.6 \cdot 10^{- 3}$	1.08	6	0.76
3	2.098	2.102	-0.0044	-0.348	-0.0015	$2.7 \cdot 10^{- 3}$	0.97	8	0.70
4	2.093	2.092	0.0015	-0.346	0.00053	$9.2 \cdot 10^{- 4}$	1.01	10	0.67
5	2.0949	2.0955	-0.00053	-0.347	-0.00018	$3.2 \cdot 10^{- 4}$	0.997	12	0.65
6	2.0944	2.0942	0.00018	-0.346	0.000063	$1.1 \cdot 10^{- 4}$	1.001	14	0.63
7	2.0946	2.0947	-0.000063	-0.346	0.000022	$3.8 \cdot 10^{- 5}$	0.9997	16	0.62

Table 4. Classic secant iterations with test function 6.1 (see Figure 4)

$p$	0	1	2	3	4	5	6
$x_{p}^{A}$	$- 2.000$	$2.000$	$- 0.416$	$0.442$	$0.898$	$0.7279$	$0.7387$
$▵ x_{p}$	$4.000$	$- 2.416$	$0.858$	$0.456$	$- 0.170$	$0.011$	$3.6 \cdot 10^{- 4}$
$f_{p}^{A}$	$1.584$	$- 2.416$	$1.331$	$0.462$	$- 0.275$	$0.019$	$6.1 \cdot 10^{- 4}$
$x_{p + 1}^{A}$	$- 0.416$	$0.442$	$0.898$	$0.728$	$0.7387$	$0.739086$	$0.739085$
$e_{p + 1}^{A}$	$- 1.155$	$- 0.297$	$0.159$	$- 0.011$	$- 0.0004$	$9.1 \cdot 10^{- 7}$	$7.3 \cdot 10^{- 11}$
$α^{S}$			$0.113$	$15.5$	$0.459$	$4.249$	$1.292$
$f_{p + 1}^{A}$	$1.331$	$0.462$	$- 0.275$	$0.019$	$6.1 \cdot 10^{- 4}$	$1.5 \cdot 10^{- 6}$	$1.2 \cdot 10^{- 10}$

Table 5. T-Secant iterations with test function 6.1 (see Figure 5)

$p$	0	1	2	3	4
$x_{p}^{A}$	$- 2.000$	$- 0.416$	$0.6668$	$0.7387$	$0.7390851328$
$▵ x_{p}$	$4.000$	$1.331$	$0.0968$	$4.1 \cdot 10^{- 4}$	$3.9 \cdot 10^{- 10}$
$f_{p}^{A}$	$1.584$	$1.331$	$0.1190$	$6.7 \cdot 10^{- 4}$	$6.5 \cdot 10^{- 10}$
$x_{p + 1}^{A}$	$- 0.416$	$0.667$	$0.7387$	$0.7390851328$	$0.7390851332$
$e_{p + 1}^{A}$	$- 1.155$	$- 0.072$	$- 4.0 \cdot 10^{- 4}$	$3.8 \cdot 10^{- 10}$
$α^{TS}$			$3.210$	$1.874$	$2.668$
$f_{p + 1}^{A}$	$1.331$	$0.119$	$6.7 \cdot 10^{- 4}$	$6.5 \cdot 10^{- 10}$
$t_{p}^{F}$	$0.840$	$0.089$	$0.0057$	$9.6 \cdot 10^{- 7}$
$x_{p + 1}^{B}$	$0.915$	$0.764$	$0.7391$	$0.7390851332$

Table 6. Efficiencies of classic and improved algorithms

	Method	$d$	$β$	$EFF$ [38]	${}^{*}{EFF}$ [26]	$L_{\max}$ [7]
1	Secant	1	1.618…	1.618…	1.618…	4.0
2	Newton	2	2.0	1.0	1.414…	3.0
3	T-Secant	2	2.618…	1.309…	1.618…	4.5
4	TS - const. $S_{p}$	2	1.0	0.5	1.0	0.6
5	T-Newton	3	3.0	1.0	1.442…	3.0
6	Chen [9]	3	1.618…	0.539…	1.173…	−
7	Wang [39]	5	1.618…	0.323…	1.101…	−

Table 7. Calculated values of the mean convergence rates

	$N$	Method	$R (x_{0}^{A})$	$R (x_{p_{\max}}^{A})$	$p_{\max}$	$N_{f}$	$L$	$L_{N}$
1	2	Broyden 1. [7]	4.92	$4.7 E^{- 10}$	-	59	0.391	0.78
2	2	Broyden 2. [7]	4.92	$2.6 E^{- 10}$	-	39	0.607	1.22
3	2	Powell [30]	4.92	$7.0 E^{- 10}$	-	151	0.150	0.30
4	2	ACD [19]	130.1	$1.0 E^{- 10}$	-	325	0.086	0.17
5	2	Nelder-Mead [20]	2.00	$1.4 E^{- 10}$	-	185	0.127	0.25
6	2	T-secant [7][30]	4.92	$1.0 E^{- 25}$	3	9	6.573	13.15
7	2	T-secant [19]	130.1	$1.0 E^{- 25}$	3	9	6.937	13.87
8	2	T-secant [20]	2.00	$6.7 E^{- 15}$	2	6	5.556	11.11
9	3	T-secant	72.72	$1.4 E^{- 14}$	5	20	1.809	5.43
10	3		32.47	$1.0 E^{- 25}$	4	16	3.815	11.45
11	5		93.53	$1.3 E^{- 14}$	8	48	0.760	3.80
12	5		7.19	$5.9 E^{- 14}$	4	24	1.351	6.76
13	10		202.6	$1.0 E^{- 25}$	14	154	0.408	4.08
14	200		92.78	$9.0 E^{- 15}$	10	2010	0.042	8.44
15	1000		212.4	$3.6 E^{- 13}$	6	6006	0.006	5.66

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

A Simplified Algorithm for a Full-Rank Update Quasi-Newton Method

Abstract

Keywords:

Subject:

1. Introduction

2. Notations

3. Linearization Methods

3.1. Single-Variable Case

3.2. Multi-Variable Case

4. T-Secant Method

4.1. Single-Variable Case

4.2. Multi-Variable Case

5. Algorithm

6. Numerical Tests Results

6.1. Single Variable Test Function

6.2. Solution of an Inverse Problem

7. Efficiency

8. Conclusions

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

MDPI Initiatives

Important Links

Subscribe