A Non-Parametric Algorithm for Predicting Future Samples in Single- and Multi-Channel Time Series

Ioannis Dologlou

doi:10.20944/preprints202511.2136.v1

Submitted:

26 November 2025

Posted:

27 November 2025

You are already at the latest version

Abstract

A new method to estimate future samples in time series data is presented and it is compared against the well known technique ESPRIT. It exploits the null space of the Hankel matrix of the data allowing the prediction of future samples with better accuracy and confidence. Moreover a generalization of the algorithm is derived that also applies to multichannel signals. Both cases with and without cross-channel coupling are considered and different algorithms are presented. The method is fully deterministic with comparable computational complexity to ESPRIT. Testing involves 4000 randomly chosen data sets with variable spectral characteristics.

Keywords:

time-series forecast non-parametric optimization null-space multichannel

Subject:

Computer Science and Mathematics - Signal Processing

1. Introduction

Difference equations have been widely used to model time series by means of exponential functions. The parameters of the exponential functions result from an optimization process that minimizes the modelling error. These techniques suffer from numerical problems that affect the accuracy of the computed parameters. The frequencies and the dumping factors of the exponential functions are the roots of polynomials that result from the optimization process. However the numerical computation of roots is rather sensitive leading to compromised accuracy. To overcome this problem, alternative approaches for the computation of the exponential parameters were proposed in literature. Among the most efficient techniques are the well known ESPRIT and HSVD algorithms that perform well when it comes to spectral estimation but continue having numerical problems to predict future samples [1,2]. In addition round-off errors add-up creating significant deviation from the correct values of the parameters which compromises the prediction based on extrapolation of the exponential functions. This paper exploits the null space of the Hankel matrix of the data allowing the prediction of future samples with better accuracy and confidence. In a more general sense the notion of null space refers to the set of eigenvectors of the data Hankel matrix which are associated with the smallest eigenvalues.

A generalized form of the new algorithm is also derived and applied to multichannel signals. Multichannel modeling has been widely used in signal processing applications. In particular, time varying multichannel signals are very common in real life applications such as video and multi source audio but also in econometrics while studying the combined evolution of various economic indices. The prediction of future vectors in a multichannel configuration has been an important problem with interesting applications. Among the most popular approaches are the multichannel autoregressive, subspace identification, deep learning and wavelet methods [8]. Here the multichannel autoregressive technique is considered and a new algorithm for the computation of future vectors is presented. It exploits the null space of the block-like Hankel matrix constructed from existing vectors of the multichannel signal. Both cases with and without cross-channel coupling are considered and different algorithms are presented.

2. The Orthogonal Approach

Let us consider the p_th order Hankel matrix H of a given set W of N samples w₁, w₂, … w_N with N>p and size px(N-p+1). The SVD of H is H=U*S*V where U is a pxp matrix of singular vectors of size p, S is a pxp diagonal matrix of singular values, arranged in ascending order and V is a px(N-p+1) matrix of singular vectors of size (N-p+1) [3]. Assuming that the size of the null space of H is q only p-q singular values in S are non-zero and q of the singular vectors of U namely u₁… u_q span the null space of H. If U_q is a pxq matrix of these vectors then the N-p+1 column vectors of H are orthogonal to U_q [4,5,6]. The (N+1)_th unknown sample of the set W that has to be estimated, namely w(N+1), along with the p-1 known samples w(N-p+2), w(N-p+3), w(N-p+4) … w(N) form a new column vector of size p, namely x₁=[w(N-p+2) w(N-p+3) w(N-p+4) … w(N) w(N+1)]^T which is part of the augmented Hankel matrix H_a1=[H x₁]. Since all the vectors of H are orthogonal to U_q the same should hold for the vectors of the augmented matrix H_a1, hence x₁ is also orthogonal to U_q. By splitting x₁ into two parts, the known part namely

s_{1}^{T}

=[w(N-p+2) w(N-p+3) … w(N)] and the unknown part y₁=w(N+1) the following equation is obtained,

s_{1}^{T} * u_{1 i} + y_{1} * u_{pi} = 0 ∀ i ≤ q

(1)

where u_1i consists of the p-1 first components of the i_th singular vector u_i and u_pi is the last component of u_i_.

Equation (1) is only theoretically valid. In real life the orthogonality is only approximate and (1) should be reformulated as follows,

s_{1}^{T} * u_{1 i} + y_{1} * u_{pi} = e_{1 i} ∀ i ≤ q

(1.1)

where e_1i is the approximation error. Best approximation will be achieved when the accumulated squared error is minimized,

\sum_{i = 1}^{q} e_{1 i}^{2} \to m i n i m u m

(2)

which leads to the solution of the following minimization problem,

\sum_{i = 1}^{q} {(s_{1}^{T} * u_{1 i} + y_{1} * u_{pi})}^{2} \to m i n i m u m

(3)

An equivalent minimization problem results after elaborating the above expression (3),

\sum_{i = 1}^{q} (s_{1}^{T} * u_{1 i} * u_{1 i}^{T} * s_{1} + 2 * y_{1} * u_{pi} * u_{1 i}^{T} * s_{1} + y_{1}^{2} * u_{pi}^{2}) \to m i n i m u m

(4)

By setting to zero the first derivative of (4) with respect to y₁ the following solution is obtained,

y_{1} = - \sum_{i = 1}^{q} u_{pi} * u_{1 i}^{T} * s_{1} / \sum_{i = 1}^{q} u_{pi}^{2}

(5)

3. The Case of Two Unknown Samples

To estimate w(N+1) and w(N+2) simultaneously a similar approach is used. Besides the vector x₁ a second column vector x₂ of size p, consisting of p-2 known samples w(N-p+3), w(N-p+4),.. w(N) and two unknown samples, namely w(N+1) and w(N+2) is formed. The augmented Hankel matrix H_a1 becomes H_a2=[H x₁ x₂] and both vectors x₁ and x₂ have to be orthogonal to U_q. Consequently next to equation (1) an additional equation to guarantee the orthogonality of x₂ is needed. Vector x₂ is split into two parts, the known part s₂ and the unknown components y₁ and y₂ with

s_{2}^{T}

=[w(N-p+3) w(N-p+4) … w(N)] and [y₁ y₂]=[w(N+1) w(N+2)]. Due to the orthogonality property the following must hold,

s_{2}^{T} * u_{2 i} + y_{1} * u_{p - 1, i} + y_{2} * u_{pi} = 0 ∀ i ≤ q

(6)

where u_2i consists of the p-2 first components of the i_th singular vector u_i and u_p-1,i and u_pi are the last two components of u_i.

As it was mentioned before, equation (6) is only theoretically valid. In real life the orthogonality is only approximate and (6) should be reformulated as follows,

s_{2}^{T} * u_{2 i} + y_{1} * u_{p - 1, i} + y_{2} * u_{pi} = e_{2 i} ∀ i ≤ q

(7)

where e_2i is the approximation error. For best estimates of both w(N+1) and w(N+2) the following combined optimization problem must be solved,

\sum_{i = 1}^{q} e_{1 i}^{2} + \sum_{i = 1}^{q} e_{2 i}^{2} \to m i n i m u m

(8)

By using (1.1) and (7) the above minimization problem can be expressed as,

\sum_{i = 1}^{q} {(s}_{1}^{T} * u_{1 i} + y_{1} * u_{pi})^{2} + \sum_{i = 1}^{q} {(s_{2}^{T} * u_{2 i} + y_{1} * u_{p - 1, i} + y_{2} * u_{pi})}^{2} \to m i n i m u m

(9)

To solve the above the first derivatives of expression (9) with respect to y₁ and y₂ are set to zero. This leads to the following two equations,

y_{1} * \sum_{i = 1}^{q} (u_{pi}^{2} + u_{p - 1, i}^{2}) + y_{2} * \sum_{i = 1}^{q} (u_{p - 1, i} * u_{pi}) = - \sum_{i = 1}^{q} (u_{pi} * s_{1}^{T} * u_{1 i} + u_{p - 1, i} * s_{2}^{T} * u_{2 i})

(10.1)

y_{1} * \sum_{i = 1}^{q} u_{p - 1, i} * u_{pi} + y_{2} * \sum_{i = 1}^{q} u_{pi}^{2} = - \sum_{i = 1}^{q} u_{pi} * s_{2}^{T} * u_{2 i}

(10.2)

The above linear system of equations provides best estimates for y ₁ and y₂.

4. The General Case of Many Unknown Samples

When three future samples need to be estimated, a third term in the optimization problem (8) is added [7,9]. By following the same type of notation equation (8) becomes,

\sum_{i = 1}^{q} e_{1 i}^{2} + \sum_{i = 1}^{q} e_{2 i}^{2} + \sum_{i = 1}^{q} e_{3 i}^{2} \to m i n i m u m

(11)

where e_3i is the error due to the third unknown sample w(N+3)=y₃

By following the same notation as in the case of two unknown samples the error e_3i can be expressed as follows,

s_{3}^{T} * u_{3 i} + y_{1} * u_{p - 2, i} + y_{2} * u_{p - 1, i} + y_{3} * u_{pi} = e_{3 i} ∀ i ≤ q

(12)

where

s_{3}^{T}

=[w(N-p+4) w(N-p+4) … w(N)], u_3i consists of the p-3 first components of the i_th singular vector u_i and u_p-2,i , u_p-1,i and u_pi are the last three components of u_i.

By setting the first derivatives of expression (11) with respect to y₁, y₂ and y₃ to zero, a linear system of three equations is formed and its solution provides the estimates of the three unknown samples y₁, y₂ and y₃ .

By replacing (1), (7) and (12) into expression (11) and by elaborating on the algebraic manipulations, the following set of equations is obtained,

y_{1} * \sum_{i = 1}^{q} (u_{pi}^{2} + u_{p - i, 1}^{2} + u_{p - 2, i}^{2}) + y_{2} * \sum_{i = 1}^{q} (u_{pi} * u_{p - 1, i} + u_{p - 1, i} * u_{p - 2, i}) + y_{3} * \sum_{i = 1}^{q} u_{pi} * u_{p - 2, i} = - \sum_{i = 1}^{q} (u_{pi} * s_{1}^{T} * u_{1 i} + u_{p - 1, i} * s_{2}^{T} * u_{2 i} + u_{p - 2, i} * s_{3}^{T} * u_{3 i})

(13.1)

y_{1} * \sum_{i = 1}^{q} (u_{pi} * u_{p - 1, i} + u_{p - 1, i} * u_{p - 2, i}) + y_{2} * \sum_{i = 1}^{q} (u_{pi}^{2} + u_{p - 1, i}^{2}) + y_{3} * \sum_{i = 1}^{q} u_{pi} * u_{p - 1, i} = - \sum_{i = 1}^{q} {(u}_{pi} * s_{2}^{T} * u_{2 i} + u_{p - 1 i} * s_{3}^{T} * u_{3 i})

(13.2)

y_{1} * \sum_{i = 1}^{q} u_{pi} * u_{p - 2, i} + y_{2} * \sum_{i = 1}^{q} u_{pi} * u_{p - 1, i} + y_{3} * \sum_{i = 1}^{q} u_{pi}^{2} = - \sum_{i = 1}^{q} u_{pi} * s_{3}^{T} * u_{3 i}

(13.3)

The above system of equations (13) maybe expressed as,

y * A = b

(14)

where y=[y₁ y₂ y₃], A is a symmetric matrix and b is a vector.

In a more general notation, the elements of the symmetric matrix A for all j ≤ k and k≤n can be expressed as,

a_{jk} = \sum_{i = 1}^{q} \sum_{m = 1}^{n - k + 1} u_{p + 1 - m, i} * u_{p + j - k + 1 - m, i}

(15)

where n stands for the number of samples to be estimated, namely n=3 in this case.

Similarly, the elements of b can be expressed as,

b_{j} = - \sum_{i = 1}^{q} \sum_{k = j}^{n} u_{p + j - k, i} * s_{k}^{T} * u_{k i}

(16)

With A and b known, the estimation of y is straightforward based on equation (14),

y = b * A^{- 1}

(17)

where y is the vector of n unknown future samples of the set W.

5. The Multichannel Solution Without Cross Channel Coupling

The above technique can be generalized and applied to the multichannel configuration. In particular for the case without cross channel coupling each vector is a linear combination of previous vectors according to the following equation,

x_{n} = a_{1} * x_{n - 1} + a_{2} * x_{n - 2} + \dots a_{p - 1} * x_{n - p + 1} + e_{n}

(18)

where the coefficients

a_{i}, i = 1 \dots p - 1

are scalars and

x_{n}, n = 1, \dots N

are the N vectors of size m of the multichannel signal [8].

If matrix H_b in (19) stands for the block Hankel matrix of the above multichannel signal, the error

e_{n}

for all

n = p \dots N,

will be zero when the vector

a^{T} = [a_{p - 1} \dots a_{1} - 1]

is part of the null space of matrix H_b

H b = (\begin{matrix} x_{1}^{T} & \dots & x_{N - p + 1}^{T} \\ ⋮ & ⋱ & ⋮ \\ x_{p}^{T} & \dots & x_{N}^{T} \end{matrix})

(19)

Assuming that the augmented block Hankel matrix H_ba=[H_b h₁] shares the same null space as H_b, where h₁ is a pxm matrix with its last row the unknown vector

x_{N + 1}

as shown below,

{h 1 = (x_{N - p + 2} \dots x_{N} x_{N + 1})}^{T}

(20)

then all column vectors of matrix h₁ should also be orthogonal to the null space of matrix H_b

The SVD of H_b is H_b=U_b*S_b*V_b where U_b is a pxp matrix of singular vectors of size p, S_b is a pxp diagonal matrix of singular values, arranged in ascending order and V_b is a px[(N-p+1)*m] matrix of singular vectors of size (N-p+1)*m. Assuming that the size of the null space of H_b is q only p-q singular values in S_b are non-zero and q of the singular vectors of U_b namely u_b1… u_bq span the null space of H_b. If U_bq is a pxq matrix of these vectors then the (N-p+1)*m column vectors of H_b are orthogonal to U_bq. The m column vectors of h₁ which should also be orthogonal to U_bq, consist of (p-1) known values and one unknown, their last element. They can be treated independently with null space the vectors in U_bq. Therefore to determine each of the m elements of vector

x_{N + 1}

, equation (5) can be used separately with each one of the m column vectors of matrix

h_{1}

, after replacing the corresponding null space vectors u_b1… u_bq.

The elements of future vectors

x_{N + k}

, k=2,…n can be determined by applying the equations (15), (16) and (17) and by using the corresponding null space vectors u_b1… u_bq.

More specifically, by introducing the matrices

{h_{k} = (x_{N - p + k + 1} \dots x_{N} x_{N + k})}^{T} k = 2, \dots n

and assuming that they share the same null space as H_b, each one of the m column vectors of these matrices can be processed separately using equations (15), (16) and (17) to provide n future estimates for each individual channel.

6. The Multichannel Solution with Cross Channel Coupling

When cross channel coupling is assumed, each vector is a linear combination of previous vectors according to the following equation,

x_{n} = A_{1} * x_{n - 1} + A_{2} * x_{n - 2} + \dots A_{p - 1} * x_{n - p + 1} + e_{n}

(21)

where the coefficients

A_{i}, i = 1 \dots p - 1

are mxm matrices,

x_{n}, n = 1, \dots N

are the N vectors of size m and

e_{n}

is the modelling error. By expanding equation (21) the k_th element of vector

x_{n}

, denoted as

x_{n k}

, is equal to

x_{n k} = a_{1 k} * x_{n - 1} + a_{2 k} * x_{n - 2} + \dots a_{(p - 1) k} * x_{n - p + 1} + e_{n k}

(22)

where

a_{1 k}, a_{2 k}, \dots a_{(p - 1) k}

are the k_th rows of the matrices,

A_{1}, A_{2}, \dots A_{p - 1}

respectively and

e_{n k}

is the k_th element of the modelling error.

If matrix H_bk in (23) stands for the block-like Hankel matrix which is associated to the k_th element of the multichannel signal, the error

e_{n k}

for all

n = p \dots N,

will be zero when the vector

a^{T} = [a_{(p - 1) k} \dots a_{1 k} - 1]

is part of the null space of matrix H_bk,

H_{b k} = (\begin{matrix} x_{1} & \begin{matrix} x_{2} \end{matrix} \dots & x_{N - p + 1} \\ \begin{matrix} \begin{matrix} ⋮ \\ x_{p - 1} \end{matrix} \end{matrix} \begin{matrix}  \end{matrix} & \begin{matrix} \begin{matrix} ⋮ \end{matrix} \\ \begin{matrix} x_{p} & \dots \end{matrix} \end{matrix} & \begin{matrix} ⋮ \\ \begin{matrix} x_{N - 1} \end{matrix} \end{matrix} \\ x_{p k} & \begin{matrix} x_{(p + 1) k} \end{matrix} \dots & x_{N k} \end{matrix})

(23)

Assuming that the augmented block-like Hankel matrix H_bka=[H_bk h_k] shares the same null space as H_bk, where h_k is a (p-1)*m+1 vector with its last element the k_th unknown sample

x_{(N + 1) k}

of the vector

x_{N + 1}

, as shown below,

h_{k} = {(x_{N - p + 2}^{T} \dots x_{N}^{T} x_{(N + 1) k})}^{T}

(24)

then vector h_k should also be orthogonal to the null space of matrix H_bk.

The SVD of H_bk is H_bk=U_bk*S_bk*V_bk where U_bk is a [(p-1)*m+1]x[(p-1)*m+1] matrix of singular vectors of size (p-1)*m+1, S_bk is a [(p-1)*m+1]x[(p-1)*m+1] diagonal matrix of singular values, arranged in ascending order and V_bk is a [(p-1)*m+1]x(N-p+1) matrix of singular vectors of size (N-p+1). Assuming that the size of the null space of H_bk is q only [(p-1)*m+1]-q singular values in S_bk are non-zero and q of the singular vectors of U_bk namely u_bk1… u_bkq span the null space of H_bk. If U_bkq is a [(p-1)*m+1]xq matrix of these vectors then the (N-p+1) column vectors of H_bk along with vector h_k are orthogonal to U_bkq. The vector h_k consist of (p-1)*m known values and one unknown, its last element. The estimation of the unknown element

x_{(N + 1) k}

can be done independently with corresponding null space the vectors in U_bkq. Hence for the unknown element of vector

h_{k}

equation (5) can be used with corresponding null space the q vectors u_bk1… u_bkq.

In order to estimate all the elements of the unknown vector

x_{N + 1}

the same procedure is repeated for all m possible values of k. For each value of k the SVD of the block-like Hankel matrix in (23) has to be computed and the new null space vectors have to be used with equation (5).

7. Experimentation – Testing

The proposed algorithm was tested against the well known method ESPRIT that models the signal by means of exponential sinusoids. A large set of signals, four thousands in total, with variable complexity in terms of the rank of their covariance matrix was considered. The comparison was based on the accuracy of predicting future samples as well as on the overall consistency of the two algorithms. The term consistency refers to the homogeneity of the prediction errors. Each signal consists of 2000 to 2500 samples and it was generated from white noise that was filtered using random low pass FIR filters with hundreds of taps. Four sets of future samples were selected for the comparison. The latter is based on the RMSE among the predicted values and the actual values of the signals. The four sets refer to predicting the first, tenth, twentieth and thirtieth future samples using both the proposed and the ESPRIT algorithms and comparing the RMSEs of the two methods. To evaluate the algorithms with respect to their consistency the variance of the errors is used. In order to eliminate variations due to the differences of the signal amplitudes, all errors were normalized with respect to the actual value of the predicted sample.

For the ESPRIT method the prediction for each one of the 4000 signals is based on twenty different models with orders equally spaced between 140 and 900. By averaging these twenty different results the overall prediction is computed.

For the new method the size p of the Hankel matrix H is set to 600. Alternative choices for p were also tested showing no significant change in the results as long as p is not too small compared to the length N of the signals. Regarding the null space not only one single size for each signal is used. Instead, an optimum set of null spaces for each signal is considered and the overall prediction is computed by averaging individual results. There are two constraints while choosing the optimum set of null spaces, a) the sizes of the null spaces of the optimum set are consecutive numbers and b) the variance of the results of the set is minimal. The second constraint is imposed to guarantee maximum consistency of the chosen set.

Table 1 shows the obtained results for the four different setups in predicting future samples. In all cases the new method performs better than ESPRIT in terms of both RMSE and variance.

Figure 1, Figure 2, Figure 3 and Figure 4 show the shape of representative signals among the 4000 used. The rank of the covariance matrix of all 4000 signals varies considerably, ranging from low to high rank. This way the results are not biased with respect to the degree of predictability of the signals.

8. Conclusion

A new method to predict future samples is presented and it is compared against the ESPRIT method. It was proven to be more accurate and more consistent than ESPRIT, based on results from four thousand time series of variable spectral complexity. The new method exploits the properties of the eigenvectors of the Hankel matrix of the time series that are associated with the low eigenvalues. A generalization of the algorithm to multichannel signals was also presented. Both cases, with and without cross-channel coupling, were considered and different algorithms were proposed. The algorithms are fast and stable and can predict any set of future samples/vectors. Further research will focus on both the optimal selection of the null space of the time series Hankel matrix as well as the choice of the model size.

References

S. van Huffel, H. Chen, C. Decanniere and P. van Hecke, "Algorithm for time-domain NMR data fitting based on total least squares", J.Magn.Res., Series A 110, pp. 228-237, 1994. [CrossRef]
Roy R., Paulraj A., and Kailath T., “ESPRIT—a subspace rotation approach to estimation of parameters of cisoids in noise”, IEEE Transactions on Acoustics, Speech, and Signal Processing. (1986) 34, no. 5, 1340–1342, 2-s2.0-0022794413. [CrossRef]
R.A. Horn, C.R. Johnson, “Matrix Analysis”, Cambridge University Press, 1996.
Dologlou and G.Carayannis, "LPC/SVD analysis of signals with zero modelling error", Signal Processing, Vol. 23 (3), pp. 293-298, 1991. [CrossRef]
I. Dologlou, G. Carayannis, "Physical interpretation of signal reconstruction from reduced rank matrices", IEEE Trans. Acoust. Speech Signal Process., ASSP, July 1991, pp. 1681-1682. [CrossRef]
J.A. Cadzow, "Signal enhancement: A composite property mapping algorithm", IEEE Trans. Acoust. Speech Signal Process., Vol. 36, No. 1, January 1998, pp. 49-62. [CrossRef]
Ioannis Dologlou, “Estimating future samples in time series data”, or doi.org/10.20944/preprints202509.2140.v2 July 2025. [CrossRef]
H. Lütkepohl, New Introduction to Multiple Time Series Analysis, Springer, 2005.
I. Dologlou “A Non-Parametric Algorithm To Estimate Future Samples In Time Series”, World Journal of Applied Mathematics and Statistics, 2025, 1(4), 01-05.

Figure 1. Typical signal among the 4000 used (high rank covariance).

Figure 2. Typical signal among the 4000 used (high rank covariance).

Figure 3. Typical signal among the 4000 used (low rank covariance).

Figure 4. Typical signal among the 4000 used (low rank covariance).

Table 1. RMSE and variance for the same set of 4000 signals using the new method and ESPRIT. Results refer to the estimation of the 1^st, 10^th, 20^th, and 30^th future samples.

	Number of tested signals	RMSE New Method	Variance New Method	RMSE ESPRIT	Variance ESPRIT
1^st Future sample	4000	9.707*10^-4	8.378*10^-7	0.0052	2.65*10^-5
10^th Future sample	4000	0.0064	4.0922*10^-5	0.0737	0.0054
20^th Future sample	4000	0.0128	1.624*10^-4	0.1933	0.0373
30^th Future sample	4000	0.0149	2.2255*10^-4	0.4215	0.1777

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.