Preprint
Article

This version is not peer-reviewed.

Varying-Coefficient Additive Models with Density Responses and Functional Auto-Regressive Error Process

A peer-reviewed article of this preprint also exists.

Submitted:

13 July 2025

Posted:

14 July 2025

You are already at the latest version

Abstract
In many practical applications, data collected over time exhibit auto-correlation, which, if not addressed, can lead to incorrect statistical inferences. To address this, we propose a varying-coefficient additive model with density responses, incorporating a functional autoregressive (FAR) error process to account for serial dependency. We present a three-step spline-based estimation procedure for the varying-coefficient components after mapping densities into a linear space using the log-quantile density transformation. First, a B-spline series approximation provides initial estimates of the bivariate varying-coefficient functions. Second, spline estimation of the error process is obtained from the residuals. Lastly, improved estimates of the additive components are obtained by removing the estimated error process. Theoretical results, including convergence rates and asymptotic properties, are provided, and the practical performance of the method is demonstrated through simulations and real data analyses.
Keywords: 
;  ;  ;  

1. Introduction

Density, or more generally, distributional data appear increasingly often in various real-world research domains. Specific examples include the distributions of cross-sectional or intraday stock returns [9,15], mortality densities [13], and distributions of intrahub connectivity in neuroimaging [11,14]. In many applications, the density curves are observed consecutively over time, which we call in the article a density time series. A motivating example is shown in Figure 1, (a) shows density time series of global mortality rate (‰) over an interval of 100 days from January 22, 2020 to April 15, 2021, while Figure (b) is an alternative look with plotting densities at three different days. In this article, we consider a regression model where density time series, served as response, is coupled with scalar predictors.
Unlike conventional functional data, density data do not form a linear space due to inherent constraints, such as nonnegativity and the requirement to integrate to one. These characteristics present significant challenges when attempting to directly apply functional data analysis techniques to random densities. Several studies have addressed this issue and proposed various solution methods, which can be broadly categorized into two types. One natural approach to overcoming the nonlinear limitations of density functions is to map them into a Hilbert space using a suitable transformation method. In this direction, [12] proposed two continuous and invertible transformations, which are log hazard transformation and log quantile density (LQD) transformation, to map probability densities to an unrestricted space of square integrable functions. [8] map densities by LQD transformation as in [12] to unrestricted square integrable functions, model and fit the responses with an additive functional-to-scalar regression. In order to forecast density functions derived from cross-sectional and intraday returns, [9] proposed two approaches based on compositional data analysis and modified log quantile transformation, combining the FPC representation and exponential smoothing forecasting method. The second way involves choosing appropriate metrics and adopting a geometric approach. For instance, [17] adopted the infinite-dimension version of the Aitchison geometry to construct a density-on-scalar linear regression model in Bayes–Hilbert spaces, while [13] discussed Fréchet regression in a general metric space with the Wasserstein metric. [5] utilized the geometry of tangent bundles of the Wasserstein space with the Wasserstein metric to propose distribution-on-distribution regression. They also studied an extension to autoregressive models of order one for distribution-valued time series. With different methodology, [19] also discussed order-p Wasserstein autoregressive model.
Denote F as the space of density functions f defined on a common support U . Without of generality, we assume that U = [ 0 , 1 ] . Given the transformation Ψ : F L 2 , for any given X R d , the conditional Fréchet mean of the random density f is defined as
μ ( · | X ) = arg min d F E | | Ψ ( f ) Ψ ( d ) | | 2 2 X ) ,
where the expectation E represents the joint distribution of ( X , f ) .
This is equivalent to the following equation
Ψ ( μ ( · | X ) ) ( u ) = E Ψ ( f ) ( u ) | X , 0 u 1 ,
leading to the fact that
μ ( s | X ) = Ψ 1 E ( Ψ ( f ) ( u ) | X ) ( s ) , 0 s 1 .
Data that we consider in this article are the density time series d t coupled with scalar predictors ( X t , Z t ) . To deal with the density functions, we adopt the LQD transformation Ψ : F L 2 , where F is the class of density function d satisfying R u 2 d ( u ) d u < . For each d t F , let Q t ( u ) be the quantile function corresponding to the cumulative distribution function F t ( y ) of d t ( y ) with support on [0,1], and q t ( u ) be the quantile density function, i.e., q t ( u ) = Q t ( u ) = d d u F t 1 ( u ) for u [ 0 , 1 ] , then the LQD transformation of d t is defined as
Ψ ( d t ) ( u ) = log d d u F t 1 ( u ) , u [ 0 , 1 ] .
In this article, a varying-coefficient additive model with functional auto-regressive error process is proposed to estimate E ( Ψ ( d t ) | X t , Z t ) . Then d t can be written as
d t = Ψ 1 ( E ( Ψ ( d t ) | X t , Z t ) ) + δ t 1 ,
where δ t 1 is the error from regression.
Except for δ t 1 , there is usually another source of error, which comes from the estimation of d t , i.e., d ^ t = d t + δ t 2 . That is because in most practical applications, instead of the density d t itself, only n t random samples generated from each d t can be observed. Namely, Y t 1 , , Y t n t d t . In this article, we assume that n t = n . Following [12], d t is estimated by the modified kernel smoothing method, i.e,
d ^ t ( y ) = i = 1 n K y Y t i h w ( y , h ) / i = 1 n 0 1 K s Y t i h w ( s , h ) d s ,
where the weight function w ( y , h ) is defined so as to rectify the boundary effects as follows:
w ( y , h ) = y / h 1 K ( u ) d u 1 I y [ 0 , h ) + 1 ( 1 y ) / h K ( u ) d u 1 I y ( 1 h , 1 ] + I y [ h , 1 h ] ,
in which K is the kernel with bandwidth h < 1 / 2 and bounded variation that is symmetric about 0 such that 0 1 K ( u ) d u > 0 , R | u | K ( u ) d u , R K 2 ( u ) d u , and R | u | K 2 ( u ) d u are finite. Thus, when we fit the regression model with d ^ t as the observation of d t , we have
d ^ t = Ψ 1 ( E ( Ψ ( d t ) | X t , Z t ) ) + δ t 1 + δ t 2 .
The contribution of this article lies in integrating density time series modeling with a functional autoregressive error process. A common assumption in regression modeling is the independence of random errors. However, data collected over time often exhibit serial dependence, rendering this assumption potentially inappropriate. By incorporating autoregressive structures into the error process, this work accounts for the temporal correlation inherent in density time series, thereby improving the model’s flexibility and accuracy. For example, daily price curves or liquidity demand-supply curves in economic and financial applications or climatological curves describing the natural phenomenon [1,3,4,6] all change over time. In such cases, the daily fluctuations in the collected data are heavily influenced by previous values, potentially resulting in what is known as sequence dependence. Numerous empirical studies have demonstrated that neglecting the temporal dependencies between observations can lead to biased statistical results and flawed inferences. Consequently, it is crucial to incorporate autoregressive error structures that explicitly account for the autocorrelation present within the data. This approach ensures a more accurate representation of the underlying temporal dynamics in regression modeling. While functional autoregressive (FAR) models have been extensively studied in the literature [1,2,3,4,6,10,18], there is a notable gap in the literature regarding the use of FAR processes to model the random error component in functional regression models. In this article, to improve the accuracy of density time series predictions, we integrate the FAR error structure within a varying coefficient additive regression model. This novel approach allows for better handling of temporal dependencies in the error process, ultimately enhancing the model’s predictive performance.
The remainder of this article is organized as follows. Section 2 outlines the methodology for constructing a varying coefficient additive model with a density response, incorporating a functional autoregressive (FAR) error process. In Section 3, we introduce a three-step procedure for estimating the bivariate varying-coefficient components within the model. Section 4 provides the theoretical results and corresponding inferences derived from the model. Monte Carlo simulations are conducted in Section 5 to demonstrate the efficiency and robustness of the proposed approach. In Section 6, we showcase the practical applicability of the model through analyses of COVID-19 data and U.S. income data. Finally, a discussion of the findings is provided in Section 7, with detailed proofs of the theoretical results included in the Appendix.

2. Model Setup

In this article, we focus on the modeling of density response. Due to the constraints of density function which is nonnegative and integrated to 1, we consider the functions after LQD transformation.
In the article, the main goal is to estimate E ( Ψ ( d t ) ( u ) | x ) based on the transfer of density function as
E ( Ψ ( d t ) ( u ) | x ) = m = 1 k z t , m g m ( u , x t , m ) , 0 u 1 ,
leading to our proposed varying-coefficient additive models with density responses and functional auto-regressive F A R ( p ) error process (DVCA-FAR):
f t ( u ) = Ψ ( d t ) ( u ) = m = 1 k z t , m g m ( u , x t , m ) + ε t ( u ) , 0 u 1 , 1 t T ,
where ε t ( u ) is the F A R ( p ) process
ε t ( u ) = γ 1 ( s , u ) ε t 1 ( s ) d s + + γ p ( s , u ) ε t p ( s ) d s + e t ( u ) .
In the above model, the random density d t ( · ) F serves as a response, and Ψ : F L 2 is the LQD transformation. Each density is coupled with the k dimensional covariates x t = ( x t , 1 , , x t , k ) τ and z t = ( z t , 1 , , z t , k ) τ with supports S x and S z respectively. Without loss of generality, we take S x = S z = [ 0 , 1 ] . In this article, covariate x t can be z t or t / T .
The bivariate functions g m ( · , x m ) quantify the effect of z , γ l ( · , · ) are smooth functions satisfying γ l 2 ( s , u ) d u d s < . Moreover, the error e t ( u ) is an independent and identically distributed random process satisfying E ( e t ( u ) ) = 0 and C o v ( e t ( u ) , e t ( s ) ) = σ t 2 ( u , s ) .
With the estimated density d ^ , let f ^ t = Ψ ( d ^ t ) . Then DVCA-FAR model can be written as
f ^ t ( u ) = m = 1 k z t , m g m ( u , x t , m ) + ε t ( u ) + ε f t ( u ) , 0 u 1 ,
where ε f t ( u ) is random error generated from the transformation of estimated density.

3. Three-Step Estimation Methodology

We propose a three-step estimation procedure. First, the B-spline smoothing method is employed to obtain the initial estimate of the bivariate varying coefficients, disregarding the error structure. Second, using the initial estimate and the response, we estimate the error process. The functional autoregressive (FAR) process is then estimated using the sequential test proposed by [10]. Finally, after removing the estimated FAR error, the optimal estimate is derived using the spline method.

3.1. Initial Estimation of Bivariate Varying-Coefficient Function

First, without considering error structure, we adopt the spline method to obtain the initial estimations of g m ( u , x m ) , m = 1 , , k , regardless of the error process, by the spline series approximation method.
Let { B 0 ( u ) , , B N 0 ( u ) } be the set of B-spline basis functions for u of order q with L 0 interior knots, where N 0 + 1 = L 0 + q . Similarly, for each m = 1 , , k , let { B 0 , m ( x m ) , , B N m , m ( x m ) } be the set of B-spline basis functions of order q for x m ( m = 1 , , k ) with L m interior knots, where N m + 1 = L m + q . Denote b j , m * ( x m ) as the normalization of B j , m ( x m ) and b r ( u ) = N 0 1 / 2 B r ( u ) as the scaled version of B r ( u ) .
Using these basis functions, we define the tensor product of the B-spline basis as
b r , j , m ( u , x m ) = b r ( u ) b j , m * ( x m ) , 1 r N 0 , 1 j N m , 1 m k .
The spline approximation of g m ( u , x m ) is given by
g m ( u , x m ) r = 1 N 0 j = 1 N m λ r , j , m b r , j , m ( u , x m ) , 1 m k .
With the least square method, the estimation of g m ( u , x m ) is
g ˜ m ( u , x m ) = r = 1 N 0 j = 1 N m λ ˜ r , j , m b r , j , m ( u , x m ) , 1 m k ,
where λ ˜ = ( λ ˜ 1 , 1 , 1 , , λ ˜ N 0 , N k , k ) τ is a ( N 0 m = 1 k N m ) -dimensional vector satisfying
λ ˜ = arg min λ t = 1 T i = 1 n f ^ t ( u i ) m = 1 k z t , m r = 1 N 0 j = 1 N m λ r , j , m b r , j , m ( u i , x t , m ) 2 .
Theorem 1 shows that the initial estimation g ˜ m ( u , x m ) is uniformly consistent.

3.2. Estimation of FAR Error Process

With the initial estimation g ˜ m ( u , x m ) , we now estimate the FAR error process. To this end, let
ε ˜ t ( u ) = f t ( u ) m = 1 k z t , m g ˜ m ( u , x t , m ) , 1 t T .
Denote the additive function ρ t ( u ) = l = 1 p γ l ( s , u ) ε t l ( s ) d s , thus the F A R ( p ) error process (3) can be written as ε t ( u ) = ρ t ( u ) + e t ( u ) .
Let { B 0 ( u ) , B 2 ( u ) , , B N ( u ) } be the set of B-spline basis functions of order q with L interior knots, where N + 1 = L + q . Define the tensor product of the B-spline basis as
b r , j ( u , s ) = B r ( u ) B j ( s ) , 1 r , j N .
Then, the spline approximations of the functions γ l ( · , · ) is given by
γ l ( s , u ) = r = 1 N j = 1 N μ r , j , l b r , j ( u , s ) , 1 l p .
The estimation of the p N 2 -dimensional vector μ = ( μ 1 , 1 , 1 , , μ N , N , p ) τ can be obtained by minimizing the following quadratic loss function:
μ ^ = arg min μ t = p + 1 T i = 1 n ε ˜ t ( u i ) l = 1 p r = 1 N j = 1 N μ r , j , l b r , j ( u i , s ) ε ˜ t l ( s ) d s 2 .
Then, the corresponding estimation of γ l ( · , · ) and the additive function ρ t ( u ) are given by
γ ^ l ( s , u ) = r = 1 N j = 1 N μ ^ r , j , l b r , j ( u , s ) , 1 l p ,
and
ρ ^ t ( u ) = l = 1 p r = 1 N j = 1 N μ ^ r , j , l b r , j ( u , s ) ε ^ t l ( s ) d s , 0 u 1 , p + 1 t T ,
respectively.
In practice, the order p of error process is unknown. To solve this problem, we utilize a sequential test by [10] to identify the order p. The detailed identification procedure is given in Section 3.4.2.

3.3. Improved Estimation of Bivariate Varying-Coefficient Function

By removing the estimated error process (8) from the response function f t ( u ) and repeating the same procedure as in Section 3.1, we obtain an improved estimation of g m ( u , x m ) .
To do so, we first denote
f t c ( u ) = f t ( u ) l = 1 p γ l ( s , u ) ε t l ( s ) d s , 0 u 1 , p + 1 t T ,
and the estimation of f t c ( u ) as
f ^ t c ( u ) = f t ( u ) l = 1 p γ ^ l ( s , u ) ε ^ t l ( s ) d s , 0 u 1 , p + 1 t T .
On the other side,
f t c ( u ) = m = 1 k z t , m g m ( u , x t , m ) + e t ( u ) , 0 u 1 .
Therefore, following the same procedure as the first step, the improved spline approximation estimations are given by
g ^ m ( u , x m ) = r = 1 N 0 j = 1 N m λ ^ r , j , m b r , j , m ( u , x m ) , 1 m k ,
where λ ^ = ( λ ^ 1 , 1 , 1 , , λ ^ N 0 , N k , k ) T is a ( N 0 m = 1 k N m ) -dimensional vector satisfying
λ ^ = arg min λ t = 1 T i = 1 n f ^ t c ( u i ) m = 1 k z t , m r = 1 N 0 j = 1 N m λ r , j , m b r , j , m ( u i , x t , m ) 2 .
Theorem 2 and 3 show the uniform convergence and asymptotic normality of the improved estimation g ^ m ( u , x m ) . Besides, simulation studies in Section 5 show that the improved estimation is more efficient than the initial estimation g ˜ m ( u , x m ) .

3.4. Implementation

3.4.1. Selection of Bandwidth

In empirical applications, prior to modeling, it is necessary to estimate the density, which involves selecting the appropriate bandwidth for the modified kernel estimation. In this section, we employ the leave-one-out cross-validation method to determine the optimal bandwidth. Specifically, the bandwidth h is chosen by minimizing the following mean squared error (MSE):
C V ( h ) = 1 n T t = 1 T i = 1 n [ d t ( y t i ) d ^ t ( y t i ) ] 2 ,
where for each i = 1 , , n , d ^ t ( y t i ) is the estimate of d t ( y t i ) with bandwidth h obtained by using observations from the t-th section other than the i-th one.

3.4.2. Identifying the Order of the FAR Process

In this section, we utilize the order determination procedure proposed by [10] to identify th order p of FAR error process. The main idea is to represent the FAR process as a fully functional linear model with dependent regressors and construct a sequential test procedure.
Given a sequence of hypotheses:
H 0 , p : { ε t } are F A R ( p ) v s H a , p + 1 : { ε t } are F A R ( p + 1 ) , p = 0 , 1 , 2 , ,
Here, F A R ( 0 ) means independent and identically distributed process.The sequential test begins with p = 0 and terminated if H 0 , p is not rejected, the order of the process is then identified as p. See [10] for more relevant details and further explanation.
To construct the test statistic, define η j ( s ) = l = 1 p ε ˜ j l ( s p ( l 1 ) ) I l ( s ) , φ ( s , u ) = p l = 1 p γ l ( s p ( l 1 ) , u ) I l ( s ) , where I l as the indicator function of the interval [ ( l 1 ) / p , l / p ] . Denote the orthonormal basis of L 2 as { x ^ j , 1 j T } from the eigenfunctions of C ^ η ( s , u ) = 1 T j = 1 T ( η j ( s ) η ¯ ( s ) ) ( η j ( u ) η ¯ ( u ) ) , with the corresponding decreasingly ordered eigenvalues λ ^ j , where η ¯ is the mean function of η j . In the proposed method, only the first q η eigenfunction/eigenvalue pairs are used. Moreover, define { y ^ j , 1 j T } and q π analogously to { x ^ j } and q η for the response functions denoted as π j .
For the product space L 2 ( [ 0 , 1 ] × [ 0 , 1 ] ) , denote η ( j , k ) = < η j , x ^ k > , π ( j , m ) = < π j , y ^ m > , ψ ( k , m ) = < φ , x ^ k y ^ m > . Let π = [ π ( j , m ) ] T × q π , η = [ η ( j , k ) ] T × q η , and ψ = [ ψ ( k , m ) ] q η × q π , j = 1 , , T ; k = 1 , , q η ; m = 1 , , q π .
Construct the matrix A ^ in which the entries are A ^ ( k , k ) = < x ^ k , p , x ^ k , p > , where x ^ k , p ( s ) = x ^ k ( s + p 1 p ) , 0 s 1 . Define the orthonormal eigenvectors β ^ k with corresponding ordered eigenvalues ξ ^ 1 ξ ^ q η as A ^ β ^ k = ξ ^ k β ^ k , 1 k q η , and denote B ^ = [ β ^ 1 , , β ^ q * ] , where q * = max { k { 1 , , q η } : | | z ^ k , p | | 2 0.9 p } and z ^ k , p ( s ) = i = 1 q η β ^ k , i x ^ k , p ( s ) .
Following [10], the test statistic is constructed as
τ ^ p = 1 T v e c [ B ^ τ ψ ^ ] τ ( I q ε B ^ τ ) ( C ^ Λ ^ ) ( I q ε B ^ ) 1 v e c [ B ^ τ ψ ^ ] ,
where Λ ^ = d i a g ( λ ^ 1 , , λ ^ q η ) , C ^ = 1 T ( π η ψ ^ ) τ ( π η ψ ^ ) . The test statistic τ ^ p + 1 has an approximately chi-squared distribution with degree q π q * .

4. Theoretical Results

In this section, we discuss the asymptotic properties of both initial and improved estimation of g m ( u , x m ) . Moreover, the consistency of estimation of order p is derived. All proofs are given in Appendix.
Throughout the remainder of this article, for any fixed interval [ a , b ] , we denote the space of l-th order smooth functions as C ( l ) [ a , b ] = { g | g ( l ) [ a , b ] } and the class of Lipschitz-continuous functions for some fixed constant C > 0 as L i p ( [ a , b ] , C ) = { g | | g ( x ) g ( x ) | | x x | , x , x [ a , b ] } . Let S x m and S z m denote the supports of x m and z m , respectively. Then, the supports of x and z are S x = m = 1 k S x m and S z = m = 1 k S z m , respectively. The necessary assumptions for the asymptotic results are as follows.
(A1) For any d F , d is differentiable and there exists a constant M > 1 such that | | d | | , | | 1 / d | | , and | | d | | are all bounded by M.
(A2) (a) The kernel density K is Lipschitz-continuous, bounded, and symmetric about 0. Furthermore, K L i p ( [ 1 , 1 ] , L k ) for some constant L k > 0 . (b) The kernel density K is such that 0 1 K ( u ) d u > 0 , R | u | K ( u ) d u , R K 2 ( u ) d u , and R | u | K 2 ( u ) d u are finite.
(A3) The covariates x t , m , z t , m , 1 m k , and the errors ε t ( u ) satisfy the following moment conditions: for some s > 2 ,
max 1 t T max 1 m k E ( | x t , m | 2 s ) < , max 1 t T max 1 m k E ( | z t , m | 2 s ) < , max 1 t T sup u E ( | ε t ( u ) | 2 s ) < .
For each t = 1 , , T , the covariance function C o v ( ε t ( s ) , ε t ( v ) ) = Σ t ( s , v ) has finite nondecreasing eigenvalues λ 1 λ m a x satisfying j λ j < .
(A4) The additive component functions g m ( u , x m ) , 1 m k , are continuous on [ 0 , 1 ] × [ a m , b m ] and twice continuously partially differentiable with respect to u and x m , where [ a m , b m ] is a compact subset of S x m .
(A5) N 0 ( n T ) 1 / 6 log n T , N m ( n T ) 1 / 6 log n T , 1 m k , h n 1 / 3 , as n , T .
Remark 1.
Assumption (A1) is basic and essential for deriving the consistency of densities after transformation. The conditions in (A2) on the kernel function K ( · ) are mild and can be satisfied by commonly used kernel functions such as the uniform and Epanechnikov kernels. The moment conditions in (A3) are crucial for deriving the uniform convergence and other asymptotic properties based on the spline function. The smoothness conditions for the component functions in (A4) are greatly relaxed. The conditions in (A5) are commonly applied in spline smoothing to ensure optimal convergence rates.
We first derive the uniform consistency of the initial estimates of bivariate functions g m ( u , x m ) , as stated in Theorem 1.
Theorem 1.
Theorem 1. Assume that assumptions (A1)–(A5) hold, and that g ˜ m ( u , x m ) are the initial estimates of g m ( u , x m ) defined by (5), m = 1 , , k . Then, as n and T , it holds that
sup u , x m [ 0 , 1 ] | g ˜ m ( u , x m ) g m ( u , x m ) | = O p ( n T ) 1 / 3 log ( n T ) + n 1 / 3 .
Theorem 2 characterizes the uniform convergence of the improved estimation of g m ( u , x m ) , and Theorem 3 describes the asymptotic properties of both the initial and improved estimations.
Theorem 2.
Theorem 2.Assume that assumptions (A1)–(A5) hold, g ^ m ( u , x m ) are the improved estimates of g m ( u , x m ) defined by (9), m = 1 , , k , and that the order of the functional error process p is known. Then, as n and T , the following holds:
sup u , x m [ 0 , 1 ] | g ^ m ( u , x m ) g m ( u , x m ) | = O p ( ( n T ) 1 / 3 ( log ( n T ) ) 2 + n 1 / 3 ) .
To present the asymptotic normality of the estimations, some notations are given as follows. Denote b ( u , x t , m ) = ( b 1 , 1 , m ( u , x t , m ) , , b N 0 , N m , m ( u , x t , m ) ) τ , b z ( u , x t , m ) = z t , m b ( u , x t , m ) , Bz t , m = ( b z ( u 1 , x t , m ) , , b z ( u n , x t , m ) ) n × N 0 N m τ , B m = ( Bz 1 , m τ , , Bz T , m τ ) τ , and B = ( B 1 , , B k ) .
Let B * = B / n T , and A m = ( 0 , , I , , 0 ) is an 1 × k block matrix, with the m-th block being N 0 N m × N 0 N m identity matrix and the j-th ( j m ) block being N 0 N m × N 0 N j zero matrix.
Theorem 3.
Theorem 3.Assume that assumptions (A1)–(A5) hold, g ˜ m ( u , x m ) and g ^ m ( u , x m ) are the initial and improved estimates of g m ( u , x m ) defined by (5) and (9), m = 1 , , k , respectively. Then, as n T , the following hold for all u ( 0 , 1 ) and x m [ 0 , 1 ] :
(i) The initial estimate g ˜ m ( u , x m ) is asymptotically normally distributed, i.e.,
n T ( C m Σ ε C m τ ) 1 ( g ˜ m ( u , x m ) g m ( u , x m ) ) D N ( 0 , 1 ) ,
where C m = b τ ( u , x m ) E ( A m ( B * τ B * ) 1 B * τ ) , the covariance matrix Σ ε = ( Σ t , s ) 1 t , s T , with Σ t , s = C o v ( ε t , ε s ) .
(ii) The improved estimate g ^ m ( u , x m ) is asymptotically normally distributed, i.e.,
n T ( C m Ξ ε C m τ ) 1 ( g ^ m ( u , x m ) g m ( u , x m ) ) D N ( 0 , 1 ) ,
where the covariance matrix Ξ ε = diag ( Ξ t , t ) 1 t T , with Ξ t , t ( u , s ) = σ t 2 ( u , s ) .

5. Numerical Study

In this section, we conduct two simulation studies to demonstrate the performance of the proposed identification and estimation procedure for the additive model.

5.1. Case 1

With the assumption that the auto-regressive error process is known, this case is conducted to demonstrate the performance of the estimation procedure with finite n and T. We consider the following DVCA-FAR(1) model:
f t ( u ) = z t , 1 g 1 ( u , x t , 1 ) + z t , 2 g 2 ( u , x t , 2 ) + ε t ( u ) , 0 u 1 ,
and the error function ε t ( u ) is
ε t ( u ) = γ 1 ( s , u ) ε t 1 ( s ) d s + e t ( u ) , 2 t T .
Let the bivariate varying-coefficient functions be
g 1 ( u , x t , 1 ) = sin ( 2 π u ) ( 2 x t , 1 1 ) , g 2 ( u , x t , 2 ) = sin ( 2 π u ) sin ( 2 π x t , 2 ) ,
and the coefficient functions be
γ 1 ( s , u ) = 0.2 u s , e t ( u ) = 0.2 η t , 1 sin ( π u ) + η t , 2 sin ( 2 π u ) ,
where η t , 1 N ( 0 , 0 . 1 2 ) , η t , 2 N ( 0 , 0 . 05 2 ) , and η t , 1 are independent of η t , 2 for u [ 0 , 1 ] .
The covariates z t , 1 , z t , 2 are generated by N ( 0 , 1 ) and N ( 0 , 0 . 5 2 ) , respectively, while x t , 1 , x t , 2 are generated by ( x t , 1 , x t , 2 ) τ = ( Φ ( v t , 1 ) , Φ ( v t , 2 ) ) τ , 1 t T , where Φ is the cumulative distribution function of the standard normal distribution and v t , 1 , v t , 2 are mutually independent variables of the standard normal distribution.
To generate the response densities, for each given Z = z and X = x , let α ( u , x , z ) be the additive surface given by α ( u , x , z ) = m = 1 p z m g m ( u , x m ) . The conditional quantile function Q ( · | x , z ) with the error process ε ( u ) satisfy Q ( u | x , z ) = F 1 ( u | x , z ) = θ ( x , z ) 1 0 u exp { α ( v , x , z ) + ε ( v ) } d v , where θ ( x , z ) = 0 1 exp { α ( v , x , z ) + ε ( v ) } d v .
Implementing the conditional quantile function to { U t , 1 U t , n t } U ( 0 , 1 ) , which are independent of X t and Z t , we can get the random samples Y t = { Y t , j = Q ( U t , j | X t , Z t ) : 1 j n t } for each 1 t T , so that Y t , 1 , , Y t , n t d t , where d t is the random response density. Denoting f t ( u ) = Ψ ( d t ( u ) ) , we obtain the transformed density expressed in model (2). Without loss of generality, we assume that n t = n independent and identically distributed observations are available for each response distribution.
With T = 100 , n = 100 , the estimation is conducted with 200 Monte Carlo runs. Figure 2 displays the true curves of the FAR error process ε ( u ) in panel (a) and its corresponding spline-based estimations in panel (b). Figure 3 presents a comprehensive view of the true surface of g m ( u , x m ) alongside the average estimations obtained from 200 Monte Carlo simulations. Specifically, the left panel displays the true densities, the middle panel shows the initial spline estimates derived without accounting for the F A R ( 1 ) error process, and the right panel provides the improved estimations achieved through the same method after removing the estimated error process. To better illustrate the performance, the bivariate function estimations are presented from two different relative perspectives, offering a clearer understanding of the improvements made through the error correction process.
Figure 2 indicates that the estimation of ε ( u ) yields highly accurate results, which is further corroborated by the findings presented in Figure 3. Specifically, the right panel of Figure 3 illustrates that the improved estimation, after accounting for the FAR error process, significantly outperforms the initial estimation shown in the middle panel. This improvement highlights the effectiveness of the proposed methodology in refining the bivariate function estimates by appropriately addressing the error structure.
For further comparison, we conducted simulations with sample sizes of T = 50 , 100 and n = 50 , 100 observations. We use the root mean squared errors (RMSEs) to measure the accuracy of the estimations, including the initial and improved estimations of g m ( u , x m ) . The RMSE is defined as
R M S E ( g ˙ m ) = 1 T t = 1 T { 1 n i = 1 n | | g ˙ m ( u i , x t , m ) g m ( u i , x t , m ) | | 2 2 } 1 2 .
where g ˙ m is g ˜ m or g ^ m .
Based on the results from 200 Monte Carlo simulations, Table 1 presents the average root mean square errors (RMSEs) along with their standard deviations for both the initial and improved estimations of g m ( u , x m ) . The findings reveal that the RMSEs of the bivariate functions decrease as both the sample size T and the number of observations n increase. Notably, the RMSEs associated with the improved estimation are consistently smaller than those for the initial estimation. This is to be expected, as the initial estimates were derived without accounting for the error process, which, as demonstrated, has a significant impact on the accuracy of the results. By incorporating the error process and removing its estimated effects in the improved estimation, the model yields more accurate and refined results.
To provide a clearer explanation of the theoretical validity differences between the two estimations, we calculate their biases and standard deviations and compare them. Taking the first example in the numerical simulation section as an example.
Table 2 presents the average bias and standard deviation (SD) of both the initial and improved estimations of g m ( u , x m ) . The results clearly indicate that, across all settings of the model sample size, the standard deviation of the improved estimates of the bivariate additive functions is substantially smaller than that of the initial estimates, with the bias also being correspondingly reduced. Moreover, as the sample size increases, both the standard deviation and bias of the estimates decrease, further reinforcing the reliability of the improved method. This finding numerically substantiates the claim that the improved estimation method results in a smaller asymptotic variance-covariance matrix compared to the initial estimation, thereby enhancing the precision and robustness of the estimates.

5.2. Case 2

Case 2 is conducted to demonstrate the efficiency of identifying the auto-regressive order of the functional error process. The densities are also generated from model (12), but with the F A R ( 2 ) error function, where γ 2 ( s , u ) = 1 4 u s 2 . All other settings are the same as for Case 1.
Table 3 presents the empirical power of the testing algorithm used to determine the order of the FAR error process under various settings and significance levels. The results clearly demonstrate that the power of the test increases as the sample size T and the number of observations n grow. Specifically, the power approaches 1 as both T and n increase to 100 when testing the null hypothesis of an independent and identically distributed (i.i.d.) sample. This suggests that the test becomes increasingly reliable with larger sample sizes. However, the power is slightly lower when testing the null hypothesis of order 1 against order 2, which is expected given the complexity of distinguishing between these two orders. Furthermore, the size of the test remains low when testing order 2 against a higher-order process, confirming the accuracy and feasibility of the testing algorithm for determining the appropriate order of the functional error process. These findings validate the effectiveness of the proposed testing algorithm in practical applications, ensuring its robustness and precision in various settings.
Furthermore, to assess the efficiency of the auto-regressive order p on the overall estimation results, Table 4 presents the average RMSEs for the bivariate varying-coefficient functions. The observed pattern closely mirrors the results from Case 1, further reinforcing the effectiveness and reliability of the proposed model’s identification and estimation procedures. These findings provide strong empirical evidence that the auto-regressive order plays a crucial role in enhancing the accuracy of the estimation process. The consistency of the RMSE results across different settings underscores the robustness of the model, confirming its ability to effectively account for the error structure and yield precise estimates of the bivariate varying-coefficient functions.

6. Real Data Analysis

In this section, we demonstrate the feasibility and efficiency of the proposed estimation procedure through the analysis of two real-world datasets. By applying the developed methodology to empirical data, we aim to showcase the practical utility of the model in capturing the underlying patterns and dependencies inherent in the datasets. The analysis serves not only to validate the effectiveness of the proposed estimation approach but also to highlight its applicability across different domains, offering insights into the model’s versatility and robustness in handling complex, time-dependent, and non-Euclidean data structures.

6.1. COVID-19 Data

On March 11, 2020, the World Health Organization (WHO) declared COVID-19, an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a global pandemic. The rapid and widespread transmission of the virus led to unprecedented global health challenges, with countries around the world instituting lockdowns and other measures to curb the spread of the disease. As of August 15, 2021, official statistics from the WHO reported a staggering 221,885,822 confirmed cases and 4,583,539 deaths across nearly all countries, reflecting the profound and far-reaching impact of the pandemic. Given the magnitude of this crisis, it is essential for international health organizations and research institutions to closely monitor the evolving global trends of COVID-19. Such monitoring facilitates timely and accurate analysis, enabling effective public health responses and the development of strategies for medical treatment, prevention, and control of future outbreaks. Understanding the dynamics of the epidemic through data-driven models is thus critical for informing policy decisions and improving global health outcomes in the face of such a devastating crisis.
To demonstrate this proposition, we select the mortality rate as an appropriate metric for measuring the global trend of the COVID-19 pandemic. The mortality rate is defined as the ratio of cumulative deaths per day to the total population of each country, serving as a key indicator of the disease’s lethality and spread. Notably, the data required to calculate the mortality rate are inherently temporally dependent, as they rely on the data from the previous day. This results in the mortality rate, and thus the global trend of the epidemic, exhibiting temporal auto-correlation.
The data on COVID-19-related deaths, which are critical for our analysis, can be accessed from the Johns Hopkins University repository. This repository includes a dynamic tracking map that provides a comprehensive view of the global trends related to the pandemic. The dataset, which is publicly available at https://www.jhu.edu/, covers the period from January 22, 2020, to April 15, 2021. Additionally, the most recent total population data for each country, necessary for calculating the mortality rate, can be obtained from the World Bank’s online platform, accessible at https://data.worldbank.org/indicator. These publicly available datasets serve as a rich resource for tracking the progression of the pandemic and for conducting rigorous statistical and epidemiological analyses aimed at understanding the disease dynamics across different regions.
Due to the varying outbreak times across different countries and regions, we define the origin of the time scale as the point at which the cumulative number of confirmed COVID-19 cases reached 100 in all countries. For this analysis, we focus on a dataset containing daily cumulative deaths from 189 countries, considering a 100-day period following this reference time. At each time point t, we estimate the density function of the mortality rate, denoted as d ^ t ( y ) , using the observations from the 189 countries. Figure 1 (a) presents the estimated densities of the global mortality rate (‰) over the 100-day interval, with data records from up to 189 countries for each time point. Figure 1 (b) offers an alternative perspective by displaying the estimated densities for three selected days. From the figures, it is evident that the mortality rate densities are well-defined across the observed period. Moreover, a temporal dependency among the distributions is clearly observable, which suggests the presence of an auto-regressive process in the data, potentially supporting the hypothesis of a FAR error structure.
The primary goal of this analysis, based on the COVID-19 data, is to identify the FAR process underlying the mortality rate and estimate its component functions. For the sake of simplicity, we begin by considering a special case in which the covariate z is constant (=1) and x represents a time scale, denoted as t / T , namely, we consider the following model:
f ^ t ( u ) = Ψ ( d ^ t ) ( u ) = g 1 ( u , x t , 1 ) + ε t ( u ) , 1 t 100 ,
where ε t ( u ) = l = 1 p γ l ( u , s ) ε t l ( s ) d s + e t ( u ) and x t , 1 denotes the time scale t / T .
Based on the initial spline estimation of g 1 ( u , x 1 ) , the testing algorithm is employed to determine the order of the FAR process. Table 5 presents the test results, specifically the p-values under different hypotheses. The table reveals that the observed p-values indicate significant evidence of auto-correlation in the data. This suggests that the underlying process can indeed be effectively modeled as a first-order functional auto-regressive process, denoted as F A R ( 1 ) . Such findings provide strong empirical support for the presence of temporal dependencies in the COVID-19 mortality rate, further justifying the application of a FAR error structure in modeling the dynamics of the epidemic.
Figure 4 presents the heat map of the estimated bivariate function g 1 ( u , x 1 ) after accounting for the functional error process and identifying the auto-regressive order. The heat map reveals a relatively stable temporal pattern, with the function initially reaching a minimum value at lower values of u, gradually increasing, and eventually reaching a maximum at later time points. This pattern highlights the underlying dynamics of the COVID-19 mortality rate across successive days. The observed correlation between the data from consecutive days supports the notion that the global mortality rate exhibits significant temporal dependencies. This is consistent with the design of the mortality rate measure, which is derived from previous daily data, reflecting the evolving trend of the pandemic on a global scale. The results further validate the presence of auto-correlation in the mortality rate data and underscore the necessity of incorporating a functional auto-regressive process to capture these dependencies effectively.

6.2. USA Income Data

Personal income statistics play a crucial role in enabling governments to understand the dynamics between national income, spending, and saving, while also serving as an important tool for evaluating and comparing the economic well-being across different regions or nations. In this context, we focus on the density time series of per capita personal income, which is defined as the total personal income of an area divided by its population. This measure provides a more granular perspective on the economic conditions within a region, reflecting the distribution and trends in income on a per-person basis over time. By analyzing such time series, policymakers and researchers can gain insights into the long-term economic trajectory of a region, assess disparities in income distribution, and make informed decisions regarding fiscal policies, social welfare programs, and economic development strategies.
Income data for the USA are publicly available at the official website of the United States Bureau of Economic Analysis (http://www.bea.gov/). We consider the quarterly per capita personal income of 50 states in the USA from the first quarter of 2010 to the fourth quarter of 2020, namely, t = 1 , , 44 . For each t, we obtain the density function of per capita income, d ^ t ( y ) , each with 50 observations. As the quarterly personal income is an economic measure based on national conditions, we choose two related covariates, `GDP’ (quarterly gross domestic product of the USA) and `Population’ (quarterly total population of the USA), which can also be obtained from http://www.bea.gov/.
The income curve, traditionally studied in economics as panel data, primarily reflects the relationship between consumers’ equilibrium points. As individuals’ income levels fluctuate, the connections between these equilibrium points form a trajectory, symbolizing not only an increase in income but also a corresponding rise in consumer satisfaction. This approach emphasizes the dynamic nature of income growth and its impact on individual well-being, providing valuable insights into consumer behavior over time.
In contrast, the income density curve, treated as functional data, is used to describe the distribution of income within a specific region or demographic group. It offers a graphical representation of the characteristics and trends in income across various income intervals, providing a more holistic view of the socio-economic landscape. By analyzing the income density curve, the degree of income inequality within a population can be effectively observed, revealing important patterns of wealth distribution. This type of curve is crucial for economic research as it facilitates a deeper understanding of consumption behavior, socio-economic conditions, and the formulation of societal policies.
Moreover, income density curves are instrumental in economic forecasting and analysis. By examining trends in income distribution, economists can combine insights into consumer preferences and consumption habits at different income levels. This integration enables predictions about future economic conditions and shifts in consumption patterns, making the income density curve a key tool for anticipating changes in both microeconomic and macroeconomic contexts. In this way, the income density curve plays an essential role in informing policy decisions, economic strategies, and the broader understanding of economic well-being.‌
Figure 5 (a) illustrates the density time series of quarterly personal income over a period of 44 quarters. The density curves reveal that, over the past decade, the overall distribution of per capita income across various states in the United States has exhibited a consistent pattern. Specifically, there are relatively few individuals in the high-income and middle-to-high-income brackets, a moderate number in the middle-income category, and a larger proportion in the middle-to-low-income segments.
To further elucidate the temporal evolution of income distribution, Figure 5 (b) presents the density curves for three distinct time points: the second quarter of 2015, the first quarter of 2017, and the third quarter of 2018. A clear trend emerges, showing a gradual shift towards higher income values over time, accompanied by a corresponding decrease in the peak of the density curve. This shift is unsurprising given the broader economic and technological advancements in modern society. As the economy progresses, the proportion of low-income individuals in the United States has been steadily declining, while the number of middle-to-high-income individuals has been rising. Consequently, the distribution of income is becoming more balanced, with a growing proportion of the population occupying the middle to high-income brackets. This pattern reflects the broader trends of economic development and income redistribution that have taken place in recent years.
We consider the following DVCA-FAR model:
f ^ t ( u ) = Ψ ( d ^ t ) ( u ) = g 0 ( u , x t , 0 ) + z t , 1 g 1 ( u , x t , 1 ) + z t , 2 g 2 ( u , x t , 2 ) + ε t ( u ) , 1 t 44 ,
where ε t ( u ) = l = 1 p γ l ( u , s ) ε t l ( s ) d s + e t ( u ) . z t , 1 denotes the quarterly gross domestic product of the USA, z t , 2 denotes the quarterly total population of the USA, and x t , 0 , x t , 1 , x t , 2 denote the time scale t / T .
Similar to the previous estimation procedure, the testing algorithm is applied to determine the optimal order based on the initial spline estimations. Table 6 presents the p-values of the test under various hypotheses, which indicate that the F A R ( 2 ) model is the most appropriate for modeling the error process in this context. This result suggests that a second-order functional autoregressive process effectively captures the autocorrelated structure of the error terms in the income data.
Using the three-step estimation procedure, estimations of the bivariate varying-coefficient functions can be obtained. Figure 6 displays the heat maps of the three bivariate functions, where g 0 ( u , x 0 ) represents the common effect of the data over time, g 1 ( u , x 1 ) and g 2 ( u , x 2 ) reflect the impact of quarterly GDP and quarterly total population on data respectively. The heat map of g 0 reveals a clear pattern, where high and low values alternate over time, indicating that individuals with both higher and lower per capita income experience similar effects, in contrast to those in the middle-income bracket, who show an opposing trend. For g 1 , the figure demonstrates a mode that is consistent for both small and large values of u, but varies over time. Initially, the function reaches a maximum, then decreases to a minimum before increasing again towards the end of the time scale. In contrast, the effect of population, as shown in the heat map of g 2 , exhibits a trend that is opposite to that of the common effect. The population impact is relatively consistent across both higher and lower per capita income groups. Taken together, these results suggest a significant dependency of the quarterly personal income distributions in the United States on previous statistics, with the dynamics of income closely tied to both macroeconomic factors (such as GDP) and demographic factors (such as population).

7. Discussion

Data collected from sequential time points often exhibit autocorrelation, which must be addressed in the modeling process. Additionally, the analysis of non-Euclidean data has become increasingly common. To tackle these challenges, we propose a varying-coefficient additive model with density responses, incorporating a Functional Auto-Regressive (FAR) error process. Due to the complexity and constraints of the data, we first map density functions into a linear space using the transformation method by [12]. We then develop a three-step estimation procedure for the varying-coefficient components. First, B-spline series approximation is used to obtain initial estimates of the bivariate varying-coefficient components without considering the functional errors. Next, the FAR error process order is determined using the test statistic from [10] based on the residuals from the initial estimation. Finally, the FAR error process is removed, and improved spline estimators are constructed for the varying-coefficient components. Theoretical results, including convergence rates and asymptotic properties, are derived for both the initial and improved estimations. The performance of the proposed method is demonstrated through simulations and two real datasets, showing the importance of accounting for autocorrelation and validating the efficiency of the proposed approach.
Future research can explore a variety of related problems. In this study, we employed the varying-coefficient additive model to establish the relationship between density function responses and scalar predictors. With the advent of large and complex datasets, functional predictors have become increasingly valuable in analyzing practical data applications. Furthermore, to account for sequence dependence, we utilized the FAR process to model the correlation structures in the data. However, more intricate models representing autocorrelation could also be explored to further refine the representation of sequence dependence in functional errors. Future work will focus on extending these approaches and conducting additional studies in these areas.

Appendix A

In this appendix, we provide detailed proofs of the theoretical results.
Proof of Theorem 1 
As described in the article, the proposed varying-coefficient additive model with functional error process can be written as
f t ( u ) = m = 1 k z t , m g m ( u , x t , m ) + ε t ( u ) , 0 u 1 .
Denote b ( u , x m ) = ( b 1 , 1 , m ( u , x m ) , , b N 0 , N m , m ( u , x m ) ) τ , λ m = ( λ 1 , 1 , m , , λ N 0 , N m , m ) τ , λ ˜ m = ( λ ˜ 1 , 1 , m , , λ ˜ N 0 , N m , m ) τ , λ = ( λ 1 , , λ k ) τ , and λ ˜ = ( λ ˜ 1 , , λ ˜ k ) τ , m = 1 , , k , t = 1 , , T .
By using the spline approximation method, g m ( u , x m ) can be written in the matrix form as
g m ( u , x m ) r = 1 N 0 j = 1 N m λ r , j , m b r , j , m ( u , x m ) = b τ ( u , x m ) λ m .
Ignoring the F A R error term, the initial estimator of bivariate varying-coefficient functions g m ( u , x m ) can be written as
g ˜ m ( u , x m ) = r = 1 N 0 j = 1 N m λ ˜ r , j , m b r , j , m ( u , x m ) = b τ ( u , x m ) λ ˜ m , 1 m k ,
where λ ˜ = ( λ ˜ 1 , 1 , 1 , , λ ˜ N 0 , N k , k ) τ is a ( N 0 m = 1 k N m ) -dimensional vector defined by
λ ˜ = arg min λ t = 1 T i = 1 n f t ( u i ) m = 1 k z t , m r = 1 N 0 j = 1 N m λ r , j , m b r , j , m ( u i , x t , m ) 2 .
We first give some notations. Denote B = ( B 1 , , B k ) , B m = ( Bz 1 , m τ , , Bz T , m τ ) τ , Bz t , m = ( b z ( u 1 , x t , m ) , , b z ( u n , x t , m ) ) τ , b z ( u , x t , m ) = z t , m b ( u , x t , m ) , f t = ( f t ( u 1 ) , , f t ( u n ) ) τ , f = ( f 1 τ , , f T τ ) τ , ε t = ( ε t ( u 1 ) , , ε t ( u n ) ) τ , ε = ( ε 1 τ , , ε T τ ) τ .
Let f ^ t ( u ) = Ψ ( d ^ t ) ( u ) be the estimate of f t ( u ) based on observations { Y t } . Denote f ^ = ( f 1 ^ τ , , f T ^ τ ) τ , where f ^ t = ( f ^ t ( u 1 ) , , f ^ t ( u n ) ) τ . We then have f ^ = f + ε f , where ε f = ( ε f 1 τ , , ε f T τ ) τ , and ε f t = Ψ ( d ^ t ) Ψ ( d t ) is the error generated from the LQD transformation of the density estimation d ^ t . For each d t , we assume the error from transfer and kernel smoothing is identically and independent distributed. The estimation of λ is given by λ ˜ = ( B τ B ) 1 B τ f ^ .
For simplicity, denote g z ( u , x t ) = m = 1 k z t , m g m ( u , x t , m ) , g t = ( g z ( u 1 , x t ) , , g z ( u n , x t ) ) τ , and g = ( g 1 τ , , g T τ ) τ . Let A m = ( 0 , , I , , 0 ) be an N 0 N m × N 0 i = 1 k N i block matrix, with each block be an N 0 N m × N 0 N i matrix, i = 1 , , k , and the m-th position an identity matrix.
To prove the consistency of g ˜ m ( u , x m ) , we first decompose g ˜ m ( u , x m ) g m ( u , x m ) into three parts, which is as follows.
g ˜ m ( u , x m ) g m ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ f ^ g m ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ f + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f g m ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ g g m ( u , x m ) + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f = g B ( u , x m ) + g V ( u , x m ) + g e ( u , x m ) ,
where
g B ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ g g m ( u , x m ) , g V ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ ε , g e ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f .
For g B ( u , x m ) , we have
g B ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ g g m ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ g λ + b τ ( u , x m ) A m λ g m ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ [ g B λ ] + b τ ( u , x m ) λ m g m ( u , x m ) = b τ ( u , x m ) A m ( 1 n T B τ B ) 1 B τ 1 n T ( g B λ ) + b τ ( u , x m ) λ m g m ( u , x m ) .
It indicates by [16] that the order of the traditional bivariate spline estimator is O p ( N m 2 ) , i.e., there exists a constant C 0 , such that
sup u , x m | g m ( u , x m ) b ( u , x m ) τ λ m | C 0 N m 2 .
For simplicity, we assume that there exists a constant N 1 such that N 0 = N m = N 1 , 1 m k . As a result, we obtain that sup u , x 1 n T | g B λ | C 1 N 1 2 with a constant C 1 . Combined with the result that | | ( 1 n T B τ B ) 1 | | = O p ( N 1 2 ) , which can be derived from DeVore & Lorentz (1993), therefore, we can get that
sup u , x m [ 0 , 1 ] | g B ( u , x m ) | = O p ( N 1 2 ) .
For g V ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ ε , ε is a functional auto-regressive process, defined as ε t ( u ) = l = 1 p γ l ( s , u ) ε t l ( s ) d s + e t ( u ) . Based on the assumption that { e t } t = 1 T is independent with x t , z t , satisfying E ( e t ( u ) | x t , z t ) = 0 , and the largest eigenvalues of the covariance function Σ ε ( u ) , λ m a x , is finite. Therefore,
E ( g V ( u , x m ) ) = E [ E ( g V ( u , x m ) | x , z ) ] = E [ E ( b τ ( u , x m ) A m ( B τ B ) 1 B τ ε | x , z ) ] = E [ b τ ( u , x m ) A m ( B τ B ) 1 B τ E ( ε | x , z ) ] = 0
and
E ( g V ( u , x m ) ) 2 = E [ E ( g V ( u , x m ) 2 | x , z ) ] = E E ( b τ ( u , x m ) A m ( B τ B ) 1 B τ ε ) 2 | x , z = E E b τ ( u , x m ) A m ( B τ B ) 1 B τ ε ε τ B ( B τ B ) 1 A m τ b ( u , x m ) | x , z = E b τ ( u , x m ) A m ( B τ B ) 1 B τ E ( ε ε τ | x , z ) B ( B τ B ) 1 A m τ b ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ Σ ε B ( B τ B ) 1 A m τ b ( u , x m ) C λ m a x n T b τ ( u , x m ) A m ( 1 n T B τ B ) 1 A m τ b ( u , x m ) C λ m a x n T b ( u , x m ) b τ ( u , x m ) ( 1 n T B τ B ) 1
Subsequently, combining the above result, we have
sup u , x m [ 0 , 1 ] | g V ( u , x m ) | = O p ( N 1 / n T ) = O p N 1 1 / 2 / n T .
For g e ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f , note that f t ( u ) = Ψ ( d t ) ( u ) and the estimation obtained from observations is f ^ t ( u ) = Ψ ( d ^ t ( u ) ) . From Petersen & Müller (2016), it can be obtained that sup d t F | d ^ t d t | = O p ( h + ( n h ) 1 / 2 ) . Since ε f t = f ^ t f t is the error from the transformation of d ^ t ( · ) and d t ( · ) , the consistency of LQD transformation indicates that sup d t | Ψ ( d ^ t ) Ψ ( d t ) | = O p ( h + ( n h ) 1 / 2 ) . Then, under the assumption of smoothness condition, we can get that
sup u , x m [ 0 , 1 ] | g e ( u , x m ) | = O p ( h + ( n h ) 1 2 ) .
Therefore, as n , T , h n 1 / 3 , N 0 , N m ( n T ) 1 / 6 log n T , thus N 1 ( n T ) 1 / 6 log n T , it is easy to have
sup u , x m [ 0 , 1 ] | g ˜ m ( u , x m ) g m ( u , x m ) | sup u , x m [ 0 , 1 ] | g B ( u , x m ) | + sup u , x m [ 0 , 1 ] | g V ( u , x m ) | + sup u , x m [ 0 , 1 ] | g e ( u , x m ) | = O p ( N 1 2 ) + O p ( N 1 / n T ) + O p ( h + ( n h ) 1 / 2 ) = O p ( n T ) 1 / 3 ( log n T ) + n 1 / 3 .
The proof of the theorem is completed. □
Proof of Theorem 2
The improved spline approximation of error process ε t ( u ) is given by
ε ^ t ( u ) = l = 1 p r = 1 N j = 1 N μ ^ r , j , l b r , j ( u , s ) ε ^ t l ( s ) d s , 0 u 1 ,
where μ ^ = ( μ ^ 1 , 1 , 1 , , μ ^ N , N , p ) τ is a p N 2 -dimensional vector, minimizing the following problem, i.e.,
μ ^ = arg min μ t = p + 1 T i = 1 n ε ˜ t ( u i ) l = 1 p r = 1 N j = 1 N μ r , j , l b r , j ( u i , s ) ε ˜ t l ( s ) d s 2 .
Let f t c ( u ) = f t ( u ) ε t ( u ) , and f ^ t c ( u ) = f t ( u ) ε ^ t ( u ) , with the spline algorithm again, the improved estimate of g m ( u , x m ) is given by
g ^ m ( u , x m ) = r = 1 N 0 j = 1 N m λ ^ r , j , m b r , j , m ( u , x m ) , 1 m k ,
where λ ^ = ( λ ^ 1 , 1 , 1 , , λ ^ N 0 , N k , k ) T is a ( N 0 m = 1 k N m ) -dimensional vector satisfying
λ ^ = arg min λ t = 1 T i = 1 n f ^ t c ( u i ) m = 1 k z t , m r = 1 N 0 j = 1 N m λ r , j , m b r , j , m ( u i , x t , m ) 2 .
Denote f t c = ( f t c ( u 1 ) , , f t c ( u n ) ) τ and f c = ( f 1 c τ , , f T c τ ) τ , f ^ t c = ( f ^ t c ( u 1 ) , , f ^ t c ( u n ) ) τ and f ^ c = ( f ^ 1 c τ , , f ^ T c τ ) τ , meanwhile e t = ( e t ( u 1 ) , , e t ( u n ) ) τ and e = ( e 1 τ , , e T τ ) τ .
Based on the estimation of ε , the estimation of λ is given by λ ^ = ( B τ B ) 1 B τ f ^ c , where f ^ c = f ε ^ . Since random error exists in the density estimation process, namely, f ^ = f + ε f , where ε f is defined similarly as ε , therefore, denote f ^ ˜ c = f ^ ε ^ , the estimate of λ based on the observations is given by λ ^ = ( B τ B ) 1 B τ f ^ ˜ c .
Similar as the proof of Theorem 1, we decompose g ^ m ( u , x m ) g m ( u , x m ) as
g ^ m ( u , x m ) g m ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ f ^ ˜ c g m ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ ( f ε ^ ) + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f g m ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ g g m ( u , x m ) + b τ ( u , x m ) A m ( B τ B ) 1 B τ ( ε ε ^ ) + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f = g B ( u , x m ) + g V ( u , x m ) + g e ( u , x m ) ,
where
g B ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ g g m ( u , x m ) , g V ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ ( ε ε ^ ) , g e ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f .
For g B ( u , x m ) ,
g B ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ g g m ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ [ g B λ ] + b τ ( u , x m ) λ m g m ( u , x m ) = b τ ( u , x m ) A m ( 1 n T B τ B ) 1 B τ 1 n T ( g B λ ) + b τ ( u , x m ) λ m g m ( u , x m )
Similar to the discussion in the proof of Theorem 1, we can have
sup u , x m [ 0 , 1 ] | g B ( u , x m ) | = O p ( N 1 2 ) .
For g V ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ ( ε ε ^ ) , the functional error process can be written as
ε t ( u ) = l = 1 p γ l ( u , s ) ε t l ( s ) d s + e t ( u ) = l = 1 p r = 1 N j = 1 N μ r , j , l b r , j ( u , s ) ε t l ( s ) d s + e t ( u ) ,
with the corresponding approximation
ε ^ t ( u ) = l = 1 p r = 1 N j = 1 N μ ^ r , j , l b r , j ( u , s ) ε ˜ t l ( s ) d s , 0 u 1 .
Denote b t ( u ) = ( b 1 , 1 ( u , s ) ε ˜ t 1 ( s ) d s , , b N , N ( u , s ) ε ˜ t p ( s ) d s ) τ , b t = ( b t ( u 1 ) , , b t ( u n ) ) τ , B ε = ( b p τ , , b T τ ) τ . Denote ε t = ( ε t ( u 1 ) , , ε t ( u n ) ) τ , ε = ( ε p + 1 τ , , ε T τ ) τ .
The model can be rewritten in the form of matrix as ε B ε μ + e . Based on the initial estimation, the model is ε ˜ B ε μ + e ˜ , where e ˜ is defined similarly as ε , and the estimation of μ is given by μ ^ = ( B ε τ B ε ) 1 B ε τ ε ˜ .
Then,
ε ^ t ( u ) ε t ( u ) = l = 1 p r = 1 N j = 1 N μ ^ r , j , l b r , j ( u , s ) ε ˜ t l ( s ) d s ε t ( u ) = b t τ ( u ) ( B ε τ B ε ) 1 B ε τ ε ˜ ε t ( u ) = b t τ ( u ) ( B ε τ B ε ) 1 B ε τ ( ε ˜ B ε μ ) + ( b t τ ( u ) μ ε t ( u ) )
Since there exist a constant C t , such that sup u | ε t ( u ) b t τ ( u ) μ | C t N 2 , and similarly, sup u | ε ˜ B ε μ | C p N 2 with another constant C p . Then, under the assumption that sample size is infinite, it is clear that sup u | ε ˜ ε | = O p ( N 2 ) . Therefore,
sup u , x m [ 0 , 1 ] | g V ( u , x m ) | = sup u , x m [ 0 , 1 ] | b τ ( u , x m ) A m ( B τ B ) 1 B τ ( ε ε ^ ) | = O p ( N 2 ) .
For g e ( u , x m ) = b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f , under the smoothness condition of f t , combining with the proof in Theorem 1, we can get that
sup u , x m [ 0 , 1 ] | g e ( u , x m ) | = O p ( h + ( n h ) 1 / 2 ) .
Therefore, as n , T , h n 1 / 3 , N 0 , N m ( n T ) 1 / 6 log n T , namely, N 1 ( n T ) 1 / 6 log n T , N ( n T ) 1 / 6 log n T , it is easy to have
sup u , x m [ 0 , 1 ] | g ^ m ( u , x m ) g m ( u , x m ) | sup u , x m [ 0 , 1 ] | g B ( u , x m ) | + sup u , x m [ 0 , 1 ] | g V ( u , x m ) | + sup u , x m [ 0 , 1 ] | g e ( u , x m ) | = O p ( N 1 2 ) + O p ( N 2 ) + O p ( N 1 2 + h + ( n h ) 1 / 2 ) = O p ( n T ) 1 / 3 ( log n T ) 2 + n 1 / 3 .
The proof of the theorem is completed. □
Proof of Theorem 3
(i) We first prove the asymptotic normality of the initial estimation g ˜ m ( u , x m ) . With (1)-(4), we rewrite n T ( g ˜ m ( u , x m ) g m ( u , x m ) ) as
n T ( g ˜ m ( u , x m ) g m ( u , x m ) ) = n T [ b τ ( u , x m ) A m ( B τ B ) 1 B τ g g m ( u , x m ) ] + n T [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ε + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f ]
Since the error process ε t is independent with the covariates x t , z t , we have E ( ε t | x t , z t ) = 0 . Combined with the result of the Theorem 1, it is easy to have that
E [ n T ( g ˜ m ( u , x m ) g m ( u , x m ) ) ] = n T E [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ( g B λ ) ] + n T E [ b τ ( u , x m ) λ m g m ( u , x m ) ] + n T E [ b τ ( u , x m ) A m ( B τ B ) 1 B τ E ( ε | x , z ) ] + n T E [ b τ ( u , x m ) A m ( B τ B ) 1 B τ E ( ε f | x , z ) ] = 0
Meanwhile,
V a r [ n T ( g ˜ m ( u , x m ) g m ( u , x m ) ) ] = n T V a r [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ε + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f ]
Based on the results proved in Theorem 1, we can get that
n T V a r [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ε ] = n T E E b τ ( u , x m ) A m ( B τ B ) 1 B τ ε | x , z 2 = n T E E b τ ( u , x m ) A m ( B τ B ) 1 B τ ε ε τ B ( B τ B ) 1 A m τ b ( u , x m ) | x , z
Denote D m = A m ( B * τ B * ) 1 B * τ , where B * = B / n T , then
n T V a r [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ε ] = E b τ ( u , x m ) E D m ε ε τ D m τ | x , z b τ ( u , x m ) .
Denote Σ ε = E ( ε ε τ | x , z ) , since the error process has auto-correlation, the covariance matrix can be decomposed into two parts, namely, Σ ε = Σ 1 + Σ 2 , where Σ 1 = diag ( Σ t , t ) 1 t T is a diagonal matrix with the t-th diagonal element as Σ t , t = C o v ( ε t ) , and Σ 2 contains the off-diagonal elements as Σ 2 = ( Σ i , j ) 1 t s T = ( C o v ( ε t , ε s ) ) 1 t s T , representing the dependence between the auto-regressive errors.
For another,
n T V a r [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f ] = n T E E b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f ε f τ B ( B τ B ) 1 A m τ b ( u , x m ) | x , z = n T σ ϵ 2 E E b τ ( u , x m ) A m ( B τ B ) 1 A m τ b ( u , x m ) | x , z ,
under the smoothness assumption, when sample size n , T , the error generated from the density estimation tends to zero, thus the variance of this part tends to zero.
Therefore, combined with the proof that the property of the second moment in the Theorem 1 satisfies the Linderberg-Feller central limit theorem, then under the assumption that n T ,
n T ( C m Σ ε C m τ ) 1 ( g ˜ m ( u , x m ) g m ( u , x m ) ) D N ( 0 , 1 ) ,
where C m = b τ ( u , x m ) E ( D m ) = b τ ( u , x m ) E ( A m ( B * τ B * ) 1 B * τ ) , Σ ε = ( Σ t , s ) 1 t , s T satisfying Σ t , s = C o v ( ε t , ε s ) .
(ii) After obtaining the spline estimation of error process as ε ^ , the improved estimation of bivariate varying-coefficient functions g m ( u , x m ) is estimated from the refined model as f t c ( u ) = f t ( u ) ε t ( u ) = m = 1 k z t , m g m ( u , x t , m ) + e t ( u ) .
As proved in the Theorem 3,
n T ( g ^ m ( u , x m ) g m ( u , x m ) ) = n T [ b τ ( u , x m ) A m ( B τ B ) 1 B τ f ^ ˜ c g m ( u , x m ) ] = n T [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ( f ε ^ ) + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f g m ( u , x m ) ] = n T [ b τ ( u , x m ) A m ( B τ B ) 1 B τ g g m ( u , x m ) ] + n T [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ( ε ε ^ ) + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f ]
Since ε ^ is the estimation of the error process ε , then based on the convergence results obtained in Theorem 3, it is clear to get that E ( n T [ g ˜ m ( u , x m ) g m ( u , x m ) ] = 0 .
Meanwhile,
V a r [ n T ( g ^ m ( u , x m ) g m ( u , x m ) ) ] = n T V a r [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ( g + ε ε ^ ε f ) g m ( u , x m ) ] = n T V a r [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ( ε ε ^ ) + b τ ( u , x m ) A m ( B τ B ) 1 B τ ε f ]
Based on the results proved in Theorem 3 and the same notes, we can get that
n T V a r [ b τ ( u , x m ) A m ( B τ B ) 1 B τ ( ε ε ^ ) ] = n T E E b τ ( u , x m ) A m ( B τ B ) 1 B τ ( ε ε ^ ) | x , z 2 = n T E E b τ ( u , x m ) A m ( B τ B ) 1 B τ ( ε ε ^ ) ( ε ε ^ ) τ B ( B τ B ) 1 A m τ b ( u , x m ) | x , z = E b τ ( u , x m ) D m E ( ( ε ε ^ ) ( ε ε ^ ) τ | x , z ) D m τ b ( u , x m )
Denote Ξ ε = E ( ( ε ε ^ ) ( ε ε ^ ) τ | x , z ) , similarly, it can also be decomposed as two parts as Ξ ε = Ξ 1 + Ξ 2 , where Ξ 1 = diag ( Ξ t , t ) 1 t T is a diagnoal matrix with the t-th diagonal element as Ξ t , t = C o v ( ε t ε ^ t ) , and Ξ 2 contains the off-diagonal elements as Ξ 2 = ( Ξ t , s ) 1 t s T = ( C o v ( ε t ε ^ t , ε s ε ^ s ) ) 1 t s T .
Due to the convergence of ε ε ^ proved before, the variance matrix becomes Ξ ε = Ξ 1 = C o v ( e t ) as n , T . Since C o v ( e t ( u ) , e t ( s ) ) = σ t 2 ( u , s ) , then, the variance matrix can be written as Ξ ε = C o v ( e t ) = diag ( Ξ t , t ) 1 t T , where Ξ t , t ( u , s ) = σ t 2 ( u , s ) .
Therefore,
n T ( C m Ξ ε C m τ ) 1 ( g ^ m ( u , x m ) g m ( u , x m ) ) D N ( 0 , 1 ) ,
where C m = b τ ( u , x m ) E ( A m ( B * τ B * ) 1 B * τ ) , Ξ ε = diag ( Ξ t , t ) 1 t T with Ξ t , t ( u , s ) = σ t 2 ( u , s ) .
The proof of the theorem is completed. □

Author Contributions

Conceptualization, Zixuan Han, TAO LI and Jinhong You; Data curation, Zixuan Han; Formal analysis, Zixuan Han; Funding acquisition, TAO LI and Jinhong You; Investigation, Zixuan Han; Methodology, Zixuan Han, TAO LI and Jinhong You; Project administration, TAO LI, Jinhong You and Narayanaswamy Balakrishnan; Resources, Zixuan Han; Software, Zixuan Han; Supervision, TAO LI, Jinhong You and Narayanaswamy Balakrishnan; Validation, Zixuan Han; Writing – original draft, Zixuan Han; Writing – review & editing, Zixuan Han, TAO LI, Jinhong You and Narayanaswamy Balakrishnan. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Dr. Li and Dr. You. Dr. Li’s research is supported by grants from Humanities and Social Science Fund of Ministry of Education of China (No. 21YJA910001). Dr. You’s research is supported by grants from the National Natural Science Foundation of China (NSFC) (No.11971291).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original datasets employed in this study are publicly accessible from the official website of Johns Hopkins University at https://www.jhu.edu/ and the World Bank’s online platform at https://data.worldbank.org/.

Acknowledgments

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Berhoune, K.; Bensmain, N. . Sieves estimator of functional autoregressive process. Statistics and Probability Letters 2018, 135, 60–69. [Google Scholar] [CrossRef]
  2. Bosq, D. . Linear processes in function spaces: theory and applications. Springer Science & Business Media, New York, 2000.
  3. Chen, Y.; Chua, W. S.; Hardle, W. Forecasting limit order book liquidity supply-demand curves with functional autoregressive dynamics. Quantitative Finance 2019, 19(9), 1473–1489. [Google Scholar] [CrossRef]
  4. Chen, Y.; Li, B. . An adaptive functional autoregressive forecast model to predict electricity price curves. Journal of Business and Economic Statistics 2017, 35(3), 371–388. [Google Scholar] [CrossRef]
  5. Chen, Y.; Lin, Z.; Müller, H. . Wasserstein regression. Journal of the American Statistical Association 2023, 118(542), 869–882. [Google Scholar] [CrossRef]
  6. Daniel, R.; David, S.; David, R. . Functional autoregression for sparsely sampled data. Journal of Business and Economic Statistics 2019, 37(1), 97–109. [Google Scholar]
  7. DeVore, R.; Lorentz, G. . Constructive approximation, Springer Science & Business media, 1993; volume 303.
  8. Han, K.; Müller, H.; Park, B. . Additive functional regression for densities as responses. Journal of the American Statistical Association 2020, 115(530), 997–1010. [Google Scholar] [CrossRef]
  9. Kokoszka, P.; Miao, H.; Petersen, A.; Shang, H. L. . Forecasting of density functions with an application to cross-sectional and intraday returns. International Journal of Forecasting 2019, 35, 1304–1317. [Google Scholar] [CrossRef]
  10. Kokoszka, P.; Reimherr, M. . Determining the order of the functional autoregressive model. Journal of Time Series Analysis 2013, 34, 116–129. [Google Scholar] [CrossRef]
  11. Petersen, A.; Chen, C.; Müller, H. Quantifying and visualizing intraregional connectivity in resting-state functional magnetic resonance imaging with correlation densities. Brain Connectivity 2019, 9(1), 37–47. [Google Scholar] [CrossRef] [PubMed]
  12. Petersen, A.; Müller, H. . Functional data analysis for density functions by transformation to a Hilbert space. The Annals of Statistics 2016, 44(1), 183–218. [Google Scholar] [CrossRef]
  13. Petersen, A.; Müller, H. . Fréchet regression for random objects with Euclidean predictors. The Annals of Statistics 2019, 47(2), 691–719. [Google Scholar] [CrossRef]
  14. Saha, A.; Banerjee, S.; Kurtek, S.; Narang, S.; Lee, J.; Rao, G.; Martinez, J.; Bharath, K.; Rao, A.; Baladandayuthapani, V. . DEMARCATE: Density-based magnetic resonance image clustering for assessing tumor heterogeneity in cancer. NeuroImage: Clinical 2016, 12, 132–143. [Google Scholar] [CrossRef] [PubMed]
  15. Sen, R.; Ma, C. Forecasting density function: Application in finance. Journal of Mathematical Finance 2015, 5, 433–447. [Google Scholar] [CrossRef]
  16. Stone, C. . The use of polynomial splines and their tensor products in multivariate function estimation. The Annals of Statistics 1994, 22(1), 118–171. [Google Scholar]
  17. Talská, R.; Menafoglio, A.; Machalová, J.; Hron, K.; Fiserová, E. Compositional regression with functional response. Computational Statistics & Data Analysis 2018, 123(1), 66–85. [Google Scholar] [CrossRef]
  18. Xu, X.; Chen, Y.; Zhang, G.; Koch, T. . Modeling functional time series and mixed-type predictors with partially functional autoregressions. Journal of Business and Economic Statistics 2022, 0, 1–18. [Google Scholar] [CrossRef]
  19. Zhang, C.; Kokoszka, P.; Petersen, A. . Wasserstein autoregressive models for density time series. Journal of Time Series Analysis 2022, 43, 30–52. [Google Scholar] [CrossRef]
Figure 1. Densities of global mortality rate (‰) of COVID-19 over an interval of 100 days. (a): Three-dimensional trend of density time series during the whole period; (b) Density curves at three different selected days.
Figure 1. Densities of global mortality rate (‰) of COVID-19 over an interval of 100 days. (a): Three-dimensional trend of density time series during the whole period; (b) Density curves at three different selected days.
Preprints 167880 g001
Figure 2. Average estimations of F A R ( 1 ) error process ε ( u ) obtained from 200 Monte Carlo runs with sample size T = 100 and n = 100 observations. Picture (a): true curves, (b): spline estimations.
Figure 2. Average estimations of F A R ( 1 ) error process ε ( u ) obtained from 200 Monte Carlo runs with sample size T = 100 and n = 100 observations. Picture (a): true curves, (b): spline estimations.
Preprints 167880 g002
Figure 3. Average estimations of g m ( u , x m ) , m = 1 , 2 . Left panels: true densities, middle panels: initial estimations, right panels: improved estimations. Upper two panels display g 1 ( u , x 1 ) from two angles, lower panels display g 2 ( u , x 2 ) .
Figure 3. Average estimations of g m ( u , x m ) , m = 1 , 2 . Left panels: true densities, middle panels: initial estimations, right panels: improved estimations. Upper two panels display g 1 ( u , x 1 ) from two angles, lower panels display g 2 ( u , x 2 ) .
Preprints 167880 g003
Figure 4. Heat map of bivariate function g 1 ( u , x 1 ) in the model based on the mortality rate (‰) data of COVID-19.
Figure 4. Heat map of bivariate function g 1 ( u , x 1 ) in the model based on the mortality rate (‰) data of COVID-19.
Preprints 167880 g004
Figure 5. Densities of national quarterly personal income in the USA over an interval of 44 quarters. (a): Three-dimensional trend of density time series during the whole period; (b): Density curves at three different selected quarters .
Figure 5. Densities of national quarterly personal income in the USA over an interval of 44 quarters. (a): Three-dimensional trend of density time series during the whole period; (b): Density curves at three different selected quarters .
Preprints 167880 g005
Figure 6. Heat maps of bivariate varying-coefficient functions g m ( u , x m ) , m = 0 , 1 , 2 , in the USA income data model.
Figure 6. Heat maps of bivariate varying-coefficient functions g m ( u , x m ) , m = 0 , 1 , 2 , in the USA income data model.
Preprints 167880 g006
Table 1. Average RMSEs of both initial and improved estimations of g m ( u , x m ) .
Table 1. Average RMSEs of both initial and improved estimations of g m ( u , x m ) .
Average RMSEs of Bivariate Varying-Coefficient Additive Functions
Sample Size g 1 ( u , x 1 ) g 2 ( u , x 2 )
T n Initial Improved Initial Improved
50 50 0.2247 0.1848 0.2139 0.1785
100 0.1759 0.1325 0.1844 0.1521
100 50 0.1826 0.1471 0.1732 0.1354
100 0.1431 0.1164 0.1319 0.1057
Table 2. Average Standard Deviation (SD) and Bias of both initial and improved estimations of g m ( u , x m ) .
Table 2. Average Standard Deviation (SD) and Bias of both initial and improved estimations of g m ( u , x m ) .
Average SD and Bias of Bivariate Varying-Coefficient Additive Functions
Sample Size g 1 ( u , x 1 ) g 2 ( u , x 2 )
Initial Improved Initial Improved
T n SD Bias SD Bias SD Bias SD Bias
50 50 0.205 0.147 0.168 0.104 0.219 0.137 0.183 0.117
100 0.179 0.122 0.142 0.093 0.196 0.128 0.164 0.095
100 50 0.174 0.136 0.151 0.082 0.187 0.131 0.158 0.086
100 0.133 0.099 0.112 0.057 0.153 0.111 0.129 0.061
Table 3. Empirical power of testing algorithm to determine the order of FAR error process under different significance levels.
Table 3. Empirical power of testing algorithm to determine the order of FAR error process under different significance levels.
Null Hypothesis p = 0 p 1 p 2
Alternative Hypothesis p 1 p 2 p 3
Sample Size Significance Level Significance Level Significance Level
T n 0.05 0.1 0.05 0.1 0.05 0.1
50 50 0.893 0.962 0.787 0.846 0.082 0.134
100 0.931 0.985 0.824 0.893 0.073 0.125
100 50 0.942 0.972 0.821 0.881 0.071 0.121
100 0.985 1.000 0.889 0.935 0.064 0.113
Table 4. Average RMSEs of both initial and improved estimations of g m ( u , x m ) .
Table 4. Average RMSEs of both initial and improved estimations of g m ( u , x m ) .
Average RMSEs of Bivariate Varying-Coefficient Additive Functions
Sample Size g 1 ( u , x 1 ) g 2 ( u , x 2 )
T n Initial Improved Initial Improved
50 50 0.2739 0.2438 0.2691 0.2235
100 0.2264 0.1852 0.2157 0.1809
100 50 0.2136 0.1817 0.2232 0.1761
100 0.1729 0.1263 0.1816 0.1224
Table 5. P-values of the testing algorithm for identifying the order of functional error process based on the mortality rate data of COVID-19.
Table 5. P-values of the testing algorithm for identifying the order of functional error process based on the mortality rate data of COVID-19.
Null Hypothesis p = 0 p 1
Alternative Hypothesis p 1 p 2
P-value 0.000 0.194
Table 6. P-values of the testing algorithm for identifying the order of the functional error process based on USA income data.
Table 6. P-values of the testing algorithm for identifying the order of the functional error process based on USA income data.
Null Hypothesis p = 0 p 1 p 2
Alternative Hypothesis p 1 p 2 p 3
P-value 0.000 0.000 0.436
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated