Preprint
Article

This version is not peer-reviewed.

On the Application of the Autoregressive–Moving-Average Model for Studying Phylogenetic Rate of Trait Evolution

A peer-reviewed article of this preprint also exists.

Submitted:

25 November 2024

Posted:

26 November 2024

You are already at the latest version

Abstract
The rate of evolution plays a crucial role in understanding the pace at which species evolve. Various statistical models have been developed to estimate the rate parameters for a group of related species evolving along a phylogenetic tree. Existing models often assume the independence of the rate parameters; however, this assumption may not account for scenarios where the rate of evolution correlates with its evolutionary history. We propose using the autoregressive-moving-average (ARMA) model for modeling the rate of evolution along the tree, hypothesizing that rates between two successive generations (ancestor-descendant) are time dependent and correlated along the tree. We denote \texttt{phyrateARMA}$(p,q)$ as a phylogenetic ARMA($p$,$q$) model in our innovation. Our algorithm begins by utilizing the tree and trait data to estimate the rates on each branch, followed by implementing the ARMA process to infer the relationships between successive rates. We apply our innovation to analyze the primate body mass dataset and plant genome size dataset and test for the autoregressive effect of rate of evolution along the tree.
Keywords: 
;  ;  ;  ;  

1. Introduction

In evolution, studying the heritable characteristics of the biological population is helpful to understand the diversity between species on our planet Earth [1]. Macroevolution is expected to occur when selection acts on a trait that has a heritable basis of phenotypic variation. During the evolutionary process, speciation results in new species, and the comparison of traits (e.g. height, weight, size, ⋯ etc.) among a group of related species can be made by studying the speed of changes in their characteristics over successive generations [2].
Although one subgroup of species evolved at a faster rate and resulted in a larger variation in the trait, the other subgroup of species evolved at a relatively lower rate and produced a moderate variation of the trait. When evolutionary processes such as natural selection (including sexual selection) and genetic drift act on this variation, certain characteristics become more common or rare within a population [3]. For example, Darwin finches are a group of about 18 species of dull-colored passerine birds on the Galápagos islands [4]. They are well known for their remarkable diversity in form, size, and function of the beak, which is highly adapted to different food sources [5]. Another example is angiosperms, which survive and thrive successfully on our planet. The evolution of fruits is one of the most important characteristics, as fruits not only provide a food source for other species, but also protect seeds and contribute to seed dispersal [6]. The survival and success of fruits require adaptation to their environment. For example, while dragon fruit (Selenicereus) endures temperatures up to 40°C (104°F) for survival, watermelon (Citrullus lanatus) needs temperatures higher than about 25°C (77°F) to thrive. Studying the reproducible properties by the rate of evolution to the resistance of temperature would help us shed light on the evolution of angiosperms themselves and understand their ecological implications.
The rate of evolution is a measurement of the change in an evolutionary lineage over time and can be defined as the ratio of the character displacement over a certain time interval. [7] defined the rate of change between two samples using three quantities: the proportional difference between the sample means, the pooled standard deviation of the samples, and the time interval between the samples. For example, suppose that a character has been measured twice, t 1 and t 2 , where t 1 and t 2 are expressed as the time before the present in millions of years. The time interval between the two samples can be written as d t = t 1 t 2 , which is 1 million years if t 1 = 6.27 and t 2 = 5.27 . The average value of the character is defined as x 1 in the previous sample and x 2 in the later sample. Let y t i = log ( x t i ) , i = 1 , 2 be the natural logarithm taking into account x t 1 and x t 2 . Then the evolutionary rate ( σ ) can be defined as in Eq. (1)
σ = ( y t 2 y t 1 ) / ( t 2 t 1 ) .
Next, multiplying the time difference t 2 t 1 on both sides of Eq. (1) produces the character difference within a time unit y t 2 y t 1 = σ ( t 2 t 1 ) . Conceptually consider that the character change occurred in infinitesimal time and denote the character displacement by d y t = lim t 1 t 2 ( y t 2 y t 1 ) = lim t 1 t 2 σ ( t 2 t 1 ) , then we have the differential equation d y t = σ d t . Given y ( 0 ) = y 0 , the solution is y t = y 0 + σ t which shows that y t increased with time t. Unfortunately, this may not be an appropriate model for describing character change in the evolutionary perspective. Instead, one may consider that the variation of the character change adopts a certain dynamic. For example, if one considers that the variation of the character change is proportional to time, then a stochastic variable W t can be introduced, where W t is a Wiener process with independent Gaussian distributed increment and d W t = W t + d t W t is a normal distributed random variable with mean 0 and variance d t (i.e. d W t N ( 0 , d t ) ). Thus, the displacement of the continuous trait variable y t ( = log x t ) solves stochastic differential equation shown in Eq. (2)
d y t = σ d W t .
Given y ( 0 ) = y 0 , integrate both sides of Eq. (2), one has
y t = y 0 + σ W t ,
where W 0 = 0 , and W t N ( 0 , t ) .
Given the initial value y 0 , Eq. (3) describes the dynamic of the trait variable y t at time t pending the rate parameter σ . Figure 1 presents one hundred trajectories generated using rates σ = 1 (bottom left) and σ = 4 (bottom right), respectively. It can be seen straightforwardly that while a smaller rate ( σ = 1 ) yields a narrower range of character value, the larger rate ( σ = 4 ) yields a wider range of character value.
For a group of n related species, denote y i , t , i = 1 , 2 , , n as the trait variable for the ith species. Then we can apply the Brownian motion with the rate parameter to explore the dynamic of the trait along the evolutionary history using a phylgoenetic tree T that represents the relatedness between species. In particular, the estimation of the evolution rate σ can be performed using comparative phylogenetic methods (PCM) [8,9,10].
There are models created on the basis of Eq. (3) via considering the constant-rates BM model may not be well addressed for the evolution in many scenarios [11,12,13,14,15]. Those models come with the assumption that the evolution of a species changes over time, the rates of evolution can be modeled as either constants or stochastic variables along times (branch lengths). The trait variable y t adopts the following dynamics.
y t = μ + σ t W t ,
where σ t can be constant (i.e. σ t = σ , t > 0 ) [16], piecewise constant (i.e. σ t = σ i , γ i 1 t γ i where γ i and γ i 1 are the successive time regimes) [11] or a random variable modeled by another pertinent process (i.e. σ t D ( t ) where D ( t ) is a distribution function for the stochastic variable) [17,18].
A hypothetical tree of four species A , B , C , D and the corresponding simulation of the trajectories y i , t s using Brownian motion along a rooted phylogenetic tree under two different rates is shown in Figure 2.
Those models have been broadly applied in many studies. For example, in the evolution of the morphology of the world’s largest flowers (Raffkesianeae: up to 1 meters in diameter), [19] found that the enormous flowers evolved from ancestors with tiny flowers. In the study of the evolution of the size of the plant genome, [20] found that the woody lineages had a stochastic motion rate that was nearly five times slower than the rate of the herbaceous lineages. Although the existing framework has produced rate models, none consider a scenario in which the rate is treated as a time-correlated stochastic variable, which could potentially enhance the study of rate evolution. This highlights the need for our work to apply a time series model [21,22]. Specifically, this approach aims to answer key questions: Are the evolutionary rates of biological traits statistically independent, or are they believed to be phylogenetically serially autocorrelated ?
Note that in modeling the rate of evolution, it is essential to consider its dynamic nature, as the rate can fluctuate, increase, or decrease over time [18,23], rather than a constant. Consider that the implementation of the time-correlated rate evolution σ t could possibly provide an alternative to reveal embedded information about species evolution, in this work we intend to expand the model y t = μ + σ t W t in Eq. (4) within the framework of correlated rate evolution ( σ t = g ( σ s | Θ ) for s < t where Θ is a parameter vector) to model the trait evolution for phylogenetic comparative analysis. In particular, we use the autoregressive moving average (ARMA) time series model that has been widely applied in econometrics to model the rate parameter [24,25]. The description of the methods can be found in Section 2. The simulations are detailed in Section 3. Empirical analyzes are presented in Section 4. The discussions and conclusions are covered in Section 5.

2. Methods

2.1. Trait Evolution with Time Correlated Rate

We describe our procedure as follows. In Section 2.1.1, we adopt phylogenetic ridge regression [26] to perform the analysis using the phylogenetic tree and trait data as input to obtain rate estimates. In Section 2.1.2, we perform a phylogenetic rate ARMA(p,q) regression, w here the rate estimates are incorporated along the tree using a traversal algorithm, treating the data as time series.

2.1.1. Step 1: Phylogenetic Ridge Regression

Given the tree with the topology and branch lengths, one assumed that the observed trait values are a linear combination of past values, the time elapsed, and the rate of evolutionary change. To illustrate, we use a case of 3 taxa tree as shown in Figure 3 where the extant species A, B, and C, and the fossil species D, E, and O have trait values of ( y A , y B , y C ) and ( y D , y E , y 0 ) . These traits values can be expressed as a linear combination of the time elapsed and the rate of evolution shown in Eq. (5).
y A = y 0 + l A σ A + l D σ D + l E σ E , y B = y 0 + l B σ B + l D σ D + l E σ E , y C = y 0 + l C σ C + l E σ E , y D = y 0 + l D σ D + l E σ E , y E = y 0 + l E σ E .
One can formulate into the system of linear equation in Eq. (6)
y A y B y C y D y E = y 0 y 0 y 0 y 0 y 0 + l A 0 0 l D l E 0 l B 0 l D l E 0 0 l C l D l E 0 0 0 l D l E 0 0 0 l D l E σ A σ B σ C σ D σ E .
Given empirical data where the tip values are y A , y B , and y C , and the ancestral values y 4 and y 5 may be known through fossil records or remain unknown. Assuming the currently provided data y = ( y A , y B , y C ) , and the phylogenetic tree T with a given set of known branch lengths L = { l A , l B , l C , l D , l E } , the tip states ( y A , y B , y C ) can be written into a matrix form in Eq. (7)
y A y B y C = y 0 y 0 y 0 + l A 0 0 l D l E 0 l B 0 l D l E 0 0 l C 0 l E σ A σ B σ C σ D σ E .
Use the can written into Equation (8)
Y ^ = L Σ ,
where
Σ ^ = arg min Σ i ( y i ( L Σ ) i ) 2 + λ j Σ j 2 ,
which can be written as
Σ ^ = ( L T L + λ I ) 1 L T y .
In general, given tip data y = { y 1 , y 2 , , y n } R n , and tree topology T of n taxa with branch length set B = { l j 1 , l j 2 , , l j m } where ( j 1 , j 2 , , j m ) : = J is internal node index set with the corresponding with the ancestral-descedant relationship delineated by the tree topology, we use the R package rrphylo [27] to perform the analysis for estimation of phenotypic evolutionary rates { σ j s } j s J using Eq. (10) with adopting the leave-one-out cross-validation (LOOCV) to search for the optimal λ . In this method, each observation is removed one at a time, and the remaining data is used to make predictions. The error for the excluded observation is calculated, and this process is repeated for all observations. The total LOOCV error is computed for various λ values, and the λ with the smallest error is selected as the best regularization parameter, improving model performance [26].

2.1.2. Step 2: Phylogenetic Rate ARMA(p,q) for Trait Evolution

Next, we implements the ARMA model for studying rate { σ j s } j s J . Given tree topology T of n taxa and with tip node y = { y 1 , y 2 , , y n } R n and internal node index set J = ( j 1 , j 2 , , j m ) and the corresponding branch length set B = { l j 1 , l j 2 , , l j m } with the ancestral-descedant relationship delineated by the tree topology, by utilizing phylogenetic ridge regression in Eq. (10) one can get a set of estimated rates Σ J = { σ j s } s = 1 m where index j s corresponds to the internal index, then the phylogenetic ARMA( p , q ) model of rates defined by a mixture of phylogenetic autoregressive (AR) and phylogenetic moving average (MA) models given by the Eq. (11):
σ j s = i = 1 p ϕ i σ j s i + j = 1 q θ j ϵ j s j + ϵ j s ,
where σ j s represents the time series at time j s , ϕ i are coefficients for the AR terms, θ j are coefficients for the MA terms, and ϵ j s denotes the white noise error terms following a normal distribution with mean zero and variable proportional to the branch length j s as shown in Eq. (12)
ϵ j s N ( 0 , τ 2 l j s ) .
For a phylogenetic ARMA(1,1) rate model, denote σ des = σ j s as the descendent ( σ j s ) and σ anc = σ j s 1 as the ancestor, the model in Eq. (11) can be represented as in Eq. (13).
σ des = ϕ σ anc + θ ϵ anc + ϵ des , where ϵ des N ( 0 , τ 2 l des ) .
An exemplary example of the 3 taxa case is shown in Figure 4.
We are interested in the joint distribution L ( Θ | y , σ , ϵ , T ) where Θ is the parameter vector (i.e. Θ = ( { ϕ i } i = 1 p , { θ j } j = 1 q , τ ) for the ARMA ( p , q ) . On a specific branch s J with branch length l s , we assume the error ϵ j s follow a normal distribution with mean conditioning on the ancestor and variance with branch length l j s . The logarithmic likelihood on a branch is a one- dimensional ARMA ( p , q ) that has likelihood in Eq. (14).
log L ( Θ | y , σ j s , ϵ j s , T ) = 1 2 log ( 2 π ) log ( τ ) ϵ j s τ .
Assuming branches are independent, the full log likelihood given the tree T and trait Y , σ = { σ s } s J , ϵ = { ϵ s } s J can be written as
log L ( Θ | y , σ , ϵ , T ) = log L ( Θ | ϵ 0 , σ 0 , T ) + s J log L ( Θ | y , σ s , ϵ s , T ) ,
where ϵ 0 is the residual at the root and the product operator s J follows the corresponding tree topology T .
To perform model inference on parameter estimation, given trait vector y = ( y 1 , y 2 , , y n ) t on the tip, and a phylogenetic tree T with known branch length set J where l 0 = 0 is the branch length of the root node, and each branch l j s . Use y and the tree T to obtain the rate estimates at the root σ 0 from the Brownian motion model [12]
σ 0 = ( y μ ^ 1 ) t C 1 ( y μ ^ 1 ) / n ,
where μ ^ = 1 C 1 y ) / ( 1 t C 1 1 ) is the MLE estimate for the ancestral at the root under the Brownian motion model, and C is the variance covariance matrix transformed from the tree T and 1 = (1, 1, ⋯, 1)t is vectors of 1s. Then we use the maximum likelihood approach for parameter estimation and inference. Refer to Section 6.2 Appendix for expression and parameter estimation a first few order phylorateARMA((p, q)) model.
For empirical analysis, we analyze the data using the model up to order p = 2 , q = 2 . For model selection, the Akaike information criterion A I C = 2 k log L and its sample size correction version A I C c = A I C + 2 k ( k + 1 ) / ( n k 1 ) [28] and the weighted parameter are calculated using equation w k = exp 0.5 × Δ AIC k / k = 1 K exp 0.5 × Δ AIC k , where w i is the weight for the models i and Δ AIC i = AIC i min i { AIC i } . The weighted parameter estimate ξ ˜ is calculated as ξ ˜ = k = 1 K w i × ξ ^ k . Here, ξ ^ i represents the MLE estimate from model i. This method incorporates uncertainty in the selection of the model by averaging parameters based on AIC weights [29].

2.2. Testing of AR(1) Effect

We conduct inference to test for the existence of the autoregressive effect. This is equivalent to testing whether ϕ = 0 , as shown in Section 2.2.1. Additionally, we investigate whether the autoregressive effect varies across different clades of the tree. For this, we pose the null hypothesis H 0 : ϕ 1 = ϕ 2 , which is discussed in Section 2.2.2.

2.2.1. Test of Correlated Rate Evolution

Given the AR(1) process: σ j s = ϕ σ j s 1 + ϵ j s where ϵ j s N ( 0 , τ 2 l j s ) is white noise, the asymptotic variance of the MLE ϕ ^ is: Var ( ϕ ^ ) = τ 2 / j s J σ j s 2 . The null hypothesis is given by: H 0 : ϕ = ϕ 0 where ϕ 0 is a hypothesized value. The alternative hypothesis for a two-sided test is: H 1 : ϕ ϕ 0 . We set ϕ 0 = 0 to test for the AR(1) effect. To perform the test, first fit an AR(1) model to the data and obtain the estimate ϕ ^ of ϕ and its standard error SE ( ϕ ^ ) . The test statistic is calculated as:
t = ϕ ^ ϕ 0 SE ( ϕ ^ ) .
This test statistic is compared with the critical value of the t distribution with n 1 degrees of freedom, where n is the sample size. The standard deviation estimate can be derived from the Hessian matrix obtained during the optimization process, which is then used to conduct hypothesis testing.
In the context of hypothesis testing for phylogenetic rate model in an AR(1) model, the process begins by establishing the null hypothesis, H 0 : ϕ = ϕ 0 , where ϕ 0 represents a specific hypothesized value. The alternative hypothesis for a two-sided test is H 1 : ϕ ϕ 0 . The methodology for testing these hypotheses involves several steps: (i) Firstly, an AR(1) model is fitted to the data. From this model, the estimate ϕ ^ of ϕ and its standard error S E ( ϕ ^ ) are obtained. (ii) The next step involves calculating the test statistic using the formula t = ϕ ^ ϕ 0 S E ( ϕ ^ ) . (iii) Finally, this test statistic is compared against the critical value from the t-distribution, which is determined by the sample size J = | J | where | J | is the number of the branches of the tree and corresponds to J 1 degrees of freedom. This comprehensive method is essential to determine whether the hypothesized value ϕ 0 is a plausible value for ϕ in the AR(1) model.

2.2.2. Testing Heterogeneity Rates on Sub Clade

We consider whether heterogeneity autoregressive effect exists on the two subclade, as shown in Figure 5.
The null Hypothesis ( H 0 ) poses that a single tree with n taxa and use one parameter ϕ while the alternative Hypothesis ( H a ) states that 2 autoregressive parameters ϕ 1 and ϕ 2 would more approporate when tree the is divided into two subtrees from the root, and parameters are estimated independently.
The null hypothesis in the likelihood ratio test states that the simpler model (in this case, the single combined tree) is sufficient to explain the variability in the data. The alternative hypothesis suggests that dividing the tree into two subtrees provides a better fit. The likelihood ratio test (LRT) statistic is computed as:
LRT = 2 log L H 0 log L H a ,
where log L H 0 is the log-likelihood under the null hypothesis and log L H a is the log-likelihood under the alternative hypothesis. The test statistic follows a chi-square distribution with the degrees of freedom d f = 1 .
The procedure in our framework is summarized in Algorithm 1.
Algorithm 1: Procedure of Phylogenetic ARMA(p,q) Rate Data Analysis
1:
Input: a phylogenetic tree T with m branch lengthes B = { b j 1 , b j 2 , , b j m } , and quantitative trait values y = ( y 1 , y 2 , , y n ) t of n taxa.
2:
Output: estimated rate of evolution Σ J = { σ ˜ j s } s = 1 m , the parameter estimates ( ϕ ^ s , θ ^ s, τ ^ 2 ), the AICc values, the test statistics t-value for H 0 : ϕ = 0 , and χ 2 -value for H 0 : ϕ 1 = ϕ 2 and their corresponding p values.
3:
Apply the phylogenetic ridge regression using the tree T , branch lengths B , and trait value Y using Eq. (10) to obtain the rate estimates { σ ˜ j s } s = 1 m .
4:
Use { σ ˜ j s } s = 1 m as input and apply the tree traveral algorithm on the tree T to fit the phylonetic rate ARMA(p,q) model.
5:
Evaluate the AICc to compare the phyrateARMA( p , q ) model, and compute the Akaike weight and the weighted estimate.
6:
Test of the significance of the AR(1) relationship in the two scenarios:
  • Test of correlated rate evolution H 0 : ϕ = 0 .
  • Testing heterogeneity of AR (1) on two subclades H 0 : ϕ 1 = ϕ 2 .
7:
Return the results.

3. Simulation

We assess the performance of the model using simulation. Four types of trees are used for the assessment: star tree, balanced tree, left tree, coalescent tree, and birth-and-death tree, each of size 16 , 32 , 64 , 128 . The initial samples are drawn from an independent normal distribution. The initial estimate for σ 0 by applying the formula in Eq. (16) at the root is estimated by the Brownian motion model using the R package geiger. For each type of tree, for each data set and for each prior set, a million replicates of the sample will be generated to assess the posterior sample distribution.
Let the tree T have m branches, with each branch representing a transition between an ancestral node j s 1 and a descendant node j s . The tree is traversed in postorder (from root to tips), and the trait values for each node are computed based on the following AR(1) process.
For each branch, the evolutionary rate at the descendant node j s is determined by the rate at its ancestor j s 1 , with some noise (innovation) added. The autoregressive equation is shown in Eq. (19)
σ j s = ϕ σ j s 1 + ϵ j s
Initially, the rate value at the root node is set to zero: β 0 = 0 .
After simulating the rates using Eq. (19) for all nodes in the tree, the final trait values at the tree tips (taxa) are computed by accumulating the rates along the path from the root to each tip.
Let J denote the set of nodes from the root to tip j, and j s denote the branch length leading to node j s . The trait value at each tip j is:
y j = j s J σ j s j s
where y j is the final accumulated trait value at tip j.
The final output consists of two arrays: (1) the array of simulated evolutionary rates for each node, σ , and (2) the trait values at the tree tips, y summarized by the following equations Eq. (21)
σ j s = ϕ σ j s 1 + ϵ j s , ϵ N ( 0 , τ 2 l j s ) , y j = j s J σ j s j s .
This simulation captures the evolutionary rate changes along a phylogeny under an AR(1) model, accounting for both ancestral states and random innovation at each branch. We create a function simulate_trait_data_given_phi which simulates the trait evolution process along a phylogenetic tree using an autoregressive process (AR(1)) for the rates of evolution. The simulation setup is shown in Algorithm 2
Algorithm 2: Estimating Type I Error and Power with Phylogenetic AR(1) rate Model
1:
Input: tree type {balanced, coalescent, pure birth, birth death }, taxa size { 16 , 32 , 64 , 128 } , simulations 1000, α = 0.05 , ϕ 0 = 0 , σ 0 = 1 , ϕ a = { 0.05 , 0.10 , 0.15 , . 0.95 , 1 } .
2:
Output: Type I error and power for each taxon size.
3:
for each tree type do
4:
    Generate tree T with branch length set B = { l j s } j s J
5:
    for each taxa do
6:
        Initialize Type I Error T 1 = 0
Estimate Type I Error
7:
        for  i = 1 to sims  do
8:
           Use Eq. (21) with ϕ = ϕ 0 to simulate trait data y
9:
           Use Eq. (10) to obtain rate estimates { σ ˜ j s } j s J
10:
           Optimize the neg log likelihood in Eq. (15) to obtain MLE parameters ϕ ^ , θ ^ , τ ^ .
11:
           Use Eq. (17) to test ϕ = 0 using MLE parameter, and its standard error and p value.
12:
           if  p < α  then
13:
                T 1 = T 1 + 1
14:
           end if
15:
        end for
16:
        Store T 1 for this tree and taxa size
Estimate Power
17:
        for each ϕ a  do
18:
           Initialize power pow  = 0
19:
           for  i = 1 to sims  do
20:
               Use Eq. (21) with ϕ = ϕ a to simulate trait data y
21:
               Use Eq. (10) to obtain rate estimates { σ ˜ j s } j s J
22:
               Optimize the neg log likelihood in Eq. (15) to obtain the MLE parameters ϕ ^ , θ ^ , τ ^ .
23:
               Use Eq. (17) to test ϕ = 0 using the MLE parameter and its standard error and p value.
24:
               if  p < α  then
25:
                   pow= pow  + 1
26:
               end if
27:
           end for
28:
           Store pow for this tree and taxa size
29:
        end for
30:
    end for
31:
end for
32:
Return T 1 , p o w
Figure 6 and Table 1 show how the statistical power to detect an effect changes with different values of ϕ a and different taxa levels.
From Figure 6 and Table 1, one can envision that the power of the test changes depending on the number of taxa. Generally, as the number of taxa increases, the power also increases. Next, for all taxa levels, as ϕ a increases, the power of the test also increases. This could mean that as the parameter ϕ a increases, the ability to detect an effect or a difference becomes stronger. Third, all curves converge or come closer together at higher values of ϕ a , indicating that the power could be more consistent across different levels of taxa at these higher values. This simulation provide the evidence that our methods fit the statistical properties as expect in the performance of evaluating the statistical power and type I error.

4. Empirical Analysis

4.1. Genome Size Rate Evolution in Herbaticus and Woody Plant

This study collected and analyzed data sets on the genome size in pg of herbaticus and woody plant. The primary data source is [30], we includes 35 herbaticus and 33 woody species in this study. The main objective is to test whether there is a time-series relationship in the evolutionary rate of these plant’ genome size.
Figure 7 shows the phylgoenetic tree for herbaticus, woody and their joint tree.

4.1.1. Genome Size: Combined Herbaticus and Woody Species

Table 2 presents the parameter estimates for various autoregressive (AR) and autoregressive moving average (ARMA) models. ARMA(1,1) has a ϕ 1 value of 0.600 and a θ 1 value of 0.738 , indicating a moderate correlation between consecutive rates of evolution, with a residual variance of τ 2 = 4.663 . The weighted estimates combine values across the models, with ϕ 1 = 0.050 and θ 1 = 0.202 , and τ 2 = 4.722 , providing a comprehensive view of the underlying evolutionary process.
Using the phylogenetic ARMA(1,1) rate model described in Eq. (13), the equation for σ des demonstrates how the descendant’s rate of evolution is influenced by multiple factors. The term ϕ σ anc indicates that -5.0% of the ancestor’s evolutionary rate ( σ anc ) is carried over to the descendant, reflecting a weak and inverse correlation between the two rates. The second term, θ ϵ anc , with a coefficient of 0.202 , adjusts for the random variation (or noise) in the ancestor’s rate. This positive value suggests that random fluctuations in the ancestor’s rate have a small reinforcing effect on the descendant. Finally, ϵ des , with variance proportional to τ 2 = 4.722 , represents the new random variation specific to the descendant. The weighted estimates offer a balanced perspective on how both inherited traits and random fluctuations shape evolutionary changes across generations.
In Table 3, AR(2) ranks first with the lowest AICc value of 537.870 and the highest Akaike weight ( w = 0.530 ), indicating the best fit among the models. AR(1) follows with an AICc of 539.850 and a weight of 0.2 , suggesting a close fit but slightly less optimal than AR(2). ARMA(1,1) ranks third with an AICc of 540.070 and a weight of 0.180 . Models ARMA(2,1) and ARMA(2,2) show considerably lower Akaike weights, indicating that increasing model complexity beyond AR(2) does not significantly improve the fit.

4.1.2. Genome Size: Herbaticus Species

The parameter estimates in Table 4 for the herbaticus models show that AR(1) has a ϕ 1 value of 0.470 with residual variance τ 2 = 0.009 , indicating a moderate influence from the previous rate of evolution. AR(2) introduces a second autoregressive term ( ϕ 2 = 0.019 ) and maintains a similar variance of τ 2 = 0.009 . The ARMA(1,1) model has ϕ 1 = 0.491 and θ 1 = 0.024 , with the same variance of τ 2 = 0.009 . More complex models, such as ARMA(2,1) and ARMA(2,2), show mixed ϕ 1 and θ 1 values, but they maintain the same low variance of τ 2 = 0.009 .
The weighted estimate balances these models, showing ϕ 1 = 0.427 , θ 1 = 0.021 , and τ 2 = 0.009 . This suggests that the descendant’s rate of evolution is influenced by 42.7 % of the ancestor’s rate, with a small adjustment of 2.1 % due to noise from the ancestor, all within a low variance process.
In Table 5, AR(1) ranks first with the lowest AICc value of 110.550 and the highest Akaike weight ( w = 0.610 ), making it the best-fitting model for the herbaticus data. ARMA(1,1) follows in second place with an AICc of 108.550 and a weight of 0.220 , while AR(2) ranks fourth with an AICc of 105.690 and a weight of 0.050 . More complex models like ARMA(2,1) and ARMA(2,2) have even lower Akaike weights, indicating that they provide little additional explanatory power compared to simpler models.

4.1.3. Genome Size: Woody Species

The parameter estimates in Table 6 for the woody models indicate that AR(1) has a ϕ 1 value of 0.013 and a residual variance of τ 2 = 0.002 , suggesting a very weak negative influence from the previous rate of evolution. AR(2) introduces a second autoregressive term ( ϕ 2 = 0.302 ) and slightly increases the variance to τ 2 = 0.003 . ARMA(1,1) shows stronger dependence with ϕ 1 = 0.359 and θ 1 = 0.446 , while maintaining a variance of τ 2 = 0.002 .
The weighted estimate balances the models, with ϕ 1 = 0.023 , θ 1 = 0.044 , and τ 2 = 0.002 . This suggests that the descendant’s rate is slightly positively influenced by the ancestor’s rate ( 2.3 %) but slightly negatively impacted by the noise ( 4.4 %), with very low overall variance ( 0.002 ), indicating minimal change between generations.
In Table 7, AR(1) ranks first with the lowest AICc value of 192.770 and the highest Akaike weight ( w = 0.860 ), indicating it is the best-fitting model for the woody data. ARMA(1,1) ranks second with an AICc of 188.710 and a much smaller weight ( w = 0.110 ), while AR(2) ranks last with an AICc of 180.880 and no Akaike weight ( w = 0.000 ). More complex models, such as ARMA(2,1) and ARMA(2,2), also have minimal Akaike weights ( w = 0.010 ), suggesting they provide little additional explanatory power and overcomplicate the fit for the woody data.
From above one may foreseen that the woody primate and the herbaticus genome size possess different autoregressive effect for the rates of the evolution, we provide the test in the following sections.

4.1.4. Testing Autocorrelation Rate

Interpretation of AR(1) Test Results
Table 8 show the AR1 test result
Table 8 presents the results of the AR(1) test under the null hypothesis H 0 : ϕ = 0 (no autocorrelation) and the alternative hypothesis H a : ϕ 0 . The herbaticus tree shows a significant positive autocorrelation with a large z-value of 5.100 and a p-value of 0.000, strongly rejecting the null hypothesis. This suggests a significant effect of the previous rate of evolution on the current rate in the herbaticus dataset.
In contrast, the woody tree show no significant autocorrelation, with z-values of 0.121 , and p-value 0.903 indicates that the null hypothesis cannot be rejected for these datasets, suggesting no significant relationship between the past and current rates of evolution in the woody trees. This may due to woody tree with slower rates (5 times slower than the herbaticus tree [20]).
The combined tree shows no significant autocorrelation, with z-values of 1.119 , and p-values of 0.263 . The result suggests that there is no significant relationship between the past and current rates of evolution in the combined trees.
Testing Heterogeneity Rates on subclades
Table 9 presents the results of the heterogeneity test in evolutionary rates across the herbaticus, woody, and combined (dino) subclades using various autoregressive models (AR(1), AR(2), ARMA(1,1), ARMA(2,1), ARMA(2,2)). The log-likelihood values for the herbaticus ( log L di ), woody ( log L no ), and combined ( log L dino ) datasets are provided, along with the chi-squared statistic and p-values.
All models yield highly significant results (p-values = 0.00), indicating strong evidence for rate heterogeneity between the subclades. The highest chi-squared statistic is observed for AR(1) (847.16), suggesting that it explains the greatest amount of variance in the data compared to the other models.

4.2. Body Mass Rate Evolution in Diurnal and Nocturnal Primate

This study collected and analyzed data sets on the body weights of male and female primates. The primary data source are [33] and Animal Diversity Web [34], which includes diurnal and nocturnal male and female primates. The main objective is to test whether there is a time-series relationship in the evolutionary rate of these primates’ body weights. The original data set contains 88 primate species, 34 diurnal species and 54 nocturnal species. Body weight for males, females, or combined is collected. The analysis is as follows, combined analysis of male and female primate data. For the male-only and female-only analysis case, see Section 6.1.2 for the corresponding analysis.

4.2.1. Body Mass: Combined Diurnal and Nocturnal Species

Table 10 presents the parameter estimates for various autoregressive moving average (ARMA) and autoregressive (AR) models. ARMA(1,1) shows the highest ϕ 1 value of 0.875 and θ 1 of 0.652 , indicating a significant correlation between consecutive rates of evolution, with residual variance τ 2 = 1.103 . The weighted estimates combine the values across the models, with ϕ 1 = 0.779 and θ 1 = 0.566 , and τ = 1.109 providing a balanced view of the underlying process.
By the phylgoenetic ARMA (1,1) rate model in Eq. (13), as the equation for σ des shows how the descendant’s rate of evolution is determined by a combination of factors. The term ϕ σ anc suggests that 77.9% of the ancestor’s evolutionary rate ( σ anc ) carries over to the descendant, indicating a strong correlation between the two rates. The second term, θ ϵ anc , which has a coefficient of 0.566 , adjusts for the random variation (or noise) in the ancestor’s rate. This negative value means that random fluctuations in the ancestor’s rate are counterbalanced, effectively reducing the influence of ancestor-specific randomness on the descendant. Finally, ϵ des has variance proportional to τ 2 = 1.109 representing the new random variation specific to the descendant. The weighted estimates provide a balanced view of the process by combining these terms, giving insight into how both inherited traits and random fluctuations influence evolutionary changes across generations.
In Table 11, ARMA(1,1) ranks first with the lowest AICc value of 417.080 and the highest Akaike weight ( w = 0.610 ), indicating the best fit among the models. ARMA(2,1) and ARMA(2,2) follow, but with considerably lower Akaike weights, suggesting that increasing model complexity beyond ARMA (1,1) does not produce substantial fit improvements. AR(1) is the lowest, highlighting its reduced explanatory power compared to more complex models.

4.2.2. Body Mass: Diurnal Species

The parameter estimates in Table 10 for the diurnal models show that AR(1) has a ϕ 1 value of 0.422 with residual variance τ 2 = 0.129 , indicating a moderate level of influence from the previous rate of evolution. AR(2) introduces a second autoregressive term ( ϕ 2 = 0.206 ) and slightly increases the variance to τ 2 = 0.131 . ARMA(1,1) and more complex models, such as ARMA(2,1) and ARMA(2,2), demonstrate negative ϕ 1 and θ 1 values, but these more complex models have similar or slightly lower variances ( τ 2 = 0.127 ). The weighted estimate balances between models, showing ϕ 1 = 0.260 , θ 1 = 0.222 , and τ 2 = 0.127 . The weighted estimate shows that the descendant’s rate is influenced by 26.0 % of the ancestor’s rate, with a positive adjustment of 22.2 % from the ancestor’s noise, while the process has a relatively low variance of 0.127 .
In Table 13, AR (1) is the first with the lowest AICc value of 45.680 and the highest Akaike weight ( w = 0.480 ), making it the best-fitting model for the diurnal data. ARMA(1,1) ranks second with an AICc of 47.170 and a lower weight ( w = 0.230 ), followed by AR(2) with an AICc of 47.710 ( w = 0.170 ). More complex models such as ARMA(2,1) and ARMA(2,2) rank lowest, with minimal Akaike weights, indicating that they provide little additional explanatory power.

4.2.3. Body Mass: Nocturnal Species

The parameter estimates in Table 14 for the nocturnal models indicate that AR(1) has a ϕ 1 value of 0.213 and a residual variance of τ 2 = 0.003 , suggesting a weak negative influence from the previous rate of evolution. AR(2) introduces a second autoregressive term ( ϕ 2 = 0.080 ) and slightly increases the variance to τ 2 = 0.004 . ARMA (1,1) has larger ϕ 1 = 0.593 and θ 1 = 0.901 , with a similar variance of τ 2 = 0.004 , indicating a stronger dependency on the autoregressive and moving average terms. The weighted estimate shows a balance between models with ϕ 1 = 0.181 , θ 1 = 0.037 , and τ 2 = 0.003 . The weighted estimate indicates that the descendant’s rate is slightly negatively influenced by the ancestor’s rate ( 18.1 %) and its noise ( 3.7 %), with very low overall variance ( 0.003 ), suggesting minimal change between generations.
In Table 15, AR (1) is the first with the lowest AICc value of 255.450 and the highest Akaike weight ( w = 0.930 ), indicating it is the best-fitting model for the nocturnal data. AR(2) and ARMA(1,1) rank much lower, with AICc values of 248.710 and 248.690 , respectively, both having a very small Akaike weight ( w = 0.030 ). More complex models such as ARMA(2,1) and ARMA(2,2) rank lowest, with minimal Akaike weights, indicating they add little explanatory power and overcomplicate the fit for the nocturnal data.
From above one may foreseen that the nocturnal primate and the diurnal primate body mass possess different autoregressive effect for the rates of the evolution, we provide the test in the following sections.

4.2.4. Testing Autocorrelation Rate

Interpretation of AR(1) Test Results
Table 16 show the AR1 test result
Table 16 presents the results of the AR(1) test under the null hypothesis H 0 : ϕ = 0 (no autocorrelation) and the alternative hypothesis H a : ϕ 0 . The combined tree and the diurnal tree show significant positive autocorrelation with large z-values ( 4.229 and 4.237 , respectively) and p values of 0.000 , indicating a strong rejection of the null hypothesis. This suggests a significant effect of the previous evolution rate on the current rate in these datasets.
Interestingly, the nocturnal tree exhibits a significant negative autocorrelation with a z value of 2.763 and a p value of 0.006 . Despite the negative z-value, the small p-value suggests a significant relationship. This indicates that while each tree shows a significant effect, the nocturnal tree behaves differently from the combined and diurnal trees, potentially indicating unique evolutionary dynamics in nocturnal species.
Testing Heterogeity Rates on subclades
Table 17 presents the results of the test of heterogeneity in evolutionary rates in the diurnal, nocturnal and combined (dino) subclades using various autoregressive models (AR(1), AR(2), ARMA(1,1), ARMA(2,1), ARMA(2,2)). The log-likelihood values for the diurnal ( log L _di), nocturnal ( log L _no), and combined ( log L _dino) data are shown, along with the chi-squared statistic and p-values. All models show highly significant results (p-values = 0.00), indicating strong evidence for rate heterogeneity between subclades. The chi-squared statistic is highest for AR1 ( 637.06 ), suggesting it explains the greatest amount of variance in the data compared to the other models.

5. Discussion and Conclusions

In this study, we first assumed that species evolve along a phylogenetic tree, with time length, evolutionary rate, and trait values expressed as a linear combination. Using the given trait values and branch lengths of the phylogenetic tree, we applied ridge regression and used the R package RRphylo to estimate the evolutionary rate. Next, we modeled the evolutionary rate using the ARMA(p,q) models, fitting these models to ancestral trait values, and then applied these models to estimate present-day trait values.
In the empirical analysis, we tested whether the trait evolution of diurnal and nocturnal populations exhibited time-series correlation. In the dataset, which combined male and female primates body mass, the results showed an autoregressive effect in the separate analysis of diurnal and nocturnal populations, indicating a time-series relationship. To observe the evolutionary dynamics, the branches might have different autoregressive coefficients ( ϕ ). These branches can be analyzed to understand how differences between them affect the overall model. Further analysis could deepen the exploration of the effects of this statistical model in the biological context, providing a more comprehensive scientific explanation to more precisely understand and interpret evolutionary processes and phylogenetic patterns in biological systems.
Several methods exist for estimating autoregressive (AR) and moving average (MA) model parameters. Maximum Likelihood Estimation (MLE) maximizes a likelihood function, assuming normally distributed errors, while Least Squares Estimation minimizes squared residuals for AR parameters but is less suited for MA parameters. The Yule-Walker Equations use sample autocorrelations to estimate AR parameters but not MA ones, whereas the Innovations Algorithm estimates both AR and MA parameters using residuals. Burg’s Method improves AR estimates by minimizing forecast errors, and Bayesian Estimation employs prior distributions and observed data, often via MCMC. The Bootstrapping Method starts by estimating the AR(1) parameters ϕ 1 and ϕ 2 for both time series. Then, bootstrapping is used to resample with replacement from each series multiple times. For each bootstrapped sample, calculate Δ ϕ = ϕ ^ 1 ϕ ^ 2 . Next, obtain the empirical distribution of Δ ϕ . If 0 falls outside the 95% confidence interval, there is evidence that the two series have different AR(1) parameters. Each method has advantages depending on the time series characteristics, analysis goals, and available resources.
The proposed research on Covid evolution could be extended by investigating how regional factors such as vaccination rates, public health interventions, and population density contribute to the mutation rate of SARS-CoV-2, or by exploring correlations between mutations and viral traits like transmissibility or immune escape. For land mammals, an extension could examine how adaptability in diet breadth, climate niche, and range size has changed over evolutionary periods or incorporate phylogenetic data to study how related species respond to environmental pressures. Lastly, the study on latitude’s influence on elevation could explore how this relationship differs across biomes or is affected by human activities like deforestation and urbanization, providing a more comprehensive understanding of climate-related changes. These extensions would enhance the versatility and application of the phylogenetic autoregressive moving average regression model across various domains.

6. Appendix

6.1. Code and Scripts

Script and data can be accessed through the following link https://tonyjhwueng.info/phyarmarate.

6.1.1. Model Script

6.1.2. Figures and Table

6.2. Phylogenetic ARMA(p,q) Model for Rate Evolution

Given tree topology T of n taxa and with tip node y = ( y 1 , y 2 , , y n ) and internal node J = ( j 1 , j 2 , , j m ) and the corresponding branch length set B = { l j 1 , l j 2 , , l j m } with the ancestral relationship, with fitting the RRphylo [27], one can get a set of estimated rates Σ J = { σ j s } s = 1 m for the tree, the phylogenetic ARMA( p , q ) model of rates is a mixture of phylogenetic autoregressive (AR) and phylogenetic moving average (MA) models given by the equation:
σ j s = c + i = 1 p ϕ i σ j s i + j = 1 q θ j ϵ j s j + ϵ j s , where ϵ j s N ( 0 , τ 2 l j s )
where σ j s represents the time series at time j s , c is a constant, ϕ i are coefficients for the AR terms, θ j are coefficients for the MA terms, and ϵ j s denotes the white noise error terms with mean 0 and variance τ 2 l j s .

6.2.1. PhyRateARMA(1,0) Model

The phylogenetic AR(1) process for rate evolution along the is written as
σ j s = ϕ σ j s + ϵ j s ,
where nodes j s , j s J and ϵ j s N ( 0 , τ 2 l j s ) .
The negative log-likelihood is:
L ( ϕ , τ 2 | T , Σ J ) = m 2 log ( 2 π ) + m 2 log ( τ 2 ) + 1 2 τ 2 σ j s , σ j s Σ J ( σ j s ϕ σ j s ) 2 .
where m = | J | is the number of branches, and σ j s and σ j s are the two successive rates along the branch length l j s and l j s , respectively.
To find the maximum likelihood estimates of ϕ and τ 2 for the phyAR(1) rate process, we aim to minimize the negative log-likelihood. This requires setting the derivatives of the negative log-likelihood with respect to ϕ and τ 2 to zero and solving these equations for the parameters.
The first order conditions for the derivatives are: L ϕ = 0 , and L τ 2 = 0 . For ϕ , the derivative simplifies to:
L ϕ = 1 τ 2 ( σ j s , σ j s ) Σ J ( σ j s σ j s ϕ σ j s 2 ) = 0 ,
which leads to:
ϕ ^ = ( σ j s , σ j s ) Σ J σ j s σ j s ( σ j s , σ j s ) Σ J σ j s 2 .
For τ 2 , the derivative is:
L τ 2 = m 2 τ 2 1 2 ( τ 2 ) 2 ( σ j s , σ j s ) Σ J ( σ j s ϕ σ j s ) 2 = 0 ,
leading to:
τ ^ 2 = 1 m ( σ j s , σ j s ) Σ J ( σ j s ϕ σ j s ) 2 .
These formulations allow us to estimate the parameters that best fit our phylogenetic AR(1) model based on the observed evolutionary rates along a phylogenetic tree.

6.2.2. PhyRateARMA(2,0) Model

A phylogenetic autoregressive model of order 2 (Phylo AR(2)) can be expressed for evolutionary rate changes along a phylogenetic tree. The rate at node t is modeled as:
σ j s = ϕ 1 σ j s + ϕ 2 σ j s + ϵ j s ,
where σ j s is the rate on the next down successive branch.
Given a phylogenetic lineage with rates σ 1 , σ 2 , , σ T , the likelihood function for the Phylo AR(2) model parameters ϕ 1 , ϕ 2 and τ 2 is:
L ( ϕ 1 , ϕ 2 , τ 2 | T , Σ J ) = σ j s , σ j s , σ j s Σ J 1 2 π τ 2 exp ( σ j s ϕ 1 σ j s ϕ 2 σ j s ) 2 2 τ 2
The log-likelihood function simplifies to:
log L ( ϕ 1 , ϕ 2 , τ 2 | T , Σ J ) = m 2 2 log ( 2 π τ 2 ) 1 2 τ 2 σ j s , σ j s , σ j s Σ J ( σ j s ϕ 1 σ j s ϕ 2 σ j s ) 2
To find the maximum likelihood estimates (MLE) for ϕ 1 , ϕ 2 , and τ 2 , the log-likelihood function’s partial derivatives are taken with respect to each parameter, set to zero, and solved for these parameters.
The equations for partial derivatives are:
ϕ 1 L ( ϕ 1 , ϕ 2 , τ 2 | T , Σ J ) = 1 τ 2 σ j s , σ j s , σ j s Σ J ( σ j s ϕ 1 σ j s ϕ 2 σ j s ) σ j s ,
ϕ 2 log L ( ϕ 1 , ϕ 2 , τ 2 | T , Σ J ) = 1 τ 2 σ j s , σ j s , σ j s Σ J ( σ j s ϕ 1 σ j s ϕ 2 σ j s ) σ j s ,
τ 2 log L ( ϕ 1 , ϕ 2 , τ 2 | T , Σ J ) = m 2 2 τ 2 + 1 2 ( τ 2 ) 2 σ j s , σ j s , σ j s Σ J ( σ j s ϕ 1 σ j s ϕ 2 σ j s ) 2 .
Solving these equations provides the MLEs for ϕ 1 , ϕ 2 , and τ 2 in a phylogenetic context.

6.2.3. PhyRateARMA(1,1) Model

Consider an ARMA(1, 1) model for a time series { σ j s } j s J , which can be written as:
σ j s = ϕ 1 σ j s + ϵ j s + θ 1 ϵ j s ,
where ϵ j s represents the corresponding white noise error terms with variance τ 2 l j s associated with the noise of the evolutionary process.
The likelihood function for the ARMA(1, 1) process, assuming that the errors are normally distributed, is given by the product of the probability densities of the innovations η t , which are the one-step forecast errors:
L ( ϕ , θ , τ 2 | T , Σ J ) = σ j s , σ j s Σ J 1 2 π τ 2 exp ( σ j s ϕ σ j s θ ε j s ) 2 2 τ 2 ,
log L ( ϕ , θ , τ 2 | T , Σ J ) = m 1 2 log ( 2 π ) m 1 2 log ( τ 2 )
1 2 τ 2 σ j s , σ j s Σ J ( σ j s ϕ σ j s θ ε j s ) 2 ,
The MLE estimates for ϕ , θ , and τ 2 are found by taking the partial derivatives of the log-likelihood function with respect to these parameters and setting them to zero:
log L ϕ = 1 τ 2 σ j s , σ j s Σ J ( σ j s ϕ σ j s θ ε j s ) σ j s = 0 ,
log L θ = 1 τ 2 σ j s , σ j s Σ J ( σ j s ϕ σ j s θ ε j s ) ε j s = 0 ,
log L τ 2 = m 1 2 τ 2 + 1 2 ( τ 2 ) 2 σ j s , σ j s Σ J ( σ j s ϕ σ j s θ ε j s ) 2 = 0 .
These equations generally require numerical optimization to solve because the innovations ( σ j s ϕ σ j s θ ε j s ) 2 depend on both ϕ and θ in a non-linear way. To estimate the parameters ϕ and θ , we use iterative numerical optimization methods that handle nonlinearity. Specifically, we apply L-BFGS-B method [36] by initialized with a starting point and boundary constraints, to maximize the likelihood function. This is done by the R function optim.

6.2.4. PhyRateARMA(2,2) Model

The ARMA(2,2) model is a mixture of autoregressive (AR) and moving average (MA) models, used to model and forecast time series data. It is specified as ARMA(p,q) where p is the order of the AR part and q is the order of the MA part.
An phylogenetic ARMA(2,2) rate model is given by the equation:
σ j s = ϕ 1 σ j s + ϕ 2 σ j s + θ 1 ϵ j s + θ 2 ϵ j s + ϵ j s
where σ j s represents the time series at time j s , c is a constant, ϕ 1 , ϕ 2 are coefficients for the AR terms, θ 1 , θ 2 are coefficients for the MA terms, and ϵ j s denotes the white noise error terms.
Similar to ARMA(1,1), to estimate the parameters ( ϕ 1 , ϕ 2 ) , ( θ 1 , θ 2 ) and τ , we again apply L-BFGS-B method [36] to maximize the likelihood function done by the R function optim.

6.3. Test Example

Figure 9. A rooted phylogenetic tree T of 8 taxa.
Figure 9. A rooted phylogenetic tree T of 8 taxa.
Preprints 140767 g009
Trait values in Table 18.
Table 18. Trait value.
Table 18. Trait value.
y 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8
0.03 0.09 0.45 0.85 0.51 0.05 1.31 0.57
Table 19. Edge lengths obtained by tree T with branch lengths using ape package [37].
Table 19. Edge lengths obtained by tree T with branch lengths using ape package [37].
eg 1 eg 2 eg 3 eg 4 eg 5 eg 6 eg 7
0.143 0.143 0.143 0.143 0.286 0.286 0.143
eg 8 eg 9 eg 10 eg 11 eg 12 eg 13 eg 14
0.143 0.143 0.143 0.143 0.286 0.286 0.571
Use (10) (and package RRphylo) to obtain rate estimate { σ s } in Table 20.
Table 20. Rate estimates { σ ^ j s } from rrphylo.
Table 20. Rate estimates { σ ^ j s } from rrphylo.
nd 1 nd 2 nd 3 nd 4 nd 5 nd 6 nd 7 nd 8
0.320 0.034 0.053 0.070 1.247 0.178 0.445 0.003
nd 9 nd 10 nd 11 nd 12 nd 13 nd 14 nd 15
0.023 0.041 0.006 0.062 0.027 0.054 0.168
Table 21. Maximum likelihood for parameter estimates under each model.
Table 21. Maximum likelihood for parameter estimates under each model.
Model ϕ 1 ϕ 2 θ 1 θ 2 τ 2
AR(1) 0.296 0.130
AR(2) 0.303 0.017 0.141
ARMA(1,1) 0.582 0.396 0.125
ARMA(2,1) 0.087 0.669 0.396 0.125
ARMA(2,2) 0.046 0.629 0.120 0.276 0.125
ξ ˜ 0.322 0.070 0.126 0.008 0.130
Table 22. Model output. k number of parameter, AICc weight w.
Table 22. Model output. k number of parameter, AICc weight w.
Model k log L A I C c w Rank
AR1 2 8.66 13.33 0.60 1st
ARMA1 3 8.67 11.33 0.22 2nd
ARMA21 4 8.67 9.33 0.08 3th
AR2 3 7.52 9.04 0.07 4th
ARMA22 5 8.67 7.33 0.03 5th

6.4. Trait Dataset for Empirical Analysis

For plant data and primate data, the tree files in newick format as well as the trait dataset in csv files can be accessed at https://tonyjhwueng.info/phyarmarate/dataset

References

  1. Pettay, J.E.; Kruuk, L.E.; Jokela, J.; Lummaa, V. Heritability and genetic constraints of life-history trait evolution in preindustrial humans. Proceedings of the National Academy of Sciences 2005, 102, 2838–2843. [Google Scholar] [CrossRef] [PubMed]
  2. Holstad, A.; Voje, K.L.; Opedal, Ø.H.; Bolstad, G.H.; Bourg, S.; Hansen, T.F.; Pélabon, C. Evolvability predicts macroevolution under fluctuating selection. Science 2024, 384, 688–693. [Google Scholar] [CrossRef] [PubMed]
  3. Scott-Phillips, T.C.; Laland, K.N.; Shuker, D.M.; Dickins, T.E.; West, S.A. The niche construction perspective: a critical appraisal. Evolution 2014, 68, 1231–1243. [Google Scholar] [CrossRef] [PubMed]
  4. Soons, J.; Herrel, A.; Genbrugge, A.; Aerts, P.; Podos, J.; Adriaens, D.; De Witte, Y.; Jacobs, P.; Dirckx, J. Mechanical stress, fracture risk and beak evolution in Darwin’s ground finches (Geospiza). Philosophical Transactions of the Royal Society B: Biological Sciences 2010, 365, 1093–1098. [Google Scholar] [CrossRef]
  5. Podos, J.; Nowicki, S. Beaks, adaptation, and vocal evolution in Darwin’s finches. Bioscience 2004, 54, 501–510. [Google Scholar] [CrossRef]
  6. Xiang, Y.; Huang, C.H.; Hu, Y.; Wen, J.; Li, S.; Yi, T.; Chen, H.; Xiang, J.; Ma, H. Evolution of Rosaceae fruit types based on nuclear phylogeny in the context of geological times and genome duplication. Molecular biology and evolution 2017, 34, 262–281. [Google Scholar] [CrossRef]
  7. Thorne, J.L.; Kishino, H.; Painter, I.S. Estimating the rate of evolution of the rate of molecular evolution. Molecular biology and evolution 1998, 15, 1647–1657. [Google Scholar] [CrossRef]
  8. Uyeda, J.C.; Zenil-Ferguson, R.; Pennell, M.W. Rethinking phylogenetic comparative methods. Systematic Biology 2018, 67, 1091–1109. [Google Scholar] [CrossRef]
  9. Garamszegi, L.Z. Modern phylogenetic comparative methods and their application in evolutionary biology: concepts and practice; Springer, 2014.
  10. Cornwell, W.; Nakagawa, S. Phylogenetic comparative methods. Current Biology 2017, 27, R333–R336. [Google Scholar] [CrossRef]
  11. OMeara, B.; Ané, C.; Sanderson, M.; Wainwright, P. Testing different rates of continuous trait evolution using likelihood. Evolution 2006, 60, 922–933. [Google Scholar]
  12. Adams, D.C. A method for assessing phylogenetic least squares models for shape and other high-dimensional multivariate data. Evolution 2014, 68, 2675–2688. [Google Scholar] [CrossRef] [PubMed]
  13. Maddison, W.P.; Midford, P.E.; Otto, S.P. Estimating a binary character’s effect on speciation and extinction. Systematic biology 2007, 56, 701–710. [Google Scholar] [CrossRef] [PubMed]
  14. Uyeda, J.C.; Hansen, T.F.; Arnold, S.J.; Pienaar, J. The million-year wait for macroevolutionary bursts. Proceedings of the National Academy of Sciences 2011, 108, 15908–15913. [Google Scholar] [CrossRef] [PubMed]
  15. Gingerich, P.D. Rates of evolution on the time scale of the evolutionary process. Microevolution rate, pattern, process 2001, 127–144. [Google Scholar]
  16. Felsenstein, J. Phylogeny and the comparative method. America Naturalist 1985, 125, 1–15. [Google Scholar] [CrossRef]
  17. Jhwueng, D.C.; Maroulas, V. Adaptive trait evolution in random environment. Journal of Applied Statistics 2016, 43, 2310–2324. [Google Scholar] [CrossRef]
  18. Jhwueng, D.C. Modeling rate of adaptive trait evolution using Cox–Ingersoll–Ross process: An Approximate Bayesian Computation approach. Computational Statistics & Data Analysis 2020, 145, 106924. [Google Scholar]
  19. Davis, C.C.; Latvis, M.; Nickrent, D.L.; Wurdack, K.J.; Baum, D.A. Floral gigantism in Rafflesiaceae. Science 2007, 315, 1812–1812. [Google Scholar] [CrossRef]
  20. Beaulieu, J.; Jhwueng, D.C.; Boettiger, C.; O’öMeara, B. Modeling stabilizing selection: expanding the Ornstein-Uhlenbeck model of adaptive evolution. Evolution 2012, 66, 2369–2383. [Google Scholar]
  21. Tsay, R.S. Analysis of financial time series; Vol. 543, John wiley & sons, 2005.
  22. Pham, H.T.; Yang, B.S. Estimation and forecasting of machine health condition using ARMA/GARCH model. Mechanical Systems and Signal Processing 2010, 24, 546–558. [Google Scholar] [CrossRef]
  23. Sakamoto, M.; Venditti, C. Phylogenetic non-independence in rates of trait evolution. Biology letters 2018, 14, 20180502. [Google Scholar] [CrossRef] [PubMed]
  24. Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. Journal of econometrics 1986, 31, 307–327. [Google Scholar] [CrossRef]
  25. Engle, R. GARCH 101: The use of ARCH/GARCH models in applied econometrics. Journal of economic perspectives 2001, 15, 157–168. [Google Scholar] [CrossRef]
  26. Castiglione, S.; Serio, C.; Mondanaro, A.; Melchionna, M.; Carotenuto, F.; Di Febbraro, M.; Profico, A.; Tamagnini, D.; Raia, P. Ancestral state estimation with phylogenetic ridge regression. Evolutionary Biology 2020, 47, 220–232. [Google Scholar] [CrossRef]
  27. Castiglione, S.; Tesone, G.; Piccolo, M.; Melchionna, M.; Mondanaro, A.; Serio, C.; Di Febbraro, M.; Raia, P. A new method for testing evolutionary rate variation and shifts in phenotypic evolution. Methods in Ecology and Evolution 2018, 9, 974–983. [Google Scholar] [CrossRef]
  28. Song, G.; Zhu, L.; Gao, A.; Kong, L. Blockwise AICc and its consistency properties in model selection. Communications in Statistics-Theory and Methods 2021, 50, 3198–3213. [Google Scholar] [CrossRef]
  29. Zhang, J.; Yang, Y.; Ding, J. Information criteria for model selection. Wiley Interdisciplinary Reviews: Computational Statistics 2023, 15, e1607. [Google Scholar] [CrossRef]
  30. Henniges, M.C.; Johnston, E.; Pellicer, J.; Hidalgo, O.; Bennett, M.D.; Leitch, I.J. The Plant DNA C-Values Database: a one-stop shop for plant genome size data. In Plant Genomic and Cytogenetic Databases; Springer, 2023; pp. 111–122.
  31. Hedges, S.B.; Dudley, J.; Kumar, S. TimeTree: a public knowledge-base of divergence times among organisms. Bioinformatics 2006, 22, 2971–2972. [Google Scholar] [CrossRef]
  32. Sanderson, M.J. Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Molecular biology and evolution 2002, 19, 101–109. [Google Scholar] [CrossRef]
  33. Galán-Acedo, C.; Arroyo-Rodríguez, V.; Andresen, E.; Arasa-Gisbert, R. Ecological traits of the world’s primates. Scientific Data 2019, 6, 55. [Google Scholar] [CrossRef]
  34. Dewey, T.; Shefferly, N.; Havens, A. Animal Diversity Web. University of Michigan Museum of Zoology 2010. [Google Scholar]
  35. Dyer, M.A.; Martins, R.; da Silva Filho, M.; Muniz, J.A.P.; Silveira, L.C.L.; Cepko, C.L.; Finlay, B.L. Developmental sources of conservation and variation in the evolution of the primate eye. Proceedings of the National Academy of Sciences 2009, 106, 8963–8968. [Google Scholar] [CrossRef] [PubMed]
  36. Andrei, N. Modern Numerical Nonlinear Optimization; Vol. 195, Springer, 2022.
  37. Paradis, E.; Schliep, K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 2019, 35, 526–528. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Trajectories of Brownian motion using two different rates (left: σ = 1 , right: σ = 4 ). The time space is set to 100. The histograms are plotted using the endpoints of the trajectories.
Figure 1. Trajectories of Brownian motion using two different rates (left: σ = 1 , right: σ = 4 ). The time space is set to 100. The histograms are plotted using the endpoints of the trajectories.
Preprints 140767 g001
Figure 2. Trajectories of trait evolution for 4 species A , B , C , D along a rooted phylogenetic tree. Middle panel: four taxa rooted tree of 4 tips A , B , C , D . Left panel: a set of four dependent trajectories along the tree using a single rate ( σ t = σ on all branches). Right panel: a set of four dependent trajectories along the tree using two rates ( σ t = σ 1 on the blue branches, and σ t = σ 2 on the red branches). μ is the root status denoted as a parameter of interest (analogous to y 0 ).
Figure 2. Trajectories of trait evolution for 4 species A , B , C , D along a rooted phylogenetic tree. Middle panel: four taxa rooted tree of 4 tips A , B , C , D . Left panel: a set of four dependent trajectories along the tree using a single rate ( σ t = σ on all branches). Right panel: a set of four dependent trajectories along the tree using two rates ( σ t = σ 1 on the blue branches, and σ t = σ 2 on the red branches). μ is the root status denoted as a parameter of interest (analogous to y 0 ).
Preprints 140767 g002
Figure 3. A rooted phylogenetic tree of three taxa with tip nodes A, B, and C, internal nodes D and E, and root node O, where the branch lengths are l A , l B , l C , l D , and l E , and the rate variables are σ A , σ B , σ C , σ D , and σ E .
Figure 3. A rooted phylogenetic tree of three taxa with tip nodes A, B, and C, internal nodes D and E, and root node O, where the branch lengths are l A , l B , l C , l D , and l E , and the rate variables are σ A , σ B , σ C , σ D , and σ E .
Preprints 140767 g003
Figure 4. The phylogenetic ARMA ( 1 , 1 ) rate model for the rate of evolution along the tree. The root status O starts with σ 0 , along the tree the error of the rate estimate. The ARMA rates are bond to the tree toplogy’s ancestral-descendant relationship. For instance, the rate σ A at tip node A has the relationship with the ancestor node D of σ A = ϕ σ A + θ σ D + ϵ A where ϵ A N ( 0 , τ 2 l A ) while the rate σ D at internal node D has the relationship with the ancestor node E of σ E = ϕ σ E + θ σ 0 + ϵ D where ϵ D N ( 0 , τ 2 l D ) .
Figure 4. The phylogenetic ARMA ( 1 , 1 ) rate model for the rate of evolution along the tree. The root status O starts with σ 0 , along the tree the error of the rate estimate. The ARMA rates are bond to the tree toplogy’s ancestral-descendant relationship. For instance, the rate σ A at tip node A has the relationship with the ancestor node D of σ A = ϕ σ A + θ σ D + ϵ A where ϵ A N ( 0 , τ 2 l A ) while the rate σ D at internal node D has the relationship with the ancestor node E of σ E = ϕ σ E + θ σ 0 + ϵ D where ϵ D N ( 0 , τ 2 l D ) .
Preprints 140767 g004
Figure 5. Test heterogeneity rates for the trait evolution on the two sub clade of the tree. Test for two ϕ ’s ( ϕ 1 , ϕ 2 ) vs. one ϕ .
Figure 5. Test heterogeneity rates for the trait evolution on the two sub clade of the tree. Test for two ϕ ’s ( ϕ 1 , ϕ 2 ) vs. one ϕ .
Preprints 140767 g005
Figure 6. Power Curve for varying Taxa levels. The horizontal axis represents an adjusted value of a parameter called ϕ a , ranging from 0 to near 1. The vertical axis depicts the statistical power, indicating the probability of rejecting a null hypothesis when an alternative is true; a range from 0 suggests no detection ability, while 1 signifies certain detection. The four curves represent different Taxa counts ( 16 , 32 , 64 , 128 ) , which could indicate distinct sample sizes or biological classifications. The horizontal lines present the type I error rate using level 0.05 .
Figure 6. Power Curve for varying Taxa levels. The horizontal axis represents an adjusted value of a parameter called ϕ a , ranging from 0 to near 1. The vertical axis depicts the statistical power, indicating the probability of rejecting a null hypothesis when an alternative is true; a range from 0 suggests no detection ability, while 1 signifies certain detection. The four curves represent different Taxa counts ( 16 , 32 , 64 , 128 ) , which could indicate distinct sample sizes or biological classifications. The horizontal lines present the type I error rate using level 0.05 .
Preprints 140767 g006
Figure 7. Left: herbaticus tree. Right: woody specie midddle: combined herbaticus and woody species. The herbaticus and woody trees are obtained by TimeTree [31] where species names are entered and the system generate the tree in Newick format, and the combined tree is obtained by using R ape package function bind.tree with applying the molecular dating with penalized likelihood approach [32] for branch length estimation. This is done by using the R function chronopl for taking the herbaticus tree and woody tree as input.
Figure 7. Left: herbaticus tree. Right: woody specie midddle: combined herbaticus and woody species. The herbaticus and woody trees are obtained by TimeTree [31] where species names are entered and the system generate the tree in Newick format, and the combined tree is obtained by using R ape package function bind.tree with applying the molecular dating with penalized likelihood approach [32] for branch length estimation. This is done by using the R function chronopl for taking the herbaticus tree and woody tree as input.
Preprints 140767 g007
Figure 8. Left: tree with 54 nocturnal primate. Right: tree with 28 diurnal primate. Middle: combined phylogenetic tree of 88 primate species [33,35].
Figure 8. Left: tree with 54 nocturnal primate. Right: tree with 28 diurnal primate. Middle: combined phylogenetic tree of 88 primate species [33,35].
Preprints 140767 g008
Table 1. The type I error (T1 err) rate and the power of tree AR(1) rate on the null H 0 : ϕ = 0 using 4 type of tree and 4 taxa size. The alternative ϕ a are set to values 0.05 , 0.2 , 0.35 , 0.5 , 0.65 , 0.8 , 0.95 , respectively. 1000 replicates are used and the significance level α is set to 0.05 .
Table 1. The type I error (T1 err) rate and the power of tree AR(1) rate on the null H 0 : ϕ = 0 using 4 type of tree and 4 taxa size. The alternative ϕ a are set to values 0.05 , 0.2 , 0.35 , 0.5 , 0.65 , 0.8 , 0.95 , respectively. 1000 replicates are used and the significance level α is set to 0.05 .
Tree type Taxa T1 err 0.05 0.20 0.35 0.50 0.65 0.80 0.95
Balanced 16 0.066 0.050 0.211 0.481 0.746 0.924 0.970 0.991
32 0.041 0.084 0.355 0.792 0.965 0.997 1.000 1.000
64 0.043 0.081 0.616 0.970 1.000 1.000 1.000 1.000
128 0.055 0.124 0.903 1.000 1.000 1.000 1.000 1.000
Pure Birth 16 0.056 0.057 0.184 0.443 0.750 0.907 0.971 0.997
32 0.057 0.071 0.343 0.807 0.971 0.998 1.000 1.000
64 0.058 0.086 0.642 0.965 0.999 1.000 1.000 1.000
128 0.037 0.125 0.884 1.000 1.000 1.000 1.000 1.000
Birth Death 16 0.058 0.064 0.197 0.490 0.748 0.924 0.969 0.996
32 0.061 0.074 0.313 0.766 0.968 0.997 1.000 1.000
64 0.057 0.085 0.604 0.971 1.000 1.000 1.000 1.000
128 0.053 0.118 0.861 1.000 1.000 1.000 1.000 1.000
Balanced 16 0.054 0.075 0.210 0.465 0.753 0.910 0.980 0.990
32 0.043 0.074 0.346 0.763 0.963 0.996 1.000 1.000
64 0.053 0.085 0.611 0.968 1.000 1.000 1.000 1.000
128 0.066 0.164 0.893 1.000 1.000 1.000 1.000 1.000
Table 2. MLE parameter estimated for the combined hebaticus (35) and woody (33) 68 species.
Table 2. MLE parameter estimated for the combined hebaticus (35) and woody (33) 68 species.
ϕ 1 ϕ 2 θ 1 θ 2 τ 2
AR(1) 0.095 4.732
AR(2) 0.108 0.083 4.748
ARMA(1,1) 0.600 0.738 4.663
ARMA(2,1) 0.046 0.554 0.738 4.663
ARMA(2,2) 0.769 0.168 0.869 0.131 4.663
Weighted Estimate 0.050 0.079 0.202 0.003 4.722
Table 3. Model information for the combined herbaticus and woody 88 primate species.
Table 3. Model information for the combined herbaticus and woody 88 primate species.
Model k log L A I C c w Rank
AR(2) 3 265.940 537.870 0.530 1st
AR(1) 2 267.920 539.850 0.200 2nd
ARMA(1,1) 3 267.040 540.070 0.180 3rd
ARMA(2,1) 4 267.040 542.070 0.070 4th
ARMA(2,2) 5 267.040 544.070 0.020 5th
Table 4. Herbaticus species model parameter estimates.
Table 4. Herbaticus species model parameter estimates.
ϕ 1 ϕ 2 θ 1 θ 2 τ 2
AR(1) 0.470 0.009
AR(2) 0.451 0.019 0.009
ARMA(1,1) 0.491 0.024 0.009
ARMA(2,1) 0.167 0.163 0.150 0.009
ARMA(2,2) 0.128 0.609 0.476 0.480 0.009
Weighted Estimate 0.427 0.030 0.021 0.014 0.009
Table 5. Herbaticus species model results.
Table 5. Herbaticus species model results.
Model k log L A I C c w Rank
AR(1) 2 57.270 110.550 0.610 1st
ARMA(1,1) 3 57.270 108.550 0.220 2nd
AR(2) 3 55.840 105.690 0.050 4th
ARMA(2,1) 4 57.250 106.500 0.080 3rd
ARMA(2,2) 5 57.270 104.550 0.030 5th
Table 6. Woody species model parameter estimates.
Table 6. Woody species model parameter estimates.
ϕ 1 ϕ 2 θ 1 θ 2 τ 2
AR(1) 0.013 0.002
AR(2) 0.017 0.302 0.003
ARMA(1,1) 0.359 0.446 0.002
ARMA(2,1) 0.269 0.315 0.283 0.003
ARMA(2,2) 0.223 0.714 0.207 0.450 0.002
Weighted Estimate 0.023 0.010 0.044 0.005 0.002
Table 7. Woody species model results.
Table 7. Woody species model results.
Model k log L A I C c w Rank
AR(1) 2 98.390 192.770 0.860 1st
ARMA(1,1) 3 97.360 188.710 0.110 2nd
ARMA(2,2) 5 97.160 184.310 0.010 3rd
ARMA(2,1) 4 95.960 183.910 0.010 4th
AR(2) 3 93.440 180.880 0.010 5th
Table 8. H 0 : ϕ = 0 , H a : ϕ 0 .
Table 8. H 0 : ϕ = 0 , H a : ϕ 0 .
Tree case z-val p-val significant ?
combined tree 1.119 0.263 N
herbaticus tree 5.100 0.000 Y
woody tree 0.121 0.903 N
Table 9. Testing heterogeneity rates on subclades. H 0 : ϕ 1 = ϕ 2 , H a : ϕ 1 ϕ 2 .
Table 9. Testing heterogeneity rates on subclades. H 0 : ϕ 1 = ϕ 2 , H a : ϕ 1 ϕ 2 .
Model log L _di log L _no log L _dino LRT χ 2 p-val Significant?
AR(1) 57.27 98.39 267.92 847.16 0.00 Y
AR(2) 55.84 93.44 265.94 830.44 0.00 Y
ARMA(1,1) 57.27 97.36 267.04 843.34 0.00 Y
ARMA(2,1) 57.25 95.96 267.04 840.50 0.00 Y
ARMA(2,2) 57.27 97.16 267.04 842.94 0.00 Y
Table 10. MLE parameter estimated for the combined dirunal and nocturnal 88 primate species.
Table 10. MLE parameter estimated for the combined dirunal and nocturnal 88 primate species.
ϕ 1 ϕ 2 θ 1 θ 2 τ 2
AR(1) 0.372 1.170
AR(2) 0.336 0.063 1.169
ARMA(1,1) 0.875 0.652 1.103
ARMA(2,1) 0.784 0.091 0.652 1.103
ARMA(2,2) 0.468 0.406 0.225 0.427 1.103
Weighted Estimate 0.779 0.057 0.566 0.034 1.109
Table 11. Model information for the combined dirunal and nocturnal 88 primate species.
Table 11. Model information for the combined dirunal and nocturnal 88 primate species.
Model k log L A I C c w Rank
ARMA(1) 3 205.540 417.080 0.610 1st
ARMA(2,1) 4 205.540 419.080 0.230 2nd
ARMA(2,2) 5 205.540 421.080 0.080 3rd
AR(2) 3 208.060 422.120 0.050 4th
AR(1) 2 209.640 423.280 0.030 5th
Table 12. Diurnal species model parameter estimates.
Table 12. Diurnal species model parameter estimates.
ϕ 1 ϕ 2 θ 1 θ 2 τ 2
AR(1) 0.422 0.129
AR(2) 0.526 0.206 0.131
ARMA(1,1) 0.138 0.730 0.127
ARMA(2,1) 0.004 0.134 0.730 0.127
ARMA(2,2) 0.008 0.130 0.128 0.858 0.127
Weighted Estimate 0.260 0.050 0.222 0.026 0.127
Table 13. Diurnal species model results.
Table 13. Diurnal species model results.
Model k log L A I C c w Rank
AR(1) 2 20.840 45.680 0.480 1st
ARMA(1,1) 3 20.580 47.170 0.230 2nd
AR(2) 3 20.850 47.710 0.170 3rd
ARMA(2,1) 4 20.580 49.170 0.080 4th
ARMA(2,2) 5 20.580 51.170 0.030 5th
Table 14. Nocturnal species model parameter estimates.
Table 14. Nocturnal species model parameter estimates.
ϕ 1 ϕ 2 θ 1 θ 2 τ 2
AR(1) 0.213 0.003
AR(2) 0.230 0.080 0.004
ARMA(1,1) 0.593 0.901 0.004
ARMA(2,1) 0.647 0.075 1.000 0.004
ARMA(2,2) 0.107 0.610 0.243 0.757 0.004
Weighted Estimate 0.181 0.002 0.037 0.000 0.003
Table 15. Nocturnal species model results.
Table 15. Nocturnal species model results.
Model k log L A I C c w Rank
AR(1) 2 129.730 255.450 0.930 1st
AR(2) 3 127.350 248.710 0.030 2nd
ARMA(1,1) 3 127.340 248.690 0.030 3th
ARMA(2,1) 4 127.290 246.580 0.010 4th
ARMA(2,2) 5 123.660 237.320 0.000 5th
Table 16. H 0 : ϕ = 0 , H a : ϕ 0 .
Table 16. H 0 : ϕ = 0 , H a : ϕ 0 .
Tree case z-val p-val significant ?
Diurnal tree 4.237 0.000 Y
Nocturnal tree 2.763 0.006 Y
Combined tree 4.229 0.000 Y
Table 17. Testing Heterogeity Rates on subclade. H 0 : one ϕ 1 = ϕ 2 , H a : ϕ 1 ϕ 2 .
Table 17. Testing Heterogeity Rates on subclade. H 0 : one ϕ 1 = ϕ 2 , H a : ϕ 1 ϕ 2 .
Model log L _di log L _no log L _dino LRT χ 2 p-val Significant?
AR(1) 20.84 129.73 209.64 637.06 0.00 Y
AR(2) - 20.85 127.35 208.06 629.12 0.00 Y
ARMA(1,1) 20.58 127.34 205.54 624.60 0.00 Y
ARMA(2,1) 20.58 127.29 205.54 624.50 0.00 Y
ARMA(2,2) 20.58 123.66 205.54 617.24 0.00 Y
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated