Valuing the future, discounting in random environments: A review

: We address the process of discounting in random environments which allows to value 1 the far future in economic terms. We review several approaches to the problem regarding different 2 well-established stochastic market dynamics in the continuous-time context and include the 3 Feynman-Kac approach. We also review the relation between bond pricing theory and discount 4 and introduce the market price of risk and the risk neutral measures from an intuitive point of 5 view devoid of excessive formalism. We provide the discount for each economic model and 6 discuss their key results. We ﬁnally present a summary of our previous empirical studies on 7 several countries of the long-run discount problem. 8 three to ten pertinent keywords speciﬁc to the 9 article; yet reasonably common within the subject discipline.)


Introduction
The introduction around three decades ago of the view and methods of statistical 12 physics into economics and finance signaled the appearance of a new interdisciplinary 13 aspect of physics which is sometimes called "econophysics" [1][2][3]. The fact that financial 14 prices are random with sudden and uncontrollable ups and downs has been known 15 since very long, but the first step towards a systematic mathematical analysis of price 16 randomness was taken by Bachelier in 1900 who proposed a model for the market 17 dynamics in which prices follow the ordinary Brownian motion [4]. However, Bachelier's 18 model is not completely satisfactory because in such a representation prices can be either 19 positive or negative, contradicting one of the most fundamental principles of economics, 20 the "principle of limited liability", which affirms that prices cannot attain negative 21 values. This limitation in Bachelier's model was remedied more than six decades later 22 by Osborne [5] by assuming that prices obey the geometric Brownian motion where 23 prices are described by the exponential of the ordinary Brownian motion and, hence, 24 they cannot ever attain negative values. 25 Let us denote by S(t) a speculative price (or an economic index) at time t. In the continuous time framework, the geometric Brownian motion assumes that where S 0 = S(t 0 ) is the price at some initial time t 0 and x(t), the so-called return, is described by the ordinary Brownian motion, that is to say, by the stochastic differential equation ronmental problems in the long run. These reasons include, among others, ethical 79 considerations [21,22], impatience, economic growth [23] and arguments based on the 80 maximization of utility functions that are mostly chosen for mathematical convenience 81 [24], all of them ingrained in a phenomenological expression called the Ramsey formula 82 [25] which constitutes the standard approach to discounting in the economics litera-83 ture (particularly in the long-run) [14]. From an empirical point of view, any practical 84 economist involved in environmental debates might try to use, as the forward discount 85 rate, the average of historical interest rates which occurred in the last 200 hundred years 86 (2.7 % in stable countries [26]), or take the average of Wall Street forward looking models 87 which price bonds of maturity as long as 30 years. Unfortunately, due to historical fluc- 88 tuations of real interest rates, the appropriate rate is considerably below such averages 89 [26]. 90 In econophysics the problem of discounting, despite its relevance, is virtually This starting phenomenological law is built on the empirical observation that the bigger 96 M(t), the greater its variation at a given time along with the simpler assumption that 97 such a variation is linear in M(t) and not, for instance, quadratic. 98 2.1. Definitions and general setting 99 We define the interest rate as the relative time derivative i.e., the rate is the derivative of the logarithm of wealth. Let us incidentally note that 100 linearity shown in Eq. (4) is equivalent to assume that r(t) is independent of M(t).

101
In the simplest case the law (4) represents a direct proportionality, that is to say, r(t) is constant and and from (5) we see that where r is the constant interest rate which has units of 1/(time) (wealth is assumed to be dimensionless). By direct integration we have the usual exponential law [12] M(t) = e r(t−t 0 ) M(t 0 ), Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 November 2021 which connects wealth at some time t 0 , for instance, today (which, without loss of 102 generality, we can take equal to zero) with wealth at some future time t > t 0 . 103 The growth law (6) appears in many branches of natural and social sciences. Thus in radioactivity if N(t) represents the number of active nuclei at time t the usual hypothesis is that this number decreases as where λ > 0 is the decay constant. Similar considerations apply to many other situations, 104 as in chemical reactions or population dynamics, just to name just a few. 105 As we have mentioned above, discounting is the procedure of linking wealth at different times. This is done through the discount function defined as where M(0) is today's wealth. In the case of constant rates we see from Eq. (7) that this function is given by the decreasing exponential The assumption of constant rates is actually unrealistic. A first generalization would be to assume that rates are known functions of time r(t). In such a case, the growth law (6) would be given by which after integrating yields However, the assumption of rates being given by constants or by deterministic functions of time is unjustified, specially over long periods of time. Financial interest rates are typically described as random, as the many models for stochastic interest rates appearing in the literature show [13,30,31]. In other words, r(t) is a random function of time and in consequence the discount function δ(t) given by Eq. (11) is a stochastic process. The effective discount function is then defined as the average of δ(t), taken over all possible realization of r(t). The function r(t) can, in principle, be any random process. However, the most natural and simplifying assumption is that rates are Markovian processes with continuous paths, that is, rates are diffusion processes [13]. Therefore, rates are solutions to stochastic differential equations of the form where W(t) is the Wiener process and the equation is interpreted in the Itô sense. Note 106 that we assume that drift f (r) and noise intensity g(r) do not depend explicitly on time, 107 that is to say, the time dependence is only implicit through r = r(t) which means that the 108 interest rate process is time homogeneous and may be stationary [32]. This is certainly In order to get an operative expression for the effective discount function (12) we define the additional random process The interpretation of x(t) is apparent after substituting Eq. (5) into Eq. (14) and integrating. We find which can be taken as an alternative definition of the accumulated return x(t).

112
Substituting Eq. (14) into Eq. (12) we see that the effective discount function can be written as which implies that, in terms of the probability density function (PDF) p(x, r, t|r 0 ) of the bidimensional diffusion process (x(t), r(t)), we can write where we have included the dependence on the initial rate, r 0 = r(0), in the discount 113 function D(t|r 0 ).
After solving the initial-value problem (17)- (18) and obtain the joint PDF p(x, r, t|r 0 ), 117 the discount function follows from Eq. (15). There are, however, two different approaches 118 for obtaining it. One of them, which is standard in the financial literature, is based on 119 the backward Fokker-Planck equation and it is called the Feynman-Kac approach [13]. A 120 second procedure is based on Fourier analysis [27]. We will explain next both approaches. Using this method one obtains a partial differential for the discount function D(t|r 0 ) 123 which is based on the backward Fokker-Planck equation for the joint density p(x, r, t|r 0 ).

124
In what follows we will assume that t 0 = 0 and denote x 0 = x(t 0 ). Note that by 125 definition x 0 = 0 (cf. Eq. (14)). However, we temporally keep x 0 = 0 and set x 0 = 0 at 126 the end of the calculation when needed.

127
The backward FPE for the PDF p(x, r, t|x 0 , r 0 , t 0 ) that corresponds to the bidimensional process (16) is [32] ∂p Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 November 2021 with final condition as t 0 → t, Let us observe that the problem (19)- (20) is invariant under translations of both time and x 0 . We thus define the new variables so that Note that under this change of variables, we also have where the last equality comes form the invariance under time and x translations, that is, Consequently the final condition (20) becomes the initial condition Having set the backward FPE in the form given by Eq. (22) we next obtain the 128 equation satisfied by the effective discount D(t|r 0 ). To this end, we multiply Eq. (22) by 129 e −x and integrate over x and r, we have On the other hand, integrating by parts the first integral on the right hand side of Eq. (24) and using (25) where we have taken into account the boundary condition (otherwise implicit in the definition of D given in Eq. (15)) lim x →±∞ e −x p(x , r, t |r 0 ) = 0.

132
An alternative method for obtaining the discount function is based on the joint characteristic function., that is, on the Fourier transform of the joint density: One of the chief advantages of working with the characteristic function is that obtaining the effective discount is straightforward. Indeed, comparison of Eq. (15), with Eq. (30) shows that Therefore, in order to obtain the discount function we only need to know the joint charac-133 teristic function of the bidimensional process (x, r). The procedure is quite advantageous 134 in linear cases. In a forthcoming section we will apply this procedure to some standard 135 models of interest rates. As we will see in the next section, the process of discounting just described is very closely related to an important problem in finance called bond pricing. In the context of bond pricing, there can be two kinds of investors. For one hand, if investors are risk neutral then bond prices can be modeled based on the data generating measure p which is the solution of the Fokker-Planck equation (17) with initial condition (18). This is sometimes called the Local Expectation Hypothesis (LEH) [35,36]. Nonetheless, a more general assumption is that investors are sensitive to risk. In such a case bonds are somewhat more accurately priced using an artificial density p * usually called risk-neutral (or risk-correcting) probability measure. Both magnitudes -the data generating measure p and the risk-neutral measure p * -are related through a quantity, denoted by q(r, t) and called market price of risk which, as described in the next section, is the extra return per unit of risk that investors demand to bear risk. This additional return is thus determined by a function q = q(r, t) that in its most general form may depend on the rate r and current time t, although the most usual assumption is that q = q(r) only depends on the rate [37]. Following a standard procedure for bond pricing [37,38] which we will present in Sect. III, one takes risk into account by replacing the drift f (r) by f * (r), and q(r) ≥ 0 is the market price of risk. The form of q(r) is, in principle, unknown and has to be conjectured. The simplest and most common assumption is that q(r) = q is constant, in such a case the value of q may be more easily estimated from empirical data. Note that now the risk-neutral measure p * (x, r, t|r 0 ) is given by the Fokker-Planck equation (17) with f (r) replaced by f * (r), that is, with the initial condition given by In an analogous way the discount function adjusted for risk will now be given by the Feynman-Kac equation (27) with f (r) replaced by f * (r). Or, using the Fourier method, the discount function will be given in terms of the risk-neutral characteristic function,p * (ω 1 , ω 2 , t|r 0 ), by (cf. Eq. (31)) and refer the interested reader to more specialized works for further information [13].

144
A bond is a financial instrument that one purchases now and provides a payment in the future. From a more technical point of view, we say that a (discount) bond is a default-free claim on a specified sum of money to be delivered at a given future date called maturity time. Such claims are bought and issued by investors. Let us denote by B(t 0 , t) the price at time t 0 of a discount bond maturing at time t ≥ t 0 , with unit maturity value, B(t, t) = 1.
Let us incidentally note that if the final maturity price is not 1 (say, B(t, t) = β) then the 145 (initial) price of the bond would be βB(t 0 , t).

146
Bonds are classified according to the time interval to maturity τ defined as Thus, if τ = 10 years we talk about a 10 year bond that is traded initially at t 0 (for instance, 147 today) with price B(t 0 , t 0 + 10) and which after 10 years has unit value. Similarly for a 3

149
The central question is to know the backward evolution of the bond price, from 150 unit maturity to the initial purchasing price B(t 0 , t). Note that the problem is virtually 151 identical to the problem of discounting discussed in Sect. II, with the sole difference 152 that in discounting we look for the forward evolution from a known initial value to an unknown final value, while in bond pricing the situation is reverse, since we know the 154 final value but not the initial one.

155
In order to proceed further we define the instantaneous rate of return b(t 0 , t) (also called forward rate) as the relative time variation of the bond price (compare with Eq. The knowledge of the forward rate b(t 0 , t) allows us to relate the initial price B(t 0 , t) and the maturing price B(t, t) = 1. Indeed, the integration of the above equation directly leads to The close analogy between bond pricing and discounting is now apparent. Indeed 156 the comparison of Eq. (37) with Eq. (11) shows that B(t 0 , t) is the equivalent of the 157 discount function δ(t) and that the forward rate b(t 0 , t) is the equivalent of the discount 158 rate r(t). However, in what follows, we will use the notation r(t) not for the forward 159 rate b(t 0 , t) but for the so-called spot rate (also called nominal rate) which we will define 160 in Eq. (39).

161
Another quantity of interest is the yield to maturity y(t 0 , τ) defined by From (37) we see that that is to say, the yield is the time average of the forward rate over the maturity period τ.

162
A final quantity is needed, the spot or nominal rate, which is defined as the limit of the yield when maturity tends to 0, 2 Solving the indeterminacy by expanding the integral in powers of τ, we see that the spot rate is given in terms of the forward rate by In other words, the spot rate is the instantaneous forward rate.

163
Let us finally note that a loan of amount M subscribed at time t 0 with an interest rate r(t 0 ) (the spot rate) will, at time t 0 + dt 0 , increase in value to M + dM, where Indeed, at any time t 0 , the value of the spot rate r(t 0 ) is the instantaneous increase of 164 the loan value, that is, r(t 0 ) = d ln M(t 0 )/dt 0 (compare with Eq. (36)). All of this clearly 165 heightens the close similarities with discounting mentioned above.

166
However, subsequent values of the spot rate are not necessarily certain. We will see 167 next the consequences of this fact on the time evolution of the bond price B(t 0 , t). Suppose the spot rate r(t 0 ) is not deterministic but random. In such a case, and analogously to discounting, the usual assumption is that r 0 = r(t 0 ) is a Markovian random process with continuous trajectories; that is, a diffusion process obeying a stochastic differential equation of the form where W(t 0 ) is the standard Wiener process. We have assumed that the drift and the 170 noise intensity are independent of time, thus the time dependence of these coefficients is 171 implicit through r 0 = r(t 0 ). We know that this implies invariance under time translations 172 and we can set t 0 = 0 when needed without loss of generality. 173 We will now obtain the time evolution of the bond price B(t 0 , t) from the purchasing time t 0 to maturity t and to this end we follow Oldrich Vasicek [37]. Let us first observe that the most natural hypothesis consists in assuming that the bond price B is a function of the initial spot rate r 0 = r(t 0 ) and write In this way B(t 0 , t|r 0 ) represents the price of a bond issued at time t 0 and maturing at time t, given that the initial interest rate is r 0 = r(t 0 ). The infinitesimal variation of the bond price is then defined by We expand in Taylor series up to second order Substituting for Eq. (42) and taking into account that dW(t 0 ) = O(dt 1/2 0 ) [32,33] we write However dW(t 0 ) 2 = dt 0 (in mean square sense) [32,33] and, up to first order in dt 0 , we obtain Defining and we see from (44) that the bond price satisfies the stochastic differential equation showing that the bond price is also a diffusion process.

175
Averaging Eq. (47) and recalling that E[dW(t 0 )] = 0 we see that which proves that µ(t 0 , t|r 0 ) is the average of the instantaneous rate of return [cf. Eq. ( 176 36)] at time t 0 on a bond with maturing date t, given that the current spot rate is r 0 . In 177 an analogous way one can easily show that σ 2 (t 0 , t|r 0 ) is the variance [37]. 178 We therefore see from the above development that the bond price is a random  Consider an investor who, at time t 0 , sells an amount M 1 of a bond maturing at time t 1 and at the same time buys an amount M 2 of another bond with a different maturing date t 2 . The total worth of the portfolio thus constructed is M = M 2 − M 1 . Note that each amount M i (i = 1, 2) is a multiple of the bond price B(t 0 , t i |r 0 ) (i = 1, 2) and, hence, they also obey the stochastic differential equation (47). That is, In consequence the infinitesimal variation dM = dM 2 − dM 1 of the worth of the portfolio 193 changes over time according to Suppose we choose the amounts M 1 and M 2 such that where and the random term in Eq. (48) vanishes. This renders the portfolio composed of such amounts of the two bonds instantaneously riskless: where µ i = µ(t 0 , t i |r 0 ). The rate of return r M of this portfolio is In order to avoid arbitrage opportunities -that is, making profits without taking any risk-the rate r M must be equal to the spot rate r 0 . If not, the portfolio can be purchased by taking funds borrowed at the spot rate, or otherwise sold and the profits lent out to accomplish a riskless arbitrage [37]. Therefore [compare also Eq. (41) with Eq. (50)] Rearranging terms we get (µ This equation is valid for arbitrary maturities t 1 , t 2 , . . . , it then follows that the ratio

196
Let us denote by q(t 0 |r 0 ) the common value of such a ratio for a bond of any maturity date, given that the current spot rate (at time t 0 ) is r 0 , The quantity q(t 0 |r 0 ) is called the market price of risk, as it gives the variation of the Note that if q = 0 the spot rate r 0 = r(t 0 ) and the average rate of return µ coincide.  The introduction of the market price implies a non-random bond price B = B(t 0 , t|r 0 ) which, in turn, allows a deterministic equation for the price. In effect, rewriting Eq. (51) as and substituting µ and σ by their definitions given in Eqs. (45) and (46) we have which, after rearranging terms, yields This equation, called the term structure equation, is a partial differential equation for B(t 0 , t|r 0 ), once we know the random character of the spot rate process r(t) (through its drift f and noise intensity g) and the market price of risk q(t 0 |r 0 ) is specified. Bond prices are thus obtained after solving the deterministic equation (52) with the final condition: Let us observe that the term structure equation (52) for the bond price B is identical to the Feynman-Kac equation (27) for the discount function D as long as we make the following change of drift On the other hand, as we have seen in Sec. 2, the solution of the Feynman-Kac equation (27) for the discount function D(t|r 0 ) is written as the average [cf. Eq. (15)] where p(x, r, t|r 0 , t 0 ) is the probability density function of the bidimensional diffusion process (x(t), r(t)) defined by Eq. (16), Now the analogy between the term structure equation (52) and the Feynman-Kac equation (29) suggests that we can write the bond price B(t 0 , t|r 0 ) as an average over the different realizations of the spot rate r(t 0 ). However, this averaging procedure is taken using a modified PDF called the risk-free measure. Thus, it can be proved in a more rigorous way that [31,37] where p * (x, r, t|r 0 , t 0 ) is the risk-free measure which is the PDF of the bidimensional 203 process (x(t 0 ), r(t 0 )) defined by the following pair of stochastic differential equations 204 that include the market price of risk [see Eq. (54)]: That is, p * is the solution to the FPE with the initial condition Since, as we have shown in Sect. 2.2, the Feynman-Kac approach to discounting is equivalent to the Fourier method described in Sect. 2.3, we can apply the latter to obtain directly the bond price knowing only the risk neutral PDF, without having to solve the Feynman-Kac equation (52) with condition (23). Indeed, the characteristic function of the risk neutral density p * is the joint Fourier transform which after comparing with Eq. (55) yields Finally, once we know the bond price, the yield to maturity y(t 0 , τ|r 0 ) (also called the term structure of interest rates) is readily evaluated from Eq. (38): The graphic representations of y(t 0 , τ|r 0 ) as a function of t 0 and for different values of the 206 maturity interval τ are called yield curves and are of prime importance for practitioners.  On the other hand there is a property of the market that seems to be well founded 223 on empirical grounds [6]. This is the property of mean reversion meaning that prices tend 224 to return to some fundamental value, called normal level, which is usually identified as  Before proceeding with the introduction of some standard models for the market 231 evolution, we will briefly explain the link between bonds and (real) interest rates.

232
Financial economists have developed a large number of models of interest rate 233 processes to enable them to price bonds and other cash flows. In these models interest

238
On the other hand, environmental economists are interested in the real behavior of the economic growth over longer horizons, in contrast to financial economists who are typically more interested in nominal rates over shorter periods of time. The behavior of real and nominal rates usually differ because due to inflation real rates can take on negative values. In this way real rates r(t) are generally defined by the so-called Fisher procedure: where i(t) is the inflation rate which is usually generated from consumer price indexes as we will explain in the next section. The quantity n(t) represents nominal rates which are typically constructed out of government bonds and are usually positive (even tough in recent years, nominal rates have taken slightly negative values). In order to explain the close relationship between nominal rates and bonds, let us first recall that nominal rates were called spot rates in the previous section where we used there the notation r(t) instead of n(t). We thus define the spot (i.e., nominal) rate as (see Eqs. (39)- (40)) where b(t , t) is the forward rate for bonds defined in Eq. (36) (see also Eq. (37)), that is where B(t , t) is the price at time t of a (government) bond maturing at time t ≥ t . Let us recall the definition of the yield to maturity y(t, τ) given in Eq. (38), and comparing with Eq. (62) we see that in terms of the yield nominal rates n(t) can be defined as Thus for empirical analysis the yield can be used as an estimator of the nominal rates: and the accuracy of such an estimator increases as τ → 0. We will return to this 239 discussion in the next section.

240
In this way, taking nominal rates corrected by inflation as a proxy of economic 241 growth, we have recently shown [26,28,29] through a detailed empirical study on many 242 countries that real interest rates are negative around 25 % of the time (see next section).

243
To understand how discounting depends on the random process used to characterize 244 interest rates we have focussed on three different models and obtained exact analytical 245 expressions for the discount function [27]. The three models describe to varying degree a 246 number of relevant characteristics observed in data, while being simple enough to allow 247 for complete analytical treatment. Their main results are summarized in Table 1.

248
The first model is based on the Ornstein-Uhlenbeck (OU) process -also called 249 Vasicek model in the financial literature [13]-which allows for negative rates and 250 is, therefore, suitable for pricing environmental problems. The model has a stationary 251 probability distribution and exhibits reversion to the mean, which means that the process 252 tends to return to its average stationary value. We will review this model below.

253
The second and third models we have considered are given by the Feller and 254 log-normal processes respectively. For these processes rates cannot be negative. The

255
Feller process -also known as the Cox-Ingersoll-Ross (CIR) model [41]-has reversion 256 to the mean and stationary probability distribution. It is one of the most popular 257 models in finance [13] and we have recently review the main properties of the Feller 258 process in previous works [27,40]. A third model, also implying positive rates, is the log-259 normal process (occasionally called Dotham model in the financial literature [43]). The shortcomings the log-normal process has also been used in the financial literature mainly 262 because is positive and allows for analytical treatment [13]. We refer the interested reader 263 to our previous work [27] for details on this model.

264
As remarked in the introduction we are primarily interested in valuing the far future 265 for environmental problems rather than the short time discount of finance, the latter  Table 1. Key statistical features for three standard models: the Vasicek (Ornstein-Uhlenbeck), the Cox-Ingersoll-Ross (Feller), and the Log-normal models. Average and variance are provided in terms of the model parameters to better compare the long run discount D(t) asymptotic behavior. The long run discount is provided showing an exponential decay with long-run rate of discount r ∞ for the Vasicek and the Cox-Ingersoll-Ross models, jointly with an specific combination of parameters in the Log-normal case (k 2 /2 < α, mild fluctuations). δ is defined in Eq. (113).
In this model, rates are described by the Ornstein-Uhlenbeck process [37], that is a diffusion model with linear drift and constant noise intensity: where r(t) is the rate and W(t) the Wiener process. The parameter m (the normal level) 272 is a mean value to which rates revert, k > 0 is the amplitude of fluctuations, and α > 0 is 273 the strength of the reversion to the mean. These parameters have to be estimated from 274 empirical data.
We seek the solution to this problem in the form of a Gaussian function: where A(ω 1 , t), B(ω 1 , t), and C(ω 1 , t) are unknown functions to be consistently determined. Substituting the ansatz (70) into Eq. (68), identifying like powers in ω 2 and taking into account Eq. (69), we find that these functions satisfy the following set of differential equationsȦ = −2αA − k 2 /2, A(ω 1 , 0) = 0; These equations are a set of linear ordinary differential equations which can be sequentially integrated, with the result and 276 C(ω 1 , t|r 0 ) = iω 1 r 0 1 We have thus proved that the joint characteristic functionp(ω 1 , ω 2 , t|r 0 ) is given by the Gaussian function (70) with the coefficients given above. Note that in terms of the bidimensional distribution (70) the distribution of the rate is given by the marginal PDF This results in the Gaussian densitỹ In the stationary state (t → ∞) we havẽ which proves that the normal level m is the stationary mean value, It can also be shown that the correlation function of the process, defined as the average (τ ≥ 0) in the stationary state reads [27] C(τ) = (k 2 /2α)e −ατ , which means that α −1 is the correlation time, τ c , of the rate. Indeed Let us observe that the volatility, σ 2 = C(0), is independent of the normal level and given by The discount function D(t|r 0 ) is also obtained from Eqs. (70) -(73) although in this 277 case after setting ω 1 = −i and ω 2 = 0 (cf. Eq. (31)). We have which after rearranging terms can be written as where r 0 = r(0) is the initial rate. Note that as t → ∞ (in fact when t α −1 , i.e., for times much grater than the correlation time α −1 ) Eq. (79) shows at once that the discount function of the Vasicek model has the typical exponential decay where is the long-run discount rate. Let us note that the long-run rate can be defined as the limit as long as the limit exists. Let us also note the important fact that r ∞ is smaller than the 279 mean value of the return given by the normal level m. This reduction is quantified by the 280 "noise-to-signal" ratio k/α, which means that either a long persistence (recall that this is 281 equivalent to long correlation time, i.e., α small) or an increase of the noise fluctuations 282 (i.e., k large) reduce the long-run discount rate as compared with the average rate. 283 We finally easily see from Eq. (79) that as t → 0 the discount function approxi- As mentioned above risk aversion is taken into account by introducing the market price of risk q(r) and changing drift according to Eq. (32). For the Vasicek model, in which f (r) = −α(r − m) and g(r) = k, we have and assuming q(r) = q to be a constant independent of r, we write where Since the modified drift f * (r) has the same form that f (r) we conclude that the adjustedfor-risk discount function will be given by Eq. (79) after the replacement m → m * . In particular, the adjusted long-run discount now reads [cf. Eq. (81)] We thus see that the long-run discount depends on the historical rate m, but this is shifted by two terms. The first term raises the long-run rate due to the market price of risk. The second shift lowers it by an amount given by the ratio of uncertainty (as measured by k) and persistence (as measured by α). We rewrite Eq. (86) as This makes it clear that whether or not the overall shift in the long-run discount rate is 288 positive or negative depends on the size of the market price of risk and on the noise-to-289 signal ratio between the volatility parameter and the reversion rate.

290
It is not surprising that the market price of risk raises the long term rate, but it is 291 not so obvious that uncertainty and persistence can lower it. Indeed for the Orstein-  In the financial literature one of the most accepted models for interest rates is the Cox-Ingersoll-Ross (CIR) model [41] where rates follow the Feller process described by drift and noise intensity given respectively by [42] The Feller model is thus a diffusion process described by the stochastic differential equation where W(t) is the standard Wiener process and, as in the OU process, m > 0 represents 307 the mean stationary rate (the normal level) and α −1 is the correlation time [27]. Let us note 308 that since the diffusion coefficient in one-dimensional diffusions is given by the square properties for the process [40]. process remains always positive. 319 We have reviewed rather thoroughly the properties of the Feller process and refer 320 the reader to [27] and [40] for a more detailed information.

321
The process is not Gaussian and the stationary PDF as t → ∞ is the Gamma distribution [27] where is a positive and dimensionless constant which combines all the parameters of the model 322 into a single expression. As mentioned above a major characteristic of the Feller process 323 is that r(t) cannot attain negative values which makes the model a convenient tool for 324 pricing bonds which are never negative [13].
Equation (94) is a linear partial differential equation of first order whose solution can be obtained by the method of characteristics and we refer the interested reader to our work [27] for a detailed information. Once we know the solutionp(ω 1 , ω 2 , t|r 0 ), the discount function is then obtained through Eq. (31) with the result [27] where θ is defined in Eq. (91) and Notice that λ > α and the time scale represented by λ −1 is smaller than the correlation 326 time α −1 .

327
In this case the long-run discount rate, defined by the limit (cf. Eq. (82)) is directly obtained from Eq. (96) with the result and, as in the Vasicek model, the effective discount reduces to the expected exponential decay Substituting into Eq. (98) the expressions for θ and λ given in Eqs. (91) and (97) we write which clearly shows that the long-run discount rate is always smaller than the stationary average rate: r ∞ < m.
where q(r) is the market price of risk as discussed in the previous section. For any function q(r) (including a constant market price of risk q) this adjusted drift leads to an unsolvable Fokker-Planck equation with no analytical expression for the adjusted discount and the long-run discount rate. It is, nonetheless, possible to get analytical expressions for these quantities if the market price of risk has the following functional form where q ≥ 0 is a positive quantity. In such a case we may write where The adjusted drift has the same form than f (r). Therefore, the adjusted discount function will be given Eq. (96) with the replacements α → α * and m → m * and the long-run discount is [cf. Eq. (100)] From the definitions of α * and m * we easily see that α * ≤ α and α * m * = αm. Hence, writing r * as  In this model rates are described by the the geometric Brownian motion (log-normal process) and the model is determined by the stochastic differential equation where r is the interest rate, α and k are constant parameters, α may be positive or negative whereas k is always positive and W(t) is the standard Wiener process. Equation (106) can be integrated at once yielding showing that r(t) is never negative (r 0 > 0). Therefore, the log-normal model is more suited for modeling nominal interest rates and bonds in finance than for the long-run real rates of environmental economics. Contrary to OU and Feller processes, the log-normal process does not show reversion to the mean. Indeed, as t increases we see from Eq. ( 107) that the rate either diverges when α > 0 or goes to zero if α < 0. In an equivalent way one can also show from Eq. (107) that the mean and variance of the process are [27] r(t) = r 0 e αt , Var[r(t)] = r 2 0 e 2αt e k 2 t − 1 .
The discount associated with the log-normal process model was studied in 1978 by L. U. Dothan [43] and in finance it is sometime refereed to as the Dothan model. Because it allows for analytical treatment it is one of the models used in the literature [13]. For this model the FPE for the joint density of the discounting processes (x(t), r(t)) is given by (cf. Eq. (17)) ∂p ∂t with the usual initial condition given by Eq. (18). The Fourier transform of this expression leads to the following equation for the characteristic functionp(ω 1 , ω 2 , t|r 0 ) and the initial condition (93). Eq. (108) is a partial differential equation of second order which cannot be solved by the method of characteristics and we refer the interested reader to our work [27] for a detailed information on how to solve Eq. (109) using the time-Laplace transform. Hence -and contrary to Vasicek and CIR models where it is possible to obtain exact expressions for the discount function D(t)for the log-normal case we can only achieve the exact expression of its Laplace transform, The resulting formula -written as an integral of special functions, the Kummer functionis rather intricate and we will not write it here (see [27] for more information). However, from the exact expression forD(s) we can get asymptotic expressions as t → ∞ of the discount function D(t) in real time. This is done using the so-called Tauberian theorems which relate the small s behavior ofD(s) with the long-time behavior of D(t) [44,45]. The final result is the following asymptotic expression for the discount function D(t) in the long run as t → ∞ [27] The asymptotic form of the discount function thus depends on the values taken 335 by the ratio α/k 2 between the strength of the constant deterministic drift α and the 336 amplitude of fluctuations given by k 2 /2 (which can be considered the "signal-to-noise 337 ratio" of this model).

338
(i) The case k 2 /2 > α corresponds to strong fluctuations, where the noise intensity 339 k 2 /2 is greater than the drift parameter α. In this case the discount tends to a constant 340 value (for the actual value of this constant see [27]).

341
(ii) The case k 2 /2 < α corresponds to mild fluctuations for which the deterministic drift is stronger than noise. In such a case the discount function has the expected exponential decay with a long-run rate of discount given by [27] where δ > 1 is a positive numerical factor which only depends on the ratio 2α/k 2 and reads where ψ(·) is the digamma function.

342
Let us write Eq. (111) in a more characteristic form. Indeed, from Eq. (107) we see that and with the help of Eq. (111) we write Eq. (111) as (t → ∞ and k 2 /2 < α). Note that the average E[ln r(t)/r 0 ] is what a practitioner would 343 take as an estimate of the discount rate up to time t within the log-normal model. Since 344 δ > 1, the analytical result (114) shows that the actual long-run rate of the model is a 345 fraction of the average rate. We have shown elsewhere that the long-run discount rate is 346 at most 73 % of the average rate [26]. In this way when 2α/k 2 > 1 the log-normal model 347 follows a similar pattern to that of the OU and Feller models: In all of them the long-run 348 rate is smaller than the average rate. This general statement is in fact a direct consequence 349 of Jenesen's inequality, which states that the average of a convex function is greater than 350 or equal to the function of the average; that is, ). Assuming f to be the 351 decreasing exponential and X the cumulative process x(t) defined in Eq. (14), it follows 352 immediately that the long-run rate r ∞ must be always less than or equal to the average 353 rate. Nonetheless, our procedure quantifies the difference among averages [27].   Assuming a constant market price of risk, q(r) = q ≥ 0, we have Again f * (r) has the same form than f (r) and all previous results will apply after making 364 the replacement α → α + q.

366
In order to choose an appropriate model for rates, which would allow us to obtain 367 realistic long-run discount functions, we performed a rather complete empirical study 368 on interest rates combined with inflation. Our study follows the line partly initiated by 369 Newell and Pizer [48] (see also [49]). To our knowledge there are few empirical studies 370 on real rates, with some exceptions. We remark here the recent and excellent survey by 371 Giglio, Maggiori and Stroebel on the housing market in London and Singapore [50,51] 372 which has allowed for a rather realistic estimation of long-run discount rates.

373
Our first concern was knowing how the discount process depended on the underly-  Table I

381
Data is reported in Table 1.

382
Since all but two of our nominal interest rate processes are for ten year government bonds, which pay out over a ten year period, we smoothed out inflation rates with a ten year moving average, and subtracted the annualized inflation index from the annualized nominal rate to compute the real interest rate, as explained in the previous section by means of the Fisher's procedure (cf. Eq, (61)), where n(t) is the nominal rate and i(t) the inflation rate. The particular case of United

383
States is plotted in Figure 2. Let us recall that denoting by B(t, t + τ) the government bond issued at time t and maturing at time t + τ with unit maturity, B(t, t) = 1, the yield y(t, τ) is defined as (cf. Eq. (38)) One can argue that τ = 10 years is not a short period of time in order to consider 385 y(t, τ = 10 years) a very accurate estimator of n(t) (cf. Eqs. (63) and (64)). Although this 386 may be true we must bear in mind that 10 years bonds are the shortest bonds available 387 for most of the countries analyzed.

388
The inflation rate is estimated through the Consumer Price Index (CPI) as where I(t) is the aggregated inflation up to time t, and τ =10 years. The relation between I(t) and the Consumer Price Index (CPI) is where C(t) is the time series of the empirical CPI. The instantaneous rate of inflation i(t) is, therefore, estimated by the quantity i(t + τ) which written in terms of the CPI reads A remarkable characteristic observed in many epochs for all countries is that real 389 interest rates frequently become negative, often by substantial amounts and over long 390 periods of time as shown in  (iii) The Min and Max columns illustrate the robustness of the estimation procedure by providing the minimum and the maximum value of parameter estimation on four data blocks of equal length. (iv) The parameter α is estimated by fitting the empirical correlation function to an exponential (cf. Eq. (77)) after using the whole data block.
(v) Countries in boldface are those considered historically more stable with positive long-run ratesr ∞ > 0.
the empirical work to the OU (Vasicek) model. We also assume the Local Expectation

395
Hypothesis for which we live in a risk neutral world and the market price of risk is zero 396 [35,36,38]. 397 We can estimate the parameters m, α and k of the Vasicek model to each of the data series. There are several feasible procedures. One possible way is to deal with stationary averages. The parameter m is easily estimated because it is the stationary mean value of the rate [cf. Eq. (76)] m = E[r(t)].
The estimation of parameters α and k is based on the correlation function of the Ornstein-Uhlenbeck process. Thus from Eq. (77) we have Evaluating then the empirical correlation from data and fitting it by an exponential we estimate α (measured in 1/year units) for each country. The third and last parameter, k, is obtained from the empirical standard deviation σ 2 = E[|r(t) − m| 2 ], which for the Vasicek model is given by Eq. (78). That is, The resulting parameters for all countries are listed in Table 2 along with its maximum 398 and minimum value for each country. This procedure allows us to show that parameters 399 may indeed fluctuate on different periods of time.

400
Once the parameters of the Vasicek model have been estimated, the long-run discount rate is readily evaluated from Eq. (81), For this calculation, we have neglected market price of risk as mentioned above.

401
The fourteen countries divide into two very clear groups. Nine countries, with 402 relative stable (that is, positive) real interest rates, have long-run positive rates (indicated 403 in boldface type in Table 2). The average historical rate for these nine countries is 404 m = 2.7 % while the average long-run rate is r ∞ = 2.1 % which, on average, is 29 % 405 lower than m. Five countries with less stable behavior have long-run negative rates 406 and an exponentially increasing discount. We can term these countries as unstable and 407 it may not be a coincidence that all five have experienced fascist governments. Four 408 cases of this group have a negative average rate m due to at least one period of runaway 409 inflation; the exception is Spain, which has a (highly positive) mean real interest rate, 410 but still has a long-run negative rate. In every case convergence to the long-run rate 411 happens within 30 years, and typically within less than a decade. This is in contrast 412 to other treatments of fluctuating rates, which assume short term rates are always (or 413 nearly always) positive and predict that the decrease in the discounting rate happens 414 over a much longer timescale, which can be measured in hundreds years [48,49,[52][53][54][55]. 415 Alternatively, we can estimate parameters using the well-established maximum like-416 lihood procedure. Maximum likelihood estimation for the Vasicek model is extensively 417 documented in the financial mathematics literature (see for instance [13]). The approach 418 differs from the previous one as it focuses its attention on two consecutive steps of our 419 time series (generally consecutive years) and takes the conditional probability to perform 420 the estimation. Table 3 shows that the most inaccurate estimator isα, a not surprising 421 fact since the estimation of α is known to be difficult for the Vasicek model [56]. The 422 last two columns in Table 3 include the long-run interest rate estimatorr ∞ and its error 423 calculated through error propagation.    Table 3: Maximum likelihood estimation for the Vasicek model and the long-run interest rate.m is the estimator of the mean real interest rate in 1/years (in %).α is the estimator related to the characteristic reversion time in 1/year. The squared root of the estimator of k 2 is the volatility of the process and k 2 is given in terms of 1/(year) 3 (multiplied by 10 4 to be comparable with results in Table 2). These estimators are accompanied with the square root of the variance, σ's, of each estimator.r ∞ is the subsequent estimator of the long-run real interest rate in 1/year (in %). Negative values ofr ∞ mean the discount function is asymptotically increasing and its standard error is obtained through error propagation. The last two rows show separately the average over all countries, the stable countries with r ∞ > 0 and the unstable countries with r ∞ < 0. In all three rows standard error provided corresponds to the standard deviation of ther ∞ for the different countries.

448
We have reviewed one of the most important aspects of economics and finance as is 449 the problem of discount which weights the future relative to the present. The problem is 450 obviously very relevant in finance over relatively short time spans, but it is even more 451 crucial for long-run planning in addressing environmental problems on how to act now 452 with measures to mitigate the effects of climate change. To our knowledge this is a rather 453 unknown issue to the econophysics community and this review is specially intended 454 to them. We have thus addressed the problem with the simplest possible approach 455 and yet with a high level of rigor and generality. In this way we have also developed 456 the traditional method used in mathematical finance to address the problem as is the 457 Feynman-Kac approach. In addition we have reviewed the bond pricing theory and its 458 close similarity with discounting and presented a short introduction to the term structure 459 of interest rates along with the market price of risk. 460 We obtain quantitative results on the problem by studying with some detail three for all models the long-run discount rate is always less than the long-time average rate.

468
A conclusion which necessarily has to have consequences in any long-run economic 469 planning. 470 We have finally reviewed our recent empirical study on 14 different countries which   Table 5 of Ref. [28] (see Section 5). In the top figure we plot the discount function D(t) while in the bottom figure we plot the log ratio − ln D(t)/t. In the top figure we observe the asymptotic exponential decay of the discount after more than a hundred years, while in the bottom figure it is clearly seen the existence of a long-run discount rate for the Vasicek model (cf. Eq. (82)). The initial rate r 0 is arbitrarily taken to be 1%. In both models we assume no market price of risk q(r) = 0 (Local Expectation Hypothesis).