Maximum Likelihood Estimation of Discretely Sampled Diffusions: A Closed-Form Approximation Approach Ait Sahalia Presentation by Biqing Cai WISE,Xiamen University

March 23, 2009

CONTENTS

Literature Review Introduction of Ait Sahalia(2002) Procedure of approximation to transition density Performance Extensions

Background

I

Since the introduction of Black-Scholes(1973) model into Finance, there has been widely use of continuous time models in Finance. Diffusions, jump-diffusions or more recently L´evy Processes are used to describe the asset price movements.

I

While the advantage of obtaining closed form solutions of derivative prices or bond price is huge. There’re obstacles of estimating the continuous time models because in general, we don’t know the transition densities.

I

Suppose our model is: dXt = µ(Xt ; θ)∆ + σ(Xt ; θ)dWt

(1)

And our observations are discrete. I

The transition density(or conditional density) f (xt+∆ |xt ) are usually unknown, except for a few cases including the Black-scholes model (1973), the Vasciek model(1977) and CIR model(1985).

Question For general specifications of diffusion process with transition density are not know, how to derive their transition density?

Intuitively, Euler Scheme can be used, which desretize the equation (1) as: ∆Xt = µ(Xt ; θ)∆t + σ(Xt ; θ)∆Wt

(2)

With ∆Wt ∼ N(0, ∆). Thus, Maximum likelihood method or GMM can be used, e.g. Chan et al.(1992). And Euler Scheme is often used to simulate diffusion process or jump-diffusion process because the discretized process goes to the true process as ∆ → 0.

I

However, as pointed out by Lo(1988), the estimation of equation (2) using MLE will introduce the so-called discretization bias which makes the estimator inconsistent if the time interval doesn’t go to zero.

I

In fact, the fast development of estimation and test for continuous time processes mainly focuses on reducing the discretization error, although some discretized models are used, e.g. Das(2002).

I

As pointed out by Sundaresan(2000) ” Perhaps the most significant development in continuous-time field during the last decade has been the innovations in econometric theory and estimation techniques for models in continuous time.”

Literature Review I

Hansen and Scheinkman(1995)is one of the first paper to estimation diffusion process without introducing discretization error. In this paper, they derive moment conditions using the infinitesimal generator. However, the stationarity of the process is required, which makes this method not generally usable.

I

Pedersen(1995) use simulated likelihood, which use finer sample to reduce the discretization error. See also Brandt and SantaClara(2002).

I

Stanton(1997) shows how to derive moment conditions that reduce the discretization bias to higher order, i.e. of order ∆2 , ∆3 . . .. And he used the nonparametric method to estimate the approximate the moments. Nonparametric method is also used in Bandi and Philips(2003). A survey of nonparametric method, see Cai and Hong(2008).

By now, there’re two powerful methods to analysize the continuous time model including stochastic volatility and jump-diffusion models,which requires large computation burdens. I

The first one is EMM proposed by Gallant and Tauchen (1996). The EMM can be decomposed into two steps. For the first one, we should estimate an auxiliary model catching some characteristics of the data. In the second one, the simulated data are used to match the moments derived in the first stage.

I

The second one is MCMC. In MCMC analysis, the critical difference in this setup is that it requires data augmentation as described by Eraker(2001)

Idea of Ait Sahalia(2002)

Here, I begin to discuss the Sahalia’ approximated likelihood method without going into the technical details. If you have interests, you can discuss with me. I

The basic idea of this paper is to create analogy between the situation of the approximation in CLT with the transition density of a diffusion using Hermite expansion.

I

The sample interval ∆ plays the role of sample size n in CLT. As in CLT the sum of r.v.s need to be standarized to make it not degenerate, here, we need to transform the observed X to another diffusion Z.

I

Another important thing is that the density pX will cannot in general be approximated for fixed ∆ around a Normal density if the distribution is in fact too far from Normal.

I

The density PX to be expanded around N(0, 1), the tail should be thin enough. So, we need transform the original diffusion to a diffusion with thiner tail. This involves transform X to Y.

Transform X → Y

The transformation of X into Y is defined as: Z Y = γ(X; θ) =

X

du/σ(u; θ)

(3)

Then by Ito’ Lemma, we have: dYt = µY (Yt ; θ)dt + dWt Where µY (Yt ; θ) =

µ(γ−1 (y;θ);θ) σ(γ−1 (y;θ);θ)



(4)

1 ∂σ −1 2 ∂x (γ (y; θ); θ).

Remark: Because σ > 0, the function γ is increasing and invertible.

I

Example: if DX = (0, +∞) and σ(x; θ) = xρ , then Y = (1 − ρ)X 1−ρ if 0 < ρ < 1 and DY = (0, +∞).

Remark: In fact, this kind of CEV diffusions are what have been analyzed in Ait Sahalia(1999). However, we know that this transformation is not always p achieve closed-form expressions. For example: when σ(x; θ) = β0 + β1 X + β2 X β3 , the integration can’t be performed analytically. This motivate Bakshi and Ju(2005) to provide a refinement.

Transforming Y → Z

I

By transforming X to Y, we have thinner tail for pY .However, when ∆ → 0 (although in fact, it will not), pY gets peaked around the conditional value.

I

To avoid using Dirac mass as leading term for expansion, he performs a further transformation. That is: 1

Z ≡ ∆− 2 (Y − y0 )

(5)

Then for fixed ∆,Z happens to be close enough to N(0, 1), which makes it possible to create a convergent series of expansion with a N(0, 1) term (Theorem 1).

Let pY (∆, y|y0 ; θ) denote the conditional density Yt+∆ |Yt , and define the density function of Z. We have: pZ (∆, z|y0 ; θ) ≡ ∆1/2 pY (∆, ∆1/2 z + y0 |y0 ; θ)

(6)

Once we have obtain approximation to function (z, y0 ) 7→ pZ (∆, z|y0 ; θ), we can obtain: pY (∆, y|y0 ; θ) ≡ ∆−1/2 pZ (∆, ∆−1/2 (y − y0 )|y0 ; θ)

(7)

1 pY (∆, γ(x; θ)|γ(x0 ; θ); θ) σ(x; θ)

(8)

Thus pX (∆, x|x0 ; θ) =

Approximation of Transition Function of the Transformed Data I

I

The classical Hermite polynomials are: 2 2 dj Hj (z) ≡ ez /2 j [e−z /2 ] j ≥ 0 dz √ 2 Let ϕ(z) ≡ e−z /2 / 2π be the N(0, 1) density function, then define:

p(J) Z (∆, z|y0 ; θ) ≡ ϕ(z)

J X

(j)

ηZ (∆, y0 ; θ)Hj (z)

j=0

The coefficients

(j) ηZ

are given by: Z (j) ηZ (∆, y0 ; θ) ≡ (1/j!)

+∞

−∞

Hj pZ (∆, z|y0 ; θ)dz

Remark: Now, our question becomes how to obtain the coefficients?

(9)

(10)

Explicit Expression for the coefficients We have: Z (j) ηZ (∆, y0 ; θ)

Z

+∞

= (1/j!)

+∞

= (1/j!) −∞

−∞

Hj pZ (∆, z|y0 ; θ)dz

Hj ∆1/2 pY (∆, ∆1/2 z + y0 |y0 ; θ)dz

= (1/j!)E[Hj (∆−1/2 (Yt+∆ − y0 ))|Yt = y0 ; θ] Define Hj (∆−1/2 (Yt+∆ − y0 )) ≡ f (Yt+∆ , y0 ),apply Taylor’s Expansion, we have:

E[f (Yt+∆ , y0 )] =

K X

Ak (θ)f (y0 , y0 )

k=0

E[AK+1 (θ).f (Yt+δ , y0 )|Yt = y0 ]

∆K+1 (K + 1)!

∆k + k! (11)

In(11), A(θ) is the infinitesimal generator of diffusion Y, define as: A(θ) : f → µY (.; θ)

∂f 1 ∂2 f + ∂y 2 ∂y2

If we gather the terms in equation (10) according to different power of ˜ (K)

∆, and let J → ∞, we have an explicit expression of pY : Z y K X ∆k y − y0 ) exp( µ (ω; θ)dω) c (y|y ; θ) Y k 0 k! ∆1/2 y0 k=0 (12) Where c0 (y|y0 ; θ) = 1 and for all j ≥ 1 : Ry cj (y|y0 ; θ) = j(y − y0 )−j y (ω − y0 )j−1 ˜ (K)

pY (∆, y|y0 ; θ) = ∆−1/2 ϕ(

0

×{λY (ω; θ)cj−1 (ω|y0 ; θ) + (∂2 cj−1 (ω|y0 ; θ)/∂ω2 )/2}dω

Here, λY (y; θ) ≡ −{µ2Y (.; θ) + ∂µY (y; θ)/∂y}/2 and −λY is the potential. Remark: In application, equation(12) is what we use to approximate transition density, because we can control the error to any order of ∆ as we need. As Sahalia pointed out, in applications of financial models, three terms are enough to make it accurate enough.

Convergence Properties

To make this method valid, we need to show: 1. The approximate density converges to the true density. This is shown by Theorem 1. However, the requirement of existence of higher bound of time interval in Theorem 1 cautions us to be careful when using the method. Luckily, for financial models, we generally can ignore this problem. And if there’s reverting near the bound ,this approximation always converges. 2. The Consistency of the MLE estimator. This is shown by Theorem 2.

Performance

MAXIMUM LIKELIHOOD ESTIMATION

Binomial Binomial Tree

0.01

tlme

In

s e c o n d s (log-scale)

Figure: Comparison of different approximation methods

Noler. T h ~ sfigure reports the average u n ~ f o r nabsolute ~ error of varlous dens~tyapproxlmatlon teclin~quesappl~edto the Vas~cek, Cox-Ingersoll-Ross and Black-Scholes n~odels."Euler" refers to the d~screte-t~me, cont~nuous-state,first-order Gauss~anapproxiniatlon scheme for the trans~tiondensity glven In equation (5.4), "B~nomialTree" refers to the d~screte-time,discrete-state (hvo) approxlmat~on,"S~mulat~ons"refers to an implementation of Pedersen (1995)'s sin~ulated-l~kel~hood method; "PDE" "PDE" refers to tlie numerical solut~onof tlie Fokker-Planck-Kolmogorov partial different~alequatlon sat~sfiedby the transition transition density, using the CrankN~colsonalgorithm For ~niplementat~on details on the d~fferentmethods cons~dered,see Jensen and Poulsen (1999).

The figure reports the average uniform absolute error of various density approximation techniques applied to Vasicek, CIR and BS and speed of different approximation methods for p , models. FIGURE1.-Accuracy 1.-Accuracy

COMPARISONOF APPROXIMATEESTIMATORSFOR THE VASICEK, BLACK-SCHOLES COX-INGERSOLL-ROSS,AND BLACK-SCHOLESMODELS dX, ~ ( M L E) p(TRUE)

~ ( E U L -) ~ L

E

b(l)-~(MLE)

j ( 2 ) -~(MLE)

(y(TRUE) &(MLE)--

&(EUL)-&,(?VILE) &,(I) -&(?VILE)

&(?I -&(MLE) (T(TRUE) &(MLE)-+(EUL) -&(MLE)

&(I)-&(MLE)

)

Mean Stand. Dev. Mean Stand. Dev. Mean Stand. Dev. Mean Stand. Dev.

Vas~cek -X,) dt +udW1,

= p=( a

Cox-Ingersoll-Ross d X , = p(a = -X I )[It +UX? d W ,

Black-Scholes d X , = px, = dt ux, d W ,

++

0.09711 -0.0002561 0.18772 0.0468815 0.0468815 -0.00164 -0.00164 0.0017667 0.0017667 0.03250 0.03250 0.0008121 0.0008121 0.00053 0.00053 -0,0020946 0.00105 0.00105 0.0017052 -0.00036 -0.00036 0.0000197 0.0000197 0.00494 0.00494 0.0000294 0.0000294

M~~~ M~~~ 0.000023341 0.0006947 not applicable Stand. Dev. 0.009078321 0.0011893 0.0011893 not applicable Mean -0.000000003 0.0000089 0.0000089 not applicable Stand. Dev. 0.000000071 0.0001789 0.0001789 not applicable Mean 0.000001102 0.000001102 0.0001747 0.0001747 not applicable Stand. Dev. 0.000109126 0.000109126 0.0002057 not applicable Mean -0.000000017 -0.000000017 -0.0000009 not applicable Stand. Dev. 0.000003544 0.000003544 0.0001322 not applicable M~~~ M~~~ 0.00008690 0.00008690 0.000560 -0.00000165 -0.00000165 Stand. Dev. Simulation 0.00101568Comparison 0.004905 0.00966928 0.00966928 Figure: Mean -0.00073620 -0.00073620 -0.002768 0.00562312 0.00562312 Stand. Dev. 0.00022012 0.001492 0.00189766 0.00189766 Mean -0.00000043 -0.00000043 0.000029 0.000029 0.00005585

&,(I) -&(?VILE)

&(?I -&(MLE) (T(TRUE) &(MLE)-+(EUL) -&(MLE)

&(I)-&(MLE) $2) $2)-+(MLE)

Stand. Dev. Mean Stand. Dev. Mean Stand. Dev.

0.000000071 0.0001789 0.0001789 not applicable 0.000001102 0.000001102 0.0001747 0.0001747 not applicable 0.000109126 0.000109126 0.0002057 not applicable -0.000000017 -0.000000017 -0.0000009 not applicable 0.000003544 0.000003544 0.0001322 not applicable

M~~~ M~~~ 0.00008690 0.00008690 0.000560 -0.00000165 -0.00000165 Stand. Dev. 0.00101568 0.004905 0.00966928 0.00966928 Mean -0.00073620 -0.00073620 -0.002768 0.00562312 0.00562312 Stand. Dev. 0.00022012 0.001492 0.00189766 0.00189766 Mean -0.00000043 -0.00000043 0.000029 0.000029 0.00005585 Stand. Dev. 0.00000248 0.00000248 0.000391 0.00006205 Mean -0.00000002 -0.00000002 0.000013 -0.00000058 -0.00000058

Stand. Dev.

0.00000029 0.00000029 0.000387

0.00000117

Figure: Simulation Comparison Cont’

Note: This Simulation is done with 5000 Monte Carlo Simulation cov(((hq~E),&(MLE)) 0.0000368 0.0000368 -0.00102 not applicable each containing 1000 observations.The parameters to the c o v ( p ( E u ~ )&(EuL)) , 0.0000345 True-0.00099 -0.00099are notset applicable realistic and Stock-0.00102 Prices.-0.00102 not applicable c o v ( j ( l ) value ,&,(I)) for US interest rate 0.0000371 C O V ( ~ (c ~? ( )~ ,) )

0.0000367

-0.00102 -0.00102 not applicable

0.0000003112 0.0000003112 0.0000006 0.0000006 not applicable 0.0000002616 0.0000057 not applicable 0.0000003134 0.0000006 0.0000006 not applicable 0.0000003112 0.0000006 0.0000006 not applicable

Extensions Now, we discuss the extensions of this paper. I Ait Sahalia (1999) apply this method to approximate densities of different spot interest rate models. I As I have mention, Bakshi and Ju (2005) provides a refinement of Sahalia’ method to make it more applicable. I Egorov et al. (2003) extend the model to time-inhomogeneous case with almost the same procedure. I Ait Sahalia(2003)extends this method to Multivariate Case. However, the transforming of X → Y will not be always doable. He distinguishes the cases of reducible and irreducible, and derive different expansions. I Yu(2007) Extends the model to more general Multivariate Jump-Diffusion model. With Sahalia (2002) as a special case. Yu (2007) first guess function form of the transition density, and then derive the coefficients by approximating the Forward and Backward Kolmogrov Equations. In this sense, it’s the same idea suggested by Lo(1988).

Maximum Likelihood Estimation of Discretely Sampled ...

significant development in continuous-time field during the last decade has been the innovations in econometric theory and estimation techniques for models in ...

287KB Sizes 4 Downloads 313 Views

Recommend Documents

No documents