Understanding Volatility

Viewer
Transcript

Online Appendix for: Information in the term structure of yield curve volatility

Anna Cieslak and Pavol Povala∗

First version: May, 2009 This version: September, 2011

Cieslak is at the Northwestern University, Kellogg School of Management. Povala is at the University of Lugano, Switzerland. Cieslak: [email protected], Department of Finance, Kellogg School of Management, Northwestern University, 2001 Sheridan Road Evanston, IL 60208, phone: +1 847 467 2149. Povala: [email protected], University of Lugano, Institute of Finance, Via Buffi 13a, 6900 Lugano, Switzerland, phone: +1 646 238 4049. ∗

Contents I. Data description

2

I.A. GovPX and BrokerTec . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

I.B. Testing for microstructure noise . . . . . . . . . . . . . . . . . . . . . . . .

2

I.C. Realized covariance matrix estimation . . . . . . . . . . . . . . . . . . . . .

3

I.D. Extracting zero coupon yield curve from high-frequency data . . . . . . . .

5

I.E. Survey data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

II. Model solution

7

II.A. Dependence between X and V factors . . . . . . . . . . . . . . . . . . . . .

7

II.B. General form of the market prices of risk . . . . . . . . . . . . . . . . . . .

7

II.C. Solution for bond prices . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

II.D. Instantaneous volatility of yields . . . . . . . . . . . . . . . . . . . . . . . .

9

II.E. Conditional covariance of X and V . . . . . . . . . . . . . . . . . . . . . .

10

II.F. Discrete approximation to the unconditional covariance matrix of X and V

11

III. Moments of the state variables

11

III.A.Moments of the Vt process . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

III.B.Moments of the Yt dynamics . . . . . . . . . . . . . . . . . . . . . . . . . .

13

IV.Model estimation

16

IV.A.Discretization and vectorization of the state space . . . . . . . . . . . . . .

16

IV.B.Econometric identification . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

IV.C.Implementation of the filter . . . . . . . . . . . . . . . . . . . . . . . . . .

18

IV.D.Pseudo-maximum likelihood estimation . . . . . . . . . . . . . . . . . . . .

21

References

22

1

I. Data description This section gives a brief description of the high-frequency Treasury data, our zero curve construction methodology, and macroeconomic surveys. The US Treasury market is open around the clock, but the trading volumes and volatility are concentrated during the New York trading hours.

Roughly 95% of trading occurs between

7:30AM and 5:00PM EST (see also Fleming, 1997). This interval covers all major macroeconomic and monetary policy announcements, which are commonly scheduled either for 9:00AM EST or 2:15PM EST. We consider this time span as a trading day. Around US bank holidays, there are trading days with a very low level of trading activity. In such cases, we follow the approach of Andersen and Benzoni (2010) and delete days with no trading for more than three hours. We choose the ten-minute sampling frequency so that it strikes the balance between the nonsynchronicity in trading and the efficiency of the realized volatility estimators (Zhang, Mykland, and Ait-Sahalia, 2005). The microstructure noise does not appear to be an issue in our data, as indicated by the volatility signature plots and very low autocorrelation of equally spaced yield changes, see Figure 1. The liquidity in the secondary bond market is concentrated in two-, three-, five- and ten-year securities (see also Fleming and Mizrach, 2008, Table 1). We assume that the dynamics of this most liquid segment spans the information content of the whole curve. Since any method for bootstrapping the zero curve is precise for maturities close to the observed yields, for subsequent covolatility analysis we select yields which are closest to the observed coupon bond maturities.

I.A. GovPX and BrokerTec Table I reports some basic statistics on the Treasury bond transaction data in the period 1992:01 through 2010:12. For the GovPX period (1992:01–2000:12), we report the average number of quotes per trading day and for the BrokerTec period (2001:01–2010:12) we report the average number of transactions per trading day. The number of Treasury bonds and bills totals to 1148 in our sample period. These were transacted or quoted more than 49.3 million times in the on-the-run secondary market.

I.B. Testing for microstructure noise To avoid potential bias in the estimates of the realized volatility using high-frequency data, we apply several tests for the presence of noise caused by the market microstructure effects. In a first step, we compute the first order autocorrelation in high-frequency price returns. Table II reports

2

Table I: Average number of quotes/trades per day in the GovPX and BrokerTec databases Bond maturity

GovPX period

BrokerTec period

3M

374

–

6M

352

–

2Y

2170

1414

3Y

1385

1017

5Y

3128

2801

7Y

637

1500

10Y

2649

2659

30Y

793

1114

the first order autocorrelation of equally-spaced ten-minute yield changes in the US Treasury zero curve in the period 1992:01–2007:12. The autocorrelation is statistically significant for the maturities of three, five and ten years. However, the magnitude of all autocorrelations is very small, which makes them economically insignificant.

Table II: Autocorrelation of high-frequency yield changes Autocorrelation p-value

2Y

3Y

5Y

7Y

10Y

0.0039

0.0154

-0.0064

-0.0040

-0.0175

(0.0648)

(0.0001)

(0.0025)

(0.0589)

(0.0001)

In a second step, we use the volatility signature plots displaying the average realized volatility against the sampling frequency (Figure 1). In the presence of microstructure noise, the average realized volatility increases with the sampling frequency. The reason is the dominance of noise at the very high-frequency sampling (see e.g., Bandi and Russell, 2008). None of the above diagnostics suggests that the microstructure noise present in our data is large and could overwhelm our results.

I.C. Realized covariance matrix estimation This section discusses the robustness and efficiency of the realized second moment estimator proposed in the paper that are critical for our results. In the paper, we use the outer product estimator given by: RCov(t, t + 1; N ) =

′ X yt+ i − yt+ i−1 yt+ i − yt+ i−1 . N

i=1,...,N

N

N

N

(1)

In case of asynchronous trading, the realized covariance matrix estimator defined in Eq. (1) can be biased toward zero (see e.g. Hayashi and Yoshida, 2005; Audrino and Corsi, 2007). The bias

3

Volatility signature plot, 1992:01-2007:12 6.2 6.1 average weekly RV (bps)

6 5.9 5.8 5.7 5.6 2Y 3Y 5Y 7Y 10Y

5.5 5.4 5.3 5.2 5

10

15 20 sampling frequency (min.)

25

30

Figure 1: Volatility signature plot We plot the average weekly realized volatility (RV) against the sampling frequency for the whole sample 1992:01–2007:12. We consider five maturities in the zero coupon curve: two, three, five, seven, and ten years.

is to a large extent generated by the interpolation of non-synchronously traded assets, and its severity depends on the difference in liquidity of the assets considered.1 Hayashi and Yoshida (2005, HY) propose a covariance estimator which corrects for the bias in (1). The estimator sums up all cross-products of returns which have an overlap in their time spans, and thus no data is thrown away. The covariance of two bond yields reads: HY RCovi,j (t, t

+ h; Ni , Nj ) =

Nj Ni X X k=1 l=1

yti + kh − yt + (k−1)h Ni

i

Ni

ytj + lh − yt Nj

j+

(l−1)h Nj

I (τi ∩ τj 6= 0) (2)

where τi and τj denote the interval of the return on the first and second bond, respectively. To verify the robustness of our realized covariance estimator (1), we implement the HY approach for the realized covariance of ten- and five-year bond. Both estimators deliver very similar results in terms of magnitude and covariance dynamics. They are highly correlated (90%) and the t-test for the difference in means does not reject the null that µouter = µHY (p-val = 0.57). There at least two reasons why we stick to the simple outer-product realized covariance estimator (1). For one, estimators of Hayashi and Yoshida (2005), Audrino and Corsi (2007) are not directly applicable in our case because for the construction of the zero curve we require a synchronized set of yield changes. More importantly, on-the-run Treasury bonds are largely homogenous in terms 1

Audrino and Corsi (2008) offer a thorough discussion of the bias in realized covariance.

4

Table III: Correlation of zero coupon yields with CMT and GSW yields Corr CMT Corr GSW

2Y

3Y

5Y

7Y

10Y

1.000 1.000

1.000 0.999

0.999 0.999

0.996 0.998

0.997 0.997

of liquidity, which is well proxied by the average number of quotes/trades per day reported in Table I of Appendix I.B.

I.D. Extracting zero coupon yield curve from high-frequency data We fit the discount curve using smoothing splines. One of the important steps in the procedure is to select the appropriate number of knot points. We make the number of knot points dependent on the number of available bonds and locate them at the bond maturities. In our setting, the number of knot points varies between three and six. The fact that we consider only one specific part of the zero coupon yield curve allows us to use constant roughness penalty as in Fisher, Nychka, and Zervos (1994) for estimating the whole curve. Waggoner (1997) proposes a varying roughness penalty for the smoothing splines procedure with a low penalty at the short end and a high penalty at the very long end of the curve. In the period 2001:01–2010:12, the intraday quotes on Treasury bills are not available from the BrokerTec database. In order to anchor the very short end for the smoothing splines procedure, we include the daily data on the three-month Treasury bill obtained from the FRED database at the FRB St. Louis. Before using the constructed zero curve for the realized volatility estimation, we compare our zero coupon yields with the daily Constant Maturity Treasury rates (CMT) from the Fed, as well as with zero yields compiled by G¨ urkaynak, Sack, and Wright (2006) (GSW). Our daily yields are almost perfectly correlated with the CMTs as well as with the GSW yields. Table III summarizes the results.

I.E. Survey data BlueChip Financial Forecasts. BlueChip Financial Forecasts (BCFF) survey contains monthly forecast of yields, inflation and GDP growth given by approximately 45 leading financial institutions. The BCFF is published on the first day of each month, but the survey itself is conducted over a two-day period, usually between the 23rd and 27th of each month. The exception is the survey for the January issue which generally takes place between the 17th and 20th of December. The precise dates as to when the survey was conducted are not published. The BCFF provides forecasts of constant maturity yields across several maturities: three and six months, one, two, five, ten, and 30 years. The short end of the term structure is additionally covered with the forecasts of the Fed funds rate, prime bank rate and three-month LIBOR rate. The forecasts are quarterly averages of interest rates for the current quarter, the next quarter out to five quarters

5

ahead. The figures are expressed as percent per annum. In addition, panelist provide forecasts for macroeconomic quantities: real GDP, GDP price index and Consumer Price Index (CPI). The numbers are seasonally adjusted quarter-on-quarter changes. BlueChip Economic Indicators.

The BlueChip Economic Indicators (BCEI) survey contains

individual and consensus forecasts of about 50 professional economists from leading financial and advisory institutions. The survey is compiled on a monthly basis, and contains predictions of key financial and macroeconomic indicators, e.g. real and nominal GDP, GDP deflator, CPI, threemonth T-bill rate, industrial production, unemployment, housing starts. The survey is conducted over two days, generally beginning on the first business day of each month. The newsletter is typically finished on the third day following completion of the survey and published on the tenth of a month. Every month, panelists provide two types of forecasts: (i) average figure for the current calendar year and (ii) average figure for the next calendar year. For instance, in January 2001 the survey contains forecasts for 2001 and 2002. In February 2001, the forecast horizon shrinks to 11 months for the current year, and to 23 months for the next year, and so on. The diminishing forecast horizon implies that the cross-sectional uncertainty measures computed from the individual responses display a visible seasonal pattern. To gauge uncertainty, every month we use the mean absolute deviation of individual forecasts. To remove the problem of seasonality, we adjust the series with a X-12 ARIMA filter. Consensus forecast is defined as the median of individual forecasts in a given month.

6

II. Model solution We provide solutions for the general version of the model, which incorporates both correlation between the dW and dZ shocks and a general form of the market prices of risk. Based on arguments presented in the body of the paper, we analyze a restricted version of the model, in which the correlation parameter is set to zero and only dZ shocks are priced.

II.A. Dependence between X and V factors In the general case, Xt and Vt can be correlated, i.e.: dZX = dW ρ +

p

1 − ρ′ ρ dB

(3)

= dW ρ + ρedB,

(4)

where dB is a (2 × 1)-vector of Brownian motions which is independent from dW , and ρ is a (2 × 1)-vector such that ρ ∈ [−1, 1] and ρ′ ρ < 1 (e.g., da Fonseca, Grasselli, and Tebaldi, 2006; √ Buraschi, Porchia, and Trojani, 2010). We use short notation ρe := 1 − ρ′ ρ. II.B. General form of the market prices of risk Let us write the shocks to Y under the physical dynamics as (for brevity we omit the superscript P):

dZ =

dZX dZf

!

=

!

dW ρ + ρedB dZf

=

!

dW ρ 01×2

+ |

ρeI2×2 02×1 01×2

{z

11×1

R

!

dB

!

dZf } | {z } e dZ

=

dW ρ 01×2

!

e + RdZ,

(5)

where R=

ρeI2×2 02×1 01×2

11×1

!

,

The change of drift is specified as:

7

dZe =

dB dZf

!

.

(6)

dZe = dZeQ − ΛY,t dt

(7)

dW = dW Q − ΛV,t dt

(8)

0 1 ΛY,t = Σ−1 Y (Vt ) λY + λY Yt p −1 p ΛV,t = Vt Λ0V + Vt Λ1V ,

(9) (10)

where λ0Y and λ1Y are a (n + 1)-vector and (n + 1) × (n + 1) matrix of parameters, and Λ0V and Λ1V are n × n constant matrices. To exclude arbitrage, the market price of risk requires that

the parameter matrix Q be invertible, so that Vt stays in the positive-definite domain. This specification implies the risk-neutral dynamics of Yt given by: dYt =

"

µY −

Λ0V ρ 0

!

− Rλ0Y

!

+ KY − Rλ1Y Yt −

Vt Λ1V ρ 0

!#

dt + ΣY (Vt ) dZtQ .

(11)

Let: µQ Y

= µY −

Λ0V ρ 0

!

− Rλ0Y

KYQ = KY − Rλ1Y .

(12) (13)

The dynamics of Vt is given as: dVt =

p p Q Q′ ′ 1′ ′ 1 ′ ΩΩ′ − Λ0V Q − Q′ Λ0′ + M − Q Λ V + V M − Λ Q dt + V dW Q + Q dW Vt . t t t V V V t t (14)

Let ′ ΩQ ΩQ′ = ΩΩ′ − Λ0V Q − Q′ Λ0′ V = (k − 2v) Q Q

M Q = M − Q′ Λ1′ V,

(15) (16)

where, to preserve the same distribution under P and Q, we assume Λ0V = vQ′ for a scalar v such that (k − 2v) > n − 1. II.C. Solution for bond prices Since both components of the state vector, i.e. Yt , Vt , are affine, bond prices are of the form: F (Yt , Vt ; t, τ ) = exp A (τ ) + B (τ )′ Yt + T r [C (τ ) Vt ] .

8

(17)

By discounted Feynman-Kac theorem, the drift of dF equals rF, thus: L{Y,V } F +

∂F = rF, ∂t

(18)

where L{Y,V } is the joint infinitesimal generator of the couple {Yt , Vt } under the risk neutral

measure. We have:

L{Y,V } F = (LY + LV + LY,V ) F " !# 1ρ V Λ ∂F 1 ∂F t Q Q V ′ LY F = + Tr ΣY (V ) ΣY (V ) µY + KY Y − ∂Y ′ 2 ∂Y ∂Y ′ 0 h i LV F = T r ΩQ ΩQ′ + M Q V + V M Q′ RF + 2V RQ′ QRF ∂ LY,V F = 2T r RQ′ ρ F V . ∂X ′ R is a matrix differential operator: Rij :=

∂ ∂Vij

(19) (20) (21) (22)

. Substituting derivatives of (17) into (18) gives:

1 1 2 2 Q ′ ′ Bτ′ µQ + K Y − T r Λ1V ρBX,τ V + T r BX,τ BX,τ V + Bf,τ σf Y Y 2 h 2i + T r ΩQ ΩQ′ Cτ + T r Cτ M Q + M Q′ Cτ + 2Cτ Q′ QCτ V ′ + T r Cτ Q′ ρBX,τ + BX,τ ρ′ QCτ V ∂Aτ ∂Bτ ′ ∂Cτ = + Y + Tr V + γ0 + γY′ Y ∂τ ∂τ ∂τ

(23) (24) (25) (26)

By matching coefficients, we obtain the system of equations: ∂A 1 2 2 Q Q′ = Bτ′ µQ (27) Y + Bf,τ σf + T r Ω Ω Cτ − γ0 ∂τ 2 ∂B = KQ′ Bτ − γY (28) ∂τ ∂C 1 ′ ′ ′ = BX,τ BX,τ + Cτ M Q + Q′ ρBX,τ + M Q′ + BX,τ ρ′ Q Cτ + 2Cτ Q′ QCτ − Λ1V ρBX,τ ∂τ 2 (29) To obtain the solution provided in the text, set ρ = 02×1 , Λ0V = 02×2 and Λ1V = 02×2 .

II.D. Instantaneous volatility of yields The instantaneous volatility of yields is given as:

9

1 1 ′ ′ ′ hdytτ1 , dytτ2 i = T r Bf,τ2 Bf,τ1 σf2 + BX,τ1 BX,τ + 2Cτ2 Q′ ρBX,τ + 2Cτ1 Q′ ρBX,τ + 4Cτ1 Q′ QCτ2 Vt . 2 1 2 dt τ1 τ2 (30) ′ Proof. The only term which requires clarification is Bτ′ 1 dYt ×T r [Cτ2 dVt ] = BX,τ dXt ×T r [Cτ2 dVt ] 1

h √ √ √ i ′ ′ ′ ′ BX,τ dX × T r [C dV ] = B V dZ × T r C V dW Q + Q dW V t τ t X τ X,τ1 2 2 1 √ √ ′ V (dW ρ + ρedB) × 2T r QCτ2 V dW = BX,τ 1 ′ = 2T r Cτ2 Q′ ρBX,τ V 1 Where we use the following fact: h √ √ i √ Tr C V dW Q + Q′ dW ′ V = 2T r QC V dW .

(31)

II.E. Conditional covariance of X and V We consider the conditional covariance matrix of X and V      * Xt,1 Vt,11 + d (X1 , V11 ) d (X1 , V12 ) d (X1 , V22 )          d   Xt,2  ,  Vt,12  =  d (X2 , V11 ) d (X2 , V12 ) d (X2 , V22 ) d (f, V11 ) ft Vt,22 d (f, V12 ) d (f, V22 )

   

(32)

The elements of the covariance matrix are given by: d hXk , Vij i = ρ′ (Q:,j Vik + Q:,i Vjk ) ,

(33)

where Q:,j denotes the j-th column of matrix Q. Proof. The expression follows by simple algebra: h √ √ i √ 1 d hVij , Xk i = e′i V dW Q ej + e′i Q′ dW ′ V ej e′k V dW ρ dt √ √ √ √ = T r ej e′i V dW Q × T r ρe′k V dW + T r ej e′i Q′ dW ′ V × T r ρe′k V dW √ ′ √ √ ′ √ ′ ′ ′ ′ ′ ′ = vec V ei ej Q vec V ek ρ + vec V ej ei Q vec V ek ρ = T r Qej e′i V ek ρ′ + T r Qei e′j V ek ρ′ = ρ′ (Q:,j Vik + Q:,i Vjk ) ,

(34)

10

where ei is the i-th column of the identity matrix.

II.F. Discrete approximation to the unconditional covariance matrix of X and V We can use the discretized dynamics of X and V to compute the unconditional covariance matrix: ¯X,∆t + ΦX,∆t Xt + Xt+∆t = µ Vt+∆t

p

Vt ∆t (Ut+∆t ρ + ρebt+∆t ) p p ′ = kµ ¯V,∆t + ΦV,∆t Vt Φ′V,∆t + Vt ∆tUt+∆t Q + Q′ Ut+∆t Vt ∆t,

(35) (36)

where Ut is a 2 × 2 matrix of Gaussian shocks, and bt is a 2-vector of Gaussian shocks. The covariance between X and V is computed as :

Cov [X, vec(V )] = E Xvec (V )′ − E (X) E vec (V )′ .

(37)

The element E Xvec (V )′ reads:

vecE X (vecV )′ = [In3 − (ΦV ⊗ ΦV ) ⊗ ΦX ]−1 (vecA + vecB) ,

(38)

where A is given as: A=µ ¯X vec (kµ ¯ V )′ + µ ¯X vec ΦV E (Vt ) Φ′V + ΦX E(Xt )vec(kµ ¯V )′ ,

(39)

and the elemtent (k, ij) of matrix B, associated with the covariance of Xk and Vij has the form: Bk,ij = ρ′ (Q:,j Vik + Q:,i Vjk ) ∆t,

where B =

B1,11 B1,12 B1,21 B1,22 B2,11 B2,12 B2,21 B2,22

!

(40)

. Note that the second and third columns of B are

identical.

III. Moments of the state variables This section derives moments of the state variables necessary for the implementation of the unscented Kalman filter.

III.A. Moments of the Vt process The first conditional moment of the volatility process Vt is given as:

11

Et (Vt+∆t ) = kµ ¯V,∆t + ΦV,∆tVt Φ′V,∆t ,

(41)

ΦV,∆t = eM ∆t Z ∆t 1 ′ ′ µ ¯V,∆t = eM ∆t Q′ QeM ∆t ds = − Cˆ12 (∆t)Cˆ11 (∆t) , 2 0

(42)

where

with

Cˆ11 (∆t) Cˆ12 (∆t) Cˆ21 (∆t) Cˆ22 (∆t)

!

"

= exp ∆t

M 0

−2Q′ Q −M ′

!#

(43)

.

Assuming stationarity (i.e. negative eigenvalues of M ), the unconditional first moment of Vt follows as: µV,∞ ) = −k [(I ⊗ M ) + (M ⊗ I)]−1 vec(Q′ Q). lim vecEt (Vt+∆t ) = kvec (¯

∆t→∞

(44)

The conditional and unconditional covariance matrix of Vt reads: Covt [vec (Vt+∆t )] = (In2 + Kn,n ) ΦV,∆tVt Φ′V,∆t ⊗ µ ¯V,∆t + k (¯ µV,∆t ⊗ µ ¯V,∆t ) + µ ¯V,∆t ⊗ ΦV,∆tVt Φ′V,∆t . (45)

lim Covt [vec (Vt+∆t )] = (In2 + Kn,n ) k (¯ µV,∞ ⊗ µ ¯V,∞ ) .

∆t→∞

(46)

Kn,n is the commutation matrix with the property that Kn,n vec(A) = vec(A′ ). These moments are derived in Buraschi, Cieslak, and Trojani (2010) and thus are stated without a proof. Gourieroux, Jasiak, and Sufana (2009) show that when ΩΩ′ = kQ′ Q, k integer, the dynamics of Vt can be represented as the sum of outer products of k independent Ornstein-Uhlenbeck processes with a zero long-run mean: Vt =

k X

vti vti′

(47)

i=1

i vt+∆t = ΦV,∆tvti + ǫit+∆t ,

ǫit ∼ N (0, µ ¯V,∆t ).

(48)

Taking the outer-product implies that the exact discretization of Vt has the form: ¯V,∆t + ΦV,∆t Vt Φ′V,∆t + uVt+∆t , Vt+∆t = kµ where the shock uVt+∆t is a heteroskedastic martingale difference sequence.

12

(49)

III.B. Moments of the Yt dynamics We assume that the dimension of Xt is n = 2 and ft is a scalar process. Let Yt = (Xt′ , ft )′ : dYt = (µY + KY Yt ) dt + Σ (Vt ) dZt .

(50)

It is straightforward to show that the conditional and unconditional first moment of Yt has the form: Et (Yt+∆t ) = eKY ∆t − I KY−1 µY + eKY ∆t Yt

(51)

lim Et (Yt+∆t ) = −KY−1 µY ,

∆t→∞

(52)

where KY is assumed to be lower triangular with negative eigenvalues. To compute the conditional covariance of Yt , let VY (t, T ) := Covt (YT ) . Following Fisher and Gilles (1996), the application of Ito’s lemma to Yˆ (t, T ) := Et (YT ) reveals that: dYˆ (t, T ) = σ ˆY (t, T ) dZt ,

(53)

where σ ˆY (t, T ) := ΦY (t, T ) Σ (Vt ) , with ΦY (t, T ) = eKY (T −t) and

√

ΣY (Vt ) =

Vt

0

0

σf2

(54)

!

.

(55)

σ ˆY (s, T ) dZs .

(56)

Then, integrating dYˆ (t, T ) yields: YT = YˆT,T = Yˆt,T +

Z

T s=t

Therefore, we have: VY (t, T ) = Covt

Z

T

s=t

=

Z

T

σ ˆY (s, T )dZsY

ΦY (s, T )Et s=t

= Et !

Vs

0

0

σf2

Z

T

′

σ ˆY (s, T )ˆ σY (s, T ) ds s=t

Φ′Y (s, T )ds.

(57) (58)

Note that since KY is lower triangular, ΦY (t, T ) = eKY (T −t) is also lower triangular, and we have:

13

ΦY (t, T ) =

ΦX (t, T )

0

ΦXf (t, T ) Φf (t, T )

Let us for convenience define two matrices:  ΦX (t, T ) ⊗ ΦX (t, T )   Φ (t, T ) ⊗ Φ (t, T ) fX  X M1Y (t, T ) =   Φf X (t, T ) ⊗ ΦX (t, T )  Φf X (t, T ) ⊗ Φf X (t, T )

!

.

(59)



    and M0Y =  

08×1 Φ2f (t, T )σf2

!

.

(60)

With help of simple matrix algebra applied to (75), the conditional covariance of Yt has the (vectorized) form vecVY (t, T ) =

Z

T

M1Y (s, T ) [ΦV (s, T ) ⊗ ΦV (s, T )] ds × vec (Vt ) Z T Z T + kM1Y vec [¯ µV (t, s)] ds + M0Y (s, T )ds.

(61)

s=t

s=t

(62)

s=t

The unconditional covariance of Y is given as: lim vecVY (t, T ) = lim

T →∞

Z

T

T →∞ s=t

kM1Y (s, T )vec [¯ µV (t, s)] ds +

Z

T

s=t

M0Y (s, T )ds.

(63)

This expression exists if the mean reversion matrices M and KY are negative definite. The expressions for the conditional mean (51) and covariance (62) give rise to an exact discretization of the process Yt . Remark 1. In order to avoid the numerical integration, we can resort to a discrete-time approximation of the unconditional covariance matrix of Y factors. To this end, we discretize the dynamics dYt = (µY + KY Yt ) dt + ΣY (Vt ) dZt

(64)

√ Yt+∆t = µ ¯Y,∆t + ΦY,∆t Yt + ΣY (Vt ) ∆tεt+∆t ,

(65)

as

where µ ¯Y,∆t = eKY ∆t − I KY−1 µY . The second moment of the discretized dynamics is straight-

forward to obtain as:

14

vecE Y Y ′ = (I − ΦY,∆t ⊗ ΦY,∆t )−1 × ×vec µ ¯Y,∆t µ ¯′Y,∆t + µ ¯Y,∆t E Y ′ Φ′Y,∆t + ΦY,∆t E (Y ) µ ¯′Y,∆t + E ΣY (Vt ) ΣY (Vt )′ ∆t

(66)

vec [V ar (Y )] = vecE Y Y

′

− vecE (Y ) [vecE (Y )]′ .

We check that for the weekly discretization step ∆t = implies a significant reduction of the computational time.

15

1 52

this approximation works well, and

IV. Model estimation IV.A. Discretization and vectorization of the state space This section collects the details about the vectorization of transition dynamics for Yt and Vt . Parameters for discretized transition dynamics of Yt are given by: µ ¯Y,∆t = eKY ∆t − I KY−1 µY

(67)

ΦY,∆t = eKY ∆t .

(68)

Parameter matrices ΦV,∆t and µ ¯V,∆t for discretized transition dynamics of Vt are given by: µ ¯V,∆t =

Z

∆t

ΦV,s Q′ QΦ′V,s ds

(69)

,

(70)

0 M ∆t

ΦV,∆t = e

The closed form solution for the integral µ ¯V,∆t is given by where

Cˆ11 (∆t) Cˆ12 (∆t) Cˆ21 (∆t) Cˆ22 (∆t)

!

"

R ∆t

= exp ∆t

0

′ (∆t), ΦV,s Q′ QΦ′V,s ds = −Cˆ12 (∆t)Cˆ11

M 0

−Q′ Q −M ′

!#

.

See Van Loan (1978) for the proof. We recast the discretized covariance matrix dynamics Vt in a vector form: vec(Vt+∆t ) = kvec(¯ µV,∆t ) + (ΦV,∆t ⊗ ΦV,∆t)vec(Vt ) + vec(uVt+∆t ).

(71)

Since the process Vt lives in the space of symmetric matrices, its lower triangular part preserves all information. Let us for convenience define two linear transformations of some symmetric matrix A: (i) an elimination matrix: En vec(A) = vech(A), where vech(·) denotes half-vectorization, (ii) a duplication matrix: Dn vech(A) = vec(A). Using half-vectorization, we define V¯t := vech(Vt ) = En vec(Vt ), which contains n ¯ = n(n + 1)/2 unique elements of Vt :

µV,∆t ) + En (ΦV,∆t ⊗ ΦV,∆t)Dn V¯t + En vec(uVt+∆t ). V¯t+∆t = kEn vec(¯

(72)

Collecting all elements, we can redefine the state as: St = (Yt′ , V¯t′ )′ , whose transition is described by the conditional mean: Et (St+∆t ) =

eKY ∆t − I KY−1 µY + eKY ∆t Yt

kEn vec(¯ µV,∆t ) + En (ΦV,∆t ⊗ ΦV,∆t)Dn V¯t

16

!

,

(73)

and the conditional covariance of the form: Covt (St+∆t ) =

Covt (Yt+∆t ) 0n¯ ×n

0n×¯n Covt V¯t+∆t

!

.

(74)

The block diagonal structure in the last expression follows from our assumption that shocks in Yt be independent of shocks in Vt . The respective blocks are given as: Covt (Yt+∆t ) = ΣY (Vt )ΣY (Vt )′ ∆t (75) Covt V¯t+∆t = En Covt (Vt+∆t ) En′ = En (In2 + Kn,n ) ΦV,∆t Vt Φ′V,∆t ⊗ µ ¯V,∆t + k (¯ µV,∆t ⊗ µ ¯V,∆t ) + µ ¯V,∆t ⊗ ΦV,∆t Vt Φ′V,∆t En′ , (76)

where Kn,n denotes a commutation matrix (see e.g., Magnus and Neudecker, 1979). Buraschi, Cieslak, and Trojani (2010) provide the derivation of the last expression.

IV.B. Econometric identification This section details our econometric identification procedure and parameter restrictions. To ensure econometric identification, we consider invariant model transformations of the type Y˜t = v + LYt and V˜t = LVt L′ , for a scalar v and an invertible matrix L. Such transformations result in the equivalence of the state variables, the short rate and thus yields (Dai and Singleton, 2000). If allowed, they can invalidate the results of an estimation. To prevent the invariance, we adopt several normalizations for the physical dynamics of the process Yt : (i) Setting µY = 0 allows to treat γ0 as a free parameter. (ii) Restricting γf = 1 makes σf identified. (iii) Since both KX and Vt determine interactions between the elements of Xt , they are not separately identifiable. We set KX to a diagonal matrix, and allow correlations of the Xt

factors to be generated solely by Vt . By the same token, the last row of matrix KY , i.e. (Kf X , Kf ) is left unrestricted, as ft does not interact with Xt via the diffusion term.

The identification of volatility factors Vt is ensured with three restrictions: (i) M is lower triangular and (ii) Q is diagonal with positive elements. (iii) The diagonal elements of Q are uniquely determined by setting γX = 1n×1 , where 1n×1 is a vector of ones. These normalizations protect Vt against affine transformations and orthonormal rotations of Brownian motions. Finally, to guarantee the stationarity of the state, we require that the mean reversion matrices KY and M

be negative definite. Due to the lower triangular structure of both, this is equivalent to restricting the diagonal elements of each matrix to be negative.

17

IV.C. Implementation of the filter This section summarizes the algorithm for the unscented Kalman filtering. We recast the transition and measurement equations above into one state space. The compound transition equation is given by: St+∆t = A + BSt + εt+∆t ,

(77)

and the compound measurement equation is given by: mt = h(St ; Θ) + ϑt .

(78)

St = (Yt′ , V¯t′ )′ and A are (n + n ¯ + 1) × 1-dimensional vectors, A is given by: A=

!

(ΦY,∆t − I) KY−1 µY k · En vec(¯ µV,∆t )

.

(79)

B is a block-diagonal matrix of the form: B=

ΦY,∆t

0n×¯n

0n¯ ×n

En (ΦV,∆t ⊗ ΦV,∆t )Dn

!

.

(80)

The vector shocks is of the form: εt+∆t =

uYt+∆t En vec(uVt+∆t )

!

,

(81)

and its covariance matrix is given by a block-diagonal matrix: Covt (εt+∆t ) =

Covt (Yt+∆t ) 0n¯ ×n

0n×¯n Covt (V¯t+∆t )

!

.

(82) τ ,τj ′ ).

mt is a vector of observed yields and volatility measures given by mt = (ytτ , vt i

Model implied

yields and volatilities are affine in the state vector. Function h(·) translates the state variables to model implied yields and volatilities: h(St ; Θ) =

f (St ; Θ) g(Vt ; Θ)

The vector of measurement errors:

18

!

.

(83)

ϑt =

p √

!

Ry eyt

Rv evt

(84)

is Gaussian with the covariance matrix, for six yields and three volatility measurements, is given by: σy2 I6

Cov(ϑt ) =

06×3

2 ) 03×6 diag(σi,v i=1,2,3

!

.

(85)

The core of UKF is the unscented transformation which approximates a distribution of a nonlinear transformation of any random variable by a set of sample points. In the UKF framework, we apply the unscented transformation recursively to B and h(·). We define LS := n + n ¯ + 1. Assume that we know the mean S¯ and the covariance PS of St at each point in time t. We form a matrix S of 2LS + 1 sigma vectors: S0 = S¯ Si = S¯ + Si = S¯ −

p p

(LS + λ)PS (LS + λ)PS

(86)

i

, i = 1, ..., LS

i−LS

, i = LS + 1, ..., 2LS ,

(87) (88)

where λ = α2 (LS − κ) − LS is a scaling parameter governing the spread of sigma points around p the mean and (LS + λ)PS is the i-th column of matrix PS . Sigma points S are propagated i

through function h(·) to get M. The first two moments of mt are approximated by: m ¯ ≈ PS ≈

2LS X i=0

2LS X i=0

Wiµ Mi

(89)

Wiσ (Mi − m)(M ¯ ¯ ′, i − m)

(90)

where W µ and W σ denote weights for the mean and the covariance matrix, respectively and are defined as: λ LS + λ λ W0σ = + 1 − α2 + β, i = 1, ..., LS LS + λ λ Wiµ = Wiσ = , i = LS + 1, ..., 2LS . 2(LS + λ) W0µ =

Parameters α and β, mainly determine higher moments of the distribution.

19

(91) (92) (93)

The UKF Algorithm 1. Initialize at unconditional moments:2 Sˆ0 = E[S0 ]

(94)

PS0 = E[(S0 − Sˆ0 )(S0 − Sˆ0 )′ ]

(95)

for k ∈ 1, ..., ∞: 2. Compute the sigma points: Sk−1 =

h

Sˆk−1 Sˆk−1 +

p

(LS + λ)PS,k−1 Sˆk−1 −

p

(LS + λ)PS,k−1

i

(96)

3. Time update: a Sk|k−1 = B(Sk−1 )

Sˆk− = − PSk =

2LS X i=0 2L S X i=0

(97)

a Wiµ Sk|k−1

(98)

a a Wiσ (Sik|k−1 − Sˆk− )(Sik|k−1 − Sˆk− )′ + Covt (εt+∆t )

(99)

4. Augment sigma points: Sk|k−1 =

h

a a Sk|k−1 S0k|k−1 +

p

a (LS + λ)Covt (εt+∆t ) S0k|k−1 −

Mk|k−1 = h(Sk|k−1 ) m ˆ− k =

2LS X i=1

(LS + λ)Covt (εt+∆t )

i

(100) (101)

Wiσ Mi,k|k−1

(102)

5. Measurement equations update:

2

p

We borrow the algorithm from Wan and van der Merwe (2001).

20

− Pm = k

2LS X i=0

PSk mk =

2LS X i=0

′ Wiσ (Mik|k−1 − m ˆ− ˆ− k )(Mik|k−1 − m k ) + Covt (ϑt+∆t )

(103)

′ Wiσ (Sik|k−1 − Sˆk− )(Mik|k−1 − m ˆ− k)

(104)

−1 Kk = PSk mk Pm k

(105)

Fˆk = Fˆk− + Kk (mk − m ˆ− k)

(106)

− Pk = Pk− − Kk Pm Kk′ . k

(107)

IV.D. Pseudo-maximum likelihood estimation ˆ− Collecting all measurements in vector mt+1 , let m ˆ− t+1 and Pm,t+1 denote the time-t forecasts of the time-(t + 1) values of the measurement series and of their conditional covariance, respectively, as returned by the filter (for convenience 1 means one week). By normality of measurement errors, we can compute the quasi-log likelihood value for each time point in our sample: ′ − −1 1 1 − |− m ˆ− − m Pm,t+1 m ˆ− lt+1 (Θ) = − ln |Pm,t+1 t+1 t+1 t+1 − mt+1 , 2 2

(108)

and obtain parameter estimates by maximizing the criterion: ˆ := arg min L Θ, {mt }Tt=1 Θ Θ

with

−1 TX L Θ, {mt }Tt=1 = lt+1 (Θ) .

(109)

t=0

with T = 1003 weeks. The initial log-likelihood is evaluated at the unconditional moments of the state vector (see Section III for the expressions).

21

References Andersen, T. G., and L. Benzoni (2010): “Do Bonds Span Volatility Risk in the US Treasury Market? A Specification Test for Affine Term Structure Models,” Journal of Finance, 65, 603–655. Audrino, F., and F. Corsi (2007): “Realized Correlation Tick-by-Tick,” Working paper, Univerisity of St. Gallen. (2008): “Realized Covariance Tick-by-Tick in Presence of Rounded Time Stamps and General Microstructure Effects,” Discussion paper no. 2008-04. Bandi, F., and J. Russell (2008): “Microstructure Noise, Realized Variance, and Optimal Sampling,” Review of Economic Studies, 75, 339–369. Buraschi, A., A. Cieslak, and F. Trojani (2010): “Correlation Risk and the Term Structure of Interest Rates,” Working paper, University of Lugano. Buraschi, A., P. Porchia, and F. Trojani (2010): “Correlation Risk and Optimal Portfolio Choice,” Journal of Finance, 65, 393–420. da Fonseca, J., M. Grasselli, and C. Tebaldi (2006): “Option Pricing when Correlations Are Stochastic: An Analytical Framework,” Working Paper, esliv, University of Padova and University of Verona. Dai, Q., and K. Singleton (2000): “Specification Analysis of Affine Term Structure Models,” Journal of Finance, 55, 1943–1978. Fisher, M., and C. Gilles (1996): “Estimating Exponential-Affine Models of the Term Structure,” Working Paper, Federal Reserve Bank of Atlanta. Fisher, M., D. Nychka, and D. Zervos (1994): “Fitting the term structure of interest rates with smoothing splines,” Working paper, Federal Reserve and North Carolina State University. Fleming, M. J. (1997): “The Round-the-Clock Market for U.S. Treasury Securities,” FRBNY Economic Policy Review. Fleming, M. J., and B. Mizrach (2008): “The Microstructure of a U.S. Treasury ECN: The BrokerTec Platform,” Working paper, Federal Reserve Bank of New York and Rutgers University. Gourieroux, C., J. Jasiak, and R. Sufana (2009): “The Wishart Autoregressive Process of Multivariate Stochastic Volatility,” Journal of Econometrics, 150, 167–181. ¨ rkaynak, R. S., B. Sack, and J. H. Wright (2006): “The U.S. Treasury Yield Curve: Gu 1961 to the Present,” Working paper, Federal Reserve Board. Hayashi, T., and N. Yoshida (2005): “On covariance estimation of non-synchronously observed diffusion processes,” Bernoulli, 11, 359–379. Magnus, J. R., and H. Neudecker (1979): “The Commutation Matrix: Some Properties and Applications,” Annals of Statistics, 7, 381–394.

22

Van Loan, C. F. (1978): “Computing Integrals Involving Matrix Exponential,” IEEE Transactions on Automatic Control, 23, 395–404. Waggoner, D. F. (1997): “Spline Methods for Extracting Interest Rate Curves from Coupon Bond Prices,” FRB of Atlanta working paper, 97-10. Wan, E. A., and R. van der Merwe (2001): Kalman Filtering and Neural Networks. John Wiley & Sons, Inc. Zhang, L., P. A. Mykland, and Y. Ait-Sahalia (2005): “A Tale of Two Time Scales: Determining Integrated Volatility With Noisy High-Frequency Data,” Journal of the American Statistical Association, 100, 1394–1411.

23