A Winnerms Curse for Econometric Models: On the ...

Viewer
Transcript

A Winner’s Curse for Econometric Models: On the Joint distribution of In-Sample Fit and Out-of-Sample Fit and its Implications for Model Selection Peter Reinhard Hansen Department of Economics Stanford University

November 4, 2010

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

1 / 50

Introduction

Estimated models tend to have a better …t in-sample than out-of-sample.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

2 / 50

Introduction

Estimated models tend to have a better …t in-sample than out-of-sample. Main result: Asymptotic distribution of {In-Sample Fit, Out-of-Sample Fit}.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

2 / 50

Introduction

Estimated models tend to have a better …t in-sample than out-of-sample. Main result: Asymptotic distribution of {In-Sample Fit, Out-of-Sample Fit}. Implications: Theoretical justi…cation for out-of-sample analyses Model Selection

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

2 / 50

Likelihood Ratio Statistic

Consider a sample

X = (X1 , . . . , Xn ) | {z } In-Sample

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

3 / 50

Likelihood Ratio Statistic

Consider a sample

X = (X1 , . . . , Xn ) | {z } In-Sample

MLE θˆ x = θˆ (X ) = argmaxθ `(X , θ ).

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

3 / 50

Likelihood Ratio Statistic

Consider a sample

X = (X1 , . . . , Xn ) | {z } In-Sample

MLE θˆ x = θˆ (X ) = argmaxθ `(X , θ ). By construction:

`(X , θˆ x )

`(X , θ ? ),

θ ? = argmaxθ Ef`(X , θ )g (“true” population parameter).

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

3 / 50

Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )

independent of X

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

4 / 50

Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )

independent of X same distribution as X

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

4 / 50

Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )

independent of X same distribution as X

How well does estimated model θˆ x describe Y ?

`(Y , θˆ x )

PRH (Stanford University)

`(Y , θ ? ).

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

4 / 50

Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )

independent of X same distribution as X

How well does estimated model θˆ x describe Y ?

`(Y , θ ? ).

`(Y , θˆ x ) With η = `(X , θˆ x )

`(X , θ ? ) we often have d

2η ! χ2 .

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

4 / 50

Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )

independent of X same distribution as X

How well does estimated model θˆ x describe Y ?

`(Y , θ ? ).

`(Y , θˆ x ) With η = `(X , θˆ x )

`(X , θ ? ) we often have d

2η ! χ2 . What can be said about η˜ = `(Y , θˆ x )

PRH (Stanford University)

`(Y , θ ? )?

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

4 / 50

Example: Simplest Possible

Let X , Y N (θ ? , 1) be independent and de…ne Z1 = X θ ? and Z2 = Y θ ? (standard normals).

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

5 / 50

Example: Simplest Possible

Let X , Y N (θ ? , 1) be independent and de…ne Z1 = X θ ? and Z2 = Y θ ? (standard normals). In-Sample: (one observation) X Q (X , θ ) = log(2π ) + (X

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

θ )2 .

November 4, 2010

5 / 50

Example: Simplest Possible

Let X , Y N (θ ? , 1) be independent and de…ne Z1 = X θ ? and Z2 = Y θ ? (standard normals). In-Sample: (one observation) X Q (X , θ ) = log(2π ) + (X MLE:

PRH (Stanford University)

θ )2 .

θˆ = θˆ (X ) = X .

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

5 / 50

Example: Simplest Possible

Let X , Y N (θ ? , 1) be independent and de…ne Z1 = X θ ? and Z2 = Y θ ? (standard normals). In-Sample: (one observation) X Q (X , θ ) = log(2π ) + (X MLE:

θ )2 .

θˆ = θˆ (X ) = X .

Over…t: η = Q (X , θˆ )

PRH (Stanford University)

Q (X , θ ? ) = (X

In-Sample Fit and Out-of-Sample Fit

θ ? )2 = Z12

χ21 .

November 4, 2010

5 / 50

Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

θ )2 .

November 4, 2010

6 / 50

Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y

θ )2 .

Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ )

PRH (Stanford University)

Q (Y , θ ? )

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

6 / 50

Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y

θ )2 .

Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

6 / 50

Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y

θ )2 .

Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2

=

PRH (Stanford University)

(Y

θ? + θ?

θˆ )2 + (Y

In-Sample Fit and Out-of-Sample Fit

θ ? )2

November 4, 2010

6 / 50

Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y

θ )2 .

Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2

= (Y θ ? + θ ? θˆ )2 + (Y θ ? )2 = 2(X θ ? )(Y θ ? ) (X θ ? )2

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

6 / 50

Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y

θ )2 .

Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2

= (Y θ ? + θ ? θˆ )2 + (Y θ ? )2 = 2(X θ ? )(Y θ ? ) (X θ ? )2 = 2Z1 Z2

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

Z12 .

November 4, 2010

6 / 50

Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y

θ )2 .

Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2

= (Y θ ? + θ ? θˆ )2 + (Y θ ? )2 = 2(X θ ? )(Y θ ? ) (X θ ? )2 = 2Z1 Z2

Joint distribution: (η, η˜ ) PRH (Stanford University)

(Z12 , 2Z1 Z2

Z12 .

Z12 ).

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

6 / 50

General Framework X = (X1 , . . . , Xn ) | {z } In-Sample

PRH (Stanford University)

Y = Y1 , . . . , Ym | {z } Out-of-Sample

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

8 / 50

General Framework X = (X1 , . . . , Xn ) | {z } In-Sample

We care about: Criterion function

Y = Y1 , . . . , Ym | {z } Out-of-Sample

Q (Y , θ ).

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

8 / 50

General Framework X = (X1 , . . . , Xn ) | {z } In-Sample

We care about: Criterion function

Y = Y1 , . . . , Ym | {z } Out-of-Sample

Q (Y , θ ).

Available for estimation: X = (X1 , . . . , Xn ) drawn from the same distribution as Y .

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

8 / 50

General Framework X = (X1 , . . . , Xn ) | {z }

Y = Y1 , . . . , Ym | {z } Out-of-Sample

In-Sample

We care about: Criterion function Q (Y , θ ).

Available for estimation: X = (X1 , . . . , Xn ) drawn from the same distribution as Y . Estimate θ by maximizing empirical criterion function: θˆ = θˆ (X ) = arg max Q (X , θ ). θ

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

8 / 50

MSPE Example Example: Want to predict Y = Xn +1 at time n. A prediction, θ, results in: Q (Y , θ ) =

PRH (Stanford University)

(Xn +1

In-Sample Fit and Out-of-Sample Fit

θ )2 .

November 4, 2010

9 / 50

MSPE Example Example: Want to predict Y = Xn +1 at time n. A prediction, θ, results in: Q (Y , θ ) =

(Xn +1

θ )2 .

Minimize in-sample MSE n

max Q (X , θ ) = θ

PRH (Stanford University)

∑ (Xt

θ )2 ,

t =1

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

9 / 50

MSPE Example Example: Want to predict Y = Xn +1 at time n. A prediction, θ, results in: Q (Y , θ ) =

θ )2 .

(Xn +1

Minimize in-sample MSE n

max Q (X , θ ) = θ

∑ (Xt

θ )2 ,

t =1

In-sample estimator: θˆ (X ) = X¯ = n

1

n

∑ Xt .

t =1

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

9 / 50

Relative Fit: In-sample In-sample measure of over…tting η = Q (X , θˆ )

PRH (Stanford University)

Q (X , θ ? ).

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

10 / 50

Relative Fit: In-sample In-sample measure of over…tting η = Q (X , θˆ )

Q (X , θ ? ).

How well does θˆ = θˆ (X ) “…t” X relative to population parameter, θ ? = arg max EfQ (X , θ )g. θ

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

10 / 50

Relative Fit: In-sample In-sample measure of over…tting η = Q (X , θˆ )

Q (X , θ ? ).

How well does θˆ = θˆ (X ) “…t” X relative to population parameter, θ ? = arg max EfQ (X , θ )g. θ

Likelihood special case: Q (X , θ ) = 2`(X , θ ),

(= 2 log Likelihood).

η is Kullback-Leibler information.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

10 / 50

Relative Fit: Out-of-Sample The out-of-sample equivalent.... the one we care about... is η˜ = Q (Y , θˆ )

Q (Y , θ ? ).

How well does θˆ = θˆ (X ) …t independent data drawn from the same distribution as X ?

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

11 / 50

Relative Fit: Out-of-Sample The out-of-sample equivalent.... the one we care about... is η˜ = Q (Y , θˆ )

Q (Y , θ ? ).

How well does θˆ = θˆ (X ) …t independent data drawn from the same distribution as X ? In-sample: over…tting η

PRH (Stanford University)

0.

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

11 / 50

Relative Fit: Out-of-Sample The out-of-sample equivalent.... the one we care about... is η˜ = Q (Y , θˆ )

Q (Y , θ ? ).

How well does θˆ = θˆ (X ) …t independent data drawn from the same distribution as X ? In-sample: over…tting η

0.

Out-of-sample: η˜ 7 0,

PRH (Stanford University)

tends to have: η˜ < 0.

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

11 / 50

Relation is more than one of opposite expectations Regular problems... (Q = 2 log L, correctly speci…ed) we have E Q (X , θˆ ) E Q (Y , θˆ )

PRH (Stanford University)

Q (X , θ ? ) ?

Q (Y , θ )

In-Sample Fit and Out-of-Sample Fit

= +k =

k.

November 4, 2010

12 / 50

Relation is more than one of opposite expectations Regular problems... (Q = 2 log L, correctly speci…ed) we have E Q (X , θˆ ) E Q (Y , θˆ )

Q (X , θ ? ) ?

Q (Y , θ )

= +k =

k.

The di¤erence E Q (X , θˆ ) Q (Y , θˆ ) = 2k motivated Akaike’s Information Criterion (AIC).

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

12 / 50

Relation is more than one of opposite expectations Regular problems... (Q = 2 log L, correctly speci…ed) we have E Q (X , θˆ ) E Q (Y , θˆ )

Q (X , θ ? ) ?

Q (Y , θ )

= +k =

k.

The di¤erence E Q (X , θˆ ) Q (Y , θˆ ) = 2k motivated Akaike’s Information Criterion (AIC). The relation between η and η˜ is much closer than that of having opposite expected values.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

12 / 50

Relation is more than one of opposite expectations Regular problems... (Q = 2 log L, correctly speci…ed) we have E Q (X , θˆ ) E Q (Y , θˆ )

Q (X , θ ? ) ?

Q (Y , θ )

= +k =

k.

The di¤erence E Q (X , θˆ ) Q (Y , θˆ ) = 2k motivated Akaike’s Information Criterion (AIC). The relation between η and η˜ is much closer than that of having opposite expected values. Regular problems... we have E Q (Y , θˆ ) Important!

PRH (Stanford University)

Q (Y , θ ? )jX =

fQ (X , θˆ ) Q (X , θ ? )g. | {z } η

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

12 / 50

Extremum Estimator Framework

Assumption 1 p i Q¯ (X , θ ) = n 1 Q (X , θ ) ! Q (θ ) uniformly in θ on a open neighborhood of θ ? , as n ! ∞. ii H¯ (X , θ ) = ∂2 Q¯ (X , θ )/∂θ∂θ 0 exists and is continuous in an open neighborhood of θ ? , p iii H¯ (X , θ ) ! I(θ ) uniformly in θ in an open neighborhood of θ ? , I(θ ) is continuous in a neighborhood of θ ? and I0 = I(θ ? ) 2 Rk is positive de…nite.

k

d

iv n 1/2 S (X , θ ? ) ! Nf0, J0 g, (S ( ) = ∂Q ( )/∂θ) where J0 = limn !∞ E n 1 S (X , θ ? )S (X , θ ? )0 .

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

13 / 50

Theorem 1 Given Assumption 1. As n, m ! ∞ with 2

PRH (Stanford University)

η η˜

d

!

m n

! π, we have

p ζ1 2 πζ 2 πζ 1

In-Sample Fit and Out-of-Sample Fit

,

November 4, 2010

14 / 50

Theorem 1 Given Assumption 1. As n, m ! ∞ with 2 where ζ 1 = Z10 ΛZ1 ,

PRH (Stanford University)

η η˜

d

!

m n

! π, we have

p ζ1 2 πζ 2 πζ 1

ζ 2 = Z10 ΛZ2 ,

and Zi

In-Sample Fit and Out-of-Sample Fit

,

iid Nk (0, Ik ).

November 4, 2010

14 / 50

Theorem 1 Given Assumption 1. As n, m ! ∞ with 2 where ζ 1 = Z10 ΛZ1 ,

η η˜

d

!

m n

! π, we have

p ζ1 2 πζ 2 πζ 1

ζ 2 = Z10 ΛZ2 ,

and Zi

,

iid Nk (0, Ik ).

Furthermore, Λ = diag (λ1 , . . . , λk ) , where λ1 , . . . , λk are the eigenvalues of I0 1 J0 . n

p 1 H (X , θ ? ) !

PRH (Stanford University)

I0 and J0 = limn !∞ E n

1 S (X , θ ? )S (X , θ ? )0

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

14 / 50

The Joint Distribution of Eta and Eta-tilde

4

Case: Λ = I and k = 3.

2

0.005

0.02

0

0.035

0.

0.0

-2

4

0.045 03

-4

0.025

eta

-6 -8

0.01

-10

etatild

e

0.015

0 PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

2

4

6

November 4, 2010

8 15 / 50

Conditional Density η˜ jη

N ( η, 4η ) ~ given η η

0.20

Conditional density of

0.00

0.05

0.10

0.15

η=1 η=4 η = 10

-10

PRH (Stanford University)

-5

0

In-Sample Fit and Out-of-Sample Fit

5

10

November 4, 2010

16 / 50

Sketch of Proof for the General Case In-Sample: S ( , θ ) = ∂Q ( , θ )/∂θ and H ( , θ ) = ∂2 Q ( , θ )/∂θ∂θ 0 . 0 = S (X , θˆ ) ' S (X , θ ? ) + H (X , θ ? )(θˆ θ ? ) (θˆ θ ? ) ' [ H (X , θ ? )] 1 S (X , θ ? ), Next, Q (X , θ ? ) ' Q (X , θˆ ) + S (X , θˆ )0 (θ ?

θ ? )0 H (X , θ ? )(θˆ

θˆ ) + 21 (θˆ

θ ? ),

so that η = Q (X , θˆ )

Q (X , θ ? )

1 (θˆ θ ? )0 [ 2 1 S (X , θ ? )0 [ 2

' '

Often [ H (X , θ ? )]

1/2

PRH (Stanford University)

S (X , θ ? ) =

"

n

1

n

∑

t =1

ht ( θ ? )

#

In-Sample Fit and Out-of-Sample Fit

H (X , θ ? )](θˆ H (X , θ ? )]

1/2

n

1/2

n

θ? ) 1

S (X , θ ? ).

d

∑ st (θ ) ! N (0, Σ).

t =1

August 2010

16 / 48

Sketch of Proof for the General Case Out-of-Sample: Q (Y , θˆ ) ' Q (Y , θ ? ) + S (Y , θ ? )0 (θˆ

θ ? ) + 12 (θˆ

θ ? )0 H (Y , θ ? )(θˆ

θ ? ),

so that η˜

= Q (Y , θˆ )

Q (Y , θ ? )

' S (Y , θ ? )0 [ H (X , θ ? )] 1 S (X , θ ? ) ? 0 ? ? ? 1 1 2 S (X , θ ) [ H (X , θ )] [ H (Y , θ )][ H (X , θ )]

Simplest case m = n large: H (X , θ ? ) [H (Y , θ ? )] η˜

PRH (Stanford University)

ZY0 ZX

1

1

S (X , θ ? ).

I . So that

η.

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

18 / 50

Empirical Example: Term Structure VAR Five interest rates with di¤erent maturities: 3, 6, 12, 60 and 120 month 18.00

16.00

14.00

12.00 GS10 GS5 GS1 TB6MS TB3MS

10.00

8.00

6.00

4.00

2.00

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

Jan-07

Jan-05

Jan-03

Jan-01

Jan-99

Jan-97

Jan-95

Jan-93

Jan-91

Jan-89

Jan-87

Jan-85

Jan-83

Jan-81

Jan-79

Jan-77

Jan-75

Jan-73

Jan-71

Jan-69

Jan-67

Jan-65

Jan-63

Jan-61

Jan-59

0.00

November 4, 2010

19 / 50

Cointegrated VAR

p 1

∆Xt = αβ0 Xt

1+

∑ Γj ∆Xt

+ µ + εt ,

j

j =1

with p = 1, . . . , 12, and cointegration rank r = 0, . . . , 5. Sample divided into odd months, Todd , and even months, Teven . Estimate the parameters, θ = (α, β, Γ1 , . . . , Γp 1 , µ), by maximizing, either Todd 1 Qodd ( , θ ) = log ∑ εˆt εˆt0 , 2 Todd t 2T odd

or

Qeven ( , θ ) =

PRH (Stanford University)

Teven 1 log 2 Teven

∑

εˆ t εˆ t0 ,

t 2Teven

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

20 / 50

In-Sample In-sample: criterion function (maximized log-likelihood) Odd months Even months r =3 r =4 r =5 r =3 r =4 r =5 p=1 2871.16 2874.11 2874.76 2980.96 2987.08 2987.19 p=2 2948.22 2951.38 2952.70 3039.33 3048.10 3048.32 p=3 2989.19 2993.22 2993.63 3073.42 3079.72 3079.91 p=4 3024.61 3028.73 3029.46 3102.98 3109.33 3109.58 p=5 3054.65 3057.46 3058.19 3114.13 3120.43 3120.59 p=6 3086.68 3089.42 3090.15 3130.52 3135.51 3135.66 p=7 3109.40 3113.22 3113.75 3178.24 3182.07 3182.20 p=8 3142.61 3145.66 3146.02 3218.15 3222.01 3222.15 p=9 3178.79 3179.84 3179.90 3250.05 3254.92 3254.97 p = 10 3203.53 3204.56 3204.72 3296.67 3299.91 3299.93 p = 11 3226.99 3227.48 3227.56 3316.55 3320.13 3320.16 p = 12 3252.82 3253.10 3253.10 3338.01 3342.79 3342.79 PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

23 / 50

In-Sample (AIC)

p p p p p p p p p p p p

=1 =2 =3 =4 =5 =6 =7 =8 =9 = 10 = 11 = 12

Akaike’s Information Criterion (AIC) Odd months Even months r =3 r =4 r =5 r =3 r =4 r =5 2850.16 2850.11 2849.76 2959.96 2963.08 2962.19 2902.22 2902.38 2902.70 2993.33 2999.10 2998.32 2918.19 2919.22 2918.63 3002.42 3005.72 3004.91 2928.61 2929.73 2929.46 3006.98 3010.33 3009.58 2933.65 2933.46 2933.19 2993.13 2996.43 2995.59 2940.68 2940.42 2940.15 2984.52 2986.51 2985.66 2938.40 2939.22 2938.75 3007.24 3008.07 3007.20 2946.61 2946.66 2946.02 3022.15 3023.01 3022.15 2957.79 2955.84 2954.90 3029.05 3030.92 3029.97 2957.53 2955.56 2954.72 3050.67 3050.91 3049.93 2955.99 2953.48 2952.56 3045.55 3046.13 3045.16 2956.82 2954.10 2953.10 3042.01 3043.79 3042.79

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

24 / 50

Out-of-Sample

p p p p p p p p p p p p

=1 =2 =3 =4 =5 =6 =7 =8 =9 = 10 = 11 = 12

r =3 2843.65 2866.41 2852.75 2833.62 2822.51 2815.58 2794.20 2768.54 2755.05 2715.22 2716.26 2695.15

PRH (Stanford University)

Out-of-sample: criterion function Odd months Even months r =4 r =5 r =3 r =4 r =5 2848.35 2848.98 2941.21 2950.63 2951.93 2873.09 2874.13 2937.93 2951.44 2952.66 2856.10 2856.68 2913.50 2918.82 2921.47 2834.61 2835.36 2876.32 2876.13 2879.63 2824.16 2824.78 2849.55 2849.50 2853.05 2816.87 2817.64 2837.17 2838.54 2841.02 2797.97 2798.59 2839.22 2840.04 2842.28 2773.15 2773.98 2799.91 2804.68 2805.98 2759.44 2760.06 2763.76 2767.40 2768.22 2719.79 2720.18 2733.72 2737.91 2738.88 2722.15 2722.60 2716.76 2719.00 2720.85 2699.25 2699.16 2676.57 2678.54 2678.88 In-Sample Fit and Out-of-Sample Fit

November 4, 2010

25 / 50

Eta – Eta-tilde

p 1 2 3 4 5 6 7 8 9 10 11 12

In-sample: “Odd” observations Q (p, r ) Q (p 1, r ). Odd months (in-sample): η Even months: η˜ r =2 r =3 r =4 r =5 r =2 r =3 r =4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 74.96 77.06 77.27 77.93 0.79 -3.28 0.80 43.65 40.97 41.84 40.93 -22.38 -24.43 -32.62 34.95 35.41 35.50 35.83 -35.31 -37.18 -42.69 31.08 30.04 28.74 28.74 -28.26 -26.77 -26.63 32.50 32.03 31.96 31.95 -8.15 -12.38 -10.96 23.63 22.73 23.80 23.60 4.36 2.04 1.50 34.23 33.21 32.44 32.27 -33.52 -39.31 -35.35 35.39 36.18 34.18 33.88 -36.11 -36.15 -37.28 25.04 24.74 24.72 24.82 -31.54 -30.04 -29.49 23.37 23.46 22.92 22.84 -17.71 -16.96 -18.91 26.36 25.82 25.62 25.54 -41.96 -40.19 -40.45

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

r =5 0.00 0.73 -31.19 -41.84 -26.57 -12.04 1.26 -36.30 -37.76 -29.34 -18.03 -41.97 26 / 50

Scatterplot Q (p + 1, r )

Q (p, r ), p

PRH (Stanford University)

3

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

27 / 50

Multiple Models

With multiple models... j = 1, . . . , M. Determine a nesting model θ2Θ So that the j-th model is characterized by Θj

PRH (Stanford University)

Θ.

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

28 / 50

Parameters in j-th Model Population parameter in j-th model is given by θ j? = argmax Q¯ (θ ), θ 2 Θj

where Q¯ (θ ) = lim n n !∞

PRH (Stanford University)

1

n

∑ qi (Xi , θ ).

i =1

|

{z

Q (X ,θ )

In-Sample Fit and Out-of-Sample Fit

}

November 4, 2010

29 / 50

Parameters in j-th Model Population parameter in j-th model is given by θ j? = argmax Q¯ (θ ), θ 2 Θj

where Q¯ (θ ) = lim n n !∞

Estimator in j-th model is

1

n

∑ qi (Xi , θ ).

i =1

|

{z

}

Q (X ,θ )

n

θˆ j = argmax Q (X , θ ) = argmax ∑ qi (Xi , θ ) θ 2 Θj

PRH (Stanford University)

θ 2 Θj

In-Sample Fit and Out-of-Sample Fit

i =1

November 4, 2010

29 / 50

A Useful Decomposition Observed in-sample …t expressed by 3 terms Q (X , θˆ j ) =

Q (θ j? ) | {z }

Genuine ( µj )

PRH (Stanford University)

+ Q (X , θ j? ) Q (θ j? ) + Q (X , θˆ j ) Q (X , θ j? ). {z } | {z } | Ordinary noise ( νj )

In-Sample Fit and Out-of-Sample Fit

Deceptive noise ( η j )

November 4, 2010

30 / 50

A Useful Decomposition Observed in-sample …t expressed by 3 terms Q (X , θˆ j ) =

Q (θ j? ) | {z }

Genuine ( µj )

(j )

Example: yi = β0(j ) xi Q (X , βˆ

(j )

+ Q (X , θ j? ) Q (θ j? ) + Q (X , θˆ j ) Q (X , θ j? ). {z } | {z } |

+ εj ,i , with σ2(j ) = E(ε2j ,i ).

n

)=

∑ εˆ2j ,i

i =1

Deceptive noise ( η j )

Ordinary noise ( νj )

n

n

∑ σ2(j ) + ∑ (σ2(j )

= |

i =1

µj =

{z

}

n σ2(j )

i =1

|

{z νj

ε2j ,i ) + }

n

∑ (ε2j ,i

i =1

|

{z ηj

εˆ 2j ,i ). }

Sampling variation comes in two ‡avors, one of them being particularly nasty. PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

30 / 50

Observed

100 95

95 90 80

M1

M2

M3

M4

M5

Genuine 8

7

Deceptive

14 20 12

92

M1

88

M2

81

M3

70

68

M4

M5

Gen-Dec 8 8

7 7

Genuine

Deceptive

14 20 14

12 20

84

12

81 67 50

M1

M2

M3

M4

56

M5

Observed

100 95

95 90 80

M1

M2

M3

M4

M5

Genuine

18

12

Deceptive

14

12 2

82

83

81

78

78

M1

M2

M3

M4

M5

Gen-Dec

18

18

64

M1

12 12

71

M2

Genuine

14 14

Deceptive

12 2 2 12

76

67

66

M3

M4

M5

AIC Example

Estimate K regression models, yi = βj xj ,i + εj ,i ,

j = 1, . . . , K

by least squares, βˆ j = ∑ni=1 xj ,i yi / ∑ni=1 xj2,i , where xj ,i

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

iidN (0, 1).

November 4, 2010

31 / 50

AIC Example

Estimate K regression models, yi = βj xj ,i + εj ,i ,

j = 1, . . . , K

by least squares, βˆ j = ∑ni=1 xj ,i yi / ∑ni=1 xj2,i , where xj ,i iidN (0, 1). Truth model is yi = εi , where εi ??xj ,i , and εi iidN (0, 1). ! n

AICj =

∑ εˆ2j ,i + 2

.

i =1

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

31 / 50

AIC Example

Estimate K regression models, yi = βj xj ,i + εj ,i ,

j = 1, . . . , K

by least squares, βˆ j = ∑ni=1 xj ,i yi / ∑ni=1 xj2,i , where xj ,i iidN (0, 1). Truth model is yi = εi , where εi ??xj ,i , and εi iidN (0, 1). ! n

AICj =

∑ εˆ2j ,i + 2

.

i =1

Consider AICmax = max1

PRH (Stanford University)

j K

AICj and Q (Y , βˆ j ).

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

31 / 50

AIC Example (cont.) n=100 Nsim>1,000,000 K 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

AICmax -101.00 -100.36 -99.90 -99.54 -99.24 -98.99 -98.77 -98.57 -98.40 -98.24 -98.09 -97.96 -97.83 -97.72 -97.61

PRH (Stanford University)

Qmax (Y ) -101.01 -101.66 -102.13 -102.50 -102.81 -103.07 -103.30 -103.49 -103.67 -103.84 -103.98 -104.12 -104.25 -104.36 -104.48

Bias 0.01 1.30 2.23 2.97 3.57 4.08 4.53 4.92 5.28 5.60 5.89 6.17 6.42 6.65 6.87

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

32 / 50

AIC Example (cont.) n=100 Nsim>1,000,000 K 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

AICmax -101.00 -100.36 -99.90 -99.54 -99.24 -98.99 -98.77 -98.57 -98.40 -98.24 -98.09 -97.96 -97.83 -97.72 -97.61

PRH (Stanford University)

Qmax (Y ) -101.01 -101.66 -102.13 -102.50 -102.81 -103.07 -103.30 -103.49 -103.67 -103.84 -103.98 -104.12 -104.25 -104.36 -104.48

Bias 0.01 1.30 2.23 2.97 3.57 4.08 4.53 4.92 5.28 5.60 5.89 6.17 6.42 6.65 6.87

AICmin -101.00 -101.63 -101.80 -101.88 -101.91 -101.94 -101.95 -101.96 -101.97 -101.97 -101.98 -101.98 -101.98 -101.99 -101.99

In-Sample Fit and Out-of-Sample Fit

Qmin (Y ) -101.01 -100.37 -100.19 -100.12 -100.08 -100.06 -100.04 -100.03 -100.02 -100.02 -100.01 -100.01 -100.01 -100.01 -100.00 November 4, 2010

32 / 50

Winner’s Curse 4

4 2

m=2

2

m=1

-2

0.06

-8

-6

-4

0.02

-10

-10

-8

-6

-4

-2

0.08

0

0.2

0

0.04

0

2

4

6

8

0

2

m=5

4

6

8

m=20 4

4

0.002 0 .00 5

0.004

0.006

2

2

0.015

0.01

5

0.018

4

-2

0.0

-2

0.014

0

0

0.03

0.04

0.022

-4

-4

0.035 0 .02 5

0.02

-6

-6

0.02

0.01

-8

-8

0.016 0.012

-10

-10

0.0

0

PRH (Stanford University)

2

4

6

8

0

2

In-Sample Fit and Out-of-Sample Fit

08

4

6

8

November 4, 2010

33 / 50

AIC/BIC Paradox (cont.) Consider the hypothetical case where... ... µj and νj are the same for all models; ... and all models have the same number of parameters. Which model does AIC/BIC select?

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

34 / 50

AIC/BIC Paradox (cont.) Consider the hypothetical case where... ... µj and νj are the same for all models; ... and all models have the same number of parameters. Which model does AIC/BIC select? arg max ξ j = arg max(µj + νj + η j j

PRH (Stanford University)

j

In-Sample Fit and Out-of-Sample Fit

pj ) ' arg max η j . j

November 4, 2010

34 / 50

AIC/BIC Paradox (cont.) Consider the hypothetical case where... ... µj and νj are the same for all models; ... and all models have the same number of parameters. Which model does AIC/BIC select? arg max ξ j = arg max(µj + νj + η j j

j

pj ) ' arg max η j . j

That happens to be min(µj j

η j ),

The worst possible models! Knock-out blow to model selection in this context.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

34 / 50

AIC Example (cont.) Model Average

The simple average: β¯ x = K

1

η¯ with expected value

PRH (Stanford University)

1 K

ˆ ∑K j =1 βj , results in 1 2 χ , K 1

.

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

35 / 50

AIC Example (cont.) Model Average

The simple average: β¯ x = K

1

η¯ with expected value

1 K

ˆ ∑K j =1 βj , results in 1 2 χ , K 1

.

When some models are better than other models (unlike this example) the question is whether µj of the selected model is su¢ ciently large to o¤set η j .

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

35 / 50

On Information Criteria

AIC may work reasonably well with a small set of nested models.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

36 / 50

On Information Criteria

AIC may work reasonably well with a small set of nested models. With “Many” models it has a tendency to select a model with a large ηj .

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

36 / 50

On Information Criteria

AIC may work reasonably well with a small set of nested models. With “Many” models it has a tendency to select a model with a large ηj . BIC to a lesser extend.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

36 / 50

On Information Criteria

AIC may work reasonably well with a small set of nested models. With “Many” models it has a tendency to select a model with a large ηj . BIC to a lesser extend. Not suitable for comparing many models with the same number of parameters. There is no penalty for the “search”.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

36 / 50

Macro Application (Stock and Watson 131 Time Series) Focus on nine series: PI, IP, UR, EMP, TBILL, TBOND, PPI, CPI, PCED S&W “An Empirical Comparison of Methods for Forecasting using Many Predictors”.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

37 / 50

Macro Application (Stock and Watson 131 Time Series) Focus on nine series: PI, IP, UR, EMP, TBILL, TBOND, PPI, CPI, PCED S&W “An Empirical Comparison of Methods for Forecasting using Many Predictors”. Forecast Horizon: 12 Month-Ahead Baseline forecast: (1 )

Xt +h = α + βXt + γPCt

+ εt,t +h , (j )

Alternative forecasts... Add a variable or PCt . Fixed Scheme (1960:01-1994:12). Evaluation (1995:01-2003:12).

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

37 / 50

Winners Curse Report σˆ 2X = n

1

∑nt=1 εˆ 2t and σˆ 2Y = m

PI IP UR EMP TBILL TBOND PPI PCI PCED

PRH (Stanford University)

σˆ 2X 3.61 21.02 1.02 46.67 2.75 1.21 26.57 10.79 7.52

1

+m 2 ∑nt = n +1 εˆ t .

σˆ 2Y 2.95 10.38 0.26 25.05 1.54 0.62 24.13 14.76 3.17

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

38 / 50

Winners Curse Report σˆ 2X = n

1

∑nt=1 εˆ 2t and σˆ 2Y = m

PI IP UR EMP TBILL TBOND PPI PCI PCED

PRH (Stanford University)

σˆ 2X 3.61 21.02 1.02 46.67 2.75 1.21 26.57 10.79 7.52

σˆ 2Y 2.95 10.38 0.26 25.05 1.54 0.62 24.13 14.76 3.17

1

σˆ 2 ,X 2.75 11.96 0.55 36.06 2.41 0.95 23.48 10.27 6.96

+m 2 ∑nt = n +1 εˆ t .

∆QX 27.21% 56.36% 62.44% 25.78% 13.28% 24.53% 12.35% 4.92% 7.82%

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

38 / 50

Winners Curse Report σˆ 2X = n

1

∑nt=1 εˆ 2t and σˆ 2Y = m

PI IP UR EMP TBILL TBOND PPI PCI PCED

PRH (Stanford University)

σˆ 2X 3.61 21.02 1.02 46.67 2.75 1.21 26.57 10.79 7.52

σˆ 2Y 2.95 10.38 0.26 25.05 1.54 0.62 24.13 14.76 3.17

1

σˆ 2 ,X 2.75 11.96 0.55 36.06 2.41 0.95 23.48 10.27 6.96

+m 2 ∑nt = n +1 εˆ t .

∆QX 27.21% 56.36% 62.44% 25.78% 13.28% 24.53% 12.35% 4.92% 7.82%

In-Sample Fit and Out-of-Sample Fit

σˆ 2 ,Y 4.19 12.09 0.56 34.39 2.41 0.44 24.61 14.71 3.32

∆QY -34.98% -15.22% -76.75% -31.70% -45.04% 35.61% -1.94% 0.35% -4.57%

November 4, 2010

38 / 50

Winners Curse

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

39 / 50

Example: Portfolio Choice Vector of Returns: Xt

PRH (Stanford University)

N(µ, Σ).

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

40 / 50

Example: Portfolio Choice Vector of Returns: Xt

N(µ, Σ).

Objective is max w 0 µ w

γ 0 2 w Σw ,

subject to ι0 w = 1.

or, in the presence of a risk-free asset max w0 µ0 + w 0 µ w

PRH (Stanford University)

γ 0 2 w Σw ,

subject to ι0 w = 1

In-Sample Fit and Out-of-Sample Fit

w0 .

November 4, 2010

40 / 50

Example: Portfolio Choice Vector of Returns: N(µ, Σ).

Xt Objective is γ 0 2 w Σw ,

max w 0 µ w

subject to ι0 w = 1.

or, in the presence of a risk-free asset max w0 µ0 + w 0 µ w

γ 0 2 w Σw ,

subject to ι0 w = 1

w0 .

Criterion function: n

Q (X , w ) =

∑ w 0 Xt

i =1

PRH (Stanford University)

γ 0 2w

n

∑ (Xt

X¯ )(Xt

X¯ )0 w .

i =1

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

40 / 50

Example: Optimal Weight Solution to (no risk-free asset) max w 0 µ w

γ 0 2 w Σw ,

w =Σ

PRH (Stanford University)

1

subject to ι0 w = 1.

µ γ1 + ι

1 ιΣ 1 µ/γ ι0 Σ 1 ι

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

41 / 50

Example: Optimal Weight Solution to (no risk-free asset) max w 0 µ

γ 0 2 w Σw ,

w

w =Σ

1

subject to ι0 w = 1.

µ γ1 + ι

1 ιΣ 1 µ/γ ι0 Σ 1 ι

Solution to (with a risk-free asset) max w0 µ0 + w 0 µ w

γ 0 2 w Σw ,

w =γ

PRH (Stanford University)

1

Σ

subject to ι0 w = 1 1

(µ

w0 .

µ0 ι )

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

41 / 50

Example: Portfolio Choice (cont.) Jobson & Korkie (JASA, 1980): 20 randomly selected stocks. Estimates for monthly returns (December 1949 – December 1975): µˆ = 0.50 0.90 1.10 1.74 1.82 1.11 0.91 1.18 1.35 1.07 1.16 1.23 0.81 1.18 0.88 1.20 0.72 1.16 0.92 1.25

0 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B @

53.6 6.6 19.8 34.1 6.3 5.8 16.9 15.3 9.6 10.1 10.3 18.9 8.5 14.6 14.6 14.3 27.9 25.1 11.8 16.9

6.6 29.8 16.7 20.7 6.5 11.8 8.4 9.3 10.8 11.2 9.6 8.8 13.5 14.1 16.5 8.8 14.9 16.7 22.8 10.3

ˆ = Σ

0

19.8 34.1 6.3 5.8 16.9 15.3 9.6 10.1 10.3 18.9 8.5 14.6 14.6 14.3 27.9 25.1 11.8 16.9 16.7 20.7 6.5 11.8 8.4 9.3 10.8 11.2 9.6 8.8 13.5 14.1 16.5 8.8 14.9 16.7 22.8 10.3 82.9 48.0 18.7 21.0 22.2 16.2 16.3 18.9 21.6 27.0 8.8 23.3 17.4 22.3 36.7 41.4 21.4 27.7 48.0 178.1 27.5 19.3 32.4 26.9 18.3 22.4 21.8 41.7 17.3 42.9 26.3 30.3 66.0 47.6 20.7 43.9 18.7 27.5 118.1 26.3 23.9 12.4 14.2 23.1 31.0 13.1 5.4 20.4 9.9 14.3 17.1 20.2 13.5 18.5 21.0 19.3 26.3 57.1 20.2 11.7 15.2 16.3 13.7 19.3 7.8 21.5 11.3 13.2 13.5 12.3 16.8 18.1 22.2 32.4 23.9 20.2 52.1 15.3 12.1 17.7 18.0 21.4 9.6 26.4 16.2 15.2 25.6 24.8 15.5 25.3 16.2 26.9 12.4 11.7 15.3 48.3 9.7 9.4 8.6 14.4 9.9 11.3 13.3 17.0 32.1 21.7 14.3 15.8 16.3 18.3 14.2 15.2 12.1 9.7 29.8 11.2 13.1 13.8 7.3 16.7 11.4 8.2 15.7 20.6 14.8 10.7 18.9 22.4 23.1 16.3 17.7 9.4 11.2 35.1 22.6 13.0 7.9 17.6 10.7 12.6 16.2 21.5 14.2 14.7 21.6 21.8 31.0 13.7 18.0 8.6 13.1 22.6 47.6 16.6 6.0 19.8 9.3 13.5 20.5 18.8 13.3 17.7 27.0 41.7 13.1 19.3 21.4 14.4 13.8 13.0 16.6 65.6 7.9 23.1 11.6 25.8 35.8 26.4 17.0 23.7 8.8 17.3 5.4 7.8 9.6 9.9 7.3 7.9 6.0 7.9 23.5 12.0 14.3 8.5 15.2 14.2 15.8 9.7 23.3 42.9 20.4 21.5 26.4 11.3 16.7 17.6 19.8 23.1 12.0 51.2 16.4 14.7 26.2 25.6 20.4 20.9 17.4 26.3 9.9 11.3 16.2 13.3 11.4 10.7 9.3 11.6 14.3 16.4 28.7 12.2 19.9 24.3 22.4 13.8 22.3 30.3 14.3 13.2 15.2 17.0 8.2 12.6 13.5 25.8 8.5 14.7 12.2 56.0 32.3 24.5 13.1 14.5 36.7 66.0 17.1 13.5 25.6 32.1 15.7 16.2 20.5 35.8 15.2 26.2 19.9 32.3 109.5 50.8 18.6 32.3 41.4 47.6 20.2 12.3 24.8 21.7 20.6 21.5 18.8 26.4 14.2 25.6 24.3 24.5 50.8 131.8 27.0 29.2 21.4 20.7 13.5 16.8 15.5 14.3 14.8 14.2 13.3 17.0 15.8 20.4 22.4 13.1 18.6 27.0 44.7 16.1 27.7 43.9 18.5 18.1 25.3 15.8 10.7 14.7 17.7 23.7 9.7 20.9 13.8 14.5 32.3 29.2 16.1 58.7

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

1 C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C A

42 / 50

Portfolio Choice with N=5 Without a risk-free asset (N = 5) T 60 120 180 240 360 480 600 6000

η

η˜

η/T

˜ η/T

36.67 34.92 34.24 34.06 33.68 33.69 33.49 33.38

-43.73 -37.89 -36.23 -35.42 -34.45 -34.32 -34.12 -33.48

0.61 0.29 0.19 0.14 0.09 0.07 0.06 0.01

-0.73 -0.32 -0.20 -0.15 -0.10 -0.07 -0.06 -0.01

Q¯ (X , w )

Q¯ (X , wˆ )

Q¯ (Y , wˆ )

Q¯ (Y , w e )

0.95 0.63 0.53 0.48 0.43 0.40 0.39 0.34

-0.38 0.02 0.14 0.19 0.24 0.26 0.27 0.33

0.07 0.06 0.06 0.05 0.05 0.05 0.05 0.05

Q¯ (X , wˆ )

Q¯ (Y , wˆ )

Q¯ (Y , wˆ e )

1.17 0.77 0.64 0.58 0.52 0.49 0.48 0.42

-0.53 0.02 0.17 0.23 0.29 0.32 0.34 0.40

0.17 0.25 0.27 0.28 0.29 0.30 0.30 0.31

0.34 0.34 0.34 0.33 0.33 0.33 0.33 0.33

With a risk-free asset (N = 5) T 60 120 180 240 360 480 600 6000

η

η˜

η/T

˜ η/T

45.29 42.53 41.49 41.18 40.62 40.56 40.31 40.08

-56.71 -47.32 -44.53 -43.39 -41.86 -41.60 -41.25 -40.28

0.75 0.35 0.23 0.17 0.11 0.08 0.07 0.01

-0.95 -0.39 -0.25 -0.18 -0.12 -0.09 -0.07 -0.01

PRH (Stanford University)

Q¯ (X , w ) 0.42 0.41 0.41 0.41 0.41 0.41 0.41 0.41

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

43 / 50

Portfolio Choice with N=20 Without a risk-free asset (N = 20) T 60 120 180 240 360 480 600 6000

η

η˜

η/T

˜ η/T

234.42 186.06 174.23 169.06 164.08 161.89 160.30 155.23

-540.35 -268.06 -220.70 -201.31 -184.04 -176.24 -171.30 -156.37

3.91 1.55 0.97 0.70 0.46 0.34 0.27 0.03

-9.01 -2.23 -1.23 -0.84 -0.51 -0.37 -0.29 -0.03

η

η˜

η/T

˜ η/T

266.52 206.31 191.91 185.53 179.54 176.83 174.96 168.80

-667.30 -309.14 -249.55 -225.31 -203.95 -194.38 -188.47 -170.36

4.44 1.72 1.07 0.77 0.50 0.37 0.29 0.03

-11.12 -2.58 -1.39 -0.94 -0.57 -0.40 -0.31 -0.03

Q¯ (X , w )

Q¯ (X , wˆ )

Q¯ (Y , wˆ )

Q¯ (Y , w e )

4.76 2.40 1.82 1.55 1.30 1.19 1.11 0.87

-8.15 -1.38 -0.38 0.01 0.34 0.48 0.56 0.82

0.43 0.42 0.42 0.42 0.42 0.42 0.42 0.42

Q¯ (X , wˆ )

Q¯ (Y , wˆ )

Q¯ (Y , wˆ e )

0.86 0.85 0.85 0.85 0.85 0.85 0.85 0.85

With a risk-free asset (N = 20) T 60 120 180 240 360 480 600 6000

PRH (Stanford University)

Q¯ (X , w ) 0.89 0.88 0.88 0.88 0.88 0.88 0.88 0.87

In-Sample Fit and Out-of-Sample Fit

5.33 2.60 1.94 1.65 1.37 1.25 1.17 0.90

-10.23 -1.69 -0.51 -0.06 0.31 0.47 0.56 0.85

November 4, 2010

0.30 0.37 0.40 0.41 0.42 0.43 0.43 0.44

44 / 50

Shrink towards 1/N

1.0 -1.0

0.0

0.6 0.2 -0.2

T=60

2.0

N=20

1.0

N=5

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

1.0 -1.0

0.0

0.6 0.2 -0.2

T=1200

0.0

2.0

0.2

1.0

0.0

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

45 / 50

Shrink towards 1/N N=20 0.002

-5

4

-2

0.18

0.1

0.14 0.08

-3

0.028 0. 01 8

-10

1 -1

02 0.

0.034

3

0.3

0.0

0.38 0. 2

0.01

2 8 0.2

0.4

0

0.26

0. 2

0.36

0.0 04 0.0 0.0 06 16 26 0. 0 . 0. 0 0 3 022 8

0

0.1 2

0.16

0.24

0.02

2

N=5 0.06

0.02 0.0 14 2 0. 0.01 00 8

0.04

-4

-15

0.02

0.4

2

3

4

5

1

2

3

4

5

6

0

0.3

0.1

-5

0.35

0.55

1

0. 2

0.1

1

0

0.1

0.75

0.6

2.0

0.3 0.5

-2

0.15

-10

0

1.5

0.25

0.45

-1

1.0

0.2

0.5

2

0.0

-4

-15

-3

0.05

0.0

0.5

PRH (Stanford University)

1.0

1.5

2.0

0

In-Sample Fit and Out-of-Sample Fit

6

November 4, 2010

46 / 50

We can try too hard

Extensive search over models with limited data is dangerous.

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

48 / 50

We can try too hard

Extensive search over models with limited data is dangerous. Too little data?

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

48 / 50

We can try too hard

Extensive search over models with limited data is dangerous. Too little data? Too many econometricians?

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

48 / 50

We can try too hard

Extensive search over models with limited data is dangerous. Too little data? Too many econometricians? Argument for letting empirical analysis be guided by economic theory. (design of medical studies).

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

48 / 50

Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

N ( η, 4η ).

November 4, 2010

50 / 50

Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη

N ( η, 4η ).

Winner’s Curse AIC/BIC in trouble A parsimonious model... is not parsimonious if selected from a large pool of models

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

50 / 50

Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη

N ( η, 4η ).

Winner’s Curse AIC/BIC in trouble A parsimonious model... is not parsimonious if selected from a large pool of models

Resolutions... avoid the extreme: η = maxθ Q (X , θ )

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

Q (X , θ ).

November 4, 2010

50 / 50

Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη

N ( η, 4η ).

Winner’s Curse AIC/BIC in trouble A parsimonious model... is not parsimonious if selected from a large pool of models

Resolutions... avoid the extreme: η = maxθ Q (X , θ )

Q (X , θ ).

Shrinkage and Model Averaging (including Bayesian) M 1 ∑M j =1 η j < maxj η j .

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

50 / 50

Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη

N ( η, 4η ).

Winner’s Curse AIC/BIC in trouble A parsimonious model... is not parsimonious if selected from a large pool of models

Resolutions... avoid the extreme: η = maxθ Q (X , θ )

Q (X , θ ).

Shrinkage and Model Averaging (including Bayesian) M 1 ∑M j =1 η j < maxj η j . Restricted model: E.g. θ = θ (ψ). η = maxψ Q (X , θ (ψ)) Q (X , θ (ψ )) One parsimonious model

PRH (Stanford University)

In-Sample Fit and Out-of-Sample Fit

November 4, 2010

50 / 50

A Winnerls Curse for Econometric Models: On the Joint Distribution of ...