A Winner’s Curse for Econometric Models: On the Joint distribution of In-Sample Fit and Out-of-Sample Fit and its Implications for Model Selection Peter Reinhard Hansen Department of Economics Stanford University
November 4, 2010
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
1 / 50
Introduction
Estimated models tend to have a better …t in-sample than out-of-sample.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
2 / 50
Introduction
Estimated models tend to have a better …t in-sample than out-of-sample. Main result: Asymptotic distribution of {In-Sample Fit, Out-of-Sample Fit}.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
2 / 50
Introduction
Estimated models tend to have a better …t in-sample than out-of-sample. Main result: Asymptotic distribution of {In-Sample Fit, Out-of-Sample Fit}. Implications: Theoretical justi…cation for out-of-sample analyses Model Selection
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
2 / 50
Likelihood Ratio Statistic
Consider a sample
X = (X1 , . . . , Xn ) | {z } In-Sample
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
3 / 50
Likelihood Ratio Statistic
Consider a sample
X = (X1 , . . . , Xn ) | {z } In-Sample
MLE θˆ x = θˆ (X ) = argmaxθ `(X , θ ).
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
3 / 50
Likelihood Ratio Statistic
Consider a sample
X = (X1 , . . . , Xn ) | {z } In-Sample
MLE θˆ x = θˆ (X ) = argmaxθ `(X , θ ). By construction:
`(X , θˆ x )
`(X , θ ? ),
θ ? = argmaxθ Ef`(X , θ )g (“true” population parameter).
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
3 / 50
Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )
independent of X
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
4 / 50
Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )
independent of X same distribution as X
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
4 / 50
Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )
independent of X same distribution as X
How well does estimated model θˆ x describe Y ?
`(Y , θˆ x )
PRH (Stanford University)
`(Y , θ ? ).
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
4 / 50
Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )
independent of X same distribution as X
How well does estimated model θˆ x describe Y ?
`(Y , θ ? ).
`(Y , θˆ x ) With η = `(X , θˆ x )
`(X , θ ? ) we often have d
2η ! χ2 .
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
4 / 50
Out-of-Sample Likelihood Ratio Statistic Y = (Y1 , . . . , Yn )
independent of X same distribution as X
How well does estimated model θˆ x describe Y ?
`(Y , θ ? ).
`(Y , θˆ x ) With η = `(X , θˆ x )
`(X , θ ? ) we often have d
2η ! χ2 . What can be said about η˜ = `(Y , θˆ x )
PRH (Stanford University)
`(Y , θ ? )?
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
4 / 50
Example: Simplest Possible
Let X , Y N (θ ? , 1) be independent and de…ne Z1 = X θ ? and Z2 = Y θ ? (standard normals).
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
5 / 50
Example: Simplest Possible
Let X , Y N (θ ? , 1) be independent and de…ne Z1 = X θ ? and Z2 = Y θ ? (standard normals). In-Sample: (one observation) X Q (X , θ ) = log(2π ) + (X
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
θ )2 .
November 4, 2010
5 / 50
Example: Simplest Possible
Let X , Y N (θ ? , 1) be independent and de…ne Z1 = X θ ? and Z2 = Y θ ? (standard normals). In-Sample: (one observation) X Q (X , θ ) = log(2π ) + (X MLE:
PRH (Stanford University)
θ )2 .
θˆ = θˆ (X ) = X .
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
5 / 50
Example: Simplest Possible
Let X , Y N (θ ? , 1) be independent and de…ne Z1 = X θ ? and Z2 = Y θ ? (standard normals). In-Sample: (one observation) X Q (X , θ ) = log(2π ) + (X MLE:
θ )2 .
θˆ = θˆ (X ) = X .
Over…t: η = Q (X , θˆ )
PRH (Stanford University)
Q (X , θ ? ) = (X
In-Sample Fit and Out-of-Sample Fit
θ ? )2 = Z12
χ21 .
November 4, 2010
5 / 50
Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
θ )2 .
November 4, 2010
6 / 50
Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y
θ )2 .
Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ )
PRH (Stanford University)
Q (Y , θ ? )
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
6 / 50
Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y
θ )2 .
Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
6 / 50
Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y
θ )2 .
Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2
=
PRH (Stanford University)
(Y
θ? + θ?
θˆ )2 + (Y
In-Sample Fit and Out-of-Sample Fit
θ ? )2
November 4, 2010
6 / 50
Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y
θ )2 .
Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2
= (Y θ ? + θ ? θˆ )2 + (Y θ ? )2 = 2(X θ ? )(Y θ ? ) (X θ ? )2
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
6 / 50
Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y
θ )2 .
Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2
= (Y θ ? + θ ? θˆ )2 + (Y θ ? )2 = 2(X θ ? )(Y θ ? ) (X θ ? )2 = 2Z1 Z2
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
Z12 .
November 4, 2010
6 / 50
Example (continued) Out-of-Sample: Y Q (Y , θ ) = log(2π ) + (Y
θ )2 .
Fit (using in-sample MLE θˆ = θˆ (X ) = X ) η˜ = Q (Y , θˆ ) Q (Y , θ ? ) = (Y θˆ )2 + (Y θ ? )2
= (Y θ ? + θ ? θˆ )2 + (Y θ ? )2 = 2(X θ ? )(Y θ ? ) (X θ ? )2 = 2Z1 Z2
Joint distribution: (η, η˜ ) PRH (Stanford University)
(Z12 , 2Z1 Z2
Z12 .
Z12 ).
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
6 / 50
General Framework X = (X1 , . . . , Xn ) | {z } In-Sample
PRH (Stanford University)
Y = Y1 , . . . , Ym | {z } Out-of-Sample
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
8 / 50
General Framework X = (X1 , . . . , Xn ) | {z } In-Sample
We care about: Criterion function
Y = Y1 , . . . , Ym | {z } Out-of-Sample
Q (Y , θ ).
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
8 / 50
General Framework X = (X1 , . . . , Xn ) | {z } In-Sample
We care about: Criterion function
Y = Y1 , . . . , Ym | {z } Out-of-Sample
Q (Y , θ ).
Available for estimation: X = (X1 , . . . , Xn ) drawn from the same distribution as Y .
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
8 / 50
General Framework X = (X1 , . . . , Xn ) | {z }
Y = Y1 , . . . , Ym | {z } Out-of-Sample
In-Sample
We care about: Criterion function Q (Y , θ ).
Available for estimation: X = (X1 , . . . , Xn ) drawn from the same distribution as Y . Estimate θ by maximizing empirical criterion function: θˆ = θˆ (X ) = arg max Q (X , θ ). θ
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
8 / 50
MSPE Example Example: Want to predict Y = Xn +1 at time n. A prediction, θ, results in: Q (Y , θ ) =
PRH (Stanford University)
(Xn +1
In-Sample Fit and Out-of-Sample Fit
θ )2 .
November 4, 2010
9 / 50
MSPE Example Example: Want to predict Y = Xn +1 at time n. A prediction, θ, results in: Q (Y , θ ) =
(Xn +1
θ )2 .
Minimize in-sample MSE n
max Q (X , θ ) = θ
PRH (Stanford University)
∑ (Xt
θ )2 ,
t =1
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
9 / 50
MSPE Example Example: Want to predict Y = Xn +1 at time n. A prediction, θ, results in: Q (Y , θ ) =
θ )2 .
(Xn +1
Minimize in-sample MSE n
max Q (X , θ ) = θ
∑ (Xt
θ )2 ,
t =1
In-sample estimator: θˆ (X ) = X¯ = n
1
n
∑ Xt .
t =1
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
9 / 50
Relative Fit: In-sample In-sample measure of over…tting η = Q (X , θˆ )
PRH (Stanford University)
Q (X , θ ? ).
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
10 / 50
Relative Fit: In-sample In-sample measure of over…tting η = Q (X , θˆ )
Q (X , θ ? ).
How well does θˆ = θˆ (X ) “…t” X relative to population parameter, θ ? = arg max EfQ (X , θ )g. θ
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
10 / 50
Relative Fit: In-sample In-sample measure of over…tting η = Q (X , θˆ )
Q (X , θ ? ).
How well does θˆ = θˆ (X ) “…t” X relative to population parameter, θ ? = arg max EfQ (X , θ )g. θ
Likelihood special case: Q (X , θ ) = 2`(X , θ ),
(= 2 log Likelihood).
η is Kullback-Leibler information.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
10 / 50
Relative Fit: Out-of-Sample The out-of-sample equivalent.... the one we care about... is η˜ = Q (Y , θˆ )
Q (Y , θ ? ).
How well does θˆ = θˆ (X ) …t independent data drawn from the same distribution as X ?
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
11 / 50
Relative Fit: Out-of-Sample The out-of-sample equivalent.... the one we care about... is η˜ = Q (Y , θˆ )
Q (Y , θ ? ).
How well does θˆ = θˆ (X ) …t independent data drawn from the same distribution as X ? In-sample: over…tting η
PRH (Stanford University)
0.
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
11 / 50
Relative Fit: Out-of-Sample The out-of-sample equivalent.... the one we care about... is η˜ = Q (Y , θˆ )
Q (Y , θ ? ).
How well does θˆ = θˆ (X ) …t independent data drawn from the same distribution as X ? In-sample: over…tting η
0.
Out-of-sample: η˜ 7 0,
PRH (Stanford University)
tends to have: η˜ < 0.
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
11 / 50
Relation is more than one of opposite expectations Regular problems... (Q = 2 log L, correctly speci…ed) we have E Q (X , θˆ ) E Q (Y , θˆ )
PRH (Stanford University)
Q (X , θ ? ) ?
Q (Y , θ )
In-Sample Fit and Out-of-Sample Fit
= +k =
k.
November 4, 2010
12 / 50
Relation is more than one of opposite expectations Regular problems... (Q = 2 log L, correctly speci…ed) we have E Q (X , θˆ ) E Q (Y , θˆ )
Q (X , θ ? ) ?
Q (Y , θ )
= +k =
k.
The di¤erence E Q (X , θˆ ) Q (Y , θˆ ) = 2k motivated Akaike’s Information Criterion (AIC).
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
12 / 50
Relation is more than one of opposite expectations Regular problems... (Q = 2 log L, correctly speci…ed) we have E Q (X , θˆ ) E Q (Y , θˆ )
Q (X , θ ? ) ?
Q (Y , θ )
= +k =
k.
The di¤erence E Q (X , θˆ ) Q (Y , θˆ ) = 2k motivated Akaike’s Information Criterion (AIC). The relation between η and η˜ is much closer than that of having opposite expected values.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
12 / 50
Relation is more than one of opposite expectations Regular problems... (Q = 2 log L, correctly speci…ed) we have E Q (X , θˆ ) E Q (Y , θˆ )
Q (X , θ ? ) ?
Q (Y , θ )
= +k =
k.
The di¤erence E Q (X , θˆ ) Q (Y , θˆ ) = 2k motivated Akaike’s Information Criterion (AIC). The relation between η and η˜ is much closer than that of having opposite expected values. Regular problems... we have E Q (Y , θˆ ) Important!
PRH (Stanford University)
Q (Y , θ ? )jX =
fQ (X , θˆ ) Q (X , θ ? )g. | {z } η
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
12 / 50
Extremum Estimator Framework
Assumption 1 p i Q¯ (X , θ ) = n 1 Q (X , θ ) ! Q (θ ) uniformly in θ on a open neighborhood of θ ? , as n ! ∞. ii H¯ (X , θ ) = ∂2 Q¯ (X , θ )/∂θ∂θ 0 exists and is continuous in an open neighborhood of θ ? , p iii H¯ (X , θ ) ! I(θ ) uniformly in θ in an open neighborhood of θ ? , I(θ ) is continuous in a neighborhood of θ ? and I0 = I(θ ? ) 2 Rk is positive de…nite.
k
d
iv n 1/2 S (X , θ ? ) ! Nf0, J0 g, (S ( ) = ∂Q ( )/∂θ) where J0 = limn !∞ E n 1 S (X , θ ? )S (X , θ ? )0 .
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
13 / 50
Theorem 1 Given Assumption 1. As n, m ! ∞ with 2
PRH (Stanford University)
η η˜
d
!
m n
! π, we have
p ζ1 2 πζ 2 πζ 1
In-Sample Fit and Out-of-Sample Fit
,
November 4, 2010
14 / 50
Theorem 1 Given Assumption 1. As n, m ! ∞ with 2 where ζ 1 = Z10 ΛZ1 ,
PRH (Stanford University)
η η˜
d
!
m n
! π, we have
p ζ1 2 πζ 2 πζ 1
ζ 2 = Z10 ΛZ2 ,
and Zi
In-Sample Fit and Out-of-Sample Fit
,
iid Nk (0, Ik ).
November 4, 2010
14 / 50
Theorem 1 Given Assumption 1. As n, m ! ∞ with 2 where ζ 1 = Z10 ΛZ1 ,
η η˜
d
!
m n
! π, we have
p ζ1 2 πζ 2 πζ 1
ζ 2 = Z10 ΛZ2 ,
and Zi
,
iid Nk (0, Ik ).
Furthermore, Λ = diag (λ1 , . . . , λk ) , where λ1 , . . . , λk are the eigenvalues of I0 1 J0 . n
p 1 H (X , θ ? ) !
PRH (Stanford University)
I0 and J0 = limn !∞ E n
1 S (X , θ ? )S (X , θ ? )0
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
14 / 50
The Joint Distribution of Eta and Eta-tilde
4
Case: Λ = I and k = 3.
2
0.005
0.02
0
0.035
0.
0.0
-2
4
0.045 03
-4
0.025
eta
-6 -8
0.01
-10
etatild
e
0.015
0 PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
2
4
6
November 4, 2010
8 15 / 50
Conditional Density η˜ jη
N ( η, 4η ) ~ given η η
0.20
Conditional density of
0.00
0.05
0.10
0.15
η=1 η=4 η = 10
-10
PRH (Stanford University)
-5
0
In-Sample Fit and Out-of-Sample Fit
5
10
November 4, 2010
16 / 50
Sketch of Proof for the General Case In-Sample: S ( , θ ) = ∂Q ( , θ )/∂θ and H ( , θ ) = ∂2 Q ( , θ )/∂θ∂θ 0 . 0 = S (X , θˆ ) ' S (X , θ ? ) + H (X , θ ? )(θˆ θ ? ) (θˆ θ ? ) ' [ H (X , θ ? )] 1 S (X , θ ? ), Next, Q (X , θ ? ) ' Q (X , θˆ ) + S (X , θˆ )0 (θ ?
θ ? )0 H (X , θ ? )(θˆ
θˆ ) + 21 (θˆ
θ ? ),
so that η = Q (X , θˆ )
Q (X , θ ? )
1 (θˆ θ ? )0 [ 2 1 S (X , θ ? )0 [ 2
' '
Often [ H (X , θ ? )]
1/2
PRH (Stanford University)
S (X , θ ? ) =
"
n
1
n
∑
t =1
ht ( θ ? )
#
In-Sample Fit and Out-of-Sample Fit
H (X , θ ? )](θˆ H (X , θ ? )]
1/2
n
1/2
n
θ? ) 1
S (X , θ ? ).
d
∑ st (θ ) ! N (0, Σ).
t =1
August 2010
16 / 48
Sketch of Proof for the General Case Out-of-Sample: Q (Y , θˆ ) ' Q (Y , θ ? ) + S (Y , θ ? )0 (θˆ
θ ? ) + 12 (θˆ
θ ? )0 H (Y , θ ? )(θˆ
θ ? ),
so that η˜
= Q (Y , θˆ )
Q (Y , θ ? )
' S (Y , θ ? )0 [ H (X , θ ? )] 1 S (X , θ ? ) ? 0 ? ? ? 1 1 2 S (X , θ ) [ H (X , θ )] [ H (Y , θ )][ H (X , θ )]
Simplest case m = n large: H (X , θ ? ) [H (Y , θ ? )] η˜
PRH (Stanford University)
ZY0 ZX
1
1
S (X , θ ? ).
I . So that
η.
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
18 / 50
Empirical Example: Term Structure VAR Five interest rates with di¤erent maturities: 3, 6, 12, 60 and 120 month 18.00
16.00
14.00
12.00 GS10 GS5 GS1 TB6MS TB3MS
10.00
8.00
6.00
4.00
2.00
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
Jan-07
Jan-05
Jan-03
Jan-01
Jan-99
Jan-97
Jan-95
Jan-93
Jan-91
Jan-89
Jan-87
Jan-85
Jan-83
Jan-81
Jan-79
Jan-77
Jan-75
Jan-73
Jan-71
Jan-69
Jan-67
Jan-65
Jan-63
Jan-61
Jan-59
0.00
November 4, 2010
19 / 50
Cointegrated VAR
p 1
∆Xt = αβ0 Xt
1+
∑ Γj ∆Xt
+ µ + εt ,
j
j =1
with p = 1, . . . , 12, and cointegration rank r = 0, . . . , 5. Sample divided into odd months, Todd , and even months, Teven . Estimate the parameters, θ = (α, β, Γ1 , . . . , Γp 1 , µ), by maximizing, either Todd 1 Qodd ( , θ ) = log ∑ εˆt εˆt0 , 2 Todd t 2T odd
or
Qeven ( , θ ) =
PRH (Stanford University)
Teven 1 log 2 Teven
∑
εˆ t εˆ t0 ,
t 2Teven
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
20 / 50
In-Sample In-sample: criterion function (maximized log-likelihood) Odd months Even months r =3 r =4 r =5 r =3 r =4 r =5 p=1 2871.16 2874.11 2874.76 2980.96 2987.08 2987.19 p=2 2948.22 2951.38 2952.70 3039.33 3048.10 3048.32 p=3 2989.19 2993.22 2993.63 3073.42 3079.72 3079.91 p=4 3024.61 3028.73 3029.46 3102.98 3109.33 3109.58 p=5 3054.65 3057.46 3058.19 3114.13 3120.43 3120.59 p=6 3086.68 3089.42 3090.15 3130.52 3135.51 3135.66 p=7 3109.40 3113.22 3113.75 3178.24 3182.07 3182.20 p=8 3142.61 3145.66 3146.02 3218.15 3222.01 3222.15 p=9 3178.79 3179.84 3179.90 3250.05 3254.92 3254.97 p = 10 3203.53 3204.56 3204.72 3296.67 3299.91 3299.93 p = 11 3226.99 3227.48 3227.56 3316.55 3320.13 3320.16 p = 12 3252.82 3253.10 3253.10 3338.01 3342.79 3342.79 PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
23 / 50
In-Sample (AIC)
p p p p p p p p p p p p
=1 =2 =3 =4 =5 =6 =7 =8 =9 = 10 = 11 = 12
Akaike’s Information Criterion (AIC) Odd months Even months r =3 r =4 r =5 r =3 r =4 r =5 2850.16 2850.11 2849.76 2959.96 2963.08 2962.19 2902.22 2902.38 2902.70 2993.33 2999.10 2998.32 2918.19 2919.22 2918.63 3002.42 3005.72 3004.91 2928.61 2929.73 2929.46 3006.98 3010.33 3009.58 2933.65 2933.46 2933.19 2993.13 2996.43 2995.59 2940.68 2940.42 2940.15 2984.52 2986.51 2985.66 2938.40 2939.22 2938.75 3007.24 3008.07 3007.20 2946.61 2946.66 2946.02 3022.15 3023.01 3022.15 2957.79 2955.84 2954.90 3029.05 3030.92 3029.97 2957.53 2955.56 2954.72 3050.67 3050.91 3049.93 2955.99 2953.48 2952.56 3045.55 3046.13 3045.16 2956.82 2954.10 2953.10 3042.01 3043.79 3042.79
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
24 / 50
Out-of-Sample
p p p p p p p p p p p p
=1 =2 =3 =4 =5 =6 =7 =8 =9 = 10 = 11 = 12
r =3 2843.65 2866.41 2852.75 2833.62 2822.51 2815.58 2794.20 2768.54 2755.05 2715.22 2716.26 2695.15
PRH (Stanford University)
Out-of-sample: criterion function Odd months Even months r =4 r =5 r =3 r =4 r =5 2848.35 2848.98 2941.21 2950.63 2951.93 2873.09 2874.13 2937.93 2951.44 2952.66 2856.10 2856.68 2913.50 2918.82 2921.47 2834.61 2835.36 2876.32 2876.13 2879.63 2824.16 2824.78 2849.55 2849.50 2853.05 2816.87 2817.64 2837.17 2838.54 2841.02 2797.97 2798.59 2839.22 2840.04 2842.28 2773.15 2773.98 2799.91 2804.68 2805.98 2759.44 2760.06 2763.76 2767.40 2768.22 2719.79 2720.18 2733.72 2737.91 2738.88 2722.15 2722.60 2716.76 2719.00 2720.85 2699.25 2699.16 2676.57 2678.54 2678.88 In-Sample Fit and Out-of-Sample Fit
November 4, 2010
25 / 50
Eta – Eta-tilde
p 1 2 3 4 5 6 7 8 9 10 11 12
In-sample: “Odd” observations Q (p, r ) Q (p 1, r ). Odd months (in-sample): η Even months: η˜ r =2 r =3 r =4 r =5 r =2 r =3 r =4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 74.96 77.06 77.27 77.93 0.79 -3.28 0.80 43.65 40.97 41.84 40.93 -22.38 -24.43 -32.62 34.95 35.41 35.50 35.83 -35.31 -37.18 -42.69 31.08 30.04 28.74 28.74 -28.26 -26.77 -26.63 32.50 32.03 31.96 31.95 -8.15 -12.38 -10.96 23.63 22.73 23.80 23.60 4.36 2.04 1.50 34.23 33.21 32.44 32.27 -33.52 -39.31 -35.35 35.39 36.18 34.18 33.88 -36.11 -36.15 -37.28 25.04 24.74 24.72 24.82 -31.54 -30.04 -29.49 23.37 23.46 22.92 22.84 -17.71 -16.96 -18.91 26.36 25.82 25.62 25.54 -41.96 -40.19 -40.45
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
r =5 0.00 0.73 -31.19 -41.84 -26.57 -12.04 1.26 -36.30 -37.76 -29.34 -18.03 -41.97 26 / 50
Scatterplot Q (p + 1, r )
Q (p, r ), p
PRH (Stanford University)
3
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
27 / 50
Multiple Models
With multiple models... j = 1, . . . , M. Determine a nesting model θ2Θ So that the j-th model is characterized by Θj
PRH (Stanford University)
Θ.
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
28 / 50
Parameters in j-th Model Population parameter in j-th model is given by θ j? = argmax Q¯ (θ ), θ 2 Θj
where Q¯ (θ ) = lim n n !∞
PRH (Stanford University)
1
n
∑ qi (Xi , θ ).
i =1
|
{z
Q (X ,θ )
In-Sample Fit and Out-of-Sample Fit
}
November 4, 2010
29 / 50
Parameters in j-th Model Population parameter in j-th model is given by θ j? = argmax Q¯ (θ ), θ 2 Θj
where Q¯ (θ ) = lim n n !∞
Estimator in j-th model is
1
n
∑ qi (Xi , θ ).
i =1
|
{z
}
Q (X ,θ )
n
θˆ j = argmax Q (X , θ ) = argmax ∑ qi (Xi , θ ) θ 2 Θj
PRH (Stanford University)
θ 2 Θj
In-Sample Fit and Out-of-Sample Fit
i =1
November 4, 2010
29 / 50
A Useful Decomposition Observed in-sample …t expressed by 3 terms Q (X , θˆ j ) =
Q (θ j? ) | {z }
Genuine ( µj )
PRH (Stanford University)
+ Q (X , θ j? ) Q (θ j? ) + Q (X , θˆ j ) Q (X , θ j? ). {z } | {z } | Ordinary noise ( νj )
In-Sample Fit and Out-of-Sample Fit
Deceptive noise ( η j )
November 4, 2010
30 / 50
A Useful Decomposition Observed in-sample …t expressed by 3 terms Q (X , θˆ j ) =
Q (θ j? ) | {z }
Genuine ( µj )
(j )
Example: yi = β0(j ) xi Q (X , βˆ
(j )
+ Q (X , θ j? ) Q (θ j? ) + Q (X , θˆ j ) Q (X , θ j? ). {z } | {z } |
+ εj ,i , with σ2(j ) = E(ε2j ,i ).
n
)=
∑ εˆ2j ,i
i =1
Deceptive noise ( η j )
Ordinary noise ( νj )
n
n
∑ σ2(j ) + ∑ (σ2(j )
= |
i =1
µj =
{z
}
n σ2(j )
i =1
|
{z νj
ε2j ,i ) + }
n
∑ (ε2j ,i
i =1
|
{z ηj
εˆ 2j ,i ). }
Sampling variation comes in two ‡avors, one of them being particularly nasty. PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
30 / 50
Observed
100 95
95 90 80
M1
M2
M3
M4
M5
Genuine 8
7
Deceptive
14 20 12
92
M1
88
M2
81
M3
70
68
M4
M5
Gen-Dec 8 8
7 7
Genuine
Deceptive
14 20 14
12 20
84
12
81 67 50
M1
M2
M3
M4
56
M5
Observed
100 95
95 90 80
M1
M2
M3
M4
M5
Genuine
18
12
Deceptive
14
12 2
82
83
81
78
78
M1
M2
M3
M4
M5
Gen-Dec
18
18
64
M1
12 12
71
M2
Genuine
14 14
Deceptive
12 2 2 12
76
67
66
M3
M4
M5
AIC Example
Estimate K regression models, yi = βj xj ,i + εj ,i ,
j = 1, . . . , K
by least squares, βˆ j = ∑ni=1 xj ,i yi / ∑ni=1 xj2,i , where xj ,i
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
iidN (0, 1).
November 4, 2010
31 / 50
AIC Example
Estimate K regression models, yi = βj xj ,i + εj ,i ,
j = 1, . . . , K
by least squares, βˆ j = ∑ni=1 xj ,i yi / ∑ni=1 xj2,i , where xj ,i iidN (0, 1). Truth model is yi = εi , where εi ??xj ,i , and εi iidN (0, 1). ! n
AICj =
∑ εˆ2j ,i + 2
.
i =1
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
31 / 50
AIC Example
Estimate K regression models, yi = βj xj ,i + εj ,i ,
j = 1, . . . , K
by least squares, βˆ j = ∑ni=1 xj ,i yi / ∑ni=1 xj2,i , where xj ,i iidN (0, 1). Truth model is yi = εi , where εi ??xj ,i , and εi iidN (0, 1). ! n
AICj =
∑ εˆ2j ,i + 2
.
i =1
Consider AICmax = max1
PRH (Stanford University)
j K
AICj and Q (Y , βˆ j ).
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
31 / 50
AIC Example (cont.) n=100 Nsim>1,000,000 K 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
AICmax -101.00 -100.36 -99.90 -99.54 -99.24 -98.99 -98.77 -98.57 -98.40 -98.24 -98.09 -97.96 -97.83 -97.72 -97.61
PRH (Stanford University)
Qmax (Y ) -101.01 -101.66 -102.13 -102.50 -102.81 -103.07 -103.30 -103.49 -103.67 -103.84 -103.98 -104.12 -104.25 -104.36 -104.48
Bias 0.01 1.30 2.23 2.97 3.57 4.08 4.53 4.92 5.28 5.60 5.89 6.17 6.42 6.65 6.87
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
32 / 50
AIC Example (cont.) n=100 Nsim>1,000,000 K 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
AICmax -101.00 -100.36 -99.90 -99.54 -99.24 -98.99 -98.77 -98.57 -98.40 -98.24 -98.09 -97.96 -97.83 -97.72 -97.61
PRH (Stanford University)
Qmax (Y ) -101.01 -101.66 -102.13 -102.50 -102.81 -103.07 -103.30 -103.49 -103.67 -103.84 -103.98 -104.12 -104.25 -104.36 -104.48
Bias 0.01 1.30 2.23 2.97 3.57 4.08 4.53 4.92 5.28 5.60 5.89 6.17 6.42 6.65 6.87
AICmin -101.00 -101.63 -101.80 -101.88 -101.91 -101.94 -101.95 -101.96 -101.97 -101.97 -101.98 -101.98 -101.98 -101.99 -101.99
In-Sample Fit and Out-of-Sample Fit
Qmin (Y ) -101.01 -100.37 -100.19 -100.12 -100.08 -100.06 -100.04 -100.03 -100.02 -100.02 -100.01 -100.01 -100.01 -100.01 -100.00 November 4, 2010
32 / 50
Winner’s Curse 4
4 2
m=2
2
m=1
-2
0.06
-8
-6
-4
0.02
-10
-10
-8
-6
-4
-2
0.08
0
0.2
0
0.04
0
2
4
6
8
0
2
m=5
4
6
8
m=20 4
4
0.002 0 .00 5
0.004
0.006
2
2
0.015
0.01
5
0.018
4
-2
0.0
-2
0.014
0
0
0.03
0.04
0.022
-4
-4
0.035 0 .02 5
0.02
-6
-6
0.02
0.01
-8
-8
0.016 0.012
-10
-10
0.0
0
PRH (Stanford University)
2
4
6
8
0
2
In-Sample Fit and Out-of-Sample Fit
08
4
6
8
November 4, 2010
33 / 50
AIC/BIC Paradox (cont.) Consider the hypothetical case where... ... µj and νj are the same for all models; ... and all models have the same number of parameters. Which model does AIC/BIC select?
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
34 / 50
AIC/BIC Paradox (cont.) Consider the hypothetical case where... ... µj and νj are the same for all models; ... and all models have the same number of parameters. Which model does AIC/BIC select? arg max ξ j = arg max(µj + νj + η j j
PRH (Stanford University)
j
In-Sample Fit and Out-of-Sample Fit
pj ) ' arg max η j . j
November 4, 2010
34 / 50
AIC/BIC Paradox (cont.) Consider the hypothetical case where... ... µj and νj are the same for all models; ... and all models have the same number of parameters. Which model does AIC/BIC select? arg max ξ j = arg max(µj + νj + η j j
j
pj ) ' arg max η j . j
That happens to be min(µj j
η j ),
The worst possible models! Knock-out blow to model selection in this context.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
34 / 50
AIC Example (cont.) Model Average
The simple average: β¯ x = K
1
η¯ with expected value
PRH (Stanford University)
1 K
ˆ ∑K j =1 βj , results in 1 2 χ , K 1
.
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
35 / 50
AIC Example (cont.) Model Average
The simple average: β¯ x = K
1
η¯ with expected value
1 K
ˆ ∑K j =1 βj , results in 1 2 χ , K 1
.
When some models are better than other models (unlike this example) the question is whether µj of the selected model is su¢ ciently large to o¤set η j .
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
35 / 50
On Information Criteria
AIC may work reasonably well with a small set of nested models.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
36 / 50
On Information Criteria
AIC may work reasonably well with a small set of nested models. With “Many” models it has a tendency to select a model with a large ηj .
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
36 / 50
On Information Criteria
AIC may work reasonably well with a small set of nested models. With “Many” models it has a tendency to select a model with a large ηj . BIC to a lesser extend.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
36 / 50
On Information Criteria
AIC may work reasonably well with a small set of nested models. With “Many” models it has a tendency to select a model with a large ηj . BIC to a lesser extend. Not suitable for comparing many models with the same number of parameters. There is no penalty for the “search”.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
36 / 50
Macro Application (Stock and Watson 131 Time Series) Focus on nine series: PI, IP, UR, EMP, TBILL, TBOND, PPI, CPI, PCED S&W “An Empirical Comparison of Methods for Forecasting using Many Predictors”.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
37 / 50
Macro Application (Stock and Watson 131 Time Series) Focus on nine series: PI, IP, UR, EMP, TBILL, TBOND, PPI, CPI, PCED S&W “An Empirical Comparison of Methods for Forecasting using Many Predictors”. Forecast Horizon: 12 Month-Ahead Baseline forecast: (1 )
Xt +h = α + βXt + γPCt
+ εt,t +h , (j )
Alternative forecasts... Add a variable or PCt . Fixed Scheme (1960:01-1994:12). Evaluation (1995:01-2003:12).
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
37 / 50
Winners Curse Report σˆ 2X = n
1
∑nt=1 εˆ 2t and σˆ 2Y = m
PI IP UR EMP TBILL TBOND PPI PCI PCED
PRH (Stanford University)
σˆ 2X 3.61 21.02 1.02 46.67 2.75 1.21 26.57 10.79 7.52
1
+m 2 ∑nt = n +1 εˆ t .
σˆ 2Y 2.95 10.38 0.26 25.05 1.54 0.62 24.13 14.76 3.17
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
38 / 50
Winners Curse Report σˆ 2X = n
1
∑nt=1 εˆ 2t and σˆ 2Y = m
PI IP UR EMP TBILL TBOND PPI PCI PCED
PRH (Stanford University)
σˆ 2X 3.61 21.02 1.02 46.67 2.75 1.21 26.57 10.79 7.52
σˆ 2Y 2.95 10.38 0.26 25.05 1.54 0.62 24.13 14.76 3.17
1
σˆ 2 ,X 2.75 11.96 0.55 36.06 2.41 0.95 23.48 10.27 6.96
+m 2 ∑nt = n +1 εˆ t .
∆QX 27.21% 56.36% 62.44% 25.78% 13.28% 24.53% 12.35% 4.92% 7.82%
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
38 / 50
Winners Curse Report σˆ 2X = n
1
∑nt=1 εˆ 2t and σˆ 2Y = m
PI IP UR EMP TBILL TBOND PPI PCI PCED
PRH (Stanford University)
σˆ 2X 3.61 21.02 1.02 46.67 2.75 1.21 26.57 10.79 7.52
σˆ 2Y 2.95 10.38 0.26 25.05 1.54 0.62 24.13 14.76 3.17
1
σˆ 2 ,X 2.75 11.96 0.55 36.06 2.41 0.95 23.48 10.27 6.96
+m 2 ∑nt = n +1 εˆ t .
∆QX 27.21% 56.36% 62.44% 25.78% 13.28% 24.53% 12.35% 4.92% 7.82%
In-Sample Fit and Out-of-Sample Fit
σˆ 2 ,Y 4.19 12.09 0.56 34.39 2.41 0.44 24.61 14.71 3.32
∆QY -34.98% -15.22% -76.75% -31.70% -45.04% 35.61% -1.94% 0.35% -4.57%
November 4, 2010
38 / 50
Winners Curse
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
39 / 50
Example: Portfolio Choice Vector of Returns: Xt
PRH (Stanford University)
N(µ, Σ).
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
40 / 50
Example: Portfolio Choice Vector of Returns: Xt
N(µ, Σ).
Objective is max w 0 µ w
γ 0 2 w Σw ,
subject to ι0 w = 1.
or, in the presence of a risk-free asset max w0 µ0 + w 0 µ w
PRH (Stanford University)
γ 0 2 w Σw ,
subject to ι0 w = 1
In-Sample Fit and Out-of-Sample Fit
w0 .
November 4, 2010
40 / 50
Example: Portfolio Choice Vector of Returns: N(µ, Σ).
Xt Objective is γ 0 2 w Σw ,
max w 0 µ w
subject to ι0 w = 1.
or, in the presence of a risk-free asset max w0 µ0 + w 0 µ w
γ 0 2 w Σw ,
subject to ι0 w = 1
w0 .
Criterion function: n
Q (X , w ) =
∑ w 0 Xt
i =1
PRH (Stanford University)
γ 0 2w
n
∑ (Xt
X¯ )(Xt
X¯ )0 w .
i =1
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
40 / 50
Example: Optimal Weight Solution to (no risk-free asset) max w 0 µ w
γ 0 2 w Σw ,
w =Σ
PRH (Stanford University)
1
subject to ι0 w = 1.
µ γ1 + ι
1 ιΣ 1 µ/γ ι0 Σ 1 ι
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
41 / 50
Example: Optimal Weight Solution to (no risk-free asset) max w 0 µ
γ 0 2 w Σw ,
w
w =Σ
1
subject to ι0 w = 1.
µ γ1 + ι
1 ιΣ 1 µ/γ ι0 Σ 1 ι
Solution to (with a risk-free asset) max w0 µ0 + w 0 µ w
γ 0 2 w Σw ,
w =γ
PRH (Stanford University)
1
Σ
subject to ι0 w = 1 1
(µ
w0 .
µ0 ι )
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
41 / 50
Example: Portfolio Choice (cont.) Jobson & Korkie (JASA, 1980): 20 randomly selected stocks. Estimates for monthly returns (December 1949 – December 1975): µˆ = 0.50 0.90 1.10 1.74 1.82 1.11 0.91 1.18 1.35 1.07 1.16 1.23 0.81 1.18 0.88 1.20 0.72 1.16 0.92 1.25
0 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B @
53.6 6.6 19.8 34.1 6.3 5.8 16.9 15.3 9.6 10.1 10.3 18.9 8.5 14.6 14.6 14.3 27.9 25.1 11.8 16.9
6.6 29.8 16.7 20.7 6.5 11.8 8.4 9.3 10.8 11.2 9.6 8.8 13.5 14.1 16.5 8.8 14.9 16.7 22.8 10.3
ˆ = Σ
0
19.8 34.1 6.3 5.8 16.9 15.3 9.6 10.1 10.3 18.9 8.5 14.6 14.6 14.3 27.9 25.1 11.8 16.9 16.7 20.7 6.5 11.8 8.4 9.3 10.8 11.2 9.6 8.8 13.5 14.1 16.5 8.8 14.9 16.7 22.8 10.3 82.9 48.0 18.7 21.0 22.2 16.2 16.3 18.9 21.6 27.0 8.8 23.3 17.4 22.3 36.7 41.4 21.4 27.7 48.0 178.1 27.5 19.3 32.4 26.9 18.3 22.4 21.8 41.7 17.3 42.9 26.3 30.3 66.0 47.6 20.7 43.9 18.7 27.5 118.1 26.3 23.9 12.4 14.2 23.1 31.0 13.1 5.4 20.4 9.9 14.3 17.1 20.2 13.5 18.5 21.0 19.3 26.3 57.1 20.2 11.7 15.2 16.3 13.7 19.3 7.8 21.5 11.3 13.2 13.5 12.3 16.8 18.1 22.2 32.4 23.9 20.2 52.1 15.3 12.1 17.7 18.0 21.4 9.6 26.4 16.2 15.2 25.6 24.8 15.5 25.3 16.2 26.9 12.4 11.7 15.3 48.3 9.7 9.4 8.6 14.4 9.9 11.3 13.3 17.0 32.1 21.7 14.3 15.8 16.3 18.3 14.2 15.2 12.1 9.7 29.8 11.2 13.1 13.8 7.3 16.7 11.4 8.2 15.7 20.6 14.8 10.7 18.9 22.4 23.1 16.3 17.7 9.4 11.2 35.1 22.6 13.0 7.9 17.6 10.7 12.6 16.2 21.5 14.2 14.7 21.6 21.8 31.0 13.7 18.0 8.6 13.1 22.6 47.6 16.6 6.0 19.8 9.3 13.5 20.5 18.8 13.3 17.7 27.0 41.7 13.1 19.3 21.4 14.4 13.8 13.0 16.6 65.6 7.9 23.1 11.6 25.8 35.8 26.4 17.0 23.7 8.8 17.3 5.4 7.8 9.6 9.9 7.3 7.9 6.0 7.9 23.5 12.0 14.3 8.5 15.2 14.2 15.8 9.7 23.3 42.9 20.4 21.5 26.4 11.3 16.7 17.6 19.8 23.1 12.0 51.2 16.4 14.7 26.2 25.6 20.4 20.9 17.4 26.3 9.9 11.3 16.2 13.3 11.4 10.7 9.3 11.6 14.3 16.4 28.7 12.2 19.9 24.3 22.4 13.8 22.3 30.3 14.3 13.2 15.2 17.0 8.2 12.6 13.5 25.8 8.5 14.7 12.2 56.0 32.3 24.5 13.1 14.5 36.7 66.0 17.1 13.5 25.6 32.1 15.7 16.2 20.5 35.8 15.2 26.2 19.9 32.3 109.5 50.8 18.6 32.3 41.4 47.6 20.2 12.3 24.8 21.7 20.6 21.5 18.8 26.4 14.2 25.6 24.3 24.5 50.8 131.8 27.0 29.2 21.4 20.7 13.5 16.8 15.5 14.3 14.8 14.2 13.3 17.0 15.8 20.4 22.4 13.1 18.6 27.0 44.7 16.1 27.7 43.9 18.5 18.1 25.3 15.8 10.7 14.7 17.7 23.7 9.7 20.9 13.8 14.5 32.3 29.2 16.1 58.7
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
1 C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C A
42 / 50
Portfolio Choice with N=5 Without a risk-free asset (N = 5) T 60 120 180 240 360 480 600 6000
η
η˜
η/T
˜ η/T
36.67 34.92 34.24 34.06 33.68 33.69 33.49 33.38
-43.73 -37.89 -36.23 -35.42 -34.45 -34.32 -34.12 -33.48
0.61 0.29 0.19 0.14 0.09 0.07 0.06 0.01
-0.73 -0.32 -0.20 -0.15 -0.10 -0.07 -0.06 -0.01
Q¯ (X , w )
Q¯ (X , wˆ )
Q¯ (Y , wˆ )
Q¯ (Y , w e )
0.95 0.63 0.53 0.48 0.43 0.40 0.39 0.34
-0.38 0.02 0.14 0.19 0.24 0.26 0.27 0.33
0.07 0.06 0.06 0.05 0.05 0.05 0.05 0.05
Q¯ (X , wˆ )
Q¯ (Y , wˆ )
Q¯ (Y , wˆ e )
1.17 0.77 0.64 0.58 0.52 0.49 0.48 0.42
-0.53 0.02 0.17 0.23 0.29 0.32 0.34 0.40
0.17 0.25 0.27 0.28 0.29 0.30 0.30 0.31
0.34 0.34 0.34 0.33 0.33 0.33 0.33 0.33
With a risk-free asset (N = 5) T 60 120 180 240 360 480 600 6000
η
η˜
η/T
˜ η/T
45.29 42.53 41.49 41.18 40.62 40.56 40.31 40.08
-56.71 -47.32 -44.53 -43.39 -41.86 -41.60 -41.25 -40.28
0.75 0.35 0.23 0.17 0.11 0.08 0.07 0.01
-0.95 -0.39 -0.25 -0.18 -0.12 -0.09 -0.07 -0.01
PRH (Stanford University)
Q¯ (X , w ) 0.42 0.41 0.41 0.41 0.41 0.41 0.41 0.41
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
43 / 50
Portfolio Choice with N=20 Without a risk-free asset (N = 20) T 60 120 180 240 360 480 600 6000
η
η˜
η/T
˜ η/T
234.42 186.06 174.23 169.06 164.08 161.89 160.30 155.23
-540.35 -268.06 -220.70 -201.31 -184.04 -176.24 -171.30 -156.37
3.91 1.55 0.97 0.70 0.46 0.34 0.27 0.03
-9.01 -2.23 -1.23 -0.84 -0.51 -0.37 -0.29 -0.03
η
η˜
η/T
˜ η/T
266.52 206.31 191.91 185.53 179.54 176.83 174.96 168.80
-667.30 -309.14 -249.55 -225.31 -203.95 -194.38 -188.47 -170.36
4.44 1.72 1.07 0.77 0.50 0.37 0.29 0.03
-11.12 -2.58 -1.39 -0.94 -0.57 -0.40 -0.31 -0.03
Q¯ (X , w )
Q¯ (X , wˆ )
Q¯ (Y , wˆ )
Q¯ (Y , w e )
4.76 2.40 1.82 1.55 1.30 1.19 1.11 0.87
-8.15 -1.38 -0.38 0.01 0.34 0.48 0.56 0.82
0.43 0.42 0.42 0.42 0.42 0.42 0.42 0.42
Q¯ (X , wˆ )
Q¯ (Y , wˆ )
Q¯ (Y , wˆ e )
0.86 0.85 0.85 0.85 0.85 0.85 0.85 0.85
With a risk-free asset (N = 20) T 60 120 180 240 360 480 600 6000
PRH (Stanford University)
Q¯ (X , w ) 0.89 0.88 0.88 0.88 0.88 0.88 0.88 0.87
In-Sample Fit and Out-of-Sample Fit
5.33 2.60 1.94 1.65 1.37 1.25 1.17 0.90
-10.23 -1.69 -0.51 -0.06 0.31 0.47 0.56 0.85
November 4, 2010
0.30 0.37 0.40 0.41 0.42 0.43 0.43 0.44
44 / 50
Shrink towards 1/N
1.0 -1.0
0.0
0.6 0.2 -0.2
T=60
2.0
N=20
1.0
N=5
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
1.0 -1.0
0.0
0.6 0.2 -0.2
T=1200
0.0
2.0
0.2
1.0
0.0
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
45 / 50
Shrink towards 1/N N=20 0.002
-5
4
-2
0.18
0.1
0.14 0.08
-3
0.028 0. 01 8
-10
1 -1
02 0.
0.034
3
0.3
0.0
0.38 0. 2
0.01
2 8 0.2
0.4
0
0.26
0. 2
0.36
0.0 04 0.0 0.0 06 16 26 0. 0 . 0. 0 0 3 022 8
0
0.1 2
0.16
0.24
0.02
2
N=5 0.06
0.02 0.0 14 2 0. 0.01 00 8
0.04
-4
-15
0.02
0.4
2
3
4
5
1
2
3
4
5
6
0
0.3
0.1
-5
0.35
0.55
1
0. 2
0.1
1
0
0.1
0.75
0.6
2.0
0.3 0.5
-2
0.15
-10
0
1.5
0.25
0.45
-1
1.0
0.2
0.5
2
0.0
-4
-15
-3
0.05
0.0
0.5
PRH (Stanford University)
1.0
1.5
2.0
0
In-Sample Fit and Out-of-Sample Fit
6
November 4, 2010
46 / 50
We can try too hard
Extensive search over models with limited data is dangerous.
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
48 / 50
We can try too hard
Extensive search over models with limited data is dangerous. Too little data?
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
48 / 50
We can try too hard
Extensive search over models with limited data is dangerous. Too little data? Too many econometricians?
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
48 / 50
We can try too hard
Extensive search over models with limited data is dangerous. Too little data? Too many econometricians? Argument for letting empirical analysis be guided by economic theory. (design of medical studies).
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
48 / 50
Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
N ( η, 4η ).
November 4, 2010
50 / 50
Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη
N ( η, 4η ).
Winner’s Curse AIC/BIC in trouble A parsimonious model... is not parsimonious if selected from a large pool of models
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
50 / 50
Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη
N ( η, 4η ).
Winner’s Curse AIC/BIC in trouble A parsimonious model... is not parsimonious if selected from a large pool of models
Resolutions... avoid the extreme: η = maxθ Q (X , θ )
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
Q (X , θ ).
November 4, 2010
50 / 50
Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη
N ( η, 4η ).
Winner’s Curse AIC/BIC in trouble A parsimonious model... is not parsimonious if selected from a large pool of models
Resolutions... avoid the extreme: η = maxθ Q (X , θ )
Q (X , θ ).
Shrinkage and Model Averaging (including Bayesian) M 1 ∑M j =1 η j < maxj η j .
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
50 / 50
Discussion Joint Distribution of (η, η˜ ) Conditional density (mixture of normals): η˜ jη
N ( η, 4η ).
Winner’s Curse AIC/BIC in trouble A parsimonious model... is not parsimonious if selected from a large pool of models
Resolutions... avoid the extreme: η = maxθ Q (X , θ )
Q (X , θ ).
Shrinkage and Model Averaging (including Bayesian) M 1 ∑M j =1 η j < maxj η j . Restricted model: E.g. θ = θ (ψ). η = maxψ Q (X , θ (ψ)) Q (X , θ (ψ )) One parsimonious model
PRH (Stanford University)
In-Sample Fit and Out-of-Sample Fit
November 4, 2010
50 / 50