Non-Parametric Econometrics Emmanuel Flachaire
Emmanuel Flachaire
Non-Parametric Econometrics
Chapter 2
Kernel Regression
Emmanuel Flachaire
Non-Parametric Econometrics
Introduction
I
Two-variable regression: one dependent variable y and one regressor x
I
The relationship between y and x is not specified a priori
I
The regression model is defined as: y = m(x) + ε
I
The unknown function m is estimated nonparametrically
I
Nonparametric regression: no parameters
I
Nonparametric vs. parametric: a (classical) arbitrage robustness vs. efficiency
Emmanuel Flachaire
Non-Parametric Econometrics
(1)
10.0
9.5
log(wage)
9.0
8.5
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ... .... .......... ... ... .... ... ... .. ... . ... .. ...... . . ... . . . . . .... ... . ... ... .... ... ... ..... . . . . ... ... .. . . . ... ... . . . . . . ... ... . .. . . . ... ... .. . . . . . ... ... . . . . . . ... ... . . . . . .. ... ... . .. . .. . ... ... ... . . . . . . ... . . ......... . . . ... ... . . ... ... . . . ... . ... . . . .. .. ... .. . ....... . . .. . . . .. . . .. .. . . ... ... . ...... . .. . . . . . . . . . . ... ... . .. ...... . . . . . . . . . . . . .. ... ..... . .. . . . .. . . . . . ... . ... ..... . . . . . . . . . .. . . .. ... . . .. . . . . . . . . .. ... ... . . ... ... ..... ............ . . . . ... ... .. ........ . . . ... ... .. ...... . ... ... . . . . . . . . ... ... . .... . . . ... ... . .... . . ... ... . .... . . ... ... . .... . . ... ... . ... . ... ... . . ... . ........... . ... . ... . . ... ... . .. . . ... ... . .... . . ... ... . .. . . ... ... . ... . . . ... ... . .. . . ... ... . ... . . ... ... . ... . . ... ... . . . . ... ... . . . . ... ... . . . . ... ... . ... .... . ... . ... ... . ... ... . ... ... .. . ... ... .. .. . . ... ......... . . ... . ... .. .. ................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... .. .. .. .. .. ..
Determinants of wages
.......................... ...................... ................. . . . . . . . . . . . . . . ............ .......... ......... . . . . . . ...... ......... ......... . . . . . . . . . ....... ........ ....... . . . . . ... .... ... . . . ... ... .... . . . ... .. ... .. ... ... .. .. .... ..
experience
0
10
20
Emmanuel Flachaire
30
40
Non-Parametric Econometrics
50
1.0
0.8
0.6
... .. .... .. .. .. .... . .. .. .... . .. .. ... ... .. ... . .. .. ... . ... .. . . .. ... . . .. .. .. . . .. ... ... . . .. .... ..... . . . . . . . ..................................................................................... ....................................... ...........
................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ ... ... ... .......... . ... ... ... ... ...... .... . .... . . . ... . .... ... ... ... .... ... ... ... ... ..... ..... ... ... ... ... ... ... ... ... ... ... . ... ... . ... . . . . ... . ......... . . ... . . ... .... .... . ... ... ... ... ... ... .. ... ... ... ... ... ... ... ... ... ... ... ... .. ... ... . ... . . ... .... ... . ... ... . ..... . .......... ... .. ... . ... ... ... ... . . ... ... ... . ... ... ... . ... . ... . ... . ... . ... .. ... .. ... ... . . ... ... . . . ... . .......... . . ... . . ..... .... ... . ... . ... . ... . . ... ... . ... . . ... ... . ... . . ... ... . . . . . . ... . ... . . . . . ... .... . ... . . . . . .... . ... . ... . . . . . . . ... ...... . .. . . ... .. . ... ...... .. . .. . . .. ... . . . ... ... . . . ... ... . . . ... . . .. . . ... . . .............. ......... ............................ . . . ... ... . . . . ... ... .. . . . ... . ... . . .. ... . .. ... . . . .. .. . . . ... . . ... ... .. ........ ........................ . ... . ... . ... . ... . ... . ... . ... . ... ... ... . .... ... .. . . . ... . ... .. ... .. ... ... ... ... ... .. ... .......... ... .... .. ................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... .. .. .. .. .. ..
Determinants of wages (heteroskedasticity)
(residuals)2 0.4
0.2
0.0
experience
0
10
20
Emmanuel Flachaire
30
40
Non-Parametric Econometrics
50
Nadaraya-Watson estimator From the discrete to the continuous case
I
Regression y = m(x) + ε can be rewritten E (y |x) = m(x)
I
If x is a dummy, we compute the mean of y for the 1st group (x = 0) and for the 2nd group (x = 1): wage of men/women
I
If x is continuous, we cannot estimate E (y |X ) like that
I
Na¨ıve estimator: we compute the mean of y with values of the regressor “close” to x: Pn x−xi 1 1 i=1 I − 2 < h < 2 yi m(x) ˆ = Pn x−xi 1 1 i=1 I − 2 < h < 2
I
The estimator m ˆ is not smooth
Emmanuel Flachaire
Non-Parametric Econometrics
Nadaraya-Watson estimator From the discrete to the continuous case I
To obtain a smooth estimator, we replace the indicator function by a kernel function (Gaussian, Epanechnikov)
I
Nadaraya-Watson estimator: Pn i=1 K m(x) ˆ = P n i=1 K
I
It can be view as a weighting mean of y , with weights depending on the distance of xi to x: m(x) ˆ =
n X
wi (x)yi
i=1 I
x−xi yi h x−xi h
with
i K ( x−x h ) wi (x) = Pn x−xi i=1 K ( h )
This estimator is very sensitive to the choice of h, not to K Emmanuel Flachaire
Non-Parametric Econometrics
Nadaraya-Watson estimator Formal derivation
I
A conditional expectation is defined as Z +∞ E (y |x) = y f (y |x) dy −∞
I
It can be rewritten in terms of unconditional density R +∞ Z +∞ y f (x, y )dy y f (x, y ) E (y |x) = dy = R−∞ +∞ f (x) −∞ −∞ f (x, y )dy
I
An estimator is obtained with kernel density estimation fˆ(x, y )
Emmanuel Flachaire
Non-Parametric Econometrics
Nadaraya-Watson estimator Formal derivation
We obtain the Nadaraya-Watson estimator with n 1 X x − xi y − yi ˆ f (x, y ) = 2 K K nh h h i=1
The denominator is: Z +∞ Z +∞ n 1 X x − xi y − yi ˆ K f (x, y )dy = 2 K dy nh h h −∞ i=1 {z } | −∞ =1
The numerator is: Z +∞ Z +∞ n 1 X x − xi y − yi ˆ y f (x, y )dy = 2 K yK dy nh h h −∞ i=1 {z } | −∞ =yi
Emmanuel Flachaire
Non-Parametric Econometrics
Nadaraya-Watson estimator Comments
The denominator of the estimator is given by f (x): I
The Nadaraya-Watson estimator cannot be computed at x, such that f (x) = 0 (denominator equals to 0)
I
This estimator is very unprecised at x, if f (x) ≈ 0
I
This estimator cannot be used to forecast outside the sample, as opposed to parametric estimation
Emmanuel Flachaire
Non-Parametric Econometrics
Local polynomial Bias 1: non-equidistant observations
.................................................................................................................................................................................................................... ... ... ... ... ... ... ... ... . .... . .... .... . . . ... . .... . . . ... .... . .... . . . ... ... . ..... ... ... ..... ........ ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ........... ... . . . ... ... . . .... .. . . . ... ... . .. . ... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ............ ... ... . . . .... ... . .... . . . . . ... ... . . .. . .. .... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ............. ... . . . . . ... ... . . . ... . . . . . . . ... ... . .. . . .. ........ ...... ...... ...... ...... ...... ...... ...... ...... ...... ................ ... . . . . . . . ... ... . . . .... .. . . . . ... ... . . .. . .. . .... . . . ... ... . . . .... . . . . . ... . ... . . . . . . .... . . . ... . ... . . .... . . . . . . ... ... . . . . . . .... . . . . ... ... .. . . . ........ ...... ...... ...... ............... ... ... . . . . . . . . ... ... . . .... .. . . . . . . ... ... . . . . . . . .... . . . . ... ... . . .... . . . . . . . ... ... . . . .. . . . . . ... ........... ... . . . ... ... ... . ... ... . . . ... ... . . . ... ... . ... ... . . . ... ... . . . ... . ... ... ... . . . ... ... . . . ... . ... ... ... . . . ... .. ..............................................................................................................................................................................................................................................................................................
m(x)
y3 yB yA y2
y1
x1
x2 xA xB x3
Emmanuel Flachaire
Non-Parametric Econometrics
Local polynomial Order 1: locally linear
I
Source of the bias: non-equidistant observations
I
This bias can be very important at the bounds of the sample
I
Solution: replace the weighting mean by a fitted value obtained from a locally linear regression
I
m(x) ˆ =α ˆ obtained from the regression model: yi = α + β (xi − x) + εi , estimated by Weighted Least Squares, with weights K ( xi −x h )
I
m(x) ˆ =α ˆ is an OLS estimate of the regression model: q q q xi −x K ( xi −x ) y = α K ( ) + β K ( xi −x i h h h ) (xi − x) + εi
Emmanuel Flachaire
Non-Parametric Econometrics
Local polynomial Bias 2: concavity/convexity of the curve
.................................................................................................................................................................................................................... ... ... ... ... ... ... ... ... .... ... ... ... ... ... ... ... ... .. ........ ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ..... ....................................... .... . . . . . . . . . . . ... ... . . . . ............ . . . . . . . . . . . ... ... . ..... ........ ...... ...... ...... ...... ...... ...... ...... ...... ........................... ... .... . . . ... ... . ..... . . . . . ... ... . . . .. .. .... . . ... ... . ... . . . . ... ... . . . . . ... . . . . ... ... . .. .. .... ... ... . . . . . . . ... ... . .. .. ... ... ... . . . . . .. ... ... . .. .. .. ... ... . . . . . .. ... . ... .. . . ... . ... . . . . .. . ... ... . ........ ...... ...... ...... ......... ... ... ... ... ... ... ... ... ... ... ...... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .... ... ... ... ... ... ... .. ............................................................................................................................................................................................................................................................................................
m(x)
y3 y2
y1
x1
Emmanuel Flachaire
x2
x3
Non-Parametric Econometrics
Local polynomial Order 2: locally quadratic
I
Source of the bias: concavity/convexity of the curve
I
Solution: fitted value obtained from a locally quadratic regression
I
m(x) ˆ =α ˆ obtained from the regression model: yi = α + β (xi − x) + γ (xi − x)2 + εi , estimated by Weighted Least Squares, with weights K ( xi −x h )
I
m(x) ˆ =α ˆ is an OLS estimate of the regression model: p p p p K (.) yi = α K (.) +β K (.) (xi −x)+γ K (.) (xi −x)2 +εi
I
Remark: Nadaraya-Watson ≡ local polynomial of order 0
Emmanuel Flachaire
Non-Parametric Econometrics
Local polynomial Application
Determinants of wages 10.0
9.5
9.0
8.5
......................................................................................................................................................................................................................................................................................................................... . ... .......... ... . ..... ... ... ... ... ... ... ... ....... ... ... ... ... ......... ... ... . . . . . . . ... ... ... . . . . . . ... . ... . .. . . . . . . . ... . ... ... . . . . ... . ... . . . . . . . ... . ......... . . . . . ... . ................................................................ ..... . . . ... . . .. . . .. . .. . . .. . . ... .... . ............. ... ... .......... ... ... . . . . . . . ... ... . . . . . ... . ... .... . . ... . ... ... . . . ... ... .. . . . ... ... ... . . .. . . ... . ... . ... . . . . ... ... .. .... .... .. . .. . .. . ... . ... ......... . . ... ... . . . ... ... . . . ... ... . . . ... ... . . . ... ... . . . ... ... . . . ... ... ... .... ... ... . ... ... ... ... ... ... ... .......... .. .......................................................................................................................................................................................................................................................................................................................... ... ... ... ... ... ...
order 0 (Nadaraya-Watson) order 1 (local linear) .......... order 2 (local quadratic) ........................ ................ ............. . . . . . . . . ....... ........ ........ . . . . . . . ......... ....... ...... . . . .... .... .... .. .. .... ... .. . . . ... ...
0
10
20
30
Emmanuel Flachaire
40
50
Non-Parametric Econometrics
Local polynomial Derivatives
Usefulness of derivatives I
1st derivative, m0 (x) → influence of variations of x over y
I
2nd derivative, m00 (x) → curvature of the function
Derivative estimates can be computed from their definitions: m ˆ 0 (x) = m ˆ 00 (x) =
1 [m(x ˆ + h) − m(x ˆ − h)] 2h
1 [m(x ˆ + h) − 2 m(x) ˆ + m(x ˆ − h)] (2h)2
with h → 0 as the sample size increases, n → ∞
Emmanuel Flachaire
Non-Parametric Econometrics
Local polynomial Derivatives
Derivatives are directly given from the local polynomial estimation Local polynomial estimation of order p: n X 2 p 2 Min K ( xi −x h ) [yi −α−β (xi −x)−γ (xi −x) −· · ·−δ(xi −x) ] α,β,γ,...,δ
i=1
That we get using in
P
K (.)[yi − m(xi )]2 a Taylor approx of m(xi ):
m(xi ) ≈ m(x)+m0 (x)(xi −x)+
m(p) (x) m00 (x) (xi −x)2 +· · ·+ (xi −x)p 2 p!
Thus, we have: m(x) = α,
m0 (x) = β,
m00 (x) = 2γ,
Emmanuel Flachaire
...
and m(p) (x) = p! δ
Non-Parametric Econometrics
Sensitivity to the bandwidth Heteroskedasticity 1.0 0.8 0.6 0.4 0.2 0.0
.. .. ... ... .. .......... default ... . . default*2 . ... default/2 .. ... ... .. ... .. ... ... .. .. .... .. ... ... ... ... . . . .. .............................................................................................
.......................................................................................................................................................................................................................................................................................................................... . .. .. .. .......... ... ... .... .... ..... . ... .... ... ... ... ... ... . ... ... ......... .... ... ... . . ... . .......... ....... . ... . .... ..... ... . . ... . .. ..... ... ... ..... ... . ... .. . . ... . ... .. ... ... ... . . . . . . ... ......... ..... .... ... ... ... ... .. ... ... . ... ..... .... ... ... ... ... ... ... ... . ... ..... ... ... . . .......... ... . ... . .... . ... ..... ... . ... . . . ... . . ... . .. . . ... . ... . . . . . ... . ... ... . . ... . . ... . . . . . .. . . . . ... . . .......... ...... ......... ..... . . .. . .. .. . . .. . . . ... . ........ . .......... ... ................................................................................. .... ... ..... ...... . ... . ... .......... ... ... ... ... ... ... ... ... ... . ... ......... ... .. ........................................................................................................................................................................................................................................................................................................................... .. .. .. .. .. ..
0
10
20
Emmanuel Flachaire
30
40
50
Non-Parametric Econometrics
How to select the bandwidth? Rule of thumb I
Choose one criterion, as the mean integrated squared error: Z 2 [m(x) ˆ − m(x)] dx MISE(h) = E
I
Minimizing an approximation of the MISE gives: hopt = c n−1/5 .
I
The constant term c depends on unknown functions f and m. I
I
I
We can replace f by a “reference” density function, as in the case of the kernel density estimation We cannot replace m by a “reference” regression function, which one should we use?
Consequence: there is no rule of thumb to select h! Emmanuel Flachaire
Non-Parametric Econometrics
How to select the bandwidth? Cross validation
I
Cross validation remains to minimize the following criterion : n
1X [yi − m ˆ −i (xi )]2 CV(h) = n i=1
where m ˆ −i (.) is estimated from the sample from which the observation i is removed. I
It is quite similar to the OLS criterion (Min SSR)
I
If xi is not removed from the sample, we have hopt = 0
I
It works quite well and can often be checked by a plot of m ˆ
Emmanuel Flachaire
Non-Parametric Econometrics
Confidence intervals Asympotic method I
Under regularities conditions, the Nadaraya-Watson estimator is asymptotically Normal: √ v σ2 , hn [m(x) ˆ − m(x)] → N b , f (x) R where σ 2 is the variance of x and v = K 2 (u)du.
I
b is the asymptotic bias of m(x), ˆ depending on the unknown quantities f (x), f 0 (x), m0 (x) and m00 (x).
I
If the bias is not significant (h sufficiently small), we have s vσ ˆ2 , m(x) ˆ ± 1.96 nhfˆ(x)
I
A value smaller than hopt is preferred → undersmooth Emmanuel Flachaire
Non-Parametric Econometrics
Confidence intervals Bootstrap methods
I
In general, the distribution of a statistic τ is unknown in practice (finite sample) → we use an approximation
I
Asymptotic method: distribution of τ as n → ∞
I
Bootstrap method: distribution of τ from a predefined DGP
I
If the “bootstrap” DGP is close to the true DGP, the approximation given by the bootstrap should be good Bootstrap Principle:
I
I I I
an initial estimation is used to calculate the residuals these residuals are used to generate new samples (resampling) new regression estimations from the new samples
The set of new estimates allow us to measure the variation in the estimated values
Emmanuel Flachaire
Non-Parametric Econometrics
Confidence intervals Bootstrap methods
A bootstrap CI is obtained as follows: I
Select an optimal bandwidth hopt
I
Reestimate regression with h+ = 1.1hopt (oversmooth): m ˆ h+
I
Reestimate regression with h− = 0.9hopt (undersmooth): εˆi
I
Bootstrap DGP: yi∗ = m ˆ h+ (xi ) + εˆi ηi∗
I
Estimate the regression over the bootstrap sample (y ∗ , x) with the optimal bandwidth hopt → m(x) ˆ ∗
I
Repeat the 2 preceding steps many times. The 95% CI results from the calculation of the 0.025 and 0.975 quantiles of the EDF of m(x) ˆ ∗ : percentile approach.
Emmanuel Flachaire
Non-Parametric Econometrics
Confidence intervals Bootstrap methods I
Bootstrap DGP: yi∗ = m ˆ h+ (xi ) + εˆi ηi∗
I
Percentile method: I I I
I
from B bootstrap samples, compute m ˆ b∗ (b = 1, . . . , B) ∗ ˆ i) The EDF of m ˆ b is the bootstrap distribution of m(x Use EDF quantiles to define a confidence interval: [q1−α ; qα ]
Percentile-t method: I I I
from B boot. samples, compute τb∗ = [m ˆ ∗ (xi ) − m(x ˆ i )]/ˆs (xi ) ∗ The EDF of τb is the bootstrap distribution of τˆ Use EDF quantiles to define a confidence interval: [m(x) ˆ − q.975 ˆs (x) ; m(x) ˆ − q.025 ˆs (x)]
I
Key difference: τ is asymptotically pivotal Emmanuel Flachaire
Non-Parametric Econometrics
Parametric vs. nonparametric model A simple test
I
Let us assume that the true regression function is: y = β0 + β1 x + β3 x 3 + ε
I
A parametric test of the following hypotheses: H0 : y = β0 + β1 x + ε vs. H1 : y = β0 + β1 x + β2 x 2 + ε would not reject the null!
I
On the contrary, a test statistic based on H1 : y = m(x) + ε would likely reject the null
Emmanuel Flachaire
Non-Parametric Econometrics
(2)
Parametric vs. nonparametric model A simple test I
Is a parametric model adequate? A measure of distance is: Z 2 ˆ dx m(x) ˆ − m(x, β)
I
Hardle and Mammen note that the nonparametric estimator is ˆ by: biased and propose to replace m(x, β) ˆ = m(x, ˆ β)
I
P
i ˆ K ( x−x )m(x, β) P h x−xi K( h )
A simple test statistic is thus, Tn =
n 2 √ X ˆ h m(x ˆ i ) − m(x, ˆ β) ψ(xi ) i=1
Emmanuel Flachaire
Non-Parametric Econometrics
Parametric vs. nonparametric model A simple bootstrap test
A bootstrap P-value is obtained as follows: ˆ and calculate Tn I estimate m(x), ˆ m(x, β) I
ˆ save residuals from the parametric model: εˆi = yi − m(xi , β) ˆ generate a bootstrap sample from: y ∗ = m(xi , β)
I
calculate a bootstrap test Tn∗ from the new sample
I
repeat several times
I
i
The P-value is the percentage of Tn∗ greater than Tn
Emmanuel Flachaire
Non-Parametric Econometrics