An ESP Decision-Theoretic Approach (Companion paper)

Comments welcome.

Benjamin Holcblat∗ Carnegie Mellon University Tepper School of Business November 6, 2011

Abstract A decision-theoretic approach to estimation provides flexibility and finite-sample optimality. However, a decision-theoretic approach is generally impossible or delicate within the existing classical estimation theory. In this paper, we present a general classical decision-theoretic approach within the ESP estimation framework developed in the core paper “Estimating Consumption-Based Asset Pricing Models. The ESP Approach”. For a large class of loss functions, we provide point estimate, confidence set and tests. We establish their consistency, and robustness to lack of identification. In addition, we prove that ESP tests do not yield any asymptotic error in contrast to standard classical tests.

Helpful comments were provided by Laurence Al`es, Dennis Epple, Gautam Iyer, Rick Green, Lars-Alexander Kuehn, Chen Li, Artem Neklyudov, Carlos Ramirez, Bryan Routledge, Max Roy, Stefano Sacchetto, Batchimeg Sambalaibat, Teddy Seidenfeld, Steven Shreve, Fallaw Sowell and seminar participants at Carnegie-Mellon University. Any errors are my own. ∗

Contact information. Address: Tepper School of Business, 5000 Forbes Avenue, Pittsburgh, PA 15213. Email:

[email protected].

1. Introduction Holcblat (2011) shows an ESP intensity is an appropriate finite-sample summary of the uncertainty about the population parameter. Therefore, in this paper, we regard estimation as a choice of parameter values in the spirit of microeconomic theory under uncertainty. In other words, in this section, we develop a decision-theoretic approach within the estimation framework offered in the core paper (Holcblat, 2011). By a decision-theoretic approach, we mean an approach in which the econometrician’s estimation decision is an optimal answer to the uncertainty summarized by the ESP intensity given his goal. The econometrician is endowed with a utility function u : (de , θT∗ ) 7→ u(de , θ) where de is an estimation decision and where θ ∈ Θ is a potential value of the solution to the empirical moment conditions. The estimation decision is typically a parameter value (point estimation) or a set of parameter values (hypothesis testing).1 The utility function indicates the utility provided by decision de to the econometrician when a solution to the empirical moment condition is θ. They correspond to the opposite of the loss functions considered in the core paper. The econometrician chooses an estimation R decision, de , that maximizes his ESP expected utility function, Θ u(de , θ)f˜θT∗ ,sp (θ)dθ. ESP expected

utility is a generalization of expected utility defined in microeconomic theory, in the sense that utility functions are integrated w.r.t. an intensity measure that is not necessarily a probability measure.2

Without loss of utility, the econometrician does not randomize his estimation decision (mixed strategy). For the same reason as in Bayesian estimation (e.g. Theorem 3.12 p.147 in Schervish, 1995), randomization cannot improve an optimal non-randomized estimation decisions (pure strategy). A randomized decision is a weighted average of non-randomized decisions; and the average of elements of a set cannot be bigger than the maximum of the set. A decision theoretic approach is generally delicate within the standard classical estimation theory. Often, it is not possible, as in standard moment-based estimation, in which case the objective function is not expressed in terms of the dimension of interest, the parameter values. For example, the objective functions of GMM, empirical likelihood (EL) and exponential tilting (ET), are expressed, respectively, 1

In this paper, we do not develop tests of goodness-of-fit, since we only consider just-restricted moment conditions (see Assumption 5(c)). Sowell (2007; 2009) proposes tests of goodness-of-fit based on a transformation of over-restricted systems into just-restricted systems based on the GMM objective function. The author has a work in progress where he develops ESP tests of goodness-of-fit independently of GMM. 2

Note that this extension is mathematically straightforward. Normalizing the ESP intensity to make the ESP intensity a

density

R

f˜θ∗ ,sp (.)

T ˜∗ ,sp (θ)dθ Θ f θT

does not affect the definitions below. However, the sets of decision-theoretic axioms used should be

modified. This is left for future research.

1

in terms of a norm of the empirical moment conditions, the probability weight of the observed sample, and the informational content (or entropy) of the sample. When a decision-theoretic approach is possible, it typically does not produce a complete ranking of estimation decisions. Given two decision rules de1 (.) and de2 (.), the risk functions θ 7→ Eθ [u(de (X), θ)] and θ 7→ Eθ [u(de (X), θ)] typically cross each others (e.g. section 2.D in Gouri´eroux and Monfort, 1989). In Bayesian theory, integration of the classical risk functions w.r.t. the posterior makes a decision-theoretic approach possible. The decision-theoretic approach presented in this paper shares common features with the one for Bayesian estimation. However, the fundamental differences analyzed in section 6 of the core paper. The paper is organized as follows. Section 1 presents the ESP decision-theoretic approach for point estimation. Section 2 provides the ESP decision-theoretic approach for hypothesis testing. In section 1 and 2, we assume identification holds. Section 3 presents the counterpart of section 1 and 2 in the case of multiple solutions to the moment conditions. Appendices A and B remind, respectively, the notations and the assumptions from the core paper. Proofs are in the Appendix C.

2. Point estimation In this section, an estimation decision by the econometrician is an element of the parameter space.

2.1.

Continuous utility functions In this section, we consider the case where the utility function of the econometrician is continuous.

In this case, we require the following assumptions. ˙ u(θ, θ) ˙ < u(θ, ˙ θ) ˙ ; (c) Assumption 1. (a) u(., .) is continuous ; (b) For all θ˙ ∈ Θ and θ ∈ Θ \ {θ}, ˙ θ). ˙ ; (d) For all θe , θ ∈ Θ, 0 6 u(θ, θ). For all θ, θ˙ ∈ Θ2 , u(θ, θ) = u(θ,

Assumption 1(a) is standard in decision theory (e.g. Definition 3.C.1 and Proposition 3.C.1 p.46-47 in Mas-Collel, Whinston and Green, 1995). See section 2.2 below for a relevant case where the utility function is not continuous. Since the parameter space Θ is compact, continuity implies boundedness, and thus rules out Saint-Petersburg type paradoxes (e.g. Mas-Collel, Whinston and Green p.185, 1995). Boundedness also ensures that the ESP estimated expected utility is always well-defined i.e. R ∗ Θ u(θe , θ)fθT ,sp (θ)dθ < ∞ for all θe ∈ Θ. Assumption 1(b) formalizes the econometrician prefer-

ence for accuracy. This means the econometrician is strictly better off when his point estimate equals

2

a solution to the empirical moment conditions than otherwise. Assumption 1(c) means the econometrician’s preference for accuracy is independent of the actual values of the solutions to the empirical moment conditions. Assumption 1(d) is the opposite of the standard convention in decision theory for estimation. Usually the convention is u(., .) 6 0 where u(., .) is a loss function (e.g. p.52,60 in Robert, 1994). Our convention has the advantage to induce the same terminology as in microeconomic theory.

Point estimate Once we have characterized the utility function, the definition of corresponding point estimates follow. Definition 2.1 (ESP point estimate). Given a utility function u(., .), an ESP point estimate, θˆTu , is a E/B(Θ)-measurable maximizer of the ESP expected utility i.e ∗ ˜ θˆTu := arg max E[u(θ e , θT )] θe ∈Θ

∗ ˜ where E[u(θ e , θT )] :=

R

3 ˜∗ Θ u(θe , θ)fθT ,sp (θ)dθ.

The following proposition presents finite-sample properties of maximization of ESP expected utility. Proposition 2.1. Under Assumptions 5-7;1 , Z i) θe 7→ u(θe , θ)f˜θT∗ ,sp (θ)dθ is continuous over Θ Θ

ii) there exists an ESP point estimate θˆTu . Proof. See Appendix C.1 p.26.  Proposition 2.1i) means the preference relation generated by maximization of the ESP expected n o n o (1) (2) utility is continuous4 i.e. if for two converging sequences of parameter values, θn θn , n>1

(1)

n>1

(2)

θn is always preferred to θn , then preference cannot be reversed at the limit. (e.g. p.46 in Mas-Collel, Whinston and Green, 1995). Proposition 2.1ii) is a consequence of Proposition 2.1i).

The following proposition presents asymptotic properties of maximization of ESP expected utility. 3

Note that this notation corresponds the usual notation only if there can be only one solution to the empirical moment conditions. 4

Continuity of the utility function is different from continuity of the expected utility.

3

Proposition 2.2. Under Assumptions 5-11;1, as T → ∞,

Z

˜

i) sup u(θe , θ)fθT∗ ,sp (θ)dθ − u(θe , θ0 )

→ 0 P-a.s.; θe ∈Θ

Θ

ii) an ESP point estimate converges P-a.s. to the population parameter i.e. lim θˆTu = θ0

P-a.s..

T →∞

Proof. See Appendix C.2 p.26.  Proposition 2.2i) means the preference relation corresponding to the ESP expected utility is consistent i.e. the preference relation corresponding to the ESP expected utility converges to the preference relation corresponding to the utility function with knowledge of the population parameter. Proposition 2.2ii) is an immediate consequence of Proposition 2.2i).

Confidence set Point estimate are not necessarily stable. The typical symptom of instability is the absence of a unique well separated maximum of the objective function. Confidence sets provide an indication of the stability of point estimates. We define ESP confidence set estimates as follows. Definition 2.2 (ESP confidence set). Given a utility function u(., .), an ESP confidence set of level 1 − α with α ∈ [0, 1] is a B(Θ)-measurable set S˜u,T :=



1 θe ∈ Θ : u KT

where kα,T is the highest bound satisfying R ˜∗ Θ2 u(θe , θ)fθT ,sp (θ)dθdθe .

Z

Θ

u(θe , θ)f˜θT∗ ,sp (θ)dθ > kα,T

R

1 S˜u,T KTu

R

˜∗



Θ u(θe , θ)fθT ,sp (θ)dθdθe

> 1 − α and KTu :=

By definition, all the elements of the parameter space contained in S˜u,T provide a higher ESP

expected utility than any elements of the parameter space outside S˜u,T . In other words, ESP confidence sets corresponds to the parameter values which are the closest to maximize the ESP expected utility. Thus it is the smallest set-estimate satisfying a constraint of ESP expected utility level 1 − α. We do not require the ESP expected utility level to be exactly equal 1 − α in order to ensure their existence. If

4

the ESP intensity is locally perfectly flat, the ESP expected utility level over the ESP confidence cannot equal 1 − α. To the knowledge of the author, such confidence sets have not been studied in Bayesian theory except in the case of 0-1 utility function. In standard classical estimation, standard deviations are sufficient to report, since whatever is the sample size the uncertainty is summarized by a Gaussian distribution centered at the point estimate. R Since the integral Θ2 u(θe , θ)f˜θT∗ ,sp (θ)dθdθe can take an arbitraty positive value, we normalize the ESP expected utility to define ESP confidence sets. The following assumption ensures the possibility to normalize i.e. KTu 6= 0. ˆ T 6= ∅ . Assumption 2. The domain of definition of the rough ESP intensity is not empty i.e. Θ ˆ T is empty either there is not support for the model of interest or the Assumption 2 is mild. If Θ ˆ T is not empty.5 sample size is too small. For T big enough, Θ To study the consistency of ESP confidence sets, we introduce a notion of convergence. Definition 2.3 (Convergence of sets). Let int(A) denote the interior of a set A. A sequence of sets {AT }T >1 converges to a set A if and only if for all a1 ∈ int(A) and a2 ∈ int(Ac ) there exists T˙ ∈ N s.t. T > T˙ implies a1 ∈ int(AT ) and a2 ∈ int(AcT ). It is denoted AT

A.

Definition 2.3 means that a sequence of sets converges to a limiting sets if the interior of the sets matches asymptotically. Using this definition, we can prove that ESP confidence sets converges to their asymptotic counterpart. The following proposition ensures existence and consistency of ESP confidence sets. Proposition 2.3. Define an asymptotic ESP confidence set of level 1 − α as a measurable set S˜u,∞ :=



1 θe ∈ Θ : u(θe , θ0 ) > kα,∞ K∞

where kα,∞ is the highest bound satisfying For all α ∈ [0, 1]

1 K∞

R

S˜u,T



u(θe , θ0 )dθe > 1−α and K∞ :=

R

Θ u(θe , θ0 )dθe .

i) under Assumptions 5-7;1-2 there exist an ESP confidence set, S˜u,T , and an asymptotic ESP confidence set of level 1 − α; 5

By Corollary 1 in the core paper, for T big enough there exists a consistent solution to the empirical moment conditions. ˆ T contains a neighborhood of a solution to the empirical moment conditions by Assumption 10(e). Thus, for T big enough, Θ

5

ii) under Assumptions 5-11;1-2, as T → ∞, S˜u,T

S˜u,∞

P-a.s.

Proof. See Appendix C.3 p.27.  Asymptotic ESP confidence sets correspond to the parameter values that provide the most weighted utility. Asymptotic ESP confidence set does not only include the population parameter with continuous utility functions. Parameter values different from the population parameter also provide utility to the econometrician. Proposition 2.3 is a consequence of Proposition 2.1iii)

2.2.

0-1 utility functions In research, estimation typically aims at increasing scientifiic knowledge, whose effects are hard

to measure. In this case, 0-1 utility functions are relevant. A 0-1 utility function equals one by normalization when the estimation decision is right (i.e. when θe is a solution to the empirical moment conditions) and zero otherwise.

Point estimate In point estimation, the 0-1 utility function is not a usual function. On the one hand, Assumption 5(d) rules out situations with a continuum of solutions to the empirical moment conditions. On the other hand, by construction, the ESP intensity measure is absolutely continuous w.r.t. the Lebesgue measure, which ignores points. Therefore, in point estimation, the 0-1 utility function is a generalized function which corresponds to a family of Dirac distribution indexed by θe , {δθe (.)}θe ∈Θ .6 Definition 2.4 (Maximum ESP point estimate). A maximum ESP point estimate, θˆT , is an ESP estimate that maximizes the ESP expected 0-1 utility function i.e. ˜ θ (θ∗ )] θˆT := arg max E[δ T e θe ∈Θ

6 See section 1 p.13-20 in chap.1 in Schwartz (2010) for a discussion about the differences between a (Schwartz) distributions and functions. We follow this misuse of language here to avoid a too cumbersome terminology. Note also that the meaning of Dirac distributions in this section is different from the one in Theorems 5.1 and 5.3 of the core paper. Here Dirac distributions formalize absolute preference of the econometrician for “truth”, while in Theorem 5.1 and 5.2 they formalize probabilistic distributions.

6

where δθe (.) is the Dirac distribution at θe . The following immediate proposition clarifies the meaning of Definition 2.4. Proposition 2.4. Under Assumptions 5-7, the Definition 2.4 is equivalent to each of the following properties i) if there exists a unique θˆT ∈ int(Θ), for r > 0 small enough, θˆT = arg max

θe ∈Θ

Z

IBr (θe ) (θ)f˜θT∗ ,sp (θ)dθ;

ii) θˆT = arg max f˜θT∗ ,sp (θe ). θe ∈Θ

Proof. See Appendix C.4 p.30.  Proposition 2.4i) provides an alternative, but equivalent formalization of absolute preference for “truth”. Proposition 2.4ii) provides an alternative interpretation of maximum ESP point estimates. A maximum ESP point estimate is the parameter whose estimated probability weight of being a solution to the empirical moment conditions is the highest. In this sense, it is a maximum-probability estimate. Proposition 2.4ii) also shows that our maximum ESP point estimate corresponds to the point estimate introduces in Sowell (2009) to correct the higher-order bias of entroy-based estimates (or exponential tilting estimates). Sowell (2009) shows the logarithm of the ESP intensity divided by the sample size corresponds to the exponential tilting objective function plus two terms which vanishes asymptotically. He deduces that maximum ESP estimates share the first-order asymptotic properties as ET estimates, but that they are higher-order bias corrected thanks to the extra two terms of the objective function. In accordance with Sowell (2009), the following proposition presents that maximum ESP estimates are consistent. Proposition 2.5. Under Assumptions 5-7, i) there exists a maximum ESP θˆT ; ii) under additional Assumptions 8-11 and 2, a maximum ESP point estimates converges P-a.s. to the population parameter i.e. lim θˆT = θ0

T →∞

Proof. See Appendix C.5 p.30 

7

P-a.s.

Proposition 2.5i) follows from Lemma 2 in Jennrich (1969) and the continuity of the ESP intensity. Unlike in Sowell (2009), Proposition 2.5ii) is here a consequence of the consistency of the ESP intensity.

Confidence set As for continuous utility functions, we define confidence sets. Definition 2.5 (Maximum ESP confidence set). A maximum ESP confidence set of level 1 − α with α ∈ [0, 1] is a B(Θ)-measurable set S˜T :=



1 ˜ fθ∗ ,sp (θe ) > kα,T θe ∈ Θ : KT T

where kα,T is the highest bound satisfying Since

R

˜∗

Θ δθe (θ)fθT ,sp (θ)dθ

1 KT

R

S˜T



f˜θT∗ ,sp (θ)dθ > 1 − α and KT :=

R

˜∗

Θ fθT ,sp (θ)dθ.

= f˜θT∗ ,sp (θe ), Definition 2.5 is in line with Definition 2.2; and thus

the same interpretation still holds. By Proposition 4.5 in the core paper, all elements of the maximumprobability set have a higher probability weight of being a solution to the empirical moment conditions, than the one outside. In this sense, they are maximum-probability based. We do not require the ESP intensity measure of the maximum ESP confidence sets to equal 1 − α to ensure existence as for continuous utility functions. To the knowledge of the author, Sowell (2007) introduces this type of confidence interval in the saddlepoint literature. The Bayesian counterpart of maximum ESP sets are typically called “highest posterior density α-credible region” (e.g. Definition 5.5.2 in Robert, 1994). The following proposition ensures existence and consistency of maximum ESP confidence sets. Proposition 2.6. For all α ∈]0, 1[ i) under Assumptions 5-7,2, there exist a maximum ESP confidence set, S˜T ; ii) under Assumptions 5-11,2, as T → ∞, S˜T

{θ0 }

P-a.s.

Proof. See Appendix C.6 p.30.  Proposition 2.6i) follows from the same arguments as Proposition 2.3 i). 8

3. Hypothesis testing In this section, the estimation decision corresponds to the choice of a subset of the parameter space Θ.

3.1.

Notations and definitions The following definition sets the notations for tests.

Definition 3.1 (Test). Define the subsets ΘH ⊂ Θ and ΘA ⊂ Θ such that ΘH ∩ ΘA = ∅ . Define the measurable decision space (D, D) where D := {dH , dA }, with dH and dA and where D is the power set of D. The decisions dH and dA respectively correspond to acceptance of ΘH and rejection of ΘH . Given a sample size T , a test is a E/D-measurable function dT (.) . As in point estimation, the decision which maximizes the ESP expected utility is retained. Thus, we define an ESP test as follows. Definition 3.2 (ESP test). Given a utility function u(., .), an ESP hypothesis test is a E/D-measurable function, dT , such that for all ω ∈ Ω if ∗ ∗ ˜ ˜ E[u(d H , θT )] > E[u(dA , θT )]

then dT (ω) = dH ; and otherwise dT (ω) = dA . The standard classical theory uses the asymptotic Gaussian distribution to summarize the uncertainty about the population parameter. This implies the population parameter can be outside the parameter space with a strictly positive probability, although the economic model is typically not defined for these parameter values. For example, in consumption-based asset pricing, standard confidence intervals and tests for the time discount factor consider it can take values higher than one. The support of the Gaussian distribution is the whole real line. However, for values higher than one, the consumptionbased asset pricing is typically not defined. The value function of a dynastic representative agent explodes to infinity for time discount factor higher than one. In contrast, like ESP confidence sets, ESP tests do not regard values outside the parameter space as possible. By construction, the ESP intensity is not defined outside the parameter space. Another concern with the standard classical theory is the asymptotic properties of tests. In the standard classical theory, a test is consistent if the probability of rejection of the alternative goes to one 9

when θ0 ∈ ΘA as the sample size increases to infinity (e.g. p.553 in Gouri´eroux and Monfort, 1989). However, a consistent test leads to asymptotically reject the hypothesis of the test when θ0 ∈ ΘH with a probability equal to the level of the test. As shown below, such asymptotic mistake does not occur in our framework. We introduce the notion of double-consistency to characterize this property. Definition 3.3 (Double consistency). A test dT (.) is doubly-consistent P-a.s. if and only if

lim dT =

T →∞

   d

H

   dA

if θ0 ∈ ΘH

P-a.s.

if θ0 ∈ ΘA

In a test, there are two possible estimation decisions (acceptance and rejection of the test hypothesis) and two possible right propositions (the hypothesis is correct and the alternative is correct). Therefore, in hypothesis testing, a utility function takes at most four different values. Consequently, instead of the distinction between continuous and 0-1 utility functions as for point estimation, we distinguish between set and point hypothesis.

3.2.

Set hypothesis The following assumption sets notations for the utility function.

Assumption 3. For all (d, θ) ∈ D × Θ, u : (d, θ) 7→ cd IΘH (θ) + bd IΘA (θ) with cdH > cdA and bdA > bdH . The strict inequality conditions on the values of the utility function ensure right estimation decision provide a strictly higher expected utility to the econometrician strictly than the wrong ones. The maximum ESP (or maximum expected “truth”) approach corresponds to cdH = bdA = 1 and cdA = bdH = 0. The following proposition reformulates conveniently ESP tests in the case of set hypothesis. Proposition 3.1. Under Assumptions 5-7 and 3, the ESP hypothesis test in Definition 3.2 is equivalent to the test dT , such that for all ω ∈ Ω if ˜ T (ΘH ) > F ˜ T (ΘA ) cF with c :=

cdH −cdA bdA −bdH ,

then dT (ω) = dH ; and otherwise dT (ω) = dA . 10

(1)

Proof. See Appendix C.7 p.30.  Proposition 3.1 is the immediate counterpart of standard result in Bayesian estimation (e.g. p.218 in Schervish, 1995). In the case of a maximum ESP approach (i.e. c = 1), the meaning is clear. We accept the hypothesis if the estimated intensity measure that solutions to the empirical moment conditions are in ΘH is higher than it is in ΘA . If there can only be one solution to the empirical moment conditions, we accept the most probable hypothesis. Despite this appealing meaning, Proposition 3.1 also shows the hypothesis with the biggest volume is favoured. The following proposition ensures the existence and the double-consistency of an ESP set-hypothesis test. Proposition 3.2. Given a utility function u(., .), i) under Assumptions 5-7 and 3, there exists an ESP set-hypothesis test; ii) under Assumptions 5-11 and 3, an ESP set-hypothesis is doubly consistent P-a.s. Proof. See Appendix C.8 p.30.  Proposition 3.2i) is immediate. Proposition 3.2ii) is a consequence of the convergence of the ESP intensity measure to a Dirac distribution centered at the population parameter. Unlike in the standard classical approach, there is no uncertainty asymptotically; and thus no mistake occurs.

3.3.

Point hypothesis In the case of point-hypothesis (i.e. ΘH := {θH }), we derive results similar to set-hypothesis

tests. The counterpart of Assumption 3 is the following assumption. Assumption 4. For all (d, θ) ∈ D × Θ, u : (d, θ) 7→ cd δθH (θ) + bd IΘA (θ) with cdH > cdA and bdA > bdH . Since ΘH is a parameter value, the utility function is expressed in terms of Dirac distribution for same reason as in section 2.2. A maximum ESP approach also corresponds to cdH = bdA = 1 and cdA = bdH = 0. The counterpart of Proposition 3.1 is the following immediate proposition.

11

Proposition 3.3. Under Assumptions 5-7 and 4, the ESP hypothesis test in Definition 3.2 is equivalent to the test dT , such that for all ω ∈ Ω if ˜ T (ΘA ) cf˜θT∗ ,sp (θH ) > F with c :=

cdH −cdA bdA −bdH ,

(2)

then dT (ω) = dH ; and otherwise dT (ω) = dA .

Proof. See Appendix C.9 p.30.  In the case of a maximum ESP approach (i.e. c = 1), an ESP test does not have the same straightforward meaning as in Proposition 3.1 . The LHS of equation (2) is in terms of intensity weight, while the RHS is in terms of intensity measure. The test hypothesis is accepted when the estimated intensity (or probability by Proposition 4.5) weight of θH being a solution to the empirical moment conditions is higher than the intensity measure of ΘA . In a maximum ESP approach, Proposition 3.3 also shows some similarity with Jeffreys’ Bayes factors (e.g. subsection 4.2.2 in Schervish, 1995). The following proposition is the point-hypothesis counterpart of Proposition 3.2. Proposition 3.4. Given a utility function u(., .), i) under Assumptions 5-7 and 4, there exists an ESP point-hypothesis test; ii) under Assumptions 5-11 and 4, an ESP point-hypothesis is doubly-consistent P-a.s. Proof. See Appendix C.10 p.30. 

4. Robustness to lack of identification of the decision-theoretic approach In this section, we present how the multiplicity of solutions to the moment conditions affects the decision-theoretic approach of sections 2 and 3. For clarity, the structure of the section is similar to sections 2 and 3. For brevity, we try to only indicate the necessary changes w.r.t sections 2 and 3. Proofs are adaptation of the proofs in the case of identification.

4.1.

Point estimation By definition, point estimation is not relevant in the case of multiple solutions to the moment

conditions. Therefore, like existing point estimates, in this case, ESP point estimates are only locally consistent i.e. they are consistent when the parameter space is restricted to a subset containing a unique 12

solution. However, ESP confidence sets reflect the lack of reliability of ESP point estimates. We show that ESP confidence sets are globally consistent in the presence of multiple solutions to the moment conditions. In contrast, standard classical confidence sets are not consistent. By construction, they consider the uncertainty about the population parameter corresponds to a Gaussian density centered at the point estimate. Thus, standard point estimates contaminate standard confidence sets.

4.1.1

Continuous utility functions

Point estimate Definition 2.1 becomes. Definition 4.1. Denote P(Θ) := {Θi }ni=1 a partition of Θ such that for all i ∈ [[1, n]], Θi contains a (i)

unique solution to the moment conditions θ0 ∈ int(Θi ). Given a utility function u(., .) and a subset

u,(i) Θi , a local ESP point estimate, θˆT , is a E/B(Θ)-measurable maximizer of the ESP expected utility

over Θi i.e u,(i) ∗ ˜ θˆT := arg max E[u(θ e , θT )] θe ∈Θi

∗ ˜ where E[u(θ e , θT )] :=

R

˜∗

Θ u(θe , θ)fθT ,sp (θ)dθ.

Note we still integrate the utility function over the whole parameter space. Proposition 2.1 remains valid for local ESP point estimates after obvious change. Proposition 2.2 becomes the following. Proposition 4.1. Under Assumptions 5-7,12-14,1, for all i ∈ [[1, n]], as T → ∞,

Z

n

X

(i) u(θe , θ0 ) → 0 P-a.s.; i) sup u(θe , θ)f˜θT∗ ,sp (θ)dθ −

θe ∈Θ Θ i=1

ii) a local ESP point estimate converges locally P-a.s. to its corresponding solution to the moment conditions i.e. u,(i) lim θˆ T →∞ T

(i)

= θ0

P-a.s..

Proof. Adapt proof of Proposition 2.2.  Proposition 4.1 shows that multiplicity of solutions to moment conditions implies multimodal ESP intensity. Similarly, multiplicity of local minima in a GMM objective function may be a symptom of 13

multiple solution to the moment conditions. Theorem 4.1.2 in Amemiya (1985) has a spirit similar to Proposition 4.1 .

Confidence set Definitions 2.2, 2.3 and Proposition 2.3i) remain valid. Proposition 2.3ii) becomes the following. Proposition 4.2. Define an asymptotic ESP confidence set of level 1 − α as a measurable set S˜u,∞ :=

(

n 1 X (i) θe ∈ Θ : u(θe , θ0 ) > kα,∞ K∞ i=1

)

P R (i) where kα,∞ is the highest bound satisfying K1∞ ni=1 S˜u,T u(θe , θ0 )dθe > 1 − α and K∞ := Pn R (i) i=1 Θ u(θe , θ0 )dθe . For all α ∈ [0, 1], under Assumptions 5-7,12-14 ,1-2, as T → ∞, S˜u,T

S˜u,∞

P-a.s.

Proof. Adapt proof of Proposition 2.3.  The definition of an asymptotic ESP confidence set is in line with the definition in Proposition 2.3 by Proposition 4.1i).

4.1.2

0-1 utility functions

Point estimate Definition 2.4 becomes the following. Definition 4.2. A local maximum ESP point estimate , θˆT , is an ESP estimate that maximizes the ESP expected 0-1 utility function over Θi i.e. ˜ θ (θ∗ )] θˆT := arg max E[δ T e θe ∈Θi

where δθe (.) is the Dirac distribution at θe . After obvious modifications, Propositions 2.4 and 2.5i) remain valid for local maximum ESP point estimate. Proposition 2.5ii) becomes the following. 14

Proposition 4.3. Under Assumptions 5-7,12-14, a local maximum ESP point estimates converges locally P-a.s. to its corresponding solution to the moment conditions i.e. for all i ∈ [[1, n]] (i) lim θˆ T →∞ T

(i)

= θ0

P-a.s..

Proof. Adapt proof of Proposition 2.5. 

Set estimate Definition 2.2 and Proposition 2.6i) remain valid for the same reason, but Proposition 2.6ii) becomes the following. Proposition 4.4. Under Assumptions 5-7,12-14 ,1-2, for all α ∈]0, 1[, as T → ∞, S˜T

n n G

i=1

where

F

(i)

θ0

o

P-a.s.

denotes a union of disjoint sets.

Proof. Adapt proof of Proposition 2.5. 

4.2.

Hypothesis testing Standard classical tests correspond to standard classical confidence intervals. Thus they are not

robust to the presence of multiple solutions the moment conditions. ESP tests presents some robustness to this situation. They take into account the uncertainty due to the multiplicity of solutions. Definitions 3.1 and 3.2 remain relevant unlike Definition 3.3.

4.2.1

Set hypothesis

Proposition 3.1 and 3.2i) remain valid for the same reasons, but Proposition 3.2ii) becomes the following.

15

Proposition 4.5. Given a utility function u(., .), under Assumptions 5-7,12-14 and 3, as T → ∞, P-a.s.

lim dT =

T →∞

   d

H

  d A

n o n o (i) (i) (i) (i) if c# θ0 : i ∈ [[1, n]] and θ0 ∈ ΘH > # θ0 : i ∈ [[1, n]] and θ0 ∈ ΘA otherwise

Proof. Adapt proof of Proposition 3.2ii) .  In other words, if the number of solutions to the moment conditions in ΘH weighted by c is higher that the one in ΘA , the hypothesis is accepted. In the case of a maximum ESP approach, c = 1.

4.2.2

Point hypothesis

Propositions 3.3 and 3.4i) remain valid for the same reasons, but Proposition 3.4ii) becomes Proposition 4.6. Given a utility function u(., .), under Assumptions 5-7,12-14 and 3, as T → ∞, P-a.s.

lim dT =

T →∞

   d

H

  dA

n on (i) if θH ∈ θ0

i=1

otherwise

Proof. Adapt proof of Proposition 3.4ii) . 

According to Proposition 4.6, if the hypothesis corresponds to a solution to the moment conditions, the hypothesis is accepted. Thus, the solution to the moment conditions which corresponds to the point-hypothesis is favoured.

16

References A¨ıt-Sahalia, Y. and Yu, J.: 2006, Saddlepoint approximations for continuous-time Markov processes, Journal of Econometrics 134, 507–551. Almudevar, A., Field, C. and Robinson, J.: 2000, The density of multivariate M-estimates, The Annals of Statistics 28(1), 275– 297. Backus, D. K., Routledge, B. R. and Zin, S. E.: September 2008 (first draft June 2007), Who holds risky assets?, working paper . Backus, D. K. and Zin, S. E.: 1994, Reverse engineering the yield curve, NBER working paper No. 4676 . Bansal, R., Kiku, D. and Yaron, A.: 2007, Risks for the long run: Estimation and inference, working paper . Bansal, R. and Yaron, A.: 2004, Risks for the long run: a potential resolution of asset pricing puzzles, Journal of Finance 59(4), 1481–1509. Barro, R. J.: 2006, Rare disasters and asset markets in the twentieth century, Quaterly Journal of Economics 121(3), 823–866. Berger, J. O.: (1980) 2006, Statistical decision theory and Bayesian analysis, Statistics, second edn, Springer. Berk, R. H.: 1972, Consistency and asymptotic normality of MLE’s for exponential models, The Annals of Mathematical Statistics 43(1), 193–204. Bhattacharya, R. N. and Ramaswamy, R. R.: (1976) 1986, Normal Approximation and Asymptotic Expansions, second edn, Robert E. Krieger Publishing Company, Inc. Boguth, O. and Kuehn, L.-A.: 2011, Consumption volatility risk, working paper . Brown, B. W. and Newey, W. K.: 2002, Generalized method of moments, efficient bootstrapping, and improved inference, Journal of Business and Economic Statistics 20(4), 507–517. Calin, O. L., Chen, Y., Cosimano, T. F. and Himonas, A. A.: 2005, Solving asset pricing models when the price-dividend function is analytic, Econometrica 73(3), 961–982. Campbell, J. Y. and Cochrane, J. H.: 1999, By force of habit: A consumption-based explanation of aggregate stock market behavior, The Journal of Political Economy 107(2), 205–251. Carrasco, M. and Florens, J.-P.: 2000, Generalization of GMM to a continuum of moment conditions, Econometric Theory 16(6), 797–834. Chen, C.-F.: 1985, On asymptotic normality of limiting density functions with Bayesian implications, Journal of the Royal Statistical Society. Series B (Methodological) 47(3), 540–546. Chen, X., Favilukis, J. and Ludvigson, S. C.: 2008, An estimation of economic models with recursive preferences, working paper . Chen, X. and Ludvigson, S. C.: 2009, Land of addicts? An empirical investigation of habit-based asset pricing models, Journal of Applied Econometrics 24, 1057–1093. Chen, Y., Cosimano, T. F. and Himonas, A. A.: 2008a, Analytic solving of asset pricing models: The by force of habit case, Journal of Economic Dynamics and Control 32, 3631–3660. Chen, Y., Cosimano, T. F. and Himonas, A. A.: 2008b, Continuous time one-dimensional asset pricing models with analytic price-dividend functions, working paper (forthcoming in Economic Theory) .

17

Chernozhukov, V. and Hong, H.: 2003, An MCMC approach to classical estimation, Journal of Econometrics 115, 293–346. Constantinides, G. M. and Ghosh, A.: 2008, Asset pricing tests with long run risks in consumption growth, working paper . Daley, D. J. and Vere-Jones, D.: (1988) 2008, An Introduction to the Theory of Point Processes. General Theory and Structure (vol. 2), Probability and Its Applications, second edn, Springer. Daniels, H. E.: 1954, Saddlepoint approximations in statistics, The Annals of Mathematical Statistics 25(4), 631–650. de Finetti, B.: 1937, La pr´evision: ses lois logiques, ses sources subjectives, Annales de l’Institut Henri Poincar´e 7(1), 1–68. Translated in English in “Studies in Subjective Probability” by Henry E. Kyrburgh and Howard E. Smokler, 1964. de Finetti, B.: 1968, International Encyclopedia of the Social Sciences, New York Macmillan, chapter “Probability: Interpretations”, pp. 496–505. de Laplace, P.-S.: (1774) 1878, Œuvres compl`etes de Laplace, Vol. 8, Gauthier-Villars, chapter “Memoire sur la probabilit´e des causes par les e´ v´enements”, pp. 31–71. D’Haultfoeuille, X.: 2009, Essai sur quelques probl`emes d’identification en e´ conomie, PhD thesis, Universit´e Paris 1 Panth´eon-Sorbonne and Paris School of Economics. Duffie, D.: (1992) 2001, Dynamic Asset Pricing Theory, third edn, Princeton University Press. Dufour, J.-M.: 1997, Some impossibility theorems in econometrics with applications to structural and dynamic models, Econometrica 65(6), 1365–1387. Epstein, L. G. and Zin, S. E.: 1989, Substitution, risk aversion, and the temporal behavior of consumption and asset returns:, Econometrica 57(4), 937–969. Epstein, L. G. and Zin, S. E.: 1991, Risk aversion, and the temporal behavior of consumption and asset returns: An empirical analysis, The Journal of Political Economy 99(2), 263–286. Esscher, F.: 1932, On the probability function in the collective theory of risk, Scandinavian Actuarial Journal pp. 175–195. Field, C.: 1982, Small sample asymptotic expansions for multivariate M-estimates, The Annals of Statistics, 10(3), 672–689. Field, C. and Ronchetti, E.: 1990, Small Sample Asymptotics, Lecture notes-Monograph Series, Institute of Mathematical Statistics. Flod´en, M.: 2008, A note on the accuracy of Markov-chain approximations to highly persistent ar(1) processes, Economics Letters 99, 516–520. Folland, G. B.: 1984, Real Analysis. Modern Techniques and Their Applications, Pure & Applied Mathematics, WileyInterscience. Gagliardini, P., Gouri´eroux, C. and Renault, E.: 2011, Efficient derivative pricing by the extended method of moments, Econometrica 79(4), 11811232. Gallant, R. A. and Hong, H.: 2007, A statistical inquiry into the plausibility of recursive utility, Journal of Financial Econometrics 5, 523–559. Ghosh, A., Julliard, C. and Taylor, A. P.: 2010, What is the consumption-CAPM missing? An information-theoretic framework for the analysis of asset pricing models, working paper . Ghosh, J. K. and Ramamoorthi, R. V.: 2003, Bayesian Nonparametrics, Statistics, Springer. Gouri´eroux, C. and Monfort, A.: (1989) 1996, Statistique et Mod`eles Econom´etriques, Col. ”Economie et Statistiques Avanc´ees”, s´erie ENSAE et CEPE, 2nd edn, Economica.

18

Translated in English by Quang Vuong under the title ”Statistics and Econometric Models”. Goutis, C. and Casella, G.: 1999, Explaining the saddlepoint approximation, The American Statistician 53(3), 216–214. Gregory, A. W., Lamarche, J.-F. and Smith, G. W.: 2002, Information-theoretic estimation of preference parameter: macroeconomic applications and simulation evidence, Journal of Econometrics 107, 213–233. Grossman, S. J. and Shiller, R. J.: 1981, The determinants of the variability of stock market prices, American Economic Review 71(2), 222–227. Guggenberger, P. and Smith, R. J.: 2005, Generalized empirical likelihood estimators and tests under partial, weak, and strong identification, Econometric Theory 21, 667–709. Guvenen, F.: 2009, A parsimonous macroeconomic model for asset pricing, Econometrica 77(6), 1711–1750. Hall, A. R.: 2005, Generalized Method of Moments, Advanced Texts in Econometrics, Oxford University Press. Hall, P. and Horowitz, J. L.: 1996, Bootstrap critical values for tests based on generalized-method-of-moments estimators, Econometrica 64(4), 891–916. Hansen, L. P.: 1982, Large sample properties of generalized method of moments estimators, Econometrica 50(4), 1029–1054. Hansen, L. P. and Heaton, J. C.: 2008, Consumption strikes back? measuring long-run risk, Journal of Political Economy 116(21), 260–302. Hansen, L. P. and Singleton, K. J.: 1982, Generalized instrumental variables estimation of nonlinear rational expectations models, Econometrica 50(5), 1269–1286. Hansen, P. L., Heaton, J. C., Lee, J. and Roussanov, N.: 2007, Handbook of Econometrics, Vol. 6A, North-Holland, chapter 61, pp. 3967–4056. Hartigan, J. A.: 1983, Bayes Theory, Series in Statistics, Springer. Hazewinkel, M. (ed.): 2002, The Online Encyclopaedia of Mathematics, Springer. available at http://eom.springer.de. Hiriart-Urruty, J.-B. and Lemar´echal, C.: (1993) 1996, Convex Analysis and Minimization Algorithms 1. Fundamentals, number 305 in Comprehensive Studies in Mathematics, Springer. Holcblat, B.: 2011, An ESP decision-theoretic approach, Companion Paper to “Estimating Consumption-Based Asset Pricing Models. The ESP Approach” . Imbens, G. W.: 1997, One-step estimators for over-identified generalized method of moments models, The Review of Economic Studies 64(3), 359–383. Itˆo, K.: 1970, Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 3, University of California Press, chapter “Poisson Point Processes Attached to Markov Processes”, pp. 225–239. Jennrich, R. I.: 1969, Asymptotic properties of non-linear least squares estimators, The Annals of Mathematical Statistics 40(2), 633–643. Jensen, J. L.: 1995, Saddlepoint Approximations, Oxford Statistical Science Series, Oxford University Press. Jensen, J. L. and Wood, A. T.: 1998, Large deviation and other results for minimum contrast estimators, Annals of the Institute of Statistical Mathematics 50(4), 673–695. Judd, K. L.: 1998, Numerical Methods in Economics, MIT Press.

19

Julliard, C. and Ghosh, A.: (2007) 2010, Can rare events explain the equity premium puzzle?, working paper . Kallenberg, O.: (1997) 2002, Foundation of Modern Probability, Probability and Its Applications, second edn, Springer. Karatzas, I. and Shreve, S. E.: (1988) 2005, Brownian Motion and Stochastic Calculus, Graduate Texts in Mathematics, second edn, Springer. Corrected eighth printing. Kass, R. E., Tierney, L. and Kadane, J. B.: 1990, Bayesian and Likelihood Methods In Statistics and Econometrics: Essays in Honor of George A. Barnard, Amsterdam: North Holland, chapter “The Validity of posterior expansions based on Laplace’s Method”, pp. 473–498. Kitamura, Y. and Stutzer, M.: 1997, An information-theoretic alternative to generalized method of moments estimation, Econometrica 65(4), 861–874. Kolmogorov, A. N.: (1933) 1936, Osnovnie poniatya teorii veroyatnostei, ONTI NKTP SSSR. Translated in English by Nathan Morrison under the title “Foundations of probability”. Komunjer, I.: 2011, Global identification in nonlinear models with moment restrictions, Econometric Theory (forthcoming) . Kumagai, S.: 1980, An implicit function theorem: Comment, Journal of Optimization Theory and Applications 31(2), 285– 288. Kydland, F. E. and Prescott, E. C.: 1982, Time to build and aggregate fluctuations, Econometrica 50(6), 1345–1370. LeCam, L.: 1953, On some asymptotic properties of maximum likelihood estimates and related Bayes’ estimates, University of California Publications in Statistics 1(11), 277–330. Lucas, R. E.: 1976, Econometric policy evaluation: A critique, Carnegie Rochester Conference Series on Public Policy 1, 19–46. Magnac, T. and Thesmar, D.: 2002, Identifying dynamic discrete decision processes, Econometrica 70(2), 801–816. Mas-Colell, A., Whinston, M. D. and Green, J. R.: 1995, Microeconomic Theory, Oxford University Press. Matthes, K., Kerstan, J. and Mecke, J.: (1974) 1978, Infinitely Divisible Point Processes, Probability Mathematical Statistics Monograph, Wiley. Mehra, R. and Prescott, E. C.: 1985, The equity premium a puzzle, Journal of Monetary Economics 14, 145–161. Monfort, A.: (1980) 1996, Cours de Probabilit´es, “Economie et statistiques avanc´ees”, ENSAE et CEPE, third edn, Economica. Nakamura, E., Steinsson, J., Barro, R. and Urs´ua, J.: 2010, Crises and recoveries in an empirical model of consumption disasters, working paper . Neely, C. J., Roy, A. and Whiteman, C.: 2001, Risk aversion versus intertemporal substitution: A case study of identification failure in the intertemporal consumption CAPM, Journal of Business and Economic Statistics 19(4), 395–403. Newey, W. K. and McFadden, D. L.: 1994, Handbook of Econometrics, Vol. 4, Elsevier Science Publishers, chapter “Large Sample Estimation and Hypothesis Testing”, pp. 2113–2247. Newey, W. K. and Smith, R. J.: 2004, Higher order properties of GMM and generalized empirical likelihood estimators, Econometrica 72, 219–255. Neyman, J.: 1977, Frequentist probability and frequentist statistics, Synthese 36(1), 97–131.

20

Otsu, T.: 2006, Generalized empirical likelihood inference for nonlinear and time series models under weak identification, Econometric Theory, 22, 513–527. Parker, J. A. and Julliard, C.: 2005, Consumption risk and the cross section of expected returns, Journal of Political Economy 113(1), 185–222. Radner, R.: 1972, Existence of equilibrium of plans, prices, and price expectations in a sequence of markets, Econometrica 40(2), 289–303. Robert, C. P.: (1994) 2007, The Bayesian Choice, From Decision-Theoretic Foundations to Computational Implementation, Springer Texts in Statistics, second edn, Springer. Roberts, A. W. and Varberg, D. E.: 1973, Convex Functions, Pure and Applied Mathematics, Academic Press. Rogers, L. and Williams, D.: (1979) 2008, Diffusions, Markov Processes and Martingales. Foundations (vol 1), Cambridge Mathematical Library, second edn, Cambridge University Press. Ronchetti, E. and Trojani, F.: 2003, Saddlepoint approximations and test statistics for accurate inference in overidentified moment conditions models, Working paper, National Centre of Competence in Research, Financial Valuation and Risk Management . Rothenberg, T. J.: 1971, Identification in parametric models, Econometrica 39(3), 577–591. Rust, J.: 1994, Handbook of econometrics, Vol. 4, Elsevier Science, chapter 51 “Structural Estimation of Markov Decision Processes”, pp. 3082–3143. Saporta, G.: (1990) 2006, Probabilit´es, Analyse des donn´ees et Statistique, second edn, Technip. Savage, L. J.: (1954) 1972, The Foundations of Statistics, Dover. Schennach, S. M.: 2005, Bayesian exponentially tilted empirical likelihood, Biometrika 92(1), 31–46. Schervish, M. J.: (1995) 1997, Theory of Statistics, Statistics, Springer. Schwartz, L.: (1950-1951) 2010, Th´eorie des distributions, third edn, Hermann. Schwartz, L.: (1955) 1998, M´ethodes math´ematiques pour les sciences physiques, Hermann. Ecrit avec le concours de Denise Huet. Published in English under the title “Mathematics For the Physical Sciences”. Seidenfeld, T.: 1992, R.A. Fisher’s fiducial argument and Bayes’ theorem, Statistical Science 7(3), 358–368. Sims, C. A.: 2007a, Bayesian methods in applied econometrics, or, why econometrics should always and everywhere be Bayesian, Discussion paper . Sims, C. A.: 2007b, Thinking about instrumental variables, Discussion paper . Singleton, K. J.: 2006, Empirical Dynamic Asset Pricing, Model Specification and Econometric Assessment, Princeton University Press. Skiadas, C.: 2009, Asset Pricing Theory, Princeton Series in Finance, Princeton University Press. Skovgaard, I. M.: 1985, Large deviation approximations for maximum likelihood estimators, Probability and Mathematical Statistics 6(2), 89–107. Skovgaard, I. M.: 1990, On the density of minimum contrast estimators, The Annals of Statistics 18(2), 779–789. Sowell, F.: 1996, Optimal tests of parameter variation in the generalized method of moments framework, Econometrica 64(5), 1085–1108.

21

Sowell, F.: 2007, The empirical saddlepoint approximation for GMM estimators, working paper, Tepper School of Business, Carnegie Mellon University . Sowell, F.: 2009, The empirical saddlepoint likelihood estimator applied to two-step GMM, working paper, Tepper School of Business, Carnegie Mellon University . Stock, J. H. and Wright, J. H.: 2000, GMM with weak identification, Econometrica 68(5), 1055–1096. van Binsbergen, J. H., Fern´andez-Villaverde, J., Koijen, R. S. and Rubio-Ram´ırez, J. F.: 2010, The term structure of interest rates in a DSGE model with recursive preferences, working paper . van der Vaart, A. W.: 1998, Asymptotic Statistics, Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press. Villanacci, A., Carosi, L., Benevieri, P. and Battinelli, A.: 2002, Differential Topology and General Equilibrium with Complete and Incomplete Markets, Kluwer Academic Publishers. Vissing-Jørgensen, A. and Attanasio, O. P.: 2003, Stock-market participation, intertemporal substitution, American Economic Review, AEA Papers and Proceedings 93(2). von Mises, R.: (1928) 1957, Probability, Statistics and Truth, Dover. Wald, A.: 1949, Note on the consistency of the maximum likelihood estimate, Annals of Mathematical Statistics 20(4), 595– 601.

22

A.

Main notations and abbreviations

Notations

Description

Size

θ0

Population parameter

m×1

{Xt }Tt=1

Data

p×1

X

A generic data

p×1

ψ(., .)

Moment function

m×1

ψt (.)

ψ(Xt , .)

θˆTu

m×1

ESP point estimate. See Definition 2.1 p.3.

f˜θT∗ ,sp (.)

m×1

(smooth) ESP intensity.

S˜u,T

1×1

ESP confidence set



n(.)

Standard Gaussian density

fˆθT∗ ,sp (.)

1×1

Rough ESP intensity.

B(A)

Borel σ-algebra generated by the set A

#A

FT (.)

Number of elements in the set A o n P # θ ∈ . : T1 Tt=1 ψt (θ) = 0m×1 i.e. point random-field.

˜ T (.) F

(Smooth) Estimated intensity measure.

ΣT (θ)

Estimated tilted variance of a solution.

Ac

Θ \ A i.e. θ ∈ Θ s.t. θ ∈ /A

∂A

Boundary of A

NT (.)



Intensity measure of NT (.).

– –

{a ∈ A : inf(a, ∂A ∩ (∂Θ) ) > η}



Domain of definition of the rough ESP intensity n o ˙


δθ˙ (.)

Dirac distribution (point mass) at θ˙



n

Number of solutions to the moment conditions

[[1, n]]

{1, 2, . . . , n}

A

−η

c

ˆT Θ ˙ Br (θ) ˙ Br (θ)

AT

˙ ∩Θ Br (θ)

A

AT converges to A. See Definition 2.2 p.4.

23



B.

Assumptions from Holcblat (2011) In this appendix, we remind the assumptions from the core paper. They are used in this paper.

B.1.

The ESP estimand and estimator

Assumption 5. (a) {Xt }∞ t=1 is a sequence of random vectors of dimension p on the complete probability sample space (Ω, E, P) ; (b) Let the measurable space (Θ, B(Θ)) such that Θ ⊂ Rm is compact and B(Θ) denotes the Borel σ-algebra on Θ ; (c) The moment function ψ : Rp × Θ → Rm is E ⊗ B(Θ)/B(Rm )-measurable, where E ⊗ B(Θ) denotes the product σ-algebra ; (d) For the sample size at hand T , the expectation of the number of solutions to the empirical moment P conditions is finite i.e. ∞ n=1 npn,T < ∞ a.s. where pn,T is the probability of having n solutions to the empirical moment conditions.

Assumption 6. For all x ∈ Rp , θ 7→ ψ(x, θ) is continuously differentiable over Θ. Assumption 7. For any set A ⊂ Θ, denote A−η := {a ∈ A : inf(a, ∂A ∩ (∂Θ)c ) > η} with η > 0 where Ac and ∂A respectively denotes the complement of A in Θ and its boundary. Define the set

ˇT Θ

" #" #−1 #−1 " T T T 0 X X 1 X ∂ψt (θ) ∂ψt (θ) 1 1 θ ∈ Θ : ψt (θ)ψt (θ)0 0  T t=1 T t=1 ∂θ T t=1 ∂θ  

:=

=0

det

  

ˇ T and Θ ˆ −η does not have any common elements i.e. For the sample at hand, for all η > 0 small enough, the sets Θ T ˇT ∩Θ ˆ −η = ∅ where ∅ denotes the empty set. Θ T B.2.

Asymptotic behaviour of the ESP estimator

Assumption 8. (a) {Xt }∞ t=1 are i.i.d. ; (b) In the parameter space Θ, there exists a unique solution θ0 ∈ int (Θ) to

i h  

the moment conditions E [ψ(X, θ)] = 0m×1 . ; (c) E supθ∈Θ kψ(X, θ)k < ∞ ; (d) E supθ∈Θ ∂ψ(X,θ)

< ∞ ; (e) ∂θ 0 h i ∂ψ(X,θ0 ) E 6= 0 . ∂θ 0 det

Assumption 9. Define the set

ˆ ∞ := Θ

                

θ ∈ Θ : ∃τ∞ (θ) ∈ Rm s.t.

i h 0 ∃r > 0, ∀τ ∈ Br (˜ τ∞ (θ)), E eτ ψ(X,θ) < ∞ i h 0 0 <∞ E eτ∞ (θ) ψ(X,θ) ∂ψ(X,θ) ∂θ |Σ∞ (θ)|det 6= 0 h i 0 E ψ(X, θ)eτ∞ (θ) ψ(X,θ) = 0m×1

                

.

i ih i−1 h h 0 −1 0 0 0 where Σ∞ (θ) := Eeτ∞ (θ) ψ(X,θ) ∂ψ(X,θ) E eτ∞ (θ) ψ(X,θ) ψ(X, θ)ψ(X, θ)0 Eeτ∞ (θ) ψ(X,θ) ∂ψ(X,θ) (a) There ∂θ ∂θ 0

ˆ T . Define a fixed η ∈]0, rˉ[ ; (b) For all exists rˉ > 0 such that there exists T˙ ∈ N, so that for all T > T˙ , Br (θ0 ) ⊂ Θ h i τ 0 ψ(X,θ) ˆ −η θ˙ ∈ Θ k < ∞. ˙ kψ(X, θ)e ∞ , there exists r1 , r2 > 0 such that for all τ ∈ Br1 (τ∞ (θ)) E supθ∈Br (θ) 2

Assumption 10. (a) For all x ∈ Rp , the function θ 7→ ψ(X, θ) is four times continuously differentiable in a neighborhood h i of θ0 P-a.s. ; (b) For all k ∈ [[1, 2]], there exists r > 0, E supθ∈Br (θ0 ) kDk ψ(X, θ)k < ∞ where Dk denotes the differential operator w.r.t. θ of order k ; (c) For all k ∈ [[1, 4]], there exists M > 0 such that there exist T˙ ∈ N and r > 0, so

24

 

k −1 2

< M ; (d) For all k ∈ [[1, 4]], there exists M > 0 such that there exist T˙ ∈ N |Σ that for all θ ∈ Br (θ0 ) (θ)| D T det

n h P io 0

˙ and r > 0, so that for all T > T and θ ∈ Br (θ0 ), Dk ln T1 Ti=1 eτT (θ) ψt (θ) < M ; (e) There exists r > 0, h i E supθ∈Br (θ0 ) ψ(X, θ)ψ(X, θ)0 < ∞.

Assumption 11. Let η > 0 be defined as in Assumption 9(a). (a) For all ε > 0, there exists T˙ ∈ N and M > 0 such 1

− −εT ˆ −η ˆ −η |ΣT (θ)|det2 6 M ; (b) For all θ˙ ∈ Θ that T > T˙ implies for all θ ∈ Θ ∞ , e ∞ , there exist r1 , r2 > 0 such that, i h 0 τ ψ(X,θ) < ∞. E sup(τ,θ)∈Br (˜τ∞ (θ))×B ˙ ˙ e r (θ) 1

B.3.

2

Case with multiple solutions to the moment conditions We adapt assumptions of the previous section to allow for multiple solutions to the moment conditions. Assumptions

8(b)(e) become the following. n on (i) Assumption 12. Denote [[1, n]] the integers in [1, n]. (b’) In the parameter space Θ, there exist multiple solutions, θ0

i=1

(i)

with n the number of solutions,7 to the moment conditions E [ψ(X, θ)] = 0m×1 such that for all i ∈ [[1, n]], θ0 ∈ int (Θ) ;   (i) ∂ψ(X,θ0 ) 6= 0. (e’) For all i ∈ [[1, n]], E ∂θ 0 det

Asumptions 9(a) becomes the following.

Assumption 13. (a’) For all i ∈ [[1, n]], there exists rˉ(i) > 0 such that there exists T˙ (i) ∈ N, so that for all T > T˙ (i) , (i) ˆT. Brˉ(i) (θ0 ) ⊂ Θ

Assumptions 10 become the following. (i)

(i)

Assumption 14. For all θ0 with i ∈ [[1, n]], Assumptions 10(a)-(e) are satisfied with θ0 replaced by θ0 .

C.

Proofs

C.1.

Proof of Proposition 2.1

i) By Proposition 4.4 and Assumption 1(a) respectively, f˜θT∗ ,sp (.) and u(., .) are continuous over the compact sets. R Thus, by the Lebesgue dominated convergence theorem, θe 7→ Θ u(θe , θ)f˜θT∗ ,sp (θ)dθ is continuous over Θ. ii) By i) and Lemma C.1, apply Lemma 2 from Jennrich (1969).

Lemma C.1. Under Assumption 7;1(a), (ω, θ) 7→

R

Θ

u(θe , θ)f˜θT∗ ,sp (θ)dθ is E ⊗ B(Θ)/B(R)-measurable.

Proof. Apply a standard preliminary result to the Fubini theorem (e.g. Lemma 1.26 p.14 in Kallenberg, 2001). 

C.2.

Proof of Proposition 2.2 i) By the Lemma C.2 below, for all ε > 0 there exists an open cover of Θ,    rθ˙ > 0

7

  supθ

˙ e ∈Br ˙ (θ) θ

S

˙ θ∈Θ

˙ such that Bεθ˙ (θ),

R



Θ u(θe , θ)fθT∗ ,sp (θ)dθ − u(θe , θ0 ) < ε for T big enough.

In accordance with Assumption 5(d), the number of solutions is unbounded but finite.

25

n oK Now, any open cover of a compact set contains a finite open cover. Therefore, by Assumption 5(b), there exists θ˙k

k=1



K K and {rk }K such that ΘK , {Tk }K k=1 ∈ N k=1 ∈]0, ∞[

  S  Θ = K

Brk (θ˙k )

R

∗ (θ˙k ) Θ u(θe , θ)fθT ,sp (θ)dθ − u(θe , θ0 ) < ε for T > Tk .

k=1

  supθ

e ∈B

k

R

Thus for T > maxk∈[[1,K]] Tk , supθe ∈Θ Θ u(θe , θ)f˜θT∗ ,sp (θ)dθ − u(θe , θ0 ) < ε.

ii) Since by Assumptions 1(b)(c) u(θ0 , θ0 ) > u(θe , θ0 ) for all θe ∈ Θ, it follows from iii). 

Lemma C.2. Under Assumptions5-11,1(a), for all θ˙e ∈ Θ and for all ε > 0, there exist r > 0 and T˙ ∈ N such that for all θe ∈ Br (θ˙e ) and for all T > T˙ ,

Z



∗ ,sp (θ)dθ − u(θe , θ0 ) 6 ε. u(θ , θ)f e θ

T Θ

Proof. For a fixed θ˙e ∈ Θ, by the triangle inequality, ∀r > 0, ∀θe ∈ Br (θ˙e )

6

Z



∗ u(θ , θ)f (θ)dθ − u(θ , θ ) e e 0 θT ,sp

Θ

Z

Z

Z



˙e , θ)fθ∗ ,sp (θ)dθ + ˙e , θ)fθ∗ ,sp (θ)dθ − u(θ˙e , θ0 )

∗ ,sp (θ)dθ − u(θ , θ)f u( θ u( θ e θ



T T T Θ Θ

Θ

˙ + u(θe , θ0 ) − u(θe , θ0 )

It remains to prove that for all ε > 0, by choosing r small enough and T big enough each of the three terms can made smaller than 3ε . By the uniform continuity of u(., .) (by the Heine-Cantor theorem) and Theorem 5.1 from the core paper, the first term is smaller than

ε 3

for r small enough. For T big enough the second term is smaller than

ε 3

by Theorem 5.1 from the core

paper. For r small enough, the third term is smaller than 3ε . 

C.3.

Proof of Proposition 2.3 Notations for this proof. ∀θe ∈ Θ,

hT (θe ) h(θe )

:= :=

1 KT

Z

Θ

u(θe , θ)f˜θT∗ ,sp (θ)dθ

(3)

u(θe , θ0 )

i) Existence of kα,T follows from Lemma C.3 and C.5iii) . Measurability of the ESP confidence set, {θe : hT (θe ) > kα,T }, follows from the E ⊗ B(Θ)/B(R)-measurability of hT (.) − kα,T by Lemma C.5iv) and Lemma C.1. Thus there exists an ESP confidence set. Existence of an asymptotic ESP confidence set follows from the same arguments as in Lemma C.5iii). ii) Proof by contradiction. Assume that kα,T does not converge to kα,∞ as T → ∞. Then there exists ε > 0  and a subsequence kα,β1 (T ) T >1 such that kα,β1 (T ) − kα,∞ > ε. By Lemma C.4 and the Bolzano-Weierstrass the  orem, there exists a converging subsequence kα,β2 ◦β1 (T ) T >1 of the sequence kα,β1 (T ) T >1 . Distinguish the two cases limT →∞ kα,β2 ◦β1 (T ) > kα,∞ and kα,∞ > limT →∞ kα,β2 ◦β1 (T ) . Both lead to a contradiction. 

26

Lemma C.3. Under Assumptions 5-7;1- 2, for η > 0 small enough, KT :=

Z

Θ2

u(θe , θ)f˜θT∗ ,sp (θ)dθdθe 6= 0.

Proof. This follows from Assumptions 1 and 2.  Lemma C.4. Under Assumptions 5-7;1-2, P-a.s. i) hT (.) > 0; ii) θe 7→ hT (θe ) is continuous; iii)

R

Θ

hT (θe )dθe = 1.

Proof. i) By Assumption 1(d) and Assumption 7 u(., ) and f˜θT∗ ,sp (.) are respectively positives. ii) u(., ) and f˜θT∗ ,sp (.) are bounded as continuous functions over a compact set. Apply Lebesgue dominated convergence theorem. iii) Note KT :=

R

Θ2

u(θe , θ)f˜θT∗ ,sp (θ)dθdθe . 

Lemma C.5. For a fixed T ∈ N, define IP on the probabilizable space (Θ, B(Θ)) such that ∀B ∈ B(Θ),

IP(B) :=

Z

hT (θe )dθe . B

Under Assumptions 5-7;1-2, P-a.s. i) IP is a probability measure; ii) ∀k > 0, k 7→ IP ({θe ∈ Θ : hT (θe ) > k}) is left-continuous decreasing function; iii) kα,T exists; iv) ω 7→ kα,T (ω) is E/B(R)-measurable. Proof. i) Immediate. ii) This comes from a standard continuity property of measures (e.g. Lemma 1.14 p.8 in Kallenberg, 2001). iii) Define k˙ := sup {k : IP ({θe ∈ Θ : hT (θ3 ) > k}) > 1 − α} k∈R

By Lemma C.4ii) hT (.) is bounded, thus by Lemma C.4iii) k˙ < ∞ exists. Therefore, by definition of a supremum and ii) the result follows. iv) For the same fixed T used to define IP, define for this proof ∀ω ∈ Ω, ∀k > 0,

g(ω, k) := IP ({θe ∈ Θ : hT (θe ) > k})

By Lemma C.1 and a standard preliminary result to the Fubini theorem (e.g. Lemma 1.26 p.14 in Kallenberg, 2001) ∀k > 0. g(., k) is E/B([0, 1])-measurable.

Adapt the proof of Lemma 2 in Jennrich (1969) to finish the proof. 

27

Lemma C.6. Under Assumptions 5-11;1-2, as T → ∞ sup khT (θe ) − h(θe )k → 0

θe ∈Θ

P-a.s.

Proof . This follows from Proposition 2.1iii).  Lemma C.7. Under Assumptions 5-11;1-2, {kα,T }T >1 is bounded P-a.s. Proof. Proof by contradiction. Assume that {kα,t }T >1 is unbounded. Now, any real-valued sequence has a monotone subsequence.8 Then there exists a subsequence kα,β(T ) → ±∞. This is a contradiction, since hT (.) is bounded over the compact set Θ by Lemma C.4ii). 

Lemma C.8. Under Assumptions 5-11;1-2, for all a > 0, for all ε > 0, for T big enough, for all θ˙e ∈ Θ, I{θe ∈Θ:hT (θe )>a+ε} (θ˙e ) 6 I{θe ∈Θ:h(θe )>a} (θ˙e ) Proof. By Lemma C.6, for T big enough, {θe ∈ Θ : hT (θe ) > a + ε} ⊂ {θe ∈ Θ : h(θe ) > a}. C.4.

Proof of Proposition 2.4 i) Proof by contradiction immediate. ii) Apply the definition of Dirac distributions.

C.5.

Proof of Proposition 2.5 i) By Proposition 4.4i) from the core paper, apply Lemma 2 from Jennrich (1969). ii) It follows from Corollary 2 of the core paper.

C.6.

Proof of Proposition 2.6 i) Adapt proof of Proposition 2.3i). ii) By Corollary 2 of the core paper, a proof by contradiction is immediate.

C.7.

Proof of Proposition 3.1 It is definition-chasing.

C.8.

Proof of Proposition 3.2 i) Write dT (.) with the help of indicator functions. ii) By Corollary ??, a proof by contradiction is immediate.

Let {un }n>1 be a real-valued sequence. Define E := {n ∈ N : ∀q > n, uq > un }. If #E = ∞, any infinite subset of E corresponds to an increasing subsequence. If #E < ∞, ∃n1 ∈ N s.t. ∀n > n1 , ∃q > n with uq < un . Thus, we can recursively define a strictly decreasing subsequence. Consequently, in both cases the result holds. 8

28

C.9.

Proof of Proposition 3.3 Adapt proof of Proposition 3.1.

C.10.

Proof of Proposition 3.4

Adapt proof of Proposition 3.2.

29

An ESP Decision-Theoretic Approach

Nov 6, 2011 - 0 } P-a.s. where ⊔ denotes a union of disjoint sets. Proof. Adapt proof of Proposition 2.5. □. 4.2. Hypothesis testing. Standard classical tests correspond to standard classical confidence intervals. Thus they are not robust to the presence of multiple solutions the moment conditions. ESP tests presents some ...

175KB Sizes 1 Downloads 203 Views

Recommend Documents

SHIELD-ESP-WIFI_schematic.pdf
Page 1 of 1. 1/6/2016 10:28 AM D:\PCB\Projects\SHIELD-ESP-WIFI\SHIELD-ESP-WIFI.sch (Sheet: 1/1). Page 1 of 1. SHIELD-ESP-WIFI_schematic.pdf.

ESP-DINA3-Musical-instrument-posters.pdf
Connect more apps... Try one of the apps below to open or edit this item. ESP-DINA3-Musical-instrument-posters.pdf. ESP-DINA3-Musical-instrument-posters.

An Interdisciplinary Approach
Human-Computer Interaction (HCI) design. His area of ... The Industrial Design Centre (IDC) in IIT Bombay has had an interdisciplinary approach towards design education for several years. .... Since the year 2000, IDC has been conducting an elective

An Applied Approach
as resume writing, interview survival, job description authoring, performance appraisal, ... Managing Organizational Change: A Multiple Perspectives Approach.

SHIELD-ESP-WIFI-R2-Sch.pdf
SHIELD-ESP-WIFI-R2-Sch.pdf. SHIELD-ESP-WIFI-R2-Sch.pdf. Open. Extract. Open with. Sign In. Details. Comments. General Info. Type. Dimensions. Size. Duration. Location. Modified. Created. Opened by me. Sharing. Description. Download Permission. Main m

TES ESP 2015-2018.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. TES ESP ...

300 (2006) esp
JOHN FRUSCIANTE 2015.The. Shawshank Redemption dual.It is, on one hand, ... House of night pdf.Pearlcalls the brook "foolish and tiresome"(171)and asks it ...

ESP Overhauling Report.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item.

Resultado final_Téc Esp. Audiovisual.pdf
Porto Velho (RO), 02 de janeiro de 2017. A Comissão. Page 1 of 1. Resultado final_Téc Esp. Audiovisual.pdf. Resultado final_Téc Esp. Audiovisual.pdf. Open.

MS 2016 ESP (BAJA).pdf
Page 1 of 43. Page 1 of 43. Page 2 of 43. Page 2 of 43. Page 3 of 43. Page 3 of 43. Page 4 of 43. Page 4 of 43. MS 2016 ESP (BAJA).pdf. MS 2016 ESP (BAJA).pdf. Open. Extract. Open with. Sign In. Main menu. Displaying MS 2016 ESP (BAJA).pdf. Page 1 of

HALLIBURTON - AXELSON ESP- SENSOR PRESION.pdf ...
HALLIBURTON - AXELSON ESP- SENSOR PRESION.pdf. HALLIBURTON - AXELSON ESP- SENSOR PRESION.pdf. Open. Extract. Open with. Sign In.

UNAC-Conf-POSTER-ESP-CLR.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item.

ESP 10 Quarter 1 LM.pdf
Inilimbag sa Pilipinas ng FEP Printing Corporation. Department of Education-Instructional Materials Council Secretariat. (DepEd-IMCS). Office Address: 5th Floor ...

Chapter_09 ESP Design Examples.pdf
must be lifted (Dynamic Fluid Level, DFL). 2. The friction loss in the tubing string. 3. The wellhead pressure which the unit must pump against. See the drawings ...

Lavage Chimique Esp (2).pdf
Page 1 of 2. Procédure de Nettoyage du Démonstrateur Aclaira C1. Tarea N° Descripción Modo P31. P32. VC3-21. VC3-22. VM3-21 VM-21 Duración. (mn).

Esp Starrett Folder Tiralineas.pdf
Facebook, Twitter, Flickr, YouTube e Linkedln son marcas registradas, respectivamente,. por las empresas Facebook Inc., Twitter Inc., Yahoo Inc., Google Inc. e ...

Thinking-Like-An-Engineer-An-Active-Learning-Approach-3rd-Edition ...
Thinking-Like-An-Engineer-An-Active-Learning-Approach-3rd-Edition.pdf. Thinking-Like-An-Engineer-An-Active-Learning-Approach-3rd-Edition.pdf. Open.

Thinking-Like-An-Engineer-An-Active-Learning-Approach-3rd-Edition ...
Page 3 of 3. Thinking-Like-An-Engineer-An-Active-Learning-Approach-3rd-Edition.pdf. Thinking-Like-An-Engineer-An-Active-Learning-Approach-3rd-Edition.pdf.

Digesting Anomalies: An Investment Approach
capital imply high net present values of new projects and high investment.6 .... In particular, book equity is shareholders' equity, plus balance-sheet deferred taxes ..... anomaly categories, including momentum, value-versus-growth, investment,.