Computing dynamic optimal mechanisms when hidden ...

Viewer
Transcript

Computing dynamic optimal mechanisms when hidden types are Markov and controlled by hidden actions Kenichi Fukushima

Yuichiro Waki

University of Wisconsin, Madison

University of Queensland

September 25, 2011

This note documents how the main theoretical results in Fukushima and Waki (2011) extend to a richer setting where the agent can inuence the evolution of his hidden type

θt

through a hidden action

yt .

An example of such a setting is one where

hidden stock of wealth or human capital and

yt

θt

represents a

is a hidden investment. The notation follows

Fukushima and Waki (2011) unless otherwise indicated.

rt ∈ Θ to the planner. The planner chooses an outcome xt ∈ X and recommends an action yt ∈ Y given the agent's 0 history of reports. The agent then chooses an action yt ∈ Y which may or may not equal yt . We assume Y is a nite set with cardinality M . If the agent's current type is θt and he chooses action yt , his next period type θt+1 is drawn from the density π(·|θt , yt ) > 0. The initial distribution is π(·|θ−1 , y−1 ) where (θ−1 , y−1 ) is ∞ t+1 publicly known. We let Y denote the set of function sequences y = {yt }t=0 , yt : Θ →Y for each t, and write In each period the agent draws a type

Pr(θ

t

θt ∈ Θ

and sends a report

|θ−1 , y−1 , y) = π(θt |θt−1 , yt−1 (θt−1 )) × · · · × π(θ1 |θ0 , y0 (θ0 )) × π(θ0 |θ−1 , y−1 ). y|θt−1 = {yt+s (θt−1 , ·)}∞ s=0

t−1 denote the continuation of y after θ . ∞ t+1 → X and yt : An allocation is then a sequence (x, y) = {xt , yt }t=0 , where xt : Θ Θt+1 → Y for each t. We do not introduce randomizations to keep the notation simple, We also let

although doing so is quite straightforward and useful for computations (as it helps obtain convexity). takes place, that is, if after each shock history θ t occurs and the agent chooses yt (θ ), the agent obtains lifetime utility: If allocation

(x, y)

U (x, y; θ−1 , y−1 ) =

∞ X X t=0

t

the outcome

β t u(xt (θt ), yt (θt ); θt )Pr(θt |θ−1 , y−1 , y)

θt

and the planner incurs cost:

C(x, y; θ−1 , y−1 ) =

∞ X X t=0

q t c(xt (θt ))Pr(θt |θ−1 , y−1 , y).

θt

1

xt (θt )

An allocation

(x, y)

is therefore incentive compatible if

U (x, y; θ−1 , y−1 ) ≥ U (x ◦ r, y0 ; θ−1 , y−1 ),

∀(r, y0 ) ∈ R × Y

(1)

and satises promise keeping if

U (x, y; θ−1 , y−1 ) ≥ U0 . The planning problem starting from of

(x, y)

(θ−1 , y−1 , U0 )

(2)

is to minimize

C(x, y; θ−1 , y−1 )

by choice

subject to incentive compatibility and promise keeping.

We have the following analog of Lemma 1: Lemma A1.

An allocation (x, y) is incentive compatible if and only if

u(xt (θt ), yt (θt ); θt ) + βUt+1 (θt ; θt , yt (θt )) ≥ u(xt (θt−1 , θt0 ), yt0 ; θt ) + βUt+1 (θt−1 , θt0 ; θt , yt0 )

(3)

for all t, θt−1 , θt , θt0 , and yt0 , where Ut (θ

t−1

; θ− , y− ) =

∞ X X s=t

β s−t u(xs (θt−1 , θts ), ys (θt−1 , θts ); θs ) Pr(θts |θ− , y− , y|θt−1 ).

θts

Proof.

0 The only if part is clear. So let (x, y) satisfy (3) and x (r, y ) ∈ R × Y . For s s s 0 0 t s t 0 t t s each t, dene r| and y | by (r|s (θ ), y |s (θ )) = (rs (θ ), ys (θ )) for all s ≤ t and θ , and s )) for all s ≥ t + 1 and θs . I.e., (r|t , y0 |t ) follows (r, y0 ) (r|ts (θs ), y 0 |ts (θs )) = (θs, y s (rt (θt ), θt+1 until period t and then reverts back to truth-telling and obedience from t + 1. Applying (3) 0 0 0 1 0 1 inductively we have U (x, y; θ−1 , y−1 ) ≥ U (x ◦ r| , y | ; θ−1 , y−1 ) ≥ U (x ◦ r| , y | ; θ−1 , y−1 ) ≥ · · · ≥ U (x ◦ r|t , y0 |t ; θ−1 , y−1 ) for any t. Since u is bounded and β ∈ (0, 1), this implies:

U (x, y; θ−1 , y−1 ) ≥ lim U (x ◦ r|t , y0 |t ; θ−1 , y−1 ) = U (x ◦ r, y0 ; θ−1 , y−1 ). t→∞

Hence

(x, y)

is incentive compatible.

Notice here that the continuation utility prole

Θ×Y,

Ut (θt−1 ; ·, ·)

is a function of

(θ− , y− ) ∈ N ×M

so a recursive formulation in the spirit of Fernandes and Phelan (2000) has

continuous state variables. We say that

π

has an order

K

mixture representation if we can write:

π(θ|θ− , y− ) =

K X

pk (θ)wk (θ− , y− ),

k=1 K where p : Θ → R+ and w : Θ × Y → PK k=1 wk (θ− , y− ) = 1 for each (θ− , y− ). Under this representation, we can dene

RK +

satisfy

P

θ∈Θ

pk (θ) = 1

k

and

p(θt )

(4)

for each

( at (θ

t−1

)=

X

u(xt (θt ), yt (θt ); θt )

θt

+β

∞ X X

s β s−t−1 u(xs (θs ), ys (θs ); θs ) Pr(θt+1 |θt , yt (θt ), y|θt )

s s=t+1 θt+1

  

2

and write

Ut (θ

t−1

; ·, ·) =

K X

akt (θt−1 )wk (·, ·).

k=1

at

This suggests that, by using

instead of

Ut

as an endogenous state variable, it should be

possible to reduce the dimensionality from N × M to K . t−1 Let us now write at (θ ; x, y) to describe the mapping from (x, y) to at (θt−1 ) dened by −1 (4). Let us also write a0 (x, y) = a0 (θ ; x, y), as this is independent of θ−1 . We then dene the auxiliary planning problem starting from to minimize

C(x, y; θ−1 , y−1 )

(θ−1 , y−1 , a0 )

as the problem of choosing

(x, y)

subject to incentive compatibility (1) and

a0 (x, y) = a0 . We let

A∗ ⊂ V K

(5)

a0 's for which the constraint set of this problem is non(θ−1 , y−1 )) and let J ∗ : Θ × Y × A∗ → R denote the optimal

denote the set of

empty (which is independent of value function. If

a∗0 ∈ arg min∗ J ∗ (θ−1 , y−1 , a0 ) a0 ∈A

s.t.

a0 · w(θ−1 , y−1 ) ≥ U0 ,

then a solution to the auxiliary planning problem starting from the planning problem starting from

(θ−1 , y−1 , a∗0 ) is a solution to

(θ−1 , y−1 , U0 ).

The analog of Lemma 2 is:

An allocation (x, y) satises the constraints of the auxiliary planning problem t ∗ (1) and (5) if and only if there exists a = {at }∞ t=0 , at : Θ → A , such that (x, y, a) satises Lemma A2.

u(xt (θt ), yt (θt ); θt ) + βat+1 (θt ) · w(θt , yt (θt )) ≥ u(xt (θt−1 , θt0 ), yt0 ; θt ) + βat+1 (θt−1 , θt0 ) · w(θt , yt0 ) at (θt−1 ) =

X

u(xt (θt ), yt (θt ); θt ) + βat+1 (θt ) · w(θt , yt (θt )) p(θt )

(7)

θt

for all t, θ , t

Proof.

θt0

,

yt0

, and a0 (θ−1 ) = a0 .

Virtually identical to that of Lemma 2.

The analog of the

B

operator therefore maps

A⊂VK

into

B(A) = {a ∈ V K |∃(x, y, a+ ) ∈ F (a; A)} where

F (a; A)

is the set of function triples

(x, y, a+ ) : Θ → X × Y × A

satisfying:

u(x(θ), y(θ); θ) + βa+ (θ) · w(θ, y(θ)) ≥ u(x(θ0 ), y 0 ; θ) + βa+ (θ0 ) · w(θ, y 0 ), X a= u(x(θ), y(θ); θ) + βa+ (θ) · w(θ, y(θ)) p(θ). θ

3

(6)

∀θ, θ0 , y 0

At this point it is useful to construct a particular incentive compatible allocation as follows. First pick any

x¯ ∈ X

and let

W (θ) = max y∈Y

For each for each

Ut (θ

 

W :Θ→R

u(¯ x, y; θ) + β



X θ+

; θ− , y− ) =

∞ X X s=t

So for each

solve the Bellman equation:

  W (θ+ )π(θ+ |θ, y) . 

θ let y¯(θ) solve the right hand side problem. t and θt . We then have for any t and θt−1 :

t−1

¯) (¯ x, y

Then set

x¯t (θt ) = x¯ and y¯t (θt ) = y¯(θt )

β s−t u(¯ x, y¯(θs ); θs )Pr(θts |θ− , y− , y|θt−1 ) =

θts

X

W (θ)π(θ|θ− , y− ).

θ

t, θt−1 , θt , θt0 , yt0 :

u(¯ x, y¯(θt ); θt ) + βUt+1 (θt ; θt , y¯(θt )) ≥ u(¯ x, yt0 ; θt ) + βUt+1 (θt−1 , θt0 ; θt , yt0 ). It follows from Lemma A1 that

¯) (¯ x, y

is incentive compatible.

We have the following analog of Proposition 3:

A∗ is a non-empty and compact set, and is the largest xed point of B . If A0 ⊂ V K is a compact set satisfying A0 ⊃ B(A0 ) ⊃ A∗ (one example being A0 = V K ) then n ∗ K satises A∗ ⊃ B(A0 ) ⊃ A0 B n (A0 ) is decreasing in n and ∩∞ n=0 B (A0 ) = A . If A0 ⊂ V n n (one example being A0 = {a0 (¯x, y¯ )}), then B (A0 ) is increasing in n and cl(∪∞ n=0 B (A0 )) = A∗ . Proposition A3.

Proof. A ⊂ B(A)

The analogs of Lemmas 5-8 follow from virtually identical arguments. Thus: (i) V K , A ⊂ B(A) =⇒ B(A) ⊂ A∗ , (ii) B(A∗ ) = A∗ , (iii) A ⊂ A0 ⊂ V K =⇒ ⊂ B(A0 ), and (iv) A is compact =⇒ B(A) is compact. Similarly for the rst two

parts of the proposition.

∗ To prove the nal part of the proposition, suppose A0 ⊂ B(A0 ) ⊂ A . Then from (ii), (iii), n ∗ ∗ n ∞ and the compactness of A , we know that B (A0 ) is increasing and cl(∪n=0 B (A0 )) ⊂ A . ∗ ∞ n ∗ ∞ n To prove A ⊂ cl(∪n=0 B (A0 )), pick any a ∈ A . We construct a sequence in ∪n=0 B (A0 ) 0 ∗ ∗ that converges to a. For this, rst pick another a ∈ A0 (⊂ A ). By the denition of A 0 0 there exist incentive compatible allocations (x, y) and (x , y ) such that a = a0 (x, y) and 0 0 0 a = a0 (x , y ). Next for each n ≥ 1, do the following. Dene xn = {xnt }∞ t=0 by truncating x 0 after n periods and appending x . Thus for t > n: t n+1 )). (xn0 (θ0 ), ..., xnt (θt )) = (x0 (θ0 ), ..., xn (θn ), x00 (θn+1 ), ..., x0t−n−1 (θn+1 And let

(rn , yn ) ∈ arg

max

ˇ ). U (xn ◦ ˇr, y

(ˇ r,ˇ y)∈R×Y 0 0 Here, since (x , y ) is incentive compatible, we can assume without loss that for t > n, n t n t 0 t ˆ n ) = (xn ◦ rn , yn ). By construction, rt (θ ) = θt and yt (θ ) = yt−n−1 (θn+1 ). Finally, let (ˆ xn , y ˆ n ) is incentive compatible, a0 (ˆ ˆ n ) ≥ a0 (xn , y), and an+1 (θn ; x ˆn, y ˆ n ) = a0 for all θn . (ˆ xn , y xn , y

4

n ˆ n ) ∈ ∪∞ We next show a0 (ˆ xn , y n=0 B (A0 ) for all n. From the incentive compatibility of ˆ n ) and an+1 (θn ; x ˆn, y ˆ n ) = a0 we obtain by induction a0 (ˆ ˆ n ) ∈ B n+1 ({a0 }). This, (ˆ xn , y xn , y n (iii), and the fact that B (A0 ) is increasing in n then imply the result. 0 0 n n ˆ n )}∞ ˆ ) → a as n → ∞, we pick an arbitrary subsequence {a0 (ˆ To verify a0 (ˆ x ,y xn , y n0 =1 00 00 ˆ n )}∞ and show that it has a further subsequence {a0 (ˆ xn , y that converges to a . Applying 00 n =1 n n n to (r , y ) the argument we applied to r in the proof of Proposition 3, we obtain a subindex 00 00 00 ˜ ). Also for each t we have xnt = xt for n00 along which (rn , yn ) converges to some (˜r, y 00 00 00 00 00 ˆ n ) = a0 (xn ◦ rn , yn ) → n00 ≥ t. This together with the boundedness of u implies a0 (ˆ xn , y 00 00 ˜ ). Combining this with a0 (ˆ ˆ n ) ≥ a0 (xn , y) and a0 (xn , y) → a, we obtain a0 (x ◦ ˜r, y xn , y

˜ ) ≥ a. But the incentive compatibility of (x, y) implies a0 (x ◦ ˜r, y ˜ ) ≤ a0 (x, y) = a, a0 (x ◦ ˜r, y ˜ ) = a. so a0 (x ◦ ˜ r, y ¯ )}. From the incentive compatibility of (¯ ¯ ), (ii), and (iii), Now let A0 = {a0 (¯ x, y x, y ∗ + we have B(A0 ) ⊂ A . To see A0 ⊂ B(A0 ), observe that if we set (x(θ), y(θ), a (θ)) = + ¯ )) for each θ we have (x, y, a ) ∈ F (a0 (¯ ¯ ); A0 ). (¯ x, y¯(θ), a0 (¯ x, y x, y The analog of the

T

operator maps

J : Θ × Y × A∗ → R

into

T J : Θ × Y × A∗ → R,

dened as:

T J(θ− , y− , a) =

inf +

(x,y,a )∈F (a;A∗ )

X

c(x(θ)) + qJ(θ, y(θ), a+ (θ)) π(θ|θ− , y− ).

(8)

θ

The analog of Proposition 4 is therefore:

J ∗ is a bounded lower semicontinuous function, and ||T n J − J ∗ || → 0 as n → ∞ for any bounded J : Θ × Y × A∗ → R. There exists a function g ∗ : Θ × Y × A∗ → (X × Y × A∗ )Θ which attains the inmum on the right hand side of (8) when J = J ∗ , and for any such g ∗ the allocation (x∗ , y∗ ) dened recursively by (x∗t (θt ), yt∗ (θt ), a∗t+1 (θt )) = ∗ (θt−1 ), a∗t (θt−1 ))(θt ) solves the auxiliary planning problem starting from (θ−1 , y−1 , g ∗ (θt−1 , yt−1 a∗0 (θ−1 )). Proposition A4.

Proof.

Virtually identical to that of Proposition 4.

References Fernandes, A.,

and

C. Phelan (2000): A Recursive Formulation for Repeated Agency with

History Dependence, Fukushima, K.,

and

Journal of Economic Theory, 91(2), 223247.

Y. Waki (2011):

Computing Dynamic Optimal Mechanisms When

Hidden Types Are Markov, Working paper.

5