Three Steps Ahead

Yuval Heller Nuffield College & Department of Economics, Oxford

UCLA, January 2014

Yuval Heller (Oxford)

Three Steps Ahead

1 / 32

Introduction

Background Evidence suggests that people have systematic deviations from payoff maximizing behavior; these biases have economic implications. I

In some cases, the biases cannot be attributed to complexity costs.

I

Substantial heterogeneity of the biases in the population.

Research approach: I

People base their choices on heuristics (rules of thumb).

I

Different heuristics “compete” in a process of cultural learning.

I

Understanding how biases survive these competitive forces can help to achieve better understanding of the biases and their implications.

Yuval Heller (Oxford)

Three Steps Ahead

2 / 32

Introduction

Limited Foresight Stylized facts: I

People look only few steps ahead in long games.

I

Some subjects systematically look fewer steps than others.

Examples: repeated Prisoner’s Dilemma (Selten & Stoecker, 1986), Centipede games (McKelvey & Palfrey, 1992), sequential bargaining (Johnson et al., 2002).

Related bias: Limited iterative thinking in one-stage games (“level-k”).

Question 1

How do the naive agents survive?

2

Why is not there an “arms race” for better foresight? Yuval Heller (Oxford)

Three Steps Ahead

3 / 32

Introduction

Research Objectives 1

Characterizing a stable state in which all agents have short foresight abilities, and some agents look further than others.

2

Under which additional assumptions this stable state is unique?

3

Novel explanation for cooperative behavior in long finite games. I

All types have the same maximal payoff (unlike Kreps et al., 1982).

Yuval Heller (Oxford)

Three Steps Ahead

4 / 32

Model

Outline 1

Model

2

Stability Analysis

3

Characterization of a Nash Equilibrium

4

Evolutionary Stability

5

Uniqueness

6

Discussion

Yuval Heller (Oxford)

Three Steps Ahead

5 / 32

Model

Overview Large population of agents interact in repeated Prisoner’s Dilemma. Agent’s type determines: (1) foresight ability, (2) behavior. Foresight abilities are partially observable. More successful types become more frequent (payoff-monotonic dynamics): cultural learning or biological evolution. I

E.g., replicator dynamics - # of offspring is proportional to payoffs.

Rarely a few agents experiment with a new type (“mutants”). Objective: characterize stable states of the population in the long run. Yuval Heller (Oxford)

Three Steps Ahead

6 / 32

Model

Types and Populations

The type of each agent has two components: I

Foresight Ability: {L1 , L2 , ..., Lk , ...} - how early the agent becomes aware of the final period of the game and its strategic implications.

I

Behavior in the repeated Prisoner’s Dilemma. F

In which situations the agent cooperates, and in which he defects.

State of the population: A distribution over the set of types. I

Incumbents - types with positive frequency.

Yuval Heller (Oxford)

Three Steps Ahead

7 / 32

Model

Repeated Prisoner’s Dilemma Agents are randomly matched and play repeated Prisoner’s Dilemma. I

The game has a random length - T (Geometric distribution)

I

At each round there is a continuation probability δ close to 1.

The payoffs at each stage are (A > 1):

C

Yuval Heller (Oxford)

C

A

D

A+1

D A 0

0 1

Three Steps Ahead

A+1 1

8 / 32

Model

Information Structure Agent with ability Lk is informed about the realized length k rounds before the end. I

Horizon - the number of remaining stages.

Partial observability of abilities (a la Dekel, Ely, Yilankaya, 2007): I

Each agent observes with probability p the opponent’s ability.

I

With probability 1 − p: non-informative signal - opponent is a stranger.

All signals are private.

Yuval Heller (Oxford)

Three Steps Ahead

9 / 32

Stability Analysis

Outline 1

Model

2

Stability Analysis

3

Characterization of a Nash Equilibrium

4

Evolutionary Stability

5

Uniqueness

6

Discussion

Yuval Heller (Oxford)

Three Steps Ahead

10 / 32

Stability Analysis

Static Stability Analysis Payoff monotonic population dynamics

Static Auxiliary Game

Random matching in a single population

Symmetric 2-player game

Feasible types

Feasible Actions

State of the population

Mixed strategy

Necessary condition for stability

Symmetric Nash equilibrium

Stable state in a broad set of dynamics

Evolutionary stable strategy

Stability is robust to the presence of sophisticated agents. Nash (1950 thesis), Maynard-Smith & Price (1973), Taylor and Jonker (1978), Thomas (1985), Bomze (1986), Cressman (1990, 1997), Sandholm (2010) ... Yuval Heller (Oxford)

Three Steps Ahead

11 / 32

Stability Analysis

Auxiliary Static game

Stage 0: Each player chooses ability - {L1 , L2 , ..., Lk , ...}. I

At the end of stage 0: Partial observability of types.

Stages 1-T: I

Each player decides whether to cooperate or defect at each stage.

I

Player with ability Lk is informed about the realized length k rounds before the end.

Yuval Heller (Oxford)

Three Steps Ahead

12 / 32

Stability Analysis

Strategies In the Auxiliary Game (Behavior) strategy - σ = (µ, β ): µ - Distribution over abilities. I

β (I ) - probability of defection at each information set (playing rule).

Each information set has 4 components: 1

Foresight ability of the player (Lk ).

2

Horizon - how many rounds remain (if known).

3

The signal about opponent’s ability.

4

Public history of actions in the previous rounds.

u (σ , σ 0 ) - the expected payoff of a player who follows strategy σ and faces an opponent who follows strategy σ 0 . Yuval Heller (Oxford)

Three Steps Ahead

13 / 32

Characterization of a Nash Equilibrium

Outline 1

Model

2

Stability Analysis

3

Characterization of a Nash Equilibrium

4

Evolutionary Stability

5

Uniqueness

6

Discussion

Yuval Heller (Oxford)

Three Steps Ahead

14 / 32

Characterization of a Nash Equilibrium

Definition (σ ∗ = (µ ∗ , b∗ )) Two abilities: µ ∗ (L3 ) =

1 p·(A−1) ,

µ ∗ (L1 ) = 1 − µ ∗ (L3 )

Deterministic simple behavior - b∗ : Everyone plays perfect-tit-for-tat at unknown horizons.

I

F

Defect iff players played different actions in the previous round.

I

L1 agents: defect at the last round.

I

L3 agents: defect at the last two rounds. F

Horizon 3: perfect-tit-for-tat against strangers & L1 ; defect otherwise.

Theorem σ ∗ is a symmetric Nash equilibrium ∀

A (A−1)2


A−1 A

Yuval Heller (Oxford)

(A = 10: 10% < p < 90% ). Three Steps Ahead

15 / 32

Characterization of a Nash Equilibrium

Intuition Why σ ∗ is an equilibrium If p is not too low, a heterogeneous population is stable: I

L1 fares better against L3 opponent (“commitment” induces cooperation at horizon 3).

I

L3 fares better against L1 opponent (defects at horizon 2).

I

“Hawk-dove”-like game: unique frequency balances the payoffs.

Details

if p is not too high, there is no “arms race” for higher abilities: I

The optimal play of L>3 agents is to mimic L3 ’s play.

Details

L2 is strictly dominated by L3 (worse payoff against L3 , same against L1 ). Yuval Heller (Oxford)

Three Steps Ahead

16 / 32

Evolutionary Stability

Outline 1

Model

2

Stability Analysis

3

Characterization of a Nash Equilibrium

4

Evolutionary Stability

5

Uniqueness

6

Discussion

Yuval Heller (Oxford)

Three Steps Ahead

17 / 32

Evolutionary Stability

Evolutionarily Stable Strategy Nash equilibrium: (1) allow arbitrary play off-equilibrium path; and (2) may be dynamically unstable.

Definition (Maynard-Smith & Price, 1973) A symmetric Nash equilibrium σ is an evolutionarily stable strategy (ESS) if for each other best-reply σ 0 : u (σ , σ 0 ) > u (σ 0 , σ 0 ).

Interpretation: If σ is adopted by the population, it cannot be invaded by any alternative strategy (σ 0 ) that is initially rare. Yuval Heller (Oxford)

Three Steps Ahead

18 / 32

Evolutionary Stability

Limit ESS ESSs almost never exist in repeated games due to equivalent strategies, which only differ off the equilibrium path. There is no ESS in repeated Prisoner’s Dilemma (Lorberbaum, 1994).

Definition (Selten, 1983) σ is a limit ESS if it is a limit of ESSs of perturbed games when the perturbations converge to 0. Perturbed game: game with minimal probabilities to choose each action a at each information set h. Yuval Heller (Oxford)

Three Steps Ahead

19 / 32

Evolutionary Stability

Stability Result Assumption: better abilities have increasing cognitive costs - c (Lk ).

Theorem If c (L4 ) > c (L3 ), then σ ∗ is a limit-ESS.

(∀

A (A−1)2


A−1 A )

Remark σ ∗ is stable for any converging sequence of perturbed games. Without c (Lk ): the set of strategies similar to σ ∗ in which L≥3 mimic L3 ’s behavior is evolutionary stable (Thomas, 1985).

Question Is σ ∗ the only stable strategy? Yuval Heller (Oxford)

Three Steps Ahead

20 / 32

Evolutionary Stability

“Folk-Theorem” Result Proposition (all types and all rates of cooperation are stable) For any p > 0, Lk and r , if δ is sufficiently high, then there exists a limit ESS in which µ (Lk ) = 1, and players cooperate with frequency close to r .

Sketch of proof: I

Lk vs. Lk : follow a cycle with cooperation frequency of r when uninformed, defect otherwise.

I

Lk vs. Lk 0 : a different cycle that yields more to Lk and less to Lk 0 .

Always defecting is a stable outcome for each p, δ (& unique if p = 0). Is σ ∗ unique in a plausible subset of stable strategies? Yuval Heller (Oxford)

Three Steps Ahead

21 / 32

Uniqueness

Outline 1

Model

2

Stability Analysis

3

Characterization of a Nash Equilibrium

4

Evolutionary Stability

5

Uniqueness

6

Discussion

Yuval Heller (Oxford)

Three Steps Ahead

22 / 32

Uniqueness

Early-Niceness

Early Niceness Definition Strategy is early-nice if each player cooperates when: (1) the horizon is unknown or sufficiently large; and (2) no one ever defected before.

Remark 1

Focus on “nice” incumbents; no restrictions on mutants.

2

Equivalent definition: efficiency + non-discrimination against mutants: 1

Efficient play at large horizons; also if one of the players “trembles” and chooses a different ability.

Further motivation

2

Motivation for efficiency: “secret handshake” (Robson, 1990).

3

Fits experimentally observed behavior (e.g., Selten & Stoecker, 86).

Yuval Heller (Oxford)

Three Steps Ahead

23 / 32

Uniqueness

Result

Theorem (Uniqueness of σ ∗) Let A > 3. Any early-nice limit ESS is realization equivalent to σ ∗ (=may only differ off the equilibrium path.) Sketch of proof: 1/2

2/2

Weaker solutions concepts

Intuition. Let Lk be the lowest incumbent ability. Everyone must defect during the last k rounds. If only Lk : “mutants” with ability Lk+1 invade. If Lk & Lk+1 : Lk is outperformed. If Lk & L≥k+3 : unstable to invasions of abilities in between. If k > 1 : “mutants” with ability L1 invade. Yuval Heller (Oxford)

Three Steps Ahead

24 / 32

Uniqueness

Yuval Heller (Oxford)

Graphical Representation of Results

Three Steps Ahead

25 / 32

Discussion

Outline 1

Model

2

Stability Analysis

3

Characterization of a Nash Equilibrium

4

Evolutionary Stability

5

Uniqueness

6

Discussion

Yuval Heller (Oxford)

Three Steps Ahead

26 / 32

Discussion

Extensions 1

Having a far-sighted L∞ ability.

2

Allowing players to send false signals:

3

I

“Cheap-talk” - always defecting is the unique outcome.

I

Results can be extended to a setup with costly lies: F

At stage 0, player chooses true ability and deception effort.

F

Efforts determine the probability of observing opponent’s true ability.

A setup with several games: if games in which looking far ahead decreases efficiency (like Centipede, social dilemma games) are sufficiently frequent. Yuval Heller (Oxford)

Three Steps Ahead

27 / 32

Discussion

Related Literature (1)

Level-k evolution in one-shot games (Stahl, 93; Stennek, 00; Mohlin, 12). I

My model: 0 < p < 1 leads to a qualitatively different result.

Uncertainty about the final period (Samuelson, 1987; Neyman, 1999) and limited foresight (Jehiel, 2001; Mengel, 2012) can induce cooperation in repeated Prisoner’s Dilemma. I

My model: Limited foresight is part of the result (not an assumption).

Yuval Heller (Oxford)

Three Steps Ahead

28 / 32

Discussion

Related Literature (2)

Evolution of preferences (Guth & Yaari, 1992; Dekel et al. 2007): high level of observability can lead to a homogenous population of cooperative players. I

My model: moderate partial observability induces a heterogeneous population of cooperators and shirkers.

Other related papers: Complex sequential problems (Geanakoplos and Gray, 1991), Co-existence of sophisticated & naive agents (Crawford, 2003).

Yuval Heller (Oxford)

Three Steps Ahead

29 / 32

Discussion

Question Why uncertain horizon that becomes certain when reaching foresight?

Interpretation: the “physical” interaction is finite; horizon is infinite when the last period is not part of the strategic considerations. “A key criterion that determines whether we should use a model with a finite or an infinite horizon is whether the last period enters explicitly into the players’ strategic considerations.” (Osborne & Rubinstein, 94)

Similar results can be obtained in a model with a fixed length. Yuval Heller (Oxford)

Three Steps Ahead

30 / 32

Discussion

Question Why do we interpret Lk as limited foresight? Alternative “myopic” notion: L0k evaluates longer games as having horizon k.

How bounded agents play long zero-sum games (e.g., chess)? I

Bounded Minimax algorithm: look several steps ahead, and use a heuristic function to evaluate non-final positions.

In non-zero-sum repeated games: position = history. I

The myopic notion uses a constant evaluation.

I

Evaluations should be history-dependent (& simple).

I

In my model: the evaluation relies on the infinite-horizon benchmark.

Yuval Heller (Oxford)

Three Steps Ahead

31 / 32

Discussion

Conclusion

Summary Moderate partial observability induces a stable heterogeneous population of agents who look a few steps ahead and cooperate until the last few rounds.

Everyone obtains the same maximal payoff. Efficient play at early stages implies uniqueness. Insights can be applicable to other biases (e.g., Level-k).

Yuval Heller (Oxford)

Three Steps Ahead

32 / 32

Discussion

Conclusion

(Informal) Motivation for Early Niceness: 1

10 rounds of PD in the lab: most subjects defect only at the last few rounds (Selten & Stoecker, 86; Anderoni & Miller, 93; Cooper et al., 96).

2

Tournaments of PD among algorithms (Axelrod, 84; Wu & Axelrod, 95).

3

Robson (90) - “secret-handshake mutants” can take a population from an inefficient ESS to an efficient ESS.

4

Early cooperation is the unique outcome in related setups: I

Finite long repeated PD with a few “crazy” players (Kreps et al., 82).

I

δ = 1 and small noise / complexity costs (Fudenberg & Maskin, 90; Binmore & Samuelson, 92).

Yuval Heller (Oxford)

Three Steps Ahead

Back 33 / 32

Discussion

Conclusion

Sketch of Proof – 1/2 (Theorem 1): L1 & L3 play the following reduced game (given b∗ ): (only payoffs at horizons 2-3 are presented; other payoffs are the same.)

L1

L3 2·A

L1

2·A

L3

2·A+1

f (p) < A ⇔ p >

1 A−1

A

A

2·A+1

f (p)

f (p) & p

f (p)

⇔ Hawk-Dove game.

µ ∗ (L3 ) - unique frequency balances the payoffs of L1 & L3 . Yuval Heller (Oxford)

Three Steps Ahead

Back

34 / 32

Discussion

Conclusion

Sketch of Proof – 2/2 (Theorem 1): b∗ is a best-reply for all abilities: I

Uncertain horizon: Lorberbaum et al. (2002).

I

Against strangers (µ ∗ (L1 ) > 1/A ⇔ p > F

I

A ): (A−1)2

Defection at horizon 3 is better against L3 and worse against L1 .

Against observed L3 (p <

A−1 A ):

Defection at horizon 4 is better

(worse) against an observing (unobserving) opponent.

Other abilities cannot yield higher payoffs: I

L2 is strictly dominated by L3 (worse payoff against L3 , same against L1 ).

I

L>3 - can’t improve L3 ’s optimal play.

Yuval Heller (Oxford)

Three Steps Ahead

Back 35 / 32

Discussion

Conclusion

Sketch of Proof 1/2 (Theorem 2)

p > ... ⇒The smallest incumbent ability must be L1 . Assume that all incumbents cooperate at horizons >2 against cooperative strangers. I

p < ... ⇒ all incumbents must cooperate at horizons > 3 against all cooperative opponents.

I I

Incumbents must only include L1 and L3 ⇒ equivalent to σ ∗ . Back

Yuval Heller (Oxford)

Three Steps Ahead

36 / 32

Discussion

Conclusion

Sketch of Proof 2/2 (Theorem 2) Assume the opposite: some incumbents defect at horizons >2 against cooperative strangers. ⇒ µ (L1 ) <

2 A+1

⇒ µ (L>2 ) <

1 A·p

⇒p > ...⇒ µ (L2 ) > 0 (otherwise, L1 outperforms) (otherwise, L2 are outperformed by L1 ).

⇒ Everyone cooperates at horizons > 3 against cooperative strangers. ⇒ Everyone cooperates at horizons > 4 . Balanced payoffs imply a unique frequency of abilities: L1 ,L2 & L≥4 . Unstable to small “group” perturbations. Yuval Heller (Oxford)

Three Steps Ahead

Back 37 / 32

Discussion

Conclusion

Uniqueness Results for Weaker Solution Concepts

1

Symmetric perfect equilibrium → heterogeneous population of L1 and a subset of {L2 , L3 , L4 }.

2

Neutrally stable strategy → a (possibly) shifted σ ∗ of Lk and Lk+2 .

3

Perfect + Normally stable → σ ∗ is unique. Back

Yuval Heller (Oxford)

Three Steps Ahead

32 / 32

Three Steps Ahead

Rarely a few agents experiment with a new type (“mutants”). Objective: characterize stable states of the population in the long ... Horizon - the number of remaining stages. Partial observability of abilities (a la Dekel, Ely, ... Everyone plays perfect-tit-for-tat at unknown horizons. ⋆ Defect iff players played different actions in the ...

1MB Sizes 1 Downloads 182 Views

Recommend Documents

Three Steps Ahead
Three Steps Ahead. Yuval Heller. Nuffield College & Department of Economics, Oxford. January 2014. Yuval Heller (Oxford). Three Steps Ahead. 1 / 23 ...

Three steps ahead - Wiley Online Library
a new mechanism to induce cooperation in the repeated prisoner's dilemma. Keywords. Limited ... the realization of T. We interpret k as the horizon (i.e., number of remaining steps) in which a player with ability .... shall say that a strategy is ear

Three steps ahead - Wiley Online Library
2In Appendix A, we relax the assumption that p is exogenous and allow players to influence the proba- bility of observing the opponent's ability. 3We assume ...

Baumard Boyer Three Steps
account of cultural knowledge is possible within the framework of an evolution- based ..... When they express their ...... American Psychologist 55 (11):1233-.

AHEAD Multiprimer
AHEAD Multiprimer is mainly for use on soffits and walls, concrete subbases without rolling cargoes. Preparations. Concrete. The concrete subbase to be treated ...

run ahead of God
learn that when we try to help or hurry God, we can make a mess of things. ..... or call USA 1-800-772-8888 • AUSTRALIA +61 3 9762 6613 • CANADA ...

run ahead of God
Open your Bible and read Genesis 16 to get the flow of the story. If you wish, .... plan and the highbrow society ladies would have nodded in support, God disapproved. .... or call USA 1-800-772-8888 • AUSTRALIA +61 3 9762 6613 • CANADA ...

AHEAD Multiprimer Floor
Description. AHEAD Multiprimer Floor is a primer, based on a mild, water-based epoxy resin with high penetration into concrete and other mineralic substrates.

AHEAD EC Levelling Layer
AHEAD EC Levelling Layer will, when applied on Zebra Anode, act as an alkaline ... 2-3 hours. No. of coats required on ZEBRA normally one coat at 1 mm ...

Day-Ahead Daily Market Watch
Feb 22, 2018 - 100. 120. 140. 160. NP15. SP15. ZP26. Price ($/MWh). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24. IFM Default LAP LMPs.

AHEAD Ax/ECH
The areas to be treated must be free from all unsound materials. Smooth surfaces should be roughened. ... Avoid storage at temperatures exceeding 20°C. Shelf life of 12 months at 20°C. Technical data. Standard color. Grey. Mixed Density.

ahead multiprimer floor
Carbon dioxide permeability sd >50 m. EN 1504-2:2004. Crack bridging ability. NPD. EN 1504-2:2004. The performance of the product conforms with the declared performance. This declaration of performance is issued under the sole responsibility of the m

AHEAD Multiprimer Floor
Areas of application. Water-based, low on emissions (VOC), 2-component primer for adhesion promotion on ZEBRA. Conductive paint for subsequent AHEAD Levelling Layer or AHEAD Ax/ECH application. For floor slabs in combination with quartz sand ... Clea

Look-Ahead Processors
domain registers D(b ) and range registers R(b ) as follows: If .... causes the transfer of instructions into the window to ... The operation names shown here will not.

Looking Ahead & Beyond -
A separate 'cost centre' or 'separate 'company account' or Personal Ledger would also be opened in Ahura Support's accounting software to track and manage ...

Three Steps to Be a Compliant Dealer: And Why ... - Automotive Digest
HOW TO BY THE EXPERTS – POSTED ON AUTOMOTIVE DIGEST. May 2016 ... Implement a compliance program that includes training for all employees. Next,.

pdf-12115\tape-reading-and-market-tactics-the-three-steps ...
... the apps below to open or edit this item. pdf-12115\tape-reading-and-market-tactics-the-three-st ... uccessful-stock-trading-by-humphrey-bancroft-neill.pdf.