Communication equilibrium payoffs in repeated games with imperfect monitoring
Communication equilibrium payoffs in repeated games with imperfect monitoring Jérôme Renault (X and GIS “Sciences de la décision”), joint with Tristan Tomala (HEC)
TSE-GREMAQ, October 14, 2008
1/29
Communication equilibrium payoffs in repeated games with imperfect monitoring
Introduction Repeated games are dynamic interactions, played by stages. This lecture: the players repeat over and over the same stage game, which is perfectly known. • In the standard model (with perfect monitoring), the actions played at a given stage are publicly observed before the next stage is reached. We have the Folk Theorem: the equilibrium payoffs of the repeated game are the feasible and individually rational payoffs. • We study here the model with imperfect monitoring: at the end of each stage, the players receive some signal depending on the action profile. e.g.: Principal-Agent problems. Computing the equilibrium payoffs is not known.
2/29
Communication equilibrium payoffs in repeated games with imperfect monitoring
Introduction Repeated games are dynamic interactions, played by stages. This lecture: the players repeat over and over the same stage game, which is perfectly known. • In the standard model (with perfect monitoring), the actions played at a given stage are publicly observed before the next stage is reached. We have the Folk Theorem: the equilibrium payoffs of the repeated game are the feasible and individually rational payoffs. • We study here the model with imperfect monitoring: at the end of each stage, the players receive some signal depending on the action profile. e.g.: Principal-Agent problems. Computing the equilibrium payoffs is not known.
2/29
Communication equilibrium payoffs in repeated games with imperfect monitoring
Introduction Repeated games are dynamic interactions, played by stages. This lecture: the players repeat over and over the same stage game, which is perfectly known. • In the standard model (with perfect monitoring), the actions played at a given stage are publicly observed before the next stage is reached. We have the Folk Theorem: the equilibrium payoffs of the repeated game are the feasible and individually rational payoffs. • We study here the model with imperfect monitoring: at the end of each stage, the players receive some signal depending on the action profile. e.g.: Principal-Agent problems. Computing the equilibrium payoffs is not known.
2/29
Communication equilibrium payoffs in repeated games with imperfect monitoring
We assume that the players can communicate with an exogeneous mediator between the stages, and consider the communication equilibrium payoffs of the repeated game (Myerson 82, Forges 85).
We characterize these payoffs in the general n-person case: they are the feasible payoffs which are robust to undetectable deviations and jointly rational. This extends the result of Lehrer (1992) and Mertens, Sorin, Zamir (1994) for 2-player games.
3/29
Communication equilibrium payoffs in repeated games with imperfect monitoring
We assume that the players can communicate with an exogeneous mediator between the stages, and consider the communication equilibrium payoffs of the repeated game (Myerson 82, Forges 85).
We characterize these payoffs in the general n-person case: they are the feasible payoffs which are robust to undetectable deviations and jointly rational. This extends the result of Lehrer (1992) and Mertens, Sorin, Zamir (1994) for 2-player games.
3/29
Communication equilibrium payoffs in repeated games with imperfect monitoring
Outline: I. The model of repeated games with imperfect monitoring II. The standard case of perfect monitoring III. Aspects of imperfect monitoring IV. Communication equilibrium payoffs V. Punishment levels VI. Feasible payoffs robust to undetectable deviations VII. Jointly rational payoffs VIII. The characterization IX. Elements of the proof X. More on imperfect monitoring without a mediator References
4/29
Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring
Repeated games with imperfect monitoring (also called repeated games with signals, or supergames). Data: • A finite stage game G given by a set of players N = {1, ..., n}, and for each player i a set of actions Ai and a payoff function g i : A −→ IR, where A = ∏i Ai stands for the set of action profiles. • an observation structure: for each player i, a finite set of signals U i , and a signalling function f : A −→ ∆(U), where U = ∏i U i is the set of signal profiles. Play: at every stage t = 1, 2, ..., the players independently choose an action in their own set of actions. If at ∈ A is the joint action chosen, a profile of signals ut = (uti )i is selected according to f (at ). The stage payoff for player i is then g i (at ), but all what player i learns before starting stage t + 1 is uti .
5/29
Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring
Repeated games with imperfect monitoring (also called repeated games with signals, or supergames). Data: • A finite stage game G given by a set of players N = {1, ..., n}, and for each player i a set of actions Ai and a payoff function g i : A −→ IR, where A = ∏i Ai stands for the set of action profiles. • an observation structure: for each player i, a finite set of signals U i , and a signalling function f : A −→ ∆(U), where U = ∏i U i is the set of signal profiles. Play: at every stage t = 1, 2, ..., the players independently choose an action in their own set of actions. If at ∈ A is the joint action chosen, a profile of signals ut = (uti )i is selected according to f (at ). The stage payoff for player i is then g i (at ), but all what player i learns before starting stage t + 1 is uti .
5/29
Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring
Repeated games with imperfect monitoring (also called repeated games with signals, or supergames). Data: • A finite stage game G given by a set of players N = {1, ..., n}, and for each player i a set of actions Ai and a payoff function g i : A −→ IR, where A = ∏i Ai stands for the set of action profiles. • an observation structure: for each player i, a finite set of signals U i , and a signalling function f : A −→ ∆(U), where U = ∏i U i is the set of signal profiles. Play: at every stage t = 1, 2, ..., the players independently choose an action in their own set of actions. If at ∈ A is the joint action chosen, a profile of signals ut = (uti )i is selected according to f (at ). The stage payoff for player i is then g i (at ), but all what player i learns before starting stage t + 1 is uti .
5/29
Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring
Repeated games with imperfect monitoring (also called repeated games with signals, or supergames). Data: • A finite stage game G given by a set of players N = {1, ..., n}, and for each player i a set of actions Ai and a payoff function g i : A −→ IR, where A = ∏i Ai stands for the set of action profiles. • an observation structure: for each player i, a finite set of signals U i , and a signalling function f : A −→ ∆(U), where U = ∏i U i is the set of signal profiles. Play: at every stage t = 1, 2, ..., the players independently choose an action in their own set of actions. If at ∈ A is the joint action chosen, a profile of signals ut = (uti )i is selected according to f (at ). The stage payoff for player i is then g i (at ), but all what player i learns before starting stage t + 1 is uti .
5/29
Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring
Illustration: The prisoner’s dilemma
1
C D1
2 D2 C (3, 3) (0, 4) (4, 0) (1, 1)
(unique equilibrium payoff for the one-shot game: (1,1)). • standard case of perfect monitoring: U i = A, and uti = at for each player i. • trivial observation for player i: U i is a singleton (play in the dark). • public signals: all players receive the same signal. (Fudenberg, Levine, Maskin 94, Mailath Morris 02, Horner Olszewski 06....) • observable payoffs: (Tomala 99,...)
6/29
Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring
Illustration: The prisoner’s dilemma
1
C D1
2 D2 C (3, 3) (0, 4) (4, 0) (1, 1)
(unique equilibrium payoff for the one-shot game: (1,1)). • standard case of perfect monitoring: U i = A, and uti = at for each player i. • trivial observation for player i: U i is a singleton (play in the dark). • public signals: all players receive the same signal. (Fudenberg, Levine, Maskin 94, Mailath Morris 02, Horner Olszewski 06....) • observable payoffs: (Tomala 99,...)
6/29
Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring
Illustration: The prisoner’s dilemma
1
C D1
2 D2 C (3, 3) (0, 4) (4, 0) (1, 1)
(unique equilibrium payoff for the one-shot game: (1,1)). • standard case of perfect monitoring: U i = A, and uti = at for each player i. • trivial observation for player i: U i is a singleton (play in the dark). • public signals: all players receive the same signal. (Fudenberg, Levine, Maskin 94, Mailath Morris 02, Horner Olszewski 06....) • observable payoffs: (Tomala 99,...)
6/29
Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring
Strategies and payoffs A strategy for player i: σ i = (σti )t ≥1 , where σti : (Ai × U i )t −1 −→ ∆(Ai ) gives the lottery played at stage t depending on his current information. A strategy profile σ naturally induces a probability over plays. Average T -stage payoff for player i:
γTi (σ )
= IEσ
1 T
!
T
∑ g (at ) i
t=1
.
For λ in (0, 1], λ -discounted payoff for player i:
γλ (σ ) = IEσ i
∞
∑ λ (1 − λ )
t=1
t −1 i
!
g (at ) .
7/29
Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring
Equilibrium payoffs Definition: A uniform equilibrium payoff of the repeated game is a strategy profile σ such that: 1) ∀ε > 0, σ is a ε -Nash eq. of every discounted game with low enough discount factor : ∃λ0 , ∀λ ≤ λ0 , ∀i ∈ N, ∀τ i ∈ Σi , γλi (τ i , σ −i ) ≤ γλi (σ ) + ε , and 2) (γλi (σ ))i ∈N converges as λ goes to 0 to a limit called an equilibrium payoff. Denote by E∞ the set of equilibrium payoffs. Rems: - long-term strategic aspects, - same with average payoffs γTi (σ ) = IEσ - better robustness than limλ →0 Eλ , - no refinement here.
1 T
i ∑T t=1 g (at ) , 8/29
Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring
The standard case of perfect monitoring Example: The prisoner’s dilemma with perfect monitoring 2 D2 C 1 C (3, 3) (0, 4) D1 (4, 0) (1, 1) (3, 3) is an equilibrium payoff of the repeated game: play C i as long as your opponent does, otherwise play D i forever. J2 Equilibrium payoffs set E∞
6 4 P ` B PPP B @ @P` B @@ E∞@B @ @B @ B @@@@B 1 B`P PP B PPB` 0 1 4 J1 9/29
Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring
The standard case of perfect monitoring Example: The prisoner’s dilemma with perfect monitoring 2 D2 C 1 C (3, 3) (0, 4) D1 (4, 0) (1, 1) (3, 3) is an equilibrium payoff of the repeated game: play C i as long as your opponent does, otherwise play D i forever. J2 Equilibrium payoffs set E∞
6 4 P ` B PPP B @ @P` B @@ E∞@B @ @B @ B @@@@B 1 B`P PP B PPB` 0 1 4 J1 9/29
Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring
The standard case of perfect monitoring Example: The prisoner’s dilemma with perfect monitoring 2 D2 C 1 C (3, 3) (0, 4) D1 (4, 0) (1, 1) (3, 3) is an equilibrium payoff of the repeated game: play C i as long as your opponent does, otherwise play D i forever. J2 Equilibrium payoffs set E∞
6 4 P ` B PPP B @ @P` B @@ E∞@B @ @B @ B @@@@B 1 B`P PP B PPB` 0 1 4 J1 9/29
Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring
The standard case of perfect monitoring Example: The prisoner’s dilemma with perfect monitoring 2 D2 C 1 C (3, 3) (0, 4) D1 (4, 0) (1, 1) (3, 3) is an equilibrium payoff of the repeated game: play C i as long as your opponent does, otherwise play D i forever. J2 Equilibrium payoffs set E∞
6 4 P ` B PPP B @ @P` B @@ E∞@B @ @B @ B @@@@B 1 B`P PP B PPB` 0 1 4 J1 9/29
Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring
The standard Folk theorem Define two sets. Feasible payoffs: convg (A) = g (∆(A)), where g : A −→ IR N is the vector payoff function. Punishment level of player i: vi =
min
max g i (p i , p −i ) independent minmax of player i
p −i ∈∏j6=i ∆(Aj ) p i ∈∆(Ai )
Individually rational payoffs: IR = {x = (x i )i ∈ IR N , ∀i x i ≥ v i }. Standard Folk theorem: the equilibrium payoffs of the repeated game are the payoffs which are both feasible (that can be achieved) and individually rational (every player gets at least his punition payoff). E∞ = g (∆(A)) ∩ IR. Aumann (1981): The Folk theorem “has been generally known in the profession for at least 15 or 20 years, but has not been published; its authorship is obscure." 10/29
Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring
The standard Folk theorem Define two sets. Feasible payoffs: convg (A) = g (∆(A)), where g : A −→ IR N is the vector payoff function. Punishment level of player i: vi =
min
max g i (p i , p −i ) independent minmax of player i
p −i ∈∏j6=i ∆(Aj ) p i ∈∆(Ai )
Individually rational payoffs: IR = {x = (x i )i ∈ IR N , ∀i x i ≥ v i }. Standard Folk theorem: the equilibrium payoffs of the repeated game are the payoffs which are both feasible (that can be achieved) and individually rational (every player gets at least his punition payoff). E∞ = g (∆(A)) ∩ IR. Aumann (1981): The Folk theorem “has been generally known in the profession for at least 15 or 20 years, but has not been published; its authorship is obscure." 10/29
Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring
The standard Folk theorem Define two sets. Feasible payoffs: convg (A) = g (∆(A)), where g : A −→ IR N is the vector payoff function. Punishment level of player i: vi =
min
max g i (p i , p −i ) independent minmax of player i
p −i ∈∏j6=i ∆(Aj ) p i ∈∆(Ai )
Individually rational payoffs: IR = {x = (x i )i ∈ IR N , ∀i x i ≥ v i }. Standard Folk theorem: the equilibrium payoffs of the repeated game are the payoffs which are both feasible (that can be achieved) and individually rational (every player gets at least his punition payoff). E∞ = g (∆(A)) ∩ IR. Aumann (1981): The Folk theorem “has been generally known in the profession for at least 15 or 20 years, but has not been published; its authorship is obscure." 10/29
Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring
The standard Folk theorem Define two sets. Feasible payoffs: convg (A) = g (∆(A)), where g : A −→ IR N is the vector payoff function. Punishment level of player i: vi =
min
max g i (p i , p −i ) independent minmax of player i
p −i ∈∏j6=i ∆(Aj ) p i ∈∆(Ai )
Individually rational payoffs: IR = {x = (x i )i ∈ IR N , ∀i x i ≥ v i }. Standard Folk theorem: the equilibrium payoffs of the repeated game are the payoffs which are both feasible (that can be achieved) and individually rational (every player gets at least his punition payoff). E∞ = g (∆(A)) ∩ IR. Aumann (1981): The Folk theorem “has been generally known in the profession for at least 15 or 20 years, but has not been published; its authorship is obscure." 10/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring
Signals do matter. 2 C C1 (3, 3) D1 (4, 0)
Example: prisoner’s dilemma in the dark D2 (0, 4) E∞ = {(1, 1)}. (1, 1)
With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.
11/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs
Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.
12/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs
Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.
12/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs
Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.
12/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs
Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.
12/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs
Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.
12/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs
Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.
12/29
Communication equilibrium payoffs in repeated games with imperfect monitoring V. Punishment levels
Punishment levels: wi =
max g i (p i , p −i ) correlated minmax of player i
min
p −i ∈∆(∏j6=i Aj ) p i ∈∆(Ai )
(instead of v i = minp−i ∈∏j6=i ∆(Aj ) maxpi ∈∆(Ai ) g i (p i , p −i )) Example with 3 players: T B
L R −1 0 0 0 W
R L 0 0 0 −1 E w 3 = −1/2 < v 3 = −1/4.
Rem: Standard Folk theorem for communication equilibria: C∞ = g (∆(A)) ∩ IRc, where IRc = {x = (x i )i ∈ IR N , ∀i x i ≥ w i }. 13/29
Communication equilibrium payoffs in repeated games with imperfect monitoring V. Punishment levels
Punishment levels: wi =
max g i (p i , p −i ) correlated minmax of player i
min
p −i ∈∆(∏j6=i Aj ) p i ∈∆(Ai )
(instead of v i = minp−i ∈∏j6=i ∆(Aj ) maxpi ∈∆(Ai ) g i (p i , p −i )) Example with 3 players: T B
L R −1 0 0 0 W
R L 0 0 0 −1 E w 3 = −1/2 < v 3 = −1/4.
Rem: Standard Folk theorem for communication equilibria: C∞ = g (∆(A)) ∩ IRc, where IRc = {x = (x i )i ∈ IR N , ∀i x i ≥ w i }. 13/29
Communication equilibrium payoffs in repeated games with imperfect monitoring V. Punishment levels
Punishment levels: wi =
max g i (p i , p −i ) correlated minmax of player i
min
p −i ∈∆(∏j6=i Aj ) p i ∈∆(Ai )
(instead of v i = minp−i ∈∏j6=i ∆(Aj ) maxpi ∈∆(Ai ) g i (p i , p −i )) Example with 3 players: T B
L R −1 0 0 0 W
R L 0 0 0 −1 E w 3 = −1/2 < v 3 = −1/4.
Rem: Standard Folk theorem for communication equilibria: C∞ = g (∆(A)) ∩ IRc, where IRc = {x = (x i )i ∈ IR N , ∀i x i ≥ w i }. 13/29
Communication equilibrium payoffs in repeated games with imperfect monitoring V. Punishment levels
Punishment levels: wi =
max g i (p i , p −i ) correlated minmax of player i
min
p −i ∈∆(∏j6=i Aj ) p i ∈∆(Ai )
(instead of v i = minp−i ∈∏j6=i ∆(Aj ) maxpi ∈∆(Ai ) g i (p i , p −i )) Example with 3 players: T B
L R −1 0 0 0 W
R L 0 0 0 −1 E w 3 = −1/2 < v 3 = −1/4.
Rem: Standard Folk theorem for communication equilibria: C∞ = g (∆(A)) ∩ IRc, where IRc = {x = (x i )i ∈ IR N , ∀i x i ≥ w i }. 13/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations
Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2 1 C (3, 3)a D1 (4, 0)b
dilemma again D2 (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c
(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations
Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2 1 C (3, 3)a D1 (4, 0)b
dilemma again D2 (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c
(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations
Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2 1 C (3, 3)a D1 (4, 0)b
dilemma again D2 (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c
(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations
Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2 1 C (3, 3)a D1 (4, 0)b
dilemma again D2 (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c
(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations
Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2 1 C (3, 3)a D1 (4, 0)b
dilemma again D2 (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c
(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations
Suppose now that at some stage, the mediator recommends some action profile a = (ai )i . If some player i can play an action b i which: - does not change the signal of the other players, - gives player i as least as much information as ai , - and gives a better payoff for player i, then player i has no incentive to play ai . (assume here that the signals are deterministic: each player i has a signalling function f i : A −→ U i .)
Definition: for every pair of actions ai and b i of player i, write b i ≥ ai if: (i) ∀a−i ∈ A−i , ∀j 6= i, f j (b i , a−i ) = f j (ai , a−i ) (ai et b i are equivalent), (ii) ∀a−i , b −i ∈ A−i , f i (ai , a−i ) 6= f i (ai , b −i ) =⇒ f i (b i , a−i ) 6= f i (b i , b −i ) (b i is more informative than ai ).
15/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations
Suppose now that at some stage, the mediator recommends some action profile a = (ai )i . If some player i can play an action b i which: - does not change the signal of the other players, - gives player i as least as much information as ai , - and gives a better payoff for player i, then player i has no incentive to play ai . (assume here that the signals are deterministic: each player i has a signalling function f i : A −→ U i .)
Definition: for every pair of actions ai and b i of player i, write b i ≥ ai if: (i) ∀a−i ∈ A−i , ∀j 6= i, f j (b i , a−i ) = f j (ai , a−i ) (ai et b i are equivalent), (ii) ∀a−i , b −i ∈ A−i , f i (ai , a−i ) 6= f i (ai , b −i ) =⇒ f i (b i , a−i ) 6= f i (b i , b −i ) (b i is more informative than ai ).
15/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations
Definition (Lehrer) The set of feasible payoffs which are robust to undectectable deivations is g (P), where: n P = p ∈ ∆(A), ∀i ∈ N, ∀b i , ai ∈ Ai s.t. b i ≥ ai ,
∑
a−i ∈A−i
p(ai , a−i )g i (ai , a−i ) ≥
∑
a−i ∈A−i
o p(ai , a−i )g i (b i , a−i ) .
Theorem (Lehrer 1992, MSZ 1994): For two-player games, C∞ = g (P) ∩ IRc.
16/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations
Definition (Lehrer) The set of feasible payoffs which are robust to undectectable deivations is g (P), where: n P = p ∈ ∆(A), ∀i ∈ N, ∀b i , ai ∈ Ai s.t. b i ≥ ai ,
∑
a−i ∈A−i
p(ai , a−i )g i (ai , a−i ) ≥
∑
a−i ∈A−i
o p(ai , a−i )g i (b i , a−i ) .
Theorem (Lehrer 1992, MSZ 1994): For two-player games, C∞ = g (P) ∩ IRc.
16/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
Jointly rational payoffs New phenomenum with at least 3 players: somebody has deviated, but who ? −→ Several suspected players have to be simultaneously punished. This creates new constraints for the eq. payoffs. Example: 3 players. L R T B
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Player 3 has payoff 0. w 1 = w 2 = w 3 = 0. J2 Eq. payoffs 6 Perfect observation (Folk) 3 Trivial observation 2 @ @ @@ @@ @@ @@ @@ @@ @ @@ 0 2 3 J1
17/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
Jointly rational payoffs New phenomenum with at least 3 players: somebody has deviated, but who ? −→ Several suspected players have to be simultaneously punished. This creates new constraints for the eq. payoffs. Example: 3 players. L R T B
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Player 3 has payoff 0. w 1 = w 2 = w 3 = 0. J2 Eq. payoffs 6 Perfect observation (Folk) 3 Trivial observation 2 @ @ @@ @@ @@ @@ @@ @@ @ @@ 0 2 3 J1
17/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
Jointly rational payoffs New phenomenum with at least 3 players: somebody has deviated, but who ? −→ Several suspected players have to be simultaneously punished. This creates new constraints for the eq. payoffs. Example: 3 players. L R T B
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Player 3 has payoff 0. w 1 = w 2 = w 3 = 0. J2 Eq. payoffs 6 Perfect observation (Folk) 3 Trivial observation 2 @ @ @@ @@ @@ @@ @@ @@ @ @@ 0 2 3 J1
17/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
Jointly rational payoffs New phenomenum with at least 3 players: somebody has deviated, but who ? −→ Several suspected players have to be simultaneously punished. This creates new constraints for the eq. payoffs. Example: 3 players. L R T B
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Player 3 has payoff 0. w 1 = w 2 = w 3 = 0. J2 Eq. payoffs 6 Perfect observation (Folk) 3 Trivial observation 2 @ @ @@ @@ @@ @@ @@ @@ @ @@ 0 2 3 J1
17/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
L T B
R
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
L T B
R
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
L T B
R
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
L T B
R
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
L T B
R
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
L T B
R
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Punishments by P3 of the form λ M + (1 − λ )E , with λ ∈ [0, 1]. For a target payoff x, efficient only if x 1 ≥ 2(1 − λ ) and x 2 ≥ 2λ . Eq. payoffs
J2 3 6 @ @@ 2 @ @ @@ @@ @ @ @@ @@@ @@ @@ @@ @@ 0 2 3 J1
19/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
L T B
R
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
Punishments by P3 of the form λ M + (1 − λ )E , with λ ∈ [0, 1]. For a target payoff x, efficient only if x 1 ≥ 2(1 − λ ) and x 2 ≥ 2λ . Eq. payoffs
J2 3 6 @ @@ 2 @ @@ @@ @ @ @@ @ @@@ @@ @@ @@ @@ 0 2 3 J1
19/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
An action of player i in the extended game is called a decision (rule), element of: D i = {d i = (α i , µ i ), wih α i : Ri −→ Ai and µ i : Ri × Ui −→ Mi }. Consider the scenario: the mediator recommends the action profile a, the players 6= i play faithfully whereas player i plays according to a mixed decision δ i . Denote by ψ i (δ i , a) ∈ ∆(U) the induced law of the messages received by the mediator, and by gδi i (a) the expected payoff of player i. Given a subset of players J, the set of similar decisions of the players in J is defined as: ) ( SD(J) =
(δ i )i ∈J ∈ ∏ ∆(D i ), ∀i, j ∈ J, ∀a ∈ A, ψ i (δ i , a) = ψ j (δ j , a) . i ∈J
SD(J) is a polytope. If a player in J deviates according to an element in SD(J), the mediator has to punish simultaneously all the players in J. 20/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
An action of player i in the extended game is called a decision (rule), element of: D i = {d i = (α i , µ i ), wih α i : Ri −→ Ai and µ i : Ri × Ui −→ Mi }. Consider the scenario: the mediator recommends the action profile a, the players 6= i play faithfully whereas player i plays according to a mixed decision δ i . Denote by ψ i (δ i , a) ∈ ∆(U) the induced law of the messages received by the mediator, and by gδi i (a) the expected payoff of player i. Given a subset of players J, the set of similar decisions of the players in J is defined as: ) ( SD(J) =
(δ i )i ∈J ∈ ∏ ∆(D i ), ∀i, j ∈ J, ∀a ∈ A, ψ i (δ i , a) = ψ j (δ j , a) . i ∈J
SD(J) is a polytope. If a player in J deviates according to an element in SD(J), the mediator has to punish simultaneously all the players in J. 20/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
An action of player i in the extended game is called a decision (rule), element of: D i = {d i = (α i , µ i ), wih α i : Ri −→ Ai and µ i : Ri × Ui −→ Mi }. Consider the scenario: the mediator recommends the action profile a, the players 6= i play faithfully whereas player i plays according to a mixed decision δ i . Denote by ψ i (δ i , a) ∈ ∆(U) the induced law of the messages received by the mediator, and by gδi i (a) the expected payoff of player i. Given a subset of players J, the set of similar decisions of the players in J is defined as: ) ( SD(J) =
(δ i )i ∈J ∈ ∏ ∆(D i ), ∀i, j ∈ J, ∀a ∈ A, ψ i (δ i , a) = ψ j (δ j , a) . i ∈J
SD(J) is a polytope. If a player in J deviates according to an element in SD(J), the mediator has to punish simultaneously all the players in J. 20/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
rem: if q is the Dirac measure on player i, then SD({i}) = ∆(D i ): every deviation of player i makes player i a suspect. Example again: L T B
R
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
P1 and P2 just see the moves of each other, P3 is in the dark. Consider: d 1 : play B, report R d 2 : play R, report B. d = (d 1 , d 2 ) is a similar decision for the players 1,2. The mediator can not distinguish between “player 1 is deviating with d 1 ” and “player 2 is deviating with d 2 ”. 21/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
rem: if q is the Dirac measure on player i, then SD({i}) = ∆(D i ): every deviation of player i makes player i a suspect. Example again: L T B
R
(0,0,0) (0,3,0) (3,0,0) (1,1,0) W
L (0,2,0) (0,2,0)
R (0,2,0) (0,2,0) M
L (2,0,0) (2,0,0)
R (2,0,0) (2,0,0) E
P1 and P2 just see the moves of each other, P3 is in the dark. Consider: d 1 : play B, report R d 2 : play R, report B. d = (d 1 , d 2 ) is a similar decision for the players 1,2. The mediator can not distinguish between “player 1 is deviating with d 1 ” and “player 2 is deviating with d 2 ”. 21/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
For each q in ∆(N), define the “punishment level” l(q) by: l(q) =
max
min ∑ q i gδi i (a) = min
δ ∈SD(Supp q) a∈A i ∈N
max
∑ p(a) ∑ qi gδi i (a).
p ∈∆(A) δ ∈SD(Supp q) a∈A
i ∈N
The set of jointly rational payoffs is defined as: JR = x ∈ IR N , ∀q ∈ ∆(N), x · q ≥ l(q) . rem: if q is the Dirac measure on player i, SD({i}) = ∆(D i ) and l(q) = w i : jointly rational payoffs are always individually rational.
22/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
For each q in ∆(N), define the “punishment level” l(q) by: l(q) =
max
min ∑ q i gδi i (a) = min
δ ∈SD(Supp q) a∈A i ∈N
max
∑ p(a) ∑ qi gδi i (a).
p ∈∆(A) δ ∈SD(Supp q) a∈A
i ∈N
The set of jointly rational payoffs is defined as: JR = x ∈ IR N , ∀q ∈ ∆(N), x · q ≥ l(q) . rem: if q is the Dirac measure on player i, SD({i}) = ∆(D i ) and l(q) = w i : jointly rational payoffs are always individually rational.
22/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
For each q in ∆(N), define the “punishment level” l(q) by: l(q) =
max
min ∑ q i gδi i (a) = min
δ ∈SD(Supp q) a∈A i ∈N
max
∑ p(a) ∑ qi gδi i (a).
p ∈∆(A) δ ∈SD(Supp q) a∈A
i ∈N
The set of jointly rational payoffs is defined as: JR = x ∈ IR N , ∀q ∈ ∆(N), x · q ≥ l(q) . rem: if q is the Dirac measure on player i, SD({i}) = ∆(D i ) and l(q) = w i : jointly rational payoffs are always individually rational.
22/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs
For each q in ∆(N), define the “punishment level” l(q) by: l(q) =
max
min ∑ q i gδi i (a) = min
δ ∈SD(Supp q) a∈A i ∈N
max
∑ p(a) ∑ qi gδi i (a).
p ∈∆(A) δ ∈SD(Supp q) a∈A
i ∈N
The set of jointly rational payoffs is defined as: JR = x ∈ IR N , ∀q ∈ ∆(N), x · q ≥ l(q) . rem: if q is the Dirac measure on player i, SD({i}) = ∆(D i ) and l(q) = w i : jointly rational payoffs are always individually rational.
22/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VIII. The characterization
Theorem R-Tomala 04: in a repeated game with imperfect monitoring, the communication equilibrium payoffs are the feasible payoffs which are both robust to undetectable deviations and jointly rational. C∞ = g (P) ∩ JR. remarks: • Perfect observation case: JR = IRc (Folk theorem) • Trivial observation case: C∞ is the set of correlated eq. payoffs of the stage game. • For two-player games, JR = IR (back to Lehrer and MSZ result). • C∞ is convex compact, but need not be a polytope. • Corollaries: E∞ ⊂ g (P) ∩ JR, Eλ ⊂ g (P) ∩ JR for every λ . • Case of random signals: with f : A −→ ∆(U), OK with P = {p ∈ ∆(A), ∀i ∈ N, ∀δ i ∈ ∆(D i ) t.q. ∀a ∈ A ψ i (δ i , a) = f (a), ∑a∈A p(a)g i (a) ≥ ∑a∈A p(a)gδi i (a)}. 23/29
Communication equilibrium payoffs in repeated games with imperfect monitoring VIII. The characterization
Theorem R-Tomala 04: in a repeated game with imperfect monitoring, the communication equilibrium payoffs are the feasible payoffs which are both robust to undetectable deviations and jointly rational. C∞ = g (P) ∩ JR. remarks: • Perfect observation case: JR = IRc (Folk theorem) • Trivial observation case: C∞ is the set of correlated eq. payoffs of the stage game. • For two-player games, JR = IR (back to Lehrer and MSZ result). • C∞ is convex compact, but need not be a polytope. • Corollaries: E∞ ⊂ g (P) ∩ JR, Eλ ⊂ g (P) ∩ JR for every λ . • Case of random signals: with f : A −→ ∆(U), OK with P = {p ∈ ∆(A), ∀i ∈ N, ∀δ i ∈ ∆(D i ) t.q. ∀a ∈ A ψ i (δ i , a) = f (a), ∑a∈A p(a)g i (a) ≥ ∑a∈A p(a)gδi i (a)}. 23/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =
x¯n −yn kx¯n −yn k
∈ ∆(N) leads to the definition of JR: )
x ∈ IR N , ∀q ∈ ∆(N), x · q ≥
max
min ∑ q i gδi i (a) .
δ ∈SD(Supp q) a∈A i ∈N
25/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =
x¯n −yn kx¯n −yn k
∈ ∆(N) leads to the definition of JR: )
x ∈ IR N , ∀q ∈ ∆(N), x · q ≥
max
min ∑ q i gδi i (a) .
δ ∈SD(Supp q) a∈A i ∈N
25/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =
x¯n −yn kx¯n −yn k
∈ ∆(N) leads to the definition of JR: )
x ∈ IR N , ∀q ∈ ∆(N), x · q ≥
max
min ∑ q i gδi i (a) .
δ ∈SD(Supp q) a∈A i ∈N
25/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =
x¯n −yn kx¯n −yn k
∈ ∆(N) leads to the definition of JR: )
x ∈ IR N , ∀q ∈ ∆(N), x · q ≥
max
min ∑ q i gδi i (a) .
δ ∈SD(Supp q) a∈A i ∈N
25/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =
x¯n −yn kx¯n −yn k
∈ ∆(N) leads to the definition of JR: )
x ∈ IR N , ∀q ∈ ∆(N), x · q ≥
max
min ∑ q i gδi i (a) .
δ ∈SD(Supp q) a∈A i ∈N
25/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
• Statistical tests in equilibrium strategies (as in Renault 00, or Lehrer 90). The game is played by blocks of stages of polynomial size. Main path + statistical tests for deviations + very long (but not infinite) punishment phases. Need for an extension of Tchebychev’s inequality without independence (actions played in a block). (Lehrer) Let R1 , ..., Rn be Bernouilli r.v. with parameter p, and Y1 ,...,Yn be Bernouilli r.v. such that for each m, Rm is independent of R1 ,...,Rm−1 ,Y1 ,...,Ym . Then R1 Y1 + ... + Rn Yn Y1 + ... + Yn 1 ≥ −p ε } ≤ 2. ∀ε > 0, P n n nε
26/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
• Statistical tests in equilibrium strategies (as in Renault 00, or Lehrer 90). The game is played by blocks of stages of polynomial size. Main path + statistical tests for deviations + very long (but not infinite) punishment phases. Need for an extension of Tchebychev’s inequality without independence (actions played in a block). (Lehrer) Let R1 , ..., Rn be Bernouilli r.v. with parameter p, and Y1 ,...,Yn be Bernouilli r.v. such that for each m, Rm is independent of R1 ,...,Rm−1 ,Y1 ,...,Ym . Then R1 Y1 + ... + Rn Yn Y1 + ... + Yn 1 ∀ε > 0, P ≥ −p ε } ≤ 2. n n nε
26/29
Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof
• Statistical tests in equilibrium strategies (as in Renault 00, or Lehrer 90). The game is played by blocks of stages of polynomial size. Main path + statistical tests for deviations + very long (but not infinite) punishment phases. Need for an extension of Tchebychev’s inequality without independence (actions played in a block). (Lehrer) Let R1 , ..., Rn be Bernouilli r.v. with parameter p, and Y1 ,...,Yn be Bernouilli r.v. such that for each m, Rm is independent of R1 ,...,Rm−1 ,Y1 ,...,Ym . Then R1 Y1 + ... + Rn Yn Y1 + ... + Yn 1 ∀ε > 0, P ≥ −p ε } ≤ 2. n n nε
26/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
More on imperfect monitoring without a mediator: • Computing the punishment levels may be difficult: Payoffs for player 3: R R L L T −1 0 0 0 B 0 0 0 −1 W E Perfect monitoring: −1/4 (independent minmax) P1 and P2 see each other, P3 in the dark: −1/2 (correlated minmax) Assume now that P3 observes P2 only, P2 observes P1 only, P1 plays in the dark. (Gossner Tomala 07): punishment level v ? v = −1/2(x 2 + (1 − x)2), where −x ln(x) − (1 − x) ln(1 − x) = 1/2 v ∼ −0.402 27/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
More on imperfect monitoring without a mediator: • Computing the punishment levels may be difficult: Payoffs for player 3: R R L L T −1 0 0 0 B 0 0 0 −1 W E Perfect monitoring: −1/4 (independent minmax) P1 and P2 see each other, P3 in the dark: −1/2 (correlated minmax) Assume now that P3 observes P2 only, P2 observes P1 only, P1 plays in the dark. (Gossner Tomala 07): punishment level v ? v = −1/2(x 2 + (1 − x)2), where −x ln(x) − (1 − x) ln(1 − x) = 1/2 v ∼ −0.402 27/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
More on imperfect monitoring without a mediator: • Computing the punishment levels may be difficult: Payoffs for player 3: R R L L T −1 0 0 0 B 0 0 0 −1 W E Perfect monitoring: −1/4 (independent minmax) P1 and P2 see each other, P3 in the dark: −1/2 (correlated minmax) Assume now that P3 observes P2 only, P2 observes P1 only, P1 plays in the dark. (Gossner Tomala 07): punishment level v ? v = −1/2(x 2 + (1 − x)2), where −x ln(x) − (1 − x) ln(1 − x) = 1/2 v ∼ −0.402 27/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
More on imperfect monitoring without a mediator: • Computing the punishment levels may be difficult: Payoffs for player 3: R R L L T −1 0 0 0 B 0 0 0 −1 W E Perfect monitoring: −1/4 (independent minmax) P1 and P2 see each other, P3 in the dark: −1/2 (correlated minmax) Assume now that P3 observes P2 only, P2 observes P1 only, P1 plays in the dark. (Gossner Tomala 07): punishment level v ? v = −1/2(x 2 + (1 − x)2), where −x ln(x) − (1 − x) ln(1 − x) = 1/2 v ∼ −0.402 27/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
• Playing more informative actions, even in case of trivial monitoring. Equilibrium paths may be complex: U 1 = {a, b, c}, P2 plays in the dark. H M B
G (0, 1),a (1, 1),c (0, 0),c
D (0, 1),b (0, 0),c (1, 1),c
An equilibrium is given by (σ 1 , σ 2 ), where: σ 2 plays i.i.d. 1/2 G +1/2 D at odd stages, and repeat the last action at even stages. σ 1 plays H at odd stages (“buy” the information), then M or B at even stages. Equilibrium payoff: (1/2, 1), can not be obtained as a convex combination of payoffs where P1 is in best reply.
28/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
• Playing more informative actions, even in case of trivial monitoring. Equilibrium paths may be complex: U 1 = {a, b, c}, P2 plays in the dark. H M B
G (0, 1),a (1, 1),c (0, 0),c
D (0, 1),b (0, 0),c (1, 1),c
An equilibrium is given by (σ 1 , σ 2 ), where: σ 2 plays i.i.d. 1/2 G +1/2 D at odd stages, and repeat the last action at even stages. σ 1 plays H at odd stages (“buy” the information), then M or B at even stages. Equilibrium payoff: (1/2, 1), can not be obtained as a convex combination of payoffs where P1 is in best reply.
28/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
• Playing more informative actions, even in case of trivial monitoring. Equilibrium paths may be complex: U 1 = {a, b, c}, P2 plays in the dark. H M B
G (0, 1),a (1, 1),c (0, 0),c
D (0, 1),b (0, 0),c (1, 1),c
An equilibrium is given by (σ 1 , σ 2 ), where: σ 2 plays i.i.d. 1/2 G +1/2 D at odd stages, and repeat the last action at even stages. σ 1 plays H at odd stages (“buy” the information), then M or B at even stages. Equilibrium payoff: (1/2, 1), can not be obtained as a convex combination of payoffs where P1 is in best reply.
28/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)
29/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)
29/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)
29/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)
29/29
Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator
• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)
29/29
Communication equilibrium payoffs in repeated games with imperfect monitoring References
D. Abreu, D. Pearce et E. Stacchetti. Toward a theory of discounted repeated games with imperfect monitoring. Econometrica, 58, 1041–1063, 1990. Aumann, R.J. and M. Maschler (1995): Repeated games with incomplete information. With the collaboration of R. Stearns. Cambridge, MA: MIT Press. R. J. Aumann and L. S. Shapley. Long-term competition—A game theoretic analysis. In N. Megiddo, editor, Essays on game theory, pages 1–15. Springer-Verlag, New-York, 1994. F. Forges. An Approach to Communication Equilibria. Econometrica, 54, 1375–1385, 1985. D. Fudenberg and E. Maskin. The folk theorem in repeated games with discounting or with incomplete information. Econometrica, 54:533–554, 1986. D. Fudenberg, D. K. Levine et E. Maskin. 29/29
Communication equilibrium payoffs in repeated games with imperfect monitoring References
The Folk theorem with imperfect public information. Econometrica, 62, 997–1039, 1994. O. Gossner. The Folk theorem for finitely repeated games with mixed strategies. International Journal of Game Theory, 24: 95–107, 1995. O. Gossner et T. Tomala. Secret correlation in repeated games with imperfect monitoring. Mathematics of Operations Research, 32, 413–424, 2007. E. Kohlberg. Optimal strategies in repeated games with incomplete information. International Journal of Game Theory, 4, 7–24, 1975. E. Lehrer. Lower Equilibrium Payoffs in Two-Player Repeated Games with Non-observable Actions. International Journal of Game Theory, 18, 57–89, 1989. E. Lehrer. 29/29
Communication equilibrium payoffs in repeated games with imperfect monitoring References
Nash equilibria of n player repeated games with semi-standard information. International Journal of Game Theory, 19, 191–217, 1990. E. Lehrer. Correlated Equilibria in two-Player Repeated Games with non-Observable Actions. Mathematics of Operations Research, 17, 175–199, 1992a. E. Lehrer. On the Equilibrium Payoffs Set of two-Player Repeated Games with Imperfect Monitoring. International Journal of Game Theory, 20, 211–226, 1992b. E. Lehrer. Two-player repeated games with nonobservable actions and observable payoffs. Mathematics of Operations Research, 17, 200–224, 1992c. R.B. Myerson. Optimal coordination mechanisms in generalized principal agent problems. J. Math. Economics, 10, 67–81, 1982 29/29
Communication equilibrium payoffs in repeated games with imperfect monitoring References
. R.B. Myerson. Multistage games with communication. Econometrica, 54, 323–358, 1986. J.Renault and T. Tomala. Repeated Proximity Games. International Journal of Game Theory, 2, 539–559, 1998. J.Renault. 2-player repeated games with lack of information on one side and state independent signalling. Mathematics of Operations Research, 25, 552–572, 2000. J.Renault, S. Scarlatti and M. Scarsini. A Folk theorem for minority games, Games and Economic Behavior, 53, 208–230, 2005. J.Renault, S. Scarlatti and M. Scarsini. Discounted and Finitely Repeated Minority Games with Public Signals. Mathematical Social Sciences, vol.56, pp.44–74, 2008. 29/29
Communication equilibrium payoffs in repeated games with imperfect monitoring References
J.Renault and T. Tomala. Communication equilibria in supergames, Games and Economic Behavior, 49, 313–344, 2004.
Thanks for your attention !
29/29