Supplementary Appendix to Coevolution of Deception and Preferences: Darwin and Nash Meet Machiavelli Yuval Heller∗and Erik Mohlin† June 14, 2017

Abstract In Heller and Mohlin (2017) we develop a framework in which individuals’ preferences coevolve with their abilities to deceive others about their preferences and intentions. Specifically, individuals are characterised by (i) a level of cognitive sophistication and (ii) a subjective utility function. Increased cognition is costly, but higher-level individuals have the advantage of being able to deceive lower-level opponents about their preferences and intentions in some of the matches. In the remaining matches, the individuals perfectly observe each other’s preferences. Our main result shows that, essentially, only efficient outcomes can be stable. Moreover, under additional mild assumptions, we show that an efficient outcome is stable if and only if the gain from unilateral deviation is smaller than the effective cost of deception in the environment. In this supplementary analysis we relax the assumption of perfect observability in matches without deception, and study the robustness of our main results to the introduction of partial observability. Keywords: Evolution of Preferences; Indirect Evolutionary Approach; Theory of Mind; Depth of Reasoning; Deception; Efficiency. JEL codes: C72, C73, D03, D83.

1

Introduction

In Heller and Mohlin (2017) we develop a framework in which individuals’ preferences coevolve with their abilities to deceive others about their preferences and intentions. Specifically, individuals are characterised by (i) a level of cognitive sophistication and (ii) a subjective utility function. Increased cognition is costly, but higherlevel individuals have the advantage of being able to deceive lower-level opponents about their preferences and intentions in some of the matches. In the remaining matches, the individuals perfectly observe each other’s preferences. Our main result shows that, essentially, only efficient outcomes can be stable. Moreover, under additional mild assumptions, we show that an efficient outcome is stable if and only if the gain from unilateral deviation is smaller than the effective cost of deception in the environment. In this supplementary analysis we relax the assumption of perfect observability in matches without deception, and study the robustness of our two main results to the introduction of partial observability. The supplementary analysis relies on the notation presented in Heller and Mohlin (2017), and we only present and discuss new notation and changes with respect to the baseline model. ∗ Affiliation:

Department of Economics, Bar Ilan University. Address: Ramat Gan 5290002, Israel. E-mail: [email protected]. Department of Economics, Lund University. Address: Tycho Brahes väg 1, 220 07 Lund, Sweden. E-mail: [email protected]. † Affiliation:

1

2

Partial Observability When There Is No Deception

2.1

Changes to the Baseline Model

In case there is no deception we follow the modelling approach of Dekel, Ely, and Yilankaya (2007) in assuming that each agent privately observes her opponent’s preferences with probability p and obtains a non-informative signal ∅ otherwise, and the agents play a Bayesian–Nash equilibrium of the resulting game of incomplete information. That is, we assume that each non-deceived individual has correct beliefs about the distribution of her opponent’s play and hence chooses a mixed action that is a best reply to this belief given her own preferences. Formally, we redefine the notion of a configuration as follows.  Definition 1. A configuration is a pair (µ, b) where µ ∈ ∆ (Θ) is a type distribution, and b = bN , bD is a behaviour policy, where bN : C (µ) × (C (µ) ∪ {∅}) −→ ∆ (A) and bD : C (µ) × C (µ) −→ ∆ (A) satisfy: 1. The profile bN is a Bayesian–Nash equilibrium of the game without deception, i.e. for each θ, θ0 ∈ C (µ) :   0 N N q (nθ , nθ0 ) + q (nθ0 , nθ ) < 1 ⇒ bN , and θ (θ ) ∈ argmaxa∈A p · π a, bθ 0 (θ) + (1 − p) · π a, bθ 0 (∅)   bN θ (∅) ∈ argmaxa∈A

X

   N . µ (θ0 ) · (1 − q (nθ , nθ0 ) − q (nθ0 , nθ )) · p · π a, bN θ 0 (θ) + (1 − p) · π a, bθ 0 (∅)

θ 0 ∈C(µ)

2. The profile bD consists of deception equilibria, i.e. for each θ, θ0 ∈ C (µ),  0 D 0 nθ > nθ0 ⇒ bD θ (θ ) , bθ 0 (θ) ∈ DE (θ, θ ) . We interpret bN (θ, ∅) as the strategy used by type θ when facing an opponent of an unknown type (henceforth, a stranger). Standard arguments imply that for any type distribution µ, there exists a behaviour policy b =  bN , bD such that (µ, b) is a configuration. Given a configuration (µ, b) and two incumbent types θ, θ0 , we define  x (θ, θ0 |µ, b) ∈ ∆ A2 as the distribution of action profiles conditional on type θ being matched with θ0 : 0 D 0 x (θ, θ0 |µ, b) (a, a0 ) = (q (nθ , nθ0 ) + q (nθ0 , nθ )) · bD θ (θ ) (a) · bθ 0 (θ) (a ) + (1 − (q (nθ , nθ 0 ) + q (nθ 0 , nθ ))) ·  2 N 0 0 N 0 N 0 p · bθ (θ ) (a) · bN θ 0 (θ) (a ) + p · (1 − p) · bθ (θ ) (a) · bθ 0 (∅) (a ) i 2 N 0 N N 0 + (1 − p) · p · bN θ (∅) (a) · bθ 0 (θ) (a ) + (1 − p) · bθ (∅) (a) · bθ 0 (∅) (a ) .

We redefine the expected fitness payoff of type θ conditional on being matched with type θ0 as follows: πθ (θ0 |µ, b) =

X

x (θ, θ0 |µ, b) (a, a0 ) · π (a, a0 ) .

(a,a0 )∈A2

 Let x (µ, b) ∈ ∆ A2 denote the aggregate distribution over action profiles induced by the configuration (µ, b): x (µ, b) (a, a0 ) =

X

µ (θ) · µ (θ0 ) · x (θ, θ0 |µ, b) (a, a0 ) .

(θ,θ 0 )∈C(µ)

If p < 1 then not all post-entry populations will have focal behaviour policies, because the entry of a fraction of mutants may affect the Bayes–Nash equilibrium actions played by incumbents who do not observe each other’s types. If all these post-entry behaviour policies involve play that is “far” from the original configuration, then

2

clearly the configuration is not stable. Following the modelling approach of Dekel, Ely, and Yilankaya (2007), we redefine our notion of stability to require that there exist some nearby post-entry behaviour policy, and that in all nearby behaviour policies the entrants do not outperform incumbents. Formally, given two distribution of  action profiles1 x, x0 ∈ ∆ A2 , let |x − x0 | = max(a,a0 )∈A2 |x (a, a0 ) − x0 (a, a0 )|. We redefine focality as follows:   Definition 2. Let (µ, b) and µ ˜, ˜b be two configurations such that C (µ) ⊆ C (˜ µ). We say that µ ˜, ˜b is focal (with respect to (µ, b)) if θ, θ0 ∈ C (µ) implies that ˜bN (∅) = bN (∅), ˜bN (θ0 ) = bN (θ0 ), and ˜bD (θ0 ) = bD (θ0 ). θ

θ

θ

θ

θ

θ

We redefine the notion of an NSC as follows. Definition 3. A configuration (µ, b) is a neutrally stable configuration (NSC) if, for every µ0 ∈ ∆ (Θ) and every  δ¯ ∈ (0, 1), there are some δ ∈ 0, δ¯ and ε¯ ∈ (0, 1) such that for all ε ∈ (0, ε¯), it holds that :  1. If µ ˜ , ˜bf , where µ ˜ = (1 − ε) · µ + ε · µ0 , is a focal configuration, then µ is an NSS in the type game Γ(µ˜,b˜f ) .   2. If there does not exist any focal configuration µ ˜ , ˜bf , then there exists a nearby configuration µ ˜ , ˜b such    ˜ , ˜b < ˜ , ˜b that satisfies x (µ, b) − x µ ˜ , ˜b < δ. Moreover, for any nearby configuration µ that x (µ, b) − x µ δ, the distribution µ is an NSS in the type game Γ(µ˜,˜b) . We conclude this subsection by discussing a few issues related to our extended model of partial observability. 1. The current model is general enough to embed both our baseline model and Dekel, Ely, and Yilankaya’s (2007) model. In particular, it coincides with the baseline model in Heller and Mohlin (2017) when p = 1, and it coincides with Dekel, Ely, and Yilankaya’s (2007) model when the costs of deception are sufficiently high, i.e. when k2 > maxa,a0 ,˜a,˜a0 (π (a, a0 ) − π (˜ a, a ˜0 )). In the latter case we say that the setup is without deception. 2. As in the baseline mode, the current model still assumes that a successful deception is very powerful and that a player with a higher cognitive level knows whether or not his deception was successful. 3. Definition 3 differs from Dekel, Ely, and Yilankaya (2007, Definition 3) in two respects: (a) While Dekel, Ely, and Yilankaya (2007) consider only monomorphic groups of mutants (i.e. mutants all having the same type), we additionally consider stability against polymorphic groups of mutants (as discussed in Remark 5 in Section 2.2 of Heller and Mohlin (2017)). Footnote 2 below discusses the influence of this difference on our results. ¯ The verbal definition of an NSC and the preceding motivating (b) Def. 3 includes the extra quantifier δ. arguments in Dekel, Ely, and Yilankaya (2007, p. 690) require that the incumbents outperform the mutants in all nearby post-entry configurations. Despite this, their Definition 3 (which does not have ¯ requires that the incumbents outperform the mutants in all post-entry configurations the quantifier δ) (also those that are not nearby). We have added the quantifier δ¯ in order to be consistent with the argument (presented in Dekel, Ely, and Yilankaya, 2007, p. 690) that an entry by a small group of mutants is unlikely to substantially change the equilibrium that for a long time has been played by the incumbents and therefore, it is sufficient to require that the mutants be outperformed only in nearby post-entry configurations (assuming that such nearby configurations exist). It turns out that our results remain essentially the same if one uses either of the two versions of the definition. 1 All

of our results remain the same if one replaces the L1 -norm with the L2 - or L∞ -norm.

3

2.2

Notion of p-Efficiency and Characterising the Highest Types’ Behaviour

Notion of p-Efficiency Given two strategies σ, σ ˜ and a probability p ∈ [0, 1], let p·σ +(1 − p)· σ ˜ be the mixture strategy; i.e. let (p · σ + (1 − p) · σ ˜ ) (a) = p · σ (a) + (1 − p) · σ ˜ (a) for each action a ∈ A. We say that a strategy profile (σ ∗ , σ ∗0 ) is p-efficient with respect to a fixed strategy profile (˜ σ, σ ˜ 0 ) if the profile (σ ∗ , σ ∗0 ) maximises the sum of fitness payoffs in a setup in which, with probability p each player is allowed to choose her strategy, but with the remaining probability she must play her part of the fixed strategy profile (˜ σ, σ ˜ 0 ), and the probability that a player gets to choose her strategy is independent of the probability that the other player gets to choose his strategy. Formally, we define p-efficiency as follows (using the notation π ¯ (σ, σ 0 ) = π (σ, σ 0 ) + π (σ 0 , σ)). Definition 4. Fix p ∈ [0, 1]. A strategy profile (σ ∗ , σ ∗0 ) is p-efficient with respect to strategy profile (˜ σ, σ ˜ 0 ) if (σ ∗ , σ ∗0 ) ∈ argmaxσ,σ0 (¯ π (p · σ + (1 − p) · σ ˜ , p · σ 0 + (1 − p) · σ ˜ 0 )). We say that a strategy profile is p-efficient if it is p-efficient with respect to itself. Remark 1. Observe that: 1. A strategy profile (σ ∗ , σ ∗0 ) is 1-efficient with respect to any strategy profile iff (σ ∗ , σ ∗0 ) is efficient (i.e. maximises the sum of payoffs without constraints). 2. If a strategy profile is p˜-efficient (with respect to itself), then it is also p-efficient for any p ≤ p˜. 3. Any strategy profile is 0-efficient with respect to any other strategy profile. Characterisation of the Highest Types’ Behaviour In what follows we extend the results of Section 3.2 of Heller and Mohlin (2017) and characterise the behaviour of an incumbent type, θ¯ = (u, n ¯ ), that has the highest level of cognition in the population. We show that the behaviour of an agent with the highest cognitive level satisfies the following two conditions: 1. p-efficiency against opponents with the highest level. When an agent of the highest cognitive level observes that her opponent also has the highest cognitive level, then she and her opponent play in a way that maximises the sum of fitness payoffs (taking as given the agents’ behaviour when observing an uninformative signal).2 2. Fitness maximisation against all deceived opponents. ¯ θ¯0 , θ ∈ C (µ∗ ) with n ¯ = n ¯0 = n Proposition 1. Let (µ∗ , b∗ )be an NSC, and let θ, ¯. θ θ    1. The strategy profile bN θ¯0 , bN θ¯ is p-efficient with respect to bN (∅) , bN (∅) , i.e. θ¯ θ¯0 θ¯ θ¯0    bN θ¯0 , bN θ¯ ∈ argmaxσ,σ0 π ¯ p · σ + (1 − p) · bN (∅) , p · σ 0 + (1 − p) · bN (∅) . θ¯ θ¯0 θ¯ θ¯0    D ¯ ¯θ . 2. If nθ < nθ¯, then bD θ ∈ F M DE θ, (θ) , b ¯ θ θ The proof is similar to the corresponding proof of Theorem 1 in Heller and Mohlin (2017), and is presented in brief in Appendix A.1. 2 The first part of Proposition 1 (p-efficiency) relies on allowing polymorphic groups of mutants. If one uses an alternative definition that allows only monomorphic groups of mutants), then the constrained efficiency result ought to be weakened to specify only that each incumbent with the highest cognitive level plays a constrained efficient strategy when observing an opponent with the same type, i.e.    bN θ¯ , bN θ¯ ∈ argmaxσ π ¯ p · σ + (1 − p) · bN (∅) , p · σ + (1 − p) · bN (∅) . θ¯ θ¯ θ¯ θ¯0

4

2.3

Pure Stable Configurations

In this subsection we characterise what pure actions are the outcomes of NSCs. Our first result characterises what actions are the outcomes of a pure NSC for any p < 1 (the perfect observability case is analysed in Section 3.3 of Heller and Mohlin (2017)). Specifically, it shows that (µ, a) is a pure NSC iff: 1. The profile (a, a) is a Nash equilibrium (the “if side” requires the profile to be a strict equilibrium). 2. The profile (a, a) is p-efficient. 3. All incumbents have the minimal cognitive level. Proposition 2. Let p ∈ [0, 1). 1. If (µ, a∗ ) is a pure NSC then: (a) nθ = 1 for each θ ∈ C (µ), (b) (a∗ , a∗ ) is a Nash equilibrium, and (c) (a∗ , a∗ ) is p-efficient, i.e. (a∗ , a∗ ) ∈ argmaxσ,σ0 (¯ π (p · σ + (1 − p) · a∗ , p · σ 0 + (1 − p) · a∗ )). 2. If the profile (a∗ , a∗ ) is a strict Nash equilibrium and p-efficient, then there exists a pure NSC (µ, a∗ ). Proof. 1. Let (µ, a∗ ) be a pure NSC. (a) The proof is omitted as it is essentially the same as the proof of Lemma 1. (b) Assume to the contrary that (a∗ , a∗ ) is not a Nash equilibrium. Let a0 ∈ A be a best reply to a∗ . The fact that everyone plays a∗ implies that uθ (a∗ , a∗ ) ≥ uθ (a, a∗ ) for each incumbent type θ ∈ C (µ) and each a ∈ A. Assume first that there exists an incumbent type θ ∈ C (µ) such that for each  > 0 there exists a ∈ A such that uθ (a, (1 − ) · a∗ +  · a0 ) > uθ (a∗ , (1 − ) · a∗ +  · a0 ). This assumption implies that following an entry of mutants who strictly prefer to play a0 , there is no nearby post-entry configuration for a sufficiently small δ. Thus, we can focus on the opposite case in which there exists ¯ > 0 such that for each incumbent type θ ∈ C (µ), each  ∈ (0, ¯), and each action a, the following inequality holds: uθ (a∗ , (1 − ) · a∗ +  · a0 ) ≥ uθ (a, (1 − ) · a∗ +  · a0 ) .

(1)

Consider an invasion of  mutants with type θ0 = (uθ0 , 1) 6= C (µ), where uθ0 represents pro-generous indifferent preferences (see Appendix A1 of of Heller and Mohlin (2017)). Inequality (1) implies that there exists a post-entry focal configuration in which: (1) the incumbents always play a∗ , (2) the mutants always play (1 − ) · a∗ +  · a0 , and (3) the mutants achieve a strictly higher fitness than the incumbents. (c) The proof is essentially the same as the proof of part (1) of Proposition 1. 2. Let (a∗ , a∗ ) be a strict Nash equilibrium and p-efficient. Consider a monomorphic configuration (µ, a∗ ) consisting of type θ∗ = (u∗ , 1) where all incumbents are of cognitive level 1 and of the same preference type u∗ , according to which action a∗ is a dominant action. Consider first mutants with cognitive level one. If the mutants do not play action a∗ against incumbents and strangers, then they are strictly outperformed when they are sufficiently rare. Mutants who play a∗ with probability one against incumbents and strangers cannot gain by playing a different strategy against observed mutants, by the definition of p-efficiency. Finally, mutants of a higher cognitive level cannot gain more than π (a∗ , a∗ ) against the incumbents, and are outperformed (due to their higher cognitive costs) if they are sufficiently rare. 5

Example 1. Simple applications of the above result (together with the result in the case p = 1) fully characterise pure NSC outcomes in: 1. Pure coordination games (in which π (a, a) > 0 and π (a, a0 ) = 0 for each a 6= a0 ): there exists a pure NSC (µ, a) iff π (a, a) ≥ p2 · π ˆ. 2. Prisoner’s Dilemma games (see Table 1): defection is a pure NSC outcome if p = 0, cooperation is a pure NSC if p = 1 and g ≤ c (due to Proposition 1 of Heller and Mohlin (2017)), and no pure NSC outcome exists for any p ∈ (0, 1). As any profile is 0-efficient, Proposition 2 immediately implies that any strict equilibrium is the outcome of a pure NSC when p = 0. An interesting question is, which strict equilibria remain stable for sufficiently low levels of observability? The following corollary shows that a strict equilibrium is robust to low levels of observability essentially iff a unilateral deviation cannot increase the sum of payoffs. Formally: Definition 5. A symmetric action profile (a, a) is unilaterally efficient in the game G = (A, π) if π ¯ (a, a) ≥ π ¯ (a, a0 ), for each action a0 6= a ∈ A, and it is strictly unilaterally efficient if the inequality is strict. Corollary 1. Let (a∗ , a∗ ) be a strict equilibrium of the fitness game G = (A, π). 1. If (a∗ , a∗ ) is strictly unilaterally efficient, then there exists p¯ ∈ (0, 1), such that a∗ is the outcome of a pure NSC for any p ∈ [0, p¯]. 2. If (a∗ , a∗ ) is not unilaterally efficient, then there exists p¯ ∈ (0, 1), such that a∗ is not the outcome of a pure NSC for any p ∈ (0, p¯). Remark 2 (Implications for the setup without deception). Dekel, Ely, and Yilankaya (2007) show that a sufficient condition for a pure outcome to be stable for p ∈ (0, 1) is for the outcome to be both (1) efficient and (2) a strict Nash equilibrium. Proposition 2 weakens these sufficient conditions, by requiring that the pure profile be only p-efficient (rather than efficient); in addition, Prop. 1 shows that these weaker conditions are also necessary for the stability of a pure outcome.

2.4

Games with Uniformly Efficient Actions

In this subsection we study NSCs of games that admit uniformly efficient actions. Action a is uniformly efficient if playing it increases the sum of payoffs regardless of the opponent’s strategy (i.e. if π ¯ (a, a0 ) > π ¯ (a0 , a0 ) for each a0 6= a). A few examples of uniformly efficient actions are: 1. cooperation in the Prisoner’s Dilemma (see Table 1); 2. maximal contribution in a public good game; 3. the action “dove” in the common formulations of the“hawk-dove” game; and 4. reporting the maximal suitcase’s weight in the traveler’s dilemma (Basu, 1994). Observe that if a is uniformly efficient, then (a, a) is the unique p-efficient profile with respect to any profile (σ, σ 0 ) and any p. Proposition 1 implies that if the underlying game admits a uniformly efficient action a, then in any NSC, any incumbent with the highest cognitive level plays action a when she observes an opponent with the highest cognitive level. Formally: 6

¯ θ¯0 ∈ C (µ) Corollary 2. Let G be a game that admits a uniformly efficient action a. Let (µ, b) be an NSC. Let θ,  be two incumbent types with the highest cognitive level. Then, b ¯ θ¯0 = a. θ

Our next result focuses on NSCs in which all incumbents have the same cognitive level, and shows that the outcome of any such NSC is that all incumbents play the uniformly efficient action.3 Proposition 3. Let G be a game that admits a uniformly efficient action a. Let (µ, b) be an NSC such that all incumbents have the same cognitive level. Then b ≡ a. 0 0 Proof. By Corollary 2, bN θ (θ ) = a for each θ, θ ∈ C (µ) . Assume to the contrary that there exists an incumbent 0 type θ ∈ C (µ) with bN θ (∅) = σ 6= a. Consider a mutant type θ with a pro-generous indifferent utility function

and the same cognitive level as the incumbents. There is a focal post-entry configuration that satisfies the following conditions: 1. Incumbents play action a against observed mutants. 2. The mutants play the strategy p · a + (1 − p) · x against strangers and incumbents. 3. The mutants play action x against observed mutants. Observe that the mutants play, on average, the same distribution of actions as the incumbents when facing other incumbents. As a result, it is indeed a best reply for the incumbents to play against the mutants in the same way they play against incumbents (i.e. to play the action a). Next observe that the mutants and the incumbents both get the same expected payoff when being matched against incumbents, but the mutants get a strictly higher payoff when being matched against other mutants. This implies that the mutants strictly outperform the incumbents, and this contradicts the assumption that (µ, b) is an NSC. Propositions 2 and 3 jointly imply the following corollary in the setup without deception (in which everyone must have the minimal cognitive level): Corollary 3. Consider a setup without deception. Let p ∈ (0, 1). Let G be an underlying game that admits a uniformly efficient action a. Then: 1. If (a, a) is a strict Nash equilibrium of the underlying game, then action a is the outcome of a pure NSC, and there exists no other NSC outcomes. 2. If (a, a) is not a Nash equilibrium of the underlying game, then the game does not admit any NSC. Remark 3. Dekel, Ely, and Yilankaya (2007, page 697) conjecture that the Prisoner’s Dilemma game admits no NSCs (in the setup without deception) when p is close to one (and show that cooperation is not stable for p < 1). Corollary 3 confirms and strengthens the conjecture by proving that it holds for any p ∈ (0, 1) and for any game with a uniformly efficient action.

2.5

Stable Configurations in the Prisoner’s Dilemma

In this subsection, we characterise stable configurations in Prisoner’s Dilemma games. In a Prisoner’s Dilemma game (as described in Table 1) each player decides simultaneously whether to cooperate (abbr., co) or defect (abbr., de); if both players cooperate they obtain a payoff of one, if both defect they obtain zero, and if one of the players defects, the defector gets 1 + g, while the cooperator gets −l, where g, l > 0 and 1 + g − l < 2 (i.e. mutual cooperation is the efficient outcome). 3 We

are grateful to Okan Yilankaya for suggesting an example that lead us to to the statement and the proof of this result.

7

Table 1: Matrix Payoffs of Prisoner’s Dilemma Games co de 1 1+g co 1 −l de

1+g

−l

0

0

Prisoner’s Dilemma (g, l > 0 , 1 + g − l < 2) Proposition 4 shows that the Prisoner’s Dilemma game does not admit any NSC for any p ∈ (0, 1). As discussed before, (1) when p = 1 and g < c the unique NSC outcome is everyone cooperating, and (2) when p = 0 the unique NSC outcome is everyone defecting. Proposition 4. Let G be a Prisoner’s Dilemma game and let p ∈ (0, 1) . Then the environment does not admit any NSCs. The proof (presented in Appendix A.2) shows that in any NSC all incumbents have the same cognitive level. By Proposition 3, this implies that the environment does not admit NSCs.

2.6

Discussion

The main results of the baseline model (p = 1) in Heller and Mohlin (2017) show that (1) only efficient profiles can be NSCs, and (2) there exist non-Nash efficient NSCs, provided that the cost of defection is sufficiently large. In this supplementary analysis we have studied the robustness of these results under partial observability. On the one hand, our analysis shows that the first of these main results is robust and that it also holds under partial observability: 1. A somewhat weaker notion of efficiency (namely, p-efficiency) is satisfied by the behaviour of the incumbents with the highest cognitive level in any NSC for any p > 0. 2. In games such as the Prisoner’s Dilemma, we show that only the efficient profile might be the outcome of an NSC. On the other hand, our analysis shows that our second main result of the baseline model is not robust, as we show that non-Nash efficient profiles cannot be NSC outcomes for any p < 1 in games like the Prisoner’s Dilemma, even when the effective cost of deception is arbitrarily large. Similarly, we show that non-Nash efficient outcomes cannot be pure NSC outcomes in all games. If a game admits a profile that is both efficient and Nash, then the profile is an NSC outcome for any p ∈ [0, 1]. If the the underlying game does not admit such a profile, then our results show that the environment does not admit a pure NSC for any p ∈ (0, 1), and that games like the Prisoner’s Dilemma do not admit any NSC. This suggests that in order to study stability in such environments one might need to apply weaker solution concepts or to follow a dynamic (rather than static) approach.

A A.1

Formal Proofs Proof of Proposition 1 (Highest Types’ Behaviour with Partial Observability)

   1. Assume to the contrary that bN θ¯0 , bN ¯ p · σ + (1 − p) · bN (∅) , p · σ 0 + (1 − p) · bN θ¯ 6∈ argmaxσ,σ0 π (∅) , θ¯ θ¯0 θ¯ θ¯0 and let σ1 , σ2 be two strategies that maximise this expression. 8

Consider two distinct mutant types

θ1 , θ2 ∈ / C (µ) with nθ1 = nθ2 = n ¯ and indifferent and pro-generous utility functions. Suppose equal fractions of these two mutant types enter the population. There is a focal post-entry configuration that satisfies the following conditions: (a) Incumbents who observe a mutant opponent of type θ1 (θ2 ) play against such an opponent in the same way they play against type θ¯ (θ¯0 ). (b) The mutants play fitness-maximising deception equilibria against all lower types. (c) The mutants of type θ1 (θ2 ) play the same as type θ¯ (θ¯0 ) when playing against strangers, observed (non-deceived) incumbent opponents, and observed mutant opponents with the same type θ1 (θ2 ). (d) Mutants of type θi play strategy σi when observing the opponent to be of type θ(i+1) mod 2 . In such a post-entry configuration the mutants earn the same fitness as θ¯ and θ¯0 against the incumbents, and a strictly higher expected fitness against the mutants. This implies that (µ∗ , b∗ ) cannot be an NSC.    D ¯ ¯ θ . Consider mutant type θˆ ∈ 2. Assume to the contrary that bD (θ) , b 6∈ F M DE θ, θ / C (µ) with ¯ θ θ nθˆ = n ¯ and an indifferent and pro-generous utility function uθˆ ∈ / C (µ). Consider a post-entry configuration in which the incumbents keep playing their pre-entry play among themselves (and against strangers), and ¯ except that they play a fitness-maximising deception equilibria against the mutants mimic the play of θ, all lower types. The mutants obtain a weakly higher payoff than θ¯ against all types, and a strictly higher payoff than θ¯ against some lower types. Thus (µ∗ , b∗ ) cannot be an NSC.

A.2

Proof of Proposition 4 (the Prisoner’s Dilemma admits no NSCs)

Assume to the contrary that there exists an NSC (µ, b). The proof includes the following parts: 1. Each pure action is the unique subjective best reply to itself for each incumbent with the highest level. We begin by showing that de is the unique best reply to itself. Assume to the contrary that there is an ¯ such that co ∈ BRuθ¯ (de). Let θ0 = (¯ incumbent type θ¯ ∈ C (µ) with nθ¯ = n n, u0 ) be a mutant type with pro-generous indifferent preferences. Then for any  > 0 there exists a focal post-entry configuration  µ ˜ = (1 − ) · µ +  · 1θ0 , ˜b in which:    N ¯ N N 0 N for each type θ ∈ C (µ) \ θ¯ and bN (a) bN θ 0 (θ) , bθ (θ ) = bθ¯ (θ) , bθ θ θ 0 (∅) = bθ¯ (∅), i.e. the mutant ¯ θ0 and the incumbent θ¯ are involved in the same play against any non-deceived incumbent type θ 6= θ;  D 0 (b) bD ¯ , i.e. the mutant θ0 plays fitness-maximising θ 0 (θ) , bθ (θ ) ∈ F M DE for each θ ∈ C (µ) with nθ < n deception equilibria against deceived opponents; 0 0 (c) bN θ 0 (θ ) = co, i.e. the mutant θ cooperates against another mutant; and   0 ¯ N 0 ¯ ¯ (d) bN θ 0 θ , bθ¯ (θ ) = (de, co), i.e. the mutant θ defects against an θ, and the incumbent θ cooperates against θ0 (this is a subjective best reply of θ¯ due to the assumption that co ∈ BRuθ¯ (de) and the fact  θ¯ = co due to Corollary 2). that bN θ¯ Observe that the mutant θ0 achieves the same payoff as θ¯ against the incumbents, and strictly outper-

forms it against the mutants, which contradicts (µ, b) being an NSC. This establishes that de is the unique best reply to itself. The fact that each incumbent of the highest type cooperates against an observed opponent of the same type with the highest cognitive level implies that action co is a subjective best reply to itself. If action co is a non-strict subjective best reply to itself given the preferences of some type θ¯ ∈ C (µ) with 9

nθ¯ = n ¯ , then action co must be subjectively weakly dominated. This implies that after an invasion by a mutant of the highest cognitive level for whom defection is a dominant action, incumbents of type θ¯ will strictly prefer to defect against strangers, and therefore also against opponents of the same type. This contradicts the existence of a nearby post-entry configuration, and hence contradicts (µ, b) being an NSC. 2. Whenever one of the agents has a cognitive level strictly below n ¯ , both players defect. Part (1) implies that each incumbent of the highest level induces the play of either (co, co) or (de, de) in any deception equilibrium. Proposition 1 implies that each incumbent of the highest type plays a fitnessmaximising deception equilibrium. These two implications are only compatible if: (1) each incumbent of the highest level plays (de, de) in any deception equilibrium (the profile (co, co) cannot be a fitness-maximising deception equilibrium because then the player with the highest level could obtain a higher fitness by defecting), and (2) defection is a subjective dominant action for each incumbent with a cognitive level strictly below n ¯ (because otherwise the profile (de, de) is not a fitness-maximizing deception equilibrium, as the player with the highest level could obtain a higher fitness by inducing the deceived partner to cooperate). This implies that when one of the players has a cognitive level below n ¯ , both players defect. 3. All incumbents have the same cognitive level (i.e. nθ = n ¯ for each θ ∈ C (µ)). Assume to the contrary that there are incumbent agents with a cognitive level strictly below n ¯ . Let n0 < n ¯ be the second-highest cognitive level in the population (i.e. n0 = argmax (nθ |θ ∈ C (µ) and nθ < n ¯ )). Let θ0 = (n0 , u0 ) be a mutant agent with pro-generous indifferent preferences. Then, there exists a focal post entry configuration µ ˜ = (1 − ) · µ +  · 1θ0 , ˜b in which: 0 0 (a) bN θ 0 (θ ) = co, i.e. the mutant θ cooperates against another mutant;  N 0 (b) bN θ 0 (θ) , bθ (θ ) = (de, de) ∀θ ∈ C (µ), i.e. both players defect when a mutant meets an non-deceived

incumbent; 0 (c) bN θ 0 (∅) = de, i.e. the mutant θ defects when seeing an uninformative signal;  D 0 0 0 (d) bD θ 0 (θ) , bθ (θ ) ∈ F M DE ∀θ ∈ C (µ) with nθ < n , i.e. the mutant θ plays fitness-maximising

deception equilibria against deceived opponents of lower level; and  D 0 (e) bD ¯ , i.e. any deceiving incumbent plays θ 0 (θ) , bθ (θ ) ∈ {(co, co) , (de, de)} ∀θ ∈ C (µ) with nθ = n either (co, co) or (de, de) against the mutant. The above properties imply that the mutants strictly outperform the incumbents with cognitive level n0 (part (a) leads to a strict advantage for the mutants while parts (b)–(d) ensure that mutants earn no less than incumbents). This contradicts that (µ, b) is an NSC. 4. Due to Propositions 3 and 2, the fact that all incumbents have the same cognitive level contradicts (µ, b) being an NSC.

References Basu, K. (1994): “The Traveler’s Dilemma: Paradoxes of Rationality in Game Theory,” American Economic Review (Papers and Proceedings), 84(2), 391–395. Dekel, E., J. C. Ely, and O. Yilankaya (2007): “Evolution of Preferences,” Review of Economic Studies, 74, 685–704. 10

Heller, Y., and E. Mohlin (2017): “Coevolution of Deception and Preferences: Darwin and Nash Meet Machiavelli,” mimeo.

11

Supplementary Appendix to Coevolution of Deception ...

supplementary analysis we relax the assumption of perfect observability in matches ... Given a configuration (µ, b) and two incumbent types θ, θ , we define.

449KB Sizes 1 Downloads 190 Views

Recommend Documents

Supplementary Appendix
product differentiation), Firm 2's net profit is negative (gross profit is zero, but the costs are positive). If σ = 0, the profit is zero. Moreover c2(0) = 0, which implies π2(0)−qc2(0) = π2(0) > 0. Therefore, Firm 2's net profit is increasing

Supplementary Appendix
through the institution, the pivotal member signals support, the audience supports x, and ..... For the pivotal member, there are two cases to check. Recall that D ...

Coevolution of Deception and Preferences: Darwin and ...
Jun 14, 2017 - Overview of the Model. As in standard evolutionary game theory we assume an infinite population of individuals who are uniformly randomly matched to play a symmetric normal form game.5,6 Each individual has a type, which is a tuple, co

Supplementary Appendix
Substitution of (43) and (44) for m ∈ {s ,0} in (23) gives: dV dp . = −(. Я(8 − 2Я − 3Я2). 8(2 − Я)(4 − Я2)(1 − Я2)2 − (1 − Я). Я(2 − Я2). (4 − Я2)2(1 − Я2)2 \E i {[E{θ |s } − θ ]2}. = −Я. 8(4 − Я2

Supplementary Appendix
Nov 15, 2017 - (a2) E∗ (BV ∗ n ) = n. ∑ i=2. (vn i−1. )1/2. (vn i ). 1/2 . (a3) V ar∗ (. √. nRV ∗ n )=2n n. ∑ i=1. (vn i ). 2 . (a4) V ar∗ (. √. nBV ∗ n ) = (k. −4. 1. − 1)n n. ∑ i=2. (vn i )(vn i−1) + 2(k−2. 1. − 1)

Supplementary Appendix to “Asymptotic Power of ...
Jun 22, 2012 - e-nf(z)g(z)dz that corresponds to a portion of K1 close to its boundary point, which in our case is z0(h). To make our exposition self-contained, we sketch Olver's derivation; for details, we refer the reader to pages. 1Here and throug

Supplementary Appendix of Strategic Disclosure of ...
Strategic Disclosure of Demand Information by. Duopolists: Theory and Experiment. Jos Jansen. Andreas Pollak. Aarhus University. MPI, Bonn. University of Cologne. July 2015. Supplementary Appendix: not for publication. Here we give missing proofs and

Supplementary Appendix to: Perpetual Learning and Apparent Long ...
Apr 13, 2017 - Perpetual Learning and Apparent Long Memory. Guillaume Chevillon. ESSEC Business School & CREST .... T large enough, f (2π. T n) > 0. Also for n ≤ T, n−δTδ+1 → ∞ as n → ∞. Therefore, there exist (n0 ..... where ⌊x⌋ i

Supplementary Appendix to: Perpetual Learning and Apparent Long ...
Apr 13, 2017 - Perpetual Learning and Apparent Long Memory. Guillaume Chevillon. ESSEC Business School & CREST. Sophocles Mavroeidis. University of ...

Bootstrapping high-frequency jump tests: Supplementary Appendix
Bootstrapping high-frequency jump tests: Supplementary Appendix. ∗. Prosper Dovonon. Concordia University. Sılvia Gonçalves. University of Western Ontario. Ulrich Hounyo. Aarhus University. Nour Meddahi. Toulouse School of Economics, Toulouse Uni

Supplementary On-Line Appendix for “International ...
Supplementary On-Line Appendix for “International Institutions .... Std. Dev. Min. Max. No. of Obs. Source. PTA Negotiation. 0.02. 0.12. 0. 1. 4460. Authors. LeaderChange. 0.14. 0.34. 0. 1. 4460. Archigos (2009) ..... of Foreign Affairs and Trade,

Coevolution
host nuclear genes eventually became so integrated that ..... view, natural selection favours general defences that best ... Walingford, UK: CAB International.

Coevolution of Cycads and Dinosaurs - Torreya Guardians
sis, i.e., that the evolutionary fates of cycads and dinosaurs were inextricably intertwined, and the Late Cretaceous extinction of these reptiles was the triggering ...

Coevolution
The evolution of biological communities is a history of the development of coevolved relationships. Research in ... communities, studies of emerging diseases, and protocols for design of nature reserves are. Article Contents ..... graphic structure o

Deception 101 -- Primer on Deception
Information on this program is available on our website, .... the American public. As he confided to Secretary of the Treasury. Morgenthau in 1942, after the United States had entered the war,. “I may have one policy for Europe and one diametricall

Online Appendix to
Online Appendix to. Zipf's Law for Chinese Cities: Rolling Sample ... Handbook of Regional and Urban Economics, eds. V. Henderson, J.F. Thisse, 4:2341-78.

Online Appendix to
The model that controls the evolution of state z can be written as zt. = µz .... Members of survey A think of the signal θA as their own, but can observe both.

Online Appendix to
Sep 27, 2016 - data by applying the pruning procedure by Kim et al. .... “Risk Matters: The Real Effects of Volatility Shocks,” American ... accurate solutions of discrete time dynamic equilibrium models,” Journal of Economic Dynamics &.

Notice of revision to the Appendix to the Consolidated Financial ...
Aug 8, 2016 - Kyowa Hakko Kirin Net Sales of Main Products. Before revision ..... Total consolidated net sales ...... Human Antibody-Producing Technology.

Online Appendix to
Nov 3, 2016 - 0.03. 0.03. 0.03. 0.02. 0.04. 0.04. 0.04. 0.04. Note . Robust standard errors b et w een paren theses, r ob us t-standard-error-based. p-v alues b et w een brac k ets. ∆. Cr e d is the gro w th rate of real lending b y domestic banks

Coevolution of Intelligent Agents using Cartesian ...
Jul 11, 2007 - by a new kind of computational network based on a com- ..... assigned. The job of the first agent is to obtain the gold ..... to the presence of the second agent and the degree to which .... and Computer Science”, Wiley, 105-131.

Coevolution of Glauber-like Ising dynamics and topology
Received 13 June 2009; published 13 November 2009. We study the coevolution of a generalized Glauber dynamics for Ising spins with tunable threshold and of the graph topology where the dynamics takes place. This simple coevolution dynamics generates

Coevolution of neuro-developmental programs that ...
The neural network that occurs by running the genetic programs has a .... Dendrite Electrical CGP (DECGP) The DECGP vector chromosome handles the ...