Evolutionary Game Theory and the Chain Store Paradox Throughout the fields of philosophy and economics there are numerous paradoxes, especially in the area where these two fields meet – rational choice and game theory. One of these paradoxes, known as the chain store paradox, is based on a seemingly simple game from game theory. Like many other similar games, though, it has been used to discuss rationality and has been analyzed from a wide array of perspectives. This paper will focus on an analysis of the chain store paradox in terms of evolutionary game theory; several variations of the game will first be introduced, followed by an evolutionary analysis of the game in both asymmetric (two populations) and symmetric (single population) forms. It is ultimately shown that, under certain conditions, evolutionary dynamics arrives at a different conclusion from traditional modular rationality that is more in line with intuition. The chain store game and ensuing paradox was first presented by Reinhard Selten in his paper “The Chain Store Paradox” (Selten, 1978). As described by Selten, the game occurs in multiple stages with one long run competitor, who participates in every round, and a collection of short run competitors, each participating in a single stage. The story behind the game is that the long run competitor, player A, is the owner of a chain store with branches in „m‟ towns. In each location, there is also a “small businessman”, player k (for k = 1 to m), who has the option of obtaining enough capital to enter the market as a competitor to player A. In each individual stage, player k is the first to act, followed by player A.

Bargar |2

Player A, who is sometimes described as the incumbent or monopolist, will from here on be referred to as the incumbent. Player k, also known as player B, the competitor (Wiseman, 2009), or the entrant (Govindan, 1994), will from here on be referred to as the competitor. The competitor has the option of either entering the market (“in”) or staying out of the market (“out”). If the competitor stays out, the resulting payoffs are independent of the incumbent‟s intended action. However, in response to the competitor entering, the incumbent can either be cooperative (also known as accommodating or acquiescing) or be aggressive (also known as fighting). While it seems as though every game theorist has their own subtly nuanced payoff structure for this game, the one considered now is Selten‟s original structure. When the competitor decides to stay out, he has a payoff of 1 (he can use his capital elsewhere) and the incumbent has a payoff of 5 (he maintains his monopoly). If the incumbent decides to accommodate after the competitor enters, then, according to Selten, both players get a payoff of 2. In many variations of the game the payoffs in this state differ – most commonly, the incumbent receives 0 while the competitor receives some arbitrary b>0 (Fudenberg & Tirole, 1991) or 0

Incumbent

Competitor In

Out

Cooperative

2,2

5,1

Aggressive

0,0

5,1

Table 1: Normal Form of Chain Store Game

In his analysis of the game as played with a finite number of iterations, Selten brings up two possible solutions: induction and deterrence. The paradox lies in their disagreement – Govindan describes it as “a phenomenon in which subgame perfection runs counter to one‟s

Bargar |3

intuition about the situation” (Govindan, 1994). In the first strategy, Selten starts from the final stage and shows that the competitor will choose to enter, as it would be irrational for the incumbent to fight his entrance and take a payoff of 0 when they could both have a payoff of 2. By this reasoning, the incumbent‟s choice of action in the second to last round has no effect on the final round, so the second to last competitor will also enter and the incumbent is compelled to cooperate. Using backwards induction, it follows that each competitor will come to the same conclusion and will all enter, to which the incumbent‟s only rational response is to accommodate. This can also be seen by considering modular rationality, which is described by Skyrms. The concept of modular rationality mandates that a strategy must specify rational choices for each decision in the game (Skyrms, 1996). One example of modular rationality is “peace by mutually assured destruction,” the strategy of one nation threatening to retaliate against a nuclear strike with an even more massive nuclear strike, effectively annihilating all nations involved. This strategy is considered to be modular irrational because, while it may decrease the likelihood of a nuclear strike, launching a massive retaliation isn‟t a rational decision. It is in the same vein that the incumbent‟s aggressive strategy is considered to be modular irrational: once the competitor has entered the market, the incumbent should choose the outcome that gives him a larger payoff. Selten‟s other proposed strategy is the one that generally seems more intuitive – deterrence. The incumbent can get a larger payoff over the course of the game if he can effectively deter the competitor from entering by fighting all of the competitors that do until succumbing to induction and being cooperative for the last several rounds (e.g. for 20 rounds total, play aggressive for the first 17) (Selten, 1978). A common topic of discussion relating to

Bargar |4

this game is “reputation,” or the incumbent‟s ability to deter the competitor from entering (Fudenberg & Levine, 1989). In fact, the incumbent can play a mixed strategy: in the case where the competitors payoffs are 0, b, and -1 for staying out, entering against a cooperative incumbent, and entering against an aggressive incumbent, respectively, the competitor will stay out if he believes that the incumbent will fight him with a probability of

or greater (Fudenberg &

Tirole, 1991) (Wiseman, 2009). For Selten‟s original payoffs, the corresonding frequency is , where p is the probability of the incumbent fighting. This can be shown by the following inequality, where the left hand side is the competitor‟s expected value from entering and the right (

side is the guaranteed payoff of staying out: which reduces to

)

(

)

(

)

(

)

,

.

Applying the principles of evolutionary game theory to the chain store game brings up some interesting results, but first requires a modification to the game‟s structure. As described by John Maynard Smith, the symmetric evolutionary model requires an infinite population reproducing asexually competing in “pairwise contests”, with a finite number of possible strategies (Maynard Smith, 1982). In each successive generation, the proportion of competitors playing each strategy is updated as a result of the relative fitnesses of those in the previous generation. Three separate cases will now be discussed. In the first case, consider a slight modification of these circumstances in which a single incumbent, with probability of playing aggressively competitors, with a proportion

entering and

functions for the competitor population are (

)

(

) (

)

(

) (

( ) ), and ̅

, plays against an infinite population of staying out. Leaving (

) ( ( )

)

( (

fixed, the fitness ) (

),

). The resulting

Bargar |5 ( ) ̅

frequency in the next generation, using Maynard Smith‟s model, is

(Maynard

Smith, 1982). Looking at several combinations of initial conditions, this model confirms the ratio discussed previously: if (figure 1), if

, the entire population of competitors will eventually stay out

they will all eventually enter (figure 2), and if

there will be no

change in proportions (figure 3). 1 Chain Store Game - P(aggressive) =0.51 1

Chain Store Game - P(aggressive) =0.49 1 0.8 Frequencies

Frequencies

0.8 0.6 0.4 0.2

0.6 0.4 0.2

0

100

200 300 Iterations

400

0

500

Competitor: percent in Competitor: percent out

100

200 300 Iterations

400

500

Competitor: percent in Competitor: percent out

Figure 1

Figure 2

Chain Store Game - P(aggressive) =0.5 1

Frequencies

0.8 0.6

Competitor: percent in Competitor: percent out

0.4 0.2 0

100

200 300 Iterations

400

500

Figure 3

In the second case, there are two distinct populations – one of incumbents and one of competitors. Each paired interaction occurs between the two populations, but the changing proportions still rely on the relative fitnesses between members of each population. The fitness 1

These charts were generated using Matlab; several of the corresponding m-files can be found in the appendix.

Bargar |6

functions for the competitor population are the same as before, with the exception that

is no

longer fixed. The fitness functions for the incumbent population are defined in the same way: (

)

̅

(

(

)

)

( (

),

(

)

(

)

(

), and

). The difference is that there is now a population of

incumbents that are playing each other, rather than just a single incumbent. In terms of the story, this is analogous to several chains that are competing nationwide, but don‟t have any locations in the same town. In this case, most ranges of initial conditions will force an eventual equilibrium in which all competitors enter and all incumbents are cooperative; this is because, as long as there are competitors entering the market, incumbents get a larger payoff from being cooperative – this is shown in figure 4 (for incumbents playing aggressively (

and

). In figure 5, there are initially enough ) to drive the competitors (

) out of the

market; in the process, though, the number of incumbents playing aggressively slightly decreases, as the cooperators get a larger payout while there are still competitors entering. Chain Store Game 1

0.8

0.8 Frequencies

Frequencies

Chain Store Game 1

0.6 0.4

0.6 0.4 0.2

0.2 0

0

20

40

60 80 100 120 140 Iterations

Incumbent: percent aggressive Competitor: percent in Figure 4

5

10 15 Iterations

20

25

Incumbent: percent aggressive Competitor: percent in Figure 5

Bargar |7

However, there is no stable mixed equilibrium of both populations in this version of the game, as the cooperating incumbents will always fare better than the aggressive ones, and as soon as the aggressive ones have left all of the competitors will enter the market (as shown in figure 4). The final case is that in which there is a single population with four possible strategies, modeled after Skyrms‟ treatment of the ultimatum game (Skyrms, 1996). Each agent is assumed to play as the incumbent half of the time and as the competitor the other half. In this spirit, there are four separate classifications, each of which has been given a name:

Timid (T) Ruthless (R) Inductive (N) Defensive (D)

Incumbent Cooperative Aggressive Cooperative Aggressive

Competitor Out In In Out

Table 2 – possible strategies

The simplified fitness functions come out to ( )

,

( )

( )

( )

, , and

. For most combinations of initial conditions consisting of all

four strategies, the Inductive strategy comes out on top, as seen in figure 6. However, as in figure 7, if there are enough defensive players initially, they can drive out the inductive strategy by “punishing” them for entering, similar to case two discussed earlier. Interestingly, the timid players “free ride” on the defensive ones – once the ruthless and inductive strategies (the “in” strategies) go extinct, the timid and defensive strategies both get the same payoff – the one for staying out of the market. While the in strategies are dying out, the timid strategy does slightly better than the defensive strategy by cooperating instead of being aggressive; the defensive players alone pay the cost of fighting off the inductive players (Skyrms, 1996).

Bargar |8

Chain Store Game

Chain Store Game 1

1 0.8 Frequencies

Frequencies

0.8 Timid Ruthless Inductive Defensive

0.6 0.4

0.4 0.2

0.2 0

0

10

20 30 Iterations

40

Timid Ruthless Inductive Defensive

0.6

10

20 30 Iterations

40

50

Figure 7

50

Figure 6

A few interesting phenomena occur when certain strategies are left out. For instance, the ruthless and timid strategies tend to reach a mixed equilibrium (¼ and ¾, respectively) in the absence of the inductive strategy, as shown in figure 8. If the proportion of defenders is increased, an equilibrium is reached between the defensive and the timid player. The inductive strategy always wins in the absence of defenders in a large enough quantity. One variation that can be added to this case is that of correlation: a factor e is introduced, making it more likely for each agent to play against others with the same strategy. This concept is also described in Skyrms, and involves replacing each probability with a conditional probability: ( | )

( )

(

), ( | )

( )

( ) (Skyrms, 1996). As

shown in figure 9, a correlation coefficient as low as 1/3 can result in the inductive strategy dying out when all four start with equal proportions. Interestingly, in this case, the timid strategy eventually becomes the only one left. When the correlation coefficient is increased to 2/3, a stable equilibrium is once again reached between the timid and defensive players (figure 10).

Bargar |9

Chain Store Game, e =0.33333 1 0.8

Frequencies

Chain Store Game

Timid Ruthless Inductive Defensive

0.6 0.4 0.2

0.5 0

0

50

100

150 Iterations

200

Timid Ruthless Inductive Defensive Figure 8

250

300

20

40 60 Iterations Figure 9

80

100

Chain Store Game, e =0.66667 1 0.8 Frequencies

Frequencies

1

Timid Ruthless Inductive Defensive

0.6 0.4 0.2 0

5 10 15 20 25 Iterations

Figure 10

The common theme throughout all of these examples is that if there are enough incumbents threatening to play aggressively, or there is a single incumbent playing aggressively sufficiently often, all of the competitors will eventually stay out of the market. In the first case, of a single incumbent, this agrees with the modification of the equation posited by Fudenberg/Tirole and Wiseman. While this result seems to generally disagree with the induction theory and modular rationality, that isn‟t an issue: Skyrms states regarding his analysis of the ultimatum game, “a strategy of commitment that fails the test of modular rationality can persist. […] Evolution does not respect modular rationality” (Skyrms, 1996). The same reasoning applies to the chain store paradox – the threat of aggressive play is a seemingly irrational commitment, but, as most people‟s intuition suggests, it is an effective tool for maximizing the incumbent player‟s long run payoff.

B a r g a r | 10

References Fudenberg, D., & Levine, D. K. (1989). Reputation and Equilibrium Selection in Games With a Patient Player. Econometrica, 57(4), 759-778. Fudenberg, D., & Tirole, J. (1991). Game Theory. Cambridge, Mass.: MIT Press. Govindan, S. (1994). Stability and the Chain Store Paradox. Journal of Economic Theory, 66, 536-547. Maynard Smith, J. (1982). Evolution and the Theory of Games. Cambridge, UK: Cambridge University Press. Selten, R. (1978). “The Chain Store Paradox.”. Theory and Decision, 9, 127-159. Skyrms, B. (1996). Evolution and the Social Contract. Cambridge, UK: Cambridge University Press. Wiseman, T. (2009). "Reputation and Expogenous Private Learning". Journal of Economic Theory, 144, 1352-1357.

B a r g a r | 11

Appendix – Matlab m-files Single Incumbent vs. Competitor Population function p_in = chainStore_const(n,pc0,pi0,ifPlot) %iterated frequencies t = linspace(1,n,n); %player B p_in = linspace(1,n,n); p_in(1) = pi0; %Utilities for player B Uic=2; Uia=0; Uoc=1; Uoa=1; %iterate n-1 times for x = 2:n %frequencies of player B pi=p_in(x-1); po=1-pi; %fitness functions player B Wi = pc0*Uic + (1-pc0)*Uia; Wo = pc0*Uoc + (1-pc0)*Uoa; WbarB = pi*Wi + po*Wo; %updated frequencies p_in(x) = pi*Wi/WbarB; end %plot frequencies if "ifPlot" if(ifPlot) plot(t,p_in,'k',t,1-p_in,'k:') xlabel('Iterations') ylabel('Frequencies') title(strcat('Chain Store Game - P(aggressive) = ',num2str(1-pc0))) legend('Competitor: percent in','Competitor: percent out','Location','EastOutside') ylim([0 1]); xlim([1 n]); end end

Correlated Single Population function [pT,pR,pN,pD] = chainStore_correlated(n,pT0,pR0,pN0,pD0,e,ifPlot) %iterated frequencies t = linspace(1,n,n); %Timid pT = linspace(1,n,n); pT(1) = pT0; %Ruthless pR = linspace(1,n,n); pR(1) = pR0; %Inductive pN = linspace(1,n,n); pN(1) = pN0; %Defensive

B a r g a r | 12 pD = linspace(1,n,n); pD(1) = pD0; %Utilities for player A Uci=2; Uco=5; Uai=0; Uao=5; %Utilities for player B Uic=2; Uia=0; Uoc=1; Uoa=1; %iterate n-1 times for x = 2:n %frequencies pt=pT(x-1); pr=pR(x-1); pn=pN(x-1); pd=pD(x-1); %fitness functions - disregard the factor of 1/2 WT = (pt+e*(1-pt))*(Uco + Uoc) + (1-e)*pr*(Uci + Uoc) + (1-e)*pd*(Uco + Uoa); WR = (1-e)*pt*(Uao + Uic) + (pr+e*(1-pr))*(Uai + Uic) + (1-e)*pd*(Uao + Uia); WN = (1-e)*pt*(Uco + Uic) + (1-e)*pr*(Uci + Uia) Uic) + (1-e)*pd*(Uco + Uia); WD = (1-e)*pt*(Uao + Uoc) + (1-e)*pr*(Uai + Uoa) (pd+e*(1-pd))*(Uao + Uoa); Wbar = pt*WT + pr*WR + pn*WN +pd*WD;

Uoa) + (1-e)*pn*(Uci + Uia) + (1-e)*pn*(Uai + + (pn+e*(1-pn))*(Uci + + (1-e)*pn*(Uai + Uoc) +

%updated frequencies pT(x) = pt*WT/Wbar; pR(x) = pr*WR/Wbar; pN(x) = pn*WN/Wbar; pD(x) = pd*WD/Wbar; end %plot frequencies if(ifPlot) plot(t,pT,t,pR,t,pN,t,pD) xlabel('Iterations') ylabel('Frequencies') title(strcat('Chain Store Game, e = ',num2str(e))) legend('Timid','Ruthless','Inductive','Defensive','Location','EastOutside') ylim([0 1]); xlim([1 n]); end end