Abstract In this paper I study the economics of self-enforcing international environmental agreements where agents never know what exactly the state of the world is. Explicitly, I consider countries using Bayesian learning to update their beliefs on the state of the world. Using a very simple framework of allowing pollution as a common bad, I study how Bayesian learning conveys message to countries and whether a full disclosure of information can necessarily improve the aggregate welfare. Interestingly, I find that the value of information is always negative which suggests that strategic interactions between countries significantly make the countries worse off. I also consider a dynamic setting where countries emissions can affect the learning process and surprisingly I find that the equilibrium breaks down to a coalition that cannot have more than two countries.

1

Introduction

How to cooperate? While research on climate change and other international environmental problems often point to a call for cooperation among nations to cut carbon emissions, we do not observe a lot of international environmental agreements (IEA) in the world1 . Since the seminal work by Barrett (1994), scholars have attempted to explain low participation rates of international environmental agreements. ∗ I would like to thank Rob Williams for his encouragement and valuable suggestions for the paper. I would also like to thanks Ian Page and other participants in the class for their feedback. The remaining errors are all of my own. Contact: [email protected]. 1 See the discussion in Stern (2007) and Barrett (2003).

1

Barrett (1994) models the game as a two stage game where countries choose to become signatories or not in the first stage, then they will choose emissions in the second stage. Signatories (who agreed to join the agreement in the first stage) will choose emissions to maximize the joint surplus. Since the agreements are assumed to be self-enforcing, we have to impose internal stability and external stability conditions to get the equilibrium number of signatories (in a stable equilibrium). In most scenarios that Barrett (1994) showed, the maximum equilibrium number of signatories is three2 . Potentially there are two problems of such modelling. The first problem is the lack of treatment of uncertainty. Authors often assume the cost-benefit ratio is deterministic, meaning that all countries know exactly what the cost-benefit ratio is. There are still controversies over the quantification in atmospheric temperature changes (as well as its potential damage) due to accumulation of greenhouse gases (Intergovernmental Panel on Climate Change, 2007a,b,c), therefore it is impossible for the government to know the exact benefits when they decide to cut carbon emissions. Na and Shin (1998) is the first paper attempting to introduce individual uncertainty into the framework of self-enforcing IEA. Nevertheless, they only allow for three countries therefore it is difficult to generalize their results in a broader framework. Kolstad (2007) allows for systematic uncertainty so all countries are subject to risk on the identical cost-benefit ratio of carbon emissions abatement. The second problem is the static nature of the game. In the early models of IEA, scholars consider static models only because what matters for the interaction of agents is a flow pollutant. More papers start looking at a stock pollutant instead and therefore we need a model that turns the IEA into a dynamic game. Ulph (2004) is one of the papers that features a stock pollutant while also allowing countries to re-consider their position each period by consider a 2-period version of Barrett’s type of game. Rubio and Ulph (2007) consider a similar game with infinite horizon, at the same time allowing for dynamic membership as well. In both papers the cost is a function of the stock of the pollutant instead of the emission in that period, meaning that if the country chooses not to abate by a sufficient amount, the cost can go up the next period. It is natural to think about the problem of learning in the dynamic context of IEA, and the above papers also feature certain kinds of learning by agents. Kolstad and Ulph (2008) look at different forms of learning on the timing that countries learn about the true cost-benefit ratio using a simple, flow pollutant format. Regardless, there are still shortcomings in the papers above. In reality, countries never know the true state of the world, yet they have to guess what the cost-benefit ratio is when they make abatement 2 Rubio

and Ulph (2006) critically review some of the claims made by Barrett (1994) and some of the results are overturned when some assumptions are relaxed.

2

decisions. In other words, countries probably have priors over the true state of the world (in our case, the cost-benefit ratio of abatement) and they will update (through Bayes’ Rule) their beliefs by observing ”events” (catastrophes, hereafter) that suggest what the state of world is. Previous papers have a mixed result on whether learning can sustain more cooperation. While intuition suggests that value of information should be positive as shown in many papers (Na and Shin, 1998; Ulph, 2004), certain kinds of learning that allow country to take strategic actions after information is revealed can make the value of information negative (Ulph and Maddison, 1997; Kolstad and Ulph, 2008). Since countries never know what exactly the state of the world is, it seems that in some cases strategic actions (that some authors are worrying that it can make a negative impact on the number of signatories) are muted. Imagine a case where countries strongly believe the pollution is very costly, then it will create a huge incentive to free-ride and hence it results in a lower coalition (in the world where marginal benefits and costs of polluting is constant), In other words, it is going to depend on the prior and the realization of signals. Scholars have previously considered Bayesian learning in the literature of the environmental economics. Kelly and Kolstad (1999) introduce Bayesian learning into an optimal growth model and they study how the learning process affect the benefits of pollution control. Karp and Zhang (2006) show that information in the case of anticipated learning has negative value because learning implies a more optimistic view of future damages. My paper will be the first paper to study Bayesian learning in the framework of self-enforcing IEA. The result can be different from what these paper found because we are looking at the internal and external stability conditions in which countries have to balance their benefits and costs of joining and not joining the agreement, and catastrophes are going to impact both benefits and costs. The above papers assumed there is a single regulator who is setting the control measures.3 In my model, all countries know that the cost-benefit ratio (of pollution) is either γh or γl , but they do not know the probability what this is. Instead, they formed common priors over γ. Catastrophes happen according to a probability distribution related to γ (which are costly) and countries update (and form posterior on) their believes on γ. I am going to consider first a simple static model that country faces a binary choice of either to pollute or abate as in Kolstad (2007) to demonstrate how Bayesian learning is going to impact our equilibrium. It may look weird that I am talking about a static model while also allowing for Bayesian learning, but the context of the problem is the same each period so it allows me to focus on a single period. 3 This

point is emphasized in Ulph (2004).

3

I start by characterizing the signatories equilibrium in a setting where both benefits and costs of polluting are linear in private and global emissions respectively. Since the game is finitely repeated, the equilibrium number of signatories in each stage game is a function of the expected cost-benefit ratio, using the information set at each time period. Given that it depends on the realization of the catastrophes variable, I run a Monte Carlo simulation to give an estimate of the welfare under Bayesian learning. I contrast the welfare measures with the ones where information (in this case, the true cost-benefit ratio) is fully revealed, surprisingly I find that countries prefer Bayesian learning to full disclosure of information, due to the nature of the equilibrium that we have fewer cooperation when the cost is high. Countries would like the expected state to converge to the true faster if the cost is low and it to converge slower if the cost is high. In expectation, I find that the latter effect dominates the former. After setting the stage up, I am going to add more dynamics to the model by considering the probability that a catastrophe will happen as a function of emissions one period ago. In other words, I allow country decisions to impact the learning process directly so now the pollution decision also includes the information tradeoff on top of the cost-benefit tradeoff. I start by describing a two-period framework to focus on the pollution decision in period 1 that can potentially affect the learning process, while agents stick with the same plan in period 2 as described in the static framework. I find that the equilibrium number of signatories can go up - which would imply that Bayesian learning did actually enforce more cooperation in this case. Contrast to the static framework where countries cannot control for the learning process, now signatories know that if they decide to pollute, the learning process will be faster and this harms their expected welfare as we saw in static model. There is an extra benefit of controlling emissions of signatories that drives more cooperation and hence we observe a bigger coalition. Still, this model will not be a completely illustration of what the world looks like, but this is certainly a big step towards that. This paper only attempts to compare exogenously given benefits and costs, instead of constructing how the impact of the greenhouse gas control diffuses to the economy (and hence results in benefits and costs). Understanding how this diffusion works is very important in designing climate change and other kinds of environmental policies, and it will certainly be important to look at the distributional issues of the international environmental agreements as well. The rest of the paper is organized as follows. In section 2, we are going to look at a simple “static” model of self-enforcing international environmental agreement. After presenting the IEA equilibria under different learning path, I compare the corresponding welfare using Monte Carlo simulations. In

4

section 3, I present another framework that emissions can directly affect the Bayesian learning process. Analytically, I look at a simple two-period model so that the equilibrium properties in section 2 can be carried over. After that I conclude in section 4.

2

“Static” Model

Consider a world where there are N identical countries. Both benefits and costs of pollution are assumed to be linear in emissions, however, benefit is linear in private emission while cost is linear in global emission. Each country can choose either to pollute (qi = 1) or not (qi = 0). The ratio of the costs to benefits is γ which is not known to countries. Nonetheless, countries do know that γ can take either γ H or γ L where 1 > γ H > γ L . γ not only governs the cost-benefit ratio, but also determines the frequency of catastrophes. Catastrophes zt can take a value of either Z or 0. Under a high cost state (γ H ), catastrophes zt = Z happen more often4 with probability pH > pL . Bear in mind that I normalize the benefit of pollution to be 1, so Z and gamma can be thought as relative cost of catastrophes and global pollution in respective order. In this static model, I am going to assume that both pH and pL are constant and this is a common knowledge to all countries. I assume that all countries have a common prior α that the cost state is high (γ H ). The timing of the game is as follows. At the beginning of the period, catastrophes are revealed (so zt = Z or zt = 0) and all countries update their beliefs over γ. After that, countries simultaneously decide whether to become a signatory or not (the ‘membership phase’). Signatories will then come together and decide whether to pollute or not, and all signatories will have to follow the direction. On the other hand, non-signatories simultaneously and non-cooperatively decide whether to pollute or not (the ‘pollution phase’). The timing of the game allows us not to consider the economic cost of catastrophes because catastrophes happened with some exogenous probabilities at the time of coalition negotiation, and their decision at time t is not going to affect realization of catastrophes at time t + 1. Effectively, country i’s payoff is, Vi = qi − γ(qi + Q−i )

(1)

where Q−i is the total emissions beside i. I am going to denote γt to be the expected value of γ using 4 This assumption is equivalent to the monotonic likelihood ratio condition requires for agents to translate the information into learning about the state.

5

the information set at time t, i.e. γt = α ˆ t γ H + (1 − α ˆ t )γ L

(2)

where α ˆ t is the posterior probability that countries updated using the information at time t contingent on the value of zt .

2.1

Non-Cooperative and Cooperative Equilibria

Throughout the paper, I am going to make the following assumption: Assumption 1 1 > γ H > γ L >

1 N

This assumption guarantees an interior solution of the membership game. If the cost-benefit ratio is bigger than one, then each country would not find it optimal to pollute. On the other hand, if costbenefit ratio is smaller than 1/N , even a coalition consisting of all countries in the world (that is, n = N , or the full coalition and co-operative equilibrium) will not find it optimal to abate since,

˜ = 1) = 1 − N γt > 0 WtC (n = N, Q

Under assumption 1, the non-cooperative solution is straightforward. Since benefits outweigh the costs, all countries will choose to pollute. This is also the optimal strategy for non-signatories. The aggregate payoff (in each period) is given by, WtN C = N (1 − γt N )

(3)

As derived earlier, the cooperative equilibrium is for all countries to abate. The aggregate welfare is 0.

2.2

IEA Equilibrium - Full Learning Case

After characterizing the cooperative and non-cooperative equilibria, I will now discuss the equilibrium with international environmental agreement (IEA). Before I move to the Bayesian learning case, it is convenient for me to outline the full learning case as in Kolstad and Ulph (2008) so we can contrast the welfare of the two equilibria in the next subsection. In the full learning case, countries learn about the true type of the cost-benefit ratio γ before both the membership phase and the pollution phase. of the first period. 6

Let V S (n) and V N (n) be the payoff to each signatory and non-signatory respectively, given that there are n signatories in the coalition. Here are some definitions that will be useful. Definition I(x) gives the smallest integer that is strictly greater than x. Definition The IEA with n signatories is internally stable if V S (n) > V N (n − 1) Definition The IEA with n signatories is externally stable if V N (n) > V S (n + 1) The internal and external stability conditions guarantee that we have a Nash equilibrium in the (sub)game. If the IEA is internally stable, it means any signatory would not find it optimal to switch to become a non-signatory; if the IEA is externally stable, no non-signatory would find it optimal to join the coalition and become a signatory. Whenever I am talking about an equilibrium number of signatories being n∗ in this paper, it means that n∗ satisfies both internal and external stability conditions. The following lemma would be useful in many propositions that follow. Lemma 1 For a given coalition of n signatories and γ, it is NOT optimal to pollute if γ > 1/n. Proof Signatories know that non-signatories are always going to pollute. If signatories decide to pollute, each of them has a payoff of V S (n) = 1 − γN since the overall emissions will be N . If signatories decide not to pollute, each of them has a payoff of V S (n) = −γ(N −n) as only (N −n) non-signatories are going to pollute. Therefore, it is not optimal for signatories to pollute if −γ(N − n) > 1 − γN ⇒ γ > 1/n. The result below is a replication of the result in Kolstad (2007). Proposition 1 Under Assumption 1 and full learning mechanism, the expected number of signatories in the equilibrium is αI( γ1H ) + (1 − α)I( γ1L ) and it is unique. Proof Starting at the time when all countries now what the true state is. Since all countries the true γ at the membership phase, I will just use γ here to denote the true one. When n = I(1/γ), it means n > 1/γ. From Lemma 1, it implies all signatories will agree not to pollute. More importantly, from the definition of the I(·) function, n − 1 < 1/γ. In other words, if a single signatory leaves the coalition, all signatories will choose to pollute. The internal validity condition hence becomes V S (n) > V N (n − 1) ⇒ −γ(N − n) > 1 − γN ⇒ γ > 1/n which is satisfied. The external validity is satisfied under Assumption 1: V N (n) > V S (n + 1) ⇒ 1 − γ(N − n) > −γ(N − (n + 1)) ⇒ 1 > γ. The uniqueness of the equilibrium 7

is coming from the fact than n is selected to represent the optimality of the signatories not to pollute at the margin. If there are more signatories than n, then at least one country would find it optimal to ”free-ride” due to the face that γ < 1. To complete the proof, we just move the game back to the time when the uncertainty is not resolved - and the expected number of signatories in the equilibrium is just the expectation of the signatories in two possible subgames.

This standard result implies that in a high cost-benefit ratio state, the number of signatories is lower, which implies the aggregate welfare (W = (N − n∗ )(1 − N γ) < 0) is lower. This is primarily driven by the strategic action by non-signatories. This result will also be useful for the analysis in the next section, where we focus on Bayesian learning.

2.3

IEA Equilibrium - Bayesian Learning Case

If we look at the tradeoff facing both non-signatories and signatories carefully, we notice that countries never need to learn about the cost-benefit ratio - they are basing all their decisions on the expected value of the cost. Since the decision in period t (which is to pollute or not) does not affect the learning process, damage or the decision in period t + 1, I can analyze this game ‘statically’ since all periods can be treated the same, except that the expected cost (γt ) would be different in each subgame. We can view the game as a finitely repeated game of each signatories stage game, and we know that the Nash equilibrium of the repeated game is equivalent to the Nash equilibrium of individual subgames. Given that the uncertainty is “systematic” as it applies to all countries equally, we can just repeat the exercise earlier of showing the internal and external validity by replacing the true γ (which we have in the previous case) with its expected value γt . By Bayes’ Rule and given the prior α , if countries observe a catastrophe,

α ˆ (z1 = Z) =

pH α pL + (pH − pL )α

(4a)

Similarly, if countries do not observe a catastrophe,

α ˆ (z1 = 0) =

(1 − pH )α 1 − (pL + (pH − pL )α)

(4b)

The rest of the problem is almost identical to the one we saw in the full learning case. Therefore it is straight-forward to prove the following, 8

Proposition 2 Under Assumption 1 and bayesian learning mechanism, the number of signatories is I( γ1t ) in each period and it is unique. We should notice that in each period, the equilibrium numbers of signatories are potentially different when we are using a Bayesian learning mechanism. It is intuitive to think that why the number of signatories is changing. As agents learn about the true cost-benefit ratio, agents update their priors and form updated expectation on the cost-benefit ratio. We can interpret the Bayesian mechanism (Bayes) as a slow convergence to the true state, while the full disclosure (Full) as an immediate convergence. Therefore, the above result motivates us to look at the value of information (in rough sense) in this context: if all countries are told the true state, what would the improvement of the aggregate welfare be? Information in this setting can be positive or negative - information generally has positive value because it allows agents to act early, while strategic actions could bring the value of information down to negative5 . High / Bayes

High / Full

Low / Bayes

Low / Full

Exp Bayes

Exp Full

γ

p

α

-7293.1 -7515.2 -3880.6 -3832.7

-7793.6 -7793.6 -4135.4 -4135.4

-2590.4 -2684.0 -957.58 -956.62

-2485.2 -2485.2 -795.26 -795.26

-3530.9 -5099.6 -2419.1 -2394.7

-3546.9 -5139.4 -2465.3 -2465.3

0.2/0.5 0.2/0.5 0.1/0.3 0.1/0.3

0.05/0.2 0.05/0.2 0.05/0.2 0.05/0.1

0.2 0.5 0.5 0.5

Note: ‘Bayes’ estimates are computed using Monte-Carlo simulations of 1000 observations. ‘Full’ scenarios are computed by assuming countries are being told the true cost-benefit ratio before the membership phase of the first period begins. Expected values (‘Exp’) are then calculated by weighting the corresponding high and low values using priors as weights. The payoffs do not include the economic cost of catastrophes because it will enter equally to both ‘Bayes’ and ‘Full’ scenarios. The following parameters are used in all specifications: (1) Number of Countries = 30, (2) Discount Factor = 0.95,(3) Time periods = 100.

Table 1: Aggregate Welfare under Different Scenarios and Specifications In light of that, I estimate the expected aggregate welfare using Monte-Carlo simulations to calculate the discounted sum of aggregate welfare (because the timing of information determines crucially on the welfare). Table 1 above compares the aggregate welfare under four different scenarios using different parameters. The very surprising result is that the expected value of aggregate welfare under full disclosure is always lower than the one under Bayesian learning. This acts as an indirect piece of evidence that the information under IEA framework can be negative. If we investigate the numbers more closely, our main result is primarily driven by the difference between the aggregate welfare when cost-benefit ratio is high. Our argument above goes through here - since the strategic actions between agents make the number of signatories essentially low (which harms the aggregate welfare), agents would not want to learn the information when the state is high. On the other hand when the cost is 5 It is inconclusive in the literature of self-enforcing IEA on whether information has positive or negative values. This is left for further theoretical research as other papers only attempt to quantify the value information in corresponding settings. See the discussion in Kolstad and Ulph (2008).

9

low, agents would like the equilibrium to converge faster hence we observe a higher aggregate welfare under full disclosure. We could also find interesting results as we look across different rows. By comparing rows 1 and 2, when countries are more “optimistic” (by having a lower prior α) the difference between the values of information increases when cost is high while it shrinks when cost is low. This is mostly driven by the change in convergence rate when α decreases: if the true state is low (high), it means the convergence is faster (slower) and the value of information will change accordingly as in the discussion above. I get similar results when I change the γ and p pairs. The aggregate welfares increase as a result of a decrease in cost.

3

Dynamic Model

In this section, I attempt to introduce some dynamics to the simple model in order to study the repeated game of the static model above. There are two basic departures that we have to take into account. First, catastrophes probability next period is a function of current global emissions. I assume the probability that a catastrophe will happen under a high cost state depends on the total emissions in the period before while the probability is still a constant under a low cost state to keep things simple, i.e.

L p˜L t+1 = p L p˜H t+1 (Qt ) = p +

(5a) Qt H Qt p (p − pL ) ≡ pL + Λ N N

(5b)

H where Λp = pH − pL > 0. Assume that the emissions before the first period is N therefore p˜H 1 (Q0 ) = p .

Under the convenient timing of the game, countries observe both the catastrophes and the number of signatories before they make the pollution decision. This implies, however, that the next period posterior probability is going to depend on the current emissions since

α ˆ t+1 (zt+1 = Z) = α ˆ t+1 (zt+1 = 0) =

p˜L t+1

ˆt p˜H (1 − α ˆ t )pL t+1 α =1− L H L p p +Λ α ˆ t Qt /N + (˜ pt+1 − p˜t+1 )ˆ αt

(1 − p˜H αt (1 − α ˆ t )(1 − pL ) t+1 )ˆ = 1 − 1 − (pL + Λp α ˆ t Qt /N ) 1 − (˜ pL pH ˜L ˆt) t+1 )α t+1 + (˜ t+1 − p

By taking derivatives respect to Qt , we get the following corollary: Corollary 3 (a)

∂α ˆ t+1 (zt+1 =Z) ∂Qt

> 0; (b)

∂α ˆ t+1 (zt+1 =0) ∂Qt

<0

10

(6a) (6b)

It will be useful in subsequent analysis of the equilibrium. The above corollary is due to the fact by polluting more, the difference between p˜H and p˜L increases. It means that now the volatility of the process - if the catastrophe happens, countries therefore can strongly believe that the state of the world is high cost; same argument goes through if countries observe no catastrophes, in that case. Consequently, higher pollution generally increases the convergence rate of the equilibria. Recall that I assume that countries will re-negotiate at each period on the size of the coalition and we get an equilibrium number of signatories. As noted in last section, the size of the coalition can be different as well, as a result, this paper features a dynamic membership game. Then, a question that one can ask is, what do countries expect to get in period 2 when they are at period 1? I am going to follow the simple approach used by Rubio and Ulph (2007) and assume a random assignment rule so that countries will endogenize the effect of the emission on the expected payoff (in terms of both signatory status and catastrophes).6 If the equilibrium number of signatories in period 2 is large, then the country in period 1 would also take into account that she has a higher chance of becoming a signatory in period 2. Second, we have to deal with the economic cost of catastrophes zt now. It matters because when countries are deciding whether they should pollute, pollution now leads to a higher chance of catastrophes next period, and consequently the expected cost goes up in the next period. To avoid confusion and build our results step-by-step, I will first assume this effect away in the first part of the results in this section. I will bring this extra cost back and see how our results change. I start the analysis by looking at the decision of signatories, assuming that non-signatories always pollute. I will then come back and evaluate the claim that non-signatories always pollute, and study when this assumption hold. To present my argument in the simplest way, I am going to present a simple 2-period model that global pollution in period 1 affects the posterior probabilities in period 2. I conclude this section by studying how results change when we depart from a two-period model.

3.1

Two-Period Model

The value to each signatory at period 1 can be written as follows:

V1S (ˆ α1 , n) = max {q1 − γ1 (ˆ α1 ) · (N − n(1 − q1 )) + δE1 V2 } q1 ∈{0,1}

6 Rubio

and Ulph (2007) also describe other ways that one can model this in a dynamic programming setting.

11

(7)

where

E1 V2 = [ˆ α1 p˜H ˆ 1 )˜ pL α2 (z2 = Z), n2 (z2 = Z)) − Z] 2 (Q1 ) + (1 − α 2 ] · [V2 (ˆ + [α ˆ 1 (1 − p˜H ˆ 1 )(1 − p˜L α2 (z2 = 0), n2 (z2 = 0)) 2 (Q1 )) + (1 − α 2 )] · V2 (ˆ = [pL + α ˆ 1 Λp Q1 /N ] · [V2 (ˆ α2 (z2 = Z), n2 (z2 = Z)) − Z] + [1 − (pL + α ˆ 1 Λp Q1 /N )] · V2 (α ˆ 2 (z2 = 0), n2 (z2 = 0)) (8) and V2 (α, n) =

1−n N n S V2 (α, n) + V2 (α, n) N N

(9)

We solve the game backwards by starting at the last period. The payoffs in last period will be identical to the ones we solved in the previous section; we expect the number of signatories to be equal to I(1/γ2 ) in the stage game. The following lemma will be found useful. Lemma 2 If there exists a significant amount of non-signatories or the change in γ2 is small, V2 (α, n) is decreasing in γ2 . Proof I am going to treat n as a differentiable function of γ to keep notations simple. Using (9) and results in previous section, we can write V2 (α, n) as

V2 (α, n2 ) =

n2 1 − n2 1 (−γ2 (N − n2 )) + (1 − γ2 (N − n2 )) = (−n2 + 1 − γ2 (N − n2 )) N N N

(10)

By taking derivative with respect to γ2 (and taking into account the fact that n2 is a function of γ2 ), ∂n2 ∂V2 = −(1 − γ2 ) − (N − n2 − 1) ∂γ2 ∂γ2 given the fact that is large or

∂n2 ∂γ2

∂n2 ∂γ2

(11)

< 0 we cannot sign the derivative. However, under the condition that N − n2

maybe 0 for small changes in γ, the second effect is going to dominate and hence the

derivative is negative. Define V2Z ≡ V2 (ˆ α2 (z2 = Z), n2 (z2 = Z)), V20 ≡ V2 (α ˆ 2 (z2 = 0), n2 (z2 = 0)) and ΛV2 ≡ V2Z − V20 . Following from the fact that γ and α are positively correlated and Corollary 3, it is straightforward to prove the following, Corollary 4 V2Z , V20 and ΛV2 have the following properties, (a) ΛV2 < 0; 12

(b) V2Z is decreasing in Q1 ; (c) V20 is increasing in Q1 ; (d) ΛV2 is decreasing in Q1 . The signs implied in this corollary are intuitive. When a catastrophe happened, the information is updated such that the expected cost increases. It will result in a potential decrease in the equilibrium number of signatories hence the global pollution in period 2 is going to increase and harm the welfare (opposite when catastrophe did not happen). Global pollution in period 1 controls for the convergence rate of γ. In the case where information is beneficial (no catastrophe), more pollution in period 1, which results in more information, creates an extra gain in welfare. Similarly, when information is harmful (catastrophe happened), more pollution in period 1 creates an extra cost in welfare. Now we can start analyzing the signatories decisions. Signatories will NOT pollute if and only if

−γ1 (N − n) + δV20 |lowQ + δ(pL + α ˆ 1 ΛP − α ˆ 1 ΛP n/N )ΛV2 |lowQ − δ[pL + α ˆ 1 ΛP (N − n)/N ]Z > 1 − γ1 N + δV20 |highQ + δ(pL + α ˆ 1 ΛP )ΛV2 |highQ − δ[pL + α ˆ 1 ΛP ]Z

(12)

Denote ∆x = x|highQ − x|lowQ and using Corollaries 3 and 4, nγ1 > 1 + δ∆V20 + δ(pL + α α1 ΛP n/N )ΛV2 |lowQ − δ α ˆ 1 ΛP nZ/N ˆ 1 ΛP )∆ΛV2 + δ(ˆ | {z } | {z } | {z } {z } | >0

<0

<0

>0

= 1 + δ(1 − (pL + α ˆ 1 ΛP ))∆V20 + δ(pL + α ˆ 1 ΛP )∆V2Z + δ(ˆ α1 ΛP n/N )ΛV2 |lowQ − δ α ˆ 1 ΛP nZ/N | {z } | {z } | {z } {z } | >0

<0

<0

(13)

>0

Similarly, nonsignatories will pollute if

1+δ(1−(pL +α ˆ 1 ΛP (N −n)/N ))∆V20 +δ(pL +α ˆ 1 ΛP (N −n)/N )∆V2Z +δ(ˆ α1 ΛP /N )ΛV2 |lowQ > γ1 +δ(ˆ α1 ΛP /N )Z (14) Notice that that the high Q and low Q scenarios for nonsignatories imply Q1 = N −n and Q1 = N −n−1 respectively. In this subsection, I am going to assume that catastrophes are not costly, i.e. Z = 0 and the fourth term in (13) will drop out from the subsequent analysis. When we look at (13), it is not immediately obvious how this IEA equilibrium is going to differ from the one that we found in the static model. The first term is the change in value when there is no catastrophe happened (due to a potential increase in signatories and decrease in global pollution). The difference is positive because period-1 pollution contributes the updating of information and the update is beneficial to society. The

13

second term is the change when catastrophe happened, it is negative because the update is harmful to society as it results in a potential increase in global pollution. These two terms isolate the effect in which global pollution in period 1 only affects the potential number of signatories in period 2. The third term takes the second effect of global pollution in period 1 into account, that is to increase the chance that the catastrophe will happen in period 2, hence it is negative. If the first three terms are greater than zero, it would mean that the number of signatories is at least as large as the IEA equilibrium in the static model. The following intermediate proposition showed that if the learning is slow and the expected γt does not change a lot in different scenarios (whether pollution is high or low; or whether catastrophes happened or not in period 2), and this leads to the fact that the number of signatories is the same in all four cases, then the three terms as in (13) summed up to zero. Lemma 3 Suppose ∆x = x|highQ − x|lowQ where highQ : Q1 = K and lowQ : Q1 = K − h. If the number of signatories in period 2 is the same under all scenarios, Θ ≡ δ(1 − (pL + α ˆ 1 ΛP K/N ))∆V20 + δ(pL + α ˆ 1 ΛP K/N )∆V2Z + δ(ˆ α1 ΛP h/N )ΛV2 |lowQ = 0 Proof See Appendix. We can apply the lemma to the problem of signatories by replacing K = N and h = n. The intuition is that when countries completely consider all the possibilities (with appropriate probabilities), if the global emissions is going to be the same (since n2 is always the same) anyway, then the net effect is zero when signatories in period 1 consider whether or not to pollute. The above striking result also implies that internal validity is satisfied. With some algebra, the external validity condition is also satisfied because non-signatories are looking at the same difference in expected period-2 values as the signatories (using the Lemma again, by replacing the K and h accordingly). By studying (14) carefully (and recall that we assumed Z to be 0 here), nonsignatories will always pollute. Therefore, it leads to the following corollary, which states that the result in previous section follows. Corollary 5 Under (i) Assumptions 1, (ii) the Dynamic Bayesian learning mechanism, (iii) if the number of signatories in period 2 is the same under all scenarios and (iv) catastrophes involve no economic loss, the equilibrium number of signatories in period 1 is I( γ11 ) and it is unique. 0 Can we use the above to generalize some results? Let nZ 2 (n2 ) be the number of signatories when

catastrophe in period 2 is (not) revealed. I am going to make the following assumption throughout the paper: 14

Assumption 2 The pollution decision of signatories cannot change the number of signatories in any state of z2 . This assumption simplifies the problem into one dimension when signatories only consider the case whether catastrophes happened or not in the next period, and the signatories can only affect the chance they lie on each state of z2 through pH .7 We know that when catastrophes are revealed, the number of signatories can decrease because the expected γ is larger. The following proposition presents another result of the paper: Lemma 4 Under Assumption 2, Θ =

mδ ˆ 1 ΛP h/N )(1 N (α

− γ H ) ≥ 0, where m is the (positively-defined)

difference of the equilibrium number of signatories whether catastrophes are revealed in period 2. When m > 0, Θ > 0. Proof See Appendix. This is the main result of the paper. Equations (13) and (14) can be boiled down to

Signatories: Non-Signatories:

where ΘS ≡

mδ α1 ΛP n/N )(1 N (ˆ

− γ H ) ≥ 0 and ΘN ≡

nγ1 > 1 + ΘS

(15)

1 + ΘN > γ1

(16)

mδ α1 ΛP /N )(1 N (ˆ

− γ H ) ≥ 0. Equation (16) always

holds and similarly external validity also holds. Internal validity implies the number of signatories in S

1 this case is n∗ = I( 1+Θ γ1 ) ≥ I( γ1 ). It is summed up in the following proposition.

Proposition 6 Under Assumptions 1 and 2, and catastrophes involve no economic loss, then the equilibrium number of signatories in period 1 under the dynamic Bayesian framework cannot be smaller than that under the static Bayesian framework. The results from this section lead us to think more about the convergence of the state into the decisionmaking of agents. As we showed in the simulations in last part, countries do prefer the convergence to be slower in expectation. In the static model in section 2, countries have no way to change the learning process itself because all catastrophes are happened according to a predetermined exogenous process. In this dynamic setting however, countries know that they can now alter this learning process by choosing to pollute or not. The marginal signatory now has an extra cost of leaving the coalition - if 7 It is very hard to quantify the assumption because the number of signatories in next period is a non-differentiable function of current emissions. I omitted this for the sake of representation.

15

she decides to leave the coalition, not only the global pollution will go up (which is a cost to them), now the increase in the Bayesian learning process convergence rate creates an extra cost to the country as this would decrease their expected welfare if the number of signatory goes down.

3.2

When Catastrophes are Costly

Now let us bring the costly catastrophes back to the big picture. Non-signatories will pollute if

1 − γ1 + δ α ˆ 1 ΛP /N

mδ (1 − γ H ) − Z N

>0

(17)

Equivalently, using (15), signatories will not pollute if mδ (ˆ α1 ΛP n/N )(1 − γ H ) − δ α ˆ 1 ΛP nZ/N N n m = 1 + δα ˆ 1 ΛP (1 − γ H ) − Z N N

nγ1 > 1 +

(18)

Therefore whether the number of signatories increases or decreases as a result of Bayesian learning mechanism depends on the relative strength of learning and cost. As I showed earlier in Lemma 3, if the number of signatories is not changing, m = 0 and the term in the bracket is always negative, hence the equilibrium number of signatories will decrease. If (17) holds, we can rearrange that and get δα ˆ 1 ΛP N

mδ (1 − γ H ) − Z N

> −(1 − γ1 )

(19)

Substitute (19) into (18), we get

nγ1 > 1 + δ α ˆ 1 ΛP

n m (1 − γ H ) − Z N N

> 1 + n(−(1 − γ1 )) nγ1 > 1 − n + nγ1

⇒

n>1

(20)

(20) is satisfied for any coalition of signatories that consist of more than one member. This is a very surprising result. When non-signatories always pollute, signatories will never pollute. It also implies that any members in the coalition now has incentive to free-ride. External validity condition for non-signatories is always satisfied under the assumption that they always pollute and Lemma 4. To 16

satisfy the internal validity condition, the maximum number of signatories can only be 2. The result is wrapped up in the following proposition: Proposition 7 Under Assumption 2 and when catastrophes are costly, if non-signatories always pollute, the equilibrium number of signatories in the dynamic Bayesian learning framework cannot exceed 2. The intuition is that both non-signatories and signatories are looking at the same margin: between the benefits of information term Θ and cost of catastrophes Z. Once non-signatories are willing to pollute such that the cost of catastrophes is not too high, it also implies that signatories would always find polluting more costly.

3.3

Generalization to More Periods

If the game is finitely repeated, we can treat the analysis as the last two periods (T − 1, T ) of the model. As the results in last subsection is ‘robust’ to the prior knowledge entering into the period (ˆ αt ) if we are willing to make an assumption that non-signatories will always pollute in all periods. When t = T − 2, given that the number of signatories is always 2, we can invoke from Lemma 3 that the Θ = 0 and the tradeoff for signatories is essentially

nγ1 > 1 − δ α ˆ T −2 ΛP nZ/N

(21)

and the equilibrium number of signatories will be lowered than the one in our static setting. Qualitatively, we should expect for all (finite) periods (but not the last period), that the number of signatories should be 2 by the above arguments. It is much harder to analyze the results qualitatively for more than 3 periods - one will have to rely on a computational model to solve the dynamic programming problem. The reason is that the results will be contingent on the value and realizations of catastrophes such that α ˆ t will respond differently. As illustrated in (21), α ˆ t and Z directly impact the number of signatories through the internal validity condition. In other words, the period-2 problem in a two-period problem also involves this whole chain of implications from the pollution decision in period t. I left this part for interesting readers.

17

4

Concluding Remarks

Models on self-enforcing international environmental agreement (IEA) starts in a simple static and deterministic framework. Different scholars try to make the model more realistic by relaxing some assumptions and considering other components like uncertainty on the cost (Na and Shin, 1998; Kolstad, 2007), stock pollutant instead of flow pollutant (Ulph, 2004; Rubio and Ulph, 2007), heterogeneity and learning mechanism (Kolstad and Ulph, 2008). This paper is the first paper to consider Bayesian learning in the framework of self-enforcing IEA. I first consider a rough sense of Bayesian learning that there is some exogenous process that determines the learning process. Using this ‘naive’ framework, I am able to compare the aggregate expected welfare under a Bayesian learning process and a full disclosure. I find that the expected welfare is always higher in the Bayesian learning process. The next thing that I attempt to do is to consider another Bayesian learning process that agents would have some control over it. I assume, for simplicity, that when agents pollute they can be informed ‘more’ of the state of the world (yet assuming that there is no cost of catastrophes). As I argue in the paper, this may be implied by the simulation exercise in the static model that countries ‘prefer’ a slower learning process. As a consequence, pollution will create an extra cost to the marginal signatory and this will increase the size of the coalition. When I allow for the cost of catastrophes, I show that the equilibrium number of signatories breaks down to a maximum coalition of 2. One thing to note here is that this paper is just a start of this fruitful branch of research in the literature of self-enforcing IEA. My paper allows readers to think more about how learning shapes the equilibrium. Of course, this paper does not intend to represent the reality, but it offers some important insight that other papers missed earlier. I show that information potentially has a negative expected value. It will be interesting to see if such a negative value of information is robust to the settings of the model. As I mentioned earlier, modelling the pollutant as a stock pollutant will be a one big step closer to representing the reality. I left this as a potentially interesting topic for enthusiastic readers. I am aware that the last result of my paper is subject to the particular setting in my model so I look forward to work where other scholars try to bring Bayesian learning in the model of international environmental agreements differently. In particular, information can potentially enter and affect individuals in many different ways other than a cost shock.

18

Appendix Proof of Lemma 3 Denote n and n2 to be the number of signatories in period 1 and 2 respectively. To simplify the analysis, Iet ZH denote high Q1 and z2 = Z, similarly for 0H, 0L and ZL. Call P ≡ pL + α ˆ 1 ΛP K/N and Λγ ≡ γ H − γ L > 0. Using (6a) and (6b), we have (1 − α ˆ 1 )(1 − pL ) 1−P (1 − α ˆ 1 )pL =1− P (1 − α ˆ 1 )(1 − pL ) =1− 1 − P + ΛP α ˆ 1 h/N L (1 − α ˆ 1 )p =1− P − ΛP α ˆ 1 h/N

α ˆ 20H = 1 −

(22a)

α ˆ 2ZH

(22b)

α ˆ 20L α ˆ 2ZL

(22c) (22d)

By our assumption that n2 is the same under all scenarios and the definitions of ∆V20 and ∆V2Z , ∆V20 = V20 |highQ − V20 |lowQ 1 0L (γ − γ20H )(N − n2 ) N 2 1 α20L − α ˆ 20H ) = Λγ (N − n2 )(ˆ N 1 = Λγ (N − n2 )(ˆ α2ZL − α ˆ 2ZH ) N

=

∆V2Z

(23a) (23b)

Using (22a)-(22d), (1 − α ˆ 1 )(1 − pL ) (1 − α ˆ 1 )(1 − pL )ΛP α ˆ 1 h/N (1 − α ˆ 1 )(1 − pL ) − = >0 P P 1−P 1−P +Λ α ˆ 1 h/N (1 − P )(1 − P + Λ α ˆ 1 h/N ) (1 − α ˆ 1 )pL (1 − α ˆ 1 )pL (1 − α ˆ 1 )pL ΛP α ˆ 1 h/N = − =− <0 P P P −Λ α ˆ 1 h/N P (P − ΛP α ˆ 1 h/N )

α ˆ 20L − α ˆ 20H = α ˆ 2ZL − α ˆ 2ZH

(24a) (24b)

Combining (23a), (23b), (24a), (24b) and simplifying, we get

δ(1 − P )∆V20 + δP ∆V2Z =

δ γ p Λ Λ (N − n2 )(1 − α ˆ 1 )α ˆ1h N2

1 − pL pL − 1 − P + ΛP α ˆ 1 h/N P − ΛP α ˆ 1 h/N | {z }

A

19

(25)

Using the definition of P , 1 (1 − pL )P − ΛP α ˆ 1 h/N − pL + P pL C ΛP α ˆ 1 (K − h)/N = >0 C

A=

where C ≡ (1 − P + ΛP α ˆ 1 h/N )(P − ΛP α ˆ 1 h/N ) (26)

Now we deal with the remaining term. By the similar derivations as above, 1 γ Λ (N − n2 )(ˆ α20L − α ˆ 2ZL ) N 1 (1 − α ˆ 1 )pL (1 − α ˆ 1 )(1 − pL ) = Λγ (N − n2 ) − N P − ΛP α ˆ 1 h/N 1 − P + ΛP α ˆ 1 h/N (1 − α ˆ1) L 1 γ p − P + ΛP α ˆ 1 h/N = Λ (N − n2 ) N C 1 (1 − α ˆ1) P = − Λγ (N − n2 ) Λ α ˆ 1 (K − h)/N < 0 N C

ΛV2 |lowQ =

(27)

Putting all the pieces together,

δ(1 − (pL + α ˆ 1 ΛP ))∆V20 + δ(pL + α ˆ 1 ΛP )∆V2Z + δ(ˆ α1 ΛP h/N )ΛV2 |lowQ =

δ γ p ΛP α ˆ 1 (K − h)/N δ (1 − α ˆ1) P Λ Λ (N − n )(1 − α ˆ )ˆ α h − 2 (ˆ α1 ΛP h)Λγ (N − n2 ) Λ α ˆ 1 (K − h)/N 2 1 1 2 N C N C

=0

Proof of Lemma 4 0 Assume that nZ 2 = n2 − m where m ≥ 0 by the arguments made above that the equilibrium number of

signatories is smaller if catastrophes are revealed. If m = 0 the results follow from Lemma 3, so I am just going to focus on the case that m > 0. Recall the results in (23a) and (23b),

∆V20 = ∆V2Z = = ΛV2 |lowQ = =

1 γ Λ (N − n02 )(ˆ α20L − α ˆ 20H ) N 1 γ Λ (N − nZ ˆ 2ZL − α ˆ 2ZH ) 2 )(α N 1 γ m Λ (N − n02 )(ˆ α2ZL − α ˆ 2ZH ) + Λγ (ˆ α2ZL − α ˆ 2ZH ) N N 1 1 ZL Z (−nZ (−n02 + 1 − γ20L (N − n02 )) 2 + 1 − γ2 (N − n2 )) − N N 1 γ m Λ (N − n02 )(ˆ α20L − α ˆ 2ZL ) + (1 − γ2ZL ) N N

20

(28a)

(28b)

(28c)

Using (2) and (22d),

γ2ZL = γ L + α ˆ 2ZL Λγ (1 − α ˆ 1 )pL Λγ = γL + 1 − P − ΛP α ˆ 1 h/N (1 − α ˆ 1 )pL Λγ P P −Λ α ˆ 1 h/N (1 − α ˆ 1 )pL = (1 − γ H ) + Λγ P P −Λ α ˆ 1 h/N = γH −

1 − γ2ZL

(29)

By substituting (24b), (28b), (28c) and (29) into the last terms in (13), as well as using Lemma 3 to cancel out the terms with (N − n02 ), Θ = δ(1 − (pL + α ˆ 1 ΛP K/N ))∆V20 + δ(pL + α ˆ 1 ΛP K/N )∆V2Z + δ(ˆ α1 ΛP h/N )ΛV2 |lowQ mδ (1 − α ˆ 1 )pL ΛP α ˆ 1 h/N γ mδ mδ (1 − α ˆ 1 )pL P H P Λ + (ˆ α Λ h/N )(1 − γ ) + ( α ˆ Λ h/N ) Λγ 1 1 N P − ΛP α ˆ 1 h/N N N P − ΛP α ˆ 1 h/N mδ = (ˆ α1 ΛP h/N )(1 − γ H ) > 0 N

=−

21

References Barrett, Scott (1994), “Self-enforcing international environmental agreements.” Oxford Economic Papers, 46, 878–94. Barrett, Scott (2003), Environment and Statecraft: the strategy of environmental treaty-making. Oxford University Press, New York. Intergovernmental Panel on Climate Change (2007a), Climate Change 2007: The Physical Science Basis. Cambridge University Press. Intergovernmental Panel on Climate Change (2007b), Climate Change 2007: Impacts, Adaptation, and Vulnerability. Cambridge University Press. Intergovernmental Panel on Climate Change (2007c), Climate Change 2007: Mitigation of Climate Change. Cambridge University Press. Karp, Larry S. and Jiangfeng Zhang (2006), “Regulation with anticipated learning about environmental damages.” Journal of Environmental Economics and Management, 51, 259–279. Kelly, David L. and Charles D. Kolstad (1999), “Bayesian learning, growth, and pollution.” Journal of Economic Dynamics and Control, 23, 491–518. Kolstad, Charles D. (2007), “Systematic uncertainty in self-enforcing international environmental agreements.” Journal of Environmental Economics and Management, 53, 68–79. Kolstad, Charles D. and Alistair Ulph (2008), “Uncertainty, learning and heterogeneity in international environmental agreements.” mimeo. Na, Seong-lin and Hyun Song Shin (1998), “International environmental agreements under uncertainty.” Oxford Economic Papers, 50, 173–85. Rubio, Santiago Jose and Alistair Ulph (2006), “Self-enforcing international environmental agreements revisited.” Oxford Economic Papers, 58, 233–263. Rubio, Santiago Jose and Alistair Ulph (2007), “An infinite-horizon model of dynamic membership of international environmental agreements.” Journal of Environmental Economics and Management, 54, 296–310. Stern, Nicholas (2007), The Economics of Climate Change: The Stern Review. Cambridge University Press, Cambridge and New York. 22

Ulph, Alistair (2004), “Stable international environmental agreements with a stock pollutant, uncertainty and learning.” Journal of Risk and Uncertainty, 29, 53–73. Ulph, Alistair and James Maddison (1997), “Uncertainty, learning and international environmental policy coordination.” Environmental and Resource Economics, 9, 451–466.

23