Observational Learning and Payoff Externalities: An Experiment David Owensa a Haverford

College 370 Lancaster Ave, Haverford, PA 19041

Abstract This paper employs an experimental setting to measure the effect of payoff externalities on the observational learning process. In our design, two subjects, acting sequentially, receive a private, informative signal drawn from a continuous distribution, and then guess the state of the world. We use the strategy method to map the entire signal space onto decisions, yielding a precise measure of preferences. We find that second-movers’ decisions are more responsive to their private information and to payoff externalities, and less responsive to information from the firstmover’s decision, than predicted by Bayesian Nash Equilibrium. We conduct additional sessions in which the computer plays the role of the first-mover, and uses a publicly known strategy. Second-movers conform more closely to theoretical predictions in these sessions, suggesting that the reluctance to learn from observation may be caused by perceived unpredictability in firstmovers’ decision-making processes.

Email address: [email protected] (David Owens)

Preprint submitted to Elsevier

April 10, 2014

1. Introduction Before making a choice, economic decision makers (DM’s) often observe others making a similar choice. Consider a diner choosing among several restaurants on the same block. She may have some private information as to which restaurant would best suit her preferences. She may also look in the windows to observe the crowd in each one. The crowds tell our diner how many others, recently faced with a decision similar to her own, chose to eat in each restaurant. Each person, in choosing a restaurant, reveals a measure of the private information on which the choice was based. In this way, early diners bestow a positive information externality on later diners. If they take advantage of the information revealed, later diners can make more informed decisions than early ones. This process, by which a DM uses previous decisions to inform her own choice, is called observational learning (OL). The majority of the OL literature1 assumes that payoffs are related across DM’s only through the information externalities described above. That is, observed decisions can affect a DM’s expected payoff only through the information that those decisions reveal. However, OL models have been applied to contexts, such as the adoption of new technologies, where the outcomes of previous decisions can affect DM’s payoffs directly. In other words, to contexts where payoff externalities or network effects are likely to play a role. The inclusion of payoff externalities in an experimental study of the OL process is therefore a sensible extension of the existing literature. In an experimental environment, we explore a setting where information and payoff externalities exist simultaneously. We combine a continuous information space with a binary decision space. We use a ‘strategy method’, which affords a precise quantification of the OL process, and allows us explore how it is affected by payoff externalities. Our baseline treatment follows C ¸ elen and Kariv (2004a). Two DM’s, Player 1 and Player 2, each make a binary choice and act in sequence, with Player 2 observing Player 1’s choice. Each receives a signal drawn iid from a uniform distribution, and guesses the ‘state of the world, ω ∈ {A, B}. ω is determined by the sum of the two signals, making payoffs perfectly correlated between Players. Further, Player 1’s choice (A or B) becomes more attractive to Player 2 because of the information that it conveys about Player 1’s signal, and thus ω. We describe this OL setting as including a ‘positive information externality’, similar to that of the diner described above. Importantly, in this baseline treatment, the only incentive motivating Players’ actions is to correctly guess the state of the world. Two additional treatments add payoff externalities, both positive and negative, to the base1 See

Section 2 for a brief overview of the existing literature.

2

line described above. This gives Player 2 an added incentive to choose the same (different) action as Player 1 if the payoff externality is positive (negative). These treatments are not classical OL settings, as decisions are motivated by more than correctly guessing ω, as attaining a positive payoff externality, or avoiding a negative one, complicates the incentives. A crucial feature of our design is that the magnitude of the payoff externality is small relative to the payoff for guessing ω. As a result, Player 2 always benefits by following extreme private signals that perfectly reveal ω, or nearly do so. Extreme values of the payoff externality would make the game one of coordination rather than OL, in which Player 2’s private information would become irrelevant to her optimal action. While other studies employ longer sequences of DM’s, we use only two. As such, our investigation of the OL process focuses primarily on Player 2’s decision. A sequence of two DM’s allows Bayesian Nash Equilibrium (BNE) to provide straightforward theoretical benchmark in our baseline treatment, and risk-neutral Bayesian Nash Equilibrium (RN-BNE) to do so in treatments with payoff externalities.2 Longer sequences of DM’s have multiple equilibria and allow for complicated beliefs to rationalize a wide range of behavior. Payoff externalities force DM’s to consider the actions of their followers as well as their predecessors.3 This further clouds a decision process already complicated by the existence of information and payoff externalities. As the purpose of this paper is to assess the relative influence of the two types of externalities, sequences of only two subjects is most appropriate. In our baseline treatment, we find Player 2 to be strongly influenced by what she observes from Player 1. However, the influence of OL falls short of that predicted by BNE, as Player 2 relies on their private information more than predicted. This has two consequences. First, Player 2’s decisions are less accurate than predicted than BNE. Second, as their decisions depend more on private information, they reveal more of this private information. Thus, more information can be gleaned by observing sequences of DM’s than predicted by theory. We also find Player 2’s decisions to be more strongly influenced by payoff externalities, both positive and negative, than RN-BNE predicts. Combined with the above finding, this implies that negative payoff externalities can have significant adverse affects on the decisions of those who observe the choices of others. A diner with a mild aversion to crowded restaurants, for example, may eat more low-quality meals than is optimal, given her preferences, in pursuit of privacy. The results carry ambiguous predictions for positive payoff externalities. In our sessions, positive externalities have little effect the accuracy of Player 2’s decisions, as Player 2 Appendix

D discuss the implications of risk-averse DM’s. tension exists, to a limited degree, for Player 1 in our design. Section 3 explains that it does not effect Player 1’s equilibrium strategy for any of the three treatments. 3 This

3

2 correctly guesses ω with a similar frequency under positive and zero externalities. This may be surprising from a theoretical standpoint, as payoff externalities introduce an incentive that competes with the task of guessing the state of the world for marginal s2 . One could imagine a different scenario where DM’s over-responsiveness to positive externalities causes them to imitate too much and make poor choices. Finally, Player 2 behavior conforms more closely to RN-BNE predictions in sessions where a computer plays the role of Player 1, removing all ambiguity about Player 1’s behavior. Thus, we suggest a mistrust of Player 1’s decision-making process as a potential explanation for the stronger influence that payoff externalities exert on Player 2’s choices. The remainder of the paper is organized as follows: Section 2 discusses existing OL literature, while Section 3 derives RN-BNE predictions for our model. Section 4 describes the experimental procedures in detail, Section 5 summarizes the experimental results, and Section 6 concludes.

2. Related Literature Bikhchandani, Hirshleifer, and Welch (1992) and Banerjee (1992) develop the classical OL model, a setting where externalities across DM’s are purely informational. Their theoretical framework involves agents acting in an exogenously-determined order, making once-in-a-lifetime decisions and observing all predecessors. General predictions include uniformity of behavior, even when private information differs across players. In addition, OL impedes the process of information aggregation, as DMs who rely less on their private information in favor of decisions that they observe render their own decisions less informative to those observing them. We call this aspect information supression. Anderson and Holt (1995) use a laboratory setting to test the Bikhchandani, Hirshleifer, and Welch (1992) model. They find that subjects are generally responsive to information contained in observed actions, including in circumstances in which they must act contrary to their private information. Several other experimental studies have reaffirmed subjects’ willingness to learn from observed actions.4 Some notable deviations from the Bikhchandani, Hirshleifer, and Welch (1992) model include C ¸ elen and Kariv (2004b), which considers model in which DM’s observe only their direct predecessor, and Hendricks, Sorensen, and Wiseman (2012), in which DM’s observe all decisions but not the order in which they are reached. In Monzon and Rapp (2011), DM’s are uncertain about their own position in the decision sequence, while Sgroi (2003) allows DM’s to determine the timing of their own decisions. Gale and Kariv (2003), Acemoglu, 4 Hung

and Plott (2001) and Drehman, Oechssler, and Roider (2007) are examples.

4

Dahleh, Lobel, and Ozdaglar (2011) and others consider observational learning in social networks. Scharfstein and Stein (1990) show, in an investment setting, that a preference to avoid being alone in making a mistake exacerbates herding, and thus information suppression. Eyster and Rabin (2010) present a bounded-rationality OL model in which DM’s behave as though predecessors’ decisions are based on private information alone. Such DM’s ignore the information suppression aspect of the OL process, and implicitly over-weight the decisions of late movers. C ¸ elen, Kariv, and Schotter (2010) allow experimental subjects to receive advice from their predecessors as well as observing their decisions, and find advice the more influential of the two, even when they are equally informative. In a different setting, Merlo and Schotter (2003) find that OL outperforms ‘learning by doing’ as a learning mechanism. Real-world studies of OL include Cai, Chen, and Fang (2009), which finds that diners given ranking information are more likely to order popular dishes. Duflo and Saez (2003) find that OL significantly impacts investment in tax-deferred retirement accounts among groups of university employees. Burke, Tobler, Baddeley, Schultz, and Ungerleider (2010), among others in the natural sciences, show that OL is not unique to humans. Hung and Plott (2001) conduct the first experimental study of the OL process with payoff externalities. They find that positive externalities increase subjects’ tendencies to imitate actions that they observe. Work by Drehman, Oechssler, and Roider (2007) uses an internet design to explore an OL setting with externalities.5 They find evidence that subjects respond to externalities, and that they are myopic, meaning that they tend to ignore the effect of their followers on their actions. The studies mentioned above find that subjects tend to replicate the actions of those that they observe when it is profitable to do so. All use a binary signal, binary action design, which affords a relatively course measure of learning. For example, when private signals are accurate with a probability of 32 , the ratio of expected payoffs of the two available actions can take on three values:

1 2,

1 and 2. Therefore, when subjects benefit from ignoring their private information,

the payoff to doing so is twice as high as the payoff to not doing so. The studies therefore yield a binary measure of whether subjects learn from observation when the benefits thereto are very high. Recent studies have measured observational learning more precisely. Weizs¨acker (2010) uses a large data set from 13 different herding experiments, and finds that subjects tend to follow their private information unless the cost to doing so is large (as it is in many of the binary signal 5 They employ positive externalities, negative externalities and a unique setting where subjects receive a negative externality for those that they follow and a positive externality for those that follow them

5

designs mentioned above). C ¸ elen and Kariv (2004a) develop the design that we adapt, with a continuous information and decision-space, which allows each subjects’ beliefs to be perfectly characterized by their decision. This design allows them to distinguish between ‘herd behavior,’ and ‘information cascades.’ In the former, the observation of others makes uniform behavior more likely. In the latter, these observations render a DM’s private information obsolete. Importantly, the decisions of a DM in an information cascade are completely uninformative. Our continuous decision space allows us to discern subjects’ preferences between actions for the entirety of their information space. As each private signal corresponds to a different payoff for each action, we are able to extrapolate choices for a continuum of expected payoff ratios. This allows us to specify, for each action, the payoff premium necessary to entice subjects to ignore their private information.

3. Theory 3.1. The Model In our model, two players act in sequence, with ‘Player 2’ observing the decision of the ‘Player 1’. Each Player i receives a private, informative signal si ∈ [−1, 1],6 and chooses an action di ∈ {A, B}. The private signals s1 and s2 jointly determine the state of the world, ω ∈ {A, B}. ω = A if s1 + s2 ≥ 0, and ω = B if s1 + s2 < 0. Player i receives a financial reward of Y if di = ω and no reward if di 6= ω. In addition to the reward described above, each Player receives a payoff externality of X if and only if d1 = d2 . This paper explores both positive and negative values of X. We restrict our attention to moderate values of the payoff externality, −Y < X < Y , preserving the study as one of OL, rather than of coordination. We now consider the decision problem faced by each Player. We first derive BNE predictions for the classical OL case, where externalities are purely informational. Then, we consider BNE in the more general case with payoff externalities, under the assumption of risk-neutrality (RNBNE). As payoff externalities allow risk-preferences to influence the OL process, we discuss their impact in Appendix D. ˆB∗ 3.2. RN-BNE Cutoff Strategies: sˆ∗1 , sˆA∗ 2 2 , s In the absence of payoff externalities (X = 0), payoff-maximizers choose the most likely state of the world, or d∗i = A if and only if P (ω = A|si ) ≥ P (ω = B|si ). Importantly, as the reward, 6 In our experimental sessions, subjects received signals uniformly distributed on [−10, 10]. To simplify the analysis in this Section, we normalize the signal space to [−1, 1].

6

Y , to either correct state is equal, risk-preferences play no role. As established in C ¸ elen and Kariv (2004a), Player 1 employs a cutoff of 0 in equilibrium (or sˆ∗1 = 0), and Player 2 chooses action A if and only if s2 ≥ −E(s1 |d1 ). This amounts to a two-part cutoff strategy in which Player 2 chooses A for s2 ≥ −0.5 if d1 = A, and for s2 ≥ 0.5 if d1 = B (in our notation, sˆA∗ ˆB∗ 2 = −0.5 and s 2 = 0.5). If Player 2 did not observe d1 , her decision would be identical to Player 1’s, and Player 2 would choose d2 = A if and only if s2 ≥ 0. We designate this strategy profile, defined by sˆ1 = sˆA ˆB 2 = s 2 = 0, as the Zero Learning Benchmark, as it represents the outcome that would occur were no learning to take place. As discussed above, as Y > X, Player 2 in our design always prefers to follow extreme her private information, regardless of d1 . However, when Player 2’s private information is only marginally informative, the information conveyed by d1 is compelling enough to reverse her decision relative to the Zero Learning Benchmark. As a result, d2 = d1 for s2 ∈ [−0.5, 0.5]. We use the term Imitation to describe an outcome where d2 = d1 . BNE predicts Imitation with a frequency of 75% in the baseline, X = 0 treatment. Payoff externalities complicate the analysis. Regardless of X, sˆ1 = 0, sˆA 2 = −0.5 and 7 sˆB 2 = 0.5 maximizes the likelihood of Accuracy, defined for Player i as di = ω. However, payoff

externalities can influence decisions on the margin, where P (A = ω) and P (B = ω) are similar. On the margin, X can induce Player 2 to choose d2 = A if P (A = ω) < P (B = ω), and viceversa. Intuitively, sˆA∗ 2 should decrease for X = 1, and increase for X = −1, relative to X = 0, while sˆB∗ 2 should change in the opposite direction. Player 2’s RN-BNE strategy is characterized by equations 1 and 2, with the derivation relegated to Appendix A.8

sˆA∗ 2

  X 1 =− × 1+ 2 Y

sˆ2B∗ =

  1 X × 1+ 2 Y

(1)

(2)

B∗ Note that the game is symmetric, in that sˆ∗1 = 0 and sA∗ 2 = −s2 . Further, positive payoff

externalities increase the set of Player 2’s signals for which she prefers to choose d1 = d2 , and negative payoff externalities decrease it. Inserting Y = 2 into equations 1 and 2 yields sˆA∗ 2 = 7 The use of the word ‘Accuracy’ is somewhat misleading under X 6= 0, as subjects are not strictly trying to guess the state of the world. √    8 These equations characterize equilibrium cutoffs for X ∈ Y × 1 − 5 , Y , which includes the employed 2 values X = −1, 0 and 1, for Y = 2

7

{−0.75, −0.5, −0.25} and sˆB∗ = {0.75, 0.5, 0.25} for X = {1, 0, −1}, respectively. As a result, 2 Player 2 imitates Player 1’s action with a frequency of 87.5%, 75% and 62.5%. Importantly, unconditional Imitation is not predicted. Player 2 benefits from imitating Player 1’s action for many private signals, but should always follow her private information in the case of extreme s2 . As Imitation and Accuracy will be useful metrics with which to compare our experimental results, their predictions are illustrated in Figures 1 and 2. Because s1 and s2 are uniform and iid on [−1, 1], there is a 1-to-1 mapping between probabilities and areas on the (s1 , s2 ) plane.

1

1

B,A

0.75

0.5

A,A B,B

−0.25

0

−0.5

−0.75

−0.75

A,B −0.5

−0.25

B,B

−0.25

−0.5

−0.75

A,A

0

0.25

0.5

0.75

−1 −1

1

A,A

0.25 0

s

0

B,A

0.5

0.25

s2

s2

B,A

0.75

2

0.5 0.25

−1 −1

1

0.75

−0.25

B,B

−0.5

A,B −0.75

−0.5

−0.25

s

0

0.25

0.5

0.75

A,B

−0.75 −1 −1

1

−0.75

−0.5

−0.25

s

1

(a) X = 1, P (d2 = d1 ) = 0.875

0

0.25

0.5

0.75

1

s

1

1

(b) X = 0, P (d2 = d1 ) = 0.750

(c) X = −1, P (d2 = d1 ) = 0.625

Figure 1: Predicted Imitation, d2 = d1 (RN-BNE)

1

d2=ω=A

0.75 0.5

d2=ω=A

0 −0.25 −0.5

d2=ω=B

−1 −1

−0.5

−0.5

−0.25

0

0.25

0.5

0.75

1

s1

−1 −1

−0.5

0 −0.25 −0.5

d2=ω=B −0.75

−0.25

0

0.25

0.5

d2=ω=A

0.25

d2=ω=B

−0.75

d2=ω=B −0.75

d2=ω=A

0 −0.25

−0.75

0.5

0.25

s2

s2

0.25

d2=ω=A

0.75

2

0.5

1

d2=ω=A

s

1 0.75

d2=ω=B d2=ω=B

−0.75

0.75

1

−1 −1

−0.75

s1

(a) X = 1, P (d2 = ω) = 0.844

(b) X = 0, P (d2 = ω) = 0.875

−0.5

−0.25

0

0.25

0.5

0.75

1

s1

(c) X = −1, P (d2 = ω) = 0.844

Figure 2: Predicted Player 2 Accuracy, d2 = ω (RN-BNE)

Figure 2(b) shows the predicted frequency of ‘accurate’ decisions under BNE is 0.875 for the 8

classical observational learning problem, X = 0. Figures 2(a) and 2(c) show the decrease in Accuracy predicted by RN-BNE under X = 1 and X = −1, respectively. Intuitively, Figure 2 shows how Player 2 under X = 1 (X = −1) is willing to accept a lower probability of making an accurate decision in order to imitate Player 1 more (less).

4. Experimental Design Experimental sessions were conducted at the experimental social science laboratory (Xlab) at the University of California, Berkeley. A total of 161 subjects participated, comprised of staff and students. Each session involved between 10 and 20 subjects, with a total of 30, 30 and 36 participating under conditions of X = 1, 0 and −1, respectively. Separate sessions were conducted with a computer playing the role of Player 1 in each treatment, with 17, 25 and 23 subjects participating in such sessions. Subjects entered decisions into computer terminals using the experimental software program z-Tree (Fischbacher, 2007), and were divided by partitions to limit undocumented interaction, and to ensure anonymity of decisions and private information. Each session lasted between 60 and 90 minutes. Before participating, subjects read a detailed set of instructions which was then read out loud to them by an experimenter.9 Subjects were paid $5 as a show-up fee, and gained subsequent earnings for their performance in each round, according to the payoff schedule described in Section 3. Average total earnings were roughly $21. During the experiment, earnings were described in terms of ‘experimental tokens’, with an exchange rate of 25 cents per token. Externalities were never directly mentioned to subjects, who were shown the payoffs that correspond to their own and their peer’s action as in Table 1,10 for the X = 1 treatment for s1 + s2 < 0. As described in Section 3, one token was added to the payoff associated with all outcomes under X = −1, so the payoffs range from 0 to 3 as well, rather than from -1 to 2. If s1 + s2 < 0 :

Your Choice

Others’ Choice A B 1 0 2 3

A B

Table 1: Payoff Presentation: X = 1 for ω = B

Sessions consisted of 30 rounds of the decision problem, at the beginning of which subjects were randomly assigned to a partner, and to the role of either Player 1 or Player 2. The computer 9 Copies 10 There

of the experimental instructions are available at XXXXX. was an analogous table for s1 + s2 ≥ 0.

9

then drew signals s1 and s2 from a U [−1, 1] distribution, which were viewed by Players 1 and 2, respectively, and Players chose di ∈ {A, B}. Before choosing d2 , Player 2 viewed d1 . Prior to viewing her signal, each subject chose a cutoff sˆi , such that di = A if si ≥ sˆi and di = B if si < sˆi .11 After both subjects made their choices, they were informed of the value of each signal, and the payoff that they received. This process was repeated for each of the 30 rounds in each session, and subjects were paid privately upon their exit from the laboratory.

5. Results As OL is the study of what DM’s learn from observing others, we are principally interested in the behavior of Player 2. As explained in Section 3, RN-BNE’s prediction of Player 1’s behavior is the same across treatments (ˆ s1 = 0). We do address an important aspect of Player 1 behavior in Section 5.3, but focus primarily on Player 2’s behavior, and how it is influenced by d1 . The application of our experimental results to the research questions will rely on two types of summary statistics: observed choice outcomes, d2 ∈ {A, B}, and observed cutoffs, sˆ2 ∈ [−1, 1]. 5.1. Results: Accuracy and Imitation In our analysis of d2 , we utilize the concepts of ‘Accuracy’ (d2 = ω) and ‘Imitation’ (d2 = d1 ), as defined in Section 3. Each of these measures captures an important aspect of the OL process. Accuracy conveys the degree to which OL helps observers make better decisions, while Imitation is a measure of the uniformity of decisions, or herd behavior. Table 2 shows the prevalence of Accuracy and Imitation across treatments, with RN-BNE predictions included for comparison. Table 2: Accuracy and Imitation

X=1

X=0

X=-1

Observations1

450

450

540

Accuracy: d2 = ω

0.742

0.756

0.620

RN-BNE Prediction

0.844

0.875

0.844

Imitation: d2 = d1

0.798

0.638

0.430

RN-BNE Prediction

0.875

0.750

0.625

1

Denotes number of observations for Player 2 only.

According to Table 2, Player 2 chooses d2 = ω with a significantly lower frequency than predicted by RN-BNE in all three treatments (p < 0.001 in each case, according to a binomial 11 Player

2 viewed d1 before choosing sˆ2 . Thus, we observe sˆA ˆB 2 if d1 = A, and s 2 if d1 = B, but not both.

10

test). This is evidence that the information in d1 is not fully taken advantage of, but is consistent with typical levels of randomness in experimental behavior. However, while RN-BNE predicts the same frequency of Accuracy (0.844) under X = 1 and X = −1, it is significantly more common under X = 1 (χ2 = 11.2, p = 0.001). Further, accurate decisions are no less frequent under X = 1 than X = 0 (χ2 = 0.01, p = 0.916). Imitation falls significantly short of the levels predicted by RN-BNE (p < 0.001, binomial, for each) across treatments. For example, consider X = 0, where are no payoff externalities to distort incentives. RN-BNE predicts that Player 2 takes the same action as Player 1 with a frequency of 0.75. In our sessions, such Imitation occurred with a frequency of only 0.638. Our subjects appear to imitate their predecessors less than predicted by the theoretical benchmark. This is consistent with Kariv (2005) and Weizs¨acker (2010), which also find that their subjects follow observed actions too infrequently, implicitly over-valuing their private information. Generally, Table 2 suggests that subjects imitate observed actions more frequently than predicted by RN-BNE, and that this divergence causes them to make suboptimal decisions. They do appear highly sensitive to payoff externalities. Because information externalities are positive in our setting, positive payoff externalities improve the Accuracy of decisions, while negative payoff externalities do the opposite. 5.2. Cutoffs Table 3 show the cutoffs chosen by Player 2. The first two rows show the distribution of cutoffs after Player 2 observes d1 = A and d1 = B, respectively. The vertical line in each graph represents RN-BNE predictions for each X and d1 . Below the graphs, the means of each distribution are displayed. The bottom panel of Table 3 categorizes the observed cutoffs by the amount of learning they represent, relative to RN-BNE predictions. The first of the five mutually exclusive categories listed at the bottom of the Table is ‘Over-Learning’, which includes observed s2 that represent a ˆB ˆB∗ stronger tendency toward Imitation than suggested by RN-BNE (ˆ sA ˆA∗ 2 ). Cutoffs 2 s are labeled ‘RN-BNE Learning’ if sˆ2 ≈ sˆ∗2 . The label ‘Under-Learning’ is applied to sˆ2 that represent a tendency toward Imitation, but weaker than suggested by RN-BNE. The ‘ZeroLearning Benchmark’ includes sˆ2 ≈ 0, while ‘Reverse-Learning’ encompasses cutoffs that show 12 a tendency against Imitation (ˆ sA ˆB 2 > 0 and s 2 < 0).

Few subjects choose the sˆA 2 predicted by RN-BNE, particularly under X 6= 0. 68% of cutoffs suggest learning under the classic OL setting, with the degree of learning spread fairly 12 As specified in Table 3’s notes, cutoffs within 0.05 of s ˆ∗2 and 0 are categorized as RN-BNE learning and zero learning, respectively.

11

Table 3: Summary Statistics and Levels of Learning

X=1

X=0

X=-1

-1

-.8

-.6

-.4

-.2

0

.2

.4

.6

.8

1

.4 0

.1

.2

.3

.4 .3 .2 .1 0

0

.1

.2

.3

.4

sˆA 2

-1

-.8

-.6

-.4

-.2

0

.2

.4

.6

.8

1

.2

.4

.6

.8

1

-1

-.8

-.6

-.4

-.2

0

.2

.4

.6

.8

1

-1

-.8

-.6

-.4

-.2

0

.2

.4

.6

.8

1

-1

¯A 1 sˆ 2 s¯ ˆB 2

Over-Learning2 RN-BNE Learning3 4

Under-Learning Zero-Learning5

Reverse-Learning 1 2 3 4 5 6

6

-.8

-.6

-.4

-.2

0

.2

.4

.6

.8

1

.4 0

.1

.2

.3

.4 .3 .2 .1 0

0

.1

.2

.3

.4

sˆB 2

-1

-.8

-.6

-.4

-.2

0

-0.623

-0.188

0.208

(0.06)

(0.07)

(0.07)

0.608

0.324

-0.103

(0.06)

(0.06)

(0.07)

0.456

0.222

0.289

0.029

0.202

0.009

0.387

0.256

0.069

0.071

0.156

0.146

0.058

0.164

0.487

Robust standard error in parentheses,  clustered  B∗by subject.  Over-Learning: sˆA ˆA∗ ˆB ˆ2 + .05, 1 2 ∈ −1, s 2 − .05 , s 2 ∈ s A A∗ A∗ B B∗ RN-BNE Learning: sˆ2 ∈ (ˆ s2 − .05, sˆ2 + .05), sˆ2 ∈ (ˆ s2 − .05, sˆB∗ 2 + .05) A A∗ B B∗ Under-Learning: sˆ2 ∈ [ˆ s2 + .05, −.05], sˆ2 ∈ [.05, sˆ2 -.05] Zero-Learning Benchmark: sˆA ˆB 2, s 2 ∈ (−.05, .05) Reverse-Learning: sˆA ∈ [.05, 1], sˆB 2 2 ∈ [−.05, −1]

evenly among over-learning, RN-BNE learning and zero-learning. Under X = 1, cutoffs are most commonly categorized as over-learning (0.456), while reverse learning is rare (0.058). The opposite is true of X = −1, where 48.7% of sˆ2 represent a preference against taking the observed action, d1 . Importantly, Table 3 categorizes cutoffs relative to RN-BNE predictions. Therefore, the high frequency of Over-Learning observed under X = 1, for example, is not only evidence that d1 carries a stronger influence under X = 1, but that its influence is stronger relative to RN-BNE predictions. 12

Three regularities emerge from the discussion of our results. First, Player 2 is positively influenced by the information represented by Player 1’s decision, as evidenced by the tendency towards Imitation in X = 0. Second, this influence is less strong than predicted by RN-BNE, as ¯ˆB < sˆB∗ . Imitation occurs with a frequency significantly lower than predicted, s¯ˆA ˆA∗ 2 > s 2 and s 2 2 Finally, payoff externalities have a stronger effect than predicted by RN-BNE, as evidenced by the frequency of Imitation and the sensitivity of the cutoffs themselves to payoff externalities. 5.3. Player 1 Ambiguity Above, we present evidence that Player 2 is less sensitive to the information contained in d1 , and more sensitive to associated payoff externalities, than an expected-payoff maximizer. This can be rationalized if Player 2 believes d1 to be uninformative due to unpredictability in Player 1’s decision-making process. To this end, additional sessions were conducted in which a computer played the role of Player 1. Subjects in these sessions were told that the ‘computerized Player 1’ always chooses sˆ1 = 0, removing all uncertainty about Player 1’s behavior. 65 subjects participated in these sessions, generating a total of 975 observations.13 Table 4 regressions with Player 2’s cutoffs as the dependent variable, including those that followed both subjects and computers in the role of Player 1. The regressions allow sˆA ˆB 2 and s 2 to depend on the value of the payoff externality X and a dummy variable equal to one if Player 1 is played by a computer. RN-BNE predictions are included as a benchmark. Table 4 shows clear evidence that Player 2’s behavior is different when a computer plays the role of Player 1. The coefficients on ‘comp’ point towards an increase in OL, as cutoffs are more negative following d1 = A, and more positive following d1 = B, though these effects are at best marginally significant. The interaction term ‘X×comp’ is highly significant in both regressions, suggesting that the influence of payoff externalities is weaker when a computer plays the role of Player 1. Collectively, the computer sessions propose an explanation for Player 2’s reliance on payoff externalities. Removing all ambiguity about Player 1’s behavior decreases this reliance, suggesting that the ambiguity is an important factor in externalities’ influence.

6. Discussion and Conclusion This paper supplements the existing observational learning (OL) literature by adding payoff externalities, a characteristic common to many real-world settings to which OL models are applied. By restricting our attention to sequences of two Players, with Player 2 observing 13 Subjects actually participated in 30 rounds each, but we report only the results of the first fifteen. This keeps experience in the role of Player 2 comparable to that in the other treatments.

13

Table 4: Regression Results, Including Computer Player 1

sˆA 2

Constant

comp

X

X×comp

sˆB 2

Observed

RN-BNE

Observed

RN-BNE

-0.202

-0.500

0.276

0.500

(0.037)∗∗∗



(0.036)∗∗∗



-0.080

0

0.063

0

(0.059)



(0.054)



-0.416

-0.250

0.359

0.250

(0.043)∗∗∗



(0.046)∗∗∗



0.178

0

-0.208

0

(0.075)∗∗



(0.076)∗∗∗



n

1136

1279

R2

0.259

0.196

1 2

Robust standard errors in parentheses, clustered by subject. comp is a dummy variable for a computer playing the role of Player 1

Player 1, we can focus on the influence that Player 1’s decisions exerts on Player 2. In the presence of payoff externalities, longer sequences would force second-movers to consider their actions’ effect on third-movers, complicating the motivations behind second-movers’ behavior. Our design contains a continuous uniform information space, and a binary decision space. Bayesian Nash Equilibrium (BNE) provides the theoretical benchmark to which we compare our results. In the absence of payoff externalities (X = 0), BNE predicts Imitation, Player 2 taking the same action as Player 1, with a frequency of 75%, a prediction unaffected by riskpreferences. With X = 1 (X = −1), predicted Imitation increases (decreases) to 87.5% (62.5%), if Player 2 is risk-neutral. Risk-averse preferences exacerbate the effect of each externality on Player 2’s predicted Imitation. As they influence players’ tendency towards Imitation, payoff externalities also affect the information suppressed by the OL process. The more Player 2’s decision is influenced by d1 , the less informative are d1 and d2 , collectively, to an observer. Therefore, Player 2 suppresses information most dramatically under X = 1. Conversely, the most information is conveyed under X = −1. As in C ¸ elen and Kariv (2004a) and Weizs¨acker (2010), subjects in our design appear overreliant on their private information, relative to the information conveyed by the decisions that they observe. For example, under X = 0, s¯ˆA 2 = −.188. This cutoff implies Player 2’s indifference 14

between following her private information (d2 = B) and imitating d1 (d2 = A) when the expected payoff to Imitation is

1−.188 .188−0

= 4.31 times as high. We also find that Player 2 is more sensitive

to payoff externalities than predicted by Risk-Neutral Bayesian Nash Equilibrium (RN-BNE). Imitation increases by 16% under positive payoff externalities, and decreases by 21% under negative payoff externalities, where RN-BNE predicts differences of 12.5% in each case. As stated above, payoff externalities allow for risk-preferences to partially explain this phenomenon. However, as discussed in Appendix D, even the highly risk-averse utility functions have only marginal effects on optimal cutoffs. The strength of payoff externalities’ influence, relative to information, is therefore unlikely to be caused entirely by risk-aversion. We conduct additional sessions of each treatment in which a computer played the role of Player 1, which removed any ambiguity about Player 1’s behavior. Player 2’s behavior changes significantly in these sessions, suggesting that perceived uncertainty regarding Player 1’s behavior influences Player 2’s decisions. Our results contain insights for real world OL settings. First, the reliance on private information suggests that decisions in classical OL settings (X = 0) convey more information than BNE predicts. The reliance on private information reduces the amount of information that is suppressed in the OL process, making the decision space collectively more informative, relative to BNE prredictions. Further, individual decisions can be improved by small, positive payoff externalities, as the increase in imitation will exceed that predicted by theory, and DM’s will receive the accidental benefit of OL. This process speeds the suppression of information, however. For this reason OL settings characterized by positive payoff externalities, such as technology adoption and fashion, are much more likely to lead to information cascades than their classical OL counterparts, where observed decisions are valued only for their informational content.

15

Acemoglu, D., M. A. Dahleh, I. Lobel, and A. Ozdaglar (2011): “Bayesian Learning in Social Networks,” Review of Economic Studies, 78(4), 1201–1236. Anderson, L. R., and C. A. Holt (1995): “Information cascades in the laboratory,” American Economic Review, 87, 847–862. Banerjee, A. (1992): “A simple model of herd behavior,” Quarterly Journal of Economics, 107(3), 797–817. Bikhchandani, S., D. A. Hirshleifer, and I. Welch (1992): “A theory of fads, fashion, custom, and cultural change as informational cascades,” Journal of Political Economy, 100(5), 992–1026. Burke, C. J., P. N. Tobler, M. Baddeley, W. Schultz, and L. G. Ungerleider (2010): “Neural mechanisms of observational learning,” Proceedings of the National Academy of Sciences of the United States of America, 107(32), pp. 14431–14436. Cai, H., Y. Chen, and H. Fang (2009): “Observational Learning: Evidence from a Randomized Natural Field Experiment,” American Economic Review, 99(3), 864–82. C ¸ elen, B., and S. Kariv (2004a): “Distinguishing Informational Cascades from Herd Behavior in the Laboratory,” American Economic Review, 94(3), 484–497. (2004b): “Observational Learning Under Imperfect Information,” Games and Economic Behavior, 47(1), 72–86. C ¸ elen, B., S. Kariv, and A. Schotter (2010): “An Experimental Test of Advice and Social Learning,” Management Science, 56(10), 1687–1701. Drehman, M., J. Oechssler, and A. Roider (2007): “Herding with and without payoff externalities - an internet experiment,” International Journal of Industrial Organization, 25, 391–415. Duflo, E., and E. Saez (2003): “The Role Of Information And Social Interactions In Retirement Plan Decisions: Evidence From A Randomized Experiment,” The Quarterly Journal of Economics, 118(3), 815–842. Eyster, E., and M. Rabin (2010): “N¨aive Herding in Rich-Information Settings,” American Economic Journal: Microeconomics, 2(4), 221–43. Fischbacher, U. (2007): “z-Tree: Zurich Toolbox for Ready-made Economic Experiments,” Experimental Economics, 10(2), 171–178. 16

Gale, D., and S. Kariv (2003): “Bayesian learning in social networks,” Games and Economic Behavior, 45(2), 329 – 346, Special Issue in Honor of Robert W. Rosenthal. Hendricks, K., A. Sorensen, and T. Wiseman (2012): “Observational Learning and Demand for Search Goods,” American Economic Journal: Microeconomics, 4(1), 1–31. Hung, A. A., and C. R. Plott (2001): “Cascades: Replication and Extension to Majority Rule and Conformity-Rewarding Institutions,” American Economic Review, 91(5), 1508–1520. Kariv, S. (2005): “Overconfidence and Informational Cascades,” Working Paper. McKelvey, R. D., and T. R. Palfrey (1995): “Quantal Response Equilibrium in Normal Form Games,” Games and Economic Behavior, 10, 6–38. Merlo, A., and A. Schotter (2003): “Learning by not doing: an experimental investigation of observational learning,” Games and Economic Behavior, 42(1), 116–136. Monzon, I., and M. R. Rapp (2011): “Observational Learning with Position Uncertainty,” Carlo Alberto Notebooks 206, Collegio Carlo Alberto. Scharfstein, D. S., and J. C. Stein (1990): “Herd Behavior and Investment,” The American Economic Review, 80(3), pp. 465–479. Sgroi, D. (2003): “The right choice at the right time. A herding experiment in endogenous time,” Experimental Economics, 6, 159–180. ¨ cker, G. (2010): “Do We Follow Others When We Should? A Simple Test of Rational Weizsa Expectations,” American Economic Review, 100(5), 2340–60.

17

Appendix A. RN-BNE Derivation for si ∼ U [−1, 1], X 6= 0 We begin by arguing that, under X 6= 0, both players employ cutoff strategies in equilibrium. Equation A.1 shows Player 1’s expected profits for choosing A and B for any strategy employed by Player 2 (in particular, we do not assume that Player 2 uses a cutoff strategy). Pjk is the probability that, given Player 2’s strategy, she chooses action j after observing d1 = k. π1A = Y ×

s1 + 1 + PAA × X 2

π1B = Y ×

1 − s1 + PBB × X 2

(A.1)

As π1A is increasing in s1 and π1B is decreasing, there is some sˆ1 such that π1A ≥ π1B ∀ s1 ≤ sˆ1 and π1A ≤ π1B , ∀s1 ≥ sˆ1 . In other words, in any RN-BNE, Player 1 employs a cutoff strategy. Note that the proof does not assume that Player 2 uses a cutoff strategy. We use this to update Player 2’s expected profit function. π2jk is Player 2’s expected profit for action j after observing d1 = k, as a function of s2 .

π2AA

  Y × s2 +1 + X 1−ˆ s1 =  Y +X

π2AB

  0 =  Y ×

if s2 ≤ −sˆ1 if s2 > −sˆ1

if s2 ≤ −sˆ1 s2 +ˆ s1 1+ˆ s1

if s2 > −sˆ1

π2BA

π2BB

  Y × =  0

−ˆ s1 −s2 1−sˆ1

  Y +X =  Y × 1−s2 + X 1+ˆ s1

if s2 ≤ −sˆ1

(A.2)

if s2 > −sˆ1

if s2 ≤ −sˆ1

(A.3)

if s2 > −sˆ1

Setting π2AA = π2AB (for s2 ≤ −sˆ1 ), and π2AB = π2BB (for s2 > −sˆ1 ) and rearranging yields Player 2’s optimal cutoff strategy following d1 = A and d1 = B, sˆA∗ ˆB∗ 2 and s 2 , respectively. sˆA∗ 2 =

  1 × sˆ1 × (X − Y ) − (X + Y ) 2Y

sˆB∗ 2 =

  1 × sˆ1 × (X − Y ) + (X + Y ) 2Y

(A.4)

As Player 2’s optimal strategy is a function of sˆ1 , Player 1 must consider its influence in her own strategy. Player 1’s expected payoff: π1 = P (d1 = ω) × Y + P (d1 = d2 ) × X P (d1 = ω) =

3 − sˆ21 4

(A.6)

P (d1 = d2 ) = P (d1 = A|ˆ s1 ) × PAA + P (d1 = B|ˆ s1 ) × PBB =

(A.5)

1 − sˆ1 1 − sˆA 1 + sˆ1 sˆB + 1 2 × + × 2 2 2 2 2

18

(A.7) (A.8)

Substituting equations A.4, A.6 and A.8 into equations A.5 yields: π1 = Y ×

   3 − sˆ21 3 1X X +X × + + sˆ21 −1 4 4 4 Y Y

(A.9)

Meaning:   ∂π1 1 2 2 = X − XY − Y ∂ˆ s21 4Y The bracketed portion of expression A.10 is negative for X ∈

(A.10)

Y 2

√   × 1 − 5 , 2 .14 Thus, sˆ∗1 = 0

for all such values of X, as moving sˆ1 away from zero reduces Player 1’s expected payoff for such values of X. As for Player strategy, substituting sˆ1 = 0 into equation A.4 yields: sˆA∗ = 2  2’s optimal  1 X 1 X − 2 × 1 + Y and sˆB∗ = 2 1 + Y . Thus, with Y = 2, sˆA∗ = {−0.75, −0.5, −0.25} and 2 2 sˆA∗ 2 = {0.75, 0.5, 0.25} for X = {1, 0, −1}, respectively.

Appendix B. Information Suppression A measure of the information suppressed by d1 is the error with which an observer estimates s1 after viewing d1 . θ1 = E(s1 |d1 ) is such an observer’s best estimate, and is represented in equation B.1, with θj ≡ θ1 |d1 = j. 1 + sˆ1 2

θA =

θB =

−1 + sˆ1 2

(B.1)

Similarly, equation B.2 defines the likelihood of observing each d1 , with Pj = P (d1 = j|ˆ s1 ).15 PA =

1 − sˆ1 2

PB =

1 + sˆ1 2

(B.2)

Using equations B.1 and B.2, equation B.3 calculates the mean-squared error of θ1 , whose evaluation yields equation B.4. Z

1

V ar(θ1 ) = PA

(θA − s1 )2 ds1 + PB

sˆ1

(θB − s1 )2 ds1

(B.3)

−1

sˆ1

=

Z

 1 4 sˆ1 + 6ˆ s21 + 1 12

(B.4)

14 Equation A.4, which describes Player 2’s behavior, is valid only for −Y ≤ X ≤ Y . Thus, the derivation of sˆ1 is also only valid for these values. For X > Y , the unique equilibrium is sˆ∗1 = 0, sˆA∗ = −1, sˆB∗ = 1. For 2 2 B∗ = 1. These are coordination games. X < −Y , it is sˆ∗1 = 0, sˆA∗ = −1, s ˆ 2 2 15 P = P (s ≥ s ˆ1 ) and PB = P (s1 < sˆ1 ). 1 A

19

As equation B.4 is an even function of sˆ1 , sˆ1 = 0 minimizes V ar(θ1 ). Thus, RN-BNE predictions for each treatment, sˆ∗1 = 0, also minimize information suppression. We now turn to the calculation of V ar(θ12 ), the variance of θ12 = E(s1 + s2 |d1 , d2 ). Equation B.5 defines the four possible θjk ≡ θ12 |d1 = j, d2 = k for generic sˆ1 , sˆA ˆB 2 and s 2 . 1 + sˆ1 1 + sˆA 2 + 2 2 −1 + sˆ1 1 + sˆB 2 = + 2 2

θAA = θBA

1 + sˆ1 −1 + sˆA 2 + 2 2 −1 + sˆ1 −1 + sˆB 2 = + 2 2

θAB = θBB

(B.5)

Equation B.6 defines the probabilities of each of the four possible {d1 , d2 } outcomes, with Pjk ≡ P (d1 = j, d2 = k|ˆ s1 , sˆA ˆB 2 ,s 2 ).

(1 − sˆ1 )(1 − sˆA 2) 4 (ˆ s1 − (−1))(1 − sˆB 2 ) = 4

PAA = PBA

16

(1 − sˆ1 )(ˆ sA 2 − (−1)) 4 (ˆ s1 − (−1))(ˆ sB 2 − (−1)) = 4

PAB = PBB

(B.6)

Combining equations B.4 and B.6 yields equation B.7, which calculates the variance of θ12 . The evaluation of its integrals yields the lengthy equation B.8.

Z

1

Z

1

V ar(θ12 ) =PAA sˆA 2

 2 Z θAA − (s1 + s2 ) ds1 ds2 + PAB

1

Z

sˆ1

+ PBA sˆB 2

Z

−1

sˆ1

Z

sˆA 2

 2 θAB − (s1 + s2 ) ds1 ds2

sˆ1

 2 Z θBA − (s1 + s2 ) ds1 ds2 + PBB

−1

1

sˆB 2

−1

Z

sˆ1

 2 θBB − (s1 + s2 ) ds1 ds2

−1

(B.7)   2  1 2 2 s1 − 1)2 × (ˆ sA ˆ1 − 2 × sˆ1 + (ˆ sA ˆA = × (ˆ 2 − 1) × s 2) −2×s 2 +2 48  2  2 2 + (ˆ s1 − 1)2 × (ˆ sA ˆ1 − 2 × sˆ1 + (ˆ sA ˆA 2 + 1) × s 2) +2×s 2 +2  2  2 2 + (ˆ s1 + 1)2 × (ˆ sB ˆ1 + 2 × sˆ1 + (ˆ sB ˆB 2 − 1) × s 2 ) −2×s 2 +2   2  2 B 2 B 2 B + (ˆ s1 + 1) × (ˆ s2 + 1) × sˆ1 + 2 × sˆ1 + (ˆ s2 ) + 2 × sˆ2 + 2

(B.8)

Equation B.8 is uniquely minimized at sˆ1 = sˆA ˆB 2 =s 2 = 0. Thus, the ‘Zero-Learning Benchmark’ maximizes the amount of information collectively conveyed by {d1 , d2 }. sˆ∗1 remains constant at zero across treatments, while sˆA∗ and sˆB∗ vary. Substituting sˆ1 = 0 and into equation B.8 2 2 and imposing symmetry (ˆ sA sB 2 = −ˆ 2 ) yields equation B.9. This equation shows the increase 16 P ˆ1 ) × P (s2 ≥ sˆA ˆ1 ) × P (s2 < sˆA ˆ1 ) × P (s2 ≥ sˆB AA = P (s1 ≥ s 2 ), PAB = P (s1 ≥ s 2 ), PBA = P (s1 < s 2 ) and PAA = P (s1 < sˆ1 ) × P (s2 < sˆB ). 2

20

in V ar(θ12 ) that accompanies increased Imitation (decreasing sˆA 2 from zero, and equivalently increasing sˆB 2 from zero). Importantly, equation B.9 represents no assumptions about risk preferences, and is valid for any sˆA 2 , independent of the preferences or beliefs that generate them. 1 V ar(θ12 ) = 12



4 sˆA 2

+7

2 sˆA 2

 +2

(B.9)

Appendix C. Quantal Response Equilibrium Under X = 1, RN-BNE predicts sˆA∗ 2 = −0.75, which yields an expected payoff, π2 = 2.56. If Player 2 were to deviate from sˆA∗ ˆA 2 as dramatically as possible, and choose s 2 = 1, π2 decreases to 0.5. Under X = −1, on the other hand, the most dramatic deviation (ˆ sA 2 = 1 rather than sˆA∗ 2 = −0.25) decreases π2 only from 2.06 to 1.50. Positive payoff externalities increase the cost of Player 2’s mistakes, while negative payoff externalities mitigate the damage. Intuitively, failing to imitate d1 decreases Player 2’s chances of receiving both Y and X. Both are costly under X = 1, while the latter is a benefit under X = −1, reducing the downside of significant deviations from RN-BNE behavior. Figure C.3 displays the relationship between π2 and sˆA 2 ∈ [−1, 1] in each treatment.

2

2

2

1

0 −1

π2 (ˆ sA 2)

3

π2 (ˆ sA 2)

3

π2 (ˆ sA 2)

3

1

−0.5

0

sˆA 2

(a) X = 1

0.5

1

0 −1

1

−0.5

0

0.5

sˆA 2

(b) X = 0

1

0 −1

−0.5

0

0.5

1

sˆA 2

(c) X = −1

Figure C.3: π2 vs. sˆA 2

In our sessions, Player 2 conforms most closely to RN-BNE predictions under X = 1, and deviates most strongly under X = −1. In other words, they perform ‘best’ under X = 1. Above, we argue that a combination of under-learning and stronger-than-predicted influence of payoff externalities are responsible for this regularity. The payoff functions of Figure C.3 suggest an 21

alternative explanation: behavior appears most random under X = −1, where random behavior is least strongly punished. The behavior that we observe could be random deviations from RN-BNE predictions, with payoff differentials mitigating the magnitude of the difference. The problem can be addressed by applying a version of Quantal Response Equilibrium (QRE, McKelvey and Palfrey (1995)) analysis. QRE allows for noisy behavior, assuming that each available action is chosen with positive probability. The probability associated with each action is increasing in its expected payoff. We assume a logistic form of QRE, in which the probability that subject i chooses cutoff sˆj is determined by equation C.1. As QRE is discrete-choice model, we sorted the observed cutoffs into 21 bins, and treat each bin as a separate, discrete action.17

P (ˆ sj ) = P1

eβπ(ˆsj )

k=−1

(C.1)

eβπ(ˆsk )

β is a ‘sensitivity parameter,’ which determines the strength of the relationship between actions’ payoffs and probabilities. If β = 0, behavior is uniformly random, while as β approaches ∞, behavior converges to RN-BNE predictions. For intermediate values of β, subjects choose higher-paying cutoffs more often, but choose each cutoff with a positive probability. Thus, our interpretation of our results is consistent with empirical values of β are increasing in X. Table C.5: Quantal Response Equilibrium

X=1

X=0

X=-1

β1

6.59 (−1235)

4.79 (−1283)

2.59 (−1605)

β2A

2.62 (−1170)

1.58 (−1288)

0.42 (−1644)

β2B

2.20

2.91

0.12

Table C.5 show β estimates for subjects in each role in each of the three treatments, along with the log-likelihoods of the associated models. The parameter values suggest that Player 2’s behavior is most responsive to payoff differentials under X = 1, and least sensitive under X = −1. Therefore, the behavior that we observe is not driven jointly by random behavior and differentially steep payoff functions, and subjects do perform ‘best’ under X = 1. As payoff externalities decrease, subjects make not only bigger mistakes, but more costly mistakes, reinforcing our conclusions of limited learning and weighty payoff externalities. 17 The bins are equivalent to those presented in Table 3. s ˆ2 ≤ −0.95 were grouped together, as were −.95 < sˆ2 ≤ −0.85. etc. The expected cutoff to cutoffs in each bin were calculated at its midpoint.

22

Appendix D. Risk Averse Preferences for Player 2 The derivation of optimal cutoffs in Appendix A assumes risk-neutrality. This section discusses the decision of a risk-averse Player 2 responding to Player 1 with sˆ1 = 0. We model Player 2’s preferences over wealth (w), at the decision level, using the simple equation u2 (w) = wα . As preferences are no longer linear, and sˆ1 = 0, equations D.1 and D.2 dictate behavior rather than equations A.2 and A.3.

EU2AA

  (Y + X)α × (s + 1) + X α × (−s ) if s ≤ 0 2 2 2 =  (Y + X)α if s2 > 0

EU2BA

  Y α × (−s ) 2 =  0

if s2 ≤ 0 if s2 > 0 (D.1)

EU2AB

  0 =  Yα×s

2

if s2 ≤ 0

EU2BB

if s2 > 0

  (Y + X)α if s2 ≤ 0 =  (Y + X)α × (s + 1) + X α × (−s ) if s > 0 2 2 2 (D.2)

Setting π2AA = π2BA for s2 ≤ 0 and π2AB = π2BB for s2 > 0 yields sˆA and sˆB , respectively, as shown in equations D.3 and D.4. sˆA 2 =

−(Y + X)α (Y + X)α + Y α − X α

(D.3)

sˆB 2 =

(Y + X)α (Y + X)α + Y α − X α

(D.4)

Recall that payoffs were scaled up by one token in the X = −1 treatment, which affects risk preferences. Accounting for this adjustment, the resulting expression for sˆA 2 is

−(Y +X+1)α −1α (Y +X+1)α +(Y +1)α −(X+1)α .

To illustrate the effect of risk-preferences on optimal cutoff behavior, consider the risk-averse √ utility function u(w) = w, or α = 0.5. Substituting this into the expressions above, sˆ∗A becomes −0.807 and −0.193 for X = 1 and X = −1, respectively. These cutoffs correspond to predicted Imitation frequencies of 90.4% and 59.7%, which differ by less than 3% from those predicted under the assumption of risk-neutrality. It is therefore unlikely that risk-preferences account for the entirety of the influence of payoff externalities.

23

Appendix E. Computer as Player 1 Figure E.4 shows the distribution of sˆ1 observed in experimental sessions across treatments. As stated in Section 5.3, the deviations from sˆ∗1 = 0 result in d1 6= d∗1 for 13, 19 and 15 % of d1 across treatments. Figure E.5 shows the distribution of sˆ2 in treatments where a computer

-1

-.8

-.6

-.4

-.2

0

.2

.4

.6

.8

1

.5 0

.1

.2

.3

.4

.5 .4 .3 .2 .1 0

0

.1

.2

.3

.4

.5

played the (predictable) role of Player 1, separated by X and d1 .

-1

-.8

-.6

(a) X = 1

-.4

-.2

0

.2

.4

.6

.8

1

-1

-.8

-.6

-.4

-.2

0

.2

.4

.6

.8

1

.4

.6

.8

1

.6

.8

1

(c) X = −1

(b) X = 0

.4 .3

.3

.2 .1 -.8

-.6

-.4

0

.2

.4

.6

.8

1

-1

-.8

-.6

-.4

(b)

-.2

sˆA 2,

0

.2

.4

.6

.8

1

-1

-.8

X=0

-.6

(c)

-.4

-.2

sˆA 2,

0

.2

X = −1

.3 .2 .1 -1

-.8

-.6

-.4

(d)

-.2

sˆB 2 ,

0

.2

.4

X=1

.6

.8

1

0

0

0

.1

.1

.2

.2

.3

.3

X=1

.4

(a)

-.2

sˆA 2,

.4

-1

0

0

0

.1

.1

.2

.2

.3

.4

Figure E.4: sˆ1

-1

-.8

-.6

-.4

(e)

-.2

sˆB 2 ,

0

.2

.4

.6

.8

1

-1

X=0

Figure E.5: sˆA ˆB 2 and s 2 : Computer as Player 1

24

-.8

-.6

(f)

-.4

-.2

sˆB 2 ,

0

.2

.4

X = −1

OL-Ext-Owens-4-10-2014.pdf

Download. Connect more apps... Try one of the apps below to open or edit this item. OL-Ext-Owens-4-10-2014.pdf. OL-Ext-Owens-4-10-2014.pdf. Open. Extract.

405KB Sizes 3 Downloads 136 Views

Recommend Documents

No documents