G ENDER , S OCIAL N ETWORKS AND P ERFORMANCE∗ Ilse Lindenlaub†

Anja Prummer‡

October 20, 2017

Abstract This paper empirically documents gender differences in social networks across a wide range of settings, from friendship to work ties: Men have more connections (a higher degree) whereas women have denser networks (a higher clustering coefficient). We develop a theory that links network structure to productivity and earnings. While a higher degree leads to better access to information, more clustering leads to higher peer pressure. Both affect effort in a model of team production, each beneficial in different environments. We find that information is particularly valuable under high uncertainty, whereas peer pressure is more valuable in the opposite case. We therefore expect men to outperform women in jobs that are characterized by high uncertainty in project outcomes and earnings. We find empirical support for our predictions that (i) workers with low clustering and high degree have a comparative advantage in risky occupations and (ii) differences in network structure across gender can help explain disparities in their labor market outcomes.

Keywords: Networks, Peer Pressure, Information, Gender, Labor Market Outcomes JEL Classification: D85, Z13, J16



We are grateful to our advisors Jérôme Adda, Jan Eeckhout, Massimo Morelli, Nicola Pavoni and Fernando Vega Redondo for helpful discussions and advice. We further would like to thank Carlos Alós-Ferrer, Alp Atakan, Estelle Cantillon, Hector Chade, Giancarlo Corsetti, Raquel Fernandez, Edoardo Gallo, Sanjeev Goyal, François Maniquet, Andrea Mattozzi, Sujoy Mukerji, Francesco Nava, Onur Özgür, Brian Rogers, Carl Sanders, Karl Schlag, Klaus Schmidt, Jan-Peter Siedlarek, Adam Szeidl, Yves Zenou as well as seminar participants at the University of Cambridge, the University of Cologne, UC Louvain-CORE, Ludwig-Maximilians University Munich, the EUI Micro Working Group, the Berkeley Student Theory Group, EEA 2013, SAET 2013, CTN 2013 and PET 2013 for helpful comments. We thank Tianhao Wu and Mrinmoyee Chatterjee for very valuable research assistance. Additionally, we thank Unicredit who awarded us for this paper with the UWIN Best Paper Award. All remaining errors are ours. † Contact: Department of Economics, Yale University, 28 Hillhouse Avenue, 06520 New Haven, USA, [email protected] ‡ Contact: School of Economics and Finance, Queen Mary University of London, Mile End Road, London, E1 4NS, UK, [email protected]

1

Loose connections are the connections you need. It’s the No. 1 rule of business. Sallie Krawcheck, owner of the global women’s network 85 Broads1

1

Introduction

Gender differences in labor market outcomes remain striking. In the US, management occupations, such as financial manager and chief executive, display particularly large gender wage gaps, whereas healthcare support and administrative occupations show much smaller gaps. Similar patterns were found for the UK, where full-time working women in the financial sector earn only 68% of their male colleagues’ salaries – a gap almost twice as large as the gap in the economy as a whole. Gender differences also appear in promotions, a notable example being academia where women hold only 33% of tenured Professor positions in the US.2 What these occupations and sectors have in common is that they are characterized by significant uncertainty regarding earnings and project outcomes: Research is characterized by ever-changing tasks with uncertain outcomes. In turn, earnings of both executives and financial managers are largely based on performance pay which drives the gender wage gap in these occupations, suggesting that men perform better.3 But why do women perform relatively poorly in “high-risk” occupations? We offer a novel answer to this question, which is based on social network heterogeneity between men and women. In the labor market, social networks have been shown to play a crucial role in shaping workers’ incentives and performance within a firm. This paper takes these insights a step further and proposes that the structure of social ties matters for labor market outcomes on the job. We first document that men and women exhibit different network structures in a variety of settings. We then develop a theory which predicts that men’s networks allow them to perform better in uncertain environments with potentially high but risky returns compared to women. Last, we provide empirical support for our predictions and show that network structure indeed affects labour market outcomes. Thus, we make three distinct but related contributions: First, we establish in three different environments that men’s and women’s social networks differ. We show that women have fewer friends than men, that is they have a lower degree, but their friends are more likely to be 1

Krawcheck at Marie Claire’s luncheon for the New Guard, November 2013. For the US, see https://www.bls.gov/opub/reports/womens-earnings/2016/pdf/home.pdf, BLSReports (2017), Goldin and Katz (2011), Goldin (2014) and data from the Integrated Postsecondary Education Data System (IPEDS) for 2015-2016 from the National Center for Education Statistics. See https://visual.ons.gov. uk/find-out-the-gender-pay-gap-for-your-job/ UK Office for National Statistic, 2016, for evidence on the UK. 3 See Equality and H.R.Commission (2009), Albanesi and Olivetti (2008) and Albanesi and Olivetti (2009) for the empirical relevance of performance pay for the wage gap. 2

2

friends among each other, implying a higher clustering coefficient. Thus, women have smaller but tighter networks, whereas men have larger but looser networks. This observation holds true in three data sets that allow us to construct agents’ networks: (i) the AddHealth data (a large longitudinal survey of young Americans) where we construct individuals’ networks based on friendship nominations; (ii) data from the Enron company where we construct employees’ networks based on their email exchanges; and (iii) in a large sample of academic computer scientists (from the dblp computer science bibliography) where we build the scholars’ networks based on co-authorships. The environments in which we identify gender differences in networks vary considerably – AddHealth is based on a non-work setting, Enron is a work setting within a firm, and academia is yet a different work setting. This highlights the pervasiveness of these gender disparities in network structure, which we believe are a new finding in the literature. Motivated by these findings, we develop a theory highlighting how network patterns affect labor market outcomes – the second contribution of this paper. Our theory relies on a trade-off between tight and loose networks that provide different types of social capital: a tight network fosters trust or peer pressure among agents, which reduces their incentive to shirk. This is because they fear repercussions not only from the individual they affect directly with their behavior but also from other members of their tight network. As a result, closed networks help overcome free-riding problems (Coleman (1988a)). But network closure comes at a cost. Networks with high closure do not allow individuals to access as much information as networks with lower closure. Being in a loose network with links to individuals that are not connected themselves is particularly valuable for information acquisition. This is what the literature has referred to as the “strength of weak ties” (Granovetter (1973)).4 We develop a theory where networks provide both access to information as well as peer pressure and contribute to the literature by formalizing this trade-off in a labor market setting. We are interested under what circumstances tightly connected female networks and thus high peer pressure are more important for performance on the job and in what environments the opposite is the case. In our model, workers are ex-ante heterogeneous regarding their informal networks at work and are repeatedly selected into partnerships to complete projects of uncertain output value. Project success positively depends on the partners’ efforts, where effort is unobservable. If the project is completed successfully, the project payoff is shared between the team members. Because output is split but effort costs are not, there is a team moral hazard problem at work, inducing inefficiently low effort (as in Holmstrom (1982)). 4

These two types of social capital can also be related to the concepts of bonding versus bridging social capital defined in Putnam (2000).

3

We are interested in the effort levels of the project partners as a proxy for their performance and specifically in the factors influencing this choice. First, the choice of effort depends on information about the value of the project, which can be high or low, depending on the state of the world. Each worker receives a signal about the true value, and can observe the signals of their friends in the network. The more signals and thus information a worker has, the more precise is his belief about the state of the world and the better is his judgment as the optimal effort is state dependent. Second, effort is positively influenced by the amount of peer pressure individuals face. The amount of information and peer pressure of a worker depends on his network structure. Workers with a higher degree receive more signals and therefore hold more information. In turn, workers with higher clustering face more peer pressure through the following mechanism: A failed project leads to frictions not only between project partners but also between them and their common friends, that is their disagreement spreads through the entire group – an idea based on the structural balance theory.5 Since an intact friendship is necessary for a successful project, repercussions of a failure are especially bad for a worker with high clustering. Therefore, higher clustering leads to higher effort in order to be on good terms with future potential project partners. We analyze under which circumstances a network with higher clustering (and thus more peer pressure) is more beneficial for job performance and ultimately wages and when a network with a higher degree (i.e. more information) is advantageous. Our main theoretical findings are as follows: A higher degree is more beneficial for performance in volatile environments, where the uncertainty about the project value is considerable. This is true when (i) overall information (that is information coming from sources unrelated to the network) is scarce, (ii) when signals are noisy and (iii) when project rewards differ significantly across states. In these cases, uncertainty about the state of the world and associated rewards is large and the benefits of purely information-based, loose networks outweigh the benefits of closed networks that lead to more peer pressure. In turn, peer pressure leads to higher effort and thus project completion in environments characterized by certainty where additional information has no value. In general, someone with more information can better fine-tune his effort to the expected project reward, exerting high effort only when there is something at stake. In turn, workers facing high peer pressure exert extra effort even if the expected project reward is low. In addition to these predictions that high clustering is more conducive to effort in some environments and a high degree is in others, we also show that degree and clustering are complementary: The marginal effect of clustering on effort is particularly large 5

This is a concept first proposed by Heider (1946) who has spawned a field of research that remains active until today. For an overview on structural balance theory, see Easley and Kleinberg (2010), chapters 3 and 5.

4

when degree is high (i.e. when information is abundant) and vice versa. Effort choices directly translate into wages. Someone with higher clustering earns more than someone with higher degree when uncertainty about the state is negligible. Such a worker then has a comparative advantage in jobs whose outcomes are more certain compared to jobs with less certain outcomes. We also show that, in line with our result on effort, the marginal return to clustering is higher when degree is high. Finally, due to the dynamic effect of clustering, there is a strong persistence of wage patterns across time, consolidating early career wage gaps. Our third contribution is to provide empirical support for our predictions. We focus on the two data sets in which we can measure performance: AddHealth, which contains information about labor market outcomes, and the sample of computer scientists, where we scrape data on performance from Google Scholar.6 In a first step, we test the predictions of our model that link network structure and labor market outcomes, based on the AddHealth data. First, we test our prediction on comparative advantage: We find that high clustering is associated with lower earnings but only in risky occupations, where we measure occupational riskiness as the unexplained variation in earnings within an occupation. Moreover, lower clustering and higher degree makes it more likely that an individual is employed in a risky occupation. We view these findings as evidence for our model’s prediction that individuals with relatively high clustering and/or relatively low degree have a comparative advantage in less risky environments. Second, and in line with our theory, we find that both characteristics, degree and clustering, are complementary in earnings, that is the marginal return to clustering is increasing in degree. Third, we find suggestive evidence for our prediction that the earnings’ gap at the beginning of the career persists especially for those individuals with high clustering. One of the main concerns may be that characteristics omitted from this analysis may drive both the network structure an individual builds and labor market outcomes. For instance, more risk averse agents may build closer networks with higher clustering and at the same time select into less risky work environments. Another trait that could influence both network structure and labor market outcomes is competitiveness. To address this issue, we obtain proxies for risk aversion and competitiveness from AddHealth and use them as additional controls. Importantly, this does not alter our results on how network characteristics impact labor market outcomes. In a second step, we aim to bring the focus back to gender disparities in labor market outcomes. Our findings on gender differences in social networks coupled with the model and our 6

Note that we do not have any performance measures in the Enron email data.

5

empirical analysis on network structure and labor market outcomes leads to the following prediction: Men, based on their loose networks, should outperform women in work environments that are characterized by uncertainty. We turn to two data sources to provide evidence for this claim. First, we find that female academic computer scientists – we view computer scientists in research as an occupation that involves significant uncertainty – would perform considerably better if they adjusted their network characteristics towards lower clustering and higher degree. Second, we provide suggestive evidence from the US Census that the gender earnings gap is considerably higher in risky occupations – a finding not documented previously. We provide a novel reason why women perform poorly in these occupations, which is based on their disadvantageous network structures. The paper proceeds as follows: We first discuss the related literature in Section 2. In Section 3, we document empirically how men’s and women’s networks differ. In Section 4, we develop our model. Section 5 tests the model’s predictions empirically. Section 6 discusses both challenges in the empirical analysis as well as model choices. Section 7 concludes.

2

Related Literature

Our paper contributes to several strands of literatures, which we discuss in turn.

2.1

Gender Differences in Labor Market Performance

First, we contribute to the work on the gender gap in labor market performance and wages.7 Common explanations for this gap are discrimination (Black and Strahan (2001), Goldin and Rouse (2000), Wenneras and Wold (1997)), differences in abilities and preferences which result in occupational self-selection (Polachek (1981)), differences in the number and length of career interruptions (Bertrand et al. (2010)), differences in hours worked at home, translating into lower female effort at work (Albanesi and Olivetti (2009)) or differences in performing job tasks with low-promotability (Babcock et al. (2017)). Differences between men and women have also been found in their competitiveness (Gneezy et al. (2003), Gneezy and Rustichini (2004), Niederle and Vesterlund (2007)), in their preferences for team-based over individual pay schemes (Kuhn and Villeval (2013)), risk aversion (for a summary, see Eckel and Grossman (2008)), in their ability to bargain (Babcock and Laschever (2003), Card et al. (2013)), and in terms of future fertility concerns (Adda et al. (2011)). Similar to us, some papers argue that the wage gap is more severe in high-powered jobs, especially in the financial and corporate world. 7 Reviews of gender wage differences and possible explanations can be found in Blau and Kahn (2000), Bertrand (2011), Blau and Kahn (2016).

6

Albanesi and Olivetti (2009) attribute this to women’s higher cost of managerial effort due to more working hours at home. Goldin and Katz (2011) and Goldin (2014) argue that in these occupations the penalty for reduced and flexible work hours are particularly high. Our paper, which is complementary to this literature, suggests a new disparity between men and women, their network structure, as a source of the performance and ultimately wage gap.

2.2

Gender and Networks

Second, we add to the literature on gender differences in networks. This literature originates in sociology with the study of children’s and adult networks. Children’s Networks Children’s networks are generally measured through interaction in play. A common finding is that boys tend to play in larger groups, whereas girls interact with few others (Halverson Jr and Waldrop (1973), Laosa and Brophy (1972), Lever (1976)). If networks are determined through nomination of friends, then boys and girls name the same number of friends on average (Eder and Hallinan (1978)). The fact that an analysis of best friends leads to different friendship patterns than an analysis of play is also evidenced in Hallinan (1980), Belle (1989) and Hartup (1983). For research that focuses on how integrated networks are, see Benenson (1990, 1993), Parker and Seal (1996). All three studies find that boys have both larger and denser networks. However, there is also work highlighting that friendship patterns are subject to age and the specific environment, in particular with respect to the share of boys versus girls (Eder and Hallinan (1978), Cairns et al. (1995)). Contrary to previous studies, the most recent analysis, Lee et al. (2007), finds that girls have a higher number of friends. A general issue with the studies is that they have very small sample sizes.8 Generally, a comparison between the studies is impeded due to the differences in how networks and friendships are measured and the findings being far from conclusive. We improve upon these studies by analyzing networks of school children in a large sample, where we have access to the entire school. This allows us to comprehensively identify the overall network structure, going beyond the number of friends.9 Adult Networks Sociological studies of adult networks point out similar interactions patterns to those found in childhood (Booth (1972), Fischer and Oliker (1983), Marsden (1987), Moore (1990)): men and women tend to name the same total number of friends, but this depends greatly on the life stage. Married women with children named the fewest friends. Addition8

Halverson Jr and Waldrop (1973) have a sample of “58 normal white children of middle-class parents (33 males and 25 females)” (p 678f), Lever (1976) analyze the interaction patterns of 181 school children. Eder and Hallinan (1978) focuses on dyadic relationship in the classroom with 5 classes of 25-35 students each. Benenson (1990) documents her findings for a sample of 81 boys and 73 girls from 8 different 4th and 5th grade classes. 9 While there is a large number of publications using the AddHealth data set, which are accessible on the AddHealth website, these studies differ from ours as none of them analyze the network structure of boys versus girls explicitly. The work most closely related to ours using this data is Olivetti et al. (2015) who find a positive relationship between the labor supply of daughters and their mothers as well as their friends’ mothers.

7

ally, women’s friends tend to be related to them, whereas men’s friends stem from the workplace. More recently, Baumeister and Sommer (1997) argues that adult men, due to the sexual nature tend to have larger networks than women, whose networks tend to be dyadic. Friebel and Seabright (2011) consider phone conversations and find that women make fewer calls than men, but that their calls last longer. Friebel et al. (2017) document that women form networks more selectively in an experiment conducted with 250 undergraduates, resulting in tighter networks. Contrary to this, no difference in networking between men and women has been found by Mengel (2015) in another experiment. Overall, however, these studies suggest that men tend to have larger networks, whereas women’s networks are tighter – features that we will measure precisely by computing degree and clustering coefficient across gender and that we will confirm in large datasets across school and work environments.

2.3

Networks and the Labor Market

Networks also have implications for various labor market outcomes.10 We focus here on referral networks and networks at the workplace including co-authorship networks with a focus on gender differences. Additionally, we discuss the role of networks in building social capital, which in turn affects labor market success. Referral Networks A significant share of the literature on labor markets and networks focuses on job search or job referral networks, both in sociology and economics.11 This is in contrast to our work which focuses on networks at the workplace and how they affect performance on the job. Additionally, the role of gender is usually neglected. Referral networks affect wages, but the empirical evidence is mixed with Marmaros and Sacerdote (2002) , Dustmann et al. (2015), Brown et al. (2016), Damm (2009), Hensvik and Skans (2016), Schmutte (2014), Seidel et al. (2000) showing that referrals lead to higher wages. In contrast, Åslund and Skans (2010) and Bentolila et al. (2010) find a negative effect of referrals on wages. The networks used in these studies tend to be ethnic-based (Borjas (1992, 1995), Bandiera et al. (2009), Bertrand et al. (2000), Dustmann et al. (2015)). Further, networks are often proxied by neighborhoods (Bayer et al. (2008), Hellerstein et al. (2011), for an overview see Topa (2011)). Others use networks that are created through membership in firms and co-workers,12 but a general issue with this 10 For example, they offer insurance which affects mobility (Munshi and Rosenzweig (2016)). Azmat and Ferrer (2015) consider networks as a means to attract new customers. 11 Empirical job referral networks are analyzed in Dustmann et al. (2015), Bandiera et al. (2009), Beaman and Magruder (2012), Beaman et al. (2013), Hensvik and Skans (2016), Heath (2013), Schmutte (2014), Goel and Lang (2009); a review of previous studies is provided by Ioannides and Loury (2004) and Topa (2011). From a theoretical perspective, Montgomery (1991), Simon and Warner (1992),Marsden and Gorman (2001), Arrow and Borzekowski (2004), Calvo-Armengol and Jackson (2004), Calvó-Armengol and Zenou (2005),Calvó-Armengol and Jackson (2007), Galenianos (2013), Fontaine (2008) consider the effect of differences in job search networks. 12 See Lalanne and Seabright (2016), Shue (2013),Hwang and Kim (2009), Renneboog and Zhao (2011), Horton

8

approach is that working at a larger firm automatically yields a larger network. Variations in referrals across network attributes has been considered by Granovetter (1995), who emphasizes the importance of weak ties in job search and in reducing unemployment.13 Further, weak ties tend to increase wages, although the evidence is not particularly robust (Green et al. (1995), Green et al. (1999), Bridges and Villemez (1986), Elliott (1999)). Most relevant for our work is the finding that women tend to have networks that are detrimental to job search due to their reduced span (Campbell (1988), Leicht and Marx (1997)).14 We consider the referral networks literature as complementary to our work. While referrals are important, so are networks at the workplace and we lack an understanding of how networks affect job performance, a gap our paper addresses. An empirical analysis of the importance of networks at the workplace is challenging as the economic effects may also be driven by the referral network. We will discuss below why it is unlikely that all of our empirical results are driven by referral networks, though. Networks at the Workplace Most relevant for our work is the analysis of networks at the workplace. Networks within firms have received limited attention, mainly due the scarcity of data. We focus here on studies that analyze the network structure with a focus on gender differences.15 The first studies on this topic by Burt (1992, 1998) highlight that women do not build denser networks, but that their networks are larger. Interestingly, the firm rewards different types of networks for men and women. Women’s networks that lead to success tend to be denser, with a lower span than men’s, whereas men are rewarded for sparser, more spread out networks. Ibarra (1992, 1993, 1997) confirms the benefit that women obtain from denser networks. However, she shows that women do not build denser networks than men, implying an unawareness about what type of network is beneficial. Burke et al. (1995), in line with the previous studies, shows that mens and women’s network size does not differ greatly.16 The discussed papers have several shortcomings, the first being small sample sizes that vary between 63 and 284. Further, all of the studies focus on managers (within narrow sectors, such as adet al. (2012), Kramarz and Thesmar (2013), Fracassi and Tate (2012), Engelberg et al. (2012), Brown et al. (2012), Liu (2014), Berardi and Seabright (2011), Cingano and Rosolia (2012), Glitz (2015), Saygin et al. (2014)) 13 This notion was formalized by Montgomery (1994) and empirically confirmed by Yakubovich (2005). Granovetter (1973) defines weak ties as those found between acquaintances with occasional and less intense contact. Weak ties also lead to higher occupational status (Lin et al. (1981)). 14 There exist several papers in which weak ties are defined as those with own ethnic groups and strong ties with family and friends (see Kuzubas and Szabo (2015), Goel and Lang (2009)). These papers again are not concerned with network structure and do not take gender into account. 15 Work on social interactions at firms without focus on gender includes Rotemberg (1994), Ananat et al. (2013). Reagans and Zuckerman (2001) analyze global network patterns. 16 Kürtösi (2008) also studies networks at work, analyzing the networks of 250 individuals working for a city council in Hungary. She highlights that relationships at the workplace are heavily affected by the organizational structure and does not disentangle informal and formal networks. Additionally, men and women tend have very different occupations making a comparison of their networks difficult.

9

vertising, computers, insurance). We would expect a significant self-selection into occupations and sectors, which might bias the observed networks. The most recent study on communication patterns within a technology company and an electronics company is provided by Kleinbaum et al. (2008). They highlight that women participate in a greater volume in electronic and face-to-face interactions and do so with a larger and more diverse set of communication partners. A problem with their finding, and that of other studies, is the lack of distinction between formal and informal networks (a notable exception being Ibarra (1997)). Ibarra (1992), for example, shows that the organizational structure (i.e. the formal network which we consider as given to the individual) impacts what kind of networks are reported. We improve upon previous studies by analyzing the informal network of Enron in a more general setting, across occupations, based on a relatively large sample and without relying on self-reported data. Collaboration Networks While the studies mentioned so far have focused on networks within firms, there is also a growing literature on academic collaboration networks. The advantage of collaboration networks is that agents choose their network, with little organizational constraint (unless we consider fields where researchers work together in laboratories). While there is an abundance of studies in different fields, we focus here on research that addresses gender differences in co-authorship networks.17 In economics, Boschini and Sjögren (2007) find that, controlling for subfield, women are more likely to work on their own, but co-authorship patterns depend on how research intensive the environment is (McDowell et al. (2006)). Female mathematicians, on the other hand, are more likely to collaborate, while men and women have the same number of distinct co-authors (Mihaljevi´c-Brandt et al. (2016)). The fact that co-authorship patterns vary by field and gender has also been shown by Schucan Bird (2011), who analyzes publications in the social sciences. According to Garg and Kumar (2014), women in the life sciences in India tend to work in smaller teams with less international collaborative papers. Abramo et al. (2013) find a higher propensity for females to collaborate across various fields. However, women are less likely to collaborate internationally. Campion and Shrum (2004) and Miller and Shrum (2012) show that 17 Co-authorship in economics has been analyzed by Goyal et al. (2006), Ductor et al. (2014), Fafchamps et al. (2010), Van der Leij and Goyal (2011). Moody (2004) analyzes co-authorship patterns in sociology, without addressing gender. The same holds for Newman (2001) who looks at biomedical research, physics and computer science, Kretschmer and Gupta (1998), who consider co-authorship in theoretical population genetics, Kundra and Kretschmer (1999), who consider collaboration in Indian medicine and Tomassini and Luthi (2007), who describe the global co-authorship network in genetic programming. Academic librarians and their co-authorhsip patterns are analyzed by Bahr and Zemon (2000). Additional work on co-authorship networks is by Glänzel and Schubert (2004). Yoshikane and Kageura (2004) analyze co-authorship patterns in electrical engineering and information processing as well as in polymer science and biochemistry.

10

men and women in Ghana have roughly the same number of co-authors. Remarkably, access to improved communication technologies have led to a decrease in network size for women and widened the gender gap.18 This highlights that gender differences in collaborations differ across fields. Whether women co-author more or less seems inconclusive. This may also be due to self-selection or the fact that gender ratios are different across fields. Further, the evidence is often constrained by small sample size (e.g. Schucan Bird (2011)) or specific years (Boschini and Sjögren (2007)). Circumventing both of these shortcomings, we analyze gender differences in networks in a field that has not been addressed before, computer science, in addition to high school students and the Enron email data. Contrary to the discussed studies, we do not only focus on degree in this collaboration network but also on clustering and establish significant gender differences, with women having a higher clustering coefficient but men having a higher degree. Team Interactions The importance of social interactions has first been analyzed in economics in Becker (1974) with a focus on family relations. Departing from this, social interactions have been addressed at the workplace, with a focus on the three following aspects: (i) team collaboration at the workplace, (ii) the incentives teams provide and (iii) the impact of team work on productivity.19 We emphasize the role of network structure and the trade-off between network density and network span. This goes back to the sociology literature in particular to Coleman (1988a,b), Granovetter (1973, 1995, 2005), and Burt (1992, 1997). In economics, this tradeoff has been analyzed in the context of borrowing (Karlan et al. (2009), Jackson et al. (2012)) and trade (Dixit (2003)) but has so far not been linked to productivity.20 Networks and Labor Market Success The literature on referral networks has already highlighted that a greater network span can lead to lower unemployment and higher wages.21 Additionally, certain networks have been related to a number of career advantages, for example quicker promotions or a wage premium.22 An overview of social networks and their role in organizations is provided by Kilduff and Tsai (2003). Social networks within organizations have also been connected to job satisfaction (Sparrowe et al. (2001), Hurlbert (1991), Flap and Völker (2001)) and mobility (Podolny and Baron (1997)). Networks and research productivity 18

Bozeman and Gaughan (2011) highlight that women have slightly more collaborators on average than do men across various disciplines. However, his data is obtained through self-reported collaborations. 19 See Rotemberg (1994), Ichniowski et al. (1997), Podolny and Baron (1997), Gibbons (1998), Hamilton et al. (2003),Bandiera et al. (2005), Bandiera et al. (2009), Bandiera et al. (2010), Mas and Moretti (2009), Gibbons and Waldman (1999), Kandel and Lazear (1992), Jackson and Schneider (2011). 20 The effect of community size on social outcomes is investigated by Allcott et al. (2007). 21 See Yakubovich (2005), Green et al. (1995), Green et al. (1999), Bridges and Villemez (1986), Elliott (1999). General effects of social network structures are analyzed in Granovetter (2005), Jackson et al. (2016). 22 See Burt (1992), Horton et al. (2012), Liu (2014), Berardi and Seabright (2011), Engelberg et al. (2012), Brown et al. (2012) for the effects on promotion and Conti et al. (2013) for the effect of indegree (capturing popularity) on wages.

11

have been analyzed by Azoulay et al. (2010), Waldinger (2010), Ductor et al. (2014). The effect of networks on promotions in academia is discussed in McDowell and Smith (1992), Combes et al. (2008), Zinovyeva and Bagues (2015).23 While networks matter for labor market outcomes, their role seems to differ for men and women. First, there is evidence that women at the work place do not get to participate in informal networks, which hinders their success within an organization as they lack relevant information and social capital (Linehan and Scullion (2008), Lyness and Thompson (2000), Metz and Tharenou (2001), McGuire (2000), McCarthy (2004), Bierema (2005)). Singh and Vinnicombe (2004) relate social exclusion in networks to the lack of women in the boardroom. This pattern carries over to collaboration networks, with women being excluded from informal networks which affects their productivity (Primack and O’Leary (1993), Shen (2013), Hong and Zhao (2016), Ynalvez and Shrum (2011), Nicolaou and Birley (2003), Hu et al. (2014), Lee and Bozeman (2005), Gersick et al. (2000), Kyvik and Teigen (1996)). Second, even if women network, they do not reap the same benefits as men. This is documented in Brass (1984, 1985), Carroll and Teo (1996), Brett and Stroh (1997), Kanter (1977), Ragins and Sundstrom (1989) and most recently Forret and Dougherty (2004), who show in a sample of MBA graduates in the US that only for men, network activities positively affect career outcomes. Further, networks create social capital, which in turn affects labor market success (Burt (1997), Lin (1999), Seibert et al. (2001), Van Emmerik (2006), Franzen and Hangartner (2006)). Van Emmerik (2006) show that men and women create the same level of soft social capital, but men create more hard social capital, implying that men co-author and collaborate more. We add to this literature by developing a theory on the role of network structure for productivity and earnings. The environments that have been analyzed empirically so far are managerial or academic in nature and our framework fits these environments as we consider collaboration on projects. Our model thus focusses on work settings where we know that networks have an impact, and we provide a novel mechanism that clarifies why this is the case. Moreover, we contribute new evidence based on AddHealth and a large sample of academic computer scientists that network structures indeed matter for labor market outcomes.

3

Gender Differences in Networks

We discuss gender network structures in three dissimilar settings. First, we analyze network patterns among students. Second, we turn to the email networks from Enron. Last, we investigate the co-authorship networks in academic computer science. 23

For an overview of gender differences in promotions see Ginther and Kahn (2004).

12

We begin with a formal definition of graphs that represent networks. A graph consists of a set of nodes N and a n × n matrix g, where gij represents the possibly directed relation between i and j. As we focus on unweighted graphs, gij equals either 0 or 1. For each node in the graph, we define two concepts that allow an assessment of agents’ network structure: degree and clustering coefficient. Degree The degree is a measure of how connected an individual is. There are three types of degree, in-degree (ID), out-degree (OD) and degree (D). Respectively, they are denoted by IDi =

X

gji ,

ODi =

X

j

gij ,

Di =

j

X

min{1, gij + gji }

j

The in-degree describes how many agents named or wrote an email to individual i. The outdegree provides information on how many agents individual i nominated or sent emails to. For an undirected network (like that of academic computer scientists), only the degree is defined. Clustering Coefficient The clustering coefficient is a measure of how close-knit or tight a network is. It is computed as the ratio of the actual number of links between a node’s neighbors to the total possible number of links between the node’s neighbors. This measure depends on whether the network is directed or undirected: P CCidir

CCi =

3.1

j6=i;k6=j;k6=i gij gik gjk

= 2

. Di (Di − 1) j6=i;k6=j;k6=i gij gik gjk

P

Di (Di − 1)

.

Friendship Networks

We obtain the friendship networks from the AddHealth data set, which contains data on students in grades 7-12 from a nationally representative sample of roughly 140 US schools in 1994-95. Every student attending the sampled schools on the interview day is asked to compile a questionnaire (in-school data) on respondents’ demographic and behavioral characteristics, education, family background and friendships. This sample contains information on 90,118 students. Students were asked to name up to 5 male and 5 female friends. The AddHealth website describes surveys and data in detail.24 The friendship network constructed from the AddHealth data is a directed network, based on friendship nominations.25 For this network, we compute both directed and undirected clustering coefficients as well as in-, out- and overall degree. We restrict attention to the individuals whose age and gender we can identify and those with an identifier. This leaves us with a 24 25

For more details on the AddHealth data, see http://www.cpc.unc.edu/projects/addhealth. For more details on the friendship networks, see Appendix A.

13

dataset of 85 627 students. The descriptive statistics of the students are provided in Table 1, split up according to gender. The results for the entire sample are given in Table 3. We show that girls always have a higher clustering coefficient than boys across all ages. In turn, all measures of degree change with age. Younger girls have a higher degree than younger boys, whereas older boys (from age of 16 onwards) have a slightly higher degree than older girls. We then exclude the youngest and oldest students from the sample to highlight that our results are not due to these outliers, see Table 4. Further, we ensure that our results are not due to a different share of boys and girls at schools by restricting attention to schools where the share of boys and girls is balanced. Our baseline results carry over to this specification as well (see Table 5 ). Additionally, some of the friends named do not attend the same school. In our regressions so far, we exclude these external friends. We also check whether degree is affected by including these individuals and we show that this is not the case (Table 6). So overall, we find that clustering is unambiguously higher for girls, hinting at girls choosing more dense and tighter networks. The number of friendships is much more sensitive to age, confirming the results of sociologists that do not find conclusive evidence for the number of nominated friends. However, at older age, which is most relevant for the labor market, boys have larger networks in all of our specifications.

3.2

Networks at the Workplace

We also study networks at work, at the Enron company. We reconstruct the network at Enron based on email communications that were made publicly available by the Federal Energy Regulatory Commission during its investigation of this company following its fraudulent bankruptcy.26 The original dataset contains about half a million emails, containing 44 583 distinct email addresses, either senders or receivers. Since our focus here is on the (informal) network at work, we focus on those email addresses containing the ending “enron.com”. One challenge is that ‘gender’ is not recorded. Fortunately, in many email addresses first and last name are separated, so we are able to extract the first name of the employee and assign a gender.27 This procedure leaves us with 10 949 individuals with Enron email address and whose gender we successfully predicted.28 We further restrict our attention to those emails that have a single 26

The Enron data is available at http://www.cs.cmu.edu/~enron/. We use the package Gender in R to predict gender. The package provides a function to predict gender from names using historical data, a comprehensive description can be found via https://cran.r-project.org/ web/packages/gender/gender.pdf. 28 While we lose a significant amount of individuals because their email addresses do not allow us to detect their 27

14

receiver as it seems plausible that group emails may not be indicative of the informal network that we seek to analyze. This reduces the sample to 6 604 employees with 2 685 women and 3 919 men, which we call our baseline sample. We construct the unweighted network of these 6 604 employees based on their email communication. Since emails are in principle directed, we report both directed and undirected network characteristics. But our main focus is on the undirected clustering coefficient and the degree, as those measures correspond best to the ones we use in our model. The summary statistics are in Table 7 in Appendix A and we find that women have a significantly higher clustering coefficient and a lower degree compared to men (Table 8). One concern with our analysis based on Enron emails is that it captures both professional and social ties of employees. Our theory, however, is about the impact of informal/social networks at the workplace. To address this concern, we perform a text analysis of the email contents, which helps us classify emails into those that are ‘work-related’ and those that are ‘social’, based on the content of the email subject and body (see Appendix A for details). We specify that the algorithm classify emails into 10 topics and then investigate based on the returned key words (i.e. most frequent words) of those topics which ones are predominantly work-related and which ones are social. See Table 9 for the topics and corresponding key words, Table 10 for how many emails are assigned to each topic and Table 11 for the average network characteristics by topic. We focus on the two topics whose key words mostly convey social content (Topic 7 and Topic 9) and construct the ‘informal’ network based on them. They contain 969 unique Enron email addresses to whom we can assign the gender, with 545 men and 424 women. Our results in Table 12 show that when restricting our attention to purely social ties, the gender differences in networks remain robust: Women have a lower degree and higher clustering coefficient. This finding is robust to the case where we focus on more frequent interactions: we exclude the links that were based on only (i) a single email, (ii) two emails, or (iii) 3 emails between any two employees (Tables 13-15). We also address the issue that the original sample contained more men (which may bias our results if network formation is predominantly driven by homophily) by randomly selecting an equal number of men and women from our sample. We reconstruct the network based on this randomly selected subsample and show that women still have a significantly higher clustering coefficient and a lower degree (Table 16)). These patterns are not only apparent when focussing on degree and undirected clustering coefficient but are robust to considering in- and out-degree as well as the directed clustering coefficient, see again Tables 8 and 12-16. We conclude that the informal networks constructed gender, we think that it is unlikely that those missing employees are predominantly male or female.

15

from Enron email data display significant gender differences, with men having a higher degree and women a higher clustering coefficient.

3.3

Co-Authorship Networks

Last, we study another distinct work environment: collaboration networks of academic computer scientists. We obtain this data from the dblp computer science bibliography, a service providing open bibliographic information on all major computer science journal publications since 1995 (our sample includes academic papers from 1995-2016).29 The raw data set contains names of scholars and the names of co-authors for each publication listed on the platform. After data cleaning, we have 1 348 324 unique names. To 906 863 of them we can assign a gender, based on their first names (where we use the same gender identification program as for Enron). We have 672 630 men and 234 233 women in this baseline sample and construct their co-authorship networks based on 1 978 857 academic articles. In this network, nodes are authors and a link between two nodes exists if the corresponding authors have published at least one paper together. Note that this co-authorship network is an undirected network since collaborations are bilateral. Hence the network characteristics we compute are degree and (undirected) clustering coefficient. On average, a computer scientist has a degree of 7 (i.e. 7 coauthors) and a clustering coefficient of 0.13, see Table 17. Our baseline results show significant differences in collaboration networks across gender: While female computer scientists have a higher clustering coefficient, male computer scientists have a significantly higher degree (Table 18). As a robustness check, we count co-authors only if their gender is known. The gender differences in networks are robust (Table 20). We have analyzed the network structure of men and women in three very different environments: a private one (AddHealth) and two work settings (Enron and computer science). Across the board, we find that there are significant differences in how men and women network, with men having a higher degree (this holds true also for AddHealth when we focus on older age groups instead of kids) and women having a higher clustering coefficient. We believe this finding is new in the literature. We are also confident that the networks we analyze reflect informal networks rather than networks that are heavily influenced by the organizational structure of the work environment. This is uncontroversial in the AddHealth data, where everyone can choose whom to be friends with, as well as in the network of computer scientists where co-authorships are voluntary. This is more tricky within Enron which is why we conducted a text analysis that allows us to focus on emails with social content. 29

See http://dblp.uni-trier.de/

16

We now build a model where both degree and clustering coefficient impact labor market outcomes and, in a second step, connect our model’s predictions to the data.

4

Model

4.1

Model Set-Up

We consider an undirected network g of N workers. Two of those workers, i, j ∈ N, are selected in each period t. We focus here on a two period model, t ∈ {1, 2}, but the analysis readily extends to a longer horizon. Once two workers are selected they have to complete a project. Whether they are successful depends on their exerted effort, which in turn depends on their network structure and past project outcomes. In order to highlight how each of these factors matters we first consider the game that is played in each period t. 1. Worker Selection At the beginning of each period, any two workers are drawn with the same probability from the set of workers to complete a project. Whenever two workers i and j have a direct link, denoted by gij = gji = 1, then they have an informal connection. We assume that workers can only complete their project successfully if there exists a direct link between them. If there is no link between two selected workers, their project fails with certainty, leading to zero payoff.30 The number of links of worker i, his degree, is denoted by Di . Then, the joint probability of being selected for a project and being partnered with a directly connected worker is given by (see Appendix C for details) si =

2Di . N (N − 1)

(1)

This probability is proportional to the degree of an individual. This implies that workers with higher degrees will be selected more often into potentially profitable projects.31 2. Information Every period is marked by a state of the world, θ, which is high or low   θ h θ=  θ l

with probability q with probability 1 − q

and iid. It is drawn after project teams are formed and is not observable to the workers. In the high (low) state, the project value is 2vh (2vl ), with vh > vl . We assume that the payoff of the 30

A link or rather a good relationship between workers makes them better team partners. To simplify, we set the payoff of projects between unlinked workers to zero. 31 This is in line with Aral et al. (2012), who study project performance in a recruiting firm. They find that peripheral nodes, i.e. nodes that are not well connected, do fewer projects per unit of time than central nodes.

17

project is split equally among the project partners.32 In the following, we show how a worker’s network structure affects his information about the state of the world. Each worker obtains a signal about the state (with a signal value of one (zero) indicating the high (low) state) but he can also observe the signals of workers he is directly or indirectly connected to. We denote the probability of a correct signal by p and assume that signals are informative with p > 1/2. Since we focus on ego networks, we distinguish between the number of signals a worker obtains from himself and his direct friends, nint,i = Di + 1, and the signals he obtains from external sources including indirect connections, next,i . This enables us to vary the baseline amount of information below. We denote by ni = nint,i + next,i the overall number of signals of worker i. Based on his observed signals, a worker then computes a sufficient statistic yi , which is the number of high signals out of all observed signals, that is yi ∈ {0, 1, . . . , ni }. Note that two project partners hold the same information. Based on yi , the posterior probability of being in the high state, P r(θh |yi ), is computed via Bayesian updating and thus having a higher number of signals gives a more precise posterior. The project value, π(yi ), is then given by π(yi ) = P r(θh |yi )vh + (1 − P r(θh |yi ))vl . To summarize, the network structure matters as a higher degree gives a higher number of internal signals, which in turn affects the expectation about the project value. 3. Choice of Effort The paired workers simultaneously choose what effort, ei ≥ 0, ∀i to exert on the project. This effort is costly with all workers facing the same cost function c, which we assume quadratic for simplicity, i.e. c(e) = e2 /2. Given that the project certainly fails if the two project partners are not connected, we focus on the effort choice of two directly linked team mates. Effort makes project success more likely. The probability that the project is completed if effort choices are ei and ej is given by f (ei , ej ) ∈ [0, 1). To ensure that f (ei , ej ) is strictly smaller than one, we assume that effort is bounded.33 This implies that success cannot be guaranteed. Further, we make some natural assumptions on the success function f , namely that it is twice continuously differentiable, increasing and concave in each argument, that it has constant returns to scale and is symmetric in both arguments.34 Moreover, we assume that f is 32 We impose the equal split assumption as we aim for a model in which agents are perfectly symmetric except for their network. This allows to show the effects of network structures in the cleanest way possible. 33 That is ei ∈ [0, emax ] where f (emax , emax ) < 1. By choosing an appropriate bound on vh , we can guarantee an interior solution e ≤ emax . 34 Note that two of our key results, Propositions 2 and 3, merely require convex costs, but not constant returns

18

strictly super-modular, f12 = f21 > 0, implying that effort levels of the workers are strategic complements. We focus on complements as the natural benchmark for a team problem: With substitutes a worker should complete the project by himself, circumventing the team moral hazard problem that stems from the individual team partner bearing the full cost of effort but only obtaining a share of the project value. Finally, if one team member chooses zero effort, the project fails for sure. After effort has been chosen, the project outcome – success or failure – is realized. A worker’s payoff is his share of the project value minus cost of effort. These three stages – worker selection, information acquisition, and effort choice – occur in both periods. What differs across periods is information (i.e. the signals workers obtain) and the effect of peer pressure (which impacts effort only if today’s project outcome matters for tomorrow’s). Effort depends on information through the sufficient statistic y. It depends on peer pressure because publicly observable past project outcomes affect current relationships between workers, especially when the network is characterized by high clustering. We outline this peer pressure channel here informally and defer the formal discussion to Appendix C. We assume that a project failure leads to discord among project partners, negatively affecting their friendship. We further argue that this discord between partners also spreads to common friends. This idea is based on the well-established structural balance theory: Triads of friends are only stable as long as the relationships are balanced. Suppose that i, j and l are all directly connected. Initially, all three relationships are intact. Then, i and j work on a project together that fails, affecting not only their link but rendering the entire triad unstable. This instability is resolved by the workers taking sides. To simplify our analysis, we assume that all relationships in a triad will turn bad after a project failure.35 This is why project failures affect workers with high clustering more than those with low clustering: they are deprived of more future project opportunities. In sum, each project failure negatively affects relationships, whereas a project success means that all directly connected workers remain in good terms. We denote the quality of the relationship by γ ∈ {γb , γg }, that is the relationship can be bad or good. A relationship between i and j turns bad after a project failure if in the previous period either (1) i and j were teamed up or (2) i or j were teamed with a common friend. In each period, a strategy of an agent maps his signals y and the state γ into an effort level, where we focus on pure strategies. Given that both the relationship-status and signals are to scale, implying that our results hold more generally. We have restricted attention to a more stylized setting as it generates closed form solutions that make the interpretation of the effects of information and peer pressure simple. 35 Our assumption is a simplification of the following idea: When a project fails, a worker faces with a positive probability more than one negative connection if he and the project partner had common friends, but only has one negative connection if the project failed with someone he does not have a common friend with.

19

observable for both team partners, our equilibrium notion is public perfect equilibrium (PPE). This is a strategy profile that satisfies the usual requirements of being mutually best responses (Nash equilibrium) and sequentially rational. See Appendix C, for the formal definition of strategies and equilibrium (Definition 2). In our setting a higher degree leads to more signals, allowing for a more precise belief about the project value. Higher clustering, on the other hand, leads to a larger number of bad relationships after a project failure and therefore incentivizes effort through peer pressure. This is the main trade-off we are focussing on. To show in more detail how peer pressure influences effort choices, we focus on a dynamic setting.

4.2

Effort Choice and Wages

Each team partner maximizes his payoff with respect to effort across two periods. We assume that in period 1 each worker is in a good relationship with everyone he is connected to and thus omit the dependence on the relationship state. The dynamic problem of team partner i is then given by max0 ei ,ei

f (ei , ej )π(yi ) + (1 − f (ei , ej ))0 − c(ei )

(2)

     + βsi P ri (γg0 )E f (e0i , e0k )π(yi0 ) − c(e0i ) γg0 + P ri (γb0 )E f (e0i , e0k )π(yi0 ) − c(e0i ) γb0 . We index second period variables by prime and denote by β the discount factor. The expectation is taken with respect to the distribution of the number of high signals in period two, yi0 . As each worker observes not only his own signal but also the signals of all workers he is (in)directly connected to, this implies that yi = yj ≡ y and yi0 = yk0 ≡ y 0 . The expected payoff in the second period depends on whether a worker is selected for the project, which occurs with probability si . If he is chosen, he can either be teamed with someone he has a good or someone he has a bad relationship with. The probability of a good relationship with future partner k is given by P ri (γg0 ) = f (ei , ej ) + (1 − rij )(1 − f (ei , ej )) , where rij =

Cij Di

(3)

is the probability that the second period’s team partners have a bad rela-

tionship after a first period failure between i and j, and where Cij is a proxy for their common friends and thus clustering.36 Note that this probability is symmetric across first period’s 36

P P Formally, Cij = 1 + k,k6=i,k6=j gik gjk where k,k6=i,k6=j gik gjk gives the number of common friends of i and j. So, rij is the probability that in the next period, worker i is doing a project with someone who would be affected by

20

project partners, rij = rji . Thus, worker i has a good relationship with all his potential second period project partners only if his current project succeeds, which happens with probability f (ei , ej ). If it fails, then he only has a good relationship with his future partner, if this partner is not the same as or a common friend of the current one, indicated by the joint probability (1 − rij )(1 − f (ei , ej )). Similarly, the probability of a bad relationship state is given by P ri (γb0 ) = 1 − P ri (γg0 ). We solve problem (2) by backward induction, starting in the second period, where workers play a static game. Worker i chooses effort to maximize his expected payoff, given by max f (ei , ej )π(y) − c(ei ) ei

(4)

Given our assumptions on f and c, the first order condition of (4) is both necessary and sufficient for a maximum. The problem is symmetric for worker j. Based on the first order approach, we determine the pure strategy public perfect equilibria of the static game and denote by e(y) the optimal strategy based on signals y. Lemma 1 (Static Game). 1. Every public perfect equilibrium is symmetric: ei (y) = ej (y) = e(y) ∀y. 2. For each y, there exist exactly two pure public perfect equilibria. (a) Zero effort:

e(y) = 0

(b) Strictly positive effort:

e(y) = f1 (1, 1)π(y).

(5)

All proofs are collected in Appendix D. Given the symmetry of the problem, both workers exert the same effort in equilibrium. Moreover, there exist two pure strategy PPE. There always exists an equilibrium where both project partners exert zero effort independently of signal realizations. It is a best response to choose zero effort given the partner chooses zero effort as, by assumption, f (ei , 0) = f (0, ej ) = 0. But there also exists a PPE with strictly positive efforts. The uniqueness of the equilibrium with strictly positive effort follows from supermodularity and the constant returns to scale of f , as well as the convexity of the cost function. We now turn to the dynamic problem in the first period, where not only the signals, but also considerations about the relationship state with future project partners matter. In what follows, we focus on a strategy profile where, for any realization of signals, a worker puts strictly positive effort if the relationship to the project partner is good, and zero effort if it is bad.37 today’s project failure, given that i and j are chosen for a project in period one. 37 Again, see Appendix C for the formal definition of these strategies.

21

Although clearly, there exist other equilibria in this model, in Appendix C (see Equilibrium Selection) we make a case why this equilibrium is a reasonable one to focus on. We denote by Vi∗ (y 0 ) ≡ arg maxe0i f (e0i , e0k )π(y 0 ) − c(e0i ) the maximized second period payoff when the relationship between i and k is good. Using this notation along with (2) and (3), the maximization problem of agent i in the first period reads max ei

f (ei , ej )π(y) − c(ei ) + βsi (f (ei , ej ) + (1 − rij )(1 − f (ei , ej )))E[Vi∗ (y 0 )].

(6)

Similar to the static problem, we show that there exists a unique PPE in which both team partners exert strictly positive effort. With some abuse of notation, we denote the optimal effort function in period one and two by e(y) and e(y 0 ), and omit the relationship-state γ as an argument, as effort is only strictly positive in case of a good relationship. Proposition 1 (Dynamic Game). 1. Every PPE is symmetric: ei (y) = ej (y) = e(y) ∀y and e0i (y 0 ) = e0j (y 0 ) = e0 (y 0 ) ∀y 0 . 2. In both periods, there exists a unique PPE with strictly positive effort levels. That is ∀y, y 0 : e(y) =f1 (1, 1)(π(y) + βsrE[V ∗ (y 0 )]) e0 (y 0 ) =f1 (1, 1)π(y 0 ).

(7) (8)

Lemma 1 established that in the second period there exists a unique equilibrium with strictly positive effort, which is symmetric (which we display again in (8)). Proposition 1 establishes that also in the first period, effort levels are symmetric. First, any two team workers have the same signals. Second, two workers must have the same number of common friends, Cij = Cji , and thus si rij = sj rji = sr (since si rij =

Cij 1 (N −1)N 2

). Therefore, we obtain βsi rij E[Vi∗ (y 0 )] =

βsrE[V ∗ (y 0 )]. The proposition establishes that in both periods effort increases in the contemporaneous project values, π(y) or π(y 0 ). But in the first period there is an additional factor at play, captured by βsrE[V ∗ (y 0 )] in (7): the dynamic incentives of maintaining good relationships push first period effort up. The equilibrium effort determines the first and second period wages (which we use interchangeably with productivity). We focus on wages conditional on the state for a given team member, where we drop the subscript i when the wage is the same across team partners: w(θ) ≡ E[f (e(y), e(y)) v|θ]  wi0 (θ, θ0 ) ≡ si P ri (γg0 |θ)E[f e0 (y 0 ), e0 (y 0 ) v 0 |θ0 ],

(9) (10)

where θ, θ0 ∈ {θl , θh } are the realized first and second period states and v, v 0 ∈ {vl , vh } are 22

the associated project values. The expectations are as usual taken over the number of high signals. We define these expected wages given that a certain state of the world has materialized. Recalling that q is the probability that the high state occurs, the expected wage across states can then be easily computed, e.g. E[w] = qw(θh ) + (1 − q)w(θl ) for the first period. Note that the structure of both periods’ wages is the same in that agents obtain their share of output in case the project is successful. In the second period, however, one also has to take into account the joint probability of being selected and having a good friendship history with the project partner, given by si P ri (γg0 |θ). Since friendship histories matter, the second period expected payoff not only depends on contemporaneous effort but also on first period effort. Both periods’ wages are increasing in effort, highlighting the tight link between the agents’ actions and their rewards.

4.3

Degree and Information

We now turn to the effect of information on effort and wages. All else equal, a worker with a higher degree receives more signals about the state of the world and thus more information. We want to know how effort varies with the number of signals and how this depends on the environment’s underlying uncertainty. Definition 1 (Uncertainty). We call a setting uncertain if all of the following features are given: • high and low project values differ, vl 6= vh • signals are not completely informative p ∈

1 2, 1



• workers’ prior about the state reflects some uncertainty q ∈ (0, 1) • overall information is bounded, ni < ∞. In turn, by vanishing uncertainty we mean a situation in which any of the four requirements from Definition 1 is violated. We obtain the following result. Proposition 2 (Degree, Effort & Wages). A higher degree leads to more information, which 1. unambiguously increases average first period effort only if the state is high. 2. increases (decreases) second period effort when the state is high (low). 3. unambiguously increases first and second period wages only if the state in both periods is high. The impact of additional information on effort and wages in both periods vanishes as the underlying uncertainty vanishes.

23

First, information impacts effort through the belief about the current project value: A high signal leads to a more optimistic belief and therefore to higher effort. Since signals are informative, the expected project value, E[π(y)] (where the expectation is taken over the number of high signals y), increases in the number of signals conditional on the realized state of the world being high θ = θh , and decreases in the number of signals conditional on the state being low, θ = θl . Therefore, the more signals are available the more accurate is the worker’s posterior belief about the state of the world. In the high state, he exerts on average higher effort compared to a worker with lower degree. The opposite is true for the low state. As a result, in both periods E[e(y)|θh ] − E[e(y)|θl ] is increasing in information. Intuitively, workers with more accurate information, i.e. more signals, can better fine-tune their effort to the expected project reward. Based on this discussion, the second period effort increases in information if the state is high and decreases if the state is low. The first period effort, in turn, does not only depend on the first period project value, but also on the expected second period payoff, see (7). A higher E[V ∗ (y 0 )] translates into higher effort on average. We prove in Lemma 3 (Appendix D) that E[V ∗ (y 0 )] is increasing in the number of signals: Having more signals yields a more precise belief about the state and therefore allows each team to better adjust their efforts. Generally, being able to adjust effort optimally leads to higher payoffs, and this is why more signals lead to a higher value of the problem. In sum, a higher degree improves information about the state of the world and is particularly beneficial when the true state is high. In this case, additional signals induce the agents to put significantly more weight on the high state, translating into higher effort, project completion and productivity/wages. Notice that the effect of additional information on effort and thus wages is reinforced when the uncertainty of the underlying environment is considerable but dies out when uncertainty is small. The reason behind this result is that the expected project value becomes independent of the number of overall signals as uncertainty vanishes, that is if either (i) there is no difference between high and low project values; or (ii) signals are completely informative; or (iii) a worker’s prior reflects complete certainty about the state of the world; or (iv) overall information becomes abundant. In any of these cases, an agent does not need to rely on his network to learn about the state of the world.

4.4

Clustering and Peer Pressure

Last, we analyze the effect of clustering on effort choices and wages. Clustering induces higher peer pressure, which attenuates the team moral hazard problem in the first period and thus affects first period effort and wages. 24

Proposition 3 (Clustering, Effort & Wages). Higher clustering increases peer pressure which leads to both higher first period average effort and higher first period wages independently of the state of the world. The effect of peer pressure (through clustering sr) on first period average effort is straightforward and unambiguously positive (see equation (7)). This channel is independent of both the true state of the world in period one and the underlying uncertainty. Peer pressure induces higher effort because a potential project failure today puts more friendships and thus future project opportunities in jeopardy. Since peer pressure works as a dynamic incentive, second period effort is unaffected by it. It then follows that peer pressure boosts the first period wage independently of the state and the underlying uncertainty. Only in the second period, the effect on wages is ambiguous: Peer pressure leads to higher first period effort (increasing P r(γg0 |θ) and thus the wage), but having many common friends also makes a non-intact relationship with the second period team partner more likely (lowering P r(γg0 |θ)).

4.5

Peer Pressure versus Information

While the previous discussion has shown that clustering and degree impact effort in quite different ways, they can both have a positive effect on effort, depending on the level of uncertainty and the state of the world. We now show that these two network characteristics are complementary as the effort and the wage in period 1 exhibit increasing differences in clustering and degree: Proposition 4 (Complementarity of Degree & Clustering). Higher peer pressure, i.e. more clustering, leads to a greater increase in average first period effort and the first period wage if the worker has a lot of information, i.e. a high degree. And vice versa. More clustering has a greater positive impact on effort when the degree is high, i.e. when information is already abundant (and vice versa). Consequently, wages, which are a function of effort, also display complementarities in peer pressure and information. Our theory thus rationalizes why the different types of social capital that emerge from tight versus lose networks are complementary. While the discussion so far has focussed on comparative statics effects of a single network characteristic holding other network characteristics fixed, we now turn to the more interesting but also more involved case of comparing two types of workers: one with higher degree but lower clustering (called D-worker) and one with lower degree but more clustering (denoted as C-worker), which is an empirically relevant case as clustering and degree are negatively 25

correlated in all of our datasets. We are interested in their relative performance depending on the underlying uncertainty of the environment. We therefore define the notion of comparative advantage in our context: the C-worker holds a comparative advantage in environments with lower uncertainty if his relative expected wage,

E[wC ] E[wD ]

=

qwC (θh )+(1−q)wC (θl ) , qwD (θh )+(1−q)wD (θl )

is increasing as

uncertainty decreases. Proposition 5 (Trade-Off Between Information and Peer Pressure). 1. Comparative Advantage: C-Workers hold a comparative advantage in environments with low uncertainty. 2. Wage Dynamics: If a C-worker has a weakly lower first period wage than a D-worker, then he also expects a lower wage in the second period. Our model predicts that workers with higher clustering and lower degree have a comparative advantage in environments characterized by less uncertainty relative to workers with less clustering and higher degree. Workers with higher degree obtain more information and thus have an advantage when information is valuable. But in environments with low uncertainty this is not the case and workers with higher clustering exert relatively more effort. This leads to higher wages for Cworkers relative to D-workers, which underlines one of our key predictions: Clustering gains importance as uncertainty vanishes. Our model also predicts a strong impact of early career wage gaps on the future wage trajectory through peer pressure, which puts workers with high clustering but low information at a disadvantage. As a result, if there is a wage gap in the first period, it persists even if they perform equally well in the second period (i.e. even if uncertainty vanishes in the second period). Moreover, the second period wage gaps between C-workers and D-workers arise even if they exert the same effort in the first period.

5

Empirical Results

Our framework shows that the impact of network structure on labor market outcomes (effort and wages/productivity) depends on the underlying uncertainty of the work environment. We will first provide empirical support for the three main predictions of our model that link network structure to outcomes, based on the AddHealth data. Out of the three data sets that we used to analyze gender networks, this is the one that contains the richest set of labor market outcomes, allowing us to link them to network characteristics. In a second step, we will bring the attention back to the differences in network structure across gender that we have docu26

mented initially. Guided by our theoretical predictions, we want to assess whether differences in networks across gender have implications for the gender gap in productivity and wages. We base this second part of the analysis on the data of academic computer scientists as well as on the US Census. We describe these data sources and the variables of interest in more detail in Appendix B.

5.1

Differences in Networks and Labor Market Outcomes

Evidence from AddHealth We use the AddHealth data to empirically assess our model’s predictions as it contains the richest set of labor market outcomes in addition to the network characteristics. We seek to test three predictions: Prediction 1: Workers with a higher clustering coefficient (and /or lower degree) have relatively higher wages in less risky environments and therefore a comparative advantage in such settings. This prediction is based on Proposition 5, part 1. We use AddHealth data to analyze whether individuals with relatively high clustering coefficient (and/or relatively low degree) earn more in less ‘risky’ occupations. To test this prediction, we first need to construct a representative measure of occupational ‘risk’. We do so using data from the US Census during time periods that match the waves of AddHealth we focus on (waves 3 and 4). We measure occupational risk by the standard deviation of residual earnings by occupation in a Mincer-type wage regression (see Appendix B for the details). The standard deviation of the residual earnings in an occupation is a measure for wage variation that is associated with this occupation and cannot be predicted based on commonly used observable controls. As such, this measure of unpredicted wage risk is closely related to the uncertainty in our model. In Figure 1, we plot this measure of occupational risk for 21 occupation groups, which correspond to occupations available in wave 3 of AddHealth:38 While occupations like legal occupations or management are high risk occupations, occupations in education or health support are considered low risk based on our measure. In Figure 2, Appendix B, we plot our measure of occupational risk against occupational mean wages, which reveals a risk-return trade-off. To test how the impact of networks on labor income differs depending on the riskiness of the environment, we regress annual log earnings in wave 3 of AddHealth (which corresponds to years 2000-2001) on network characteristics (that were assessed in wave 1 of AddHealth), network characteristics interacted with occupational risk and network characteristics interacted 38

Wave 4 contains more disaggregate occupational codes, but we aggregate them up to these 21 groups throughout for consistency.

27

Legal Occ

Sales

Arts/Entertainment

Management

Personal Care Service

Construction

Healthcare Practicioner

Maintenance

Business/Finance

Food Services

Transportation

Social Science

Education

Computer/Math Occ

Production

Social Services

Installation

Protective Services

Healthcare Support

Admin

Architecture/Engineer

.45

.5

Risk .55 .6

.65

.7

Figure 1: Occupational Risk By Occupation

ID

with each other:

LogEarningsit = β0 + β1 Riskjt + β2 Clusteringi + β3 Degreei + β4 (Degreei × Clusteringi ) + β5 (Riskjt × Degreei ) + β6 (Riskjt × Clusteringi ) + xTit γ + it

(11)

where LogEarningsit are log annual earnings of individual i at time t =wave 3 reported in AddHealth39 , Riskjt is the riskiness of occupation j at time t measured based on Census data, Clusteringi and Degreei are clustering coefficient and degree of individual i, xit is a vector of individual-level controls (such as work experience, education, hours, tenure, etc.) and it is a mean-zero error term. Our main focus is on how network characteristics affect earnings. In particular, we are interested in the non-linear effects of network characteristics on earnings depending on the riskiness of the occupation (coefficients β5 and β6 ): our model predicts that β6 is negative meaning that a high clustering coefficient is disadvantageous in risky environments; and that β5 is positive, meaning that a high degree is advantageous in risky environments. The results of our main specification (Table 25, column (1), Appendix B) show that a higher clustering coefficient is associated with lower wages but only in risky occupations. If the environment is completely safe (Riskjt = 0, i.e. no unanticipated wage risk) then a one standard deviation increase in the clustering coefficient has a large positive effect on earnings: it increases yearly earnings by 19%, meaning that higher clustering is in general advantageous. However, in riskier occupations this effect flips and becomes negative due to a sizable negative coefficient on the interaction term Risk×Clustering Coefficient. The interpretation is that increas39 Note that across waves 3 and 4 of AddHealth we do not have a consistent measure of hourly pay, which is why we use annual earnings instead but control for hours worked. With some abuse, we use the words ‘wages’ and ‘earnings’ interchangeably here.

28

ing an individual’s clustering coefficient by one standard deviation decreases annual earnings by 4.8% if he/she works in the most risky occupation of our sample (which is legal occupations that has a risk measure of about 0.72) – a large wage penalty.40 We view this as supportive evidence for our model since workers with high clustering earn relatively more in less risky environments, implying a comparative advantage in those settings. Furthermore, clustering has a positive impact on earnings, indicated by coefficient β2 , in line with our model that predicts a positive effect of peer pressure on starting wages independent of the state of the world (Proposition 3). In turn, degree (coefficient β3 ) has no significant impact on earnings. This may – at least in light of our theory – not be too surprising since the effect of degree on wages is generally ambiguous unless the state of the world is high (Proposition 2, part 3). When interacted with the occupation’s riskiness, the effect of degree on earnings becomes positive (high degree is advantageous in risky occupations, as the theory predicts) but remains insignificant. Most other variables have the expected impact on earnings: hours, age and prior work experience have a positive effect while years of education for these young workers has negative effect.41 We conduct various robustness checks: Our baseline results also hold when controlling for gender in column (2) of Table 25, as well as for marital status where we interact marital status also with the gender dummy in column (3). Moreover, the coefficient of interest remains nearly unchanged if we also control for race in column (4). Furthermore, these results are even more significant when controlling for in-degree instead of degree (Table 27) and are to some extent robust to controlling for out-degree (Table 26). Last, we show robustness of these results when running these regressions for network measures that are constructed in alternative ways, which are available upon request.42 Note that, contrary to wave 3, in wave 4 of AddHealth we do not find significant effects of network characteristics on earnings. Prediction 2: The network characteristics clustering coefficient and degree are complementary for productivity. Our second prediction is based on Proposition 4: a high clustering coefficient is particularly advantageous for labor market outcomes when the degree is high. That is, when information is already abundant, more clustering is particularly beneficial. 40

The overall effect was then calculated as ∂LogEarningsit /∂Clusteringi |Riskjt =0.72 = β2 + β6 × 0.72 = −4.8%. Note that in this computation, we evaluate degree at its mean (= 0), which is why β4 drops. 41 At young age, more years of education means less work experience and less human capital accumulated on the job, which is why higher education implies lower starting wages compared to those that have begun working much earlier. 42 Recall that we restrict attention to friends that can be uniquely identified and attend the same school. We additionally calculate our network measures taking into account external friends and students without an identifier, see Table 6. Using these measure in our regressions does not affect our results qualitatively.

29

We test this prediction in the AddHealth data using the same model (11) that we used for our earnings analysis above. We are now interested in how the interaction Degreei × Clusteringi impacts earnings. Our model predicts that β4 is positive. We find that this is robustly true in the data, see Table 25, columns (1)-(4). Unlike with the previous prediction, the complementarity between clustering and degree also holds in wave 4, the results are available upon request. Prediction 3: Wage gaps in starting wages are persistent, especially for workers with high clustering. This prediction is based on Proposition 5, part 2. We want to test whether the initial earnings disadvantage for workers with high clustering coefficient is persistent. We test this prediction by focusing on the subsample of AddHealth workers who are present in both waves (wave 3 and wave 4) and who work in the same occupation in both waves (i.e. a sample of stayers). This way we can keep the riskiness of the individuals’ occupation almost constant over time.43 This is in line with our model, where we consider wage dynamics for a given riskiness of the environment. The downside of this approach is that the sample is reduced to roughly 600 observations. Ideally we would like to measure the initial wage disadvantage of workers with high clustering on the occupational level but it is not straightforward how to obtain such a measure in the data. We therefore proxy the initial wage gap between workers with high clustering and those with low clustering in a different way, namely by the occupational gender earnings gap. The underlying assumption is that the previous period’s gender wage gap in a worker’s occupation proxies the advantage in starting wages of workers with low clustering over workers with high clustering – something we believe is reasonable given our evidence that women are characterized by a higher clustering coefficient in AddHealth. In order to ensure representativeness of those gender earnings gaps at the occupational level, we choose to measure them in the US 2000 Census and then merge them into AddHealth as opposed to measuring them directly in AddHealth based on a small sample. We regress individual earnings growth on individual network characteristics and the previous period’s occupational gender earnings gap, where we also interact network characteristics with the gender earnings gap. That is, we estimate an equation of the form ∆t+1,t Earningsij = β0 + β1 EarningsGapjt + β2 Clusteringi + β3 Degreei + β4 (EarningsGapjt × Clusteringi ) + β5 (EarningsGapjt × Degreei ) T α + ijt + xTit γ + yjt

(12)

43 We re-compute our occupational riskiness measure for the time period of wave 4 based on data from the American Community Survey in 2005-07 but it is similar to the one based on the data from the 2000 Census.

30

where again i indicates an individual and j his/her occupation and where t =wave 3 and t + 1 =wave 4 of AddHealth since these are the two waves for which we have labor market outcomes. Note that ∆t+1,t Earningsij is defined as the individual i’s log difference in earnings between t and t + 1 in occupation j, Degreei and Clusteringi are our usual network characteristics, xit is a vector of observable characteristics of individual i at time t (when available we also control for observables in t + 1) and yjt is a vector of occupational characteristics at time t, including occupational mean wage and occupational risk both measured in the 2000 Census. h i E(earningsijt |occit =j,genderi =male) Note that EarningsGapjt = log E(earningsijt |occit =j,genderi =f emale) is the gender earnings gap in occupation j at time t, measured in the 2000 Census; it is zero when women and men earn the same in occupation j, it is positive if men earn more and negative if women earn more. Note that all variables in (12) that control for occupational characteristics are measured in the US Census to increase representativeness, while all individual-level variables are measured in AddHealth. Finally, ijt is a mean-zero error term. Our main focus is on the interaction term EarningsGapjt ×Clusteringi : Our theory predicts that β4 is negative, meaning that the initial occupational gender earnings gap has a particularly adverse impact on workers with high clustering coefficient (who are predominantly women). More precisely, increasing the clustering coefficient by one standard deviation changes earnings growth by (β2 + β4 EarningsGapjt ) × 100 percent where the non-linear part is important: if the earnings gap favors men in t (in which case EarningsGapjt is positive), then an increase in clustering by one standard deviation pushes down earnings growth by β4 EarningsGapjt compared to an occupation where there was no gender earnings gap in t. The estimation results are reported in Table 28. Column (1), our baseline specification, indeed suggests that having a high clustering coefficient is a disadvantage particularly in occupations where the gender earnings gap was high initially. The coefficient β4 is negative and significant at the 10% level. To quantify the effect, suppose there is an occupation in which men earn twice as much as women in the initial period. In this occupation, an increase in clustering by one standard deviation implies a decrease of earnings growth of -17%.44 If we disregard the positive effect β2 (the non-interaction term of clustering is borderline significant here), then a one standard deviation increase in clustering even leads to a 31% decline of earnings growth in occupations where men earn twice as much initially. This result is robust to including additional controls. We control for occupational characteristics such as occupational mean earnings and occupational risk in column (2), for growth in hours worked between wave 3 and wave 4 in column (3), for gender and marital status in 44

This is computed as (β2 + β4 EarningsGapjt ) × 100 = (0.1322 − 0.44 × log(2)) × 100 = −17%.

31

column (4), and for race in column (5). The result becomes even more significant when controlling for in-degree instead of degree (Table 29) and it is also robust to controlling for out-degree (Table 30), as well as to using our alternatively computed network measures (see above for how we compute them; these results are available upon request). Overall, the result that an initial earnings disadvantage, as proxied by the occupational gender wage gap, impedes wage growth of those workers with high clustering seems remarkably robust and the magnitude of the effect is sizable.

5.2

Gender Differences in Networks and Labor Market Outcomes

We have documented two sets of empirical results: (i) men and women differ in their network structures with women being characterized by higher clustering and men by higher degree; (ii) high clustering is disadvantageous for labor market outcomes in risky environments. In this section, we aim to connect these results. We will provide suggestive evidence that differences in network structures across gender can help explain disparities in male and female labor market outcomes. We will base this analysis on our data of academic computer scientists and the US Census. Evidence from Academic Computer Scientists We supplement our findings from AddHealth by a case study of one particular occupation: computer scientists in research. This occupation is somewhat exceptional in that our occupational risk measure based on unexplained wage dispersion is not a good proxy for its riskiness: First, in the Census data that we used to compute occupational risk, we have very few observations that work in academic/research computer scientist jobs. Second, in many countries, academic positions are public sector jobs with heavily regulated wages, which naturally decreases (unexplained) wage dispersion. Therefore, wage dispersion is not the preferable measure to classify this occupation as risky or not (nor are wages the preferred performance measure; we discuss alternative performance measures below). Nevertheless, we argue that the occupation computer scientists in academic/research environments is characterized by complex and, especially, uncertain tasks. The success of research and patents is difficult to foresee at the time of production. Moreover, there is a considerable amount of uncertainty stemming from the lack of job security before tenure. We therefore view this occupation as intrinsically risky. We want to test whether researchers with high degree and low clustering perform better in this risky environment compared to those with the opposite network structure and whether this has implications for the gender productivity gap. In order to measure the productivity of computer scientists, we randomly subsample about 29 000 of them from our dblp data set and 32

scrape their Google Scholar profiles (10 932 women and 17 884 men).45 This subsample has the same average clustering coefficient but a higher average degree than the full sample.46 Based on the Google Scholar data, we measure performance (or productivity) according to three indicators: h-index, i10-index and number of citations. The i10-index is the number of publications with at least 10 citations; this index was created by Google Scholar and is high for scholars with many publications (even if they have as few as 10 citations). The h-index equals the number h when the scholar has published h papers each of which has been cited in other papers at least h times. Thus, contrary to the i10-index, the h-index does not only measure quantity of publications but also takes quality into account, where quality can be interpreted to be higher when the number of citations is higher. The h-index is high if the scholar has many publications and if they are heavily cited. Another advantage of the h-index is that it is widely used to assess productivity of researchers while the i10-index is only used by Google Scholar. The h-index is our preferred measure of productivity. The summary statistics of these performance measures are in Table 35 (Appendix B). Computer science is an academic field where scholars generally have many citations, with an average of about 6 200. The productivity according to all three performance measures varies widely in the sample. Naturally, the i10-index is higher on average than the h-index. While these performance measures are positively correlated, the correlation is far from perfect (Table 36). To test whether high degree and low clustering enhance performance in this risky environment, we regress each of the three performance measures on these network characteristics. The results are reported in Table 37, Appendix B: For a given clustering coefficient, an increase of degree by one increases the h-index by .17, the i10-index by 1.28 and citations by 59. In turn, for a given degree, increasing the clustering coefficient from 0 to 1 decreases the h-index by 11, the i10-index by 35 and citations by 3056. Given our initial finding that computer scientists show large disparities in networks across gender, with male scientists having higher degrees but female scientists having higher clustering coefficients, we now ask whether these differences in networks impact the gender productivity gap.47 We first want to highlight that there are interesting performance differences across gender. We regress each of the three performance measures on a gender dummy, which equals 1 if the scholar is female and 0 if he is male, Table 39. While being a woman increases the i10-index 45 We could in principle extract more Google Scholar profiles than 29 000. We drew a line here since this process is very time-consuming and computationally intensive. Note that the scraping is done by a standard web crawler using Python. 46 While in the full sample the average degree was 7, here it is roughly 10. 47 This subsample of 29 000 scholars shows similar differences in network characteristics across gender as the full sample, see Table 38.

33

by 2.1 (the coefficient is however not statistically significant), it decreases the h-index by about 2. Our interpretation is that women tend to have more publications of little impact compared to men (captured by a higher i10-index). In turn, men have fewer publications but those they have are of higher impact/quality based on the citation count (captured by a higher h-index). Women also have overall more citations but the gender difference is not significant. We are interested in how much of the performance gap in the h-index is due to women’s disadvantageous network characteristics in this risky environment. Similarly, we want to understand how much larger the women’s lead in the i10-index and citation count would be if they were not characterized by high clustering and low degree. The results are in Table 40. When focusing on the h-index – our preferred performance measure – we observe that differences in network characteristics across gender account for a significant share of the gender gap, column (1): Controlling for network characteristics makes the (absolute value of the) gender coefficient drop by about 15% compared to the regression where we only controlled for gender (column (1) Table 39). This suggests that 15% of the gender performance gap in the h-index among computer scientists is due to differences in network characteristics. We perform similar exercises for the other performance measures, the i-10 index and citations. Women have an advantage over men in both of these measures. Interestingly, controlling for network characteristics improves female performance even more (compare columns (2) and (3) across Tables 39 and Table 40). All of this suggests that women’s network characteristics hold them back. If female computer scientists did not have their disadvantageous network characteristics they might perform better according to our measures of performance. We view these results of how differences in gender networks may account for productivity gaps as an interesting implication of our theory. US Census and American Community Survey Taking our evidence together, our model makes predictions about the gender earnings gap in risky versus safe occupations. In particular, (i) our analysis of earnings that showed an earnings penalty for having a high clustering coefficient in risky environments and (ii) the fact that women have a higher clustering coefficient than men, lead to the following prediction: The gender earnings gap should be larger in risky occupations. Since we carry out this test on the occupational level without directly alluding to network characteristics, we use US Census data which is more representative of the whole population than AddHealth. Our main specification estimates the effect of occupational risk on the occupational gender earnings gap: T EarningsGapjt = β0 + β1 Riskjt + yjt γ + jt

34

(13)

where EarningsGapjt is the occupational gender earnings gap defined as above, Riskjt is our measure of occupational risk, yjt is a vector of other controls at the occupational level such as the hours gap or the education gap, and jt is a mean-zero error term. Our main focus is on β1 , the coefficient of Riskjt . In order to highlight the role of risk in the occupational gender earnings gap, we first estimate model (13) without controlling for occupational risk. Column (1) in Table 41 shows the results based on the 2000 Census (which coincides with the time of wave 3 of AddHealth). Variation in the observed characteristics of occupations, yjt , go a long way in explaining the variation of the gender earnings gap at the occupational level, indicated by an adjusted R2 of 0.8. For instance, and quite intuitively, the gender wag gap is positively related to the gender gap in hours or the gap in educational attainment.48 We then ask whether the riskiness of an occupation provides additional explanatory power. Column (2) of Table 41 includes our measure of occupational risk: Its coefficient is positive, sizable and significant. The gender wag gap is larger in risker occupations – in line with our previous evidence. A one percent increase in occupational risk leads to a .6 percent increase in the gender earnings gap. Including occupational risk as a regressor increases the adjusted R2 by almost 5%-points, meaning that explanatory power of the model has increased significantly when risk is accounted for. For robustness, we repeat the estimation for other time periods based on the 1990 Census (Table 42) and the American Community Survey for 2005-07 (Table 43), which shows that this result is stable over time.49 The Literature Our findings are in line with several studies showing that earnings and performance gaps between men and women are especially large in occupations that are characterized by uncertainty like those in the financial sector, film-industry and basic research. First, the within-occupational wage gap is particularly severe in management occupations, especially for financial managers and chief executives. The evidence suggests that women’s lower earnings in financial management and executives occupations are especially due to large differences in performance pay and bonuses.50 A second well-studied sector where gender inequalities persist is the film industry (Lutter (2012) and Lutter (2013)), where women create lower box revenues from movies. This industry 48 The experience gap takes potential experience as opposed to actual experience into account and thus misses maternity leaves, which are an important reason why women have less actual experience. This is possibly why the sign on this variable is counterintuitively negative. 49 The advantage of the American Community Survey is that more years are available so we can pool observations across years. Even though wave 4 of AddHealth was rolled out in 2008, we chose to use data for 2005-07 to stay away from the Great Recession. 50 See Albanesi and Olivetti (2009) for evidence on management occupations.

35

is highly project-based where tasks have uncertain outcomes. Ferriani et al. (2009) argue that the film market requires fast adjustment to new work environments since film ventures operate under constant uncertainty and have to foresee ex-ante whether the project opportunity is valuable. They argue that information is crucial to identify potentially successful scripts and to assemble the right project team. Based on the finding that producers who are more central in their network (i.e., have more access to information) are more likely to increase the box revenue from a movie, the authors conclude that social networks provide crucial access to information. In a similar vein, Lutter (2013) documents that women with loose information-based networks perform better in the film-industry than women with dense networks, supporting our hypothesis that information is the key to success in uncertain environments. A third well-known area for gender disparities is the market for patents. Hunt et al. (2012) document that women in the US are much less likely to be granted a patent than men, with women holding only 5.5% of commercialized patents. Gabbay and Zuckerman (1998) document that in basic research, which is typically characterized by complex, uncertain tasks, scientists benefit from sparse networks with many holes, whereas in applied research, which is typically characterized by non-complex, certain tasks, scientists benefit from dense networks. In line with this view, Ding et al. (2006) argue that an important reason for the gender wage gap in patenting is that women’s networks are less effective: In relying more on close relationships, they lack access to industry contacts. We offer a unified theory for our empirical findings and these findings in the literature. Our model provides a new network-based mechanism why men outperform women under uncertainty. In uncertain environments, information is crucial for success and men hold more of this type of social capital than women, which is why they perform better. In contrast, in sectors characterized by stable earnings like health support or social services the wage gap is much smaller or even reversed. It is not reversed in all occupations that lack earnings or project uncertainty because many factors – not just networks differences – fuel the wage gap. However, we argue that in these industries and occupations, the wage gap is smaller than it would be in the absence of women’s network features.

6 6.1

Discussion Risk Aversion and Competitiveness

We are aware that our analysis may suffer from several endogeneity issues. The main concern in our context is that we did not control for other unobserved factors that may impact both net-

36

work structure as well as occupational choices and earnings. This would bias our main effects of network characteristics on outcomes. Two factors come to mind, risk aversion and competitiveness. Women are both more risk averse and less competitive than men, for a summary of the literature on this, see Croson and Gneezy (2009). One can imagine that both characteristics lead them to (i) form tighter networks and (ii) be less likely to choose risky occupations and if they do so they underperform compared to men. To address this concern, we use a series of variables in AddHealth that can be used to proxy (i) risk aversion and (ii) competitiveness, perform a factor analysis to reduce the dimensionality of these measures and include them as controls into all of our regressions. To measure risk aversion, we take into account helmet use on bicycles, seatbelt use, motorcycle use, marijuana consume and the frequency of doing something dangerous because the individual was dared to. In turn, to measure competitiveness we take into account participation in high school sports, intensity of sport activity and participation in debate clubs. We provide details on these proxy variables and the construction of the factors in Appendix B. We are most confident in the first factor of risk aversion and the first two factors of competitiveness, which we therefore use as proxies for these unobserved characteristics. Maybe surprisingly, we do not find any significant correlation between the network characteristics degree and clustering coefficients on the one hand and the various measures of risk aversion on the other (Table 22). We also do not find any notable correlation between clustering coefficient and the various measures of competitiveness. There is, however, a positive correlation between degree and competitiveness (Table 24). To test more formally whether risk aversion and competitiveness are confounding factors in our analysis, we include them as controls (either one or both of them) into our regressions. Importantly, this does not alter our main results: First, clustering still has a negative impact on earnings if the occupation is risky (the coefficient of interest drops slightly but remains borderline significant, Table 25, columns (5)-(7)). Second, the additional controls do not change our result on the persistence of the initial earnings disadvantage for high clustering workers. This effect becomes even more pronounced when controlling for risk aversion, columns (6)-(8) in Table 28. This robustness check gives us confidence that network structure per se matters for labor market outcomes and that differences in network structures are not just a by-product of unobserved heterogeneity in preferences or competitiveness.

37

6.2

Network Structure versus Referral Networks

An important question is whether our empirical results are really due to networks at work or whether they may have also been influenced by referral networks. Even though we have no way to formally test these alternative explanations, there are several reasons why it is unlikely that all of our findings are driven by referral networks: First, the data on academic computer scientists includes information on both, networks at work and performance. We provide evidence of how collaboration networks impact performance, where we do not see any compelling reason why referral networks confound these effects. Second, in the AddHealth data, we do not find a significant effect of degree on starting wages (in contrast to what a theory of referral networks would predict). Moreover, while we consider it plausible that referral networks impact starting wages, we are aware of no theory or empirical results that they also impact the self-selection into risky versus non-risky occupations: Based on the comparative advantage prediction of our model and the results from the earnings regressions above, workers with a relatively high clustering coefficient (and/or relatively low degree) should be more likely to select into less risky occupations if they understand the effect of networks on labor market outcomes. To test this prediction, we run an ordered probit model, which is a choice model where the alternatives are ordered by risk. In our baseline specification, we regress the riskiness of the occupation that was chosen by an individual on his/her network characteristics, Riskijt = β0 + β1 Clusteringi + β2 Degreei + xTit γ + ijt

(14)

where Riskijt is the risk of occupation j in which individual i is employed at time t, xit is a similar vector of individual controls as above and where ijt is a mean-zero error term. In (14), we cluster the standard errors at the occupational level. Our main explanatory variables of interest are Clusteringi and Degreei . If individuals understand their comparative advantage based on network types, we would expect that β1 is negative (higher clustering leads to choosing less risky occupations) while β2 is positive (higher degree encourages choosing riskier occupations). Estimating (14) on both wave 3 and wave 4 of the AddHealth data is consistent with the comparative advantage prediction of our model: In wave 3, a higher clustering coefficient makes it less likely to be employed in riskier occupations (Table 31, column (1)), while degree does not significantly predict the occupational risk choices of individuals. This result is robust to including additional controls, columns (2)-(8). In wave 4 of AddHealth, a higher degree is as-

38

sociated with a higher probability of being employed in a risky occupation and this effect is significant (Table 33). Thus, there is a sense in which individuals self-select into the occupation that caters to their network characteristics, which the literature on referral networks is silent about. However, the economic size of these effects is small, indicating that most individuals may not be fully aware of their network types and/or its usefulness across different occupations.51

6.3

Informal and Fixed Networks

We focus in our setting on informal and fixed networks. While firms are concerned with the optimal organizational structure and hierarchy within a firm (i.e. with the formal network), a question at the heart of the literature on personnel economics, for an overview see Lazear and Oyer (2007), they cannot implement informal interactions. Thus, they are required to take informal networks as given, as fixed. However, these informal networks are essential to the success of a firm and may interact with the organizational framework. Understanding how they operate is therefore vital and can improve the operations of a firm. Our evidence suggests that informal networks are strongly influenced by gender. We document that boys and girls differ in their networks, which carries over to male and female adults at the workplace. This indicates that gender differences in networks are stable across different ages and environments and highlights that these informal networks should not only be viewed as fixed from the point of view of the firm, but also the worker’s perspective. Our interpretation is supported by the early literature on gender differences in work networks, as both women and men display significant variation in their network structure (Burt (1998), Ibarra (1992, 1993, 1997)) implying that individuals do not adjust the networks in order to reap the greatest benefits. There are different potential explanations for this observation, one being that network structure should be seen as a type that is stable over time and across environments (Burt (2011)), another being that individuals are not aware of what constitutes a more or less successful network (Burt (1998)). The fact that agents do not know about optimal team composition has been corroborated by Sarsons (2015), who shows that women are unaware that co-authorship with men is detrimental to the probability of them being tenured. The most compelling evidence that there is a lack of information on the benefits of networking has been 51 To grasp the magnitude of the effect of clustering on occupational risk in wave 3, we compute the marginal effects of choosing different occupations. For illustration, we picked two occupations that, according to our occupational risk measure, are high-risk (management occupations) and low risk (health care support). The effects are in Table 32. In column (1), the coefficient on clustering implies that increasing the clustering coefficient by one standard deviation increases the probability of choosing the low risk occupation Health Care Support by 0.2%. In turn, according to column (2), increasing the clustering coefficient by one standard deviation decreases the probability of choosing the high risk occupation Management by 0.1%. For the marginal effects in wave 4, see Table 34.

39

provided by Forret and Dougherty (2004). They show that women’s networking only affects their perceived career success, highlighting that women are unaware that their networking efforts have no impact. But if the network can be considered as an intrinsic type or if individuals do not understand what type of network is beneficial for them, it seems a logical first step to keep networks fixed, not only from the firms but also from the workers’ perspective.

7

Conclusion

We empirically identify a new dimension of heterogeneity between men and women, namely differences in their social network structures, and develop a theory that connects these differences to discrepancies in their labor market outcomes. We first establish in three very different environments – both social and at work – that men have a higher degree than women, whereas women have a higher clustering coefficient. Based on this finding, we develop a theory that sheds light on the relative advantages of having a male network (high degree, low clustering) versus a female network (low degree, high clustering). A higher clustering coefficient implies higher peer pressure, whereas a higher degree improves access to information. Both peer pressure and access to information are advantageous, but in different environments. We find that, in environments where uncertainty is high, information is crucial and, therefore, men outperform women. We find empirical support for the main predictions of our theory: Individuals with high clustering and low degree have a comparative advantage in less risky occupations, reflected by their smaller propensity to select into risky occupations and their lower earnings in risky seetings. Moreover, early career wage gaps are particularly persistent for individuals with high clustering. We also show that differences in network structure across gender can help explain disparities in some of their labor market outcomes. It is important to note that these results are not driven by differences in risk aversion or competitiveness across gender, which we control for. It is beyond the scope of this paper to analyze the source of network differences between men and women. These differences might be due to different patterns of socialization or distinct preferences. To analyze the origins we would require more systematic data of children at younger ages, which to the best of our knowledge is not available at this point. Last, at its current stage, we do not use our model to study the optimal composition of a team. The optimal team composition should depend on the network structures of the team members. We believe that this is an interesting extension of our research, which we aim to address in future work. 40

Data Appendix A: Network Characteristics Data Sources We use three different data sources in this section, the AddHealth, Enron email data and data on collaboration networks of computer scientists from the dblp plattform.

AddHealth Data Table 1: Descriptive Statistics Add Health

Cl. Coeff. (dir.) Cl. Coeff. Degree In-Degree Out-Degree Age Size/1000 Observations

Mean 0.105 0.147 6.9 3.726 4.419 15.084 0.953 42376

Male Students Std Dev. Min 0.146 0 0.183 0 4.976 0 3.734 0 3.591 0 1.719 10 0.539 0.025

Max 1 1 39 37 10 19 2.328

Mean 0.109 0.167 7.544 4.109 5.14 14.923 0.950 42416

Female Students Std Dev. Min 0.133 0 0.181 0 4.691 0 3.633 0 3.371 0 1.702 10 0.548 0.025

Max 1 1 37 34 10 19 2.328

Note: Non-standardized network variables according to gender; individuals that cannot be uniquely identified are omitted.

Table 2: Difference in Network Characteristics Men-Women, Age > 17, Add Health

Cl. Coeff. (dir)

-0.0165 (0.0331)

Cl. Coeff.

-0.0794∗ (0.0313)

Degree

0.0787∗∗ (0.0271)

In Degree

0.128∗∗∗ (0.0253)

Out Degree

-0.0801∗∗ (0.0289) 4930

Observations Standard errors in parentheses ∗

p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

42 -0.0325∗ (0.0131)

Female×Size/1000

-0.0103 (0.0321)

0.0146 (0.0145)

-0.0197 (0.0472)

0.00906 (0.0234)

-0.0630 (0.630)

Cl. Coeff. 0.170∗∗∗ (0.0136)

-0.133∗∗∗ (0.0138) -0.271∗∗∗ (0.0254)

-0.144∗∗∗ (0.0133) -0.246∗∗∗ (0.0264)

84792 0.132

0.225∗∗∗ (0.0408)

0.0823∗ (0.0417)

84792 0.134

0.0247∗ (0.0120)

0.182∗∗∗ (0.0224)

1.721∗∗∗ (0.487)

In Degree 0.140∗∗∗ (0.0137)

0.137∗∗∗ (0.0217)

1.820∗∗∗ (0.483)

1.280∗ (0.586)

1.377∗ (0.593)

In Degree 0.103∗∗∗ (0.00652) -0.0166∗∗∗ (0.00258)

Degree 0.161∗∗∗ (0.0130)

-0.0441∗∗∗ (0.00251)

Degree 0.121∗∗∗ (0.00626)

-0.0391∗∗ 0.0258∗ (0.0128) (0.0116) SCHOOL FIXED EFFECTS INCLUDED 84792 84792 84792 84792 0.092 0.093 0.202 0.205

-0.103 (0.632)

-0.000555 (0.00277)

Cl. Coeff. 0.138∗∗∗ (0.00656)

84792 0.170

0.724 (0.701)

-0.0524∗∗∗ (0.00259)

Out Degree 0.195∗∗∗ (0.00638)

Standard errors in parentheses, ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001 Note: Network characteristics are standardized and can be interpreted in terms of standard deviations, all regressions include school fixed effects, size is calculated as the number of students at a given school.

84792 0.057

-0.00769 (0.0341)

Female× Age 18-19

84792 0.057

-0.0186 (0.0147)

Female× Age 16-17

Observations R2

-0.000737 (0.0511)

Age 18-19

-0.0602 (0.568)

Cl. Coeff. (dir) 0.0858∗∗∗ (0.0139)

0.0374 (0.0243)

-0.104 (0.569)

-0.000432 (0.00284)

Cl. Coeff. (dir) 0.0482∗∗∗ (0.00672)

Age 16-17

Size/1000

Age

Female

Table 3: Differences in Degree & Clustering for Boys and Girls: Entire Sample Add Health

84792 0.171

0.0203 (0.0121)

-0.132∗∗∗ (0.0287)

-0.0781∗∗∗ (0.0138)

-0.105∗ (0.0442)

0.0309 (0.0227)

0.646 (0.692)

Out Degree 0.211∗∗∗ (0.0130)

43 -0.0324∗ (0.0132)

Female×Size/1000

0.0105 (0.0345)

0.0150 (0.0145)

-0.0271 (0.0504)

0.00821 (0.0234)

1.043 (0.619)

Cl. Coeff. 0.170∗∗∗ (0.0137)

-0.132∗∗∗ (0.0139) -0.273∗∗∗ (0.0278)

-0.143∗∗∗ (0.0133) -0.243∗∗∗ (0.0285)

83688 0.132

0.267∗∗∗ (0.0447)

0.120∗∗ (0.0451)

83688 0.134

0.0237∗ (0.0121)

0.179∗∗∗ (0.0225)

2.119∗∗∗ (0.456)

In Degree 0.140∗∗∗ (0.0138)

0.134∗∗∗ (0.0218)

2.205∗∗∗ (0.452)

2.159∗∗∗ (0.557)

2.245∗∗∗ (0.564)

In Degree 0.103∗∗∗ (0.00658) -0.0123∗∗∗ (0.00273)

Degree 0.161∗∗∗ (0.0131)

-0.0411∗∗∗ (0.00262)

Degree 0.121∗∗∗ (0.00630)

-0.0390∗∗ 0.0250∗ (0.0128) (0.0117) SCHOOL FIXED EFFECTS INCLUDED 83688 83688 83688 83688 0.093 0.093 0.202 0.204

0.987 (0.620)

0.00190 (0.00289)

Cl. Coeff. 0.139∗∗∗ (0.00659)

83688 0.170

1.915∗∗ (0.678)

-0.0500∗∗∗ (0.00269)

Out Degree 0.196∗∗∗ (0.00642)

Standard errors in parentheses, ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001 Note: Network characteristics are standardized and can be interpreted in terms of standard deviations, all regressions include school fixed effects, size is calculated as the number of students at a given school, students below 12 and above 18 are dropped from the sample.

83688 0.057

-0.00119 (0.0365)

Female× Age 18-19

83688 0.057

-0.0185 (0.0147)

Female×Age 16-17

Observations R2

0.00327 (0.0548)

Age 18-19

0.936 (0.569)

Cl. Coeff. (dir) 0.0858∗∗∗ (0.0139)

0.0370 (0.0244)

0.883 (0.569)

0.000560 (0.00294)

Cl. Coeff. (dir) 0.0484∗∗∗ (0.00675)

Age 16-17

Size/1000

Age

Female

Table 4: Differences in Degree & Clustering for Boys and Girls: Age 12-18, Add Health

83688 0.170

0.0199 (0.0122)

-0.121∗∗∗ (0.0308)

-0.0783∗∗∗ (0.0139)

-0.0819 (0.0477)

0.0304 (0.0227)

1.848∗∗ (0.670)

Out Degree 0.212∗∗∗ (0.0131)

44 0.00891 (0.0168) 0.00854 (0.0399) -0.0145 (0.0156) 57605 0.057

Female×Age 16-17

Female×Age 18-19

Female×Size/1000

Observations R2

57605 0.186

0.328∗∗ (0.108)

-0.0470∗∗∗ (0.00302)

Out Degree 0.168∗∗∗ (0.00773)

57605 0.187

0.0453∗∗ (0.0145)

-0.0965∗∗ (0.0334)

-0.0524∗∗∗ (0.0156)

-0.111∗ (0.0513)

-0.000906 (0.0254)

0.143 (0.107)

Out Degree 0.147∗∗∗ (0.0190)

Note: Network characteristics are standardized and can be interpreted in terms of standard deviations, all regressions include school fixed effects, size is calculated as the number of students at a given school, students below 12 and above 18 are dropped from the sample, the regression includes only schools which have a share of women between .49 and .51, students below 12 and above 18 are dropped from the sample.

57605 0.150

0.00120 (0.0142)

-0.0328∗ 0.0361∗∗ (0.0153) (0.0138) SCHOOL FIXED EFFECTS INCLUDED 57605 57605 57605 57605 0.092 0.093 0.225 0.227

57605 0.148

-0.281∗∗∗ (0.0305)

-0.227∗∗∗ (0.0310)

0.0134 (0.0379)

0.0313 (0.0166)

-0.0244 (0.0549)

-0.0107 (0.0266)

-0.142∗∗∗ (0.0157)

0.481∗∗∗ (0.0507)

In Degree 0.189∗∗∗ (0.0194)

-0.127∗∗∗ (0.0150)

0.500∗∗∗ (0.0439)

-0.0109∗∗∗ (0.00302)

In Degree 0.111∗∗∗ (0.00774)

0.279∗∗∗ (0.0478)

0.228∗∗ (0.0882)

Degree 0.136∗∗∗ (0.0184)

0.0985∗ (0.0483)

0.460∗∗∗ (0.0867)

-0.0410∗∗∗ (0.00290)

Degree 0.105∗∗∗ (0.00744)

0.204∗∗∗ (0.0251)

Standard errors in parentheses, ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

-0.00719 (0.0596)

Age 18-19

57605 0.057

-0.00311 (0.0277)

Age 16-17

1.027∗∗∗ (0.151)

Cl. Coeff. 0.151∗∗∗ (0.0198)

0.119∗∗∗ (0.0243)

0.553∗∗∗ (0.152)

0.615∗∗ (0.195)

0.419∗ (0.198)

Size/1000

Cl. Coeff. 0.131∗∗∗ (0.00803) 0.00823∗ (0.00330)

Cl. Coeff. (dir) 0.0416∗ (0.0200)

0.00198 (0.00336)

Cl. Coeff. (dir) 0.0302∗∗∗ (0.00821)

Age

Female

Table 5: Differences in Degree & Clustering for Boys and Girls: Age 12-18, Balanced Gender Ratio, Add Health

Table 6: Differences in Degree for Boys and Girls: Non-Identified Friends Included, Add Health Female Age Size/1000

Degree 0.237∗∗∗ (0.00643)

Degree 0.261∗∗∗ (0.0132)

-0.0274∗∗∗ (0.00262) 1.661∗∗ (0.531)

In Degree 0.103∗∗∗ (0.00653)

In Degree 0.143∗∗∗ (0.0137)

-0.0172∗∗∗ (0.00259) 1.551∗∗ (0.541)

1.797∗∗∗ (0.477)

Out Degree 0.340∗∗∗ (0.00651)

Out Degree 0.331∗∗∗ (0.0130)

-0.0274∗∗∗ (0.00270) 1.701∗∗∗ (0.480)

1.070 (0.599)

0.975 (0.607)

Age 16-17

0.127∗∗∗ (0.0226)

0.184∗∗∗ (0.0224)

0.0136 (0.0236)

Age 18-19

0.0916∗ (0.0446)

0.229∗∗∗ (0.0407)

-0.0949∗ (0.0475)

Female×Age 16-17

-0.111∗∗∗ (0.0137)

-0.135∗∗∗ (0.0138)

-0.0279 (0.0142)

Female×Age 18-19

-0.231∗∗∗ (0.0284)

-0.273∗∗∗ (0.0254)

-0.103∗∗∗ (0.0309)

0.0284∗ 0.0233 (0.0120) (0.0120) SCHOOL FIXED EFFECTS INCLUDED

0.0239 (0.0125)

Female×Size/1000

Observations 84792 84792 84792 84792 R2 0.155 0.157 0.130 0.132 Standard errors in parentheses, ∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

84792 0.132

84792 0.134

Note: Network characteristics are standardized and can be interpreted in terms of standard deviations, all regressions include school fixed effects, size is calculated as the number of students at a given school, degree is calculated taking non-identified friends into account.

Enron Data LDA T OPICS M ODELING . We want to classify emails into those that are ‘work-related’ and those that are ‘social’ based on the content of the email subject and body. Toward this goal, we perform an LDA (Latent Dirichlet Allocation) topic modeling over the email content. LDA modeling is a statistics method widely used in Natural Language Processing. The assumption is that if observations are words collected into documents then each document is a mixture of a small number of topics and each word is attributable to one of the document’s topics. Every topic has probabilities of generating each word. The topic boundaries are then determined by the probabilities that certain words occur. We fit 10 topics to the Enron emails, which is what we can control ex-ante. However, in the LDA we cannot specify what these topics are. This is the program’s task based on the probability distributions over words: For each email i, the LDA algorithm returns a probability distribution over topics Pi = [pi1 , pi2 , ..., pi10 ]. The assigned topic of email i is argmaxi Pi . We again focus on the network that is based on single receivers (excluding group emails), see above. Before applying the topic model, we perform the following cleaning process: 45

Table 7: Summary Statistics – Baseline

Outdegree Indegree Degree Clustering Coefficient (Undirected) Clustering Coefficient (Directed) Observations

Min 0.00 0.00 0.00 0.00 0.00

Max Mean 410.00 4.09 335.00 4.09 472.00 6.38 1.00 0.37 1.00 0.23 6604

Std 17.22 13.88 21.82 0.34 0.28

Note: This table is based on our Enron baseline sample of 6 604 employees.

Table 8: Summary Statistics by Gender - Baseline Outdegree

Male 4.191631

Female 3.936685

Indegree

4.166624

3.973184

Degree

6.550906

6.130726

Clustering Coefficient (Undirected)

0.3603268

0.3838312

Clustering Coefficient (Directed)

0.2231572

0.2412111

Observations

Difference 0.254946 (0.58722) 0.19344 (0.55069) 0.42018 (0.76684) -0.0235044** (-2.0563) -0.0180539** (-2.0003)

6604

t-statistics in parenthesis. ***p<0.01, **p<0.05, *p<0.1. Note: This table is based on our Enron baseline sample of 6 604 employees.

• merge email content and email subject • transform to lowercase • remove punctuations and numbers • remove stop words such as “has”, “do”, “you”, etc • remove custom stop words including “www”, “http”, “com”, “html”, “enron”, “email”, “mailto”, “re” and “subject” • stem certain words, for example, change “had“ to “have” • remove all white spaces Applying the topic model, the program gives the following 10 categories (topics) with corresponding key words (here we give some examples of the words that appear most frequently, i.e. key words. So in each column/topic words are ordered by decreasing frequency): Once the 10 topics are determined, the algorithm assigns a probability distribution for each email content over topics, and the topic that receives the largest probability is then the topic of the email. After classifying all the emails in the data set by topic, we can compute the

46

Table 9: Topic-Key Word Matrix 1 cc pm sent original message thanks mark please forwarded fax

2 deal gas price database date deal hour start final schedule

3 width table size border font height color cellpadding cellspacing img

4 energy power will gas new market company business us trading

5 image please message information may click use recipient access contact

6 will please agreement can need time questions call thanks know

7 game updated checkout week team play fantasy free season football

8 get know can just like will sent message good one

9 new day alias houston travel visit click city free time

10 detected watson date reservation river pipeline service usd capacity Feb

distribution over the topics (i.e. how many emails fall into each category) as well as the average probability that the email falls into the topic it is assigned to (Table 10). Here is an example of a typical work-related email (the 10th email in our list): “Mr. Buckner, for delivered gas behind San Diego, Enron Energy Services is the appropriate Enron entity. I have forwarded your request to Zarin Imam at EES. Her phone number is 713-853-7107. Phillip Allen”. It is in Topic 4, which we identified as a work-related topic based on keywords.

Table 10: Summary Statistics by Topic Number of Emails Average Prob.

1 69435 0.5390860

2 29760 0.6163105

3 2550 0.5811157

4 32477 0.5235197

5 30572 0.5660184

6 93468 0.5168961

7 4757 0.5624698

8 71150 0.5756421

9 14932 0.5400725

10 5500 0.4526563

Note: For each email i, the LDA algorithm returns a probability distribution over topics Pi = [pi1 , pi2 , ..., pi10 ]. The assigned topic of email i is argmaxi Pi . Then the average probability of each topic, reported in the second row, is E[Pi ] taking into account all emails i in that topic.

Table 11: Summary Statistics by Topic Outdegree Indegree Degree CC (Undirected) CC (Directed)

1 2.8629730 2.8629730 4.6318919 0.1643709 0.1164492

2 2.0123457 2.0123457 3.4224966 0.0903606 0.0671486

3 0.6056338 0.6056338 1.1478873 0.0024008 0.0012271

4 1.7531894 1.7531894 3.0215898 0.0958132 0.0743884

5 1.3655678 1.3655678 2.4278388 0.0527038 0.0411118

Note: CC stands for Clustering Coefficient.

47

6 2.8774323 2.8774323 4.7345113 0.1600291 0.1193435

7 0.8453865 0.8453865 1.5461347 0.0280657 0.0226686

8 2.1962705 2.1962705 3.6042984 0.1182538 0.0842104

9 0.9540816 0.9540816 1.7500000 0.0165915 0.0158571

10 1.2029197 1.2029197 2.0817518 0.0497318 0.0341557

Table 12: Summary Statistics by Gender (LDA Topics Model) – Baseline

Outdegree

Male 1.2513761

Female 0.8867925

Indegree

1.1743119

0.9858491

Degree

2.157798

1.709906

Clustering Coefficient (Undirected)

0.08168651

0.11353656

Clustering Coefficient (Directed)

0.05893266

0.09246837

Observations

Difference 0.3645836** (2.372) 0.1884628* (1.7988) 0.447892** (2.2859) -0.03185005 (-0.9925) -0.03353571 (-1.2618)

969

t-statistics in parenthesis. ***p<0.01, **p<0.05, *p<0.1. Note: Analysis is based on baseline sample with 969 employees, where we focussed on email exchanges falling into topics 7 and 9 from Table 9.

Table 13: Summary Statistics by Gender (LDA Topics Model) – Robustness 1

Outdegree

Male 0.8587156

Female 0.5542453

Indegree

0.7577982

0.6839623

Degree

1.456881

1.150943

Clustering Coefficient Undirected

0.09005311

0.12379234

Clustering Coefficient Directed

0.06361183

0.10551364

Observations

Difference 0.3044703** (2.2439) 0.0738359 (0.91462) 0.305938* (1.8271) -0.03373923 (-0.82461) -0.04190181 (-1.2303)

607

t-statistics in parenthesis. ***p<0.01, **p<0.05, *p<0.1. Note: We focus on email exchanges falling into topics 7 and 9 from Table 9. And we only consider the connections between workers if they have more than one email exchange. After imposing this threshold, we have 607 employees in our sample.

48

Table 14: Summary Statistics by Gender (LDA Topics Model) – Robustness 2

Outdegree

Male 0.6146789

Female 0.3632075

Indegree

0.5596330

0.4339623

Degree

1.053211

0.750000

Clustering Coefficient Undirected

0.1187454

0.1008724

Clustering Coefficient Directed

0.09162081

0.09065046

Observations

Difference 0.2514714** (2.2196) 0.1256707* (1.9551) 0.303211** (2.2308) 0.017873 (0.34524) 0.00097035 (0.021602)

443

t-statistics in parenthesis. ***p<0.01, **p<0.05, *p<0.1. Note: We focus on email exchanges falling into topics 7 and 9 from Table 9. And we only consider the connections between workers if they have more than two email exchanges. After imposing this threshold, we have 443 employees in our sample.

Table 15: Summary Statistics by Gender (LDA Topics Model) – Robustness 3

Outdegree

Male 0.3926606

Female 0.2264151

Indegree

0.3467890

0.2853774

Degree

0.6678899

0.4858491

Clustering Coefficient Undirected

0.1242735

0.1309028

Clustering Coefficient Directed

0.1063300

0.1232026

Observations

Difference 0.1662455* (1.7928) 0.0614116 (1.3901) 0.1820408* (1.7592) -0.0066293 (-0.089554) -0.0168726 (-0.24773)

308

t-statistics in parenthesis. ***p<0.01, **p<0.05, *p<0.1. Note: We focus on email exchanges falling into topics 7 and 9 from Table 9. And we only consider the connections between workers if they have more than three email exchanges. After imposing this threshold, we have 308 employees in our sample.

49

Table 16: Summary Statistics by Gender (LDA Topics Model) – Robustness 4

Outdegree

Male 1.0975

Female 0.7725

Indegree

1.015

0.855

Degree

1.8700

1.4875

Clustering Coefficient Undirected

0.06998089

0.08347487

Clustering Coefficient Directed

0.04605156

0.06915951

Observations

400

400

Difference 0.325** (2.0819) 0.16* (1.5533) 0.3825* (1.9815) -0.01349398 (-0.41186) -0.02310795 (-0.88869)

t-statistics in parenthesis. ***p<0.01, **p<0.05, *p<0.1. Note: We focus on email exchanges falling into topics 7 and 9 from Table 9. And we consider a randomly selected gender-balanced subsample of 400 women and 400 men.

50

Computer Scientists Data

Table 17: Summary Statistics – Baseline

Degree Clustering Coefficient Observations

Min 1.00 0.00

Max Mean 1049.00 7.34 0.25 0.13 906863

Std 14.24 0.08

Note: this analysis is based on our baseline sample of 906 863 scholars.

Table 18: Summary Statistics by Gender – Baseline

Degree

Female 6.897871

Male 7.959661

Clustering Coefficient

0.1366825

0.1295397

Observations

Difference -1.06179*** (-25.657) 0.0071428*** (31.757)

906863

t-statistics in parenthesis. ***p<0.01, **p<0.05, *p<0.1. Note: this analysis is based on our baseline sample of 906 863 scholars.

Table 19: Summary Statistics – Robustness

Degree Clustering Coefficient Observations

Min 1.00 0.00

Max 506.00 0.25

Mean 6.207028 0.12 655700

Std 11.17447 0.08

Note: Here we consider a link to a co-author only if his/her gender is known. This implies, we have 655 700 authors, with 491 080 men and 164 620 women.

Table 20: Summary Statistics by Gender – Robustness Degree

Female 5.537091

Male 6.431604

Clustering Coefficient

0.1219477

0.1159816

Observations

Difference -0.894513*** (-25.657) 0.0059661*** (24.729)

655700

t-statistics in parenthesis. ***p<0.01, **p<0.05, *p<0.1. Note: Here we consider a link to a co-author only if his/her gender is known. This implies, we have 655 700 authors, with 491 080 men and 164 620 women.

51

Data Appendix B: Empirical Tests of the Model Data Sources We use four different data for this empirical analysis: AddHealth, Computer Scientists from dblp bibliography and US Census data (and American Community Survey). Here we describe the data cleaning and variable construction. A DD H EALTH . We use the AddHealth data to test the model’s predictions about network characteristics and labor market outcomes (occupational choices, earnings and earnings dynamics). We use waves 3 and 4 of AddHealth, which reports labor market outcomes. Wave 3 is rolled out during 2000/2001 (the age of the individuals in this sample ranges between 18-27, with average age of 22) and wave 4 is rolled out during 2006-2008 (the age of the individuals in this sample ranges between 25-34 with average age of 29). We focus on a subsample of AddHealth in waves 3 and 4 for which: (i) we can compute the network characteristics above, (ii) individuals are attached to the labor force and currently working full-time in some occupation (where we define full-time by working at least 40 hours per week; we do not have information on weeks worked), (iii) individuals report annual income from wages and salaries (we drop individuals with suspiciously low annual income of less than $1000). In both waves, we drop individuals employed in farming and the military. We use/construct the following variables: Network Characteristics: computed based on nominations of friends, see above. For the regression analysis we use standardized network characteristics, degree and clustering coefficient, that have mean 0 and standard deviation 1. The network characteristics are based on wave 1 of AddHealth since this is the only wave for which we have friendship nominations and thus can construct the social network for each individual. Earnings: There is no variable that consistently measures rate of hourly pay in both wave 3 and wave 4. Moreover, several of the reported earnings measures have extremely low response rates. We therefore use the response to the question about earnings during the year where the questionnaire was rolled out, which include wages or salaries, as well as tips, bonuses, and overtime pay, and income from self-employment. This is the earnings variable with the highest response rate. In order to deal with the problem that earnings confound rate-of-pay (i.e. wage) and hours, we control for (weekly) hours. Occupations: Respondents are asked about job classifications, which we take as an indicator for the occupation he/she works in. There are 23 occupational categories in wave 3 (only coarse categories from the 1998 Standard Occupational Classification SOC system are available), while occupational codes in wave 4 are more disaggregate and based on the 2000 Standard Occupa-

52

tional Classification (SOC) system. We aggregate wave 4 codes to make the comparable to the occupational codes used in wave 3. In both wave 3 and wave 4, the occupation code of the current/most recent job is recorded, so we know with certainty which occupation the respondent is currently working in. Demographic Characteristics: We control in our analysis for a large range of demographic variables. Education measures the highest grade completed in wave 3 and the highest educational level in wave 4. We adjust the wave 4 codes to make the education variable comparable. Tenure is the number of years on the current job. We control for having children with a dummy variable which is 1 if respondent has children (while in wave 3 we could assess the number of children, this is not the case in wave 4). The definition of each variable used is explained in the notes below each table. Marital status is only reported in wave 3 (in wave 4 individuals are asked about how many partners they ever married but it is unclear whether the marriage is still intact or divorced). We also control for whether respondents majored in business in college or whether they received vocational training. Measures of Risk Aversion and Competitiveness: See for details below. C OMPUTER S CIENTISTS . We focus on a (randomly drawn) subsample of about 29 000 scholars for which we scraped Google Scholar information in order to assess their performance. The network characteristics are measured as above. The performance measures from Google Scholar are the i10-index, the h-index and the total number of citations. We describe them further in the main text. US C ENSUS . We use the US Census data as well as the American Community Survey (a yearly survey starting in 2000, also administered by the Census Bureau) to compute measures of: occupational risk, occupational gender wage gap and occupational mean earnings. Our occupational risk measure is based on unexplained earnings risk and we explain the details below. The occupational gender gap is the log difference between average male and average female earnings in each occupation. Occupational mean earnings are the average earnings in an occupation. We crosswalk the occupational codes used in the Census data to merge them into the occupational codes of AddHealth. We focus on the 2000 Census for the analysis of wave 3 of AddHealth. We merge occupational information in 2005-07 from the American Community Survey 2005-2007 into wage 4 of AddHealth. The Census data reports annual pre-tax wage and salary income and this is the earnings measure we use. Throughout, we focus on full-time workers during prime age (25-65). The reason we compute the occupational risk and occupational mean earnings and occupational gender earnings gaps from the Census and ACS and not from AddHealth is that these data are 53

representative of the whole US economy. We merge these measures, which are assessed at the occupational level, to the occupations that individuals report in AddHealth.

Measurement Measuring Occupational Risk In order to assess whether a work environment is ‘risky’ or ‘safe’ we construct a measure of occupational earnings risk: Occupational risk is assessed by computing the standard deviation of residual earnings by occupation. That is, we run a Mincer type wage regression (i.e. regress individual log earnings on commonly used observable characteristics of individuals as well as occupation and industry dummies) where the residual gives the unexplained/unpredicted portion of the individual’s earnings. The standard deviation of the residual earnings by occupation is a measure for wage variation that is associated with a certain occupation and cannot be predicted based on observable controls, i.e. it is a reasonable measure of wage risk. This measure of unpredicted wage risk is closely related to the uncertainty in our model. We use Census 2000 data to construct this risk measure for wave 3 and use a crosswalk to make the occupational codes from the Census comparable to those from AddHealth. In turn, to construct the occupational risk for wave 4 we use the data from ACS from 2005-07 (just before Great Recession to avoid all sorts of confounding factors). Note that we have dropped farming and military occupations as we also drop those in our empirical analysis.

.45

.5

.55

Risk

.6

.65

.7

Figure 2: Risk-Return Relationship for Broad Occupational Categories.

9.5

10 10.5 Mean of Log Yearly Wage Risk

11

Fitted values

Measuring Risk Aversion in AddHealth In the AddHealth data, wave 1, there are various measures that can be used to proxy risk aversion: 1. helmet usage when riding a bicycle: 0-never,...,4-always; 2. frequency of riding a motorcycle during last 12 months: 0-never,...,4-almost every day;

54

3. frequency of wearing a seatbelt in a car: 0-never,...,4-always; 4. frequency of smoking marijuana during lifetime: range 1 to 950 times 5. frequency of doing something dangerous because dared to: 0-never,...,6-nearly everyday. We use a factor analysis to reduce the dimensionality of these measures, where the five described variables enter. Since the variables (except ‘marijuana use’) are categorical variables, we first compute the polychoric correlation matrix of these variables, taking into account that we have a mix of categorical and non-categorical variables. We then perform the factor analysis using this correlation matrix (rather than using the raw variables) as an input. By doing so, two factors are returned. The rotated factor loadings are displayed below: Table 21: Rotated Factor Loadings (Risk Aversion) Helmet Motorcycle Seatbelt Marijuana Dare

Factor 1 -0.2467 0.4789 -0.3274 0.2084 0.4782

Factor 2 0.1900 -0.0592 0.1373 0.0620 -0.0388

The first factor seems reasonable in terms of signs and it also has non-trivial loadings: it loads positively on variables that indicate “risk-loving” behavior (motorcycle, marijuana and danger due to dare) and it loads negatively on variables that indicate “risk-averse” behavior (seatbelt and helmet). The second factor, however, does not have a clear interpretation and the loadings are much lower. We therefore choose to include factor 1 – the non-trivial factor that is easily interpreted as a proxy for “risk-loving” behavior – as a control variable in our regressions. Our results are robust to additionally including factor 2 as a regressor. Correlations: Table 22 shows the correlations between the proxies for risk aversion (raw data) and factor 1 as well as between these measures of risk aversion and our network characteristics of interest. We make the following observations: First, quite intuitively, our raw measures of risky behavior (marijuana, dare and motorcycle) are positively correlated but they are negatively correlated to our proxies for risk aversion (seatbelt and helmet). Similarly, the risk aversion factor obtained from the factor analysis (which can be interpreted as a proxy for risk-loving behavior) is strongly positively correlated with our proxies for risk-loving behavior (marijuana, dare and motorcycle) and negatively correlated with our proxies for risk aversion (seatbelt and helmet). Maybe surprisingly, there is no notable correlation between the network characteristics degree and clustering coefficients on the one hand and the various measures of risk aversion on the other. Measuring Competitiveness in AddHealth In AddHealth, wave 1, there are various measures that can be used to proxy competitiveness: 55

Table 22: Cross-correlation table: Proxies for Risk Aversion and Network Characteristics Variables Helmet

Helmet 1.000

Motorcycle

Motorcycle

-0.069 (0.000) 0.104 (0.000) -0.004 (0.777) -0.069 (0.000) -0.020 (0.117) -0.016 (0.207) -0.327 (0.000)

1.000

Seatbelt Marijuana Dare Clustering Degree Risk Aversion Factor

-0.110 (0.000) 0.081 (0.000) 0.207 (0.000) 0.012 (0.351) 0.028 (0.026) 0.686 (0.000)

Seatbelt

Marijuana

Dare

Clustering

Degree

Risk Aversion Factor

1.000 -0.097 (0.000) -0.124 (0.000) 0.006 (0.656) 0.028 (0.024) -0.476 (0.000)

1.000 0.110 (0.000) -0.007 (0.563) -0.018 (0.160) 0.344 (0.000)

1.000 0.007 (0.565) 0.015 (0.235) 0.697 (0.000)

1.000 -0.039 (0.002) 0.007 (0.577)

1.000 0.015 (0.257)

1.000

P-values in parentheses.

1. participation in high school sport teams in year of interview (list of sports: wrestling, swimming, football, track, tennis, soccer, ice hockey, field hockey, basketball, baseball): 0-no, 1-yes; 2. intensity of active sport, such as baseball, softball, basketball, soccer, swimming, or football, during past week: 0-not at all,...,3-five or more times; 3. participate in high school debate club in year of interview: 0-no, 1-yes. To capture measure 1. in a more compact way, we create a new variable that indicates how many high school sports the respondent is participating in. We therefore end up with three proxies for competitiveness: Measure 1. captures participation in sports and, to some extent, the intensity of this participation. Measure 2. captures the intensity with which the respondent participates in sports. Measure 3. captures participation in a non-sport, competitive activity: debating. (Note that we experimented with two alternatives of how to create Measure 1.: (i) indicate whether respondent participates in high school sports or not (instead of the number of sports he is participating in); or (ii) whether respondent participates in one of the most competitive sports like football, swimming, track etc. The results remain similar and are available upon request.) We use factor analysis to reduce the dimensionality of these measures. Since the three variables of interest are categorical variables, we first compute the polychoric correlation matrix of these variables. We then perform the factor analysis using this correlation matrix (rather than using the raw variables) as an input, where two factors are returned. The rotated factor loadings are displayed below: The first factor positively correlates with all three proxies for competitiveness and loads highly on sport participation (High School Sport) and intensity (Active Sport). The second factor, in turn, loads less on the sport variables but has a higher loading on “debate”. In our

56

Table 23: Rotated Factor Loadings (Competitiveness) Factor 1 0.4518 0.4711 0.0401

High School Sport Active Sport Debate Club

Factor 2 0.1269 -0.0654 0.2837

opinion, each factor plausibly captures a certain facet of “competitiveness” and we therefore include both of them as controls in our regressions. Our results do not change if we only include factor 1 as a regressor. Correlations: Table 24 shows the correlations between the proxies for competitiveness (raw data) and the factors as well as between these measures of competitiveness and our network characteristics of interest. We make the following observations: First, our raw measures of competitiveness are positively correlated (except for a negative but insignificant correlation between ‘active sport’ and ‘debate club’). The factors from above (which can be interpreted as proxies for competitiveness) strongly positively correlate with our proxies for competitiveness. Last, we are interested in how our network characteristics correlate with competitiveness. There is no notable correlation between clustering coefficient and the various measures of competitiveness. There is, however, a positive correlation between degree and competitiveness, where competitiveness is either measured by the factors or the raw measures of sport participation and intensity. Table 24: Cross-correlation table: Proxies for Competitiveness and Network Characteristics Variables High School Sport Active Sport Debate Club Clustering Degree Competitiveness Factor 1 Competitiveness Factor 2

High School Sport 1.000

Active Sport

0.295 (0.000) 0.109 (0.000) 0.042 (0.001) 0.138 (0.000) 0.827 (0.000) 0.693 (0.000)

1.000 -0.010 (0.440) 0.021 (0.100) 0.113 (0.000) 0.768 (0.000) 0.217 (0.000)

Debate Club

Clustering

Degree

Competitiveness Factor 1

Competitiveness Factor 2

1.000 0.012 (0.347) 0.004 (0.759) 0.190 (0.000) 0.791 (0.000)

1.000 -0.039 (0.002) 0.040 (0.001) 0.034 (0.006)

P-values in parentheses.

57

1.000 0.155 (0.000) 0.091 (0.000)

1.000 0.671 (0.000)

1.000

Table 25: Earnings and Network Characteristics (Degree) (1) Log Earnings 0.194∗ (0.106)

(2) Log Earnings 0.187∗ (0.105)

(3) Log Earnings 0.171 (0.105)

(4) Log Earnings 0.166 (0.105)

(5) Log Earnings 0.166 (0.105)

(6) Log Earnings 0.163 (0.105)

(7) Log Earnings 0.163 (0.105)

Degree

0.00628 (0.101)

0.0158 (0.101)

0.0101 (0.100)

0.0123 (0.100)

0.0113 (0.100)

0.0101 (0.100)

0.00931 (0.100)

Occupational Risk

-0.250 (0.185)

-0.236 (0.183)

-0.232 (0.183)

-0.220 (0.183)

-0.217 (0.183)

-0.217 (0.183)

-0.214 (0.183)

Risk×Clustering Coefficient

-0.336∗ (0.197)

-0.332∗ (0.196)

-0.303 (0.196)

-0.302 (0.195)

-0.303 (0.195)

-0.295 (0.195)

-0.296 (0.195)

Risk×Degree

0.0517 (0.190)

0.0335 (0.189)

0.0425 (0.188)

0.0346 (0.188)

0.0351 (0.188)

0.0427 (0.188)

0.0429 (0.188)

0.0464∗∗ (0.0201)

0.0419∗∗ (0.0200)

0.0400∗∗ (0.0199)

0.0364∗ (0.0200)

0.0360∗ (0.0200)

0.0378∗ (0.0200)

0.0375∗ (0.0200)

-0.0269∗∗∗ (0.00854)

-0.0195∗∗ (0.00859)

-0.0195∗∗ (0.00857)

-0.0202∗∗ (0.00857)

-0.0190∗∗ (0.00864)

-0.0209∗∗ (0.00862)

-0.0197∗∗ (0.00870)

Age

0.446∗∗ (0.214)

0.470∗∗ (0.212)

0.498∗∗ (0.212)

0.490∗∗ (0.212)

0.490∗∗ (0.212)

0.498∗∗ (0.212)

0.497∗∗ (0.212)

Age2

-0.00782 (0.00477)

-0.00844∗ (0.00474)

-0.00916∗ (0.00473)

-0.00897∗ (0.00473)

-0.00893∗ (0.00473)

-0.00915∗ (0.00473)

-0.00911∗ (0.00473)

Hours

0.0141∗∗∗ (0.00169)

0.0120∗∗∗ (0.00171)

0.0119∗∗∗ (0.00171)

0.0117∗∗∗ (0.00171)

0.0115∗∗∗ (0.00172)

0.0117∗∗∗ (0.00172)

0.0115∗∗∗ (0.00172)

Tenure

-0.000138∗∗∗ (0.0000449)

-0.000139∗∗∗ (0.0000447)

-0.000141∗∗∗ (0.0000446)

-0.000140∗∗∗ (0.0000446)

-0.000141∗∗∗ (0.0000446)

-0.000141∗∗∗ (0.0000446)

-0.000142∗∗∗ (0.0000446)

Kids

0.00868 (0.0375)

0.0378 (0.0376)

-0.00106 (0.0388)

0.0111 (0.0389)

0.00954 (0.0389)

0.00950 (0.0389)

0.00804 (0.0389)

Business Major

0.106 (0.0675)

0.118∗ (0.0672)

0.121∗ (0.0670)

0.126∗ (0.0669)

0.124∗ (0.0669)

0.125∗ (0.0669)

0.124∗ (0.0669)

Vocational Training

0.0315 (0.0329)

0.0292 (0.0327)

0.0260 (0.0326)

0.0267 (0.0326)

0.0236 (0.0327)

0.0279 (0.0326)

0.0250 (0.0327)

-0.167∗∗∗ (0.0295)

-0.141∗∗∗ (0.0328)

-0.134∗∗∗ (0.0328)

-0.126∗∗∗ (0.0337)

-0.140∗∗∗ (0.0340)

-0.132∗∗∗ (0.0348)

Married

0.198∗∗∗ (0.0507)

0.189∗∗∗ (0.0508)

0.190∗∗∗ (0.0508)

0.188∗∗∗ (0.0508)

0.189∗∗∗ (0.0508)

Married×Female

-0.134∗ (0.0693)

-0.142∗∗ (0.0692)

-0.143∗∗ (0.0692)

-0.139∗∗ (0.0693)

-0.141∗∗ (0.0693)

Clustering Coefficient

Degree×Clustering Coefficient Education

Female

Risk Aversion Factor 1

0.0259 (0.0234)

0.0247 (0.0235)

Competitiveness Factor 1

-0.0410 (0.0364)

-0.0419 (0.0364)

Competitiveness Factor 2

0.0975∗ (0.0554)

0.0959∗ (0.0554)

Constant Dummies Work 1995-2000 Race Dummies Observations R2

2.893 (2.364)

2.748 (2.350)

2.459 (2.344)

2.562 (2.342)

2.547 (2.342)

2.499 (2.342)

2.487 (2.342)

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No 2560 0.174

No 2560 0.184

No 2560 0.189

Yes 2560 0.193

Yes 2560 0.193

Yes 2560 0.194

Yes 2560 0.194

Standard errors in parentheses. ***p<0.01, **p<0.05, *p<0.1. Note: We here explain the variables that may not be self-explanatory. Variable Risk Aversion is the first factor coming out of our factor analysis to measure risk aversion, see above for details. Variables Competitiveness 1 and Competitiveness 2 are the two factors returned by the factor analysis on competitiveness. Variable Tenure gives years of tenure of main job in wave 3. Vocational Training is a dummy equalling 1 if the respondent has received some. Business Major is a dummy equalling 1 if the respondent majored business in college. Variable Education measures highest grade completed. Kids is a dummy equalling one if respondent has kids. Married is a dummy equalling one if respondent is married in wave 3. Clustering Coefficient and Degree are the network characteristics measured for each respondent in wave 1 of AddHealth. Dummies Work 1995-2000 include a dummy for each year 1995-2000 which is one if the individual worked then and zero otherwise. Occ. Risk is the occupational earnings risk measure that we computed above using the 2000 Census. Race Dummies are four dummy variables for: African American, American Indian, Asian and White.

58

Table 26: Earnings and Network Characteristics (Outdegree) (1) Log Earnings 0.153 (0.104)

(2) Log Earnings 0.151 (0.103)

(3) Log Earnings 0.137 (0.103)

(4) Log Earnings 0.134 (0.103)

(5) Log Earnings 0.135 (0.103)

(6) Log Earnings 0.130 (0.103)

(7) Log Earnings 0.131 (0.103)

-0.00884 (0.1000)

0.0114 (0.0994)

0.0192 (0.0992)

0.0180 (0.0990)

0.0189 (0.0990)

0.0133 (0.0991)

0.0145 (0.0991)

Occupational Risk

-0.248 (0.185)

-0.234 (0.184)

-0.230 (0.183)

-0.219 (0.183)

-0.215 (0.183)

-0.216 (0.183)

-0.213 (0.183)

Risk×Clustering Coefficient

-0.297 (0.197)

-0.298 (0.196)

-0.271 (0.195)

-0.272 (0.195)

-0.273 (0.195)

-0.264 (0.195)

-0.265 (0.195)

Risk×Outdegree

0.0648 (0.187)

0.0331 (0.186)

0.0176 (0.186)

0.0153 (0.185)

0.0141 (0.185)

0.0269 (0.186)

0.0252 (0.186)

Outdegree×Clustering Coefficient

0.0204 (0.0154)

0.0184 (0.0153)

0.0171 (0.0153)

0.0149 (0.0153)

0.0152 (0.0153)

0.0163 (0.0153)

0.0166 (0.0153)

-0.0265∗∗∗ (0.00854)

-0.0191∗∗ (0.00858)

-0.0192∗∗ (0.00856)

-0.0198∗∗ (0.00856)

-0.0186∗∗ (0.00862)

-0.0207∗∗ (0.00862)

-0.0194∗∗ (0.00869)

Age

0.437∗∗ (0.214)

0.462∗∗ (0.212)

0.492∗∗ (0.212)

0.486∗∗ (0.212)

0.485∗∗ (0.212)

0.493∗∗ (0.212)

0.492∗∗ (0.212)

Age2

-0.00763 (0.00477)

-0.00831∗ (0.00474)

-0.00906∗ (0.00473)

-0.00888∗ (0.00473)

-0.00884∗ (0.00473)

-0.00906∗ (0.00473)

-0.00902∗ (0.00473)

Hours

0.0141∗∗∗ (0.00169)

0.0120∗∗∗ (0.00171)

0.0120∗∗∗ (0.00171)

0.0117∗∗∗ (0.00171)

0.0115∗∗∗ (0.00172)

0.0117∗∗∗ (0.00172)

0.0115∗∗∗ (0.00172)

Tenure

-0.000136∗∗∗ (0.0000450)

-0.000137∗∗∗ (0.0000447)

-0.000139∗∗∗ (0.0000446)

-0.000138∗∗∗ (0.0000446)

-0.000139∗∗∗ (0.0000446)

-0.000139∗∗∗ (0.0000446)

-0.000140∗∗∗ (0.0000446)

0.00665 (0.0375) 0.110 (0.0675)

0.0365 (0.0376) 0.123∗ (0.0671)

-0.00270 (0.0388) 0.126∗ (0.0669)

0.00975 (0.0389) 0.131∗ (0.0669)

0.00799 (0.0389) 0.129∗ (0.0669)

0.00808 (0.0389) 0.130∗ (0.0669)

0.00645 (0.0389) 0.129∗ (0.0669)

0.0291 (0.0329)

0.0272 (0.0327)

0.0241 (0.0326)

0.0248 (0.0326)

0.0215 (0.0327)

0.0258 (0.0326)

0.0226 (0.0327)

-0.171∗∗∗ (0.0295)

-0.144∗∗∗ (0.0328)

-0.137∗∗∗ (0.0329)

-0.127∗∗∗ (0.0337)

-0.141∗∗∗ (0.0341)

-0.133∗∗∗ (0.0349)

Married

0.199∗∗∗ (0.0508)

0.190∗∗∗ (0.0508)

0.191∗∗∗ (0.0508)

0.189∗∗∗ (0.0508)

0.190∗∗∗ (0.0508)

Married×Female

-0.136∗ (0.0693)

-0.143∗∗ (0.0693)

-0.145∗∗ (0.0693)

-0.140∗∗ (0.0693)

-0.142∗∗ (0.0693)

Clustering Coefficient Out-Degree

Education

Kids Business Major Vocational Training Female

Risk Aversion Factor 1

0.0291 (0.0234)

0.0280 (0.0235)

Competitiveness Factor 1

-0.0363 (0.0361)

-0.0377 (0.0361)

Competitiveness Factor 2

0.0959∗ (0.0554)

0.0945∗ (0.0555)

Constant Dummies Work 1995-2000 Race Dummies Observations R2

3.001 (2.364)

2.833 (2.350)

2.531 (2.344)

2.623 (2.342)

2.603 (2.341)

2.554 (2.341)

2.538 (2.341)

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No 2560 0.173

No 2560 0.183

No 2560 0.189

Yes 2560 0.192

Yes 2560 0.193

Yes 2560 0.193

Yes 2560 0.194

Standard errors in parentheses. ***p<0.01, **p<0.05, *p<0.1. Note: We here explain the variables that may not be self-explanatory. Variable Risk Aversion is the first factor coming out of our factor analysis to measure risk aversion, see above for details. Variables Competitiveness 1 and Competitiveness 2 are the two factors returned by the factor analysis on competitiveness. Variable Tenure gives years of tenure of main job in wave 3. Vocational Training is a dummy equalling 1 if the respondent has received some. Business Major is a dummy equalling 1 if the respondent majored business in college. Variable Education measures highest grade completed. Kids is a dummy equalling one if respondent has kids. Married is a dummy equalling one if respondent is married in wave 3. Clustering Coefficient and Degree are the network characteristics measured for each respondent in wave 1 of AddHealth. Dummies Work 1995-2000 include a dummy for each year 1995-2000 which is one if the individual worked then and zero otherwise. Occ. Risk is the occupational earnings risk measure that we computed above using the 2000 Census. Race Dummies are four dummy variables for: African American, American Indian, Asian and White.

59

Table 27: Earnings and Network Characteristics (Indegree) (1) Log Earnings 0.203∗ (0.107)

(2) Log Earnings 0.194∗ (0.106)

(3) Log Earnings 0.179∗ (0.106)

(4) Log Earnings 0.175∗ (0.106)

(5) Log Earnings 0.174 (0.106)

(6) Log Earnings 0.170 (0.106)

(7) Log Earnings 0.170 (0.106)

-0.00877 (0.101)

0.0000372 (0.100)

-0.0173 (0.100)

-0.0117 (0.1000)

-0.0147 (0.1000)

-0.0109 (0.1000)

-0.0136 (0.100)

Occupational Risk

-0.249 (0.185)

-0.235 (0.184)

-0.230 (0.183)

-0.219 (0.183)

-0.216 (0.183)

-0.216 (0.183)

-0.213 (0.183)

Risk*Clustering Coefficient

-0.369∗ (0.198)

-0.361∗ (0.197)

-0.331∗ (0.197)

-0.329∗ (0.197)

-0.329∗ (0.197)

-0.322 (0.197)

-0.322 (0.197)

Risk*Indegree

0.0659 (0.189)

0.0493 (0.188)

0.0791 (0.188)

0.0639 (0.188)

0.0672 (0.188)

0.0641 (0.188)

0.0672 (0.188)

0.0525∗∗ (0.0232)

0.0468∗∗ (0.0231)

0.0461∗∗ (0.0230)

0.0435∗ (0.0230)

0.0427∗ (0.0230)

0.0427∗ (0.0231)

0.0420∗ (0.0231)

-0.0259∗∗∗ (0.00855)

-0.0186∗∗ (0.00860)

-0.0185∗∗ (0.00858)

-0.0192∗∗ (0.00857)

-0.0180∗∗ (0.00865)

-0.0199∗∗ (0.00863)

-0.0187∗∗ (0.00872)

Age

0.444∗∗ (0.214)

0.467∗∗ (0.212)

0.496∗∗ (0.212)

0.490∗∗ (0.212)

0.489∗∗ (0.212)

0.496∗∗ (0.212)

0.495∗∗ (0.212)

Age2

-0.00781 (0.00477)

-0.00842∗ (0.00474)

-0.00916∗ (0.00473)

-0.00899∗ (0.00473)

-0.00895∗ (0.00473)

-0.00914∗ (0.00473)

-0.00910∗ (0.00473)

Hours

0.0141∗∗∗ (0.00169)

0.0120∗∗∗ (0.00172)

0.0120∗∗∗ (0.00171)

0.0117∗∗∗ (0.00171)

0.0115∗∗∗ (0.00172)

0.0117∗∗∗ (0.00172)

0.0115∗∗∗ (0.00172)

Tenure

-0.000136∗∗∗ (0.0000450)

-0.000138∗∗∗ (0.0000447)

-0.000139∗∗∗ (0.0000446)

-0.000138∗∗∗ (0.0000446)

-0.000139∗∗∗ (0.0000446)

-0.000139∗∗∗ (0.0000446)

-0.000140∗∗∗ (0.0000446)

0.00835 (0.0375) 0.105 (0.0676)

0.0374 (0.0376) 0.117∗ (0.0672)

-0.00164 (0.0388) 0.120∗ (0.0670)

0.0107 (0.0389) 0.126∗ (0.0670)

0.00919 (0.0389) 0.124∗ (0.0670)

0.00905 (0.0389) 0.125∗ (0.0670)

0.00762 (0.0389) 0.124∗ (0.0670)

0.0295 (0.0329)

0.0272 (0.0327)

0.0241 (0.0326)

0.0249 (0.0326)

0.0219 (0.0327)

0.0257 (0.0326)

0.0228 (0.0327)

-0.167∗∗∗ (0.0295)

-0.142∗∗∗ (0.0328)

-0.135∗∗∗ (0.0328)

-0.126∗∗∗ (0.0337)

-0.139∗∗∗ (0.0341)

-0.131∗∗∗ (0.0349)

Married

0.198∗∗∗ (0.0508)

0.189∗∗∗ (0.0508)

0.190∗∗∗ (0.0508)

0.188∗∗∗ (0.0508)

0.189∗∗∗ (0.0509)

Married*Female

-0.129∗ (0.0693)

-0.137∗∗ (0.0692)

-0.138∗∗ (0.0692)

-0.135∗ (0.0693)

-0.136∗∗ (0.0693)

Clustering Coefficient In-Degree

Indegree*Clustering Coefficient Education

Kids Business Major Vocational Training Female

Risk Aversion Factor 1

0.0253 (0.0235)

0.0240 (0.0235)

Competitiveness Factor 1

-0.0332 (0.0362)

-0.0339 (0.0362)

Competitiveness Factor 2

0.0879 (0.0553)

0.0863 (0.0553)

Constant Dummies Work 1995-2000 Race Dummies Observations R2

2.916 (2.364)

2.775 (2.349)

2.479 (2.344)

2.570 (2.341)

2.554 (2.341)

2.515 (2.341)

2.503 (2.342)

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No 2560 0.173

No 2560 0.184

No 2560 0.189

Yes 2560 0.192

Yes 2560 0.193

Yes 2560 0.193

Yes 2560 0.194

Standard errors in parentheses. ***p<0.01, **p<0.05, *p<0.1. Note: We here explain the variables that may not be self-explanatory. Variable Risk Aversion is the first factor coming out of our factor analysis to measure risk aversion, see above for details. Variables Competitiveness 1 and Competitiveness 2 are the two factors returned by the factor analysis on competitiveness. Variable Tenure gives years of tenure of main job in wave 3. Vocational Training is a dummy equalling 1 if the respondent has received some. Business Major is a dummy equalling 1 if the respondent majored business in college. Variable Education measures highest grade completed. Kids is a dummy equalling one if respondent has kids. Married is a dummy equalling one if respondent is married in wave 3. Clustering Coefficient and Degree are the network characteristics measured for each respondent in wave 1 of AddHealth. Dummies Work 1995-2000 include a dummy for each year 1995-2000 which is one if the individual worked then and zero otherwise. Occ. Risk is the occupational earnings risk measure that we computed above using the 2000 Census. Race Dummies are four dummy variables for: African American, American Indian, Asian and White.

60

Table 28: Earnings Dynamics and Network Characteristics (Degree) (1) Wage Growth 0.132 (0.102)

(2) Wage Growth 0.135 (0.102)

(3) Wage Growth 0.138 (0.102)

(4) Wage Growth 0.134 (0.102)

(5) Wage Growth 0.137 (0.102)

(6) Wage Growth 0.144 (0.102)

(7) Wage Growth 0.136 (0.102)

(8) Wage Growth 0.144 (0.102)

Degree

0.00268 (0.0698)

0.00480 (0.0700)

-0.00212 (0.0700)

-0.00544 (0.0701)

-0.00349 (0.0702)

-0.0104 (0.0703)

-0.00254 (0.0703)

-0.00942 (0.0704)

Degree×Gender Wage Gap w3

-0.0194 (0.193)

-0.0229 (0.193)

-0.00625 (0.193)

0.00693 (0.193)

0.0192 (0.194)

0.0308 (0.194)

0.0311 (0.194)

0.0433 (0.194)

(Clustering Coefficient)×(Gender Wage Gap w3)

-0.439 (0.268)

-0.449∗ (0.268)

-0.455∗ (0.268)

-0.438 (0.268)

-0.435 (0.269)

-0.454∗ (0.269)

-0.428 (0.269)

-0.448∗ (0.269)

0.115∗∗∗ (0.0198)

0.108∗∗∗ (0.0207)

0.108∗∗∗ (0.0206)

0.103∗∗∗ (0.0209)

0.101∗∗∗ (0.0210)

0.105∗∗∗ (0.0212)

0.101∗∗∗ (0.0214)

0.105∗∗∗ (0.0216)

-0.0878∗∗∗ (0.0235)

-0.0879∗∗∗ (0.0236)

-0.0892∗∗∗ (0.0236)

-0.0891∗∗∗ (0.0236)

-0.0883∗∗∗ (0.0237)

-0.0877∗∗∗ (0.0237)

-0.0890∗∗∗ (0.0238)

-0.0884∗∗∗ (0.0237)

Age (w3)

-0.923 (0.679)

-0.934 (0.680)

-0.898 (0.679)

-0.911 (0.679)

-0.981 (0.682)

-0.985 (0.682)

-0.972 (0.683)

-0.977 (0.683)

Age2 (w3)

0.0140 (0.0115)

0.0141 (0.0115)

0.0135 (0.0114)

0.0138 (0.0115)

0.0150 (0.0115)

0.0151 (0.0115)

0.0148 (0.0115)

0.0149 (0.0115)

Hours (w3)

-0.00425 (0.00367)

-0.00397 (0.00370)

0.000932 (0.00465)

0.00229 (0.00478)

0.00310 (0.00481)

0.00225 (0.00484)

0.00321 (0.00483)

0.00239 (0.00486)

Gender Wage Gap (w3)

-0.0441 (0.239)

0.0433 (0.300)

0.0961 (0.301)

-0.0369 (0.316)

-0.0995 (0.319)

-0.0615 (0.320)

-0.0877 (0.320)

-0.0510 (0.321)

Kids (w3)

-0.0393 (0.0927)

-0.0345 (0.0929)

-0.0331 (0.0927)

-0.0240 (0.0961)

-0.0318 (0.0969)

-0.0371 (0.0969)

-0.0313 (0.0971)

-0.0368 (0.0971)

Job Change (w3)

0.0216 (0.0375)

0.0250 (0.0377)

0.0217 (0.0377)

0.0207 (0.0377)

0.0191 (0.0377)

0.0184 (0.0377)

0.0196 (0.0378)

0.0187 (0.0378)

Vocational Training (w3)

-0.0310 (0.0758)

-0.0277 (0.0759)

-0.0294 (0.0758)

-0.0222 (0.0759)

-0.0170 (0.0761)

-0.0202 (0.0761)

-0.0111 (0.0765)

-0.0141 (0.0765)

Business Major (w3)

0.00427 (0.129)

0.000798 (0.129)

0.0182 (0.129)

0.00721 (0.130)

0.00915 (0.130)

0.00706 (0.130)

0.0112 (0.130)

0.00854 (0.130)

Occupational Mean Earnings (w3)

0.138 (0.130)

0.148 (0.130)

0.195 (0.134)

0.203 (0.134)

0.205 (0.134)

0.206 (0.134)

0.208 (0.134)

Occupational Risk (w3)

-0.276 (0.600)

-0.431 (0.606)

-0.311 (0.613)

-0.224 (0.615)

-0.264 (0.616)

-0.225 (0.619)

-0.260 (0.619)

0.00744∗ (0.00430)

0.00817∗ (0.00434)

0.00834∗ (0.00434)

0.00790∗ (0.00435)

0.00824∗ (0.00435)

0.00782∗ (0.00436)

Female

0.106 (0.0786)

0.113 (0.0791)

0.132 (0.0802)

0.100 (0.0808)

0.119 (0.0818)

Married (w3)

-0.0550 (0.0858)

-0.0476 (0.0862)

-0.0471 (0.0862)

-0.0478 (0.0864)

-0.0473 (0.0863)

Clustering Coefficient

Education (w3) Tenure (w3)

∆_w4, w3 Hours

Risk Aversion Factor 1

0.0719 (0.0527)

0.0727 (0.0529)

Competitiveness Factor 1

-0.0660 (0.0829)

-0.0692 (0.0829)

Competitiveness Factor 2

0.0869 (0.128)

0.0820 (0.128)

Constant Race Dummies Observations R2

15.16 (9.993)

14.11 (10.06)

13.26 (10.05)

12.88 (10.05)

13.77 (10.09)

13.77 (10.09)

13.67 (10.11)

13.68 (10.10)

No 544 0.197

No 544 0.199

No 544 0.204

No 544 0.207

Yes 544 0.212

Yes 544 0.214

Yes 544 0.213

Yes 544 0.216

Standard errors in parentheses. ***p<0.01, **p<0.05, *p<0.1. Note: We here explain the variables that may not be self-explanatory. w3 stands for wave 3 and w4 stands for wave 4 of AddHealth. The parenthetical remark (w3) refers to the variable being measured in w3. Variable Gender Earnings Gap in occupation j is defined as the log difference of average male earnings and the average female earnings in occupation j. It is measured using data from the 2000 Census, which coincides in terms of timing with wave 3 (w3), and then merged into AddHealth occupations using an occupational crosswalk. Variable ∆w4,w3 Hours is defined as the difference of hours worked in wave 4 and hours worked in wave 3. Variable Risk Aversion is the first factor coming out of our factor analysis to measure risk aversion. Variables Competitiveness 1 and Competitiveness 2 are the two factors returned by the factor analysis on competitiveness. Variable Tenure gives years of tenure of job in wave 3. Job Change is a dummy equalling 0 if respondent is still in his/her first full-time job and zero otherwise. Vocational Training is a dummy equalling 1 if the respondent has received some. Business Major is a dummy equalling 1 if the respondent majored business in college. Variable Education measures highest grade completed. Kids is a dummy equalling one if respondent has kids. Married is a dummy equalling one if respondent is married in wave 3. Clustering Coefficient and Degree are the network characteristics measured for each respondent in wave 1 of AddHealth. Dummies Work 1995-2000 include a dummy for each year 1995-2000 which is one if the individual worked then and zero otherwise. Race Dummies are four dummy variables for: African American, American Indian, Asian and White. We focus on a subsample of AddHealth where individuals (i) worked full-time in w3 and w4 and (ii) have not switched the occupation across waves.

Table 29: Earnings Dynamics and Network Characteristics (Indegree) (1) Wage Growth 0.136 (0.102)

(2) Wage Growth 0.138 (0.102)

(3) Wage Growth 0.141 (0.102)

(4) Wage Growth 0.140 (0.102)

(5) Wage Growth 0.135 (0.102)

(6) Wage Growth 0.138 (0.102)

(7) Wage Growth 0.146 (0.102)

(8) Wage Growth 0.137 (0.102)

(9) Wage Growth 0.146 (0.103)

In-Degree

0.0113 (0.0788)

0.0113 (0.0789)

0.00235 (0.0789)

0.00158 (0.0789)

0.00130 (0.0790)

0.00810 (0.0792)

0.00131 (0.0793)

0.00839 (0.0793)

0.00148 (0.0794)

Gender Wage Gap (w3)

-0.0419 (0.238)

0.0443 (0.299)

0.0989 (0.300)

-0.0276 (0.315)

-0.0290 (0.315)

-0.0897 (0.318)

-0.0495 (0.319)

-0.0776 (0.319)

-0.0387 (0.320)

Indegree×Gender Wage Gap w3

-0.0760 (0.208)

-0.0710 (0.208)

-0.0524 (0.208)

-0.0390 (0.208)

-0.0396 (0.208)

-0.0419 (0.208)

-0.0321 (0.208)

-0.0338 (0.209)

-0.0230 (0.209)

(Clustering Coefficient)×(Gender Wage Gap w3)

-0.450∗ (0.267)

-0.459∗ (0.268)

-0.466∗ (0.267)

-0.457∗ (0.267)

-0.443∗ (0.268)

-0.440 (0.269)

-0.460∗ (0.269)

-0.434 (0.270)

-0.455∗ (0.270)

Education (w3)

0.117∗∗∗ (0.0193)

0.110∗∗∗ (0.0203)

0.110∗∗∗ (0.0203)

0.106∗∗∗ (0.0206)

0.104∗∗∗ (0.0209)

0.102∗∗∗ (0.0209)

0.106∗∗∗ (0.0211)

0.102∗∗∗ (0.0214)

0.106∗∗∗ (0.0216)

-0.0878∗∗∗ (0.0236)

-0.0880∗∗∗ (0.0236)

-0.0891∗∗∗ (0.0236)

-0.0901∗∗∗ (0.0236)

-0.0887∗∗∗ (0.0237)

-0.0882∗∗∗ (0.0238)

-0.0875∗∗∗ (0.0237)

-0.0889∗∗∗ (0.0238)

-0.0882∗∗∗ (0.0238)

Age (w3)

-0.941 (0.678)

-0.950 (0.679)

-0.909 (0.678)

-0.934 (0.678)

-0.910 (0.680)

-0.978 (0.683)

-0.981 (0.682)

-0.968 (0.684)

-0.971 (0.683)

Age2 (w3)

0.0142 (0.0114)

0.0143 (0.0115)

0.0137 (0.0114)

0.0141 (0.0114)

0.0138 (0.0115)

0.0149 (0.0115)

0.0150 (0.0115)

0.0147 (0.0115)

0.0148 (0.0115)

Hours (w3)

-0.00412 (0.00366)

-0.00386 (0.00369)

0.00106 (0.00464)

0.00249 (0.00476)

0.00227 (0.00478)

0.00306 (0.00480)

0.00219 (0.00484)

0.00316 (0.00483)

0.00232 (0.00486)

Job Change (w3)

0.0227 (0.0375)

0.0259 (0.0376)

0.0225 (0.0376)

0.0225 (0.0376)

0.0208 (0.0377)

0.0192 (0.0377)

0.0186 (0.0377)

0.0195 (0.0378)

0.0188 (0.0378)

Vocational Training (w3)

-0.0325 (0.0758)

-0.0292 (0.0759)

-0.0305 (0.0758)

-0.0233 (0.0759)

-0.0234 (0.0760)

-0.0190 (0.0762)

-0.0218 (0.0761)

-0.0140 (0.0766)

-0.0166 (0.0765)

Business Major (w3)

0.00757 (0.129)

0.00385 (0.129)

0.0211 (0.129)

0.0145 (0.129)

0.00902 (0.130)

0.0107 (0.130)

0.00867 (0.130)

0.0124 (0.130)

0.00975 (0.130)

Occupational Mean Earnings (w3)

0.137 (0.130)

0.147 (0.130)

0.188 (0.133)

0.191 (0.134)

0.199 (0.134)

0.201 (0.134)

0.203 (0.134)

0.204 (0.134)

Occupational Risk (w3)

-0.275 (0.599)

-0.433 (0.604)

-0.314 (0.611)

-0.318 (0.612)

-0.234 (0.615)

-0.276 (0.615)

-0.237 (0.618)

-0.275 (0.618)

0.00748∗ (0.00429)

0.00824∗ (0.00433)

0.00816∗ (0.00434)

0.00830∗ (0.00434)

0.00786∗ (0.00435)

0.00820∗ (0.00436)

0.00778∗ (0.00436)

0.102 (0.0787)

0.103 (0.0789)

0.110 (0.0795)

0.129 (0.0805)

0.0994 (0.0810)

0.117 (0.0820)

Married (w3)

-0.0540 (0.0856)

-0.0470 (0.0861)

-0.0460 (0.0860)

-0.0475 (0.0862)

-0.0464 (0.0861)

Kids (w3)

-0.0261 (0.0961)

-0.0339 (0.0969)

-0.0394 (0.0969)

-0.0336 (0.0971)

-0.0393 (0.0971)

Clustering Coefficient

Tenure (w3)

∆_w4, w3 Hours Female

Risk Aversion Factor 1

0.0725 (0.0527)

0.0732 (0.0529)

Competitiveness Factor 1

-0.0597 (0.0822)

-0.0631 (0.0822)

Competitiveness Factor 2

0.0816 (0.127)

0.0768 (0.127)

13.63 (10.12)

13.60 (10.11)

Constant Dummies Work 1995-2000 Race Dummies Observations R2

15.40 (9.978)

14.34 (10.04)

13.42 (10.04)

13.27 (10.03)

12.90 (10.06)

13.77 (10.10)

13.73 (10.09)

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No 544 0.197

No 544 0.199

No 544 0.204

No 544 0.206

Yes 544 0.207

Yes 544 0.212

Yes 544 0.215

Yes 544 0.213

544 0.216

Standard errors in parentheses. ***p<0.01, **p<0.05, *p<0.1. Note: We here explain the variables that may not be self-explanatory. w3 stands for wave 3 and w4 stands for wave 4 of AddHealth. The parenthetical remark (w3) refers to the variable being measured in w3. Variable Gender Earnings Gap in occupation j is defined as the log difference of average male earnings and the average female earnings in occupation j. It is measured using data from the 2000 Census, which coincides in terms of timing with wave 3 (w3), and then merged into AddHealth occupations using an occupational crosswalk. Variable ∆w4,w3 Hours is defined as the difference of hours worked in wave 4 and hours worked in wave 3. Variable Risk Aversion is the first factor coming out of our factor analysis to measure risk aversion. Variables Competitiveness 1 and Competitiveness 2 are the two factors returned by the factor analysis on competitiveness. Variable Tenure gives years of tenure of job in wave 3. Job Change is a dummy equalling 0 if respondent is still in his/her first full-time job and zero otherwise. Vocational Training is a dummy equalling 1 if the respondent has received some. Business Major is a dummy equalling 1 if the respondent majored business in college. Variable Education measures highest grade completed. Kids is a dummy equalling one if respondent has kids. Married is a dummy equalling one if respondent is married in wave 3. Clustering Coefficient and Degree are the network characteristics measured for each respondent in wave 1 of AddHealth. Dummies Work 1995-2000 include a dummy for each year 1995-2000 which is one if the individual worked then and zero otherwise. Race Dummies are four dummy variables for: African American, American Indian, Asian and White. We focus on a subsample of AddHealth where individuals (i) worked full-time in w3 and w4 and (ii) have not switched the occupation across waves.

62

Table 30: Earnings Dynamics and Network Characteristics (Outdegree) (1) Wage Growth 0.132 (0.102)

(2) Wage Growth 0.134 (0.102)

(3) Wage Growth 0.136 (0.102)

(4) Wage Growth 0.135 (0.102)

(5) Wage Growth 0.131 (0.102)

(6) Wage Growth 0.135 (0.102)

(7) Wage Growth 0.142 (0.102)

(8) Wage Growth 0.134 (0.102)

(9) Wage Growth 0.142 (0.102)

Out-Degree

-0.0214 (0.0842)

-0.0178 (0.0844)

-0.0285 (0.0844)

-0.0330 (0.0844)

-0.0373 (0.0848)

-0.0360 (0.0851)

-0.0427 (0.0852)

-0.0338 (0.0854)

-0.0400 (0.0855)

Gender Wage Gap (w3)

-0.0527 (0.238)

0.0285 (0.299)

0.0832 (0.300)

-0.0452 (0.315)

-0.0467 (0.315)

-0.109 (0.317)

-0.0704 (0.318)

-0.0950 (0.319)

-0.0569 (0.320)

Outdegree×Gender Wage Gap w3

0.106 (0.235)

0.0922 (0.235)

0.124 (0.236)

0.128 (0.235)

0.136 (0.236)

0.146 (0.237)

0.162 (0.237)

0.155 (0.237)

0.170 (0.237)

(Clustering Coefficient)×(Gender Wage Gap w3)

-0.435 (0.267)

-0.445∗ (0.268)

-0.448∗ (0.267)

-0.440∗ (0.267)

-0.427 (0.268)

-0.425 (0.269)

-0.443 (0.269)

-0.419 (0.269)

-0.438 (0.269)

0.114∗∗∗ (0.0194)

0.108∗∗∗ (0.0204)

0.107∗∗∗ (0.0203)

0.103∗∗∗ (0.0205)

0.101∗∗∗ (0.0209)

0.0997∗∗∗ (0.0209)

0.104∗∗∗ (0.0210)

0.0994∗∗∗ (0.0213)

0.104∗∗∗ (0.0216)

-0.0879∗∗∗ (0.0235)

-0.0880∗∗∗ (0.0236)

-0.0895∗∗∗ (0.0236)

-0.0905∗∗∗ (0.0235)

-0.0892∗∗∗ (0.0236)

-0.0882∗∗∗ (0.0237)

-0.0878∗∗∗ (0.0237)

-0.0888∗∗∗ (0.0238)

-0.0883∗∗∗ (0.0237)

Age (w3)

-0.959 (0.679)

-0.966 (0.679)

-0.931 (0.678)

-0.954 (0.678)

-0.931 (0.680)

-1.002 (0.682)

-1.009 (0.682)

-0.993 (0.683)

-1.001 (0.683)

Age2 (w3)

0.0145 (0.0114)

0.0146 (0.0115)

0.0141 (0.0114)

0.0145 (0.0114)

0.0142 (0.0115)

0.0154 (0.0115)

0.0155 (0.0115)

0.0151 (0.0115)

0.0153 (0.0115)

Hours (w3)

-0.00418 (0.00366)

-0.00393 (0.00369)

0.00107 (0.00464)

0.00258 (0.00477)

0.00237 (0.00479)

0.00320 (0.00481)

0.00234 (0.00484)

0.00330 (0.00483)

0.00246 (0.00486)

Job Change (w3)

0.0230 (0.0375)

0.0262 (0.0376)

0.0228 (0.0376)

0.0226 (0.0376)

0.0209 (0.0377)

0.0194 (0.0377)

0.0187 (0.0377)

0.0201 (0.0378)

0.0192 (0.0378)

Vocational Training (w3)

-0.0283 (0.0757)

-0.0252 (0.0758)

-0.0269 (0.0757)

-0.0204 (0.0758)

-0.0206 (0.0759)

-0.0156 (0.0760)

-0.0186 (0.0760)

-0.00945 (0.0764)

-0.0123 (0.0764)

Business Major (w3)

0.00825 (0.129)

0.00399 (0.129)

0.0224 (0.129)

0.0161 (0.129)

0.0106 (0.130)

0.0132 (0.130)

0.0114 (0.130)

0.0161 (0.130)

0.0138 (0.130)

Occupational Mean Earnings (w3)

0.137 (0.130)

0.146 (0.130)

0.188 (0.133)

0.192 (0.134)

0.198 (0.134)

0.200 (0.134)

0.199 (0.134)

0.201 (0.134)

Occupational Risk (w3)

-0.255 (0.600)

-0.412 (0.605)

-0.293 (0.611)

-0.297 (0.612)

-0.209 (0.615)

-0.249 (0.615)

-0.211 (0.618)

-0.247 (0.618)

0.00760∗ (0.00430)

0.00840∗ (0.00433)

0.00832∗ (0.00434)

0.00851∗ (0.00434)

0.00806∗ (0.00435)

0.00840∗ (0.00435)

0.00797∗ (0.00436)

0.105 (0.0784)

0.106 (0.0787)

0.111 (0.0792)

0.131 (0.0804)

0.0970 (0.0811)

0.116 (0.0822)

Married (w3)

-0.0566 (0.0858)

-0.0489 (0.0863)

-0.0483 (0.0862)

-0.0488 (0.0864)

-0.0481 (0.0863)

Kids (w3)

-0.0214 (0.0960)

-0.0300 (0.0968)

-0.0352 (0.0968)

-0.0299 (0.0969)

-0.0352 (0.0969)

Clustering Coefficient

Education (w3) Tenure (w3)

∆_w4, w3 Hours Female

Risk Aversion Factor 1

0.0728 (0.0526)

0.0738 (0.0528)

Competitiveness Factor 1

-0.0702 (0.0825)

-0.0741 (0.0825)

Competitiveness Factor 2

0.0949 (0.128)

0.0907 (0.128)

14.06 (10.11)

14.09 (10.11)

Constant Dummies Work 1995-2000 Race Dummies Observations R2

15.70 (9.982)

14.59 (10.05)

13.77 (10.04)

13.59 (10.04)

13.21 (10.06)

14.15 (10.10)

14.17 (10.09)

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Yes

No 544 0.197

No 544 0.199

No 544 0.204

No 544 0.207

Yes 544 0.208

Yes 544 0.212

Yes 544 0.215

Yes 544 0.214

544 0.217

Standard errors in parentheses. ***p<0.01, **p<0.05, *p<0.1. Note: We here explain the variables that may not be self-explanatory. w3 stands for wave 3 and w4 stands for wave 4 of AddHealth. The parenthetical remark (w3) refers to the variable being measured in w3. Variable Gender Earnings Gap in occupation j is defined as the log difference of average male earnings and the average female earnings in occupation j. It is measured using data from the 2000 Census, which coincides in terms of timing with wave 3 (w3), and then merged into AddHealth occupations using an occupational crosswalk. Variable ∆w4,w3 Hours is defined as the difference of hours worked in wave 4 and hours worked in wave 3. Variable Risk Aversion is the first factor coming out of our factor analysis to measure risk aversion. Variables Competitiveness 1 and Competitiveness 2 are the two factors returned by the factor analysis on competitiveness. Variable Tenure gives years of tenure of job in wave 3. Job Change is a dummy equalling 0 if respondent is still in his/her first full-time job and zero otherwise. Vocational Training is a dummy equalling 1 if the respondent has received some. Business Major is a dummy equalling 1 if the respondent majored business in college. Variable Education measures highest grade completed. Kids is a dummy equalling one if respondent has kids. Married is a dummy equalling one if respondent is married in wave 3. Clustering Coefficient and Degree are the network characteristics measured for each respondent in wave 1 of AddHealth. Dummies Work 1995-2000 include a dummy for each year 1995-2000 which is one if the individual worked then and zero otherwise. Race Dummies are four dummy variables for: African American, American Indian, Asian and White. We focus on a subsample of AddHealth where individuals (i) worked full-time in w3 and w4 and (ii) have not switched the occupation across waves.

63

64

-0.0523∗∗∗ (0.0201)

-0.0641 (0.104)

-0.0779∗ (0.0441)

Vocational Training

Kids

No 4869

No 4869

Yes 4869

Yes 4869

Yes 4869

Yes 4869

Yes

0.0158 (0.0624)

0.0538 (0.0349)

0.0236 (0.0411)

-0.0907∗∗ (0.0361)

-0.0136∗ (0.00737)

-0.0900 (0.211)

-0.0388∗ (0.0202)

-0.0680 (0.0952)

0.0638 (0.200)

-0.0338 (0.0272)

-0.0000358 (0.0274)

-0.0140 (0.0119)

(8) Occupational Risk -0.0303∗∗∗ (0.0102)

Standard errors in parentheses. ***p<0.01, **p<0.05, *p<0.1. Note: We here explain the variables that may not be self-explanatory. Variable Risk Aversion is the first factor coming out of our factor analysis to measure risk aversion. Variables Competitiveness 1 and Competitiveness 2 are the two factors returned by the factor analysis on competitiveness. Variable Tenure gives years of tenure of main job in wave 3. Vocational Training is a dummy equalling 1 if the respondent has received some. Business Major is a dummy equalling 1 if the respondent majored business in college. Variable Education measures highest grade completed. Kids is a dummy equalling one if respondent has kids. Married is a dummy equalling one if respondent is married in wave 3. Clustering Coefficient and Degree are the network characteristics measured for each respondent in wave 1 of AddHealth. Dummies Work 1995-2000 include a dummy for each year 1995-2000 which is one if the individual worked then and zero otherwise. Race Dummies are four dummy variables for: African American, American Indian, Asian and White. Log Earnings is the log of annual earnings in wave 3. The dependent variable Occupational Risk is the occupational earnings risk measure that we computed above using the 2000 Census.

No 4869

No 4869

Race Dummies Observations

Yes

Yes

Yes

Yes

Dummies Work 1995-2000

-0.0904∗∗ (0.0364)

-0.0135∗ (0.00736)

-0.0968 (0.221)

-0.0375∗ (0.0200)

-0.0656 (0.0977)

0.0642 (0.199)

-0.0344 (0.0265)

-0.00113 (0.0273)

-0.0135 (0.0119)

(7) Occupational Risk -0.0303∗∗∗ (0.0102)

0.0165 (0.0630)

0.0281 (0.0437)

-0.0921∗∗∗ (0.0348)

-0.0133∗ (0.00745)

-0.108 (0.219)

-0.0372∗ (0.0195)

-0.0682 (0.0956)

0.0622 (0.200)

-0.0381 (0.0253)

0.00250 (0.0283)

-0.00921 (0.0117)

(6) Occupational Risk -0.0287∗∗∗ (0.0104)

Competitiveness Factor 2 Yes

-0.0919∗∗∗ (0.0350)

-0.0854∗∗ (0.0391)

Yes

-0.0132∗ (0.00745)

-0.0128∗ (0.00738)

-0.0130∗ (0.00740)

-0.117 (0.230)

-0.0355∗ (0.0194)

-0.0653 (0.0983)

0.0626 (0.200)

-0.114 (0.230)

-0.0299 (0.0226)

-0.0652 (0.0985)

0.0700 (0.197)

-0.116 (0.229)

-0.0530∗∗∗ (0.0201)

-0.0668 (0.0989)

0.0722 (0.197)

-0.0389 (0.0243)

0.00129 (0.0281)

-0.00839 (0.0120)

(5) Occupational Risk -0.0286∗∗∗ (0.0105)

0.0553 (0.0366)

Yes

-0.111 (0.230)

0.0722 (0.197)

-0.0417∗ (0.0235)

-0.000786 (0.0278)

-0.00477 (0.0127)

(4) Occupational Risk -0.0282∗∗∗ (0.0101)

Competitiveness Factor 1

Risk Aversion Factor 1

Married

Log Earnings

Female

-0.0682 (0.0986)

0.0653 (0.207)

Business Major

-0.0447∗ (0.0245)

-0.0466∗ (0.0249)

-0.0422 (0.0318)

Age

0.000202 (0.0279)

-0.00517 (0.0129)

(3) Occupational Risk -0.0286∗∗∗ (0.00998)

0.00136 (0.0279)

-0.00349 (0.0281)

Education

-0.00574 (0.0130)

-0.00641 (0.0129)

(2) Occupational Risk -0.0292∗∗∗ (0.0101)

Degree

Clustering Coefficient

(1) Occupational Risk -0.0294∗∗∗ (0.0101)

Table 31: Occupational Choice – Ordered Probit Wave 3

Table 32: Occupational Choice – Ordered Probit Wave 3 (Marginal Effects) (1) Occupational Risk (Health Care Support – Low Risk) 0.00157∗∗∗ (0.000528)

(2) Occupational Risk (Management – High Risk) -0.000862∗∗∗ (0.000289)

Degree

0.000726 (0.000617)

-0.000398 (0.000338)

Education

0.00000185 (0.00142)

-0.00000102 (0.000779)

Age

0.00175 (0.00141)

-0.000961 (0.000773)

Business Major (d)

-0.00332 (0.0124)

0.00176 (0.00603)

Vocational Training (d)

0.00352 (0.00792)

-0.00197 (0.00328)

Kids (d)

0.00201 (0.00263)

-0.00112 (0.00167)

Female (d)

0.00466 (0.0116)

-0.00254 (0.00699)

Log Earnings

0.000705∗ (0.000382)

-0.000386∗ (0.000209)

Married (d)

0.00468 (0.00508)

-0.00265 (0.00269)

Risk Aversion Factor 1

-0.00123 (0.00213)

0.000671 (0.00117)

Competitiveness Factor 1

-0.00279 (0.00181)

0.00153 (0.000990)

Competitiveness Factor 2

-0.000818 (0.00324) Yes

0.000448 (0.00177) Yes

Race Dummies

Yes

Yes

Observations

4869

4869

Clustering Coefficient

Dummies Work 1995-2000

Clustered standard errors at occupation level in parentheses. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. The prob of choosing Management Occupations= .0459; the prob of choosing Health Care Support Occupations= .055. (d) for discrete change of dummy variable from 0 to 1. For a description of the remaining variables, see Table 31.

65

66

-0.0766 (0.0767)

Kids

Yes 5339

Yes 5339

Yes 5339

Yes 5339

Yes

0.0457 (0.0439)

-0.0157 (0.0398)

0.0287 (0.0398)

0.00633 (0.0469)

-0.0174 (0.205)

-0.0703 (0.0671)

-0.00932 (0.0542)

0.226 (0.148)

-0.00620 (0.0113)

0.00358 (0.0430)

0.0205∗∗ (0.00890)

-0.00346 (0.00719)

(7) Occupational Risk

Standard errors in parentheses. ***p<0.01, **p<0.05, *p<0.1. Note: We here explain the variables that may not be self-explanatory. Variable Risk Aversion is the first factor coming out of our factor analysis to measure risk aversion. Variables Competitiveness 1 and Competitiveness 2 are the two factors returned by the factor analysis on competitiveness. Variable Tenure gives years of tenure of main job in wave 3. Vocational Training is a dummy equalling 1 if the respondent has received some. Business Major is a dummy equalling 1 if the respondent majored business in college. Variable Education measures highest grade completed. Kids is a dummy equalling one if respondent has kids. Married is a dummy equalling one if respondent is married in wave 3. Clustering Coefficient and Degree are the network characteristics measured for each respondent in wave 1 of AddHealth. Dummies Work 1995-2000 include a dummy for each year 1995-2000 which is one if the individual worked then and zero otherwise. Race Dummies are four dummy variables for: African American, American Indian, Asian and White. Log Earnings is the log of annual earnings in wave 4. The dependent variable Occupational Risk is the occupational earnings risk measure that we computed above. Note that we do not control for marital status because we do not have this information in wave 4.

No 5339

No 5339

Race Dummies Observations

No 5339

Yes

Yes

Yes

0.00687 (0.0468)

-0.0261 (0.209)

-0.0686 (0.0681)

-0.00649 (0.0549)

0.227 (0.148)

-0.00683 (0.0112)

Dummies Work 1995-2001

0.0292 (0.0411)

0.00649 (0.0468)

-0.0166 (0.203)

-0.0708 (0.0674)

-0.00924 (0.0540)

0.227 (0.148)

-0.00622 (0.0106)

0.00241 (0.0429)

0.0212∗∗ (0.00908)

-0.00356 (0.00722)

(6) Occupational Risk

0.0460 (0.0439) Yes

0.00710 (0.0466)

-0.0261 (0.207)

-0.0690 (0.0685)

-0.00635 (0.0548)

0.227 (0.148)

-0.00701 (0.0107)

0.00409 (0.0429)

0.0209∗∗ (0.00951)

-0.00303 (0.00712)

(5) Occupational Risk

Competitiveness Factor 2 Yes

0.00873 (0.0477)

-0.0292 (0.207)

-0.0713 (0.0690)

-0.00496 (0.0540)

0.227 (0.147)

-0.00930 (0.0102)

0.00297 (0.0429)

0.0217∗∗ (0.00996)

-0.00307 (0.00713)

(4) Occupational Risk

-0.0136 (0.0417)

Yes

-0.0319 (0.213)

-0.0720 (0.0692)

-0.00487 (0.0539)

0.229 (0.149)

-0.00904 (0.00984)

0.00197 (0.0428)

0.0261∗∗ (0.0105)

-0.00115 (0.00699)

(3) Occupational Risk

Competitiveness Factor 1

Risk Aversion Factor 1

Log Earnings

Female

-0.00338 (0.0570)

Vocational Training

-0.00802 (0.00843)

Age

0.229 (0.149)

0.00180 (0.0430)

Education

Business Major

0.0262∗∗ (0.0105)

0.0262∗∗ (0.0106)

Degree 0.00293 (0.0445)

-0.00119 (0.00697)

(2) Occupational Risk

-0.00105 (0.00667)

Occupational Risk Clustering Coefficient

(1) Occupational Risk

Table 33: Occupational Choice – Ordered Probit Wave 4

Table 34: Occupational Choice – Ordered Probit Wave 4 (Marginal Effects) (1) Occupational Risk (Health Care Support – Low Risk) 0.000177 (0.000368)

(2) Occupational Risk (Management – High Risk) -0.000244 (0.000507)

Degree

-0.00105∗∗ (0.000456)

0.00145∗∗ (0.000628)

Education

-0.000183 (0.00220)

0.000253 (0.00303)

Log Earnings

-0.000324 (0.00240)

0.000446 (0.00331)

Age

0.000318 (0.000580)

-0.000438 (0.000800)

Business Major (d)

-0.0119 (0.0157)

0.0154 (0.0195)

Vocational Training (d)

0.000476 (0.00307)

-0.000658 (0.00406)

Kids (d)

0.00359 (0.00546)

-0.00496 (0.00701)

Female (d)

0.000889 (0.0107)

-0.00122 (0.0145)

Risk Aversion Factor 1

-0.00147 (0.00204)

0.00202 (0.00281)

Competitiveness Factor 1

0.000804 (0.00204)

-0.00111 (0.00281)

Competitiveness Factor 2

-0.00234 (0.00225)

0.00322 (0.00310)

Race Dummies

Yes

Yes

Dummies for Work 1995-2001

Yes

Yes

Observations

5339

Clustering Coefficient

Clustered standard errors at occupation level in parentheses.



p < 0.10,

5339 ∗∗

p < 0.05,

∗∗∗

p < 0.01

The prob that to select into Health Care Support= .0629; the prob to select into Management= .078. (d) for discrete change of dummy variable from 0 to 1. For a description of the remaining variables, see Table 33.

67

Table 35: Summary Statistics of Performance Measures in Google Scholar Subsample

citations h-index i10-index Observations

Min 0.00000 0.00000 0.00000

Max Mean 679167.00000 6205.23895 262.00000 20.79069 1973.00000 60.22703 28 816

Std 21411.60298 24.84482 141.29833

Table 36: Correlation Matrix of Performance Measures in Google Scholar Subsample citations h-index i10-index Observations

citations 1.00 0.77 0.60

h-index 0.77 1.00 0.78 28 816

i10-index 0.60 0.78 1.00

Table 37: Network Characteristics and Performance

Degree Clustering Coefficient Constant Observations R2

h-index

i10-index

citations

(1)

(2)

(3)

∗∗∗

∗∗∗

0.17472 (0.00677) −11.31338∗∗∗ (1.90668) 20.31973∗∗∗ (0.29530)

1.27925 (0.03813) −35.17516∗∗∗ (10.74617) 50.89976∗∗∗ (1.66432)

58.56022∗∗∗ (5.92192) −3,056.21900∗ (1,668.89000) 5,978.61200∗∗∗ (258.47070)

28,816 0.02794

28,816 0.04214

28,816 0.00405

Standard errors in parentheses. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01.

68

Table 38: Network Characteristics by Gender

Degree

Female 9.786681

Male 11.167021

Clustering Coefficient

0.1237841

0.1187366

Observations

Difference -1.38034*** (-5.2232) 0.0050475*** (5.3023)

28 816 ∗

t-statistics in parentheses. p<0.1;

∗∗

p<0.05;

∗∗∗

p<0.01

Table 39: Gender and Performance h-index

i10-index

citations

(1)

(2)

(3)

−2.02864 (0.30209) 21.58399∗∗∗ (0.18607)

2.11412 (1.71646) 59.46930∗∗∗ (1.05722)

226.02580 (261.42460) 6,147.40400∗∗∗ (161.01990)

28,816 0.00156

28,816 0.00005

28,816 0.00003

∗∗∗

Gender Constant Observations R2

Standard errors in parentheses. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01

Table 40: Network Characteristics and Performance

Degree Clustering Coefficient Gender Constant Observations R2

h-index

i10-index

citations

(1)

(2)

(3)

∗∗∗

∗∗∗

0.17380 (0.00676) −11.04119∗∗∗ (1.90617) −1.73301∗∗∗ (0.29813) 20.95413∗∗∗ (0.31466)

1.28141 (0.03814) −35.81342∗∗∗ (10.74851) 4.06366∗∗ (1.68112) 49.41219∗∗∗ (1.77432)

58.73139∗∗∗ (5.92349) −3,106.91600∗ (1,669.37800) 322.77700 (261.09910) 5,860.45500∗∗∗ (275.57460)

28,816 0.02908

28,816 0.04233

28,816 0.00410

Standard errors in parentheses. ∗ p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01

69

Table 41: Occupational Gender Earnings Gap (Year 2000) (1) Wage Gap -0.198∗∗∗ (0.000877)

(2) Wage Gap -0.246∗∗∗ (0.000769)

Education Gap

3.488∗∗∗ (0.00164)

3.242∗∗∗ (0.00145)

Hours Gap

1.558∗∗∗ (0.00128)

0.906∗∗∗ (0.00127)

Share Female

0.0177∗∗∗ (0.000198)

0.0411∗∗∗ (0.000175)

Share White

-0.408∗∗∗ (0.00154)

-0.457∗∗∗ (0.00135)

Share Black

-1.394∗∗∗ (0.00194)

-0.943∗∗∗ (0.00174)

Experience Gap

0.584∗∗∗ (0.000540)

Occupational Risk Constant Observations Adjusted R2

0.609∗∗∗ (0.00137) 3801304 0.804

0.345∗∗∗ (0.00122) 3801304 0.850

Standard errors in parentheses. ***p<0.01, **p<0.05, *p<0.1. Note: The regressions are based on US Census data, year 2000, which coincides in terms of timing with wave 3 of AddHealth. Variable Gender Earnings Gap is defined as the log difference of average male earnings and average female earnings in an occupation. Variable Experience Gap is the log difference between male and female average potential experience in an occupation. Education Gap is the log difference between male and female average years of schooling in an occupation. Hours Gap is the log difference in hours worked between men and women in an occupation. Share Female is the fraction of workers in an occupation that are female. Share White is the fraction of workers in an occupation that are white. Share Black is the fraction of workers in an occupation that are African American. Occupational Risk is the unexplained occupational earnings risk, for details see above.

70

Table 42: Occupational Gender Earnings Gap (Year 1990) (1) Wage Gap -0.602∗∗∗ (0.000868)

(2) Wage Gap -0.482∗∗∗ (0.000814)

Education Gap

1.412∗∗∗ (0.00140)

1.366∗∗∗ (0.00129)

Hours Gap

1.715∗∗∗ (0.00138)

0.875∗∗∗ (0.00168)

Share Female

0.0921∗∗∗ (0.000200)

0.128∗∗∗ (0.000190)

Share White

-0.957∗∗∗ (0.00263)

-1.194∗∗∗ (0.00244)

Share Black

-2.123∗∗∗ (0.00320)

-1.925∗∗∗ (0.00295)

Experience Gap

0.594∗∗∗ (0.000777)

Occupational Risk Constant Observations Adjusted R2

1.225∗∗∗ (0.00248) 3233742 0.681

1.147∗∗∗ (0.00229) 3233742 0.730

Standard errors in parentheses. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. Note: These regressions are based on the 1990 US Census. For a description of the variables, see Table 41.

Table 43: Occupational Gender Earnings Gap (Years 2005-2007) (1) Wage Gap -0.102∗∗∗ (0.00101)

(2) Wage Gap -0.258∗∗∗ (0.000867)

Education Gap

3.049∗∗∗ (0.00187)

3.055∗∗∗ (0.00158)

Hours Gap

2.028∗∗∗ (0.00167)

1.257∗∗∗ (0.00160)

Share Female

0.0484∗∗∗ (0.000229)

0.0437∗∗∗ (0.000193)

Share White

-0.680∗∗∗ (0.00138)

-0.582∗∗∗ (0.00117)

Share Black

-1.488∗∗∗ (0.00176)

-0.943∗∗∗ (0.00158)

Experience Gap

0.577∗∗∗ (0.000572)

Occupational Risk Constant Observations Adjusted R2

0.785∗∗∗ (0.00117) 2512715 0.837

0.400∗∗∗ (0.00106) 2512715 0.884

Standard errors in parentheses. ∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01. These regressions are based on the American Community Survey, 2005-2007. For a description of the variables, see Table 41.

71

Theory Appendix C: Omitted Details of Model and Derivations Derivation of si The probability that one agent is chosen is given by P r(K) =

N −1 1 N (N −1) 2

=

2 N , and the probability

that this agent i is linked to the suggested project partner j, given that he is selected by P r(gij = 1|K) =

Di N −1 .

Then, the probability of being chosen and being partnered with a friend is si ≡ P r(gij = 1 ∧ K) = P r(gij = 1|K)P r(K) =

2Di . N (N − 1)

Peer Pressure and Relationship Quality We outline here formally how a project outcome affects the relationships of workers. As mentioned previously, whether the project of workers i and j was a success, S, or a failure, F is publicly observable and denoted by ω ∈ Ω = {S, F }×{1, 2, . . . , N }2 . As an example, if ω = S12, this means that a project was successfully completed by workers 1 and 2. We condition also on the workers who carried out the project as we do not only care about whether the project was successful but also about the workers who were involved. Each project failure induces some bad relationships in the network g. The network that contains the links that signify a bad relationship is denoted by gb ⊂ g. The specific network gb that arises after F ij, that is a project failure between workers i and j, where gij = 1, is given by gb (F ij) = {{ij, il, jl}|gil = 1 ∧ gjl = 1, ∀l}. Workers i and j have a bad relationship with each other if their joint project fails. But a worker l, who is connected to both i and j also has a bad relationship with both of them. Denote by gg (F ij) = g\gb (F ij) the good relationships in the network g. Let γg ∈ gg and γb ∈ gb . Further, for any i, j gg (Sij) = g. Perfect Public Equilibrium The relationship quality between two directly connected workers constitutes a state, γ ∈ Γ = {γg , γb }. Also, recall the publicly observed signals y ∈ Y = {0, 1, . . . , nmax } where nmax = maxi ni . We can define a pure public strategy σ : Γ × Y → E, which maps from the relationship state and the signals into the action space. Due to our restriction to public strategies, the equilibrium concept applied is that of a public perfect equilibrium. We index the variables in the second period by prime. Definition 2. A public perfect equilibrium (PPE) is a profile of public strategies σ that for any state γ, γ 0 ∈ Γ and for any signal realization y, y 0 ∈ Y specifies a Nash equilibrium for the repeated game,

72

i.e. in the first period, σ(γg , y) is a Nash equilibrium and in the second period σ 0 (γ 0 , y 0 ) is a Nash equilibrium. We restrict attention to the strategies according to which agents exert high effort if the relationship to the project partner is good and zero effort otherwise. This implies for period one and period two strategies: Period 1: ∀ y

σ(γg , y) > 0

Period 2: ∀ y 0

σ 0 (γg0 , y 0 ) > 0

and

σ 0 (γb0 , y 0 ) = 0.

To simplify notation in the main text, we there denote the first and second period strategies by e(y) and e(y 0 ) (instead of σ(γg , y) and σ(γg0 , y 0 )), where we omit the relationship state as an argument. Equilibrium Selection In our analysis, we have selected the equilibrium that induces workers to play high effort if their relationship is good and zero effort if their relationship is bad. Alternatively, agents could choose to play the static high effort PPE each period, independently of their relationship. Another possibility is to select zero effort independently of past project outcomes and signals.52 We evaluate these different equilibria according to their expected payoffs. We find that if workers always choose the payoff maximizing equilibrium, then the zero effort equilibrium will never be played. Men will do even better in volatile environments, whereas women keep their advantage in environments with little uncertainty, leaving our conclusions of Section 4 unchanged. In order to see this, we define the payoffs from choosing the static high effort PPE and from our proposed strategy, respectively:   Wistat =si (1 + β)E f (e0 (y), e0 (y))π(y) − c(e0 (y)) ,

(15)

Widyn =si E [f (e(y), e(y))π(y) − c(e(y))]   + si βE [(1 − r(1 − f (e(y), e(y))))] E f (e0 (y 0 ), e0 (y 0 ))π(y 0 ) − c(e0 (y 0 )) .

(16)

The equilibrium we select yields a higher payoff than the static PPE whenever Widyn > Wistat . To simplify notation, we let E[V1 ] = E [f (e(y), e(y))π(y) − c(e(y))] and 52

Obviously, there are other equilibria, such as whenever a project fails, all relationships in the network turn bad and then all players choose zero effort. Another possibility is that a good relationship leads to zero effort and a bad relationship to positive effort. We find these equilibria hard to justify and therefore use the static PPE as a benchmark. Further, endogenizing the equilibrium selection is beyond the scope of this work.

73

E[V2 ] = E [f (e0 (y 0 ), e0 (y 0 ))π(y 0 ) − c(e0 (y 0 ))]. Welfare under our strategy, Widyn , is higher than welfare in the static high effort PPE, Wistat , whenever E[V1 ] − E[V2 ] > βri (1 − E[f (e(y), e(y))])E[V2 ]

(17)

So, if E[V1 ] − E[V2 ] > 0 and E[f (e(y), e(y))] is sufficiently large, then welfare is higher under our strategy.53 An example of parameter values for which equation (17) holds is given in Table √ 44. We assume f (ei , ej ) = ei , ej and c(ei ) = 12 e2i . In this example, men exert on average lower effort than women, in both states of the world. This is not surprising given that the project value in both states of the world is fairly similar. Table 44: Welfare Parameters vl 1.5

vh 1.6

p 0.75

q 0.5

β 0.9

dW 2

dM 3

CW 2

CM 1

N 4

Notice that E[f (e(y), e(y))] is large if effort is high under any signal realization. Effort does not vary greatly with the different signal realizations if the project values across states are similar, implying little uncertainty in the environment. We have shown that women exert higher effort than men in these environments, see Proposition (2). If agents always play the strategy that yields the highest payoff, then in an environment with high uncertainty the static high effort PPE will be selected, whereas in an environment with low uncertainty and relatively high payoffs, our proposed strategy is implemented. But this implies that the differences between men and women, which we discussed in Section 4 , remain unchanged. Women would do even worse than men in uncertain environments than under our strategy and perform the same in situations with low uncertainty and high payoffs. But the equilibrium that is payoff maximizing might not be selected. If a worker exerts positive effort, but his team partner shirks and only exerts zero effort, then he will face a loss. So, if there is a possibility of mis-coordination it might be better to always choose zero effort. Whether the expected payoff maximizing equilibrium or the zero effort equilibrium (that even under mis-coordination yields no losses) will be selected depends on whether payoff or risk dominant strategies should be played. The evidence for this is mixed at best (Van Huyck et al. (1990), Cooper et al. (1990), Cooper et al. (1992)). We believe that it is plausible to assume that workers might risk to choose the high effort which can potentially result in a loss (namely when they trust their project partner after a good 53 Note that E[V1 ] − E[V2 ] > 0 might not always be the case, although e > e0 . To see this we consider the example given in Table 6, where E[V1 ] < E[V2 ]. The reason is that workers choose very high effort in the first period even if the project does not yield a payoff in order to avoid having a bad relationship in the second period.

74

history) and that they go for the strategy that ensures a nonnegative profit after a loss and thus bad history.

Theory Appendix D: Proofs Throughout, we make the following assumption on f : Assumption 1. The success probability function f satisfies: 1.

Symmetry: f (ei , ej ) = f (ej , ei ).

2.

f1 (ei , ej ) > 0, f2 (ej , ei ) > 0.

3.

f11 (ei , ej ) = f22 (ej , ei ) < 0.

4.

Strict Supermodularity: f12 (ei , ej ) = f21 (ei , ej ) > 0.

5.

f (ei , 0) = f (0, ej ) = 0.

6.

f (λei , λej ) = λf (ei , ej ), λei , λej ≤ emax .54

Proof of Lemma 1: Static Game Given the Assumption 1, there always exists an equilibrium where both project partners exert zero effort. It therefore remains to show that this equilibrium is unique, with ei = ej > 0. We first show symmetry. From the first order conditions we obtain f1 (ei , ej ) c0 (ei ) = 0 f2 (ei , ej ) c (ej )

(18)

Suppose, by contradiction, that effort levels are not symmetric ej > ei . Due to convexity of the cost functions, the RHS of (18) is smaller than one. Due to concavity and supermodularity of the effort function, we have f1 (ei , ej ) > f2 (ei , ej ), which is why the LHS is larger than one, which gives the contradiction. Further, the equilibrium where both workers exert strictly positive effort is unique. It suffices to show that the FOCs (which under symmetry become a function of one variable) have one zero under the condition that effort is strictly positive. f1 (e, e)π(y) = c0 (e)

(19)

Due to our assumption of constant returns to scale, f1 (e, e) is constant in e. By our assumption of quadratic costs, the first derivative of the cost function c0 (e) is linear in e and starts in the 54

We know that ei ∈ [0, emax ]. If λ ∈ [0, 1], then λei ≤ emax , and for λ > 1 we impose the additional restriction that λei ≤ emax , ∀i.

75

origin. Hence, the two functions have a unique intersection, implying a unique symmetric equilibrium with strictly positive effort. Proof of Proposition 1: The proof follows directly from maximization problem (6), Lemma 1 and the arguments given in the text. Proof of Proposition 2: Second Period Effort We first establish how second period effort is affected by additional information, depending on the state of the world. We know from equation (8) that the second period effort is a function of expected payoff, π(y). To stress that a worker receives n signals, we adjust our notation and denote the project value by π(yn ). We then establish in Lemma 2 that π(yn ) increases in the number of signals in the high state and decreases in the number of signals in the low state. It then follows immediately from equation (8) that a worker with more signals exerts higher effort in the high state and lower effort in the low state (proving claim 2. of Proposition 2). Additionally, Lemma 2 characterizes the effect of vanishing uncertainty on the expected payoff. Lemma 2 (Information and Expected Project Value). Project value π(yn ) satisfies the martingale property: π(yn ) = E[π(yn+1 )|yn ]. However, given that the state is realized, a worker with more signals holds a more accurate posterior belief about the state of the world and thus about the project value: vh > E [π(yn+1 )|θh ] > E [π(yn )|θh ]

vl < E [π(yn+1 )|θl ] < E [π(yn )|θl ] .

The impact of an additional signal vanishes, if uncertainty vanishes, i.e. E [π(yn )|θ] = E [π(yn+1 )|θ], if either (i) vl → vh , (ii) p → 1, (iii) q → 1 if θ = θh , q → 0 if θ = θl , or (iv) next → ∞. Proof of Lemma 2: We prove this Lemma in three steps. First, we show the claim that π(y) has the martingale property: π(yn ) = P r(θh |yn )vh + (1 − P r(θh |yn ))vl

76

Define ψn ≡ P r(θh |yn ). We know that the stochastic process {ψn } is a martingale as E[ψn+1 |yn ] = E[E[ψ|yn+1 ]|yn ] = E[ψ|yn ] = ψn , where the second equality follows from the tower property of conditional expectations. Then, E[π(yn+1 )|yn ] = E[ψn+1 vh + (1 − ψn+1 )vl |yn ] = E[ψn+1 vh |yn ] + E[(1 − ψn+1 )vl |yn ] = ψn vh + (1 − ψn )vl = π(yn ) Second, we prove the stated properties of E [π(yn )] and E [π(yn )|θ]. Some useful observations:

1. The number of signals do not impact E[π(y)] due to the martingale property of π(y), E[π(yn+1 )] = E[E[π(yn+1 )|yn ]] = E[π(yn )]. 2. We note that the posterior is given by qpy (1 − p)n−y P r(y|θh )P r(θh ) = y P r(θh )P r(y|θh ) + P r(θl )P r(y|θl ) qp (1 − p)n−y + (1 − q)pn−y (1 − p)y 1 (20) =  2y−n 1−q 1−p 1+ q p

P r(θh |y) =

To simplify notation we define p˜ ≡

1−p p ,

q˜ ≡

1−q q

and yˆ ≡ 2y − n. Then, ψn = P r(θh |y) =

1 . 1+˜ q p˜yˆ

3. We note that the expected project value conditional on state is given by:

E [π(yn )|θh ] =

n X y=0

 n! py (1 − p)n−y y!(n − y)!



qpy (1 − p)n−y vh + (1 − q)pn−y (1 − p)y vl qpy (1 − p)n−y + (1 − q)pn−y (1 − p)y



We are interested in showing that E [π(yn+1 )|θh ] > E [π(yn )|θh ]

(21)

E [π(yn+1 )|θl ] < E [π(yn )|θl ]

(22)

We will show that equation (21) holds and leave the proof of equation (22) to the reader. Using notation ψn = P r(θh |y), we can rewrite equation (21) as (vh − vl )E [(ψn+1 − ψn )|θh ] > 0

77

As (vh − vl ) > 0, by assumption, it remains to be shown that E [ψn+1 − ψn |θh ] > 0. Given θ = θh , and a signal realization yˆ, ψn+1 =

1 1+˜ q p˜yˆ+1

with probability p and ψn+1 =

1 , 1+˜ q p˜yˆ−1

with

probability (1 − p). Therefore, 1 p 1−p < + y ˆ y ˆ +1 1 + q˜p˜ 1 + q˜p˜ 1 + q˜p˜yˆ−1 ⇔

p˜ p2 + (1 − p) − p˜ < q˜p˜yˆ(p + (1 − p)˜ p2 − p˜)

which holds since p˜ p2 + (1 − p) − p˜ = 0 and 0 < q˜p˜yˆ(p + (1 − p)˜ p2 − p˜), as p >

1 2

and thus

E [ψn |θh ] < E [ψn+1 |θh ] , which concludes the proof.

Third, we show our last claim that additional signals do not matter as uncertainty vanishes which is true in any of the following cases: (i) For vl → vh , lim E [π(yn )|θh ] =

vl →vh

n X y=0

 (n)! py (1 − p)n−y vh = (p + 1 − p)n vh = vh , y!(n − y)!

where the second step follows from the binomial formula. The expression is independent of n and therefore additional signals do not matter. An analogous argument holds for E [π(y)|θl ].

(ii) Assume p → 1. Then, n X

   qpy (1 − p)n−y vh + (1 − q)pn−y (1 − p)y vl (n)! y n−y lim E [π(yn )|θh ] = lim p (1 − p) p→1 p→1 y!(n − y)! qpy (1 − p)n−y + (1 − q)pn−y (1 − p)y y=0    qpn (1 − p)n−n vh + (1 − q)pn−n (1 − p)n vl (n)! n n−n = lim p (1 − p) p→1 n!(n − n)! qpn (1 − p)n−n + (1 − q)pn−n (1 − p)n  n  n qp vh + (1 − q)(1 − p) vl = lim pn = vh , p→1 qpn + (1 − q)(1 − p)n which is independent of n; and analogous when conditioning on θ = θl . (iii) Assume q → 1. Then, lim E [π(yn )|θh ] =

q→1

n X y=0

 (n)! py (1 − p)n−y vh = (p + 1 − p)n vh = vh y!(n − y)!

which is independent of n. Similarly for q → 0 and E [π(y)|θl ].

78

(iv) Note that y ∼ Binomial(np, np(1 − p)) if θ = θh and y ∼ Binomial(n(1 − p), np(1 − p)) if θ = θl . Then, limn→∞ (y − (n − y)) = ∞ if θ = θh and limn→∞ (y − (n − y)) = −∞ if θ = θl . To see this note that y − (n − y) = 2y − n. By the weak law of large numbers, as n → ∞, if θ = θh if

θ = θl

P

P

y → np

y → n(1 − p)

⇒ lim (2np − n) = ∞ n→∞

⇒ lim (2n(1 − p) − n) = −∞. n→∞

Then, limn→∞ P r(θh |y) = 1 if the true state is θ = θh and limn→∞ P r(θh |y) = 0 if the true state is θ = θl as lim P r(θh |y) = lim

n→∞

n→∞

1+

1−q q

1 

1−p p

2y−n

We have already shown that P r(θh |y) is increasing in n if θ = θh and decreasing in n if θ = θl . Thus we can apply the Monotone Convergence Theorem, which implies that limn→∞ E[P r(θh |y)vh ] = E[limn→∞ P r(θh |y)vh ]. From this it follows that limn→∞ E [π(y)|θh ] = vh and limn→∞ E [π(y)|θl ] = vl .  First Period Effort The first period effort is a function of both π(yn ) and E[V ∗ (y 0 )], see equation (7). Thus, additional signals affect effort through their impact on π(yn ) and E[V ∗ (y 0 )]. As we have already established the effect of additional signals on π(yn ), we now turn to E[V ∗ (y 0 )]. In Lemma 3 we first show that V ∗ (y 0 ) is a convex function of the second period project value, π(y 0 ), which is a martingale, see Lemma 2. This establishes that additional signals lead to a higher expected value. As before we analyze the effect of vanishing uncertainty, now on the expected value. Lemma 3 (Information and Second Period Expected Value). V ∗ (yn0 ) is a submartingale. Thus, a worker with more signals has a higher second period expected value: 0 E[V ∗ (yn0 )] < E[V ∗ (yn+1 )],

0 The impact of an additional signal vanishes, if uncertainty vanishes, i.e. E[V ∗ (yn0 )] = E[V ∗ (yn+1 )], if

either (i) vl → vh (ii) p → 1 (iii) q → 1 if θ = θh , q → 0 if θ = θl , or (iv) next → ∞.

79

Proof of Lemma 3: First, we establish that V ∗ (y 0 ) is a submartingale: We can express V ∗ (y 0 ) as a function of π(y 0 ), and write V ∗ (y 0 ) ≡ g(π(y 0 ))

(23)

As π(y 0 ) is a martingale, we have that g(π(y 0 )) is a submartingale if g is a convex function, whenever E[V ∗ (yn0 )] < ∞ which holds as 0 ≤ E[V ∗ (yn0 )] < vh , ∀n. Note that equilibrium effort depends on the expected project payoff through the signals, or e0 (y 0 ). We mostly omit this dependence here in order to keep notation simple and write e0 . Applying the envelope theorem repeatedly, the first and second derivative of g are given by ∂g(π(y 0 )) ∂e0 0 0 0 = f (e , e )π(y ) + f (e0 , e0 ) 2 ∂π(y 0 ) ∂π(y 0 )  2 ∂ 2 g(π(y 0 )) ∂e0 ∂ 2 e0 ∂e0 0 0 0 0 0 0 0 0 0 0 = [f (e , e ) + f (e , e )]π(y ) + f (e , e )π(y ) + f (e , e ) 22 12 2 2 ∂π(y 0 )2 ∂π(y 0 ) ∂π(y 0 )2 ∂π(y 0 ) ∂e0 + (f1 (e0 , e0 ) + f2 (e0 , e0 )) ∂π(y 0 ) 2 0 ∂ e ∂e0 ∂e0 0 0 0 0 0 0 = f2 (e0 , e0 )π(y 0 ) + f (e , e ) + (f (e , e ) + f (e , e )) 2 1 2 ∂π(y 0 )2 ∂π(y 0 ) ∂π(y 0 ) From the first order condition of the static problem, evaluated at the equilibrium effort, we can compute f1 (e0 , e0 ) ∂e0 = >0 ∂π(y 0 ) (∂ 2 c(e0 )/∂e02 ) 0

∂e (f11 (e0 , e0 ) + f21 (e0 , e0 )) ∂π(y) ∂ 2 e0 = =0 ∂π(y 0 )2 (∂ 2 c(e0 )/∂e02 )

It follows that ∂ 2 g(π(y 0 )) ∂e0 ∂e0 0 0 0 0 0 0 = f (e , e ) + (f (e , e ) + f (e , e )) > 0, 1 2 2 ∂π(y 0 )2 ∂π(y 0 ) ∂π(y 0 ) which implies that V ∗ (yn0 ) is a submartingale and therefore E[V ∗ (yn0 )] is increasing in n.

Second, we prove the stated properties of E[Vn∗ ]

=

n X y=0

 n! (qpy (1 − p)n−y + (1 − q)pn−y (1 − p)y ) f (e0 , e0 )π(y) − c(e0 ) y!(n − y)!

80

(24)

as uncertainty vanishes:

(i) Consider vl → vh . We are interested in lim E[Vn∗ ] = lim

vl →vh

vl →vh

n X y=0

 n! (qpy (1 − p)n−y + (1 − q)pn−y (1 − p)y ) f (e0 (y 0 ), e0 (y 0 ))π(y 0 ) − c(e0 (y 0 )) , (y)!(n − y)!

where e0 (y 0 ) is the equilibrium effort for given y 0 . As the other terms are constant in vl , all that matters is lim

vl →vh

 f (e0 (y 0 ), e0 (y 0 ))π(y 0 ) − c(e0 (y 0 )) = lim f (e0 (y 0 ), e0 (y 0 )) lim π(y 0 ) − lim c(e0 (y 0 )) vl →vh

vl →vh

vl →vh

= lim f (e0 (y 0 ), e0 (y 0 ))vh − lim c(e0 (y 0 )) vl →vh

vl →vh

Note that limπ(y0 )→vh e0 (y 0 ) = e0vh , i.e. the effort converges to some constant e0vh as π(y 0 ) → vh , since e0 (y 0 ) is a linear function of π(y 0 ) (recall from (5) that e(y) = f1 (1, 1)π(y)). Also, due to constant returns to scale of f , f (e0 (y 0 ), e0 (y 0 )) = e0 (y 0 )f (1, 1) and thus lime0 (y0 )→e0v f (e0 (y 0 ), e0 (y 0 )) = h

e0vh f (1, 1), which again is constant in n. As f (e0vh , e0vh ) = e0vh f (1, 1) is continuous, we know that limπ(y0 )→vh f (e0 (y 0 ), e0 (y 0 )) = e0vh f (1, 1). The argument is similar for c(.). Then, we can write lim

vl →vh

 f (e0 (y 0 ), e0 (y 0 ))π(y 0 ) − c(e0 (y 0 )) = bvl ,

where bvl is constant and thus independent of n. Therefore, as vl converges to vh , the expected second period value converges to a constant and is independent of the number of signals, lim E[Vn∗ ] = bvl .

vl →vh

(ii) Consider p → 1 for θ ∈ {θh , θl }. Note that lim π(y) = vh

p→1

if n − 2y < 0

lim π(y) = qvh + (1 − q)vl

p→1

lim π(y) = vl

p→1

if n − 2y = 0

if n − 2y > 0

As π(y) converges to some constant (and, of course, the same holds for π(y 0 )), so does f (e0 (y 0 ), e0 (y 0 ))π(y 0 ) − c(e0 ). We denote by V ∗ (vh ) (V ∗ (vl )) [V ∗ (v)] the limit of

81

f (e0 (y 0 ), e0 (y 0 ))π(y 0 ) − c(e0 ) when π(y) converges to vh (vl ) [qvh + (1 − q)vl ].

Note further that if n−2y < 0, limp→1 (qpy (1−p)n−y +(1−q)pn−y (1−p)y ) = limp→1 qpy (1−p)n−y . Then we know that lim qpy (1 − p)n−y

p→1

  q if y = n =  0 otherwise

If n − 2y > 0, limp→1 (qpy (1 − p)n−y + (1 − q)pn−y (1 − p)y ) = limp→1 (1 − q)pn−y (1 − p)y . It follows that   1 − q if y = 0 lim (1 − q)pn−y (1 − p)y = p→1  0 otherwise Last, if n − 2y = 0, limp→1 (qpy (1 − p)n−y + (1 − q)pn−y (1 − p)y ) = limp→1 py (1 − p)n−y = 0, as y, n > 0. From this it then follows that lim

p→1

E[Vn∗ ]

= lim

p→1

! n! y n−y n−y y 0 0 0 0 0 0 0 (qp (1 − p) + (1 − q)p (1 − p) ) (f (e (y ), e (y ))π(y ) − c(e (y ))) y!(n − y)! y=0

n X

= qV ∗ (vh ) + (1 − q)V ∗ (vl ),

which is independent of n.

(iii) Consider q → 1. Notice that, lim (qpy (1 − p)n−y + (1 − q)pn−y (1 − p)y ) = pn−y (1 − p)y ,

q→1

lim π(y) = vh .

q→1

It follows that  n X lim E[Vn∗ ] = lim 

q→1

  n! (qpy (1 − p)n−y + (1 − q)pn−y (1 − p)y ) f (e0 (y 0 ), e0 (y 0 ))π(y 0 ) − c(e0 (y 0 ))  q→1 y!(n − y)! y=0   n X n! = lim  (pn−y (1 − p)y )V ∗ (vh ) q→1 y!(n − y)! y=0

= lim V ∗ (vh )(1 − p + p)n = V ∗ (vh ), q→1

where the last step follows from the fact that V ∗ (vh ) is a constant and the binomial theorem. Thus, the limit is a constant and independent of n. 82

Next, consider q → 0. lim (qpy (1 − p)n−y + (1 − q)pn−y (1 − p)y ) = pn−y (1 − p)y ,

q→0

lim π(y) = vl ,

q→0

and by the same steps as previously it follows that limq→0 E[Vn∗ ] is constant.

(iv) Consider the case of abundance of information: next → ∞. We want to show that lim E[Vn∗ ] = E[V ∗ ].

n→∞

∗ ] as V ∗ is a submartingale and that E[V ∗ ] ≤ v for all We know that for each n, E[Vn∗ ] ≤ E[Vn+1 h n n

n. By the monotone convergence theorem, we know that a finite limit exists, which we denote by E[V ∗ ].  We have thus established that E[Vn∗ ] is increasing in the number of signals. We know from Lemma 2 that π(yn ) can be increasing or decreasing in the number of signals, depending on the state of the world. Thus, only if the state in the first period is high, first period effort is unambiguously increasing in the number of signals (proving claim 1. in Proposition 2). Wages The effect of information on wages follows immediately from wage functions (9) and (10), conditional on the high state θ = θh (proving claim 3. in Proposition 2). Vanishing Uncertainty Additional information does not affect second period effort if uncertainty is vanishing, see Lemma 2 (i)-(iv). Further, the result that the impact of degree on average first period effort vanishes as uncertainty vanishes is due to Lemma 2 (i)-(iv) and Lemma 3 (i)-(iv). Similarly, under vanishing uncertainty, the impact of a higher degree on wages vanishes since information affects wages through effort. Proof of Proposition 3 The effect of clustering on expected first period effort follows from our expression of equilibrium effort (7), showing that it is increasing in sr. The effect on productivity/wages follows immediately from the wage function (9).

83

Proof of Proposition 4: We first show that the first period average effort exhibits increasing differences, that is     E [e(sr, y)] − E [e(sr, y)] ≥ E e(sr, y) − E e(sr, y) , where we use the notation e(sr, y) to indicate that effort depends on clustering and degree (i.e. information) and where sr > sr and y > y. Simplifying and using the expression for equilibrium effort (7), yields   f1 (1, 1) E[π(y)] + βsrE[V ∗ (y 0 )] − f1 (1, 1) E[π(y)] + βsrE[V ∗ (y 0 )]   ≥ f1 (1, 1) E(π(y)) + βsrE[V ∗ (y 0 )] − f1 (1, 1) E(π(y)) + βsrE[V ∗ (y 0 )] ⇔

E[V ∗ (y 0 )] (sr − sr) ≥ E[V ∗ (y 0 )] (sr − sr)



E[V ∗ (y 0 )] ≥ E[V ∗ (y 0 )]

where the last inequality holds as V ∗ (yn0 ) is a submartingale (see Lemma 3). From (9), the first period average wage is given by: E[w] = qw(θh ) + (1 − q)w(θl ) = qE [e(y)|θh ] f (1, 1)vh + (1 − q)E [e(y)|θl ] f (1, 1)vl We then establish that the average wage has increasing differences in (y, sr), that is     E [w(sr, y)] − E [w(sr, y)] ≥ E w(sr, y) − E w(sr, y) Using the expression for the average wage, this can be expanded to: qE [e(sr, y)|θh ] f (1, 1)vh + (1 − q)E [e(sr, y)|θl ] f (1, 1)vl − qE [e(sr, y)|θh ] f (1, 1)vh − (1 − q)E [e(sr, y)|θl ] f (1, 1)vl         ≥ qE e(sr, y)|θh f (1, 1)vh + (1 − q)E e(sr, y)|θl f (1, 1)vl − qE e(sr, y)|θh f (1, 1)vh − (1 − q)E e(sr, y)|θl f (1, 1)vl Simplifying yields: qE [e(sr, y)|θh ] vh + (1 − q)E [e(sr, y)|θl ] vl − qE [e(sr, y)|θh ] vh − (1 − q)E [e(sr, y)|θl ] vl         ≥ qE e(sr, y)|θh vh + (1 − q)E e(sr, y)|θl vl − qE e(sr, y)|θh vh − (1 − q)E e(sr, y)|θl vl

84

Using the (expected) equilibrium effort from (7) gives:   qf1 (1, 1) E(π(y|θh )) + βsrE[V ∗ (y 0 )] vh + (1 − q)f1 (1, 1) E(π(y|θl )) + βsrE[V ∗ (y 0 )] vl   − qf1 (1, 1) E(π(y|θh )) + βsrE[V ∗ (y 0 )] vh − (1 − q)f1 (1, 1) E(π(y|θl )) + βsrE[V ∗ (y 0 )] vl   ≥ qf1 (1, 1) E(π(y|θh )) + βsrE[V ∗ (y 0 )] vh + (1 − q)f1 (1, 1) E(π(y|θl )) + βsrE[V ∗ (y 0 )] vl   − qf1 (1, 1) E(π(y|θh )) + βsrE[V ∗ (y 0 )] vh − (1 − q)f1 (1, 1) E(π(y|θl )) + βsrE[V ∗ (y 0 )] vl ⇔

(sr − sr) E[V ∗ (y 0 )] (qvh + (1 − q)vl ) ≥ (sr − sr) E[V ∗ (y 0 )] (qvh + (1 − q)vl )



E[V ∗ (y 0 )] ≥ E[V ∗ (y 0 )]

which again holds as V ∗ (yn0 ) is a submartingale. Proof of Proposition 5: Trade-Off Between Information and Peer Pressure We assume that a D-worker has a higher degree and hence more signals, nint , and has clustering, (sr)D . In turn, a C-worker has a lower degree and thus a lower number of signals (and therefore sD > sC ) but higher clustering and therefore (sr)C > (sr)D .

1. Comparative Advantage:

We want to show that

E[wC ] E[wD ]

(where wC indicates the first period wage of a C-worker and wD

indicates the first period wage of a D-worker) increases as the environment becomes more certain. First notice that, E[wC ] qwC (θh ) + (1 − q)wC (θl ) = E[wD ] qwD (θh ) + (1 − q)wD (θl )

(25)

Recall that E[w] =f (1, 1)f1 (1, 1) (qvh + (1 − q)vl ) βsrE[V ∗ (y 0 )] + f (1, 1)f1 (1, 1) (qvh E[π(y|θh )] + (1 − q)vl E[π(y|θl )]) ,

(26)

which follows from substituting equation (7) into wage equation (9). To simplify notation we define k1 ≡ f (1, 1)f1 (1, 1) and v ≡ qvh + (1 − q)vl . By the law of total expectation it follows that (1 − q)E[π(y|θl )] = E[π(y)] − qE[π(y|θh )]. Then, equation (26) becomes E[w] =k1 vβsrE[V ∗ (y 0 )] + q(vh − vl )E[π(y|θh )] + E[π(y)]vl

85



(27)

The wage ratio (25) can then be expressed as E[wC ] vβ(sr)C (E[V ∗ (y 0 )])C + q(vh − vl ) (E[π(y|θh )])C + (E[π(y)])C vl = E[wD ] vβ(sr)D (E[V ∗ (y 0 )])D + q(vh − vl ) (E[π(y|θh )])D + (E[π(y)])D vl

(28)

Note that E[π(y)] is independent of the number of signals as it is a martingale and thus, (E[π(y)])C = (E[π(y)])D = v. Further, note that E[V ∗ (y 0 )] = C1 (vh − vl )2 qE [ψn |θh ] + 2C1 vl (vh − vl )q + C1 vl2 E[π(y)|θh )] = (vh − vl )E[ψn |θh ] + vl

(29) (30)

 where C1 = f1 (1, 1) 1 − 21 f1 (1, 1) . C1 is positive as the value of the problem is positive. To obtain the simplified expression for E[V ∗ (y 0 )] in (29), we used equation (24) and substituted in the equilibrium first period effort (7). We then applied the binomial theorem and used the martingale property. To make the notation more compact and to single out those variables that depend on information, we now introduce the variables ai and bi , i ∈ {C, D} (which do not depend on information) and write the wage ratio (25) as (where we also use (29) and (30)): E[wC ] aC + bC (E[ψ|θh ])C = E[wD ] aD + bD (E[ψ|θh ])D

(31)

For illustration, we now focus on the case where the D-worker has one more signal than the Cworker. Our exercise aims at analyzing (31) when reducing uncertainty, which we here achieve by letting next and thus n grow. If (31) is increasing in the number of signals n, it must hold that aC + bC E[ψn |θh ] aC + bC E[ψn−1 |θh ] > aD + bD E[ψn+1 |θh ] aD + bD E[ψn |θh ] or  aC bD + aD bC E[ψn |θh ] + bC bD E[ψn |θh ]2 > aC bD E[ψn+1 |θh ] + aD bC E[ψn−1 |θh ] + bC bD E[ψn+1 |θh ]E[ψn−1 |θh ]

(32)

We focus first on showing that bC bD E[ψn |θh ]2 > bC bD E[ψn+1 |θh ]E[ψn−1 |θh ], or E[ψn |θh ]2 > E[ψn+1 |θh ]E[ψn−1 |θh ]

(33)

Thus, if we establish that E[ψn |θh ] is log-concave, then inequality (33) follows immediately. As

86

concavity implies log-concavity it suffices to show that E[ψn |θh ] is concave.

Concavity of E[ψn |θh ]: It is helpful to express ψn in terms of its log-likelihood ratio (LLR). Without any signals the LLR, denoted by λ0 is a function of the prior q: 

q 1−q

λ0 = log

 (34)

,

Generally, the LLR is given by  λn+1 = λn + 2 log

p 1−p



1 (xn − ), 2

where we denote by xn the signal realization of the nth observation. Further,  log

ψn 1 − ψn

 ⇔

= λn

ψn =

e λn . 1 + e λn

Taking expectations yields  E(ψn |θh ) = E

eλn |θh 1 + eλn



Then, we take the first and second derivative with respect to n, which yields ∂E(ψn |θh ) =E ∂n ∂ 2 E(ψn |θh ) =E ∂n2

! ∂ψn ∂λn θh ∂λn ∂n !   ∂ 2 ψn ∂λn 2 ∂ψn ∂ 2 λn + θh ∂λ2n ∂n ∂λn ∂n2

Note that λn is a linear function in n. To see this note that with each signal, the LLR either increases or decreases by a constant. Thus,  sign

∂ 2 λn ∂n2

∂ 2 E(ψn |θh ) ∂n2

= 0 and



 = sign

∂ 2 E(ψn |θh ) ∂λ2n



We therefore restrict attention to the derivative with respect to λn :     ∂E(ψn |θh ) (1 + eλn )eλn − e2λn eλn =E θ = E h θh ∂λn (1 + eλn )2 (1 + eλn )2    λn  ∂ 2 E(ψn |θh ) e (1 − eλn ) (1 + eλn )2 eλn − 2e2λn (1 + eλn ) =E θh = E θh ∂λ2n (1 + eλn )4 (1 + eλn )3

87

The second derivative is negative (thereby implying that E(ψn |θh ) is concave) if 1 − eλn < 0



0 < λn

This implies that if the LLR is negative, then the expected posterior belief E(ψn |θh ) is convex, otherwise, it is concave. The LLR is positive if the probability of the high state outweighs the probability of the low state, that is if sufficiently many signals have been positive. It remains to be shown that, given that the true state is θ = θh , λn is positive for some n and that once it is positive, the probability of it becoming negative again vanishes. We first show that λn becomes positive within a finite number of observations. To see this define a stopping time T over the set of all possible observations P, T = inf{n ≥ 0 : λ+ n ∈ P}, where λ+ n is the sequence for which λn > 0. Then, Williams (1991), p. 101 establishes the following: Lemma 4. Suppose that T is a stopping time such that for some N ∈ N and some  > 0 we have for every n ∈ N: P (T ≤ n + N |Fn ) > 

almost surely

(35)

Then, E[T ] < ∞. Note that Fn denotes the filtration with n observations. Inequality (35) is fulfilled as the probability that there are more positive than negative signals given any number of signals is strictly positive and thus there exists an  that is smaller than this probability. This establishes that λn becomes positive for a finite number of signals. Next, we want to establish that once λn is strictly positive, the probability of λn becoming negative converges to zero as n grows. We know from Hoeffding’s Inequality that the number of high signals is concentrated around its mean, with exponentially small tail, formally 2

P (np − y ≥ t) ≤ e−2nt

Given that some λn is positive, we know that the number of high signals must satisfy y > n even and y ≥

n+1 2

n 2

for

for n odd. We focus on the case where n is even, the case of n odd follows

immediately. The probability of λn being negative is equivalent to having more than half of the

88

signals indicating the low state. We therefore set t = np − n2 , which yields  2 ! 1 n P (np − y ≥ np − ) ≤ exp −2n3 p − 2 2

(36)

It is evident that for n sufficiently high, the probability of having more low signals than high signals (which is the probability on the LHS of (36)) approaches zero quickly and thus we have established that λn is positive in finite time and remains positive for sufficiently large n with probability approaching one. Thus for a sufficiently high LLR, E(ψn |θh ) is concave. While we focus here on the effect of an increase in signals n on the LLR, note that the LLR is also affected by q and p, where q is the prior probability of the high state and p is the probability of the signal being high given that the state is high. More precisely, λn is increasing in q and p. Thus, the LLR is influenced by all of our measures of uncertainty. Decreasing uncertainty by increasing q, p or n leads to a higher and, at some point, positive LLR, in which case E(ψn |θh ) is concave.

Inequality (33) is thus fulfilled and for (32) to hold it remains to be shown that  aC bD + aD bC E(ψn |θh ) ≥ aC bD E(ψn+1 |θh ) + aD bC E(ψn−1 |θh ) ⇔

aD bC E(ψn − ψn−1 |θh ) ≥ aC bD E(ψn+1 − ψn |θh )

(37)

is fulfilled. We know that E(ψn |θh ) is concave and increasing for n sufficiently high, and thus E(ψn − ψn−1 |θh ) > E(ψn+1 − ψn |θh ). Further, we can show that aD bC = aC bD , which is equivalent to  βC1 (sr)C − (sr)D v(v − qvh − (1 − q)vl) = 0 as v = qvh + (1 − q)vl. Thus, we have shown that (37) is positive for n sufficiently high, which establishes that C-workers have a comparative advantage as uncertainty vanishes. 

2. Wage Dynamics:

Claim: wD (θ) ≥ wC (θ)



E[w0D ] > E[w0C ].

89

From (10), it follows that the second period expected wage across states is defined as E[w0 ] = qw0 (θ, θh0 ) + (1 − q)w0 (θ, θl0 ) = f (1, 1)si P ri (γg0 |θ) qE[e0 (y 0 )|θh0 ]vh + (1 − q)E[e0 (y 0 )|θl0 ]vl



Recall that P r(γg0 |θ) ≡ E[f (e(y), e(y)) + (1 − r)(1 − f (e(y), e(y)))|θ] = E[e(y)|θ]rf (1, 1) + 1 − r. Suppose that in the first period wD (θ) ≥ wC (θ), implying E[e(y)D |θ] ≥ E[e(y)C |θ]. Moreover, by assumption, sC < sD and (sr)C > (sr)D . Hence, [sP r(γg |θ)]D > [sP r(γg |θ)]C since [sP r(γg |θ)]D = (sr)D (E[e(y)D |θ]f (1, 1)−1)+sD > [sP r(γg |θ)]C = (sr)C (E[e(y)C |θ]f (1, 1)−1)+sC where the expression in brackets, E[e(y)|θ]f (1, 1) − 1, is negative but (weakly) less so for the D-worker. Last, we focus on  qE[e0 (y 0 )|θh0 ]vh + (1 − q)E[e0 (y 0 )|θl0 ]vl = f1 (1, 1) q(vh − vl )E[π(y 0 )|θh0 ]vh + vvl , where we again denoted v ≡ qvh + (1 − q)vl and where we used the law of total expectation (1 − q)E[π(y|θl )] = E[π(y)] − qE[π(y|θh )]. As E[π(y 0 )|θh0 ] is the only variable here that depends on information and since it is increasing in the number of signals, it follows that q (E[e0 (y 0 )|θh0 ])D vh + (1 − q) (E[e0 (y 0 )|θl0 ])D vl > q (E[e0 (y 0 )|θh0 ])C vh + (1 − q) (E[e0 (y 0 )|θl0 ])C vl . Thus, wD (θ) ≥ wC (θ) implies E[w0D ] > E[w0C ], which proves the claim. 

90

References Abramo, G., C. A. D’Angelo, and G. Murgia (2013). Gender differences in research collaboration. Journal of Informetrics 7(4), 811–822. Adda, J., C. Dustmann, and K. Stevens (2011). The Career Costs of Children. Albanesi, S. and C. Olivetti (2008, November). Gender and Dynamic Agency: Theory and Evidence on the Compensation of Female Top Executives. Boston University - Department of Economics - Working Papers Series WP2006-061, Boston University - Department of Economics. Albanesi, S. and C. Olivetti (2009, January). Production, Market Production and the Gender Wage Gap: Incentives and Expectations. Review of Economic Dynamics 12(1), 80–107. Allcott, H., D. Karlan, M. Möbius, T. Rosenblat, and A. Szeidl (2007). Community Size and Network Closure. The American Economic Review 97(2), 80–85. Ananat, E., S. Fu, and S. L. Ross (2013). Race-specific agglomeration economies: Social distance and the black-white wage gap. Technical report, National Bureau of Economic Research. Aral, S., E. Brynjolfsson, and M. W. Van Alstyne (2012, September). Information, Technology and Information Worker Productivity. Information Systems Research 23(3), 849–867. Arrow, K. and R. Borzekowski (2004). Limited Network Connections and the Distribution of Wages. FEDS Working Paper No. 2004-41. Åslund, O. and O. N. Skans (2010). Will i see you at work? ethnic workplace segregation in sweden, 1985–2002. ILR Review 63(3), 471–493. Azmat, G. and R. Ferrer (2015). Gender gaps in performance: Evidence from young lawyers. Working Papers. Azoulay, P., J. S. Graff Zivin, and J. Wang (2010). Superstar extinction. The Quarterly Journal of Economics 125(2), 549–589. Babcock, L. and S. Laschever (2003). Women Don’t Ask: Negotiation and the Gender Divide. Princeton University Press. Babcock, L., M. P. Recalde, L. Vesterlund, and L. Weingart (2017, March). Gender differences in accepting and receiving requests for tasks with low promotability. American Economic Review 107(3), 714–47. Bahr, A. H. and M. Zemon (2000). Collaborative authorship in the journal literature: Perspectives for academic librarians who wish to publish. College & Research Libraries 61(5), 410–419. Bandiera, O., I. Barankay, and I. Rasul (2005). Social preferences and the response to incentives: Evidence from personnel data. The Quarterly Journal of Economics 120(3), 917–962. Bandiera, O., I. Barankay, and I. Rasul (2009). Social connections and incentives in the workplace: Evidence from personnel data. Econometrica 77(4), 1047–1094. Bandiera, O., I. Barankay, and I. Rasul (2010). Social incentives in the workplace. The Review of Economic Studies 77(2), 417–458. Baumeister, R. F. and K. L. Sommer (1997). What do men want? gender differences and two spheres of belongingness: Comment on cross and madson (1997). American Psychological Association. Bayer, P., S. L. Ross, and G. Topa (2008). Place of work and place of residence: Informal hiring networks and labor market outcomes. Journal of Political Economy 116(6), 1150–1196. Beaman, L., N. Keleher, and J. Magruder (2013). Do Job Networks Disadvantage Women? Evidence from a Recruitment Experiment in Malawi. Working Paper. Beaman, L. and J. Magruder (2012). Who gets the job referral? evidence from a social networks experiment. The American Economic Review 102(7), 3574–3593. Becker, G. S. (1974). A theory of social interactions. Journal of political economy 82(6), 1063–1093.

91

Belle, D. (1989). Children’s Social Networks and Social Supports, Volume 136. John Wiley & Sons Inc. Benenson, J. F. (1990). Gender differences in social networks. The Journal of Early Adolescence 10(4), 472–495. Benenson, J. F. (1993). Greater preference among females than males for dyadic interaction in early childhood. Child Development 64(2), 544–555. Bentolila, S., C. Michelacci, and J. Suarez (2010). Social contacts and occupational choice. Economica 77(305), 20–45. Berardi, N. and P. Seabright (2011). Professional network and career coevolution. Working Paper. Bertrand, M. (2011). New Perspectives on Gender, Volume 4 of Handbook of Labor Economics, Chapter 17, pp. 1543–1590. Elsevier. Bertrand, M., C. Goldin, and L. F. Katz (2010). Dynamics of the Gender Gap for Young Professionals in the Financial and Corporate Sectors. American Economic Journal: Applied Economics, 228–255. Bertrand, M., E. F. Luttmer, and S. Mullainathan (2000). Network effects and welfare cultures. The Quarterly Journal of Economics 115(3), 1019–1055. Bierema, L. L. (2005). Women’s networks: a career development intervention or impediment? Human Resource Development International 8(2), 207–224. Black, S. E. and P. E. Strahan (2001). The division of spoils: rent-sharing and discrimination in a regulated industry. American Economic Review, 814–831. Blau, F. D. and L. M. Kahn (2000). Gender differences in pay. Technical report, National bureau of economic research. Blau, F. D. and L. M. Kahn (2016). The gender wage gap: Extent, trends, and explanations. Technical report, National Bureau of Economic Research. BLS-Reports (2017). Highlights of women’s earnings in 2016. Technical Report 1069, U.S. Bureau of Labor Statistics, http://www.bls.gov/cps/cpswom2012.pdf. Booth, A. (1972). Sex and social participation. American Sociological Review, 183–193. Borjas, G. J. (1992). Ethnic capital and intergenerational mobility. The Quarterly journal of economics 107(1), 123–150. Borjas, G. J. (1995). Ethnicity, neighborhoods, and human-capital externalities. The American Economic Review, 365– 390. Boschini, A. and A. Sjögren (2007). Is team formation gender neutral? evidence from coauthorship patterns. Journal of Labor Economics 25(2), 325–365. Bozeman, B. and M. Gaughan (2011). How do men and women differ in research collaborations? an analysis of the collaborative motives and strategies of academic researchers. Research policy 40(10), 1393–1402. Brass, D. J. (1984). Being in the right place: A structural analysis of individual influence in an organization. Administrative science quarterly, 518–539. Brass, D. J. (1985). Men’s and women’s networks: A study of interaction patterns and influence in an organization. Academy of Management journal 28(2), 327–343. Brett, J. M. and L. K. Stroh (1997). Jumping ship: Who benefits from an external labor market career strategy? Journal of Applied Psychology 82(3), 331. Bridges, W. P. and W. J. Villemez (1986). Informal hiring and income in the labor market. American sociological review, 574–582. Brown, M., E. Setren, and G. Topa (2016). Do informal referrals lead to better matches? evidence from a firm’s employee referral system. Journal of Labor Economics 34(1), 161–209. Brown, R., N. Gao, E. Lee, and K. Stathopoulos (2012). What are friends for? ceo networks, pay and corporate governance. In Corporate Governance, pp. 287–307. Springer. Burke, R. J., M. G. Rothstein, and J. M. Bristor (1995). Interpersonal networks of managerial and professional women

92

and men: descriptive characteristics. Women in Management Review 10(1), 21–27. Burt, R. (1992). Structural Holes: The Social Structure of Competition. Harvard University Press. Burt, R. (2011). Structural holes in virtual worlds. unpublished. Burt, R. S. (1997). The contingent value of social capital. Administrative science quarterly, 339–365. Burt, R. S. (1998). The gender of social capital. Rationality and society 10(1), 5–46. Cairns, R. B., M.-C. Leung, L. Buchanan, and B. D. Cairns (1995). Friendships and social networks in childhood and adolescence: Fluidity, reliability, and interrelations. Child development 66(5), 1330–1345. Calvo-Armengol, A. and M. Jackson (2004). Effects of Social Networks on Employment and Inequality. The American Economic Review 94(3), 426–454. Calvó-Armengol, A. and M. Jackson (2007). Networks in Labor Markets: Wage and Employment Dynamics and Inequality. Journal of Economic Theory 132(1), 27–46. Calvó-Armengol, A. and Y. Zenou (2005). Job matching, social network and word-of-mouth communication. Journal of urban economics 57(3), 500–522. Campbell, K. E. (1988). Gender differences in job-related networks. Work and occupations 15(2), 179–200. Campion, P. and W. Shrum (2004). Gender and science in development: Women scientists in ghana, kenya, and india. Science, technology, & human values 29(4), 459–485. Card, D., A. R. Cardoso, and P. Kline (2013). Bargaining and the Gender Wage Gap: A Direct Assessment. Technical report, Institute for the Study of Labor (IZA). Carroll, G. R. and A. C. Teo (1996). On the social networks of managers. Academy of Management journal 39(2), 421–440. Cingano, F. and A. Rosolia (2012). People i know: job search and social networks. Journal of Labor Economics 30(2), 291–332. Coleman, J. (1988a). Free riders and Zealots: The Role of Social Networks. Sociological Theory 6(1), 52–57. Coleman, J. (1988b). Social Capital in the Creation of Human Capital. American journal of sociology, 95–120. Combes, P.-P., L. Linnemer, and M. Visser (2008). Publish or peer-rich? the role of skills and networks in hiring economics professors. Labour Economics 15(3), 423–441. Conti, G., A. Galeotti, G. Müller, and S. Pudney (2013). Popularity. Journal of Human Resources 48(4), 1072–1094. Cooper, R., D. V. DeJong, R. Forsythe, and T. Ross (1990). Selection criteria in coordination games: Some experimental results. American Economic Review 80(1), 218–233. Cooper, R., D. V. DeJong, R. Forsythe, and T. W. Ross (1992). Communication in coordination games. The Quarterly Journal of Economics 107(2), 739–771. Croson, R. and U. Gneezy (2009). Gender differences in preferences. Journal of Economic Literature 47(2), 448. Damm, A. P. (2009). Ethnic enclaves and immigrant labor market outcomes: Quasi-experimental evidence. Journal of Labor Economics 27(2), 281–314. Ding, W. W., F. Murray, and T. E. Stuart (2006). Gender differences in patenting in the academic life sciences. Science 313(5787), 665–667. Dixit, A. (2003). Trade expansion and Contract Enforcement. Journal of Political Economy 111(6), 1293–1317. Ductor, L., M. Fafchamps, S. Goyal, and M. J. van der Leij (2014). Social networks and research output. Review of Economics and Statistics 96(5), 936–948. Dustmann, C., A. Glitz, U. Schönberg, and H. Brücker (2015). Referral-based job search networks. The Review of Economic Studies 83(2), 514–546. Easley, D. and J. Kleinberg (2010). Networks, Crowds, and Markets. Cambridge Univ Press.

93

Eckel, C. and P. Grossman (2008). Men, Women and Risk Aversion: Experimental Evidence. Handbook of experimental economics results 1, 1061–1073. Eder, D. and M. Hallinan (1978). Sex Differences in Children’s Friendships. American Sociological Review, 237–250. Elliott, J. R. (1999). Social isolation and labor market insulation. The Sociological Quarterly 40(2), 199–216. Engelberg, J., P. Gao, and C. A. Parsons (2012). The price of a ceo’s rolodex. The Review of Financial Studies 26(1), 79–114. Equality and H.R.Commission (2009). Financial services inquiry: Sex discrimination and gender pay gap report of the equality and human rights commission. Technical report. Fafchamps, M., M. J. Leij, and S. Goyal (2010). Matching and network effects. Journal of the European Economic Association 8(1), 203–231. Ferriani, S., G. Cattani, and C. Baden-Fuller (2009). The relational antecedents of project-entrepreneurship: Network centrality, team composition and project performance. Research Policy 10(38), 1545–1558. Fischer, C. and S. Oliker (1983). A Research Note on Friendship, Gender, and the Life Cycle. Social Forces 62(1), 124–133. Flap, H. and B. Völker (2001). Goal specific social capital and job satisfaction: Effects of different types of networks on instrumental and social aspects of work. Social networks 23(4), 297–320. Fontaine, F. (2008). Why are similar workers paid differently? the role of social networks. Journal of Economic Dynamics and Control 32(12), 3960–3977. Forret, M. L. and T. W. Dougherty (2004). Networking behaviors and career outcomes: differences for men and women? Journal of Organizational Behavior 25(3), 419–437. Fracassi, C. and G. Tate (2012). External networking and internal firm governance. the Journal of finance 67(1), 153–194. Franzen, A. and D. Hangartner (2006). Social networks and Labour Market Outcomes: The Non-Monetary Benefits of Social Capital. European Sociological Review 22(4), 353–368. Friebel, G., M. Lalanne, B. Richter, P. Schwardmann, and P. Seabright (2017). Women form social networks more selectively and less opportunistically than men. Friebel, G. and P. Seabright (2011). Do women have longer conversations? telephone evidence of gendered communication strategies. Journal of Economic Psychology 32(3), 348–356. Gabbay, S. M. and E. W. Zuckerman (1998). Social capital and opportunity in corporate r&d: The contingent effect of contact density on mobility expectations. Social Science Research 27(2), 189 – 217. Galenianos, M. (2013). Learning about match quality and the use of referrals. Review of Economic Dynamics 16(4), 668–690. Garg, K. and S. Kumar (2014). Scientometric profile of indian scientific output in life sciences with a focus on the contributions of women scientists. Scientometrics 98(3), 1771–1783. Gersick, C. J., J. E. Dutton, and J. M. Bartunek (2000). Learning from academia: The importance of relationships in professional life. Academy of Management Journal 43(6), 1026–1044. Gibbons, R. (1998). Incentives in organizations. Technical report, National bureau of economic research. Gibbons, R. and M. Waldman (1999). Careers in organizations: Theory and evidence. Handbook of labor economics 3, 2373–2437. Ginther, D. K. and S. Kahn (2004). Women in economics: moving up or falling off the academic career ladder? The Journal of Economic Perspectives 18(3), 193–214. Glänzel, W. and A. Schubert (2004). Analysing scientific networks through co-authorship. Handbook of quantitative

94

science and technology research 11, 257–279. Glitz, A. (2015). The role of coworker-based networks in the labour market. DICE Report 13(1), 25. Gneezy, U., M. Niederle, and A. Rustichini (2003). Performance in competitive environments: Gender differences. Quarterly Journal of Economics 118(3), 1049–1074. Gneezy, U. and A. Rustichini (2004). Gender and competition at a young age. American Economic Review, 377–381. Goel, D. and K. Lang (2009). Social ties and the job search of recent immigrants. Technical report, National Bureau of Economic Research. Goldin, C. (2014, April). A Grand Gender Convergence: Its Last Chapter. American Economic Review 104(4), 1091– 1119. Goldin, C. and L. Katz (2011). The cost of workplace flexibility for high-powered professionals. The Annals of the American Academy of Political and Social Science 638(45), 45–67. Goldin, C. and C. Rouse (2000). Orchestrating impartiality: The impact of “blind” auditions on female musicians. The American Economic Review 90(4), 715–741. Goyal, S., M. J. Van Der Leij, and J. L. Moraga-González (2006). Economics: An emerging small world. Journal of political economy 114(2), 403–412. Granovetter, M. (1973). The Strength of Weak Ties. American Journal of Sociology, 1360–1380. Granovetter, M. (1995). Getting a job: A study of contacts and careers. University of Chicago Press. Granovetter, M. (2005). The impact of social structure on economic outcomes. The Journal of economic perspectives 19(1), 33–50. Green, G. P., L. M. Tigges, and I. Browne (1995). Social resources, job search, and poverty in atlanta. Research in Community Sociology 5, 161–82. Green, G. P., L. M. Tigges, and D. Diaz (1999). Racial and ethnic differences in job-search strategies in atlanta, boston, and los angeles. Social science quarterly, 263–278. Hallinan, M. T. (1980). Patterns of cliquing among youth. Friendship and social relations in children, 321–342. Halverson Jr, C. F. and M. F. Waldrop (1973). The relations of mechanically recorded activity level to varieties of preschool play behavior. Child Development, 678–681. Hamilton, B. H., J. A. Nickerson, and H. Owan (2003). Team incentives and worker heterogeneity: An empirical analysis of the impact of teams on productivity and participation. Journal of political Economy 111(3), 465–497. Hartup, W. W. (1983). Peer relations. Handbook of child psychology: formerly Carmichael’s Manual of child psychology/Paul H. Mussen, editor. Heath, R. (2013). Why do firms hire using referrals? evidence from bangladeshi garment factories. Working Paper. Heider, F. (1946). Attitudes and Cognitive Organization. The Journal of Psychology 21(1), 107–112. Hellerstein, J. K., M. McInerney, and D. Neumark (2011). Neighbors and coworkers: The importance of residential labor market networks. Journal of Labor Economics 29(4), 659–695. Hensvik, L. and O. N. Skans (2016). Social networks, employee selection, and labor market outcomes. Journal of Labor Economics 34(4), 825–867. Holmstrom, B. (1982). Moral Hazard in Teams. The Bell Journal of Economics, 324–340. Hong, W. and Y. Zhao (2016). How social networks affect scientific performance: Evidence from a national survey of chinese scientists. Science, Technology, & Human Values 41(2), 243–273. Horton, J., Y. Millo, and G. Serafeim (2012). Resources or power? implications of social networks on compensation and firm performance. Journal of Business Finance & Accounting 39(3-4), 399–426. Hu, Z., C. Chen, and Z. Liu (2014). How are collaboration and productivity correlated at various career stages of

95

scientists? Scientometrics 101(2), 1553–1564. Hunt, J., J.-P. Garant, H. Herman, and D. J. Munroe (2012, March). Why Don’t Women Patent? NBER Working Papers 17888, National Bureau of Economic Research, Inc. Hurlbert, J. S. (1991). Social networks, social circles, and job satisfaction. Work and occupations 18(4), 415–430. Hwang, B.-H. and S. Kim (2009). It pays to have friends. Journal of financial economics 93(1), 138–158. Ibarra, H. (1992). Homophily and differential returns: Sex differences in network structure and access in an advertising firm. Administrative science quarterly, 422–447. Ibarra, H. (1993, January). Personal networks of women and minorities in management: A conceptual framework. The Academy of Management Review 18(1), 56–87. Ibarra, H. (1997). Paving an Alternative Route: Gender Differences in Managerial Networks. Social Psychology Quarterly, 91–102. Ichniowski, C., K. Shaw, and G. Prennushi (1997). The effects of human resource management practices on productivity: A study of steel finishing lines. The American Economic Review, 291–313. Ioannides, Y. M. and L. D. Loury (2004). Job Information Networks, Neighborhood Effects, and Inequality. Journal of Economic Literature 42(4), 1056–1093. Jackson, C. K. and H. S. Schneider (2011). Do social connections reduce moral hazard? evidence from the new york city taxi industry. American Economic Journal: Applied Economics 3(3), 244–67. Jackson, M. O., T. Rodriguez-Barraquer, and X. Tan (2012). Social capital and social quilts: Network patterns of favor exchange. The American Economic Review 102(5), 1857–1897. Jackson, M. O., B. W. Rogers, and Y. Zenou (2016). The economic consequences of social network structure. Working Paper. Kandel, E. and E. Lazear (1992). Peer Pressure and Partnerships. Journal of Political Economy, 801–817. Kanter, R. M. (1977). Some effects of proportions on group life: Skewed sex ratios and responses to token women. American journal of Sociology 82(5), 965–990. Karlan, D., M. Mobius, T. Rosenblat, and A. Szeidl (2009). Trust and Social Collateral. The Quarterly Journal of Economics 124(3), 1307–1361. Kilduff, M. and W. Tsai (2003). Social networks and organizations. Sage. Kleinbaum, A. M., T. Stuart, and M. Tushman (2008). Communication (and coordination?) in a modern, complex organization. Harvard Business School Boston, MA. Kramarz, F. and D. Thesmar (2013). Social networks in the boardroom. Journal of the European Economic Association 11(4), 780–807. Kretschmer, H. and B. M. Gupta (1998). Collaboration patterns in theoretical population genetics. Scientometrics 43(3), 455–462. Kuhn, P. J. and M.-C. Villeval (2013, August). Are Women More Attracted to Cooperation Than Men?

NBER

Working Papers 19277, National Bureau of Economic Research, Inc. Kundra, R. and H. Kretschmer (1999). A new model of scientific collaboration part 2. collaboration patterns in indian medicine. Scientometrics 46(3), 519–528. Kürtösi, Z. (2008). Differences in Female and Male Social Networks in a Work Setting. Ph. D. thesis, Sociology Department Budapest. Kuzubas, T. U. and A. Szabo (2015). Job search through weak and strong ties: Theory and evidence from indonesia. Technical report, Working paper. Kyvik, S. and M. Teigen (1996). Child care, research collaboration, and gender differences in scientific productivity.

96

Science, Technology, & Human Values 21(1), 54–71. Lalanne, M. and P. Seabright (2016). The old boy network: The impact of professional networks on remuneration in top executive jobs. SAFE Working Paper. Laosa, L. M. and J. E. Brophy (1972). Effects of sex and birth order on sex-role development and intelligence among kindergarten children. Developmental Psychology 6(3), 409. Lazear, E. P. and P. Oyer (2007). Personnel economics. Technical report, National Bureau of economic research. Lee, L., C. Howes, and B. Chamberlain (2007). Ethnic heterogeneity of social networks and cross-ethnic friendships of elementary school boys and girls. Merrill-Palmer Quarterly 53(3), 325–346. Lee, S. and B. Bozeman (2005). The impact of research collaboration on scientific productivity. Social studies of science 35(5), 673–702. Leicht, K. T. and J. Marx (1997). The consequences of informal job finding for men and women. Academy of Management Journal 40(4), 967–987. Lever, J. (1976). Sex differences in the games children play. Social problems 23(4), 478–487. Lin, N. (1999). Building a network theory of social capital. Connections 22(1), 28–51. Lin, N., W. M. Ensel, and J. C. Vaughn (1981). Social resources and strength of ties: Structural factors in occupational status attainment. American sociological review, 393–405. Linehan, M. and H. Scullion (2008). The development of female global managers: The role of mentoring and networking. Journal of business ethics 83(1), 29–40. Liu, Y. (2014). Outside options and ceo turnover: The network effect. Journal of Corporate Finance 28, 201–217. Lutter, M. (2012). Anstieg oder ausgleich? die multiplikative wirkung sozialer ungleichheiten auf dem arbeitsmarkt für filmschauspieler. Zeitschrift für Soziologie (41), 435–457. Lutter, M. (2013). Is there a closure penalty? cohesive network structures, diversity, and gender inequalities in career advancement. MPIfG Discussion Paper (13/9). Lyness, K. S. and D. E. Thompson (2000). Climbing the corporate ladder: do female and male executives follow the same route? Journal of applied psychology 85(1), 86. Marmaros, D. and B. Sacerdote (2002). Peer and Social Networks in Job Search. European Economic Review 46(4), 870–879. Marsden, P. (1987). Core Discussion Networks of Americans. American Sociological Review, 122–131. Marsden, P. V. and E. H. Gorman (2001). Social networks, job changes, and recruitment. In Sourcebook of labor markets, pp. 467–502. Springer. Mas, A. and E. Moretti (2009). Peers at work. American Economic Review 99(1), 112–45. McCarthy, H. (2004). Girlfriends in high places: How women’s networks are changing the workplace. Demos. McDowell, J. M., L. D. Singell, and M. Stater (2006). Two to tango? gender differences in the decisions to publish and coauthor. Economic inquiry 44(1), 153–168. McDowell, J. M. and J. K. Smith (1992). The effect of gender-sorting on propensity to coauthor: Implications for academic promotion. Economic Inquiry 30(1), 68–82. McGuire, G. M. (2000). Gender, race, ethnicity, and networks: The factors affecting the status of employees’ network members. Work and occupations 27(4), 501–524. Mengel, F. (2015). Gender differences in networking. Working Paper. Metz, I. and P. Tharenou (2001). Women’s career advancement: The relative contribution of human and social capital. Group & Organization Management 26(3), 312–342. Mihaljevi´c-Brandt, H., L. Santamaría, and M. Tullney (2016). The effect of gender in the publication patterns in

97

mathematics. PloS one 11(10), e0165367. Miller, B. P. and W. Shrum (2012). Isolated in a technologically connected world?: Changes in the core professional ties of female researchers in ghana, kenya, and kerala, india. The Sociological Quarterly 53(2), 143–165. Montgomery, J. D. (1991). Social networks and labor-market outcomes: Toward an economic analysis. The American economic review 81(5), 1408–1418. Montgomery, J. D. (1994). Weak ties, employment, and inequality: An equilibrium analysis. American Journal of Sociology 99(5), 1212–1236. Moody, J. (2004). The structure of a social science collaboration network: Disciplinary cohesion from 1963 to 1999. American sociological review 69(2), 213–238. Moore, G. (1990). Structural determinants of men’s and women’s personal networks. American sociological review, 726–735. Munshi, K. and M. Rosenzweig (2016). Networks and misallocation: Insurance, migration, and the rural-urban wage gap. The American Economic Review 106(1), 46–98. Newman, M. E. (2001). The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences 98(2), 404–409. Nicolaou, N. and S. Birley (2003). Social networks in organizational emergence: The university spinout phenomenon. Management science 49(12), 1702–1725. Niederle, M. and L. Vesterlund (2007). Do women shy away from competition? do men compete too much? The Quarterly Journal of Economics 122(3), 1067–1101. Olivetti, C., E. Patacchini, and Y. Zenou (2015). Mothers, friends and gender identity. Working Paper. Parker, J. G. and J. Seal (1996). Forming, losing, renewing, and replacing friendships: Applying temporal parameters to the assessment of children’s friendship experiences. Child Development 67(5), 2248–2268. Podolny, J. and J. Baron (1997). Resources and Relationships: Social Networks and Mobility in the Workplace. American Sociological Review, 673–693. Polachek, S. W. (1981). Occupational Self-Selection: A Human Capital Approach to Sex Differences in Occupational Structure. The Review of Economics and Statistics 63(1), 60–69. Primack, R. B. and V. O’Leary (1993). Cumulative disadvantages in the careers of women ecologists. BioScience 43(3), 158–165. Putnam, R. (2000). Bowling Alone: America’s Declining Social Capital. Simon and Schuster. Ragins, B. R. and E. Sundstrom (1989). Gender and power in organizations: A longitudinal perspective. Psychological bulletin 105(1), 51. Reagans, R. and E. W. Zuckerman (2001). Networks, diversity, and productivity: The social capital of corporate r&d teams. Organization science 12(4), 502–517. Renneboog, L. and Y. Zhao (2011). Us knows us in the uk: On director networks and ceo compensation. Journal of Corporate Finance 17(4), 1132–1157. Rotemberg, J. J. (1994, August). Human Relations in the Workplace. Journal of Political Economy 102(4), 684–717. Sarsons, H. (2015). Gender differences in recognition for group work. Harvard University 3. Saygin, P., A. Weber, and M. Weynandt (2014). Coworkers, networks, and job search outcomes. Working Paper. Schmutte, I. M. (2014). Job referral networks and the determination of earnings in local labor markets. Journal of Labor Economics 33(1), 1–32. Schucan Bird, K. (2011). Do women publish fewer journal articles than men? sex differences in publication productivity in the social sciences. British Journal of Sociology of Education 32(6), 921–937.

98

Seibert, S. E., M. L. Kraimer, and R. C. Liden (2001). A social capital theory of career success. Academy of management journal 44(2), 219–237. Seidel, M.-D. L., J. T. Polzer, and K. J. Stewart (2000). Friends in high places: The effects of social networks on discrimination in salary negotiations. Administrative Science Quarterly 45(1), 1–24. Shen, H. (2013). Mind the gender gap. Nature 495(7439), 22. Shue, K. (2013). Executive networks and firm policies: Evidence from the random assignment of mba peers. The Review of Financial Studies 26(6), 1401–1442. Simon, C. J. and J. T. Warner (1992). Matchmaker, matchmaker: The effect of old boy networks on job match quality, earnings, and tenure. Journal of labor economics 10(3), 306–330. Singh, V. and S. Vinnicombe (2004). Why so few women directors in top uk boardrooms? evidence and theoretical explanations. Corporate Governance: An International Review 12(4), 479–488. Sparrowe, R. T., R. C. Liden, S. J. Wayne, and M. L. Kraimer (2001). Social networks and the performance of individuals and groups. Academy of management journal 44(2), 316–325. Tomassini, M. and L. Luthi (2007). Empirical analysis of the evolution of a scientific collaboration network. Physica A: Statistical Mechanics and its Applications 385(2), 750–764. Topa, G. (2011). ’labour markets and referrals’, chapter 22 in handbook of social economics, vol. 1: 1193-122, j. Benhabib, A. Bisin and MO Jackson (eds). Van der Leij, M. and S. Goyal (2011). Strong ties in a small world. Review of Network Economics 10(2). Van Emmerik, I. H. (2006). Gender differences in the creation of different types of social capital: A multilevel study. Social networks 28(1), 24–37. Van Huyck, J. B., R. C. Battalio, and R. O. Beil (1990). Tacit coordination games, strategic uncertainty, and coordination failure. American Economic Review 80(1), 234–248. Waldinger, F. (2010). Quality matters: The expulsion of professors and the consequences for phd student outcomes in nazi germany. Journal of Political Economy 118(4), 787–831. Wenneras, C. and A. Wold (1997). Nepotism and sexism in peer-review. Nature 387(6631), 341. Williams, D. (1991). Probability with martingales. Cambridge university press. Yakubovich, V. (2005). Weak ties, information, and influence: How workers find jobs in a local russian labor market. American sociological review 70(3), 408–421. Ynalvez, M. A. and W. M. Shrum (2011). Professional networks, scientific collaboration, and publication productivity in resource-constrained research institutions in a developing country. Research Policy 40(2), 204–216. Yoshikane, F. and K. Kageura (2004). Comparative analysis of coauthorship networks of different domains: The growth and change of networks. Scientometrics 60(3), 435–446. Zinovyeva, N. and M. Bagues (2015). The role of connections in academic promotions. American Economic Journal: Applied Economics 7(2), 264–292.

99

gender, social networks and performance

Oct 20, 2017 - 1 rule of business. Sallie Krawcheck ... In turn, earnings of both executives and financial managers are largely based on ..... focuses on dyadic relationship in the classroom with 5 classes of 25-35 students each. Benenson ...

618KB Sizes 5 Downloads 309 Views

Recommend Documents

Occupational mismatch and social networks
May 13, 2013 - high, networks provide good matches at a higher rate than the formal .... and for a sufficiently high homophily level social networks pay a ...... The discount rate δ is set to 0.988 which corresponds to a quarterly interest rate.

Sales Performance and Social Preferences
Apr 25, 2018 - concerns into account may do better in inspiring this trust and ... trust game complete significantly fewer sales per day and the effect is again ...

Social Networks and Research Output
Empirical strategy: role of social networks. • How much can prediction be .... 4.54% .14∗∗. 2-Coauthors prod. .32 .731. 3.62% .10∗∗. Top 1% coauthor .31 .738.

Social Networks and Research Output
Aim: Assess whether networks have explanatory power? Doe they ... Two roles for the network. • Conduit for ideas: Communication in the course of research ...

Optimal Taxation and Social Networks
Nov 1, 2011 - We study optimal taxation when jobs are found through a social network. This network determines employment, which workers may influence ...

Gender Similarities and Differences in Children's Social ...
differences must be manifested in overall act trends and illustrate how gender differences in ... behavior rates, average trait ratings, or summary checklist scores.

Gender Similarities and Differences in Children's Social ...
Gender Similarities and Differences in Children's Social Behavior: Finding Personality in Contextualized Patterns of Adaptation. Audrey L. Zakriski. Connecticut College. Jack C. Wright. Brown University. Marion K. Underwood. University of Texas at Da

Social Emulation, the Evolution of Gender Norms, and ...
Abstract. In this dissertation, I develop theoretical models of the role of social ... In the third essay, “Do Public Transfers Crowd Out Private Support to the Elderly?:

Gender Similarities and Differences in Children's Social ...
Recently, the two cultures view has suggested that girls and boys differ in ..... sampled, and the present study examined data for 360 children.2 The composition ...

Blockchain meets Social Networks - Longcatchain
Jan 15, 2018 - The platform also provides custom analytic and monitoring capabilities for blockchain operations and enterprises. Users can run custom queries on social network data to get desired insights, find influencers, see the reach of their pos

F5 Improves the Agility, Performance, and Security of ... - F5 Networks
1. F5 Improves the Agility, Performance, and. Security of IBM Maximo Deployments ... F5 increases Maximo performance by offloading SSL and other services.