Community Colleges and Upward Mobility∗ Job Market Paper Jack Mountjoy† University of Chicago Department of Economics November 10, 2017 Click this link for the most recent version. Abstract Two-year community colleges enroll nearly half of college entrants in the United States. These local institutions extend higher education to new populations but may also divert college-bound students from 4-year entry, casting ambiguity over their contributions to upward mobility. This paper empirically investigates the consequences of expanding access to 2-year colleges in light of this tradeoff. I combine linked administrative data with a new instrumental variables (IV) approach that nonparametrically identifies causal effects along multiple treatment margins, unlocking a decomposition of the overall effect of 2-year access into separate causal impacts on new 2-year entrants who otherwise would have not enrolled in college versus those diverted from 4-year entry. I implement the approach using instrumental variation in high school students’ proximities to 2-year and 4-year colleges conditional on neighborhood quality, urbanization, and local labor market fixed effects, validating that this variation drives initial enrollment choices while remaining balanced across excluded student ability measures that strongly predict choices and outcomes. The results of this IV approach offer four main conclusions. First, expanding access to 2-year colleges boosts the educational attainment and earnings of new 2-year entrants on net. Second, these net effects shroud opposing impacts along the two enrollment margins: roughly one third of these 2-year entrants are diverted from 4-year entry and earn fewer bachelor’s degrees as a result, while the other two thirds would not have otherwise attended college and reap significant gains in educational attainment and earnings. Third, stratifying by demographics reveals that women drive these results with larger effects along both margins compared to men, while 2-year access significantly boosts the upward earnings mobility of low-income students with little offsetting diversion. Finally, stratifying the effects of 2-year entry across the range of 2-year college proximity suggests that the net gains to marginal students do not diminish with further expansion of 2-year access. ∗

I thank Magne Mogstad, James Heckman, and Michael Greenstone for their guidance and support. Marianne Bertrand, Stephane Bonhomme, John Eric Humphries, Sonia Jaffe, Ezra Karger, Michael Lovenheim, Talla Mountjoy, Casey Mulligan, Azeem Shaikh, Jeff Smith, Alex Torgovitsky, Chris Walters, and seminar participants at the University of Chicago provided valuable feedback. I also thank Rodney Andrews, Greg Branch, Janie Jury, Mark Lu, Sara Muehlenbein, Greg Phelan, John Thompson, and Yu Xue at the UT-Dallas Education Research Center for expert guidance on the administrative data, and Joe Seidel for expertise with the Census geospatial data. I gratefully acknowledge support from the National Academy of Education/Spencer Dissertation Fellowship, the Becker Friedman Institute for Research in Economics, and the University of Chicago Department of Economics. The conclusions of this research do not necessarily reflect the opinions or official position of the Texas Education Agency, the Texas Higher Education Coordinating Board, the Texas Workforce Commission, or the State of Texas. † Website: home.uchicago.edu/∼jmountjoy. Email: [email protected].

1

Introduction

Since 1980, the earnings gap between college and high school graduates has roughly doubled in the United States.1 Rising demand for skilled labor has outpaced modest growth in the supply of college-educated workers, and this modest growth has been uneven: children from high-income families are more likely to enroll in and complete college than their low-income peers, and this college gradient in family income has steepened since 1980.2 In response to these trends, a recent wave of policies aimed at broadening college access have doubled down on 2-year community colleges, which enroll nearly half of all college entrants and a disproportionate share of low-income students, as key arteries in increasing the flow of young Americans into higher education.3 Five states and several major cities have launched free 2-year college tuition programs since 2014, and at least 10 additional states are considering similar legislation, all in the hope that expanding access to 2-year colleges will help extend the prospects of postsecondary attainment and upward mobility to a broader share of young Americans.4 While expanding access to 2-year colleges may extend enrollment opportunities to new populations, causal evidence on the efficacy of 2-year access in boosting educational attainment and earnings is limited,5 and greater 2-year accessibility may even detrimentally divert college-bound students away from 4-year institutions.6 Concerns stem from low rates of degree completion and upward transfer among 2-year entrants: while 81 percent begin with the intention to earn a bachelor’s degree, only 33 percent actually transfer to a 4-year institution within six years, and just 14 percent complete the bachelor’s degree (Jenkins and Fink 2016). In contrast, 60 percent of 4-year entrants complete a bachelor’s degree over the same timeframe, and substantial graduation gaps between 2-year and 4-year entrants remain after adjusting for observable differences in test scores and demographic background (e.g. Reynolds 2012). To the extent that these gaps reflect causal 1

See Autor (2014), among others. See Katz and Murphy (1992), Goldin and Katz (2008), and Acemoglu and Autor (2011) for evidence of steadily rising demand for skilled labor, and Belley and Lochner (2007), Bailey and Dynarski (2011), and Chetty et al. (2014) for the college enrollment gradient in family income. 3 National Center for Education Statistics (2015). In line with the existing literature and following the definition of Cohen et al. (2014), I refer to community colleges, 2-year colleges, and junior colleges interchangeably as accredited non-profit institutions that award associate’s degrees as their highest credential. 4 National Conference of State Legislatures (2016). 5 See Belfield and Bailey (2011, 2017) for reviews, and Denning (2017) for a recent contribution. 6 See Clark (1960), Pincus (1980), and Brint and Karabel (1989) for prominent expositions of this argument, and Rouse (1995) for a pioneering econometric analysis. 2

1

impacts of 2-year versus 4-year entry, the potential for diversion casts further ambiguity over the role of 2-year college access in promoting educational attainment and upward mobility. This paper empirically investigates the consequences of expanding access to 2-year colleges. Does expanding 2-year access boost educational attainment and earnings, on net? Are some students diverted from 4-year entry, with detrimental effects? Are these potential losses outweighed by gains among 2-year entrants who otherwise would not have enrolled in any college? To answer these questions, I combine linked administrative data with a new instrumental variables (IV) approach that nonparametrically identifies causal effects along multiple treatment margins, overcoming a methodological challenge in the IV literature: standard IV methods like two-stage least squares (2SLS) do not generally recover causal effects of one alternative versus another when individuals face multiple alternatives and reap heterogeneous effects (Heckman and Urzua 2010; Kirkeboen et al. 2016; Kline and Walters 2016). Applied to the 2-year college setting, this new approach unlocks a decomposition of the overall net effect of 2-year access into two distinct channels, identifying causal impacts on new 2-year entrants who otherwise would have not enrolled in college separately from impacts on students diverted from 4-year entry. The identification approach uses two instruments to isolate the mean potential outcomes of instrument compliers along each treatment margin, then differences these mean potential outcomes to form pairwise treatment effects. As in the binary treatment IV case, individual compliers cannot be identified, but their distributional characteristics emerge when studying changes in the composition of each treatment group driven by specific instruments.7 Moving beyond local treatment effects, instruments that take on a range of values enable identification of causal effects for a wider range of compliers, as in the binary treatment IV setting.8 This enables tests for different types of selection behavior, like Roy (1951)-style sorting on comparative advantage, and helps evaluate the external validity of the local estimates, including how marginal returns might evolve as 2-year access further expands and draws deeper into the population of potential entrants. I implement the method on longitudinal administrative data spanning the state of Texas. I 7

This extends the logic of Imbens and Rubin (1997), Abadie (2002), and Carneiro and Lee (2009) to multiple margins of treatment and instrument compliance. Kline and Walters (2016) consider an intermediate case of multiple treatments and one binary instrument, showing identification of complier characteristics along a subset of the relevant treatment margins. This paper shows how two instruments can secure identification along all relevant complier margins and thus yield pairwise treatment effects. Since this method can identify entire marginal distributions of complier potential outcomes, estimating quantile treatment effects, for example, remains an avenue for future work. 8 See, for example, Heckman et al. (2010), Angrist and Fernandez-Val (2013), and Brinch et al. (2017).

2

individually link the population of Texas public high school students with enrollment and degree completion records at all public and private Texas colleges and universities, then further link these students with quarterly earnings records for all Texas employees from the state unemployment insurance administration. These data are unusual in the U.S. context in their combination of breadth and depth of coverage, spanning the population of the second largest state (comprising 10.5% of all public K-12 students in the U.S.) and providing detailed information on demographics, test scores, college enrollment dynamics, degree completion, and longitudinal earnings. Linking the student-level data with annual geospatial measures of all high school and college locations in Texas, I identify causal effects along the multiple margins of 2-year enrollment with instrumental variation in students’ proximities to 2-year and 4-year colleges. Departing from most papers in the returns-to-schooling canon that employ distance instruments, I control directly for detailed measures of neighborhood urbanization, as well as local labor market fixed effects at the commuting zone level, to help purge the raw proximity measures of factors that may violate independence and exclusion.9 I also do not rely on variation in a student’s distance to any type of college; instead, identification comes from variation in 2-year distance holding 4-year distance fixed, and 4-year distance holding 2-year distance fixed. Encouragingly, I show that these two dimensions of instrumental variation are balanced across excluded student ability measures that strongly predict college choices and outcomes, yielding IV estimates that are robust to the inclusion of these ability measures as controls. The results of this IV approach offer four main conclusions. First, expanding access to 2-year colleges boosts the educational attainment and earnings of new 2-year entrants on net. Second, these net effects shroud opposing impacts along the two distinct enrollment margins: roughly one third of the new 2-year entrants are diverted from 4-year entry and earn fewer bachelor’s degrees as a result, empirically confirming concerns over the diversion channel but with smaller magnitudes than OLS regressions would suggest.10 The other two thirds of 2-year entrants would have not have otherwise attended college, and they reap significant gains in educational attainment and earnings. Third, stratifying by demographics reveals that women drive these results with larger 9 Papers in the distance instrument tradition include Kane and Rouse (1993), Card (1995), Rouse (1995), Kling (2001), Cameron and Taber (2004), Carneiro and Lee (2009), Carneiro et al. (2011), Eisenhauer et al. (2015), and Nybom (2017). 10 Differences between the IV and OLS estimates could be driven by selection bias, treatment effect heterogeneity, or their combination.

3

effects along both margins compared to men, while 2-year access significantly boosts the upward earnings mobility of low-income students with little offsetting diversion. Finally, stratifying the effects of 2-year entry across the range of 2-year college proximity suggests that the net gains to marginal students do not diminish with further expansion of 2-year access.11 Taken together, these results contribute clarifying empirical evidence in light of a growing policy movement to expand access to 2-year community colleges. Since the Tennessee Promise program launched in 2014, offering recent high school graduates free tuition at all 2-year colleges in the state, Oregon, Minnesota, Kentucky, and Rhode Island have implemented similar programs, and at least ten more states are considering similar legislation.12 Several major metropolitan governments have launched analogous local programs, including the Chicago STAR Scholarship (launched in 2014), San Francisco’s Free City program (2017), and the Boston Bridge program (2017). The highestprofile policy of all, the Obama administration’s 2015 proposal to make two years of community college free across the country, has not yet advanced at the national level, but it continues to catalyze state and local programs and may become a component of the Democratic National Committee policy platform (Donnis 2017). The results of this paper suggest that such broad expansions of 2year college access may boost the educational and labor market prospects of new students induced into higher education, but more targeted policies that avoid diverting college-bound students from 4-year entry may confer greater net benefits by reducing this unintended consequence.13 The empirical results build on an interdisciplinary literature studying the outcomes of 2-year college students, reviewed by Kane and Rouse (1999) and Belfield and Bailey (2011, 2017).14 Most of this literature relies on selection-on-observables assumptions to interpret OLS and matching results as causal. Belfield and Bailey (2017) discuss a small but growing set of papers that relax 11

If anything, the net returns increase slightly over the observed support of 2-year proximity, implying a reverseRoy selection pattern of higher gains accruing to those students who are less likely to enroll, but this result is not statistically precise. See Aakvik et al. (2005), Walters (2017), Kline and Walters (2016), and Cornelissen et al. (2017) for examples of reverse-Roy selection patterns into educational programs. 12 National Conference of State Legislatures (2016). New York state’s recent Excelsior Scholarship program offers free tuition at 2-year community colleges as well as state 4-year institutions to students who satisfy specific income and enrollment requirements. 13 Comparing the benefits of additional 2-year enrollments to estimates of their marginal costs, borne both by students and taxpayers, remains an important avenue for future work. 14 The 2-year versus 4-year comparison also contributes to the literature on the effects of college quality; see Hoxby (2009) for a review, as well as Andrews et al. (2016) for recent quantile treatment effect estimation using the same administrative data source as this paper. The greater prevalence of vocational courses in the 2-year sector could alternatively cast the comparison as one between different types of education, e.g. college major; see Altonji et al. (2012) for a review, as well as Hastings et al. (2014) and Kirkeboen et al. (2016) for recent contributions.

4

this assumption in panel specifications with individual fixed effects, but this approach necessarily focuses on older workers who have accumulated pre-enrollment earnings histories.15 A handful of recent papers exploit natural experiments that directly or indirectly influence 2-year college enrollment, but these papers do not employ additional instruments to identify causal effects of 2-year entry along both the no college margin and the 4-year entry margin.16 In estimating causal impacts of 2-year college entry along both margins of enrollment through the use of multiple instruments, this paper advances the related work of Rouse (1995, 1998) and Miller (2007) by relaxing the assumption of constant causal effects embedded in their multivariate two-stage least squares (2SLS) specifications. As shown by Kirkeboen et al. (2016), multivariate 2SLS estimands generally pool together potential outcome differences across multiple treatment margins and complier groups, making multivariate 2SLS estimates difficult to interpret if effects vary across individuals.17 A rapidly growing literature suggests that constant effects are the exception rather than the rule in educational settings, which motivates the development of this paper’s separate identification approach.18 The identification results more generally contribute to the IV literature on identifying causal effects of multivalued treatments. Heckman and Urzua (2010) discuss the identification challenges inherent in settings with multiple margins of treatment, showing how individuals induced into a specific treatment by instrumental variation can come from multiple alternative states. A small set of papers have developed certain conditions under which pairwise treatment effects can be identified, including parametric restrictions on unobserved heterogeneity (Heckman and Vytlacil 2007; Feller et al. 2016); full-support instruments that permit a multidimensional version of identification at infinity (Heckman et al. 2008); assumptions of uniform selection behavior across observable stratification groups (Kline and Walters 2016); and observable measures of individual preferences 15

See Jacobson et al. (2005) and Jepsen et al. (2014) for examples of this approach. Denning (2017) studies changes in community college tuition induced by redistricting in Texas, while Zimmerman (2014) and Goodman et al. (2017) exploit cutoffs for admission into the public 4-year sectors of Florida and Georgia, respectively. 17 In Appendix B, I derive and decompose the multivariate 2SLS estimands corresponding to my econometric framework in Section 4.2, showing how they fuse all treatment margins into each coefficient. Kline and Walters (2016) derive a related result in the 2SLS case where a single instrument is interacted with a stratifying covariate as an attempt to generate another dimension of instrumental variation. See also Pinto (2016) for a related discussion in the context of the multiple treatment arms of the Moving to Opportunity experiment. 18 See Moffitt (2008), Carneiro and Lee (2009), Carneiro et al. (2011), Havnes and Mogstad (2015), Kirkeboen et al. (2016), Kline and Walters (2016), Carneiro et al. (2017), Cornelissen et al. (2017), Nybom (2017), and Walters (2017), among others. 16

5

over alternatives (Kirkeboen et al. 2016). This paper’s separate identification approach with two instruments adds a new method to the toolkit, complementing recent work by Lee and Salanie (2017) who study identification in multinomial selection models with potentially multiple dimensions of unobserved heterogeneity. Finally, while the institutional focus of this paper lies with higher education in the United States, the methodology developed to identify causal effects of multivalued treatments could be applied to a broader range of settings. The 2-year and 4-year college distance instruments have a general interpretation as prices or cost shifters, suggesting parallels to other unordered choices that depend on initial costs, including migration decisions, occupational choice, hospital admission, K-12 school choice, and firm location decisions. This method also applies to program evaluation in the presence of substitution bias (Heckman et al. 2000), since the task of evaluating a policy or program with readily-available substitutes—here the encouragement of 2-year enrollment in the presence of the 4-year alternative—is aided by the ability to decompose net policy impacts into distinct effects among individuals who would otherwise go untreated (no college) versus those who would have simply obtained the substitute (4-year entry).19 The remainder of the paper proceeds as follows. Section 2 provides institutional background on the American community college. Section 3 describes the linked administrative data and presents descriptive results on initial enrollment choices and outcomes. Section 4 discusses the identification challenges posed by multiple treatment margins and develops the separate identification approach. Section 5 discusses estimation and conducts diagnostics on the instruments. Section 6 presents the empirical results. Section 7 concludes.

2

Institutional Background

Two-year community colleges straddle a complicated space in American higher education. From the emergence of the first “junior colleges” at the dawn of the 20th century, through their explosive midcentury growth and modern stabilization at roughly one thousand campuses across all fifty states, debate over the proper role of this “contradictory college” (Dougherty 1994) has continued apace, centering around three interrelated questions. The oldest, and largely resolved, question from the 19

See Kline and Walters (2016) for a close parallel in evaluating the impacts of Head Start.

6

primordial period at the turn of the 20th century was whether 4-year universities should spin off their first two years of teaching to these emerging junior colleges, allowing 2-year college faculty to specialize in undergraduate instruction while freeing up university faculty and resources to focus on the “higher” academic pursuits of research and graduate training. This sharp bifurcation into separate junior and senior institutions, advocated by the University of Chicago’s William Rainey Harper, Stanford’s David Starr Jordan, and several other prominent university presidents at the time (Cohen et al. 2014), never materialized on a large scale in the U.S., as the vast majority of colleges and universities that offer bachelor’s degrees have maintained their common model of four continuous undergraduate years.20 The resulting functional overlap between 2-year and 4-year colleges helped fuel debate over a second question: should 2-year colleges continue to prepare students for 4-year transfer through academic coursework, or should they differentiate themselves from 4-year institutions by focusing on terminal vocational training to prepare students for workforce entry? The academic transfer function remained the core mission of 2-year colleges from their inception through the mid-20th century, despite continuous efforts from much of the leadership of the American Association of Junior Colleges to carve out a new niche for 2-year colleges in providing occupational training (Brint and Karabel 1989; Cohen et al. 2014). A confluence of events in the 1960s and 1970s finally brought vocational education to the fore, including the federal Vocational Education Act of 1963, billions of dollars of subsequent vocational program funding championed by the Nixon administration, the early 1970s downturn in the wage premium to bachelor’s degrees, several reports from the influential Carnegie Commission on Higher Education advocating more vocational emphasis at 2year colleges, and the shift in terminology from “junior” to “community” college, shedding the former connotation of subordination to 4-year institutions in favor of responsiveness to community needs, including occupational education suited to local industry demand.21 Though the directions of causation likely ran in both directions, these forces coincided with a rise in the share of 2-year college students pursuing vocational instead of academic programs, from less than a third in 1970 20

William Rainey Harper did manage to separate the undergraduate experience at the nascent University of Chicago into a Junior College and a Senior College, and even pioneered the American associate’s degree as an award to students who completed the two-year Junior College curriculum (Brint and Karabel 1989). But while separate 2-year colleges did emerge and grow dramatically over the 20th century, Harper’s own bifurcation of the University of Chicago never resulted in two standalone institutions, to the grumbling of many senior faculty members whom Harper recruited to the new university with assurances that there would be no need to teach lower-division undergraduates (Boyer 2015). 21 Freeman (1976); Brint and Karabel (1989); Cohen et al. (2014).

7

to more than half in 1977 (Blackstone 1978), and settling at rough parity today (Cohen et al. 2014). The rise of vocational education at 2-year colleges has only intensified debate over a final question, on which this paper focuses: do 2-year colleges boost the upward mobility of individuals who otherwise would not participate in higher education, or do they mainly divert inframarginal college students away from 4-year institutions, perhaps to their own detriment? The early champions of the community college movement focused almost exclusively on the “democratization” effect along the extensive margin, viewing 2-year college accessibility as a cornerstone in building a higher education system that offered equal opportunity to Americans from all backgrounds (Eells 1931; Koos 1944). In more recent decades, concerns over the diversion channel have grown more prominent in the academic literature, with Brint and Karabel (1989) arguing in an influential book that diversion is actually the dominant function of modern 2-year colleges, and others like Grubb (1989) and Dougherty (1994) noting that both of these margins are likely at play as simultaneous features of the “contradictory” community college. With nearly 10 million community college students in the United States annually generating over 50 billion dollars in costs, over 70 percent of which are subsidized by local, state, and federal taxpayers (National Center for Education Statistics 2015), building a clear understanding of democratization, diversion, and their impacts on student outcomes is not simply an academic pursuit, but a necessary input into evaluating the effects of a wide range of higher education policies that influence student enrollment decisions. The recent surge in the promotion and subsidization of 2-year college enrollment discussed in the Introduction, atop a century of controversy over the role of community colleges in American social mobility, motivates the analysis of this paper.

3 3.1

Data and Descriptive Results Data Sources

My empirical analysis combines several restricted administrative datasets spanning the state of Texas. As the second largest U.S. state by population, land area, and GDP, Texas comprises 8.5 percent of the U.S. population and educates 10.5 percent of U.S. public K-12 students, and the Texan economy would rank 11th largest in the world as a sovereign nation. This large populace supports a comprehensive statewide system of higher education, with 37 public 4-year universities 8

enrolling 637,000 students, 38 private 4-year colleges and universities enrolling 124,000 students, and 57 public 2-year community college districts enrolling 732,000 students (National Center for Education Statistics 2015). The analysis sample begins with student-level data from the Texas Education Agency (TEA) covering the population of Texas public high school students.22 I link these students to administrative records from the Texas Higher Education Coordinating Board (THECB), capturing all enrollments and degrees at all public and private Texas colleges and universities.23 I further link these students to individual quarterly earnings records from the Texas Workforce Commission (TWC), measuring total earnings at each job each quarter for all Texas employees subject to the state unemployment insurance system.24 I complement these administrative student-level records with several auxiliary school- and neighborhood-level data sources: high school characteristics from the National Center for Education Statistics (NCES) Common Core of Data, college characteristics from the Integrated Postsecondary Education Data System (IPEDS), and neighborhood characteristics from the 2000 decennial Census measured at the tract level.25 One obvious limitation of any administrative data from a particular state is attrition due to outmigration. In my setting, college enrollments and earnings of Texas high school students who leave the state will not be observed. Fortunately, Texas has the lowest outmigration rate of any U.S. state, with 82 percent of all Texas-born individuals remaining in Texas as of 2012 (Aisch et al. 2014) and only 1.7 percent of Texas residents leaving the state each year (White et al. 2016). On the college enrollment front, National Student Clearinghouse (NSC) records are available for a subset of my sample—students who graduate from high school in 2008 and 2009—allowing me to study college enrollment patterns inclusive of the small fraction of Texas high school students who do attend college out-of-state.26 On the earnings front, missing earnings values could represent either nonemployment or outmigration; I show below that students with non-missing earnings look very 22

Private high school students, who are not observed in this data, account for less than 5 percent of all Texas high school graduates (National Center for Education Statistics 2015). 23 I do not observe for-profit college enrollments. In the fall of 2004, when the last of my main analysis cohorts begin to enter college, the for-profit share of enrollment at all degree-granting postsecondary institutions is only 5.1 percent (National Center for Education Statistics 2015). 24 Excluded from the state UI system are the self-employed, independent contractors, military personnel, some federal employees, and workers in the informal sector. Stevens (2007) estimates that roughly 90 percent of the civilian labor force is captured in state UI records. 25 Census tracts delineate neighborhoods of roughly 1,200 to 8,000 people, averaging around 4,000. 26 The NSC records cover 90 percent of nationwide college enrollment (Dynarski et al. 2013).

9

similar to those with missing earnings in terms of observable characteristics, suggesting that the scope for sample selection bias may be limited.27

3.2

Variable Definitions Cohorts.—The main analysis sample consists of five cohorts of Texas public high school stu-

dents enrolled in 10th grade between 1998 and 2002. I will hereafter refer to their projected high school graduation years of 2000 through 2004.28 2000 is the oldest cohort for which private Texas college enrollments can be observed,29 and 2004 is the last cohort of the Texas Assessment of Academic Skills (TAAS) testing regime, with substantial changes to the testing structure thereafter. I also make separate use of the 2008 and 2009 cohorts in the descriptive results, leveraging their National Student Clearinghouse coverage to show college enrollment patterns inclusive of students who attend out-of-state. Covariates.—Student-level demographics are measured in the 10th grade TEA enrollment files, including categorical variables for gender, race/ethnicity, and eligibility for free or reduced price lunch, a proxy for economic disadvantage. To obtain a single test score measure for each student, I combine raw 10th grade math and reading scores in a one-factor model separately by cohort, then normalize this factor to within-cohort percentiles. High school-level controls are measured in the NCES Common Core of Data and include the share of students eligible for free/reduced price lunch, the NCES geographic locale code, which measures local urbanization in twelve detailed categories based on Census geospatial data,30 and a county variable, which I group into the 62 Texas commuting zones using the year-2000 mapping provided by the U.S. Department of Agriculture’s Economic Research Service.31 To control for any local influences of the oil and gas industry, I also measure the long-run share of oil and gas employment at the high school level using the NAICS industry codes in the TWC workforce data. Finally, I construct an index of neighborhood quality 27 Andrews et al. (2016) and Dobbie and Fryer (2017) arrive at similar conclusions using different extracts from the same Texas administrative data as this paper. 28 Appendix Table A.1 shows that the proximity instruments have no detectable effect on the probability of graduating from high school. 29 The private Texas college enrollment data begin in Fall 2002, so the 2002 high school graduates are officially the first with complete private college enrollment coverage. Persistence rates at private colleges are quite high, however, so catching the 2001 and 2000 cohorts in their second and third years, respectively, at private colleges allows me to significantly increase the sample size with little measurement error in treatments. 30 These twelve urbanization categories are large city, midsize city, small city, large suburb, midsize suburb, small suburb, fringe town, distant town, remote town, fringe rural, distant rural, and remote rural. 31 https://www.ers.usda.gov/data-products/commuting-zones-and-labor-market-areas.aspx.

10

similar to the test score measure: I combine the tract-level Census measures of median household income and percent of households under the poverty line with the high school-level percent eligible for free/reduced price lunch into a one-factor model, then normalize this neighborhood factor to within-cohort percentiles. Treatments.—The three mutually exclusive and exhaustive treatments of interest are starting at a 2-year college, starting at a 4-year college, and not enrolling in any college. I define these by taking the first observed postsecondary enrollment, if any, starting the fall semester after projected high school graduation through the subsequent three academic years.32 Instruments.—I measure proximity (in miles) to the nearest 2-year and 4-year colleges by computing ellipsoidal distances between the coordinates of all Texas public high schools (from NCES CCD) and the coordinates of all Texas postsecondary institutions (from IPEDS), then taking minimum distances separately within the 2-year and 4-year sectors.33 For college campuses with missing geospatial data in IPEDS, I manually collected their locations by first checking each college’s institutional profile for standalone branch campuses and location changes over my sample period, then converting those year-specific physical addresses to geocoordinates via Google Maps. Academic outcomes.—I study the effects of 2-year college enrollment on two main academic outcomes. Bachelor’s degree completion is an indicator for appearing in the THECB public and private 4-year degree completion files within ten years of projected high school graduation. Years of completed schooling are defined by an algorithm detailed in the footnote below.34 Earnings.—I measure real quarterly earnings by summing TWC earnings within each personquarter, deflating by the quarterly U.S. consumer price index (base year 2010), winsorizing at the 99th percentile, and averaging the non-missing quarters within person over ages 28-30, the latest common ages available for these cohorts.35 To study earnings dynamics, I also construct similar 32 For the small number of students who initially enroll in both sectors simultaneously, I assign them to the sector with greater credit hours. Following Andrews et al. (2014), I ignore summer terms when defining sector of enrollment. 33 In determining minimum distances, I ignore small private campuses that enroll fewer than 400 Texas first-year students and small extension centers that offer very limited courses and student services. 34 Years of completed schooling range from 10 to 17. To complete 10: enroll in 11th grade. To complete 11: enroll in 12th grade. 12: complete high school. 13: enroll in college with 2nd year standing, or complete a certificate, or complete the academic core requirement at a community college. 14: enroll in college with 3rd year standing, or complete an associate’s degree. 15: enroll in college with 4th year standing. 16: complete a bachelor’s degree, or enter a postsecondary program with post-baccalaureate standing, or enroll in graduate school. 17: complete any graduate degree. 35 These ages assume the student was 16 at the end of 10th grade, which is true for roughly 96 percent of students in the sample. The 2004 cohort, the youngest in the sample, is only measured over ages 28-29, since the TWC earnings data are only available through 2015.

11

within-person averages over ages 22-24 and 25-27, as well as an annual panel of mean quarterly earnings at each observed age.

3.3

Sample Construction and Summary Statistics

To construct the main analysis sample of 2000-2004 cohorts and the NSC sample of 2008-2009 cohorts, I begin with the population of 10th grade students in each cohort with valid student identifiers, covariates, and high school locations. Table 1 presents summary statistics for these base samples. Column (1) shows that 3.7 percent of students in the 2008-2009 NSC cohorts attend college outside of Texas. To mitigate any bias from outmigration in the 2000-2004 main analysis cohorts for whom out-of-state enrollments are unobserved, column (3) drops the highest ability students with test scores above the 80th percentile. Appendix Figure A.1 shows that out-of-state enrollment in the 2008-2009 NSC cohorts is concentrated among students with test scores above this threshold, and that top-scoring students are also more likely to have missing earnings in the main 2000-2004 cohorts. Appendix Table A.1 shows that after dropping top-scoring students, the proximity instruments have no significant effect on the small remaining share of out-of-state enrollments in the 2008-2009 NSC cohorts. To complete the main analysis sample, column (4) of Table 1 drops the remaining students with no observed quarterly earnings over ages 28-30. Comparing columns (3) and (4) shows that students with non-missing earnings look very similar to the full sample in terms of covariates, though those with non-missing earnings are a bit more likely to have enrolled in college and earned degrees. In Appendix Figure A.2, I project earnings on all covariates and instruments within the earnings sample, predict earnings for those with missing earnings, and plot the two densities of predicted earnings for comparison. The distributions are nearly identical, with a mean difference of just 58 dollars. These results cannot rule out differential attrition based on unobservables, but they offer some assurance that the scope for bias may be limited.

12

Table 1: Summary Statistics

Mean (SD)

Mean (SD)

Mean (SD)

(4) Main cohorts w/out top scoring quintile or missing earnings Mean (SD)

Covariates Female White Black Hispanic Asian Free/reduced price lunch Test score percentile Neighborhood quality pctile Oil/gas employment share City Suburb Town Rural

0.514 0.456 0.142 0.367 0.032 0.395 50.6 50.6 0.017 0.364 0.276 0.119 0.242

0.513 0.529 0.122 0.317 0.029 0.311 50.7 50.6 0.018 0.377 0.256 0.134 0.233

0.516 0.477 0.141 0.355 0.024 0.354 40.5 48.7 0.018 0.389 0.243 0.134 0.234

0.517 0.464 0.145 0.367 0.021 0.359 40.4 48.3 0.018 0.386 0.242 0.136 0.235

Treatments No college Start at 2-year college Start at 4-year college Start at 4-year (out of state)

0.318 0.371 0.274 0.037

0.388 0.339 0.273 -

0.433 0.366 0.201 -

0.391 0.394 0.215 -

8.5 (10.1) 19.5 (18.7)

9.7 (11.4) 20.5 (19.7)

9.7 (11.5) 20.5 (19.9)

9.7 (11.5) 20.4 (19.8)

Academic Outcomes Bachelor’s degree Years of schooling

-

0.255 13.1 (2.1)

0.187 12.8 (2.0)

0.207 13.0 (2.0)

Earnings Outcomes Mean quarterly earnings Has quarterly earnings

-

8,826 (5,841) 0.764

8,168 (5,413) 0.773

8,168 (5,413) 1.000

454,078

957,752

763,847

590,397

Instruments Miles to 2-year college Miles to 4-year college

Observations

(1) NSC cohorts 2008-2009

(2) Main cohorts 2000-2004

(3) Main cohorts w/out top scoring quintile

Notes: NSC cohorts are those with National Student Clearinghouse data coverage. The twelve NCES geographic locale categories are grouped into four values to save space. Academic outcomes are measured at age 28. Earnings outcomes are measured over ages 28-30. Native Americans are omitted as an ethnicity category due to small shares.

13

3.4

Sorting into Initial College Enrollments

Figure 1 describes how initial college enrollment choices vary across observable student characteristics in the 2008-2009 NSC cohorts.36 Across demographic groups and neighborhood quality deciles, the 2-year college enrollment share is remarkably constant around the mean of 37 percent; what differs are the outside option shares of 4-year enrollment and no college, with men, disadvantaged students, underrepresented minorities, and students from poor neighborhoods more likely to forego college altogether than enroll in a 4-year institution. The bottom panel of Figure 1 shows that the 2-year enrollment share is hump-shaped across 10th grade test scores with a peak at the 40th percentile, though 2-year enrollment is still quite broadly distributed. 4-year college enrollment and no college enrollment, meanwhile, are strongly monotonic in test scores in opposing directions.

3.5

Enrollment and Earnings Dynamics

Figure 2 describes how the initial sector of enrollment relates to enrollment in subsequent years. The left panel conditions on initial 2-year entrants and shows that the vast majority of them stay in the 2-year sector, if any, for the first few years after entry. Enrollment in 4-year colleges among these 2-year entrants then rises and peaks around 20 percent at ages 22-23, with both 2-year and 4-year enrollment slowly trailing off thereafter. The right panel, conditioning on 4-year entrants, tells a similar story of “sticky treatment” in that the vast majority of 4-year entrants stay in the 4-year sector. Roughly 20 percent of initial 4-year entrants enroll in a 2-year college in their early 20s, either as a transfer or dual enrollment. Figure 3 plots the raw earnings profiles associated with each initial enrollment choice. As expected, 4-year entrants overtake 2-year entrants, and 2-year entrants overtake those who do not enroll in any college, but differences by gender emerge: women experience this overtaking a full two years prior to men, and the raw college premiums for women are larger, especially in proportional terms. These gender differentials persist into the causal results, as shown in Section 6. 36 Appendix Figure A.3 reproduces these results for the 2000-2004 main analysis cohorts. The plots are very similar up through the 80th percentile test score sample cutoff, beyond which the share of students starting 4-year is somewhat understated, and no college overstated, due to unobserved out-of-state enrollments.

14

Figure 1: Sorting into College Enrollments by Observable Characteristics Women Men

Not disadvantaged Disadvantaged

Asian White Black Hispanic 0

.1

.2

.3

.4 .5 .6 Enrollment share Start 4-year

.8

.9

1

No college

0

Neighborhood quality percentile 20 40 60 80

100

Start 2-year

.7

0

.1

.2

.3

.4

.5 .6 Enrollment share Start 4-year

.8

.9

1

.9

1

No college

0

20

Test score percentile 40 60 80

100

Start 2-year

.7

0

.1

.2

.3

.4

.5 .6 Enrollment share

Start 2-year

Start 4-year

.7

.8 No college

Notes: 2008-2009 cohorts with National Student Clearinghouse college enrollment coverage; see Appendix Figure A.3 for comparison to the main analysis cohorts for whom out-of-state enrollments are not observed. Disadvantaged is an indicator for free or reduced price lunch eligibility in 10th grade. Neighborhood quality and test score percentiles, defined in Section 3.2, are grouped into 5-unit bins.

15

Figure 2: Enrollment Dynamics by Sector of Initial Enrollment

.6

.8 19

20

21

22

23 24 Age

Enrolled in 2-year

25

26

27

0

.2

.4

Share of 4-year entrants

.6 .4 0

.2

Share of 2-year entrants

.8

1

Initial 4-Year Entrants

1

Initial 2-Year Entrants

28

Enrolled in 4-year

19

20

21

22

23 24 Age

Enrolled in 2-year

25

26

27

28

Enrolled in 4-year

Notes: Enrollment shares at age 19 are not equal to 1 due to the 3-year window in defining the sector of initial enrollment. Subsequent enrollments are not mutually exclusive at a given age; a small fraction of students enroll in both sectors simultaneously.

Figure 3: Earnings Profiles by Sector of Initial Enrollment

Mean quarterly earnings 5000 10000

15000

Men

0

0

Mean quarterly earnings 5000 10000

15000

Women

19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

Age No college

Start 2-year

Age Start 4-year

No college

Start 2-year

Start 4-year

Notes: Quarterly earnings are measured in real 2010 U.S. dollars and averaged within person over each age. Earnings after age 29 are only available for progressively older cohorts in the analysis sample, as the earnings data end in 2015.

16

Table 2: Raw and Controlled OLS Regressions Years of schooling Raw Controlled

Bachelor’s degree Raw Controlled

Quarterly earnings Raw Controlled

Start 2-year vs. no college

1.75 (0.01)

1.48 (0.01)

0.189 (0.002)

0.149 (0.001)

1,418 (20)

1,089 (17)

Start 2-year vs. start 4-year

-1.58 (0.01)

-1.33 (0.01)

-0.392 (0.002)

-0.355 (0.002)

-1,659 (30)

-1,400 (23)

R2 N

0.407 590,397

0.457 590,397

0.286 590,397

0.323 590,397

0.046 590,397

0.169 590,397

Notes: Standard errors in parentheses are clustered at the high school campus by cohort level. Academic outcomes are measured at age 28. Quarterly earnings are measured in real 2010 U.S. dollars and averaged within person over ages 28-30.

3.6

Regression Results

Turning to regression specifications that quantify outcome differences across initial enrollment choices, Table 2 presents coefficients from OLS regressions of the following form: Outcome = α − β2←0 1{No college} − β2←4 1{Start 4-year} + Controls +  Writing the specification in this form, with 2-year entry as the excluded category, immediately delivers a comparison between 2-year entry vs. no college in β2←0 , and a comparison between 2-year entry vs. 4-year entry in β2←4 . The control set includes cohort fixed effects, dummies for each categorical covariate and cubic polynomials in each continuous covariate listed in Table 1, commuting zone fixed effects, and a cubic polynomial in 10th grade test score percentile. Taken at face value, the OLS results in Table 2 suggest diversion from 4-year to 2-year entry has large negative consequences on educational attainment and earnings: 2-year entrants complete 1.3 fewer years of schooling, are 36 percentage points less likely to complete a bachelor’s degree, and earn $1,400 less per quarter around age 30 relative to observably similar 4-year entrants. But how much do these differences reflect causality versus selection bias? And what share of students are actually on the diversion margin when 2-year college access expands? These questions motivate the instrumental variables method developed in the next section. 17

4

Identification

4.1

The Methodological Challenge

Instrumental variables (IV) offer a potential solution to the problem of selection bias in college enrollment choices, since valid instrumental variation can induce otherwise similar students into different choices and thus enable causal comparisons of their subsequent outcomes. For the wellknown case of binary treatment, Imbens and Angrist (1994) demonstrate that such comparisons identify a local average treatment effect (LATE) among compliers, those individuals whose choices respond to instrumental variation, under the standard assumptions of independence, exclusion, and monotonicity. Multiple margins of treatment present a challenge within this paradigm: when instruments shift individuals among more than two alternatives, the relevant counterfactuals for individuals induced into a specific alternative may be both multiple and unobserved, which hampers causal comparisons of the consequences of one choice versus another.37 To see this in the setting of community college enrollment, suppose an exogenous binary instrument Z2 induces students into 2-year college entry (indicated by D2 ) and does not otherwise affect enrollment choices or outcomes. The multiple margins of treatment in this case are the extensive “democratization” margin and the intensive “diversion” margin: some Z2 compliers would have otherwise not attended college, while others would have started at a 4-year institution. Abstracting from covariates, the standard two-stage least squares (2SLS) approach to IV would specify the following outcome and first stage equations: Y = β0 + β2 D2 +  E[D2 |Z2 ] = α0 + α2 Z2 , where Y is a student outcome (e.g. bachelor’s degree attainment or earnings) and β2 is the coefficient of interest on 2-year entry. A straightforward argument, provided in Appendix A, shows that the 2SLS estimand β2 represents a pooled local average treatment effect (LATE) of 2-year entry on 37 See Kirkeboen et al. (2016) for a higher education setting with observable measures of these relevant counterfactuals, thanks to a centralized admissions system that requires applicants to submit rank-ordered lists of program enrollment preferences. The college application and enrollment process is far more decentralized in the United States, usually prohibiting identification of a given student’s next-preferred alternative to a given program.

18

student outcomes, combining the two distinct complier margins into a single weighted average:38 β2 = LAT E2 = ω LAT E2←0 + (1 − ω) LAT E2←4 | {z }

Net effect of 2-year entry

|

{z

}

Democratization effect

|

{z

Diversion effect

(1)

}

The weight ω captures the share of Z2 compliers who are on the 2–0 margin, and this share is identified by the reduction in P r(D0 ) induced by Z2 as a fraction of the increase in P r(D2 ). The distinct democratization and diversion effects are not separately identified, however, leaving these likely opposing impacts of 2-year enrollment shrouded behind the identified net effect that pools them together. In many settings, the net effect is a parameter of interest in its own right; here, LAT E2 captures the aggregate impact of 2-year enrollment on all students induced by the instrument, which may correspond to policy-relevant variation like closer access to a community college campus, subsidized tuition, etc. Decomposing the net effect into its opposing impacts on students from each margin, however, allows for a more comprehensive assessment of impacts and potential unintended consequences of such policies. Equation (1) reveals that diverse combinations of democratization and diversion effects could all yield the same net effect, with very different policy implications. To take two illustrative cases, consider that the same positive LAT E2 value could be generated from a moderately positive LAT E2←0 plus a zero LAT E2←4 , or alternatively a large positive LAT E2←0 plus a large negative LAT E2←4 . The first case features modest gains for democratized students and no loss to diverted students; in light of lower costs at 2-year colleges relative to their 4-year counterparts, this case could potentially justify broad investment in 2-year college access as a cost-effective engine of upward mobility. The second case features large gains for democratized students but large losses for diverted students; this case would demand caution in broadly expanding 2-year access, perhaps in favor of targeted policies towards the types of students likely to be on the extensive margin while encouraging students with 4-year ambitions to start directly in that sector. With two treatment margins of interest and only one instrument, the preceding 2SLS framework is underidentified, so a natural next step is to consider multivariate 2SLS when a second instrument 38

Heckman and Urzua (2010) and Kline and Walters (2016) provide similar derivations, as do Angrist and Imbens (1995) for the case of ordered multivalued treatments.

19

is available, e.g. Z4 : Y = β0 + β2 D2 + β4 D4 +  E[D2 |Z2 , Z4 ] = α0 + α2 Z2 + α4 Z4 E[D4 |Z2 , Z4 ] = γ0 + γ2 Z2 + γ4 Z4 Kirkeboen et al. (2016) consider this case and show that even with one instrument per endogenous treatment, the multivariate 2SLS framework does not generally recover causal effects of one treatment versus another for any relevant population. Instead, the 2SLS estimands mix together additional treatment margins and complier subpopulations to yield weighted averages that generally do not correspond to a simple net effect as in Equation (1). In Appendix B, I derive and decompose the multivariate 2SLS estimands corresponding to my econometric framework in Section 4.2, showing how they fuse all treatment margins into each coefficient.39 The remainder of this section develops an alternative separate identification approach that overcomes the limitations of 2SLS in the presence of multiple treatment margins, allowing for a decomposition of the net effect of 2-year college access into its distinct democratization and diversion components.

4.2

Setup and Assumptions

To set up the identification arguments, let us begin with notation. The three mutually exclusive and exhaustive discrete treatments are D = 2 (start at a 2-year college), D = 4 (start at a 4-year college), and D = 0 (no college). Define D2 , D4 , D0 as the binary indicators corresponding to each treatment, noting that D2 + D4 + D0 = 1. We are interested in identifying causal effects of these initial choices on outcomes Y using two continuous instruments: Z2 (distance to the nearest 2-year college) and Z4 (distance to the nearest 4-year college).40 To simplify notation, suppress the individual index i and implicitly condition on the control set X throughout. Denote potential treatment choice as D(z2 , z4 ) ∈ {0, 2, 4}: this is the enrollment choice a 39 A similar result applies when interacting a single instrument with covariates in the attempt to generate additional sources of instrumental variation (Kline and Walters 2016). See Pinto (2016) for a related discussion in the context of the Moving to Opportunity experiment. 40 Appendix D discusses identification in the case of discrete instruments.

20

student would make if exogenously assigned to instrument values (Z2 , Z4 ) = (z2 , z4 ). Define the binary indicators D0 (z2 , z4 ), D2 (z2 , z4 ), D4 (z2 , z4 ) analogously. The potential outcomes associated with these treatments are Y0 , Y2 , Y4 , indicating the outcome a student would reap if exogenously assigned to treatment D = d ∈ {0, 2, 4}. Realized outcomes are thus Y = Y0 D0 + Y2 D2 + Y4 D4 , and the treatment effects of interest are Y2 − Y0 (democratization) and Y2 − Y4 (diversion). The following assumptions put the necessary structure on these counterfactual objects to secure identification from the observed data (Y, D, Z2 , Z4 ). First, I make the standard IV assumptions of independence and exclusion of the instruments: A1. Independence and Exclusion: Z2 , Z4 ⊥ ⊥ Y0 , Y2 , Y4 , {D(z2 , z4 )}∀(z2 ,z4 ) . Independence requires that the instruments are as good as randomly assigned, conditional on the implicit control set X, and exclusion requires that outcomes only depend on instruments via choices. Second, I adapt the standard monotonicity assumption to the case of multiple unordered treatments and instruments: A2. Partial Unordered Monotonicity: Z2 ) P r(Dj (z20 , z4 ) ≥ Dj (z2 , z4 )) = 1 or P r(Dj (z20 , z4 ) ≤ Dj (z2 , z4 )) = 1 for all z2 , z20 , z4 , j. Z4 ) P r(Dj (z2 , z40 ) ≥ Dj (z2 , z4 )) = 1 or P r(Dj (z2 , z40 ) ≤ Dj (z2 , z4 )) = 1 for all z2 , z4 , z40 , j. Partial unordered monotonicity extends the binary intuition of “no defiers” to the multivariate setting: a given instrument shift makes each treatment uniformly more or less attractive across individuals. Heckman and Pinto (2017) introduce unordered monotonicity to the IV literature and consider the case where the condition holds across any two values of a discrete instrument; I add the “partial” modifier to Assumption A2 to reflect that I only consider partial instrumental variation in Z2 holding Z4 fixed and Z4 holding Z2 fixed, relaxing the assumption of unordered monotonicity across instrument shifts in which both Z2 and Z4 change. Mapping this assumption into the college enrollment setting, 2-year proximity makes 2-year entry more attractive at the expense of no college and 4-year entry (inducing 2–0 and 2–4 compliers), while 4-year proximity makes 4-year entry more attractive at the expense of no college and 2-year entry (inducing 0–4 and 2–4 compliers).41 41 An option value channel could potentially cause 4-year proximity to induce 0–2 compliers, since the future prospect of upward transfer may inspire some non-college individuals into 2-year entry. Empirically, the first stage

21

Finally, I introduce a new assumption that draws a connection between Z2 compliers and Z4 compliers:42 A3. Comparable Compliers: lim E[Y2 |D(z20 , z4 ) = 2, D(z2 , z4 ) = 4] = 0lim E[Y2 |D(z2 , z40 ) = 2, D(z2 , z4 ) = 4] for all z2 , z4 .

z20 →z2

z4 →z4

This comparable compliers assumption allows me to use information about Z4 compliers to infer otherwise-unidentified characteristics of Z2 compliers.

From a common base value of the

instruments, a marginal shift in Z2 , i.e. (z2 , z4 ) → (z20 , z4 ), and a marginal shift in Z4 , i.e. (z2 , z4 ) → (z2 , z40 ), each induce some compliers along the 2–4 diversion margin. Assumption A3 requires that these two diverted complier groups are not systematically different in terms their mean potential 2-year outcomes. This is similar to an index sufficiency condition: both instruments marginally tilt the relative cost of 2-year versus 4-year entry, so exactly which instrument caused the change is assumed irrelevant in tracing out the consequences on outcomes. This assumption is not directly testable, but an indirect check involves comparing the means (or entire distributions) of observable characteristics across the two complier groups, which are separately identified in this framework. Table 3 below conducts this check with mean test scores and finds that the two complier groups are statistically indistinguishable. To interpret these assumptions through the lens of economic behavior, consider that both partial unordered monotonicity (A2) and comparable compliers (A3) are implied by a weakly separable selection model with choice-specific instruments.43 Suppose individuals have latent indirect utilities relationship between 4-year proximity and the probability of not attending college turns out to be quite small; this limits the influence of such an option value channel on the results, since the mass of any 0–2 compliers with respect to 4-year proximity would be bounded by this small first stage share. 42 I implicitly make the necessary continuity and regularity assumptions that ensure the limits in A3 are well-defined. 43 Vytlacil (2002) proves an equivalence result between the standard IV assumptions of independence, exclusion, and monotonicity and a weakly separable selection model for the binary IV case, and Heckman and Pinto (2017) extend this logic to multivariate treatment settings in which unordered monotonicity holds across all pairwise instrument comparisons. In my partial unordered monotonicity setting, the selection model presented here is sufficient but not necessary for Assumptions A1-A3 to hold; generalizing the selection model in a way that achieves equivalence under this weaker monotonicity condition is an avenue for future work.

22

for each choice given by I0 = 0 I2 = U2 − µ2 (Z2 ) I4 = U4 − µ4 (Z4 ), where the utility of no college is normalized to zero, U2 is unobserved preference heterogeneity for 2year enrollment, U4 likewise for 4-year enrollment, and the µj () functions represent the costs of each alternative. Note that µ2 () and µ4 () need not be the same function, allowing for different disutilities of 2-year and 4-year distance; the key restrictions are that these functions are weakly separable from unobserved heterogeneity and that each instrument is specific to each choice.44 Independence and exclusion (A1) require (Z2 , Z4 ) ⊥ ⊥ (U2 , U4 , Y0 , Y2 , Y4 ), while selection can arise through relationships between unobserved preferences and potential outcomes: (U2 , U4 ) and (Y0 , Y2 , Y4 ) can depend on each other in unrestricted ways. To complete the model, individuals simply choose the alternative with the highest indirect utility, implying the choice equations D0 (z2 , z4 ) = 1[U2 < µ2 (z2 ), U4 < µ4 (z4 )] D2 (z2 , z4 ) = 1[U2 > µ2 (z2 ), U4 − U2 < µ4 (z4 ) − µ2 (z2 )] D4 (z2 , z4 ) = 1[U4 > µ4 (z4 ), U4 − U2 > µ4 (z4 ) − µ2 (z2 )]. Figure 4 visualizes how this selection model generates A2 and A3 as necessary implications, and Appendix C provides the formal proofs. The first panel of Figure 4 shows how the choice equations partition the two-dimensional space of unobserved preference heterogeneity for a given value of the instruments: individuals who choose D = 0 have low preference values for both 2-year and 4-year enrollment, while those who choose D = 2 or D = 4 have higher values of the corresponding Uj . The second and third panels visualize partial unordered monotonicity (A2): a shift in Z2 increases the attractiveness of 2-year enrollment at the expense of no college and 4-year enrollment, while Z4 44 The index/cost function µ2 () could depend on Z4 , and µ4 () on Z2 , as long as each function is able to vary while fixing the other; then identification would proceed by first identifying these index functions (e.g. Matzkin 1993), then exploiting partial variation in the function values rather than the instruments directly. See Cameron and Heckman (1998) for an application of this logic in a dynamic discrete choice framework, and Lee and Salanie (2017) for a recent application to multivalued treatments with multiple dimensions of unobserved heterogeneity.

23

Figure 4: Selection Model Illustration U4

U4 D(z2,z4) = 4

U4 - U2 = μ4(z4) - μ2(z2’) D(z2,z4) = 4

U4 - U2 = μ4(z4) - μ2(z2)

μ4(z4)

2 ← 0 compliers

μ4(z4) D(z2,z4) = 2 D(z2,z4) = 0

D(z2,z4) = 0

μ2(z2)

U4

U2

μ4(z4’)

D(z2,z4) = 2

μ2(z2’) μ2(z2)

U4

U4 - U2 = μ4(z4’) - μ2(z2) D(z2,z4) = 4

U4 - U2 = μ4(z4) - μ2(z2)

U4 - U2 = μ4(z4) - μ2(z2’) = μ4(z4’) - μ2(z2) D(z2,z4) = 4

U4 - U2 = μ4(z4) - μ2(z2)

μ4(z4’)

U4 - U2 = μ4(z4) - μ2(z2)

0 ← 4 compliers

μ4(z4)

μ4(z4) D(z2,z4) = 2 D(z2,z4) = 0

D(z2,z4) = 0

μ2(z2)

U2

2 ← 0 compliers

0 ← 4 compliers

U2

μ2(z2’) μ2(z2)

D(z2,z4) = 2

U2

D0 (z2 , z4 ) = 1[U2 < µ2 (z2 ), U4 < µ4 (z4 )]

D2 (z2 , z4 ) = 1[U2 > µ2 (z2 ), U4 − U2 < µ4 (z4 ) − µ2 (z2 )]

D4 (z2 , z4 ) = 1[U4 > µ4 (z4 ), U4 − U2 > µ4 (z4 ) − µ2 (z2 )].

Notes: This figure illustrates how the selection model in Section 4.2 generates Assumption A2 (partial unordered monotonicity) and Assumption A3 (comparable compliers). The top left panel shows how the choice equations, given a particular pair of instrument values, partition the two-dimensional space of unobserved preference heterogeneity. The top right panel illustrates a shift in Z2 ; the bottom left panel illustrates a shift in Z4 ; and the final panel overlays these two shifts to illustrate their comparable compliers.

24

increases the attractiveness of 4-year enrollment at the expense of no college and 2-year enrollment. Finally, comparable compliers (A3) emerge in the fourth panel by overlaying these two instrument shifts: both Z2 and Z4 induce 2–4 compliers, and these two complier groups have the same (U2 , U4 ) values.45 This is a sufficient, but not necessary, condition for equal mean potential 2-year outcomes of these two complier groups. The selection model thus provides a useful illustration of Assumptions A2 and A3 but is slightly stronger than necessary, motivating the following nonparametric results.

4.3

Identification Results

I identify pairwise treatment effects along the 2–0 and 2–4 choice margins by first isolating the mean potential outcomes of compliers along each margin separately, then taking differences of these potential outcomes to form treatment effects. The method begins by decomposing the reduced form with respect to Z2 : E[Y |Z2 , Z4 ] = E[Y D0 + Y D2 + Y D4 |Z2 , Z4 ]

∂E[Y |Z2 , Z4 ] ∂E[Y D0 |Z2 , Z4 ] ∂E[Y D2 |Z2 , Z4 ] ∂E[Y D4 |Z2 , Z4 ] = + + ∂Z2 ∂Z2 ∂Z2 ∂Z2 Since Y D0 = Y0 when D0 = 1 and Y D0 = 0 otherwise, instrument-induced changes in E[Y D0 |Z2 , Z4 ] reveal information about Y0 among compliers switching into or out of D0 . By partial unordered monotonicity, changes in D0 with respect to Z2 are driven by 2–0 compliers, so changes in E[Y D0 |Z2 , Z4 ] with respect to Z2 reveal information about Y0 these 2–0 compliers: ∂E[D0 |Z2 , Z4 ] ∂E[Y D0 |Z2 , Z4 ] = E[Y0 |2–0 complier] , ∂Z2 ∂Z2 where E[Y0 |2–0 complier] is shorthand for limz20 →z2 E[Y0 |D(z20 , z4 ) = 2, D(z2 , z4 ) = 0] evaluated at a given instrument point (Z2 , Z4 ) = (z2 , z4 ). Appendix D provides the proof for this result and the others in this section. Since the derivatives

∂E[Y D0 |Z2 ,Z4 ] ∂Z2

and

∂E[D0 |Z2 ,Z4 ] ∂Z2

are directly identified

from data, their ratio identifies E[Y0 |2–0 complier] at all points in the instrument support at which the first stage derivative

∂E[D0 |Z2 ,Z4 ] ∂Z2

is nonzero.

45

The small triangle in the fourth panel of Figure 4 in which the 2–4 Z2 and Z4 compliers do not overlap vanishes as these discrete shifts converge to marginal shifts.

25

Likewise, changes in D4 with respect to Z2 are driven by 2–4 compliers, so changes in E[Y D4 |Z2 , Z4 ] with respect to Z2 reveal information about Y4 for these 2–4 compliers: ∂E[D4 |Z2 , Z4 ] ∂E[Y D4 |Z2 , Z4 ] = E[Y4 |2–4 complier] ∂Z2 ∂Z2 The story is more complicated for E[Y D2 |Z2 , Z4 ], since changes in D2 with respect to Z2 are driven by both 2–0 compliers and 2–4 compliers: ∂E[Y D2 |Z2 , Z4 ] ∂E[D0 |Z2 , Z4 ] = E[Y2 |2–0 complier] − ∂Z2 ∂Z2 ∂E[D4 |Z2 , Z4 ] + E[Y2 |2–4 complier] − ∂Z2

!

!

Instrumental variation in Z2 alone is therefore insufficient to identify pairwise treatment effects, since the mean Y2 potential outcomes for both complier margins are pooled together. The key to disentangling these margins lies with Z4 : since changes in D2 with respect to Z4 are driven only by 2–4 compliers, we have ∂E[Y D2 |Z2 , Z4 ] ∂E[D2 |Z2 , Z4 ] = E[Y2 |2–4 complier] . ∂Z4 ∂Z4 Furthermore, since these comparable compliers have the same mean Y2 as those induced by Z2 (Assumption A3), plugging in the identified E[Y2 |2–4 complier] from this equation into the pooled expression above disentangles E[Y2 |2–0 complier] from E[Y2 |2–4 complier], since every other piece of the pooled expression is identified. This secures all of the mean potential outcomes necessary

26

for forming the pairwise treatment effects of interest: E[Y2 |2–0 complier at (Z2 , Z4 )] − E[Y0 |2–0 complier at (Z2 , Z4 )] =E[Y2 − Y0 |2–0 complier at (Z2 , Z4 )] =M T E2←0 (Z2 , Z4 )

E[Y2 |2–4 complier at (Z2 , Z4 )] − E[Y4 |2–4 complier at (Z2 , Z4 )] =E[Y2 − Y4 |2–4 complier at (Z2 , Z4 )] =M T E2←4 (Z2 , Z4 ) These marginal treatment effects (MTEs) are simply the continuous instrument analogues to discrete LATEs. After identifying these MTEs across the empirical support of the instruments, any discrete LATE of interest within the instrument support can be formed by integrating the corresponding MTE over the relevant discrete instrument shift (Heckman and Vytlacil 2005).46 Finally, the marginal net effect of 2-year access is identified as the local instrumental variables (LIV) estimand involving D2 and Z2 ,47

M T E2 (Z2 , Z4 ) =

∂E[Y |Z2 ,Z4 ] ∂Z2 ∂E[D2 |Z2 ,Z4 ] ∂Z2

,

and the marginal 2–0 complier share is identified as ω(Z2 , Z4 ) =

0 |Z2 ,Z4 ] − ∂E[D∂Z 2

∂E[D2 |Z2 ,Z4 ] ∂Z2

.

Bringing these identification results together, we arrive at the decomposition of interest: M T E2 = ω M T E2←0 + (1 − ω) M T E2←4 , | {z }

Net effect of 2-year entry

|

{z

}

Democratization effect

|

{z

Diversion effect

(2)

}

where each component is separately identified at each point in the empirical support of the instru46

The methods of Mogstad et al. (2017) could also be used to explore parameters of interest that require extrapolation beyond the empirical instrument support. 47 Heckman and Vytlacil (1999). Recall the discrete analogue LAT E2 from Equation (1).

27

ments at which the first-stage derivatives are nonzero.

5

Estimation and Instrument Diagnostics

5.1

Locally Linear Specification

All of the quantities of interest in Equation (2) are composed of ratios of partial derivatives, involving the conditional expectations of {D0 , D2 , D4 , Y, Y D0 , Y D2 , Y D4 } with respect to the instruments (Z2 , Z4 ). These partial derivatives can be consistently estimated as local slopes in two-dimensional local polynomial regressions, evaluated at each point in the empirical support of (Z2 , Z4 ). In many empirical applications, the instruments may only satisfy Assumptions A1-A3 conditional on a control set X. All of the preceding identification and estimation arguments still apply after conditioning on each X = x, but the curse of dimensionality quickly sets in as X becomes high-dimensional, and X may also include continuous variables, as it does in my setting. To reduce this dimensionality problem, I estimate locally linear conditional expectation specifications around (z2 , z4 ) evaluation points in which the variables in the control set X enter additively, with all of the coefficients allowed to vary across different (z2 , z4 ) evaluation points. Formally, for a given variable W ∈ {D0 , D2 , D4 , Y, Y D0 , Y D2 , Y D4 }, the estimated coefficients at each (z2 , z4 ) evaluation point solve a kernel-weighted least squares problem: 



W βˆ0 (z2 , z4 )

    W N βˆ (z2 , z4 ) X  2  K  = argminβ0 ,β2 ,β4 ,βx   ˆW  i=1 β4 (z2 , z4 )    

Z2i − z2 Z4i − z4 , h h

!



Wi − β0 − β2 Z2i − β4 Z4i − Xi0 βx

2

,

βˆxW (z2 , z4 )

where K() is a two-dimensional kernel with bandwidth h. This specification occupies a halfway house between a fully nonparametric specification across all dimensions of (Z2 , Z4 , X) and a partially linear specification that would constrain βx to remain globally constant across all (z2 , z4 ) evaluation points. Forming the potential outcome and treatment effect estimates then proceeds by the analogy principle, plugging in the local slope coefficients βˆ2W (z2 , z4 ) and βˆ4W (z2 , z4 ) in place of the local

28

partial derivatives

∂E[W |Z2 =z2 ,Z4 =z4 ] ∂Z2

and

∂E[W |Z2 =z2 ,Z4 =z4 ] ∂Z4

involved in each expression, e.g.

ˆY D0 ˆ 0 |2–0 complier at (z2 , z4 )] = β2 (z2 , z4 ) . E[Y βˆ2D0 (z2 , z4 ) These estimators are numerically equivalent to locally linear single-treatment 2SLS specifications using one instrument while controlling for the other instrument and X. Continuing the example ˆ 0 |2–0 complier at (z2 , z4 )] through a 2SLS above, we would arrive at the same estimate of E[Y regression of Y D0 on D0 instrumented with Z2 , controlling linearly for Z4 and X, in a locallyweighted region around the evaluation point (z2 , z4 ).

5.2

Implementation

In my setting, 2-year and 4-year college proximities may be related to the quality of a student’s neighborhood, the degree of urbanization of the neighborhood, and local labor market opportunities, all of which may directly influence enrollment choices and outcomes through confounding channels. To purge the proximity instruments of these relationships, I include in the control set X a cubic polynomial in the neighborhood quality index described in Section 3.2, indicators for the 12 NCES urbanization locale codes, and indicators for each commuting zone to soak up unobserved characteristics of each local labor market in Texas. I also include basic demographics (cohort, gender, race, and free/reduced price lunch eligibility) and a cubic in the long-run high-school level oil and gas employment share to account for any local influences of the Texas oil and gas industry. I exclude test scores from this control set to conduct balancing tests and robustness checks below. I use a two-dimensional Epanechnikov (parabolic) kernel with 40-mile bandwidth to weight the locally linear regressions, and I report the main results evaluated at the mean values of (Z2 , Z4 ).48 The selection results evaluate estimates across the empirical support of Z2 in 5-mile increments, holding Z4 at its mean. Inference is conducted via block bootstrap at each evaluation point with clusters at the high school campus by cohort level. 48

Table 5 below shows that the results are not sensitive to deviations around this bandwidth value.

29

6000

Quarterly earnings 8000 10000 12000

14000

Figure 5: Mean Earnings by Test Score Percentile

0

20

40 60 Test score percentile

80

100

Notes: Quarterly earnings are measured in real 2010 U.S. dollars and averaged within person over ages 28-30. Test score percentile construction is described in Section 3.2.

5.3

Instrument Diagnostics

Table 3 presents diagnostic results on the 2-year and 4-year proximity instruments. The first column conducts a balancing test by regressing the excluded 10th grade test score measure on the instruments and controls using the main specification. The results indicate no significant relationship between distances and test scores: the coefficient of −0.56 on Z2 , which is measured in 50-mile units, means that moving 50 miles further from a 2-year college is associated with an insignificant decrease of just one half of one percentile in the test score distribution. Likewise across Z4 , moving 50 miles further from a 4-year college is associated with an insignificant increase of one quarter (0.273) of one percentile. Since this evidence is only meaningful if test scores are related to choices and outcomes, recall Figure 1 showing strong sorting patterns into initial college enrollment choices across the test score distribution, and see Figure 5 for the strong relationship between test scores and earnings: predicted quarterly earnings around age 30 more than double across the range of the score distribution. The next several columns of Table 3 show the first stage and reduced form estimates. Each pair of sub-columns compares the results across exclusion and inclusion of the test score measure, entering as a cubic polynomial. Given the balance result in the first column, we should expect these estimates to remain robust to test score inclusion, and this result is borne out across all of the first 30

Table 3: Instrument Diagnostics: Balance, First Stages, Reduced Forms, & Comparable Compliers Test score percentile

Start 2-year

Start 4-year

Z2 (50 miles)

-0.560 (0.620)

-0.216 (0.014)

-0.217 (0.014)

0.075 (0.013)

0.078 (0.012)

Z4 (50 miles)

0.273 (0.540)

0.094 (0.010)

0.092 (0.010)

-0.126 (0.011)

-0.128 (0.010)

R2 N

0.101 556,726

0.027 556,726

0.038 556,726

0.070 556,726

0.158 556,726

Baseline controls Test score control

X

X

X X

X

X X

Years of schooling

Bachelor’s degree

Quarterly earnings

Z2 (50 miles)

-0.201 (0.054)

-0.183 (0.047)

-0.023 (0.010)

-0.020 (0.009)

-151 (103)

-124 (101)

Z4 (50 miles)

-0.261 (0.041)

-0.273 (0.035)

-0.057 (0.008)

-0.058 (0.007)

-410 (77)

-427 (72)

R2 N

0.093 556,726

0.223 556,726

0.075 556,726

0.151 556,726

0.105 556,726

0.142 556,726

Baseline controls Test score control

X

X X

X

X X

X

X X

Mean test score percentile 2–4 complier w.r.t. Z2

53.4 (2.77)

2–4 complier w.r.t. Z4

54.1 (2.79)

Notes: Locally weighted observations: 556,726. Test score percentiles are measured from 1 to 100, so 0.5 is half of one percentile. Standard errors in parentheses are block bootstrapped at the high school campus by cohort level. All estimates are evaluated at the mean values of the instruments. Academic outcomes are measured at age 28. Quarterly earnings are measured in real 2010 U.S. dollars and averaged within person over ages 28-30.

31

stages and reduced forms.49 The large gains in R-squared that result from test score inclusion provide further evidence that test scores strongly predict choices and outcomes. Table 3 also shows the statistical precision of the first stage estimates, with (unreported) partial F-statistics on the two instruments exceeding 70 for each treatment. Each first stage sign is in the intuitive direction: 2-year distance decreases 2-year entry but increases 4-year entry, while 4-year distance decreases 4-year entry but increases 2-year entry. The bottom panel of Table 3 conducts a check on the validity of the comparable compliers assumption (A3). The top row reports the estimated mean test score percentile among 2–4 compliers with respect to Z2 , while the bottom row reports the separately-identified mean test score percentile among 2–4 compliers with respect to Z4 . The two complier means are statistically indistinguishable, which lends suggestive evidence towards the comparable compliers assumption of equal mean 2-year potential outcomes between these two groups.

6

Results

6.1

Main Results

Table 4 presents the main results. The first column shows the net effect of 2-year entry on each outcome, which pools the two different complier margins into a single weighted average. On net, 2-year entry increases completed schooling by roughly one year and bachelor’s degree attainment by 10.5 percentage points, with a positive but statistically insignificant effect on quarterly earnings around age 30.50 The next four columns of Table 4 decompose the net effects into the two opposing channels of democratization and diversion. Roughly two-thirds of compliers would not have otherwise attended college (democratization), while the other third are diverted from 4-year entry. Diversion causes lower completed schooling and substantially lower bachelor’s degree attainment, with a negative but statistically imprecise effect on earnings. These IV estimates are meaningfully smaller than the OLS estimates in Table 2, however, suggesting that unobserved differences between 2-year and 49

Table 5 below also confirms robustness of the main results to the test score control. Future drafts will explore whether less flexible specifications of the control set or parametric restrictions on unobserved heterogeneity yield similar point estimates with greater statistical precision. 50

32

Table 4: Causal Effect Estimates M T E2 Net effect

=

ω Democratization share

M T E2←0 Democratization effect

+

(1 − ω) Diversion share

M T E2←4 Diversion effect

Years of schooling

0.931 (0.255)

0.654 (0.048)

1.720 (0.201)

0.346 (0.048)

-0.557 (0.359)

Bachelor’s degree

0.105 (0.048)

''

0.257 (0.036)

''

-0.182 (0.087)

Quarterly earnings

701 (533)

''

1,337 (731)

''

-499 (874)

Notes: Locally weighted observations: 556,726. All estimates are evaluated at the mean values of the instruments. Standard errors in parentheses are block bootstrapped at the high school campus by cohort level. Complier shares are the same across outcomes due to common first stage equations. Academic outcomes are measured at age 28. Quarterly earnings are measured in real 2010 U.S. dollars and averaged within person over ages 28-30.

4-year entrants bias the OLS diversion estimates towards larger negative magnitudes.51 On the democratization margin, 2-year entrants who would not have otherwise attended college experience significant gains in all outcomes. They complete 1.7 more years of schooling, are 26 percentage points more likely to earn a bachelor’s degree, and earn $1,337 more per quarter around age 30 relative to never enrolling, which corresponds to a roughly 20 percent earnings premium. These positive impacts, combined with these compliers’ two-thirds share of all compliers, outweigh the negative impacts on the diversion margin to yield the positive net effects of 2-year entry from the first column of Table 4.

6.2

Robustness across Alternative Specifications

Table 5 conducts several robustness checks to probe the sensitivity of the results to alternative specifications. Column (1) transposes the baseline point estimates and standard errors from Table 4 for comparison. Column (2) adds the excluded test score measure to the control set. Column (3) increases the local regression bandwidth from 40 to 45 miles, while Columns (4) and (5) reduce it to 35 and 30 miles, respectively. None of these alternative specifications cause meaningful changes in the estimates. 51

Differences between the IV and OLS estimates could also be driven by treatment effect heterogeneity. Reweighting the OLS estimates to mimic the observable characteristics of compliers is one potential avenue for probing the relative influences of selection bias and effect heterogeneity.

33

Table 5: Causal Effect Estimates: Robustness Checks

Democratization share Years of schooling Democratization effect Diversion effect Bachelor’s degree Democratization effect Diversion effect Quarterly earnings Democratization effect Diversion effect Locally weighted observations

(1) Baseline specification

(2) Add test score control

(3) Bandwidth 45 miles

(4) Bandwidth 35 miles

(5) Bandwidth 30 miles

.654 (.0476)

.638

.651

.649

.655

1.72 (.201) -.557 (.359)

1.69

1.72

1.69

1.67

-.647

-.513

-.617

-.639

.256

.253

.252

.252

-.202

-.175

-.192

-.192

1,337 (731) -499 (874)

1,302

1,146

1,546

1,665

-713

-451

-553

-770

556,726

556,726

563,476

542,279

526,714

.257 (.036) -.182 (.0875)

Notes: All estimates are evaluated at the mean values of the instruments. Standard errors in parentheses are block bootstrapped at the high school campus by cohort level. Complier shares are the same across outcomes due to common first stage equations. Academic outcomes are measured at age 28. Quarterly earnings are measured in real 2010 U.S. dollars and averaged within person over ages 28-30.

34

6.3

Heterogeneity by Gender

Table 6 stratifies the results by gender. Men and women have nearly identical complier shares along each enrollment margin, but the similarities end there: women drive the main results with much larger effects than men for every outcome along every margin. While men experience positive gains in educational attainment along the 2–0 margin, their diversion losses are small and insignificant, and male earnings appear unaffected by 2-year entry along any margin. Women, meanwhile, experience large gains in educational attainment and significant earnings returns to 2-year entry among compliers who otherwise would not have attended college, consistent with the large OLS literature documenting a female premium in the returns to 2-year entry on the extensive margin (Belfield and Bailey 2011, 2017). Diverted women, on the other hand, experience significant losses in educational attainment and a modestly large (though imprecise) decline in earnings relative to their 4-year entry counterfactual.52 To gauge the evolution of male and female earnings effects across the early-career lifecycle, Figure 3 estimates mean quarterly earnings effects separately across the three age windows of 2224, 25-27, and 28-30 (pooled for greater precision), then plots these estimates to yield age profiles by gender. The middle panel shows that the roughly zero 2-0 effect for men around age 30 from Table 6 is actually preceded by negative returns at earlier ages: men on the margin between 2-year entry and no college who do enroll end up taking their entire 20s to overtake the earnings of those who do not enroll. Women already begin experiencing positive extensive margin returns in the early 20s and enjoy a steadily increasing return over at least the next decade.53 Concerning the diversion margin, the rightmost panel of Figure 3 shows that the weak earnings effects along the 2–4 margin do not have a detectable trend, offering no early-career evidence of widening gaps between marginal students who begin at 2-year versus 4-year colleges.

6.4

Impacts on Disadvantaged Students and Implications for Upward Mobility

Table 7 limits the sample to disadvantaged students, as measured by eligibility for subsidized meals in high school. Low-income students are a key constituency in policy debates over community 52 Exploration of the causal mechanisms driving these negative diversion effects, including mediation through occupational choice, remains an avenue for future work. See Monaghan and Attewell (2015) for a recent analysis using propensity score matching that suggests the importance of credit loss during 2-year to 4-year transfer. 53 Recall similar gender differences in the raw earnings profiles of Figure 3.

35

Table 6: Causal Effect Estimates: Women vs. Men M T E2 Net effect

=

Women ω M T E2←0 Democratiz- Democratization share ation effect

+

(1 − ω) Diversion share

M T E2←4 Diversion effect

Years of schooling

1.05 (.265)

.650 (.048)

2.04 (.239)

.350 (.048)

-.772 (.400)

Bachelor’s degree

.124 (.053)

''

.324 (.042)

''

-.247 (.098)

Quarterly earnings

1,587 (479)

''

2,863 (580)

''

-779 (955)

(1 − ω) Diversion share

M T E2←4 Diversion effect

M T E2 Net effect

=

Men ω M T E2←0 Democratiz- Democratization share ation effect

+

Years of schooling

.785 (.339)

.659 (.063)

1.30 (.318)

.341 (.063)

-.211 (.514)

Bachelor’s degree

.083 (.062)

''

.166 (.054)

''

-.078 (.130)

Quarterly earnings

-216 (929)

''

-320 (1,386)

''

-15 (1,210)

Notes: Locally weighted observations: 287,979 (women), 268,747 (men). All estimates are evaluated at the mean values of the instruments. Standard errors in parentheses are block bootstrapped at the high school campus by cohort level. Complier shares are the same across outcomes due to common first stage equations. Academic outcomes are measured at age 28. Quarterly earnings are measured in real 2010 U.S. dollars and averaged within person over ages 28-30.

36

Figure 6: Earnings Effect Profiles by Gender

24

26 Age

Women

28

30 Men

1000 2000 3000 0 22

24

26 Age

Women

28

30 Men

-2000 -1000

0 22

-2000 -1000

0 -2000 -1000

2-4 Effect

1000 2000 3000

2-0 Effect

1000 2000 3000

Net Effect

22

24

26 Age

Women

28

30 Men

Notes: This figure plots marginal treatment effect estimates of 2-year entry on quarterly earnings averaged within three different age windows: 22-24, 25-27, and 28-30. All estimates are evaluated at the mean values of the instruments.

colleges, since they are disproportionately likely to enroll in 2-year rather than 4-year institutions and are likely the most sensitive to policies that reduce 2-year entry costs. The second column of Table 7 offers empirical support for these intuitions: when 2-year access expands, disadvantaged students are overwhelmingly on the 2–0 democratization margin, with just 19 percent on the 2–4 diversion margin. Disadvantaged students “democratized” into higher education along the 2–0 margin experience smaller-than-average gains in educational attainment, but larger-than-average earnings returns to 2-year entry. This suggests that 2-year entry may involve other labor market benefits for disadvantaged students beyond modest increases in formal educational attainment, such as better access to employer networks, short course sequences teaching readily-employable skills, and improved job matching. Taken together, these results suggest that boosting the upward earnings mobility of disadvantaged youth need not require large increases in years of formal postsecondary schooling or a narrow focus on bachelor’s degree attainment; simply attracting more disadvantaged students into 2-year colleges appears to confer meaningful earnings benefits through other channels, and identifying these specific channels remains an important avenue for future work.54 54

Since the geographic accessibility of 2-year colleges is a characteristic of a local neighborhood, these results also relate to the active literature studying the relationship between neighborhoods and the upward mobility of low-income youth, e.g. Kling et al. (2007), Chetty et al. (2016), and Chetty and Hendren (2016).

37

Table 7: Causal Effect Estimates: Disadvantaged Students M T E2 Net effect

=

ω Democratization share

M T E2←0 Democratization effect

+

(1 − ω) Diversion share

M T E2←4 Diversion effect

Years of schooling

.859 (.262)

.807 (.056)

1.04 (.220)

.193 (.056)

.101 (.982)

Bachelor’s degree

.084 (.046)

''

.100 (.038)

''

.017 (.241)

Quarterly earnings

1,373 (687)

''

1,730 (769)

''

-122 (2,816)

Notes: Locally weighted observations: 197,260. Disadvantaged is an indicator for free or reduced price lunch eligibility in 10th grade. All estimates are evaluated at the mean values of the instruments. Standard errors in parentheses are block bootstrapped at the high school campus by cohort level. Complier shares are the same across outcomes due to common first stage equations. Academic outcomes are measured at age 28. Quarterly earnings are measured in real 2010 U.S. dollars and averaged within person over ages 28-30.

6.5

Selection Patterns

Finally, stratifying the results across the range of 2-year proximity permits exploration of selection patterns and helps probe the external validity of the local IV estimates. Since the separate identification approach directly identifies complier mean potential outcomes along each treatment margin, Figure 7 first plots these mean potential outcome estimates in the top panel, stratifying across the range of 2-year distance while holding 4-year distance fixed at its mean.55 As 2-year distance increases, the mean potential earnings of marginally-induced compliers increase along all choice margins. This implies 2-year entrants are positively selected on potential outcome levels, since those marginal individuals who are induced into 2-year entry at large distances must be those with the greatest unobserved preference for 2-year entry. The ranking of mean potential outcomes across complier groups also yields an intuitive pattern: the 4-year outcome for compliers along the 2–4 diversion margin dominates all others, followed by the 2-year outcome for these 2–4 compliers, which lies almost entirely above the 2-year outcome for compliers along the 2–0 margin, all of which strongly dominate the no-college outcome for 2–0 compliers. This ranking makes sense in light of a mean test score percentile of 54 for compliers along the 2–4 margin compared to 32 for compliers 55

For the binary IV case, Huber and Mellace (2014) and Kitagawa (2015) develop specification tests from necessary restrictions on potential outcome distributions. This logic could be readily extended to the case of multiple treatments and instruments but is not pursued here.

38

2-4 Margin

Complier potential outcome levels

Complier potential outcome levels

Complier potential outcome levels

Y4

6000

7000

Y2

7000 6000

Y0

0 5 10 15 20 25 30 35 Miles to nearest 2-year

Marginal treatment effect

Marginal treatment effect

Marginal treatment effect

0

0 -3000 -1500

0 5 10 15 20 25 30 35 Miles to nearest 2-year

1500 3000

0 5 10 15 20 25 30 35 Miles to nearest 2-year

1500 3000

0 5 10 15 20 25 30 35 Miles to nearest 2-year

0 5 10 15 20 25 30 35 Miles to nearest 2-year

-3000 -1500

7000 6000

Y\2

0

1500 3000

8000

8000 Y2

Y2

-3000 -1500

9000

2-0 Margin 9000

Pooled Margins

8000

9000

Figure 7: Selection Patterns

0 5 10 15 20 25 30 35 Miles to nearest 2-year

.4

.5

.6

.7

.8

2-0 Complier Share

0

5

10 15 20 25 30 Miles to nearest 2-year

35

Notes: Each estimate is evaluated at a given 2-year distance value holding 4-year distance fixed at its mean. Quarterly earnings are measured in real 2010 U.S. dollars and averaged within person over ages 28-30. 90 percent confidence intervals are estimated via block bootstrap at the high school campus by cohort level.

39

along the 2–0 margin. The middle panel of Figure 7 takes the differences of the potential outcomes in the top panel to form the marginal treatment effects of interest. The estimates at the mean 2-year distance of 10 miles correspond to the main results in Table 4, while stratifying across other 2-year proximity values permits exploration of selection-on-gains patterns and external validity. The 90 percent confidence intervals cannot reject that the effects along all margins are flat across the empirical support of 2-year proximity, though the first stage estimates in Table 3 imply that this 35-mile range only spans a 15 percentage point change in the propensity score of 2-year entry—far from full support. If anything, the net effect of 2-year entry on earnings increases slightly as 2-year distance decreases over this range, i.e. as 2-year access further increases. This is not driven by meaningful slopes in the 2–0 or 2–4 marginal treatment effects, but rather by an increasing share (as 2-year distance decreases) of compliers who are on the 2–0 margin, as shown in the bottom panel of Figure 7. The large confidence intervals preclude a precise conclusion, but such a pattern would imply reverse-Roy selection on net gains driven by compositional changes: as 2-year access further expands, the net returns to marginal 2-year entrants increase because a greater share of them are “democratized” into higher education rather than diverted from 4-year college entry.56

7

Conclusion

In light of slowing college attainment rates and rising inequality across education levels in the United States, policymakers are increasingly looking to 2-year community colleges as key policy levers in extending higher education to a broader share of American youth. This paper has empirically explored the consequences of expanding access to 2-year colleges, mindful of the potential tradeoff between attracting new students and diverting those already bound for college away from 4-year enrollment. Decomposing the net impacts of 2-year college access into effects along these two distinct enrollment margins presents a methodological challenge, since standard instrumental variables methods are not generally equipped to disentangle such effects. I show how a separate identification approach, guided by the flows of different compliers to different instruments, can secure identification 56

See Aakvik et al. (2005), Walters (2017), Kline and Walters (2016), and Cornelissen et al. (2017) for evidence of reverse-Roy selection patterns into educational programs.

40

of causal effects along these distinct complier margins. I apply the method using linked administrative data spanning the state of Texas, leveraging instrumental variation in 2-year and 4-year college proximities net of controls for neighborhood quality, neighborhood urbanization, and fixed effects for each local labor market in Texas. I verify that this residual proximity variation is balanced across excluded test scores that strongly predict enrollment choices and outcomes, and I show that the assumption of comparable 2-year and 4-year proximity compliers has empirical support through equal mean test scores across these two complier groups. The empirical results suggest that expanding access to 2-year colleges does boost the aggregate educational attainment and earnings of new 2-year entrants, but decomposing these net effects reveals substantial heterogeneity along several dimensions: students diverted from 4-year entry face lower outcomes, those who would not have otherwise attended college experience large gains, women experience larger effects along both margins compared to men, and disadvantaged students reap large earnings returns to 2-year entry with little offsetting diversion. Taken together, these results suggest that broad expansions of 2-year college access have different implications for the upward mobility of different types of students, leaving open the potential for more targeted policies to achieve greater net impacts with fewer unintended consequences.

41

Appendix Figure A.1: Out-of-State Enrollment and Missing Earnings Are Concentrated among Top Scorers

Attend college out-of-state 0 .05 .1 .15 .2

Out-of-state college enrollment NSC cohorts

0

20

40 60 Test score percentile

80

100

80

100

Missing earnings Has missing earnings .2 .25 .3 .35

Main analysis cohorts

0

20

40 60 Test score percentile

Notes: The top panel of this figure plots the share of students within each test score percentile (defined in Section 3.2) who enroll in college outside of Texas using the 2008-2009 cohorts with National Student Clearinghouse data coverage. The bottom panel plots the share of students within each test score percentile who have no Texas quarterly earnings records over ages 28-30 using the 2000-2004 main analysis cohorts.

42

0

.00005

Density .0001

.00015

.0002

Figure A.2: Predicted Earnings Are Similar for Students with and without Observed Earnings

0

5000

10000

15000

Predicted quarterly earnings Has earnings

Missing earnings

Mean difference: $58 (SE 10)

Notes: This figure plots the distributions of predicted mean quarterly earnings over ages 28-30 for students with and without observed earnings. Earnings are first projected on all covariates and instruments in Table 1 in the sample with valid earnings, then predicted in the full sample and plotted by earnings status.

Table A.1: High School Graduation and Out-of-State College Enrollment Graduate from high school

Enroll in college out-of-state

Z2 (50 miles)

0.0065 (0.0060)

-0.000006 (0.002925)

Z4 (50 miles)

0.00047 (0.00420)

-0.0057 (0.0034)

R2 N

0.019 590,397

0.017 362,013

Sample: Baseline controls

Main analysis cohorts X

NSC cohorts X

Notes: NSC cohorts are those with National Student Clearinghouse college enrollment coverage. Standard errors in parentheses are clustered at the high school campus by cohort level. High school graduation is measured cumulatively through eight years after 10th grade. Out-of-state college enrollment is measured within two years of projected high school graduation due to NSC data availability.

43

Figure A.3: Sorting into College Enrollments by Observables, 2000-2004 Analysis Cohorts Women Men

Not disadvantaged Disadvantaged

Asian White Black Hispanic 0

.1

.2

.3

.4 .5 .6 Enrollment share Start 4-year

.8

.9

1

No college

0

Neighborhood quality percentile 20 40 60 80

100

Start 2-year

.7

0

.1

.2

.3

.4

.5 .6 Enrollment share Start 4-year

.8

.9

1

.9

1

No college

0

20

Test score percentile 40 60 80

100

Start 2-year

.7

0

.1

.2

.3

.4

.5 .6 Enrollment share

Start 2-year

Start 4-year

.7

.8 No college

Notes: Disadvantaged is an indicator for free or reduced price lunch eligibility in 10th grade. Neighborhood quality and test score percentiles, defined in Section 3.2, are grouped into 5-unit bins.

44

A

Proof of Binary 2SLS Decomposition

This appendix section proves the binary two-stage least squares (2SLS) decomposition in Equation (1) from Section 4.1, showing that 2SLS estimates a weighted average of local average treatment effects along the 2–0 (2-year entry vs. no college) and 2–4 (2-year entry vs. 4-year entry) complier margins.57 Recall the 2SLS specification: Y = β0 + β2 D2 +  E[D2 |Z2 ] = α0 + α2 Z2 , where Y is a student outcome, D2 is an indicator for 2-year college entry, and Z2 is an exogenous and excludable binary instrument that induces students into 2-year entry from the alternative treatments of no college (D0 ) and 4-year entry (D4 ). In this system, β2 is the familiar Wald (1940) estimand: β2 =

E[Y |Z2 = 1] − E[Y |Z2 = 0] E[D2 |Z2 = 1] − E[D2 |Z2 = 0]

Decompose E[Y |Z2 = 1] in the numerator using the fact that Y = Y0 D0 + Y2 D2 + Y4 D4 , where Yj is the potential outcome associated with treatment j ∈ {0, 2, 4}: E[Y |Z2 = 1] = E[Y0 D0 + Y2 D2 + Y4 D4 |Z2 = 1] = E[Y0 D0 |Z2 = 1] + E[Y2 D2 |Z2 = 1] + E[Y4 D4 |Z2 = 1] = E[Y0 |D0 = 1, Z2 = 1]P r(D0 = 1|Z2 = 1) + E[Y2 |D2 = 1, Z2 = 1]P r(D2 = 1|Z2 = 1) + E[Y4 |D4 = 1, Z2 = 1]P r(D4 = 1|Z2 = 1) 57 Heckman and Urzua (2010) and Kline and Walters (2016) provide similar derivations, as do Angrist and Imbens (1995) for the case of ordered multivalued treatments.

45

Letting D(z2 ) ∈ {0, 2, 4} denote the potential choice an individual would make if exogenously assigned to Z2 = z2 ∈ {0, 1}, by instrument exclusion this becomes E[Y |Z2 = 1] = E[Y0 |D(1) = 0]P r(D(1) = 0) + E[Y2 |D(1) = 2]P r(D(1) = 2) + E[Y4 |D(1) = 4]P r(D(1) = 4) The monotonicity assumption that Z2 induces students into D2 from D0 and D4 permits the following five complier groups: {D(0) = 0, D(1) = 0}, {D(0) = 0, D(1) = 2}, {D(0) = 2, D(1) = 2}, {D(0) = 4, D(1) = 4}, and {D(0) = 4, D(1) = 2}. Hence we can further decompose: E[Y |Z2 = 1] = E[Y0 |D(0) = 0, D(1) = 0]P r(D(0) = 0, D(1) = 0) + E[Y2 |D(0) = 0, D(1) = 2]P r(D(0) = 0, D(1) = 2) + E[Y2 |D(0) = 2, D(1) = 2]P r(D(0) = 2, D(1) = 2) + E[Y2 |D(0) = 4, D(1) = 2]P r(D(0) = 4, D(1) = 2) + E[Y4 |D(0) = 4, D(1) = 4]P r(D(0) = 4, D(1) = 4) By analogous arguments, decompose E[Y |Z2 = 0] into E[Y |Z2 = 0] = E[Y0 D0 |Z2 = 0] + E[Y2 D2 |Z2 = 0] + E[Y4 D4 |Z2 = 0] = E[Y0 |D0 = 1, Z2 = 0]P r(D0 = 1|Z2 = 0) + E[Y2 |D2 = 1, Z2 = 0]P r(D2 = 1|Z2 = 0) + E[Y4 |D4 = 1, Z2 = 0]P r(D4 = 1|Z2 = 0) = E[Y0 |D(0) = 0, D(1) = 0]P r(D(0) = 0, D(1) = 0) + E[Y0 |D(0) = 0, D(1) = 2]P r(D(0) = 0, D(1) = 2) + E[Y2 |D(0) = 2, D(1) = 2]P r(D(0) = 2, D(1) = 2) + E[Y4 |D(0) = 4, D(1) = 4]P r(D(0) = 4, D(1) = 4) + E[Y4 |D(0) = 4, D(1) = 2]P r(D(0) = 4, D(1) = 2)

46

Subtracting E[Y |Z2 = 1]−E[Y |Z2 = 0] eliminates the always-taker and never-taker groups, leaving only the instrument compliers: E[Y |Z2 = 1] − E[Y |Z2 = 0] = E[Y2 |D(0) = 0, D(1) = 2]P r(D(0) = 0, D(1) = 2) − E[Y0 |D(0) = 0, D(1) = 2]P r(D(0) = 0, D(1) = 2) + E[Y2 |D(0) = 4, D(1) = 2]P r(D(0) = 4, D(1) = 2) − E[Y4 |D(0) = 4, D(1) = 2]P r(D(0) = 4, D(1) = 2) = E[Y2 − Y0 |D(0) = 0, D(1) = 2]P r(D(0) = 0, D(1) = 2) + E[Y2 − Y4 |D(0) = 4, D(1) = 2]P r(D(0) = 4, D(1) = 2) Since changes in D0 with respect to Z2 are driven by {D(0) = 0, D(1) = 2} compliers, and changes in D4 are driven by {D(0) = 4, D(1) = 2} compliers, this becomes E[Y |Z2 = 1] − E[Y |Z2 = 0] = E[Y2 − Y0 |D(0) = 0, D(1) = 2](E[D0 |Z2 = 0] − E[D0 |Z2 = 1]) + E[Y2 − Y4 |D(0) = 4, D(1) = 2](E[D4 |Z2 = 0] − E[D4 |Z2 = 1]) Plugging this back into the Wald expression yields the result: E[Y |Z2 = 1] − E[Y |Z2 = 0] E[D2 |Z2 = 1] − E[D2 |Z2 = 0] E[Y2 − Y0 |D(0) = 0, D(1) = 2](E[D0 |Z2 = 0] − E[D0 |Z2 = 1]) = E[D2 |Z2 = 1] − E[D2 |Z2 = 0] E[Y2 − Y4 |D(0) = 4, D(1) = 2](E[D4 |Z2 = 0] − E[D4 |Z2 = 1]) + E[D2 |Z2 = 1] − E[D2 |Z2 = 0]

β2 =

= ωE[Y2 − Y0 |D(0) = 0, D(1) = 2] + (1 − ω)E[Y2 − Y4 |D(0) = 4, D(1) = 2] = ωLAT E2←0 + (1 − ω)LAT E2←4 , where the weights

ω≡−

E[D0 |Z2 = 1] − E[D0 |Z2 = 0] , E[D2 |Z2 = 1] − E[D2 |Z2 = 0]

(1 − ω) = −

result from the fact that D0 + D2 + D4 = 1.

47

E[D4 |Z2 = 1] − E[D4 |Z2 = 0] E[D2 |Z2 = 1] − E[D2 |Z2 = 0]

B

What Does Multivariate 2SLS Estimate?

This appendix section derives and decomposes the multivariate two-stage least squares (2SLS) estimands corresponding to the econometric setting in Section 4.2. To that end, assume A1-A3 (independence & exclusion, partial unordered monotonicity, and comparable compliers), implicitly condition on X, and consider a local region around an evaluation point (Z2 , Z4 ).58 For maximum comparability to the parameters of interest in Section 4.2, consider the local multivariate 2SLS specification Y = γ + β0 D0 + β4 D4 +  E[D0 |Z] = α00 + α02 Z2 + α04 Z4 E[D4 |Z] = α40 + α42 Z2 + α44 Z4 where Z ≡ (Z2 , Z4 ). I use D0 as the first endogenous treatment and exclude D2 = 1 as the reference case to correspond to a 2←0 comparison in −β0 and a 2←4 comparison in −β4 . These are the relevant estimands to compare to my M T E2←0 and M T E2←4 , respectively. To begin the derivation, consider the reduced form: E[Y |Z] = γ + β0 (α00 + α02 Z2 + α04 Z4 ) + β4 (α40 + α42 Z2 + α44 Z4 ) + E[|Z] = γ + β0 α00 + β4 α40 + (β0 α02 + β4 α42 ) Z2 + (β0 α04 + β4 α44 ) Z4 |

{z

}

≡α0y

|

{z

}

≡α2y

|

{z

≡α4y

}

= αy0 + αy2 Z2 + αy4 Z4 where E[|Z] = 0 by A1. Then 





αy2  α02  =   

αy4







α42 

α04 α44

β   ×  0 ,   

β4

58 See Kirkeboen et al. (2016) for a related derivation involving discrete instruments, a less restrictive monotonicity condition, and no comparable compliers assumption, which yields more complicated estimands due to additional margins of instrument compliance. See also Kline and Walters (2016) for a related derivation involving one binary instrument interacted with a stratifying covariate.

48

and we can solve for β0 and β4 as −1







α02 

β0   =   

α42  

α04 α44

β4





αy2   

×  αy4





=



α44

 1  α02 α44 − α42 α04 −α4 0

−α42   α02





αy2   

×

αy4



α44 αy2 − α42 αy4 α02 α44 − α42 α04 α02 αy4 − α04 αy2 β4 = 2 4 α0 α4 − α42 α04

β0 =

Using the complier mean potential outcome identification results of Section 4.3 implies ∂E[Y D0 |Z] ∂E[Y D2 |Z] ∂E[Y D4 |Z] ∂E[Y |Z] = + + ∂Z2 ∂Z2 ∂Z2 ∂Z2 ! ! ∂E[D0 |Z] ∂E[D0 |Z] ∂E[D4 |Z] ∂E[D4 |Z] = E[Y0 |2–0] + E[Y2 |2–0] − + E[Y2 |2–4] − + E[Y4 |2–4] ∂Z2 ∂Z2 ∂Z2 ∂Z2

αy2 =

=−

 ∂E[D4 |Z]  ∂E[D0 |Z] E[Y2 |2–0] − E[Y0 |2–0] − E[Y2 |2–4] − E[Y4 |2–4] ∂Z2 ∂Z2

= −α02 M T E2←0 − α42 M T E2←4 where the local evaluation point (Z2 , Z4 ) is suppressed and E[Y0 |2–0], for example, is shorthand for E[Y0 |2–0 complier w.r.t. Z2 at (Z2 , Z4 )] = 0lim E[Y0 |D(Z2 , Z4 ) = 0, D(Z20 , Z4 ) = 2]. Z2 →Z2

Likewise with respect to Z4 , ∂E[Y D0 |Z] ∂E[Y D2 |Z] ∂E[Y D4 |Z] ∂E[Y |Z] = + + ∂Z4 ∂Z4 ∂Z4 ∂Z4 ∂E[D0 |Z] ∂E[D2 |Z] = E[Y0 |4–0] + E[Y2 |2–4 w.r.t.Z4 ] ∂Z4 ∂Z4 ! ! ∂E[D0 |Z] ∂E[D2 |Z] + E[Y4 |4–0] − + E[Y4 |2–4 w.r.t. Z4 ] − ∂Z4 ∂Z4

αy4 =

49

where E[Y0 |4–0], for example, is shorthand for E[Y0 |4–0 complier w.r.t. Z4 at (Z2 , Z4 )] = 0lim E[Y0 |D(Z2 , Z4 ) = 0, D(Z2 , Z40 ) = 4]. Z4 →Z4

By comparable compliers (A3), we can equate E[Y2 |2–4 w.r.t.Z4 ] = E[Y2 |2–4 w.r.t.Z2 ] ≡ E[Y2 |2–4]. The assumption (A3) in the text is silent about the relationship between E[Y4 |2–4 w.r.t. Z4 ] and E[Y4 |2–4 w.r.t. Z2 ], however, since no restrictions are needed on these Y4 potential outcomes to secure identification of the desired treatment effects in the separate identification approach of this paper. To simplify the 2SLS decomposition, however, let us make a slightly stronger comparable compliers assumption and equate these mean Y4 potential outcomes across the Z2 and Z4 complier groups, in addition to Y2 . Hence we equate E[Y4 |2–4 w.r.t. Z4 ] = E[Y4 |2–4 w.r.t. Z2 ] ≡ E[Y4 |2–4], which simplifies the expression for αy4 to αy4

∂E[D2 |Z] ∂E[D0 |Z] ∂E[D0 |Z] + E[Y2 |2–4] + E[Y4 |4–0] − = E[Y0 |4–0] ∂Z4 ∂Z4 ∂Z4 =−

!

∂E[D2 |Z] + E[Y4 |2–4] − ∂Z4

 ∂E[D2 |Z]  ∂E[D0 |Z] E[Y4 |4–0] − E[Y0 |4–0] + E[Y2 |2–4] − E[Y4 |2–4] ∂Z4 ∂Z4

= −α04 M T E4←0 − (α04 + α44 )M T E2←4 again suppressing the local evaluation point and using the fact that ∂E[D2 |Z] ∂E[1 − D0 − D4 |Z] ∂E[D0 |Z] ∂E[D4 |Z] = =− − = −α0 − α4 . ∂Z4 ∂Z4 ∂Z4 ∂Z4 Plugging these results into the expressions for β0 and β4 yields: α44 (−α02 M T E2←0 − α42 M T E2←4 ) − α42 (−α04 M T E4←0 − (α04 + α44 )M T E2←4 ) α02 α44 − α42 α04 α2 α4 M T E2←0 − α42 α04 (M T E4←0 + M T E2←4 ) =− 0 4 α02 α44 − α42 α04

β0 =

α02 (−α04 M T E4←0 − (α04 + α44 )M T E2←4 ) − α04 (−α02 M T E2←0 − α42 M T E2←4 ) α02 α44 − α42 α04 (α2 α4 − α42 α04 + α02 α04 )M T E2←4 + (−α02 α04 )(M T E2←0 − M T E4←0 ) =− 0 4 α02 α44 − α42 α04

β4 =

50

!

Finally, defining the weights

θ0 ≡

α02 α44 , α02 α44 − α42 α04

θ4 ≡

α02 α44 − α42 α04 + α02 α04 α02 α44 − α42 α04

yields the main results: −β0 = θ0 M T E2←0 + (1 − θ0 )(M T E4←0 + M T E2←4 ) −β4 = θ4 M T E2←4 + (1 − θ4 )(M T E2←0 − M T E4←0 ).

Each 2SLS estimand in this setting is thus a weighted average of the effect for compliers along the treatment margin of interest and a biasing term involving effects for compliers along the other two treatment margins. In the special case of constant treatment effects across individuals, note that M T E4←0 + M T E2←4 = (Y4 − Y0 ) + (Y2 − Y4 ) = Y2 − Y0 and M T E2←0 − M T E4←0 = (Y2 − Y0 ) − (Y4 − Y0 ) = Y2 − Y4 , which confirms that 2SLS identifies the effects of interest in the absence of effect heterogeneity. With heterogeneous effects, however, M T E4←0 + M T E2←4 6= M T E2←0 and M T E2←0 − M T E4←0 6= M T E2←4 in general, leading to multivariate 2SLS estimands that generally do not recover a well-defined treatment effect for a relevant complier population.

51

C

Selection Model Proofs

This appendix section shows how the the selection model of Section 4.2 generates A2 (partial unordered monotonicity) and A3 (comparable compliers) as necessary implications. Recall the choice equations D0 (z2 , z4 ) = 1[U2 < µ2 (z2 ), U4 < µ4 (z4 )] D2 (z2 , z4 ) = 1[U2 > µ2 (z2 ), U4 − U2 < µ4 (z4 ) − µ2 (z2 )] D4 (z2 , z4 ) = 1[U4 > µ4 (z4 ), U4 − U2 > µ4 (z4 ) − µ2 (z2 )] Letting fX,Y (u, v) denote the joint density of random variables (X, Y ), the three choice probabilities are thus P r(D0 (z2 , z4 ) = 1) = P r(D2 (z2 , z4 ) = 1) = P r(D4 (z2 , z4 ) = 1) =

Z µ2 (z2 ) Z µ4 (z4 )

fU2 ,U4 (u, v)dvdu −∞ −∞ Z µ4 (z4 )−µ2 (z2 ) Z ∞ −∞ Z ∞

Z

µ2 (z2 ) ∞

µ4 (z4 )−µ2 (z2 ) µ4 (z4 )

fU4 −U2 ,U2 (u, v)dvdu

fU4 −U2 ,U4 (u, v)dvdu

Consider a marginal change in Z2 , corresponding to the top right panel of Figure 4. We have ∂ ∂P r(D2 (z2 , z4 ) = 1) = ∂z2 ∂z2 =

−µ02 (z2 )

=

−µ02 (z2 )

Z µ4 (z4 )−µ2 (z2 ) −∞ Z µ4 (z4 ) −∞

Z µ4 (z4 )−µ2 (z2 ) Z ∞ −∞

fU4 −U2 ,U2 (u, v)dvdu

µ2 (z2 )

fU4 −U2 ,U2 (u, µ2 (z2 ))du −

fU4 ,U2 (u, µ2 (z2 ))du −

µ02 (z2 )

µ02 (z2 )

Z ∞ µ2 (z2 )

Z ∞ µ2 (z2 )

fU4 −U2 ,U2 (µ4 (z4 ) − µ2 (z2 ), v)dv

fU4 −U2 ,U2 (µ4 (z4 ) − µ2 (z2 ), v)dv

(3)

Referring to the partition in the top left panel of Figure 4, the first term of Equation (3) is a line integral tracing out the set of indifferent individuals along the border between D(z2 , z4 ) = 0 and D(z2 , z4 ) = 2, i.e. marginal 2–0 compliers. The second term traces out the set of indifferent individuals along the border between D(z2 , z4 ) = 4 and D(z2 , z4 ) = 2, i.e. marginal 2–4 compliers. Hence, as visualized in the top right panel of Figure 4, a marginal shift in Z2 induces 2–0 and 2–4 compliers and thus generates partial unordered monotonicity (A2) in Z2 .

52

Considering a marginal change in Z4 , corresponding to the bottom left panel of Figure 4, we have ∂ ∂P r(D4 (z2 , z4 ) = 1) = ∂z4 ∂z4 = −µ04 (z4 ) =

−µ04 (z4 )

Z ∞ µ4 (z4 )−µ2 (z2 ) Z ∞ µ2 (z2 )

Z ∞

Z ∞

µ4 (z4 )−µ2 (z2 ) µ4 (z4 )

fU4 −U2 ,U4 (u, v)dvdu

fU4 −U2 ,U4 (u, µ4 (z4 ))du − µ04 (z4 )

fU2 ,U4 (u, µ4 (z4 ))du −

µ04 (z4 )

Z ∞ µ4 (z4 )

Z ∞ µ4 (z4 )

fU4 −U2 ,U4 (µ4 (z4 ) − µ2 (z2 ), v)dv

fU4 −U2 ,U4 (µ4 (z4 ) − µ2 (z2 ), v)dv

(4)

Referring again to the partition in Figure 4, the first integral of Equation (4) traces out the set of indifferent individuals along the border between D(z2 , z4 ) = 0 and D(z2 , z4 ) = 4, i.e. marginal 0–4 compliers. The second term traces out the set of indifferent individuals along the border between D(z2 , z4 ) = 2 and D(z2 , z4 ) = 4, i.e. marginal 2–4 compliers. Hence, as visualized in the bottom left panel of Figure 4, a marginal shift in Z4 induces 0–4 and 2–4 compliers and thus generates partial unordered monotonicity (A2) in Z4 . Comparable compliers (A3) emerge by the fact that the second integral in Equation (3) and the second integral in Equation (4) both trace out the same group of 2–4 compliers along the margin of indifference between D(z2 , z4 ) = 4 and D(z2 , z4 ) = 2. A simple change of variables confirms this: with U4 − U2 fixed at µ4 (z4 ) − µ2 (z2 ) in both integral expressions, we can write U2 = U4 − µ4 (z4 ) + µ2 (z2 ), and plugging this transformation into the second integral in Equation (3) yields the second integral in Equation (4). The weights −µ02 (z2 ) and −µ04 (z4 ) outside the integrals still differ, but this is only to allow for differential marginal disutility of 2-year versus 4-year distance; the identical integrals confirm that both the Z2 shift and the Z4 shift involve the same group of compliers along the margin of indifference between 2-year and 4-year entry.

53

D

Main Identification Proofs

This appendix section proves the main identification results of Section 4.3. From an evaluation point (Z2 , Z4 ) = (z2 , z4 ), consider a marginal shift z2 → z20 in favor of D2 , i.e. P r(D2 |z20 , z4 ) > P r(D2 |z2 , z4 ); the opposite direction proceeds analogously. By partial unordered monotonicity (A2), this shift in Z2 induces 2←0 and 2←4 compliers. Changes in D0 with respect to this shift therefore must only involve 2←0 compliers, and changes in D4 must only involve 2←4 compliers. Using the shorthand E[Y D0 |Z2 = z2 , Z4 = z4 ] ≡ E[Y D0 |z2 , z4 ], we have E[Y D0 |z2 , z4 ] =E[Y0 |D0 = 1, z2 , z4 ]P r(D0 = 1|z2 , z4 ) =E[Y0 |D(z2 , z4 ) = 0, z2 , z4 ]P r(D(z2 , z4 ) = 0) =E[Y0 |D(z2 , z4 ) = 0]P r(D(z2 , z4 ) = 0) where the fourth line is due to independence and exclusion (A1). Decomposing the mass of individuals with D(z2 , z4 ) = 0 into the complier groups allowed by partial unordered monotonicity (A2) yields E[Y0 |D(z2 , z4 ) = 0]P r(D(z2 , z4 ) = 0) h

= E[Y0 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 0]P r(D(z20 , z4 ) = 0|D(z2 , z4 ) = 0) i

+E[Y0 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 2]P r(D(z20 , z4 ) = 2|D(z2 , z4 ) = 0) P r(D(z2 , z4 ) = 0) =E[Y0 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 0]P r(D(z2 , z4 ) = 0, D(z20 , z4 ) = 0) +E[Y0 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 2]P r(D(z2 , z4 ) = 0, D(z20 , z4 ) = 2) Likewise, the permitted complier groups with D(z20 , z4 ) = 0 yield E[Y D0 |z20 , z4 ] =E[Y0 |D(z20 , z4 ) = 0]P r(D(z20 , z4 ) = 0) =E[Y0 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 0]P r(D(z2 , z4 ) = 0, D(z20 , z4 ) = 0)

54

where again the third line is due to A2. Taking the difference of these two expressions gives E[Y D0 |z2 , z4 ] − E[Y D0 |z20 , z4 ] =E[Y0 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 2]P r(D(z2 , z4 ) = 0, D(z20 , z4 ) = 2) h

=E[Y0 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 2] P r(D(z2 , z4 ) = 0) − P r(D(z20 , z4 ) = 0)

i

where the third line is due to A2: the change in the probability of D0 with respect to this instrument shift is driven by 2–0 compliers. Taking limits to convert differences to derivatives, E[Y D0 |z2 , z4 ] − E[Y D0 |z20 , z4 ] ∂E[Y D0 |z2 , z4 ] = 0lim ∂Z2 z2 − z20 z2 →z2 P r(D(z2 , z4 ) = 0) − P r(D(z20 , z4 ) = 0) = 0lim E[Y0 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 2] z2 − z20 z2 →z2 ∂P r(D(z2 , z4 ) = 0) =E[Y0 |2–0 complier at (z2 , z4 )] ∂Z2 ∂E[D0 |z2 , z4 ] =E[Y0 |2–0 complier at (z2 , z4 )] ∂Z2 Rearranging identifies the first potential outcome of interest: ∂E[Y D0 |z2 ,z4 ] ∂Z2 ∂E[D0 |z2 ,z4 ] ∂Z2

= E[Y0 |2–0 complier at (z2 , z4 )].

We can proceed analogously for D4 : E[Y D4 |z2 , z4 ] =E[Y4 |D(z2 , z4 ) = 4]P r(D(z2 , z4 ) = 4) h

= E[Y4 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 4]P r(D(z20 , z4 ) = 4|D(z2 , z4 ) = 4) i

+E[Y4 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 2]P r(D(z20 , z4 ) = 2|D(z2 , z4 ) = 4) P r(D(z2 , z4 ) = 4) =E[Y4 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 4]P r(D(z2 , z4 ) = 4, D(z20 , z4 ) = 4) +E[Y4 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 2]P r(D(z2 , z4 ) = 4, D(z20 , z4 ) = 2)

55

E[Y D4 |z20 , z4 ] =E[Y4 |D(z20 , z4 ) = 4]P r(D(z20 , z4 ) = 4) =E[Y4 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 4]P r(D(z2 , z4 ) = 4, D(z20 , z4 ) = 4)

E[Y D4 |z2 , z4 ] − E[Y D4 |z20 , z4 ] =E[Y4 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 2]P r(D(z2 , z4 ) = 4, D(z20 , z4 ) = 2) h

i

=E[Y4 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 2] P r(D(z2 , z4 ) = 4) − P r(D(z20 , z4 ) = 4)

E[Y D4 |z2 , z4 ] − E[Y D4 |z20 , z4 ] ∂E[Y D4 |z2 , z4 ] = 0lim ∂Z2 z2 − z20 z2 →z2 P r(D(z2 , z4 ) = 4) − P r(D(z20 , z4 ) = 4) = 0lim E[Y4 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 2] z2 − z20 z2 →z2 ∂E[D4 |z2 , z4 ] =E[Y4 |2–4 complier at (z2 , z4 )] ∂Z2 Therefore ∂E[Y D4 |z2 ,z4 ] ∂Z2 ∂E[D4 |z2 ,z4 ] ∂Z2

= E[Y4 |2–4 complier at (z2 , z4 )].

Turning to D2 , E[Y D2 |z2 , z4 ] =E[Y2 |D(z2 , z4 ) = 2]P r(D(z2 , z4 ) = 2) =E[Y2 |D(z2 , z4 ) = 2, D(z20 , z4 ) = 2]P r(D(z20 , z4 ) = 2|D(z2 , z4 ) = 2) The pooled expression arises from the fact that changes in D2 with respect to z2 → z20 are driven

56

by both 2←0 and 2←4 compliers: E[Y D2 |z20 , z4 ] =E[Y2 |D(z20 , z4 ) = 2]P r(D(z20 , z4 ) = 2) =E[Y2 |D(z2 , z4 ) = 2, D(z20 , z4 ) = 2]P r(D(z2 , z4 ) = 2, D(z20 , z4 ) = 2) +E[Y2 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 2]P r(D(z2 , z4 ) = 0, D(z20 , z4 ) = 2) +E[Y2 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 2]P r(D(z2 , z4 ) = 4, D(z20 , z4 ) = 2) Hence E[Y D2 |z2 , z4 ] − E[Y D2 |z20 , z4 ] =E[Y2 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 2]P r(D(z2 , z4 ) = 0, D(z20 , z4 ) = 2) +E[Y2 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 2]P r(D(z2 , z4 ) = 4, D(z20 , z4 ) = 2) h

i

h

i

=E[Y2 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 2] P r(D(z2 , z4 ) = 0) − P r(D(z20 , z4 ) = 0) +E[Y2 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 2] P r(D(z2 , z4 ) = 4) − P r(D(z20 , z4 ) = 4)

E[Y D2 |z2 , z4 ] − E[Y D2 |z20 , z4 ] ∂E[Y D2 |z2 , z4 ] = 0lim ∂Z2 z2 − z20 z2 →z2 "

= 0lim E[Y2 |D(z2 , z4 ) = 0, D(z20 , z4 ) = 2] z2 →z2

+ E[Y2 |D(z2 , z4 ) =

4, D(z20 , z4 )

P r(D(z2 , z4 ) = 0) − P r(D(z20 , z4 ) = 0) z2 − z20

P r(D(z2 , z4 ) = 4) − P r(D(z20 , z4 ) = 4) = 2] z2 − z20

#

∂E[D0 |z2 , z4 ] ∂Z2 ∂E[D4 |z2 , z4 ] + E[Y2 |2–4 complier at (z2 , z4 )] . ∂Z2 = E[Y2 |2–0 complier at (z2 , z4 )]

Finally, we turn the comparable compliers induced by Z4 . From the same evaluation point (Z2 , Z4 ) = (z2 , z4 ), consider a marginal shift z4 → z40 in favor of D2 , i.e.

P r(D2 |z2 , z40 ) >

P r(D2 |z2 , z4 ); the opposite direction proceeds analogously. By partial unordered monotonicity (A2), this shift in Z4 induces 2←4 and 0←4 compliers. Changes in D2 with respect to this shift 57

therefore must only involve 2←4 compliers: E[Y D2 |z2 , z4 ] =E[Y2 |D(z2 , z4 ) = 2]P r(D(z2 , z4 ) = 2) =E[Y2 |D(z2 , z4 ) = 2, D(z2 , z40 ) = 2]P r(D(z2 , z4 ) = 2, D(z2 , z40 ) = 2)

E[Y D2 |z2 , z40 ] =E[Y2 |D(z2 , z40 ) = 2]P r(D(z2 , z40 ) = 2) =E[Y2 |D(z2 , z4 ) = 2, D(z2 , z40 ) = 2]P r(D(z2 , z4 ) = 2, D(z2 , z40 ) = 2) +E[Y2 |D(z2 , z4 ) = 4, D(z2 , z40 ) = 2]P r(D(z2 , z4 ) = 4, D(z2 , z40 ) = 2)

E[Y D2 |z2 , z4 ] − E[Y D2 |z2 , z40 ] =E[Y2 |D(z2 , z4 ) = 4, D(z2 , z40 ) = 2]P r(D(z2 , z4 ) = 4, D(z2 , z40 ) = 2) h

i

=E[Y2 |D(z2 , z4 ) = 4, D(z2 , z40 ) = 2] P r(D(z2 , z4 ) = 2) − P r(D(z2 , z40 ) = 2)

E[Y D2 |z2 , z4 ] − E[Y D2 |z2 , z40 ] ∂E[Y D2 |z2 , z4 ] = 0lim ∂Z4 z4 − z40 z4 →z4 P r(D(z2 , z4 ) = 2) − P r(D(z2 , z40 ) = 2) = 0lim E[Y2 |D(z2 , z4 ) = 4, D(z20 , z4 ) = 2] z4 − z40 z4 →z4 ∂E[D2 |z2 , z4 ] =E[Y2 |2–4 complier at (z2 , z4 )] ∂Z2 This mean potential outcome among marginal Z4 compliers is equal to that among marginal Z2 compliers by A3 (comparable compliers). Therefore ∂E[Y D2 |z2 ,z4 ] ∂Z4 ∂E[D2 |z2 ,z4 ] ∂Z4

= E[Y2 |2–4 complier at (z2 , z4 )],

and plugging this identified potential outcome into the pooled Z2 expression yields all of the mean complier potential outcomes of interest.

58

Two remarks are in order. First, this identification strategy can recover any distributional feature of complier potential outcomes and exogenous covariates, not simply means, by replacing Y with appropriate indicator functions of Y and/or exogenous covariates, e.g.

1(Y ≤ y) for CDF

recovery. This extends the logic of Imbens and Rubin (1997), Abadie (2002), and Carneiro and Lee (2009) to multiple treatment margins and that of Kline and Walters (2016) to multiple instruments. Second, this method could be applied to discrete instruments. If the comparable compliers assumption (A3) holds across a discrete pair of shifts (z2 , z4 ) → (z20 , z4 ) and (z2 , z4 ) → (z2 , z40 ), then point identification is secured; if not, information about 2–4 compliers with respect to the Z4 shift may still be informative about 2–4 compliers with respect to Z2 , which could lead to a partial identification approach or sensitivity analysis to different assumptions about the degree of similarity between these two complier groups.

59

References Aakvik, A., J. J. Heckman, and E. J. Vytlacil (2005): “Estimating Treatment Effects for Discrete Outcomes When Responses to Treatment Vary: An Application to Norwegian Vocational Rehabilitation Programs,” Journal of Econometrics, 125, 15–51. Abadie, A. (2002): “Bootstrap Tests for Distributional Treatment Effects in Instrumental Variable Models,” Journal of the American Statistical Association, 97, 284–292. Acemoglu, D. and D. Autor (2011): “Skills, Tasks and Technologies: Implications for Tmployment and Earnings,” in Handbook of Labor Economics, Elsevier B.V., vol. 4, 1043–1171. Aisch, G., R. Gebeloff, and K. Quealy (2014): “Where We Came From and Where We Went, State by State,” New York Times, 19 August. Altonji, J. G., E. Blom, and C. Meghir (2012): “Heterogeneity in Human Capital Investments: High School Curriculum, College Major, and Careers,” Annual Review of Economics, 4, 185–223. Andrews, R. J., J. Li, and M. F. Lovenheim (2014): “Heterogeneous Paths through College: Detailed Patterns and Relationships with Graduation and Earnings,” Economics of Education Review, 42, 93–108. ——— (2016): “Quantile Treatment Effects of College Quality on Earnings,” Journal of Human Resources, 51, 200–238. Angrist, J. and I. Fernandez-Val (2013): “ExtrapoLATE-ing: External Validity and Overidentification in the LATE Framework,” in Advances in Economics and Econometrics, ed. by D. Acemoglu, M. Arellano, and E. Dekel, Cambridge: Cambridge University Press, vol. 31. Angrist, J. D. and G. W. Imbens (1995): “Two-Stage Least Squares Estimation of Average Causal Effects in Models with Variable Treatment Intensity,” Journal of the American Statistical Association, 90, 431–442. Autor, D. H. (2014): “Skills, Education, and the Rise of Earnings Inequality among the ‘Other 99 Percent’,” Science, 344, 843–851. 60

Bailey, M. J. and S. M. Dynarski (2011): “Gains and Gaps: Changing Inequality in U.S. College Entry and Completion,” in Inequality in Postsecondary Education, ed. by G. Duncan and R. Murnane, New York: Russell Sage. Belfield, C. and T. Bailey (2011): “The Benefits of Attending Community College: A Review of the Evidence,” Community College Review, 39, 46–68. ——— (2017): “The Labor Market Returns to Sub-Baccalaureate College: A Review,” CAPSEE Working Paper. Belley, P. and L. Lochner (2007): “The Changing Role of Family Income and Ability in Determining Educational Achievement,” Journal of Human Capital, 1, 37–89. Blackstone, B. (1978): “Summary Statistics for Vocational Education Program Year 1978,” Tech. rep., U.S. Department of Health, Education and Welfare, Washington, D.C. Boyer, J. W. (2015): The University of Chicago: A History, University of Chicago Press. Brinch, C. N., M. Mogstad, and M. Wiswall (2017): “Beyond LATE with a Discrete Instrument,” Journal of Political Economy, 125, 985–1039. Brint, S. and J. Karabel (1989): The Diverted Dream: Community Colleges and the Promise of Educational Opportunity in America, 1900-1985, Oxford University Press. Cameron, S. and J. Heckman (1998): “Life Cycle Schooling and Dynamic Selection Bias: Models and Evidence for Five Cohorts of American Males,” Journal of Political Economy, 106, 262–333. Cameron, S. and C. Taber (2004): “Estimation of Educational Borrowing Constraints Using Returns to Schooling,” Journal of Political Economy, 112, 132–182. Card, D. (1995): “Using Geographic Variation in College Proximity to Estimate the Return to Schooling,” in Aspects of Labor Market Behavior: Essays in Honor of John Vanderkamp, ed. by L. N. Christofides, K. E. Grant, and R. Swidinsky, Toronto: University of Toronto Press, 201–222.

61

Carneiro, P., J. J. Heckman, and E. J. Vytlacil (2011): “Estimating Marginal Returns to Education,” American Economic Review, 101, 2754–2781. Carneiro, P. and S. Lee (2009): “Estimating Distributions of Potential Outcomes Using Local Instrumental Variables with an Application to Changes in College Enrollment and Wage Inequality,” Journal of Econometrics, 149, 191–208. Carneiro, P., M. Lokshin, and N. Umapathi (2017): “Average and Marginal Returns to Upper Secondary Schooling in Indonesia,” Journal of Applied Econometrics, 32, 16–36. Chetty, R. and N. Hendren (2016): “The Impacts of Neighborhoods on Intergenerational Mobility II: County-Level Estimates,” NBER Working Paper No. 23002. Chetty, R., N. Hendren, and L. F. Katz (2016): “The Effects of Exposure to Better Neighborhoods on Children: New Evidence from the Moving to Opportunity Experiment,” American Economic Review, 106, 855–902. Chetty, R., N. Hendren, P. Kline, and E. Saez (2014): “Where Is the Land of Opportunity? The Geography of Intergenerational Mobility in the United States,” The Quarterly Journal of Economics, 129, 1553–1623. Clark, B. R. (1960): “The ‘Cooling-Out’ Function in Higher Education,” American Journal of Sociology, 65, 569–576. Cohen, A. M., F. B. Brawer, and C. B. Kisker (2014): The American Community College, The Jossey-Bass Higher and Adult Education Series, Wiley. Cornelissen, T., C. Dustmann, A. Raute, and U. Schönberg (2017): “Who Benefits from Universal Child Care? Estimating Marginal Returns to Early Child Care Attendance,” Journal of Political Economy, forthcoming. Denning, J. T. (2017): “College on the Cheap: Consequences of Community College Tuition Reductions,” American Economic Journal: Economic Policy, 9, 155–188. Dobbie, W. S. and R. G. Fryer (2017): “Charter Schools and Labor Market Outcomes,” NBER Working Paper No. 22502. 62

Donnis, I. (2017): “DNC Chair Tom Perez Touts Raimondo’s College Tuition Plan as a Smart Investment,” Rhode Island Public Radio, 5 April. Dougherty, K. J. (1994): The Contradictory College: the Conflicting Origins, Impacts, and Futures of the Community College, Albany: State University of New York Press. Dynarski, S. M., S. W. Hemelt, and J. M. Hyman (2013): “The Missing Manual: Using National Student Clearinghouse Data to Track Postsecondary Outcomes,” Working Paper. Eells, W. C. (1931): The Junior College, Boston: Houghton Mifflin Company. Eisenhauer, P., J. J. Heckman, and E. J. Vytlacil (2015): “The Generalized Roy Model and the Cost-Benefit Analysis of Social Programs,” Journal of Political Economy, 123, 413–443. Feller, A., T. Grindal, L. Miratrix, and L. C. Page (2016): “Compared to What? Variation in the Impacts of Early Childhood Education by Alternative Care Type,” The Annals of Applied Statistics, 10, 1245–1285. Freeman, R. (1976): The Overeducated American, New York: Academic Press. Goldin, C. D. and L. F. Katz (2008): The Race Between Education and Technology, Harvard University Press. Goodman, J., M. Hurwitz, and J. Smith (2017): “Access to Four-Year Public Colleges and Degree Completion,” Journal of Labor Economics, 35, 829–867. Grubb, N. (1989): “The Effects of Differentiation on Educational Attainment: The Case of Community Colleges,” Review of Higher Education, 12, 349–374. Hastings, J. S., C. A. Neilson, and S. D. Zimmerman (2014): “Are Some Degrees Worth More than Others? Evidence from College Admission Cutoffs in Chile,” NBER Working Paper No. 19241. Havnes, T. and M. Mogstad (2015): “Is Universal Child Care Leveling the Playing Field?” Journal of Public Economics, 127, 100–114.

63

Heckman, J. J., N. Hohmann, J. Smith, and M. Khoo (2000): “Substitution and Dropout Bias in Social Experiments: A Study of an Influential Social Experiment,” The Quarterly Journal of Economics, 115, 651–694. Heckman, J. J. and R. Pinto (2017): “Unordered Monotonicity,” NBER Working Paper No. 23497. Heckman, J. J., D. Schmierer, and S. Urzua (2010): “Testing the Correlated Random Coefficient Model,” Journal of Econometrics, 158, 177–203. Heckman, J. J. and S. Urzua (2010): “Comparing IV with Structural Models: What Simple IV Can and Cannot Identify,” Journal of Econometrics, 156, 27–37. Heckman, J. J., S. Urzua, and E. J. Vytlacil (2008): “Instrumental Variables in Models with Multiple Outcomes: the General Unordered Case,” Annals of Economics and Statistics, 91/92, 151–174. Heckman, J. J. and E. J. Vytlacil (1999): “Local Instrumental Variables and Latent Variable Models for Identifying and Bounding Treatment Effects,” Proceedings of the National Academy of Sciences of the United States of America, 96, 4730–4734. ——— (2005): “Structural Equations, Treatment Effects, and Econometric Policy Evaluation,” Econometrica, 73, 669–738. ——— (2007): “Econometric Evaluation of Social Programs, Part II: Using the Marginal Treatment Effect to Organize Alternative Econometric Estimators to Evaluate Social Programs, and to Forecast their Effects in New Environments,” in Handbook of Econometrics, ed. by J. J. Heckman and E. Leamer, Elsevier, vol. 6, 4875–5143. Hoxby, C. M. (2009): “The Changing Selectivity of American Colleges,” The Journal of Economic Perspectives, 23, 95–118. Huber, M. and G. Mellace (2014): “Testing Instrument Validity for LATE Identification Based on Inequality Moment Constraints,” The Review of Economics and Statistics, 97, 398–411.

64

Imbens, G. W. and J. D. Angrist (1994): “Identification and Estimation of Local Average Treatment Effects,” Econometrica, 62, 467–475. Imbens, G. W. and D. B. Rubin (1997): “Estimating Distributions for Outcome Compliers Models in Instrumental Variables,” Review of Economic Studies, 64, 555–574. Jacobson, L., R. LaLonde, and D. G. Sullivan (2005): “Estimating the Returns to Community College Schooling for Displaced Workers,” Journal of Econometrics, 125, 271–304. Jenkins, D. and J. Fink (2016): “Tracking Transfer: New Measures of Institutional and State Effectiveness in Helping Community College Students Attain Bachelor’s Degrees,” Tech. rep., Community College Research Center. Jepsen, C., K. Troske, and P. Coomes (2014): “The Labor-Market Returns to Community College Degrees, Diplomas, and Certificates,” Journal of Labor Economics, 32, 95–121. Kane, T. J. and C. E. Rouse (1993): “Labor Market Returns to Two- and Four-Year Colleges: Is a Credit a Credit and Do Degrees Matter?” NBER Working Paper No. 4268. ——— (1999): “The Community College: Educating Students at the Margin between College and Work,” Journal of Economic Perspectives, 13, 63–84. Katz, L. F. and K. M. Murphy (1992): “Changes in Relative Wages, 1963-1987: Supply and Demand Factors,” The Quarterly Journal of Economics, 107, 35–78. Kirkeboen, L. J., E. Leuven, and M. Mogstad (2016): “Field of Study, Earnings, and SelfSelection,” The Quarterly Journal of Economics, 131, 1057–1111. Kitagawa, T. (2015): “A Test for Instrument Validity,” Econometrica, 83, 2043–2063. Kline, P. and C. R. Walters (2016): “Evaluating Public Programs with Close Substitutes: The Case of Head Start,” The Quarterly Journal of Economics, 131, 1795–1848. Kling, J. R. (2001): “Interpreting Instrumental Variables Estimates of the Returns to Schooling,” Journal of Business & Economic Statistics, 19, 358–364. Kling, J. R., J. B. Liebman, and L. F. Katz (2007): “Experimental Analysis of Neighborhood Effects,” Econometrica, 75, 83–119. 65

Koos, L. V. (1944): “How to Democratize the Junior-College Level,” The School Review, 52, 271–284. Lee, S. and B. Salanie (2017): “Identifying Effects of Multivalued Treatments,” Working Paper. Matzkin, R. L. (1993): “Nonparametric Identification and Estimation of Polychotomous Choice Models,” Journal of Econometrics, 58, 137–168. Miller, D. W. (2007): “Isolating the Causal Impact of Community College Enrollment on Educational Attainment and Labor Market Outcomes in Texas,” SIEPR Discussion Paper No. 06–33. Moffitt, R. (2008): “Estimating Marginal Treatment Effects in Heterogeneous Populations,” Annales d’Économie et de Statistique, 239. Mogstad, M., A. Santos, and A. Torgovitsky (2017): “Using Instrumental Variables for Inference about Policy Relevant Treatment Effects,” NBER Working Paper No. 23568. Monaghan, D. B. and P. Attewell (2015): “The Community College Route to the Bachelor’s Degree,” Educational Evaluation and Policy Analysis, 37, 70–91. National Center for Education Statistics (2015): “Digest of Education Statistics 2016,” Tech. rep. National Conference of State Legislatures (2016): “Free Community College,” Tech. rep. Nybom, M. (2017): “The Distribution of Lifetime Earnings Returns to College,” Journal of Labor Economics, 35, 903–952. Pincus, F. (1980): “The False Promises of Community Colleges: Class Conflict and Vocational Education,” Harvard Educational Review, 50, 332–361. Pinto, R. (2016): “Learning from Noncompliance in Social Experiments: The Case of Moving to Opportunity,” Working Paper. Reynolds, C. L. (2012): “Where to Attend? Estimating the Effects of Beginning College at a Two-Year Institution,” Economics of Education Review, 31, 345–362.

66

Rouse, C. E. (1995): “Democratization or Diversion? The Effect of Community Colleges on Educational Attainment,” Journal of Business and Economic Statistics, 13, 217–224. ——— (1998): “Do Two-Year Colleges Increase Overall Educational Attainment? Evidence from the States,” Journal of Policy Analysis and Management, 17, 595–620. Roy, A. D. (1951): “Some Thoughts on the Distribution of Earnings,” Oxford Economic Papers, 3, 135–146. Stevens, D. W. (2007): “Employment That Is Not Covered by State Unemployment Insurance Laws,” U.S. Census Bureau Technical Paper No. TP–2007–04. Vytlacil, E. (2002): “Independence, Monotonicity, and Latent Index Models: An Equivalence Result,” Econometrica, 70, 331–341. Wald, A. (1940): “The Fitting of Straight Lines if Both Variables are Subject to Error,” The Annals of Mathematical Statistics, 11, 284–300. Walters, C. R. (2017): “The Demand for Effective Charter Schools,” Journal of Political Economy, forthcoming. White, S., L. B. Potter, H. You, L. Valencia, J. A. Jordan, and B. Pecotte (2016): “Introduction to Texas Domestic Migration,” Tech. rep., Office of the State Demographer. Zimmerman, S. D. (2014): “The Returns to College Admission for Academically Marginal Students,” Journal of Labor Economics, 32, 711–754.

67

Community Colleges and Upward Mobility Job Market ...

Nov 10, 2017 - on the “higher” academic pursuits of research and graduate training. ... colleges and universities that offer bachelor's degrees have maintained ...

2MB Sizes 5 Downloads 278 Views

Recommend Documents

Unions and Upward Mobility for Low-Wage Workers
On average, in the low-wage occupations analyzed here, unionization raised workers' wages by just over 16 percent ..... care aides, janitors, ground maintenance workers, nurses aides and home-health aides, teachers' assistants, and ..... School of Ec

Job Loss and Regional Mobility
If workers make large trade-offs between income losses and the distance to their extended ... (2016) analyze the increased long-term unemployment rate in the US following the .... Unique spouse (i.e. married or cohabiting partner) codes also exist an

Job Mobility and Earnings Instability
Jan 18, 2016 - and transitory income is serially uncorrelated or a first order Moving Average process: see, for example, Meghir and Pistaferri (2004) and Blundell et al. (2008). Since I am interested in the role of job changers, I include in the mode

California Community Colleges Chancellor's Office.pdf
Narrative Analysis of AB86 Consortia Proposal Objectiv ... California Community Colleges Chancellor's Office.pdf. Narrative Analysis of AB86 Consortia ...

Redesigning America's Community Colleges, ACC Futures Institute ...
Redesigning America's Community Colleges, ACC Futures Institute, Spring 2016.pdf. Redesigning America's Community Colleges, ACC Futures Institute, ...

Job Market Paper.pdf
Page 2 of 79. 1 Introduction. Contrary to the permanent income hypothesis, the relative income hypothesis states that. individual concerns not only her own consumption level, but also her consumption level. relative to the average consumption level i

Upward Grounding1
Jan 10, 2017 - those operators to express grounding claims and when I don't. In what follows, I ..... refrigerator, in contrast, is neither true nor false of the stop sign, nor indeed of anything. On van ...... American Philosophical Quarterly 50, pp

Evidence from the Agency Bond Market (Job Market ...
Dec 13, 2007 - 1For example, the Bank of America 2005 Mortgage Outlook Report ..... non-availability of balance sheet data and use of data at a daily ...... Stiglitz, J. E, 1972, On the Optimality of the Stock Market Allocation of Investment, The.

pdf-91\brazil-is-the-new-america-how-brazil-offers-upward-mobility ...
Page 1 of 13. BRAZIL IS THE NEW AMERICA: HOW. BRAZIL OFFERS UPWARD MOBILITY IN A. COLLAPSING WORLD BY JAMES DALE. DAVIDSON. DOWNLOAD EBOOK : BRAZIL IS THE NEW AMERICA: HOW BRAZIL. OFFERS UPWARD MOBILITY IN A COLLAPSING WORLD BY JAMES DALE. DAVIDSON P

Director of School of Community Resources and Development Job ...
Degrees at the Bachelor, Master, and Ph.D. levels are offered. .... Evidence of administrative skills, employee management and development, and outward ...

Community Organizing to Win Mobility Options for All.pdf ...
Page 4 of 32. Page 4 of 32. Community Organizing to Win Mobility Options for All.pdf. Community Organizing to Win Mobility Options for All.pdf. Open. Extract.

Job Market Paper Michael Guggisberg.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Job Market ...

Rodgers Job Market Paper.pdf
what should be done to alleviate the burden of child care expenses.2 These discussions often high- light how rising child care prices discourage labor market ...

Job Market Paper (Abridged).pdf
The paper utilizes the notion of entropy ... a high “level” of labor productivity and low capital productivity along with a high .... Job Market Paper (Abridged).pdf.

9 Understanding market mobility: perceptions of ...
assessment and monitoring surveys were carried out by ... surveys revealed information about marketing outlets in .... would be paid from the extra profit earned.

Separation costs, job heterogeneity and labor market ...
Feb 20, 2008 - model with heterogeneity among matched jobs. This approach .... the free entry condition for vacancies. Therefore, in .... We choose ζ and σ in order to reproduce the cyclical behavior of p in the U.S. data, with standard ...

Director of School of Community Resources and Development Job ...
excellence in education and innovative responses to collective problems. ... Supervise the process of course development and scheduling; seek promising ... Evidence of administrative skills, employee management and development, and ...

9 Understanding market mobility: perceptions of ...
more energy and staves off hunger. Coarse ... tobacco companies (e.g. Aziz Biri Company) from the village when the crop is green. Figure 1. Market preference ...