Over-Education in Multilingual Economies: Evidence from Catalonia Maite Blázquez1 and Silvio Rendon2

Abstract Catalonia's economy is characterized by linguistic diversity and provides a unique opportunity to measure the incidence of language proficiency on over-education, that is, whether individuals with deficient language skills, as non-natives, tend to accept jobs for which they have excessive formal skills. Descriptive evidence suggests the contrary, that individuals with better language knowledge are more likely to be over-educated. However, estimating a model that controls for individuals' sociodemographic characteristics reveals the opposite: better language knowledge decreases over-education. This effect, although robust to accounting for endogeneity of language knowledge and significant at the individual level, is mostly non-significant on average.

JEL Classification: J24, J41, I20, J61, J70. Keywords: Over-Education, Language, Immigration, Skill Premium.

1 2

Universidad Autónoma de Madrid Stony Brook University

ACKNOWLEDGMENTS We thank participants of the XX Annual Conference of the European Society for Population Economics in Verona, and participants of seminars at U. Pompeu Fabra, U. of Girona, FEDEA, U. Autónoma of Madrid, and U. of Salamanca, as well as Núria Quella and Miguel Sánchez Romero for useful comments. All errors and omissions are only ours. Maite Blázquez thanks the financial support provided by Ministerio de Educación y Ciencia (Plan Nacional I+D+I, 2008-2011, ECO2008-04813). The usual disclaimer applies. Corresponding author. Email: [email protected]; Phone: 1 917 528 5119; Fax: 1 631 632 7516.

1. INTRODUCTION In a competitive labor market a variety of qualifications are required, so that individuals who lack some skills, such as language skills may compensate for this disadvantage by accepting jobs for which they have excessive formal skills.3 This type of skill mismatch may be particularly evident in countries that have undergone language shifts, that is, in countries where languages that were once widely but only informally used have become official. Catalonia's multilingual economy with its coexistence of Spanish and Catalan provides a good opportunity to analyze this issue in great detail. In this article we measure the effect of language skills on the quality of job matches in this multilingual economy, an aspect of over-education that has received scarce attention. We show that language knowledge reduces over-education, though not at a significant level.

Over-education is a form of labor under-utilization consisting in workers performing jobs for which educational requirements are below their own. Workers who failed to find appropriate jobs may refuse to continue their job search and just accept jobs that do not match their educational accomplishments. In the literature this phenomenon has been explained as compensation for a lack of human capital, such as experience, ability, or on-the-job training (Sicherman 1991, Alba-Ramírez 1993, Groot 1993, 1996, Groot and Maasen van den Brink 1996). Being language proficiency a specific form of human capital, potentially highly valued in the labor market, over-education may also arise as a consequence of insufficient language skills.

Our main contribution to the literature on over-education is to measure the effect of language proficiency on over-education in a multilingual labor market where language skills potentially 3

In particular, from the point of view of the job-competition theory (Thurow, 1975) the labor market is characterized by a queue of workers competing for jobs, with those at the head of the queue being hired first. A worker's position in the queue is determined by his or her training costs for the firm. In this framework, education proxies training costs, with the highly educated being seen as more able and, therefore, requiring less training. However, there are other workers' characteristics that also reduce training costs and make the less educated workers being hired first, such as language knowledge.

play an important role in explaining labor market outcomes. As in other economies characterized by linguistic diversity, public intervention in Catalonia in the form of an active language policy has had direct economic implications. One of the goals of the language policy carried out by the autonomous Catalan government during the eighties and nineties, called Normalization policy, was to create an economy where the official use of Catalan would increase at the expense of Castilian (Spanish), formerly the only official language. This has obviously contributed to increase the economic value of Catalan knowledge in that individuals with more knowledge of Catalan are significantly more likely to be employed (Rendon 2007). Our work goes a step further and analyzes the incidence of language knowledge on the probability that an individual finds an appropriate job. In this article we use two Census years, 1991 and 1996.4 An initial descriptive analysis of Catalonia, a multilingual economy, suggests that individuals with better language knowledge appear more likely to be over-educated, rather the opposite of what the substitution-of-skills hypothesis predicts. However, once we estimate a model that controls for several sociodemographic attributes, we find that language knowledge diminishes the probability of overeducation. This negative effect, although robust to accounting for endogeneity of language knowledge and significant at the individual level, is mostly non-significant on average.

The remainder of this paper is organized as follows. The next section gives background to multilingualism and reviews language policy in Catalonia. Section 3 describes the data set and discusses the main descriptive statistics. Section 4 presents the estimation results and Section 5 details the main conclusions of this paper.

2. LANGUAGE SHIFT IN A MULTILINGUAL ECONOMY The existing evidence supports the so-called ''compensation hypothesis,'' that over-educated workers have less experience, tenure and on-the-job training than correctly allocated workers 4

These are the last census with reliable data on language for Catalonia. The census of 2001 had several inconsistencies and has therefore not been used for the analysis of language issues.

(Groot and Maasen van den Brink 1996, Groot 1993, 1996, and Sicherman 1991). However, over-education, a particular case of a misallocation of resources, may be particularly severe in multilingual economies, as monolingual agents try to find their competitive edge in more education, which is ultimately not really required by the labor market. Catalonia offers an excellent context to analyze over-education in a multilingual economy; moreover, an economy that underwent a language shift.

In multilingual economies, like Catalonia, over-education can be also the result of compensating a lack of ``language skills''\ with higher levels of education. After decades of absence of all public and formal use of Catalan, at the beginning of the eighties, with the ``Normalization policy'', this region experienced a progressive shift in the official language from Spanish to Catalan. This language shift affected individuals who arrived in the sixties from the rest of Spain, but also the locals, who, schooled in the period from 1939 to 1975, were not able to read or write in Catalan.5

Most research on language done by economists has focused on language assimilation of immigrants to their host countries, approaching language as a form of human capital valued by the market, and crucial to convergence between wages of immigrants and natives. A recent stream of the economic literature studies changes in the language of education and attempts to measure their economic effects. As examples we have the language shift from French to Arabic in Morocco, from English to Welsh in Wales, from Russian to Estonian in Estonia, or from English to Spanish in Puerto Rico (see Angrist and Lavy 1977, Grin and Vaillancourt 1998, Sabourin and Bernier 2003, Angrist et al. 2008). However, evidence on the economic effects of this kind of language shifting is so far inconclusive. While Angrist and Lavy (1977) find that the language shift in Morocco decreased returns to education, Angrist et al. (2008) show that once education-specific cohort trends were introduced, English instruction had no effect on English-speaking ability among Puerto Rican natives.

5

For a more detailed linguistic history of Catalonia see Laitin (1989). From the forties to the seventies, during Franco's regime, Spanish was declared the only official language in Catalonia (and actually in the whole of Spain), while Catalan was reserved for private use. This repression, combined with massive immigration of Spanish speakers to Catalonia, helps to explain how an important proportion of Catalans did in the recent past not master Catalan, even if it was their native language.

As in other European countries, Spain is characterized by a vast language diversity. Altogether, around forty percent of the population of Spain lives in areas with two official languages. While Castilian (Spanish) is official in the totality of the territory, Catalan, Galician, and Basque share co-officiality with Castilian in their own territories.6 A comparison of the importance of these languages in their territories reveals that Galicia, with little immigration, has the highest proportion of speakers of their own language. However, Catalonia exhibits the best evolution of indicators of knowledge, use, and favorable attitudes towards the language. Not only has language knowledge increased through the past twenty years, but also the proportion of individuals who consider Catalan their main language. Language commitment to one's own language though is higher in the Basque Country and Navarra: out of those who speak Basque, the proportion that also writes is higher than its equivalent for Catalan in Catalonia and Galician in Galicia (Siguán 1999).

[Table 1 here]

Table 1 compares Catalonia's over-education rates and years of education with other regions of Spain. Data come from the Survey of Active Population (EPA) for year 1996. Catalonia has a medium level of over-education in the Spanish context. Other multilingual regions have higher levels, such as the Basque country and the Balearic Islands, or lower, such as Galicia and the Valencian Country. However, as we will see in the next section, language does seem to play a role in explaining over-education in Catalonia.

3. DATA We use two samples of 250,000 randomly selected individuals, extracted from census data for 1991 and 1996,7 National Statistical Institutes provided by the Catalan and Spanish (IDESCAT-

6

Catalan is official in Catalonia (6,995,206 inhabitants in 2005), in Valencian Country (4,692,449 inhabitants), and in Balearic Islands (983,131). Galician is official in Galicia (2,762.198) and Basque is official in the Basque Country (2,124,846) and in the north of Navarra. These regions represent 39.81% of the total Spanish population (44,108,530). Other languages are Asturian or Bable (with around 600,000 speakers and not official in its territory: Asturias and the north of Castilla-Leon), and Aranés (Occitan Language, official in Vall d'Aran, within Catalonia, with 9,100 inhabitants). 7 Unlike the census of 1991, applied in all of Spain, the census of 1996 was only applied in Catalonia.

INE). These datasets contain }information on personal attributes such as gender, age, marital status, schooling, place of residence, place of birth, number of years in Catalonia, occupational status, and knowledge of Catalan. We combine this information with data at the district area, called municipi, to capture the externality effects of residing in areas with high employment rates and/or widespread Catalan knowledge. We restrict the sample to only parents and children aged between 16 and 60, born in Spain,8 but not in Catalonia, and participating in the labor force. In the sample we include individuals of several types of households: single, divorced or separated individuals living alone or individuals living in multi-personal households, from which we only consider both parents and their children.9 The final sample contains 47,053 individuals for year 1991, and 69,043 individuals for 1996. Appendix A.1 details the sample selection.

Alternative methods have been used in the literature to measure over-education: a selfassessment method whereby survey respondents are asked directly about the minimum education level needed to do their jobs (see Duncan and Hoffman 1981, Sicherman 1991, Sloane et al. 1999), methods to assess the average required education for a particular type of job, generally using job analysis data (see Rumberger 1987), and, finally, a method based on comparing educational attainments of workers with the mean within the occupation (see Verdugo and Verdugo 1989). Our data set does not provide workers' self-reports on their level of skill utilization, and consequently we use the third method and define an individual as overeducated when he/she has more years of schooling than the mean for his/her occupation plus one standard deviation.10

8

We restrict the analysis to individuals born in the rest of Spain because these are the individuals who are more likely to have made the choice of learning Catalan, whereas for individuals born in Catalonia this choice is less clear. Accordingly, for individuals born in the rest in the rest of Spain, we have clear instrumental variables, such as assimilation variables (years since migration) and origin variables in Spain, which we do not have for individuals born in Catalonia. In 1991 the international migration rate was only 4%, very low to include international immigrants. 9 Among those living in multi-personal household we do not include uncles, aunts, grandparents or other family members. 10 Appendix A.2. describes in greater detail the definition of over-education as well as the rest of variables.

Knowledge of Catalan, classified into understanding, reading, speaking and writing, is selfreported. Because Catalan is linguistically close to Castilian, respondents may over-report their knowledge of Catalan. To alleviate the possible biases caused by self-reporting and linguistic closeness (Charette and Meng 1994), we class individuals who claim to either understand, only speak, or only read Catalan as having a basic level of Catalan knowledge; individuals who report to read and speak will be in the intermediate level, while those who can write have a superior level of Catalan knowledge.11

[Table 2 here]

Descriptive statistics for all variables by gender and census year are presented in Table2, where we can see that over-education amounts to between 6 percent and 9 percent of the labor force. Gender differences in over-education rates are not systematic: while in 1991 over-education is less likely among females, the opposite is observed in 1996. However, the incidence of overeducation decreases over time for both genders. This improvement in skill matching is associated with an increase of the educational attainment of the whole population in this period. We also observe that over-educated workers of both genders tend to be younger than workers on average and that the proportion of over-educated individuals in the service sector is above the average for the overall economy, although it decreases over time. Therefore, skill mismatch mainly affects young workers, especially in the service sector, and tends to diminish over time as workers' schooling increases.12

The descriptive analysis clearly suggests a positive relationship between over-education and Catalan knowledge. The proportion of individuals who read and speak, and write Catalan is higher among the over-educated than among all individuals. Notice also that average 11

In a recent paper, Ortega (2007) analyses the determinants of speaking Catalan by people living in Catalonia but born in other regions of Spain, or out of Spain and called ``immigrants''. In general terms, the aim of the paper is to provide evidence that economic aspects can be of key importance in determining language proficiency of immigrants. For the particular case of Catalonia, Ortega shows that language proficiency of all immigrants is increasing with the proportion of people born in Catalonia who speak Catalan. Furthermore the results of this paper suggest the existence of some complementarities between Castilian and Catalan, since Catalan knowledge is higher in those municipalities with a high share of Castilian speaking immigrants. 12 Although the subsample used is only for individuals not born in Catalonia, we also report, as a comparison, over-education rates of individuals born in Catalonia. Their rates are higher than those for individuals not born in Catalonia, as the former have more years of education than the latter.

knowledge of Catalan increases over time and that women are always more proficient in this language than men. Conversely, over-education is higher than the average for those who read and speak Catalan, and even higher for those who write Catalan.

Women represent 30 percent of the sample, but around 55 percent of them work in the service sector. As we will see below, this high female participation in the service sector may be the underlying reason for the over-education gender gap. Interestingly, average marriage rates are higher for men than for women. The percentage of the population directly affected by the Normalization process is higher for women than for men and is growing over time for both genders: from 0.58 percent in 1991 to 1.65 percent in 1996 for men, and from 1.09 percent in 1991 to 2.80 percent in 1996 for women.13 The population of Catalonia is strongly concentrated14: around 80 percent reside in the province of Barcelona, although this percentage is decreasing over time. More than two thirds come from Andalusia. Around one third of individuals in the sample arrived when they were no older than 10 years. These individuals arrived mostly at the end of the sixties and have been in Catalonia for an average of between 23 and 25 years in 1991, and between 27 and 28 years in 1996.

Data at the municipal level confirm the population's increasing Catalan proficiency and presence of individuals born in Catalonia, as well as an increase in the service sector's share in the economy and a decrease of employment rates for both genders.

In sum, the descriptive analysis suggests that knowledge of Catalan increases the probability of over-education, that is, that skill mismatch is mostly a problem of individuals who are proficient in Catalan. In the next section, we will see how this picture changes once we control for several socio-demographic attributes and account for endogeneity of Catalan knowledge.

13

In Appendix A.2. we define that an individual was affected by the ``Normalization'' process if he or she was younger than 12 in 1984. This is the year when education in Catalan was introduced massively in Catalonia as part of Normalization. 14 We refer here to the whole population, including both the immigrants and the people born in Catalonia.

4. ESTIMATION In this section we proceed to a more in-depth analysis of how over-education in Catalonia is related to Catalan knowledge. To this purpose, we first perform Probit estimations for years 1991 and 1996, for males and females separately. As explanatory variables we include personal characteristics such as schooling and its square, age and its square, an interaction term between schooling and age, marital status, and a set of occupational and sectorial variables. Furthermore, we also control for local labor market characteristics by including as explanatory factors the employment rate in the municipi, the share of those employed in the service sector in the municipi, and a set of regional dummies. Finally, we repeat these estimations controlling for endogeneity, for the intermediate (reading and speaking) and superior (writing) levels of Catalan knowledge.

4.1 Over-Education by Level of Language Knowledge Table 3 shows the representative and average discrete effects and Catalan premia obtained from standard Probit models for over-education, presented separately for the intermediate and superior level of Catalan knowledge. Discrete effects are defined as the variation in the probability of over-education produced by a discrete variation in the level of Catalan knowledge. A representative individual's discrete effects are called representative discrete effects, whereas average discrete effects correspond to the average of discrete effects over all individuals.15

[Table 3 here]

This table shows that representative discrete effects of Catalan knowledge on the individual likelihood of being over-educated are negative and significant for almost all groups, but not

15

Representative and average effects differ because of Jensen's inequality: . On its turn, average discrete effects have higher standard deviations than representative, because the former also include the variation across individuals. That is why average effects are more likely to be statistically non significant.

very large. Average discrete effects are also found to be negative and with substantially larger absolute values, but, unlike representative discrete effects, they are clearly not significant. For example, for a woman of average socio-demographic characteristics, reading and speaking Catalan in 1996 decreases the probability of her being over-educated by 0.03 percentage points and this effect is significantly different from zero. However, for all women in 1996 reading and speaking Catalan decreased the probability of being over-educated on average by 2.08 percentage points, and this contribution is non significantly different from zero.

Both representative and average effects of language knowledge on over-education are larger for women than for men, and increasing for the former and decreasing for the latter. Interestingly, these effects are increasing over the level of proficiency in the language for both genders in 1991, but in year 1996 the opposite is true, as the effects are smaller for writing than for reading Catalan in that year.

These results suggest the existence of some degree of substitution between language skills and educational attainment: workers with Catalan knowledge are more likely to occupy jobs where their educational attainment match better with the jobs' educational requirements. Notice, however, that this negative effect of Catalan knowledge on over-education is only significant at the individual level. On average, because of the strong variation across individuals, one cannot reject the hypothesis that this effect is zero.

4.2 Endogeneity of Language Knowledge The previous estimations constitute an important evidence of the effect of Catalan knowledge on over-education. They provide unbiased results if language knowledge is an exogenous variable, i.e. if language were merely an ethnic attribute that signals membership to a given community, as assumed by the early studies on the economics of language (Becker 1957 and Reynauld and Marion 1972; see also Grin 2003). These estimations, however, do not account for the possible endogeneity of language knowledge. The effect of Catalan on over-education estimated by a standard Probit may be biased and ultimately driven by attributes that account for both language knowledge and over-education. To estimate this effect unbiasedly, we

estimate the probability of being over-educated conditional on the level of Catalan knowledge for the four subgroups, by gender and Census year accounting for the selection into knowing Catalan.

We proceed in a two-step estimation, as in Willis and Rosen (1979). In the first stage, estimation selection into Catalan knowledge is accounted for by the same variables of the overeducation equation, explained above, augmented by the following: percentage of individuals born in Catalonia and percentage of individuals who write Catalan in the municipi; a dummy variable indicating whether the individual was affected by the Normalization process; whether the individual arrived to Catalonia before age 10; number of years since migration; an interaction term between years since migration and whether the individual arrived before age 10; and, finally, whether the individual was born in Andalusia, Valencia and Balearics, or La Franja. As exogenous sources of variation, these variables allow us to identify the recursive bivariate Probit model (Maddala 1983); thus, they are only included in the language selection equation, but excluded in the over-education equations. Variables assumed to affect Catalan knowledge but not directly the probability of over-education of individuals who were not born in Catalonia are the externality effect of the community of residence on Catalan knowledge, the exposure to Catalan language environment, (captured in years since migration, age since migration and whether the individual was affected by the change in the language of schooling), and the language predominant in the region of origin of the individual, as in certain regions of Spain Catalan is also known.16

[Table 4 here]

Using the estimated parameters of the language selection equation, we proceed to estimate the second stage: the probability of being over-educated conditional on a given Catalan proficiency level, which is reported in Table 4. Individuals with more schooling or who are younger are more likely to be over-educated, though at a decreasing rates in both cases. The cross-effect of age and schooling reinforces this effect; individuals with more schooling and who are younger are more likely to be over-educated. Married men are more likely to be over-educated; 16

In Appendix A.3. we discuss the first-stage estimation on Catalan knowledge selection in greater detail.

however, the opposite is true for women: married women are less likely to be over-educated. Similarly, individuals who reside in municipis with higher employment rates are less likely to be over-educated, except for women in 1996, when the opposite is true, though not at a significant level. Finally, for the effect of the importance of services in the municipi on overeducation, there is a clear-cut gender difference: whereas men who reside in municipis where services are relatively more important are less likely to be over-educated, women who reside in those municipis are more likely to be over-educated. This last result may arise from the greater importance of services for the female labor force.

Notice that unobserved characteristics that increase selection into Catalan knowledge are negatively correlated with unobservables that make over-education more likely. This effect is captured in the correlation coefficient between the Catalan and the over-education equation, which is found to be significant for most subsamples. Both its standard error and the likelihood ratio test reject the hypothesis that Catalan knowledge is an exogenous variable in the estimation of over-education.

[Table 5 here]

We compare these results with those of the simple Probit estimation. Table 5 shows the average discrete effects for standard Probit and bivariate Probit estimations for over-education, by census year, gender and Catalan knowledge. Accounting for endogeneity of Catalan knowledge tends to decrease the absolute value of the average discrete effects for both genders, both years and language knowledge, except for women who do not write Catalan in year 1996. Thus, correction of endogeneity yields lower effects of language on over-education.

Note also that the effect of language on over-education is generally decreasing over time for men, but increasing for women, so that their effects end up overtaking those of men. While the effects for men are higher in 1991, they are higher for women in 1996. These effects are also increasing over the level of Catalan proficiency and higher for individuals who know Catalan than for individuals who do not know it. Returns to language in terms of reducing overeducation may decrease over time for men, but they may be increasingly important for women,

more likely to work in the service sector where communication skills, and therefore language is more valued.

[Table 6 here]

Table 6 presents the predicted probabilities of being over-educated by Catalan reading and speaking, and writing skills based on previous estimations. As shown in the descriptive statistics, for all groups the actual probability of being over-educated is substantially higher for individuals who know Catalan. However, this comparison does not reveal much, because it is made across groups with different individual attributes. A valid comparison of possible outcomes should be made for the same subgroup of the population, and that is possible after recovering the parameters of the language and the over-education equations. This is contained in the average discrete effects of Catalan knowledge, which are found to be negative, suggesting that Catalan knowledge indeed reduces the likelihood of over-education. For instance, the illusion that Catalan knowledge increases over-education may be a result of naively comparing the probabilities of over-education, say for men who write and for men who do not write Catalan in 1991, 17.33 percent and 8.41 percent, respectively. In contrast, Catalan writing skills decrease over-education by 3.91 percentage points for individuals who write Catalan and by 1.72 percentage points for individuals who do not write Catalan. Interestingly, the difference in these returns by actual language knowledge conforms to the theory of comparative advantage, that is, individuals with higher returns to language knowledge are those who actually know it. However, the standard deviation of these effects is so high that they end up being not significant: one cannot reject that all these effects are significantly different from zero.

In short, selection does matter; neglecting language selection leads to overestimating the effect of Catalan on over-education. In no case does this accounting change our result that returns to language in terms of reducing over-education are on average negative but systematically nonsignificant.

5. CONCLUSIONS Catalonia provides a good opportunity to assess the incidence of language skills on overeducation in a multilingual economy. Catalan language, formerly confined to informal uses, became co-official in coexistence with Castilian (Spanish) and the language of instruction in the early eighties. This change resulted in an important increase in the knowledge and use of Catalan that did not, however, undermine the intensive use of Castilian in most spheres of communication.

At a first glance, descriptive evidence suggests that individuals with better language knowledge appear more likely to be over-educated. A more detailed analysis consists on the estimation of a model that controls for several socio-demographic workers' attributes. In such a model, we find that language knowledge has in fact a negative effect on over-education: individuals who are less fluent in Catalan accept jobs of lower educational requirements. This effect is robust to accounting for endogeneity of language knowledge and significant for a representative individual. However, this negative effect is so heterogeneous across individuals that once we compute average discrete effects of language knowledge on over-education, it becomes mostly non-significant.

Our results have implications for contemporary language policy issues, especially in countries that have undergone a change in the language of instruction, such as Morocco from French to Arabic, Wales from English to Welsh, Estonia from Russian to Estonian, or Puerto Rico from English to Spanish. These economies may not be using efficiently their human resources, as individuals who are not fluent in the current language of communication may be working in jobs for which they are over-qualified. Our findings based on evidence for Catalonia show that language knowledge does reduce skills mismatch.

APPENDIX A.1. SAMPLE SELECTION The following table illustrates the importance of the selection criteria in constructing the sample. Total sample Only main household members: parents and children Only individuals between 16 and 60 years old Only Spaniards Only if arrival in Catalonia available Only individuals in the labor force Only if Catalan language variable available Only if born outside Catalonia Selected sample

1991 250 000 17 654 -82 297 -5 740 -47 421 -25 -49 810 47 053

1996 250 000 -17 903 -81 770 -4 745 -3 788 -44 809 -27 942 69 043

A.2. DEFINITION OF THE VARIABLES The explanation on the construction of each variable is presented below.

Over-education.-A worker is defined as over-educated if his/her years of schooling are above the mean educational level of the corresponding occupation (three-digit classification) plus one standard deviation. Adequately educated workers are those whose educational level is higher than the mean educational level of the corresponding occupation minus one standard deviation and lower than the mean educational level plus one standard deviation.

Schooling.-The census reports the maximum level of studies attained by the individual. To each level, we assign the number of years of schooling.

Age.-It is the census year, 1991 or 1996, respectively, minus the year of birth.

Normalization.-If the individual was younger than 12 years old in 1984, this dummy variable takes the value of one and zero otherwise.

Married.-This variable takes the value of one, if the respondent reports to be currently married; it is zero if the respondent reports to be a widow(er), separated, or divorced.

Residence variables.-The census reports the municipi and the Province of residence for each individual. With this information we construct dummies for Lleida, Girona and Tarragona.

YSM (Years since Migration).-The census reports the year of arrival to Catalonia. YSM is the census year minus this number. We also construct the dummy indicating if somebody arrived when s/he was no more than 9 years old.

Municipal variables.-We use the residence variable to assign to each individual the corresponding information of the municipi.

Occupation.-This variable is literally called “occupation, profession or craft” and is specific to the interviewed individual. It is coded according to the Catalan Code of Occupations (CCO-94). We aggregate the original three-digit classification of occupations into 4 groups: (1) agricultural occupations (reference group), (2) industrial occupations, (3) trade, service and professional occupations.

Activity.-This variable provides information on the industrial sector of the firm where the worker performs his or her work. It is coded according to the Catalan Classification of Economic Activities (CCAE-93). This classification is also provided at a three-digit disaggregation, which we group into four categories: (1) agricultural activities (reference group), (2) industrial activities, (3) trade activities, (4) service activities.17

A.3. CATALAN KNOWLEDGE The estimates of the Catalan knowledge selection equation are reported in Table A.1. The covariates for these estimations are the variables explained above plus squared and interaction terms. All estimations exhibit a fairly good fit. The probability of knowing Catalan is thus increasing both in schooling and age (except for writing Catalan), but at a decreasing rate. 17

Our exact grouping for the classification of occupation and activities is available upon request.

[Table A.1 here] Individuals are more likely to know Catalan, especially to read and speak it, if they are single, live outside Barcelona, in areas with a higher density of individuals who are employed, particularly in non-service activities, know Catalan, or were born in Catalonia. Being affected by Normalization also increases the probability of knowing Catalan. This effect is weaker in 1996 than in 1991, greater for men than for women, and greater for reading and speaking than for writing skills. Early arrival to Catalonia, younger than 10, more exposure to the local culture, captured by years since migration, makes language assimilation more likely, being this effect stronger for individuals who arrived at a mature age. Individuals coming from Andalusia have a lower probability of knowing Catalan, particularly women, whereas the probability is higher for individuals born in Catalan-speaking areas, such as Valencia, Balearics and La Franja.

References Alba-Ramirez, A. (1993), ‘Mismatch in the Spanish Labour Market. Overeducation?’, Journal of Human Resources 28, 259–278. Angrist, J., Chin, A. and Godoy, R. (2008), ‘Is Spanish-only schooling responsible for the Puerto Rican language gap?’, Journal of Development Economics 85(1-2), 105–128. Angrist, J. D. and Lavy, V. (1997), ‘The Effect of a Change in Language of Instruction on the Returns to Schooling in Morocco’, Journal of Labor Economics 15, 48–76. Becker, G. (1957), The Economics of Discrimination, Chicago University Press, Chicago. Charette, M. and Meng, R. (1994), ‘Explaining language proficiency. Objective versus selfassessed measures of literacy’, Economic Letters 44, 313–321. Duncan, G. and Hoffman, S. (1981), ‘The Incidence and Wage Effects of Overeducation’, European Economic Review 1, 75–86. Grin, F. (2003), ‘Language planning and economics’, Current Issues in Language Planning 4(1), 1–66. Grin, F. and Vaillancourt, F. (1998), Language Revitalisation Policy: An Analytical Survey. Treasury Working Paper 98/6. Mimeo.

Groot, W. (1993), ‘Overeducation and the Returns to Enterprise-related Training’, European Economic Review 12, 299–309. Groot, W. (1996), ‘The Incidence of, and Returns to Overeducation in the UK’, Applied Economics 28, 1345–1350. Groot, W. and Maassen van den Brink, H. (1997), ‘Allocation and the Returns to Overeducation in the United Kingdom’, Education Economics 5, 169–183. Laitin, D. (1989), ‘Linguistic Revival: Politics and Culture in Catalonia’, Comparative Studies in Society and History 31(2), 297–317. Maddala, G. S. (1983), Limited-dependent and qualitative variables in econometrics, Cambridge University Press, Cambridge. Ortega, J. (2007), ‘Determinantes del nivel de catalán de los inmigrantes en Cataluña: un análisis de sección cruzada a nivel comarcal, Cuadernos Económicos de ICE 74, 101– 127. Raynauld, A. and Marion, P. (1972), ‘Une analyse ´economique de la disparit´e inter-ethnique des revenus’, Revue Economique 23, 1–19. Rendon, S. (2007), ‘The Catalan Premium: Work and Language in Catalonia’, Journal of Political Economy 20(3), 669–686. Rumberger, R. (1987), ‘The Impact of Surplus Schooling on Productivity and Earnings’, Journal of Human Resources 22(1), 24–50. Sabourin, C. and Bernier, J. (2001), Government responses to language issues. international examples. Office of the Languages Commissioner of Nunavut. Mimeo. Sicherman, N. (1991), ‘Overeducation in the Labor Market’, Journal of Labor Economics 9, 101–122. Siguán, M. (1999), Conocimiento y uso de las lenguas en España. (investigación sobre el conocimiento y uso de las lenguas cooficiales en las Comunidades Autónomas Bilingües. Centro de Investigaciones Sociológicas (CIS), 22, Madrid. Sloane, P., Battu, H. and Seaman, P. (1999), ‘Overeducation, Undereducation and the British Labour Market’, Applied Economics 31, 1437–1453. Thurow, L. (1975), Generating Inequality: Mechanisms of Distribution in the U.S Economy, New York: Basic Books, New York.

Verdugo, R. and Verdugo, N. (1989), ‘The impact of surplus schooling on earnings: Some additional findings’, Journal of Human Resources 24, 629–643.

Table 1. Over-education (in %) and Education (in Years) in Spanish regions Region Over-education Education Basque Country 25.0 7.684 Navarra 22.7 7.519 Madrid 21.5 7.739 Aragon 21.5 7.087 Cantabria 21.1 7.253 Balearic Islands 19.6 6.535 Castile-Leon 19.4 7.018 Asturias 18.8 7.120 Ceuta and Melilla 18.8 6.326 Catalonia 18.8 7.061 La Rioja 17.0 6.867 Valencian Country 15.7 6.494 Andalucia 15.3 5.935 Canary Islands 15.2 6.238 Murcia 14.1 6.234 Castile-La Mancha 13.7 5.882 Galicia 13.7 6.560 Extremadura 12.5 5.729 Source: Survey of Economic Activity (EPA) 1996

Table 2: Summary Statistics Census Year Gender %Over-educated %Over-educated: Service Sector %Over-educated born in Catalonia Average Years of Schooling of All Over-educated Born in Catalonia Over-educated and born in Catalonia Average Age of All Over-educated % Read and Speak Catalan among All Over-educated % Write Catalan among All Over-educated % Over-education among those who Read and Speak Catalan Write Catalan % Women % Women working in the Service Sector % Married % Normalized Origin % Barcelona % born in Andalusia % arrived age ≤ 9 Years since migration if not born in Cat Municipi % writes Catalan in Municipi % Catalan-born in Municipi % work in Services in Municipi % employed over population in Municipi

.

1991

1996

Men

Wom.

Men

Wom.

9.33 17.67 17.54

8.01 12.78 13.08

5.85 11.56 11.40

7.05 12.02 12.10

6.51 11.78 8.52 12.08

6.96 12.13 8.20 13.03

7.45 12.84 9.11 12.74

7.97 13.72 9.66 13.63

42.47 37.47

39.61 34.07

44.33 41.21

41.96 37.88

30.78 43.16

39.36 50.96

41.74 61.24

52.04 69.86

10.38 19.26

17.41 29.16

14.17 30.12

24.83 47.95

13.09 17.32

10.37 13.41

8.58 12.43

9.47 13.62

84.03 0.58

29.06 54.95 73.10 1.09

83.58 1.65

31.93 55.10 72.39 2.80

82.31 63.66 28.62 24.96

83.18 59.31 35.39 23.88

80.54 62.36 32.00 28.36

79.73 56.16 37.98 27.13

37.42 63.90 51.73 36.98

38.55 64.82 52.65 37.28

44.03 65.48 58.09 36.15

44.76 66.06 59.26 36.28

Table 3. Over–Education Language Discrete Effects and Premia by Reading and Speaking, and Writing Skills (Standard errors in small fonts) Catalan Skill Census Year Gender

Representative Average

Reading and Speaking 1991 1996

Writing 1991

1996

Men

Wom.

Men

Wom.

Men

Wom.

Men

Wom.

-0.09

-0.16

-0.00

-0.03

-0.09

-0.15

-0.00

-0.02

0.02

0.05

0.00

0.01

0.02

0.05

0.00

0.01

-1.86

-1.09

-0.42

-2.08

-2.24

-1.18

-0.34

-1.68

2.95

2.45

0.83

4.21

3.62

2.75

0.66

3.48

Table 4. Over-Education by Reading and Speaking, and Writing Skills Standard errors in small fonts Catalan Skill Census Year Gender

Constant Schooling ×10⁻⁴ Schooling² ×10⁻⁶ Age ×10⁻¹ Age² ×10⁻³ Age × Schooling ×10⁻⁴ Married ×10⁻¹ % Muna Employed

Reading and Speaking 1991 1996

Writing 1991

1996

Men

Wom.

Men

Wom.

Men

Wom.

Men

Wom.

-6.49

-7.90

-17.85

-14.72

-6.57

-7.98

-17.85

-14.72

0.56

1.05

1.13

0.00

0.56

1.05

1.13

0.00

1.15

1.02

2.88

1.51

1.16

1.03

2.88

1.51

0.04

0.06

0.13

0.15

0.04

0.07

0.13

0.153

-3.01

-2.16

-10.20

-4.79

-3.03

-2.21

-10.20

-4.79

0.12

0.18

0.50

0.57

0.12

0.18

0.50

0.57

-0.40

0.02

-0.23

-0.35

-0.39

-0.01

-0.22

-0.35

0.15

0.25

0.25

0.32

0.15

0.25

0.25

0.32

0.43

0.20

0.17

0.16

0.42

0.23

0.15

0.16

0.19

0.31

0.29

0.39

0.19

0.31

0.29

0.39

-7.56

-30.42

3.21

8.76

-8.05

-30.19

3.70

8.85

6.31

9.34

10.95

14.85

6.32

9.33

10.96

14.85

1.22

-0.56

0.46

-0.50

1.22

-0.51

0.45

-0.47

0.49

0.64

0.65

0.67

0.49

0.64

0.65

0.67

-1.82

-1.35

-0.27

1.61

-1.79

-1.36

-0.27

1.60

1.03

1.71

1.16

1.51

1.03

1.71

1.16

1.52

-0.12

0.49

-0.50

0.23

-0.13

0.49

-0.50

0.23

0.18

0.28

0.25

0.32

0.18

0.28

0.25

0.32

-0.13

-0.14

-0.05

-0.19

-0.13

-0.14

-0.05

-0.17

0.02

0.04

0.03

0.04

0.03

0.04

0.04

0.04

LRT (ρ=0) 28.89 10.37 2.25 18.76 22.22 12.33 1.59 a Percentage in municipi Note: All estimations incluye dummy variables for activities, occupations and province of resindence

15.17

% Muna Services

ρ

Table 5: Over-Education Average Discrete Effect (in %) by Catalan skills Standard errors in small fonts Census Year 1991 Gender Men D.knw Knowledge Knw. READING AND SPEAKING Probit -2.88 -1.41 Biprobit WRITING Probit Biprobit

Women Knw. D.knw

1996 Men Knw. D.knw

Women Knw. D.knw

-2.00

-0.74

-0.66

-0.26

-3.21

3.30

2.64

3.08

1.20

0.97

0.67

4.93

-.0.95 2.88

-2.84

-1.39

-1.95

-0.70

-0.86

-0.31

-3.45

-0.90

3.36

2.62

3.27

2.65

1.73

1.02

5.89

3.26

-4.82

-1.94

-3.53

-0.92

-0.78

-0.26.

-4.02

-0.98

4.24

3.40

4.08

2.40

0.84

0.60

4.61

2.67

-3.91

-1.72

-3.01

-0.86

-1.06

-0.38

-3.89

-0.93

3.44

2.99

3.66

2.55

1.43

0.91

4.85

3.02

Table 6: Probability of Over-education and Average Discrete E?ects (in percent) by Catalan skills. Standard errors in small fonts Census Year Gender Knowledge

1991 Men

1996 Women

Men

Women

Knw. READING AND SPEAKING Actual 13.09 Pred: Know 14.13

D.knw

Knw.

D.knw

Knw.

D.knw

Knw.

D.knw

7.66 6.20

10.37 7.83

6.48 2.47

8.58 9.65

3.89 3.69

9.47 7.22

4.43 2.05

21.95

15.46

19.24

12.10

18.99

13.59

16.72

13.04

Pred: DKnow

16.97

7.58

9.78

3.16

10.51

4.00

10.66

2.95

24.53

17.58

21.70

14.45

20.00

14.35

20.84

15.69

Aver. Discr. Effects

-2.84

-1.39

-1.95

-0.70

-0.86

-0.31

-3.45

-0.90

3.36

2.62

3.27

2.65

1.73

1.02

5.89

3.26

WRITING Actual Pred: Know

17.33 19.79

8.41 6.65

13.41 12.29

6.87 2.63

12.43 14.31

4.76 4.63

13.62 10.70

4.89 2.46

24.42

15.71

20.82

10.26

21.39

13.73

17.00

9.86

Pred: DKnow

23.71

8.37

15.30

3.49

15.37

5.01

14.59

3.38

26.87

18.09

23.60

12.31

22.23

14.46

20.98

12.61

Aver. Discr. Effects

-3.91

-1.72

-3.01

-0.86

-1.06

-0.38

-3.89

-0.93

3.44

2.99

3.66

2.55

1.43

0.91

4.85

3.02

Table A1: Language Equation (First Stage Estimation) Standard errors in small fonts Language Skills Census Year Gender

Intermediate: Reading and Speaking 1991 1996 Men Wom Men -3.10 0.32

0.34

0.40

0.53

0.39

0.42

0.48

0.58

Schooling ×10⁻² Schooling² ×10⁻⁴ Age ×10⁻¹ Age² ×10⁻³ Age × Schooling ×10⁻⁴ Married ×10⁻¹ Lleida resident ×10⁻¹ Girona resident ×10⁻¹ Tarragona resident ×10⁻¹ % Muna Employed

27.01

35.07

32.94

37.29

32.85

36.98

34.64

38.05

2.15

2.55

3.05

3.96

% Muna Born in Catalonia % Muna Write Catalan Normalizedb Arrived younger than 10 YSMc

-3.54

-3.82

1.75

2.16

2.50

3.48

-67.92

-86.33

-74.23

-91.27

-1.20

-2.68

1996 Men Wom

Constant

% Muna Services

-4.19

Wom

Advanced: Writing 1991 Men Wom

-63.67 -72.15

-0.94

-2.20

-50.44 -73.00

5.22

6.37

8.07

11.23

6.45

7.62

10.88

14.18

-0.17

0.28

-0.29

-0.21

-0.88

-0.35

-1.15

-1.10

0.10

0.10

0.13

0.17

0.12

0.13

0.15

0.19

0.00

-0.04

0.01

0.00

0.11

0.04

0.12

0.10

0.01

0.01

0.02

0.02

0.01

0.01

0.01

0.02

-6.39

-14.28

-12.95

-10.78

-18.70 -18.62

-22.06 -11.73

3.12

3.75

3.88

5.37

3.67

4.33

4.32

5.62

-0.54

-1.75

0.24

-0.68

-2.31

-2.39

-1.58

-1.99

0.28

0.28

0.29

0.33

0.35

0.34

0.35

0.36

2.33

1.64

0.48

0.62

2.01

1.08

0.91

0.06

0.57

0.57

0.62

0.88

0.72

0.72

0.74

0.92

2.30

1.71

1.88

1.38

1.53

1.26

2.43

0.54

0.40

0.41

0.45

0.62

0.50

0.53

0.54

0.69

1.32

0.66

2.20

1.49

0.91

0.16

1.78

1.24

0.36

0.38

0.37

0.53

0.46

0.49

0.44

0.58

-0.21

-1.15

-0.85

0.15

-1.60

-1.24

-2.12

0.68

0.59

0.62

0.54

0.72

0.76

0.79

0.68

0.82

-0.82

-0.65

-0.21

-0.25

-1.10

-1.10

-0.77

-0.62

0.09

0.09

0.10

0.14

0.12

0.12

0.13

0.16

2.40

2.08

2.21

1.71

3.03

2.71

3.07

1.91

0.20

0.21

0.26

0.36

0.25

0.27

0.32

0.40

0.97

1.55

1.50

1.68

-0.32

0.04

-0.28

0.80

0.21

0.22

0.25

0.35

0.27

0.29

0.31

0.39

0.56

0.49

0.35

0.27

0.52

0.59

0.23

0.19

0.11

0.12

0.09

0.11

0.12

0.12

0.09

0.11

-4.27

-4.79

-4.09

-3.92

-1.78

-1.83

-0.92

-1.26

0.22

0.23

0.28

0.38

0.18

0.20

0.20

0.27

0.84

0.17

0.97

0.52

0.84

0.17

0.97

0.52

0.71

0.13

0.71

0.98

0.71

0.13

0.71

0.98

Arrived younger than 10× YSMc Born in Andalusia

0.21

0.29

0.78

0.63

0.21

0.29

0.78

0.63

0.57

0.36

0.02

0.89

0.57

0.36

0.02

0.89

Born in Valencia -Balearics Born in Franja

0.26

0.90

0.38

0.57

0.26

0.90

0.38

0.57

0.73

0.81

0.77

0.36

0.73

0.81

0.77

0.36

0.56

0.63

0.64

0.60

0.56

0.63

0.64

0.60

0.90

0.42

0.95

0.40

0.90

0.42

0.95

0.40

0.95

0.49

0.01

0.43

0.95

0.49

0.01

0.43

0.20

0.17

0.27

0.30

0.20

0.17

0.27

0.30

Pseudo R² 0.18 0.22 0.19 0.22 0.21 0.25 a b c Percentage in municipi; Affected by Normalization policiy; YSM=Years since migration

0.24

0.28

Over-Education in Multilingual Economies: Evidence ...

proficiency on over-education in a multilingual labor market where language skills potentially. 3. In particular, from the ... compensating a lack of ``language skills''\ with higher levels of education. After decades of ..... These results suggest the existence of some degree of substitution between language skills and educational ...

171KB Sizes 1 Downloads 143 Views

Recommend Documents

Inequality Constraints in Recursive Economies
Sep 6, 2007 - The following definition of time iteration will be used.7 ... As far as the author is aware, there has been no application of “time .... The final proposition will show that the sequence of policy functions .... without markedly incre

Optimization in Economies with Nonconvexities
Abstract. Nonconvex optimization is becoming the fashion to solve constrained optimization problems. Classical Lagrangian does not necessarily represent a ...

corporate governance problems in transition economies - CiteSeerX
different ways that owners maintain control over the work of management: 1) the owners directly .... initiatives in the field of corporate governance. Comparative ...

Optimal Taxation in Life-Cycle Economies - ScienceDirect
May 31, 2002 - System Macro Meeting in Cleveland, as well as James Bullard and Kevin Lansing ... Key Words: optimal taxation; uniform taxation; life cycle.

Pairwise stable matching in large economies
Hour: 17.10. Date: November, 30th (Thursday) 2017. Place: Room: 5C, building C, SGH. Pairwise stable matching in large economies. Michael Greinecker. University of Graz. Abstract: This paper provides a model of stable pairwise matchings in two-sided

The wealth distribution in Bewley economies with ... - NYU Economics
Jul 26, 2015 - (2011) for a survey and to the excellent website of the database they ..... solves the (IF) problem, as a build-up for its characterization of the wealth .... 18 A simple definition of a power law, or fat tailed, distribution is as fol

Affective Economies -
refusal to allow the boat Tampa into its waters (with its cargo of 433 asy- ... Martin was released in August 2003 and “his story” was very visible in the pop-.

On Competitive Cycles in Productive Economies
II%: into 53, _ It is homogeneous of degree 1. It is smooth on the interior ..... and using the homogeneity of degree zero of F;i and FL ..... associates to (( Y, c1), a).

Human Capital Risk in Life-cycle Economies - Semantic Scholar
with risky human capital and risk-free physical capital. In Krebs (2003) this ...... The solid line in the left graph of Figure 3 is the age effects in mean earnings ...

sequencing banking reforms in transition economies
Apr 15, 2006 - The centralized approach rests on the transfer/sale of NPLs ... means of auction (Megginson, 2005). The practice in transition ... In the early 1920s, other foreign banks also started doing business in Vietnam,. i.e. Bank of East ...

Revisiting economies of size in American education
evidence that moderately sized elementary schools (300–500 students) and high schools (600–900 ..... a tangible effect, and where the school administration is ...... student birth childhood experience high achiever order, health, m obility, study

Externalities in economies with endogenous sharing rules
Apr 18, 2017 - In this note, we prove existence of a Simon and Zame “solution” in ... For example, in exchange economies, consumers are limited by their.

Pecuniary Externalities in Economies with Financial ... - of Anton Korinek
There are two types of goods, a homogeneous consumption good, which serves as numeraire, and a capital good. We denote by L ∈ o the state of nature realized at date 1, where o is the set of possible states. Preferences/endowments. Each agent i valu

Optimal Taxation in Life-Cycle Economies
How to finance a given streams of government spending in the absence of ... Corlett-Hague's intuition: the degree of substitutability between taxed and untaxed ...

Robust Comparative Statics in Large Dynamic Economies - Core
Mar 5, 2014 - operator maps a probability distribution into a probability distribution, so its domain or range is not a lattice in any natural order (Hopenhayn and ...... 13A natural first approach to comparative statics in general equilibrium econom