Decomposing the Gender Wage Gap with Sample Selection Adjustment: Evidence from Colombia 1

Alejandro Badel [email protected] Federal Reserve Bank of St. Louis

Ximena Peña [email protected] Universidad de Los Andes

June, 2009 Abstract Despite a strong convergence in the distribution of labor market characteristics between men and women in Colombia, relative hourly wages between the genders have not converged in the same order of magnitude and a sizeable gender wage gap persists. We employ quantile regression techniques to examine the degree to which differences in the distribution of observable characteristics can explain the gender gap, and rather find that the gap is largely explained by differences in the rewards to human capital characteristics. The remaining gap after controlling for observable factors is unevenly spread across the distribution, primarily affecting women at the top and the bottom of the distribution. We find that self selection is important, explaining roughly -50% of the gap, implying that able women self-select into work. Keywords: Gender gap, semiparametric, quantile regression, selection. JEL classification numbers: C21, J22, J31.

1

The authors are grateful to James Albrecht, Susan Vroman and participants at the Universidad de Los Andes and Rosario Seminars, NIP-Colombia and LACEA Conferences and BIARI 2009 for their comments. The usual disclaimer applies.

Introduction In the past decades in the Colombian labor market, there has been a relative improvement in the labor market indicators and characteristics of women. Whereas the participation rate of men has been fairly stable around 75%, participation rates for women almost doubled from 30% in 1976 to nearly 60% in 2006 and it is now among the highest in Latin America (Duryea et al., 2001). Between 1984 and 2006 in our sample 2 , whereas male participation was around 95% for the whole period, female participation increased from 40% to 70%. The median hours of work remained the same for both groups across time at 48 hours which is the legal working week for manufacturing in the country. The fraction of women in the primary sector, industry and services passed from 15%, 29% and 43% to 27%, 37% and 51%, respectively. When comparing working men and women, females have substantially improved the labor market characteristics that they bring to the labor market. First, the existing difference in potential experience has receded due to the increase in the participation rate of mothers (Amador et. al, work). Average years of potential experience for men and women went from 18.0 and 16.6 to 19.0 and 18.7, respectively. In terms of education, women widened their educational advantage over men; the average years of schooling raised from nearly 8.2 for men and 8.5 for women to 10.3 and 10.7. Furthermore, women reversed the education gap in college attainment and are now more educated than men (see Peña, 2006). While the fraction of men with college education over the fraction of women with college education for those working was 1.96 in 1986 and by 2006 it was 0.94. Despite a strong convergence in the distribution of characteristics, relative hourly wages between the genders have not converged in the same order of magnitude and a sizeable gender wage gap persists. The unconditional gender wage gap, that is, the difference in average wages between men and women, went from 23% to 14% during the same period 3 . However, between 1986 and 2006 not only did the labor market characteristics of men and women changed, but so did the market returns or ‘prices’ paid for such characteristics between genders. To isolate the effects of these two sources of change we calculated, using a mean wage regression with 1986 data, the average returns to observable characteristics in 1986. The gender wage gap implied by the 1986 returns and the characteristics of men and women in 2006 should be 2%; as mentioned earlier, the observed gap in 2006 was 14%. These facts appear puzzling because the gap did not fall too much, al though the other measures converged strongly, and therefore the unexplained portion of the gap increased substantially. In this paper we employ quantile regression techniques to study the gender gap. We find that the analysis of means is misleading. Higher moments of the distribution of characteristics go a long way in explaining the gender gap. Men are paid significantly more than women and the raw gap displays a U-shape: women's wages fall further below 2

See the data section for a full description of our sample. The conditional gap, that is, the gender gap after controlling for labor market characteristics, has decreased from 20,5% at the beginning of the period to 11,4% in 2006.

3

2

men's at the extremes of the distribution whereas they are closer around the middle of the distribution. We employ the Machado Mata (MM hereafter) decomposition technique, to decompose the gap into a component due to differences in human capital characteristics such as education and age -composition effect- and differences in the rewards to these characteristics -price effect. We examine the degree to which the distribution of male and female characteristics can explain the gender gap in 2006 and find that it is largely explained by the price effect. The remaining gap after controlling for observable factors is unevenly spread across the distribution, primarily affecting women at the top and the bottom of the distribution. The MM technique has been used to decompose de wage gaps across the distribution in several developed economies (see for example Albrecht et al., 2003, for Sweden; de la Rica et al., 2007, for Spain). Regarding developing countries, several papers calculate and decompose the gender wage gap along the distribution (see for example Ganguli and Terrell, 2005, for Ukraine; Ñopo, 2006, for Chile; Fernández, 2006, for Colombia). However, these papers do not control for sample selection, which is often an issue in these calculations. Albrecht et al. (2007, AVV in what follows) propose and extension of the MM technique to account for selection following Buchinsky (1998); this paper applies the AVV methodology. Despite having one of the highest female labor participation rates in Latin America, self selection of women into work is important in the Colombian case, explaining roughly -50% of the gap. We find a positive selection effect, that is, able women self-select into work. The analysis of the gender wage gap is important for a developing country because it is a relevant measure of how unequal a society is. Colombia is a highly unequal country; it is among the highest income inequalities in Latin America as measured by the Gini index. Thus, gender gap calculations and decompositions are especially interesting. To the best of our knowledge, there are no papers that decompose the selection-corrected gender wage gap along the distribution for developing countries.

Descriptive Statistics and Data We use the Colombian Household Survey (CHS), a repeated cross-section carried out by the Statistics Department. It collects information on demographic and socioeconomic characteristics such as gender, age, marital status and educational attainment, as well as labor market variables for the population aged 12 or more including occupation, job type, income and sector of employment. We use the June 1986, 1996 and 2006 shifts to analyze the evolution of the raw gap and then focus on the latter wave to perform the selection correction and decomposition exercises.

3

Our analysis focuses on the seven main cities which account for 60% of the urban population, and according to 2005 Census data 78% of Colombians live in urban areas4 . In the 7 main cities 93% of men between 25 and 55 years of age work, while only 69% of women do. When we compare Bogotá and the other cities, we find that even though the levels of male participation are comparable, women participation is significantly higher in Bogotá: 75% vs. 65%. We use only observations with a complete set of covariates and restrict our sample to prime-aged individuals (between 25 and 55 years of age) who report working between 16 and 84 hours per week 5 and earn more than one dollar per day. Table 1 shows the sample selection for 2006, which leaves 15,423 observations, equivalent to nearly 4 million using weights, 47% of which are female. The sample selection process was very careful to minimize measurement error in the log hourly wage. Table 1: Sample Selection, April-June 2006 No. Observations Weighted % Men 46.439 14.200.850 0,44 7 main cities, 12+ years 23.915 6.047.089 0,43 Ages 25 to 55 years… 4.302.923 0,51 16.513 who w ork, 15.563 4.012.872 0,52 report 16-84 hours per w eek and earn more than US$1 per day. 15.423 3.978.580 0,52

In addition to the aforementioned differences in participation rates, men and women also display important differences in hours worked per month. Even though in our sample both have median hours of 208, men work on average 220 hours per month while women work 197 hours. The dependent variable is log hourly wage. The explanatory variables included in the estimations are: age and its square 6 , 4 education groups 7 , and dummies for marital status 8 and head of household. The descriptive Statistics are summarized in Table (2). First, men earn higher mean hourly wages than women: the average log wage for men is 7.86 and 4

Bogotá accounts for 45% of the population in the 7 main cities but given the design of the CHS, the sample size corresponds to only 15%. Sample weights are used to get representative results. Instead of extending the MM methodology to include sample weights we perform calculations for Bogotá and Elsewhere separately, and then we build the weighted distribution as follows: a) Let qi be the percetiles of the log wage distributions for i={Bogotá, Elsewhere}. b) Calculate at the j distribution the percentile levels at which qi lies and call these Pi. E.g. PBog = Fbog(qelse). c) The percentiles qelse correspond to the Pr(z=bog)∗( PBog)+(1-Pr(z=bog))∗(0.01,0.02,0.03...0.99) percentile levels of the country distribution. d) Obtain the country percentiles by linear interpolation. 5 The legally defined full time work is 48 hours per week in Colombia. 6 There is no available information in the survey regarding work experience, nor information about the number of births per woman -this is only identifiable for the head of household or spouse. Therefore, we use age and its square to proxy for experience instead of a transformation of age and schooling. 7 The education groups are: no completed education, completed primary, completed secondary and completed tertiary. 8 We summarize the marital status information into two categories: ‘together’ including individuals married or cohabiting which we refer to as married, and ‘alone’ which includes the categories single, divorced, separated and widowed.

4

7.72 for women. There are sizeable differences between the traditional labor market characteristics of working and non-working women, which is suggestive of non-random selection into work. The distribution of age and schooling is very similar between men and working women. Working men and women have similar average age, whereas nonworking women are nearly 2 years older. Working women are the most educated, followed by men and finally non-working women; the education distribution of working women first-order stochastically dominates that of working men which in turn first order stochastically dominates that of non-working women. Working men and non-working women display similar proportions of married individuals, 69% and 67% respectively, whereas only 48% of working women report being married. Males are more often head of household than females: 69% of men are head of household, while only 30% of working women and 17% of non-working women are. Table 2. Descriptive Statistics, Wage Equation Men Women Working Working Not Working 7,86 7,72 Log Wage (0,76) (0,82) 38,33 38,01 39,93 Age (8,57) (8,34) (9,17) Education < Primary 0,07 0,07 0,10 Primary + 0,34 0,31 0,39 Secondary + 0,41 0,40 0,40 University 0,18 0,22 0,10 0,69 0,48 0,67 Married Head of Household 0,69 0,30 0,17 0,43 0,47 0,33 Bogotá 0,49 0,52 0,57 Home Ownership # Children 2-6yrs 2 0,18 0,13 0,15 1 0,03 0,02 0,02 0,04 0,02 0,02 # Children <1yr 12,28 11,95 12,33 Non-Earned Income Log (1,35) (1,31) (1,40) 13,43 13,77 13,67 Log Other Family Income (1,18) (1,12) (1,00) 7.055 5.670 8.368 No. Obs Note: Standard errors in parentheses.

The additional variables included in the selection equation, that determine the decision to work but not the wage, are home ownership, number of children between 2 and 6 years of age, presence of children under 1, personal non-earned income (NEI) and other family income (OFI). The selection equation is calculated only for women given the high participation rate of men. Again, working and non-working women have different sets of characteristics regarding these variables. Home Ownership is a dichotomous variable indicating whether the person owns the house they inhabit. A higher proportion of women not working tend to be home-owners as compared to those who work: 57% vs.

5

52%, respectively. A smaller percentage of working women has children: twice as many women not working have children under 1 as compared to women working, and there is a slightly higher fraction of non-working women with children between 2 and 6 years of age. NEI is defined as income not related to labor market activities: accrued interest rates, rentals, pensions, remittances and other concepts. A low percentage of women report positive NEI: 19% of women who do not work report positive NEI, while 14% of working women do. Of those who report strictly positive NEI, women not working report higher levels on average than those working. Finally, OFI is defined as the total household income minus the individual's total income. Not surprisingly, since a higher proportion is married, a higher percentage of non-working women report strictly positive levels vis-à-vis working ones, 87% vs. 80%. However, the average OFI for working women is higher than for non-working women. We estimate Quantile Regression (QR) equations where the log hourly wage is regressed on the specified set of covariates. Results are reported in Tables (3) and (4) for women and men, respectively. Included variables have the expected sign. Education has a monotonic effect on wages; since the left out category is Completed Primary, lower levels negatively affect wages whereas higher ones have a positive effect. Age is sometimes significant while Age squared appears not be. Being married or head of household positively affects wages.

Constant Age Age squared Married Head Education
20% 6,62*** (0,95) 0,01 (0,05) 0 (0,00) 0,21*** (0,07) 0,25*** (0,08)

Table 3: Mincer Equation, Women Bogota 40% 60% 80% 20% 7,18*** 7,33*** 7,2*** 5,82*** (0,38) (0,32) (0,43) (0,34) 0 0 0,01 0,04*** (0,02) (0,02) (0,02) (0,02) 0 0 0 0 (0,00) (0,00) (0,00) (0,00) 0,09*** 0,11*** 0,17*** 0,11*** (0,05) (0,03) (0,05) (0,03) 0,11*** 0,12*** 0,18*** 0,19*** (0,05) (0,03) (0,06) (0,04)

Elsewhere 40% 60% 6,64*** 7,13*** (0,19) (0,18) 0,02*** 0 (0,01) (0,01) 0 0 (0,00) (0,00) 0,04*** 0,05*** (0,02) (0,02) 0,09*** 0,1*** (0,02) (0,02)

80% 7,12*** (0,23) 0,01 (0,01) 0 (0,00) 0,12*** (0,02) 0,19*** (0,02)

-0,58*** (0,23) 0,34*** (0,08) 1,18*** (0,08)

-0,19* (0,12) 0,28*** (0,06) 1,14*** (0,10)

-0,32*** (0,04) 0,48*** (0,02) 1,27*** (0,03)

-0,27*** (0,04) 0,45*** (0,02) 1,48*** (0,03)

-0,23*** (0,07) 0,27*** (0,03) 1,35*** (0,05)

-0,19*** (0,06) 0,50*** (0,06) 1,59*** (0,07)

-0,22*** (0,06) 0,55*** (0,04) 1,38*** (0,04)

-0,32*** (0,03) 0,36*** (0,02) 1,29*** (0,03)

*** Significant to 99%, ** Significant to 95%, *Significant to 90%

6

Constant Age Age squared Married Head

20% 7,09*** (0,60) 0 (0,03) 0 (0,00) 0,09 (0,07) 0,07 (0,08)

Table 4: Mincer Equation, Men Bogota 40% 60% 80% 20% 7,7*** 7,36*** 6,99*** 6,74*** (0,40) (0,35) (0,49) (0,17) -0,02 0 0,03 0,01 (0,02) (0,02) (0,03) (0,01) 0 0 0 0 (0,00) (0,00) (0,00) (0,00) 0,06 0,05 -0,06 0,1*** (0,04) (0,04) (0,06) (0,02) 0,09 0,10*** 0,20*** 0,13*** (0,05) (0,04) (0,06) (0,03)

Education
-0,28** -0,34*** -0,25*** -0,22** -0,28*** (0,16) (0,09) (0,08) (0,13) (0,04) 0,32*** 0,24*** 0,33*** 0,47*** 0,32*** Secondary (0,06) (0,03) (0,04) (0,06) (0,02) 1,01*** 1,25*** 1,52*** 1,69*** 1,00*** College (0,11) (0,09) (0,05) (0,08) (0,03) *** Significant to 99%, ** Significant to 95%, *Significant to 90%

Elsewhere 40% 60% 6,91*** 6,83*** (0,14) (0,13) 0,02 0,03*** (0,00) (0,01) 0 0 (0,00) (0,00) 0,01 0 (0,02) (0,02) 0,12*** 0,16*** (0,01) (0,02)

80% 6,76*** (0,21) 0,04*** (0,01) 0 (0,00) 0,02 (0,03) 0,18*** (0,03)

-0,25*** (0,02) 0,27*** (0,02) 1,12*** (0,02)

-0,23*** (0,03) 0,40*** (0,02) 1,47*** (0,04)

-0,18*** (0,02) 0,29*** (0,01) 1,27*** (0,03)

Methodology Machado-Mata Technique The MM technique is a decomposition in the spirit of Oaxaca-Blinder extended to the analysis of full distributions. It uses Quantile Regressions to partition the wage gap into a `price' component (due differences in the wage coefficients) and a `quantity' component (attributed to differences in labor market characteristics). We use a partial equilibrium assumption, assuming away the effect of changes in aggregate quantities of skills on skill prices, to build a counterfactual distribution, w FM . The wage gap, Q w M − Q w F , is decomposed as Q w M − Q w FM + Q w FM − Q w F . For example, to account for the effect of the differences in the distribution of covariates on the calculated gap, we build a counterfactual: the female wage density that would arise if we could endow women with men's labor market characteristics, but were still paid like women. Thus, by taking the difference between the male distribution and the proposed counterfactual, we purge the effect of differences in characteristics and calculate the portion of the gap due to differences in returns, or the price effect. In the previous expression, therefore, the first term in squared brackets is the price effect while the second is the composition effect. One could also perform a similar decomposition by calculating the alternative counterfactual distribution: the density that would prevail if women retained their own labor market characteristics but were paid like men.

[( ) (

)] [ (

) ( )]

( ) ( )

Let us illustrate the first counterfactual distribution with an example. Suppose that for each individual i in either population of females, F, or males, M, we observe the log wage wi and a vector of covariates xi . Further, assume that for each population j=M,F, the conditional θ -quantile of wij , conditional on the set of covariates xij , is given by 7

( )

Qθ wij = xij β θj . Then we can define the error term as eθji = y ij − xij βθji , where eθji is a

( )

random disturbance that satisfies Q eθji = 0 by construction. The QR model for population F is wiF = xiF β θF and similarly for population M. The counterfactual distribution of wages, conditional of the observable variables x, is fully characterized by the conditional quantile process, that is, it can be described by viewing Qθ (w | x ) as a function of θ , just like ordinary sample quantiles characterize a given marginal distribution. Hence, realizations of wi given xi can be interpreted as independent draws from Qθ (wi | xi ) where is a uniform random variable in (0,1). Therefore, one can simulate a random sample of the (estimated) conditional distribution of wages at a given x: randomly drawing a series of θ 's from a uniform distribution on (0,1), applying the probability integral transformation 9 and using the estimated QR coefficients, βˆθj , to get wage estimates. Then, to 'integrate x out' and get a sample from the marginal wage distribution, one can draw a sample of covariates x from an appropriate distribution (Machado and Mata, 2005). Let us continue the previous example to describe the workings of the procedure. Denote w FM as the counterfactual distribution of female log wages that would prevail if we maintained the returns to observable characteristics of women, but endow women with the male distribution of labor market characteristics, wiFM = β θF xiM + eiFM

( )

with Qθ eiFM = 0 .

(1)

To calculate the desired (counterfactual) conditional distribution of w given x, w FM , we generate random draws as follows. Pick at random man i with covariates xiM from the empirical distribution and quantile θ i from the uniform (0,1) distribution. With the ~ FM ≡ βˆ F x M . This way we can generate an arbitrarily large consistent estimator form w i

θi

i

sample of draws from the conditional distribution of male labor market characteristics but paid as women.

(

)

(

)

~ FM = e FM and so Q w FM = Q w ~ FM . Notice that under the true β θFi , we have wiFM − w θi θi i iθ i ~ FM is a consistent estimator of Thus, if βˆθF is consistent, the empirical distribution of w ~ FM the empirical distribution of w FM since, as shown in AVV, the quantiles of w converge in probability to the quantiles of w FM .

9

The probability integral transformation states that if U is a uniform random variable on [0,1], the F⁻¹(U) has the density F.

8

Estimating a quantile regression of Female Wages A consistent estimator of βˆθF is vital to build counterfactual distributions. However, given that women self-select into work, the usual problem of sample selection bias applies to its estimation. If, for example, the fraction of women actually participating is higher at the top of the potential work distribution, observed data under-samples the low potential earners and oversamples the high potential ones. Therefore, we need to correct for selection in a QR framework. We estimate the Mincer equation for working women correcting for selection where each quantile is given by: Qθ (w | ⋅) = μθ + x F β θF + Pθ ( zγ ) + eθ

for θ ∈ (0,1) .

(2)

x is a vector of labor market characteristics, z is the set of observables that influence the participation decision 10 , and P is the probability of participating. The term Pθ ( zγ ) adjusts for selection at each quantile θ ∈(0,1). Once β θF has been consistently estimated, the MM procedure is conducted as described above. We follow Buchinsky (1998) to account for selection in a QR framework 11 . This procedure shares the spirit of the popular Heckman (1978) two-step selection correction model but differs from Heckman in two important ways. First, quantiles, as opposed to mean regressions, are considered. Second, normality and homoskedasticity in the selection model are not assumed. Therefore, while in Heckman's the selection bias term takes the usual `inverse Mills ratio' form, in Buchinsky (1998) the form of the selection bias term is unknown. The model is summarized as follows. Let yi be a participation dummy and G an unknown function of the single index z i γ . The probability of participating is given by P( y i = 1) = G ( z i γ ) for i = 1,..., N .

(3)

We then construct the selection correction term, P(⋅), as a polynomial of the index,

10

An important assumption is that xi is a subvector of z i , and that z i includes at least one continuous

variable not present in xi . The particular exclusion restrictions in our application are made explicit in the data section. 11 Buchinsky's approach to the semiparametric sample selection model for conditional quantiles has been recently criticized by Melly and Huber (“Sample Selection, Heteroscedasticity, and Quantile Regression”, in progress). The authors claim that Buchinsky's results implicitly require the independence between the error term and the regressors conditional on the selection probability, and it is unclear whether the independence assumption is satisfied in practice. Despite the criticism, there are currently no alternative methods to control for selection bias in a QR framework.

9

P( z i γ ) = λ θ , 0 + λ θ ,1 r (a + b( z i γ )) + λ θ , 2 r (a + b( z i γ )) + ... + λ θ ,q r (a + b( z i γ )) , 2

q

(4)

where a and b are location and scale parameters, and r(⋅) denotes the inverse mills ratio φ (⋅) r (⋅) = evaluated at a + b( z i γ ) . A key point here is that the λ ′s vary with θ . We Φ(⋅) separate the location and scale parameters from the index since these are not identified in the semiparametric single-index framework 12 . Following Buchinsky, a Hausman specification test is used. We test the null hypothesis of normal errors, given the existence of the single index estimator which is consistent under both null and alternative hypotheses 13 . Probit should be used in the first step of the selection correction when errors are normally distributed; the single-index estimator should be used otherwise 14 . Last, note that μθ and λ θ , 0 are not separately identified in the quantile regression model above. We follow Buchinsky and estimate them by the method proposed by Andrews and Schafgans (1998) of identification at infinity. The intuition is as follows: if we choose a subsample of women with labor market characteristics such that the probability of working given those characteristics is arbitrarily close to 1, we can use this subsample to estimate the intercept in the Mincer equation, μθ , without adjusting for selection. Clearly, de variance-covariance matrix for the MM procedure has to account for the variability of the selection correction. AVV prove asymptotic normality of the MM quantiles in this context, and extend the covariance matrix estimator in Buchinsky (1998) for quantile regression with selection correction to the MM quantiles.

Results Raw Gap Decompositions, without Selection Correction The first step is to study the gender gap from raw data, before conditioning on covariates (such as age and education) and before accounting for selection of women into the labor market. The the raw gap is the difference between the log wage of a male at a specific

12

To see this, note that for any pair (a,b) and a function G (a + b( z i γ )) there is a function Gˆ ( z i γ ) such

that G (a + b( z i γ )) = Gˆ ( z i γ ) for all z i . Following Buchinsky, we estimate a and b by running a probit regression of y i on the semiparametrically estimated index z i γ . 13

The Hausman Test is perfomed using Klein and Spady's (1993) estimator. Under the null hypothesis of

(

)(

normally distributed errors, d I − d p ' V I − V p

) (d −1

I

− d p ) ~ χ 2 (d f ) where for i={single index,

probit} , d i are the estimates, Vi the covariance matrices and

d f = dim(d i ) . The delta method is used

to compute the covariance matrix of the probit estimates. While Buchinsky (1998) and AVV (2007) use the Ichimura single-index estimator, we employ the quasimaximum likelihood estimator of Klein Spady (1993). The latter is superior since it achieves the semiparametric efficiency bound of Chamberlain and Cosslet. 14

10

quantile of their distribution and the log wage of a female at the same quantile of the female distribution. A gap of, say, 0.4 at the i-th percentile is interpreted as one group having a log-wage 40% higher than the other at that percentile. To characterize the evolution of the gender wage gap along the distribution of wages, Figure (1) displays three observations of the raw gender gap, one decade apart of each other. Several features are worth mentioning. Figure 1: Raw Gender Gap for 1986, 1996 and 2006

0,50 0,45

% Hourly Wage

0,40 0,35 0,30 0,25 0,20 0,15 0,10 0,05 0,9

0,8

0,7

Percentile

0,6

0,5

0,4

0,3

0,2

0,1

0

-

Note: The solid line with markers, solid and dashed lines correspond to the 1986, 1996 and 2006 raw gaps, respectively. First, male and female wages are extremely unequal, and men are always paid significantly more than women. Second, the gender gap displays a U-shape, that is, women's wages fall behind men's more at the extremes of the distribution whereas they are closer near the median. De la Rica, Dolado and Llorens (2006) report a similar nonmonotonicity in Spain, due to a composition effect: the gap for high education workers increases along the distribution while that of low education ones decreases. This is not the case in Colombian data, for any of the studied covariates. Rather, the minimum wage may be behind the lowers levels of the wage gap in the middle of the distribution. Since the people at the middle of the distribution earn around the minimum, the minimum wage may exert a gender equalizing effect on intermediate earnings jobs. In our sample for the year 2006, the median wage is 1,1 minimum wages, and approximately 40% of workers earn wages less than, or equal to the minimum wage. The ‘bite’ of the minimum wage varies along the income distribution. It does not affect the wages of people earning less than the minimum, usually informal, unprotected workers. It is very binding at and around the level of the minimum wage, and it looses its grip as we move along the income distribution towards the high earners (Cunningham, 2007). Hence, even though the minimum indexes the wage distribution, its evolution has little effect on very highly paid workers. Third, even though the gaps for 1986 and 1996 are very similar, the 2006 gap shifted downwards for the middle quintile. When including confidence intervals (not displayed 11

here) the 1986 and 1996 gaps are not significantly different from each other. The 2006 gap is only significantly lower than the other two between percentiles 40 and 60. Also recall that the minimum wage is most binding in the middle of the income distribution. The fact that the gender gap is so resilient is surprising since Colombia is perceived domestically as an ‘egalitarian’ country regarding gender issues. Legally, equality is legally ‘guaranteed’. On one hand, the 1991 Constitution, through Articles 13 and 43, mandate the State to promote ‘real and effective equality’ and ‘adopt measures to favor discriminated or marginalized groups’. On the other, the Labor Code 15 states that employers should pay equal wages for equal jobs. Therefore, despite the strong improvement in the labor market characteristics of women, and the legal framework to promote equality, the gender wage gap in Colombia has changed little during the last 20 years. The QR framework allows us to observe the variation across the distribution hidden behind means analysis. In Colombia, the gender gap is higher for women at the top and bottom of the distribution of log-wages. Since the gap widens at the top of the distribution, there is a glass ceiling effect suggesting a barrier to further advancement of women once they have attained a certain level. Albrecht et al.(2003) find that the raw gap in Sweden increases to 40% in 1992 and 1998. We find similar levels at the top of the distribution for Colombia in 1986 and 1996 (and insignificantly lower levels in 2006).

Figure 2: Raw Gender Gap for 2006

0,45 0,40

% Hourly Wage

0,35 0,30 0,25 0,20 0,15 0,10 0,05 0,9

0,8

0,7

Percentile

0,6

0,5

0,4

0,3

0,2

0,1

0

(0,05)

Note: The solid line is the raw gap while the dashed lines are the 95% confidence intervals In what follows, we will focus on the year 2006. As mentioned above, the raw gap for 2006 (Figure 2) displays a U-shape. Whereas at low levels of the distribution the gap is around 35% of the log-wage, near the median it is close to zero, and it increased towards 15

Código Sustantivo del Trabajo (art. 143).

12

the upper tail of the distribution gap to a maximum log wage difference of about 30%. Recall that a log-wage gap of 35% is equivalent to a 42% gap in the wage level. Finally, even though the gap increases in the second half of the distribution, the main increase is observed at the richest decile: at the 90th percentile the gap is around 10% and it increases to about 30% at the 99th percentile. However, given that there are higher standard deviations of the distribution of wages at either extreme, the results are measured less precisely. What is behind the Colombian gender wage gap? With equality in mind, if productivity was neutral to gender, two identical workers who differ only by their gender should earn the same. Using the MM technique we can decompose the gap in Figure (2) into a component generated by differences in labor market characteristics and a component due to differences in the returns to these characteristics. We build the counterfactual distribution of men wages given their characteristics but paid the level of female returns. The difference in the characteristics between men and women is accounted for by taking the difference between the observed male distribution and the proposed counterfactual distribution of `men paid as women'.

Figure 3: Difference Between the Men Wage Distribution and the Distribution of Men Paid as Women

0,35 0,30 % Hourly Wage

0,25 0,20 0,15 0,10 0,05 (0,05) 0,9

0,8

0,7

Percentile

0,6

0,5

0,4

0,3

0,2

0,1

0

(0,10)

Note: The solid line is the raw gap while the dashed lines are the 95% confidence intervals Figure 3 displays the gender gap that remains after we purge the effect of differences in labor market characteristics. It is statistically equal to the raw gap in Figure (2), except at the extremes. In the bottom 10% of the distribution, the remaining gap after controlling for observables is nearly 25%, 10 percentage points lower than the raw gap in Figure (2). At the top 5% of the distribution, the difference in characteristics explains the steep increase of the raw gap between roughly 15% and 30%. Therefore, only at the extremes of the income distribution do labor market characteristics account for the observed gender 13

wage gap; the price effect accounts for most of the gender gap between the 10th and 95th percentiles. This in line with results from other studies (see for example Albrecht, Van Vuuren and Vroman, forthcoming). To confirm the previous result, we also calculated the difference between the observed male distribution and the (counterfactual) distribution of women's wages that would have prevailed if women retained their labor market characteristics but were paid for them as men: the `women paid as men' distribution 16 . The previous conclusion is confirmed by the additional exercise: the price effect accounts for most of the raw gender gap. Wage Gap Decompositions, Controlling for Selection 17

While male participation rates are very high -approximately universal, the proportion of working women is smaller. In addition, working and non-working women differ in labor market characteristics such as age and schooling. This suggests that selection bias is an issue in this estimation since women select into the labor force in a non-random way. Therefore, the raw gap is not a good measure of the differences in pay between genders since we're comparing the universe of men with a selected sample of women. We need to account for the selection bias in women's distribution, to make the male and female distributions comparable, and then calculate the gender gap. What would the distribution of female wages be if all women worked? The MM procedure described above is used to build this counterfactual distribution. We generate a random sample of female wages using the female coefficients adjusted à la Buchinsky combined with the labor market characteristics of all women -not just those who work. Hence, in what follows `accounting for selection' refers to the use of this potential distribution of female wages.

16

Results available from authors upon request. We are grateful to Albrecht, Van Vuuren and Vroman for making their code available. The standard errors reported in this section were calculated using their codes. 17

14

Table 5. Selection Equation Bogota Rest Probit Klein&Spady Probit Klein&Spady -1,172*** -1,22*** -1,103*** -1,087*** Age squared (0,06) (0,04) (0,02) (0,01) -0,01 0,00 -0,02* -0,02 < Primary (0,03) (0,01) (0,01) (0,03) Secondary+ 0,00 0,03 0,12*** 0,07*** (0,03) (0,03) (0,02) (0,02) University 0,13*** 0,17*** 0,25*** 0,18*** (0,05) (0,02) (0,03) (0,02) Married -0,133*** -0,287*** -0,173*** -0,143*** (0,05) (0,05) (0,02) (0,02) Head 0,22*** 0,23*** 0,20*** 0,12*** (0,07) (0,05) (0,02) (0,04) -0,04 -0,09*** -0,04*** -0,023*** # Children <1 (0,03) (0,03) (0,01) (0,01) -0,07*** -0,04*** -0,02*** -0,01*** # Children <6 (0,04) (0,02) (0,01) (0,02) -0,08*** -0,13*** -0,04*** -0,03*** Home Ownership (0,04) (0,03) (0,01) (0,01) -0,10*** -0,20*** -0,18*** -0,13*** Non-Earned Income (0,04) (0,04) (0,02) (0,02) 0,06* 0,09*** -0,02** 0,02 Other Family Income (0,03) (0,02) (0,01) (0,03) Hausman Test 95% Critical Value Test 95% Critical Value 36,163 19,675 35,018 19,675 Note: All the coefficients are calculated relative to the absolute value of the coefficient of age. Standard errors in parentheses. *** Significant to 99%, ** Significant to 95%, *Significant to 90%

The results of the estimation of the selection equation are presented in Table 5. Hausman test results suggest that for the Colombian data the single index (as opposed to the probit) estimator should be used in the first step of the correction model. After accounting for selection using the single index estimator proposed by Klein and Spady, we calculate the true gender gap (gender gap in what follows) as the difference between the male wage distribution and the potential women distribution. Figure (4) shows that the gender gap displays a U-shape, as did the raw gap. However, the level is significantly higher, especially at the upper-end of the distribution. The lowest wage gap is around 25% of log wages and it is observed in the middle of the distribution. Whereas the maximum levels recorded by the raw gap were around 35% in the lower end of the distribution and 30% in the upper end, the respective maxima for the gender gap are 50% and 60%. Recall that a log wage gap of 60% is equivalent to a wage gap in the level of wages of over 80%. Again, the gender gap increases substantially at the top tenth of the distribution, this time passing from 40% to 60%. AVV find that after accounting for selection the gender gap is increasing and it reaches 40% at the top of the distribution. According to our calculations, the glass ceiling in Colombia is steeper, and high earning women are swimming upstream.

15

Figure 4: Wage Gap after Accounting for Selection or True Wage Gap

0,90 0,80 % Hourly Wage

0,70 0,60 0,50 0,40 0,30 0,20 0,10 0,9

0,8

0,7

Percentile

0,6

0,5

0,4

0,3

0,2

0,1

0

-

Note: The solid line is the raw gap while the dashed lines are the 95% confidence intervals Clearly, the selection correction is important and sizable. Given that working women are a selected sample of women, the raw gap underestimates the existing gender gap in the country. Note that the gender gap is equivalent to `adding up' the raw gap (Figure 2) and the selection effect (which will be discussed later and is portrayed in Figure 6). Again, we perform a decomposition using the MM technique and accounting for selection. As in the previous section, we build the distribution of men wages that we would observe if they retained their characteristics but were disguised as women, and hence were paid the selection-adjusted returns of women: the `men paid as women' distribution. That is, we purge the effect of differences in the distribution of observable characteristics between men and women (Figure 5). Note that roughly two-thirds of the wage gap, attributable to the price effect, remains after accounting for differences in characteristics.

16

Figure 5: Difference between the Men wage Distribution and the Distribution of Men paid as Women, after accounting for Selection

0,60

% Hourly Wage

0,50 0,40 0,30 0,20 0,10 -

0,9

0,8

0,7

Percentile

0,6

0,5

0,4

0,3

0,2

0,1

0

(0,10)

Note: The solid line is the raw gap while the dashed lines are the 95% confidence intervals

Decomposing the Selection Term

Let us now turn to the direct effect of selection by characterizing the non-randomness of the participation decision of women. Are less able women forced to work because of need or, on the contrary, is there a positive selection and able women participate more? We already saw that in Colombia working women are younger and more educated than those who don't work. They are also less likely to own the house they live in than nonworking women. The selection effect is calculated as the difference between the observed and the potential distribution of women's wages. Thus it is not surprising to find that selection is positive and rather high in our application, around 20%, as shown in Figure 6. This is twice what AVV find the direct effect of selection to be in the Netherlands. Additionally, since we go beyond means analysis, we are able to determine that selection is not constant but rather it seems to increase along the distribution. This implies that able women, those with better labor market characteristics, are increasingly pulled into the workforce by the high returns. Hence, the raw gap underestimates the true gender gap by roughly 20% -the selection effect- since women who actually work are those who would get the greatest return. The underestimation is especially high at the top of the distribution.

17

Figure 6: Selection Effect 0,60 0,50 % Hourly Wage

0,40 0,30 0,20 0,10 (0,10) 0,9

0,8

0,7

Percentile

0,6

0,5

0,4

0,3

0,2

0,1

0

(0,20)

Note: The solid line is the raw gap while the dashed lines are the 95% confidence intervals Selection is due both to differences in the labor market characteristics between women who work and those who don't and to unobserved characteristics. The MM methodology allows us to decompose selection effect into a portion due to observables labor market characteristics, and the remainder due to unobservables. In doing so we build another counterfactual distribution: the distribution of women's wages that would have prevailed if prices accounted for selection, but women had the distribution of labor market characteristics of working women -not of all women. The difference between this `working women adjusting for selection' counterfactual and the potential distribution tells us how much of the selection effect can be explained by differences in the distribution of characteristics between women who work and those who don't (Figure 7). Even though the effect of observables is not homogeneous along the distribution, it accounts to roughly one quarter of the selection effect until the 70th percentile, and in the top 30% it explains about half. However, it is not significant.

18

Figure 7: Selection due to Observables 0,50 0,40 % Hourly Wage

0,30 0,20 0,10 (0,10) (0,20) 0,9

0,8

0,7

Percentile

0,6

0,5

0,4

0,3

0,2

0,1

0

(0,30)

Note: The solid line is the raw gap while the dashed lines are the 95% confidence intervals The remainder of the selection effect is the attributed to unobservables. It is calculated as the difference between the actual distribution of female wages and the `working women adjusting for selection' counterfactual distribution. What we do here is hold the distribution of observable characteristics constant -that of working women- and change the returns to characteristics from the ones observed in the market to the selection adjusted ones. Unobservables account for roughly three quarters of the selection effect until the 70th percentile, and half in the top 30% of the distribution, and the effect is statistically significant. Therefore, the portion attributable to unobservables is higher than the portion due to observable characteristics despite the obvious differences in labor market characteristics between working and non-working women. Clearly, adding up the portions due to observables and unobservables yields the selection effect. Concluding Remarks

The raw gender gap in Colombia has proved very resilient over the past 20 years, despite the strong convergence in men and women labor market characteristics and the existence of legal guarantees to gender equality. Men are always paid more than women and underpaid women are prevalent at the extremes of the distribution. Since this can only be captured in a QR framework, it is not only necessary but also interesting to go beyond means analysis for the Colombian case. The size of the selection effect even in a country with high levels of female participation for Latin American standards such as Colombia, implies that non-random selection is an issue in the calculation of gender gaps, and should be taken seriously. Correcting for selection in a QR framework is important since we find that the selection effect is positive and significant: able women are pulled into the workforce. 19

The results for Colombia, a developing economy, are similar to those found in developed countries in the sense that the price effect explains the bulk of both the raw and the true gender gaps.

20

References

Albrecht, James, Anders Bjorklund and Susan Vroman, 2003 "Is there a glass ceiling in Sweden?", Journal of Labor Economics, 21, 145--177. Albrecht, James, Aico Van Vuuren and Susan Vroman, forthcoming "Counterfactual Distributions with Sample Selection Adjustments: Econometric Theory and an Application to the Netherlands", Labour Economics. Amador, Diego, Raquel Bernal and Ximena Peña, in progress "Trends in Female Labor Participation in Colombia: Marriage, Children or Education?". Andrews, Donald and Marcia Schafgans, 1998 "Semiparametric Estimation of the Intercept of a Sample Selection Model", The Review of Economic Studies, Vol. 65, No. 3, Jul., pp. 497-517. Autor, David, Lawrence Katz and Melissa Kearney, 2005 "Rising Wage Inequality: the Role of Composition and Prices" NBER Working Paper 11628, September. Buchinsky, Moshe, 1998 "The Dynamics of Changes in the Female Wage Distribution in the USA: a Quantile Regression Approach", Journal of Applied Econometrics, 13, 1-30. Cunningham, Wendy, 2007 Minimum Wages and Social Policy: Lessons from Developing Countries. World Bank Publications. De la Rica, Sara, Juan Dolado and Vanesa Llorens, 2005 "Glass Ceiling or Floors?: Gender Wage Gaps by Education in Spain" IZA Discussion Paper No. 1483, January. Duryea, Suzanne, Olga Jaramillo and Carmen Pagés, 2001. Latin American Labor Markets in the 1990s: Deciphering the Decade. Inter-American Development Bank. Fernández, Pilar, 2006 "Determinantes del diferencial salarial por género en Colombia, 1997-2003" Desarrollo y Sociedad, #58, septiembre. Ganguli, Ina and Katherine Terrell, 2005 "Wage Ceilings and Floors: The Gender Gap in Ukraine's Transition" IZA Discussion Paper #1776. Klein , Roger and Richard Spady, 1993 "An Efficient Semi-Parametric Estimator for Binary Response Models". Econometrica, Vol. 61, No. 2, March, 387-421. Machado José and José. Mata, 2005 "Counterfactual Decomposition of Changes in Wage Distribution Using Quantile Regression" Journal of Applied Econometrics, 20, 445-65. Melly, Blaise. and Martin Huber, in progress "Sample Selection, Heteroscedasticity, and Quantile Regression".

21

Ñopo, Hugo, 2006 "The Gender Wage Gap in Chile 1992-2003 from a Matching Comparisons Perspective" Research Department Working paper series # 562, InterAmerican Development Bank. Peña, Ximena, 2006 "Assortative Matching and the Education Gap", Georgetown University Working Paper.

22

Decomposing the Gender Wage Gap with Sample ...

b) Calculate at the j distribution the percentile levels at which qi lies and call these Pi. ... work but not the wage, are home ownership, number of children between 2 and 6 ..... may exert a gender equalizing effect on intermediate earnings jobs.

246KB Sizes 2 Downloads 253 Views

Recommend Documents

Decomposing the Gender Wage Gap with Sample ...
selection correction and decomposition exercises. 3 ... minimize measurement error in the log hourly wage. No. ..... On the other, the Labor Code15 states that.

Explaining the Gender Wage Gap: Estimates from a Dynamic Model of ...
Policy simulation results suggest that, relative to reducing the wage cost of part-time work, providing additional employment protection to part-time jobs is more effective in reducing the gender wage gap. JEL: D91, J31, J16, J63. ∗Faculty of Econo

Gendered Wage Gap In Canada.pdf
001-x/topics-sujets/pdf/topics-sujets/minimumwage-salaireminimum-2009-eng.pdf. Baker, M., & Drolet, M. (2010). A new view of the male/female pay gap.

UnDeRstanDIng the natIve–ImmIgRant Wage gaP ...
nomic association (sea) meetings (madrid 2010), Collegio Carlo alberto, the Institut .... project. the Research Data Center (fDZ) can provide on-site use or remote.

Understanding the City Size Wage Gap
differences across locations in job offer arrival rates and dispersion of ..... We allow the job search technology to differ by city size, ability and employment status.

Gender Gap Holds in Retirement Plan Participation
Nov 26, 2013 - @EBRI or http://twitter.com/EBRI ... In fact, across all work-status categories, females were more likely to ... EBRI does not lobby and does.

Gender Gap Holds in Retirement Plan Participation
Nov 26, 2013 - EBRI blog: https://ebriorg.wordpress.com/. Sign up for our RSS feeds! © 2013, Employee Benefit Research Institute, 1100 13th St. NW, ...

The Disappearing Gender Gap: The Impact of Divorce, Wages, and ...
Each of these changes alone can account for about 60% of the change in LFP ... LFP gap for high-school women (48.3% due to wages and 34.2% due to family). .... The presence of concavity, borrowing and savings, and heterogeneity in our model .... We c

The Disappearing Gender Gap: The Impact of Divorce, Wages, and ...
The presence of concavity, borrowing and savings, and heterogeneity in our ...... generated by the model therefore are the right ones to compare with the data which are reported by ..... and the proportions of men and women who go to college.

Gender Wage Gaps Reconsidered: A Structural Approach Using ...
Technical College. 0.386. 0.436. 0.354. (completed) (0.0010) (0.0021). (0.0012). College. 0.609. 0.616. 0.566. (0.0011) (0.0033). (0.0011). University Degree.

Entrepreneurship: Is There a Gender Gap?
Jan 11, 2007 - regarded as an untapped source of potential economic growth. Nevertheless, there ..... Medical, precision & optical instruments. 14.4 14.9 15.5 ...

Entrepreneurship: Is There a Gender Gap?
Jan 11, 2007 - 45.6 35.9 39.2 39.9 38.1 39.0. Clothing. 51.8 52.7 46.6 48.2 51.5 49.8. Leather & footwear. 28.8 30.8 31.4 31.5 32.1 31.4. Wood products.

Recovering the Counterfactual Wage Distribution with ...
Two questions of interest arise once the migration process is dynamic. ... longitudinal and cross-sectional data, Lubotsky finds that return migration by low-wage.

The Support Gap
program services such as ESL, Special Education (SPED), and Gifted and Talented ... provide a basis for determining academic, instructional, and technology.

The Support Gap
and career ready? ... job openings created by 2018 will require workers with at least some college education ... Information, Media, and Technological Skills .... To what extent is technology being used by ESL students versus non-ESL students ...

Decomposing Differences in R0
employ, but the analysis they conduct is still consistent and valid because the terms in (12) still sum to ε .... An Excel spreadsheet and an R package with example data and a tutorial will be available in March 2011 to accompany the methods.

DECOMPOSING INBREEDING AND COANCESTRY ...
solutions can be obtained by tracing the pedigree up and down. ... bt bt bt a. After expanding the relationship terms up to the founders and algebraic ...

Developing a Framework for Decomposing ...
Nov 2, 2012 - with higher prevalence and increases in medical care service prices being the key drivers of ... ket, which is an economically important segmento accounting for more enrollees than ..... that developed the grouper software.

Decomposing Discussion Forums using User Roles - DERI
Apr 27, 2010 - Discussion forums are a central part of Web 2.0 and Enterprise 2.0 infrastructures. The health and ... they been around for many years in the form of newsgroups [10]. Commerical ... Such analysis will enable host organizations to asses

Decomposing time-frequency macroeconomic relations
Aug 7, 2007 - As an alternative, wavelet analysis has been proposed. Wavelet analysis performs ... For example, central banks have different objectives in ...... interest rates was quite high in the 3 ∼ 20 year scale. Note that the causality is ...

The Gap; Arabesque.pdf
Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. The Gap; Arabesque.pdf. The Gap; Arabesque.pdf. Open. Extract.

The relative sensitivity of algae to decomposing barley ...
... Germany and Sciento strains were obtained from Sciento, Manchester, UK. .... current results with E. gracilis support those of Cooper et al. (1997) who showed ...