Spells of eligibility and spells of participation to welfare programs in France Antoine Terracol∗ february 2004

Abstract This paper analyses the duration of welfare eligibility and of welfare spells. Due to the under-reporting of households’ income in our dataset, durations of eligibility spells suffer from measurement errors. To handle the problem of mismeasured dependant variable, we use the Monotone Rank Estimator of Cavanagh and Sherman (1998) to estimate a duration model for eligibility spells. Because of participation costs, not all the eligible agents choose to participate in the welfare program, thus creating self-selection in the sample of observed participants. We develop two duration models correcting for selectivity to estimate the determinants of the duration of welfare use. Finally, a double-hurdle duration model tackling the issue of under-reporting of program participation is estimated. Results show that both self-selection and under-reporting do occur, and that estimated welfare use spells have been previously under-estimated. Keywords: welfare eligibility, welfare spells, duration models, measurement errors, selectivity. JEL codes: I38, I32, C14, C41, C35

1

Introduction

Many studies have focused on the efficiency of the best-known French welfare program, the RMI (Revenu minimum d’Insertion, Minimum Guaranteed Income). Its redistributive properties as well as its effects on labor supply behavior of its participants have been scrutinized. This paper focuses on the analysis of the spells of eligibility and of the spells of participation to the RMI using data from the French waves of the European Community Household Panel. We try to take into account some features of our data, as well as the take-up behavior of eligible individuals. A previous study (Terracol, 2002a), has shown that the European Community Household Panel suffers from under-reporting of income, thereby leading to errors in the computation of the individuals’ eligibility. Moreover, Terracol (2002a) shows that the take-up rate of ∗ TEAM, Université de Paris 1 Panthéon-Sorbonne and CNRS 106-112 boulevard de l’Hôpital, 75642 Paris cedex 13 [email protected]

1

the RMI in France is around 65%, implying that the population of participants is a self-selected sub-population of eligible individuals. This study seeks to shed some new light on the issues of eligibility and participation spells by applying appropriate statistical methods to analyze our data. Section 2 presents the French welfare program which is analyzed in this paper, Section 3 gives a brief overview of previous studies on the topic. Our data is presented in Section 4; spells of welfare of welfare eligibility are analyzed in Section 5 while Section 6 concentrates on participation spells. Sub-samples and econometric models are presented within the relevant sections; and Section 7 finally concludes.

2

The RMI program

The RMI (Revenu Minimum d’Insertion, Minimum Guaranteed Income) is one of the best known French benefits. It is an almost-universal mechanism, created in December 1988, whose aim is to provide with a minimum income any person who, due to his age, physical or mental disability, or labour market conditions, is unable to work and is located outside of the traditional welfare net. This mechanism comes as a complement to older benefits, targeting narrower categories: disabled adults (Allocation Adulte Handicapé, AAH), single parents (Allocation de Parent Isolé, API), elderly population (Minimum Vieillesse). . . It has been designed from two fundamental principles: the citizenship principle which states that anyone in a society has the right to lead a dignified life; and the responsibility principle which defines a set of reciprocal duties between society and the individuals that are part of it. The universal nature of the RMI is reinforced by the access to a whole set of social rights aiming to provide welfare recipients with the basis for social and professional reintegration. Welfare recipients must (in principle) sign an “ integration contract ” with a local caseworker. Eligibility for the RMI program is only based on two criteria: the recipient must be at least 25 and the household’s income must fall below a certain threshold. The benefit actually paid is then defined as the difference between the income threshold and the household’s mean income during the three previous months. The RMI benefit can be totally drawn concurrently with labour income for as long as three months, and then partially (with two decreases) for twelve more months. In any case, the benefit cannot be drawn concurrently with labour income if the recipient has worked more than 750 hours.

3

Previous studies

A large number of studies have used spell data to study the time pattern of welfare programs participation. However, all of these studies have focused on welfare participation spells. An exception is Blank and Ruggles (1996) which looks at both AFDC and Food-Stamps eligibility and participation spells for women.They show that many eligibility spells are relatively short and end without program participation. Women experiencing those spells – ending with an increase in income or a change in family structure – tend to be older, more educated.

2

A seminal study is Bane and Ellwood (1986). They study the dynamics of poverty spells, with poverty defined as income falling below a certain threshold. They tabulate spell durations to calculate exit probabilities and duration distributions. Their results show that many poverty spells are quite short, and that the bulk of ressources go up to a minority of people that have very long stays in poverty. Their methodology has been applied by Stevens (1994) on six more waves of the PSID. She also looks at multiple poverty spells and finds that they are quite common for people having experienced rather long welfare spells in the past. Spells of welfare program participation have received great attention in the economic literature. One can for example cite O’Neil et al. (1987) Blank (1989), Hoynes and MaCurdy (1994), Hoynes (1996), Duclos et al. (1998), Welch (1998) and Fitzgerald (1994) for the USA and Quebec. In France, RMI spells has been studied by Zoyem (2001), Afsa and Guillemot (1999) or, in a more theoretical way, by Gurgand and Margolis (2002). Afsa (1999) also looks at spells of participation to another French welfare program: the API1 . All these studies use survival models to analyse their data, but assume the absence of measurement errors in the eligibility calculation, and a perfect take-up of these benefits.

4

Data

For this study, we use the first seven French waves of the European Community Household Panel, covering the years 1993 to 2000. This dataset is particularly adapted to our purpose since it provides detailed information on agents’ income and program participation on a monthly basis. Moreover, the structure of these “ income calendars ”, covering more than 40 types of income (labour income, unemployment insurance, social transfers etc.) allows us to compute households’2 eligibility based on official benefit scales. More precisely, we build the benefit level to which the household is entitled, defined as the difference between the maximum benefit level corresponding to the household’s structure and the previous trimester mean income. Because they are eligible to the API program which provides a higher income than the RMI program, single parents with children below 3 years old are excluded from our sample. Likewise, household where at least an individual receives the disability benefit (Allocation Adulte Handicapé, AAH), as well as militaries and retired individuals are also excluded. We thus define spells of eligibility as consecutive months for which our calculated eligibility is positive. Likewise, welfare participation spells are defined as consecutive months for which at least an individual in the household reports program participation. Because households’ income is likely to be misreported (mainly under-reported), observed eligibility spells will suffer from measurement errors. Our statistical analysis of section 5 seeks

1 Allocation

de Parent Isolé, Single Parent Benefit specifically, they are “ units ” defined in a more restrictive way than “ households ”: a child over 25 years still living with her parents will be considered as a separate unit. For simplicity’s sake, we will refer to those units as “ households ” in the remainder of the paper. 2 More

3

to provide estimates of the determinants of the exit rates out of eligibility that correct for this unknown measurement error.

5 5.1

Eligibility spells Data

Households’ eligibility to the RMI program was assessed using reported income and official benefit scales. RMI entitlement is defined as the the difference between the income threshold corresponding to the household structure at time t, and the mean of the household’s income for the previous 3 months3 . Eligibility spells are then defined as consecutive months for which the calculated household’s income fall below the income threshold of the RMI program. If two participation spells for the same household are separated by less than three months of (reported) non-participation, we have “ filled the gap ” and imputed participation for those months. We also have censored ongoing spells at 48 months, as wells as spells ending at the junction of two data waves. Table 1 presents summary statistics of the eligibility spells according to their censoring status. Table 1: Eligibility spells Spell type N Mean Left censored 196 12.71 Right censored 266 16.28 Left & right censored 237 23.33 Complete 324 7.23 All 1023 14.36

Std. Err. 10.95 13.29 16.93 8.61 13.95

Only right-censored and complete spells are retained for the empirical analysis. We are thus left with 590 eligibility spells, 45.08% of which are right-censored. Table 2 gives summary statistics for the covariates that will be used in our empirical analysis. Our set of covariates also includes regional dummies and dummies for city size (not included in table 2). Table 3 shows the cumulative distribution of the number of eligibility spells and of the number of eligibility months. It shows that a majority of eligibility spells are quite short, but that the minority of long spells represent the majority of eligibility months: while 72% of eligibility spells last less than a year, they represent only 31.8% of the total number of eligibility months in our dataset. This feature of eligibility spells is confirmed by Figure 1 which shows the estimated density of our spells according to their censoring status. The Kaplan-Meier survival estimates presented in figure 2 suggests a negative duration dependance for the hazard rate. People entering an eligibility spell will either exit rather quickly, or get caught in a “ poverty trap ” an experience long poverty spells.

3 Appendix

A gives details about our computation of RMI entitlement

4

Variable Age of adults No or primary education High school Technical school University Number of adults Number of children Local unemployment rate French Disabled self-employed worker Max. benefit /1000 N

Table 2: Summary statistics Censored spells Complete spells Mean Std. Dev. Mean Std. Dev. 39.942 10.941 37.714 10.466 0.15 0.358 0.13 0.336 0.267 0.443 0.272 0.445 0.365 0.482 0.417 0.494 0.218 0.414 0.182 0.387 1.217 0.409 1.264 0.438 0.948 1.05 0.629 1.012 12.211 2.682 11.905 2.506 0.917 0.276 0.929 0.257 0.132 0.339 0.216 0.412 0.034 0.181 0.028 0.165 3.529 1.117 3.197 1.182 266 324

All spells Mean Std. Dev. 38.719 10.731 0.139 0.346 0.269 0.444 0.393 0.489 0.198 0.399 1.243 0.425 0.773 1.04 12.043 2.589 0.924 0.266 0.178 0.383 0.031 0.172 3.347 1.16 590

Table 3: Cumulative distribution of spells & months Spell length % of spells % of months ≤ 6 months 47.80 11.50 ≤ 12 months 72.03 31.83 ≤ 18 months 80.00 41.87 ≤ 24 months 88.98 59.17 ≤ 30 months 91.19 64.42 ≤ 36 months 94.41 73.66 ≤ 42 months 95.76 76.39 All spells 100 100

The computation of eligibility spells requires high-quality information on households’ income to allow the researcher to determine – on a monthly basis – wether or not an agent is eligible to receive a welfare benefit. However, the vast majority of datasets used for this kind of study are based on self-reported information subject to different accuracy problems. Memory effects can occur, and lead agents to omit some types of income, mostly when filling out retrospective calendars4 . Using the same dataset, Terracol (2002a) has shown that income was under-reported, thus leading to an over-estimations of welfare eligibility, and of course of the length of eligibility spells. Because the dependant variable is likely to be mismeasured, traditional duration analysis might not be applicable, and a more robust methodology is required. 4 There

is also evidence of under-reporting of income in questionnaire surveys in France.

5

0

Density .05

.1

Figure 1: Density of eligibility spells duration

0

10

20 30 Spell duration Right censored All

5.2

40

50

Complete

Econometric models

While the use of duration models have grown in the economic literature since the seminal article of Lancaster (1979), followed by the rise of semi-parametric models following Meyer (1990) or Han and Hausman (1990) ; few studies have tried to cope with mismeasured dependant variables. Lancaster (1990, p.59) briefly mentions that it could be handled via an unobserved heterogeneity term. Magnac and Visser (1998) propose a transition model with measurement errors, but it requires additional information not always available to researchers. We here use the Monotone Rank Estimator of Cavanagh and Sherman (1998) which can be used to estimate semi-parametric duration models with mismeasured dependant variable, without specifying any parametric form for the measurement errors. The Monotone Rank Estimator and its application to duration models is described in section 5.2.1 below. For comparison purposes, we have also estimated a semi-parametric proportional hazard duration model allowing for discrete unobserved heterogeneity (see section 5.2.2) 5.2.1

The Monotone Rank Estimator

Cavanagh and Sherman (1998) propose a class of estimators for models such as: y = g (Xβ , ε) where g (.) is an increasing function. The estimator takes the form: N

βˆMRE = argmax ∑ M (yi ) Rn (Xβ ) β

(1)

i=1

where y is the observed dependant variable, Rn (.) is the “ rank ” function, and M (.) is some monotonic increasing function. 6

0.00

0.25

0.50

0.75

1.00

Figure 2: Kaplan-Meier survival estimates for the eligibility spells

0

10

20 30 Spell duration in months

40

50

The key condition for the consistency of βˆMRE is: E [M (y) |X] is an increasing function of Xβ Because of the Rn (.) function, one must normalize the parameter vector, and exclude the constant term from the model5 . Abrevaya and Hausman (1999) show that the Monotone Rank Estimator can be applied to fit survival models under the proportional hazard assumption. Indeed, the proportional hazard assumption states that the hazard rate at time t is: λ (t) = λ0 (t) exp (Xβ ) where λ0 (t) is the baseline hazard rate at time t. The log of the integrated hazard is then: ln (Λ (t)) = ln (Λ0 (t)) + Xβ

(2)

where Λ0 is the baseline integrated hazard. Defining ε as − ln (Λ (t)), one know obtains from (2): − ln (Λ0 (t)) = Xβ + ε where ε has an extreme-value distribution6 . 5 The

rank of the index Xi β is invariant to a shift in the constant term, and to a multiplicative factor integrated hazard is, regardless of the functional form of λ0 (t), a random variable with a standard exponential distribution (see Lancaster, 1990); and minus the logarithm of an exponential random variable has an Extreme-Value Type 1 distribution. 6 The

7

One then gets: −t = −Λ−1 0 (exp (−Xβ − ε)) Since Λ0 is strictly increasing, minus time is an increasing function of Xβ , the Monotone Rank Estimator can thus be applied. Moreover, Abrevaya and Hausman (1999) show that the estimator is also applicable in the presence of random right censoring and of unobserved heterogeneity. Abrevaya and Hausman (1999) also show that the Monotone Rank Estimator stays consistent when the dependant variable y is measured with errors, without the need to specify a particular form for the measurement error. Denoting y∗ the latent dependant variable, and y the observed dependant variable, the Monotone Rank Estimator stays consistent if: E [M (y) |y∗ ] is an increasing function of y∗ Since equation (1) is not differentiable, standard maximisation techniques cannot be applied. Following Cavanagh and Sherman (1998, p 36), the estimation is performed using the NelderMead (Nelder and Mead, 1965) “ amoeba ” algorithm 7 which requires only function evaluations, not derivatives. The parameters of the Monotone Rank Estimator are only identified up-to-scale (in this application, we normalize the parameters vector such that kβˆ k = 1), without a constant term, and do not provide an estimation of the shape of the baseline hazard. It is, however, robust to measurement errors in the recorded durations, which makes it attractive for our problem. To assess the robustness of the Monotone Rank Estimator to measurement errors, Terracol (2002b)8 has performed Monte-Carlo experiments with a variety of measurement errors that are likely to appear in eligibility spells data. The distribution of the Monotone Rank Estimator parameters are compared with those of the Cox model for survival data. Results show that the (normalized) MRE parameters are always unbiased, while the (normalized) Cox parameters can be severely biased, depending on the type of measurement errors considered. 5.2.2

A semi-parametric model with unobserved heterogeneity

To evaluate the impact of measurement errors on the estimated parameters, we wish to compare our Monotone Rank Estimator parameters to those of a more classical duration model. We thus estimate a semi-parametric proportional hazard model with unobserved heterogeneity (Meyer, 1990; Han and Hausman, 1990). We start by specifying a proportional hazard model: λ (t) = exp (Xβ ) λ0 (t) we obtain Pr [T ∈ [t,t + 1]|T ≥ t] = 1 − exp [− exp (Xβ + γt )]  R where γt = ln tt+1 λ0 (u)du 7 The

Nelder-Mead algorithm can be found online in C and Fortran at Numerical Recipes (www.nr.com); Jason Abrevaya proposes Gauss code for the Monotone Rank Estimator at http://gsbwww.uchicago.edu/fac/jason.abrevaya/research/rankest.prg ; we performed the estimation using Stata 7.0. 8 Available upon request

8

Following Heckman and Singer (1984), we add an unobserved heterogeneity term δ such that the hazard function for individual i now writes: λi (t, δi ) = 1 − exp [− exp (Xi β + γt + δi )] and the survival function is: t−1

S (t, δi ) = ∏ exp [− exp (Xβ + γt + δi )] t=1

We assume that the heterogeneity term δ is distributed as a discrete random variable with K points of support and associated probabilities Pk . The likelihood for individual i becomes: K

li =

∑ Pk [(λ (t, δk ) S (t, δk ))]c [S (t, δk )]1−c i

i

k=1

where ci is a dummy variable taking the value 1 if the individuals fails, and 0 otherwise; and where ∑Kk=1 Pk = 1 The log-likelihood of the data thus writes: N

LL = ∑ ln (li ) i=1

This model was estimated using the algorithm described in Baker and Melino (2000)

5.3

Model specification and results

The variables used in this analysis reflect the socio-economic characteristics of households: the mean age of adults, their education level, number and age of children, nationality, regional unemployment rate and so on. We also control for the maximum benefit corresponding to the household structure9 . Controls for the city size and the geographical area are also included. The semi-parametric model presented in section5.2.2 involves the estimation of the baseline hazard that we have chosen to specify as a quadratic function of time. 2 points of support are specified for the distribution of the unobserved heterogeneity term. The Monotone Rank Estimator has been estimated using the “ identity ” function for M (.) (see equation (1)). Following Abrevaya and Hausman (1999), we have used 200 bootstrap replications to construct standard errors and confidence intervals. Table 4 presents our results for the different models. Model (1) is the semi-parametric duration model of section 5.2.2 ; the second column contains the normalized10 parameters and 9 It

is not the benefit that the household is entitled to, which is defined as the difference between the maximum benefit and the household’s income. 10 The parameter vector is normalized such that |βˆ | = 1. To get results that can be compared to those of the Monotone Rank Estimator, only the parameters that are also included in the MRE estimation have been used for the normalization; we thus have excluded the baseline hazard coefficient (time and time squared) and the unobserved heterogeneity parameters.

9

standard errors of model (1), to be compared with the Monotone Rank Estimator parameters presented in the last column. Table 4: Estimation results Model (1) Variable

Coefficient

Model (1) normalized Coefficient

(Std. Err.)

(Std. Err.)

MRE Coefficient (boot. S. E.) [boot. 95% C.I.]

Baseline hazard -0.080†

Time

-

-

-

-

(0.042)

0.001†

Time squared

(0.001)

Mean age of adults

Socio-economic characteristics -0.051 -0.009 (0.039)

(0.009)

-0.041 (0.007) [-0.054, -0.027]

(Mean age of adults)2

0.050

0.009

(0.046)

(0.010)

0.047 (0.009) [0.030, 0.063]

Number of adults

2.705

0.480

0.405

(8.997)

(1.246)

(0.076) [0.261, 0.549]

(Number of adults)2

0.336

0.059

(3.011)

(0.577)

0.094 (0.019) [0.058, 0.133]

Number of children

2.285∗∗

0.405

0.216

(0.556)

(0.304)

(0.048) [0.149, 0.329]

(Number of children)2

0.116∗

0.020

(0.055)

(0.018)

0.062 (0.016) [0.028, 0.089]

Regional unemployment rate

-0.036

-0.006

-0.005

(0.036)

(0.008)

(0.005) [-0.019, 0.002]

Continued on next page...

10

... table 4 continued Variable

Coefficient

Coefficient

(Std. Err.)

(Std. Err.)

Coefficient (boot. S. E.) [boot. 95% C.I.]

French

0.127

0.022

-0.112

(0.256)

(0.048)

(0.084) [-0.237, 0.106]

Disabled

0.180

0.032

(0.163)

(0.038)

0.011 (0.010) [-0.0138, 0.029]

Self-employed worker

-0.599

-0.106

0-.631

(0.405)

(0.104)

(0.321) [-0.832, 0.294]

Max. benefit

-3.281∗∗

-0.582

(0.631)

(0.434)

-0.491 (0088) [-0.643, -0.331]

High school

Education (ref: none or primary) 0.189 0.033 (0.244)

(0.051)

0.118 (0.044) [0.028, 0.201]

Technical school

0.047

0.008

0.072

(0.237)

(0.042)

(0.036) [0.002, 0.143]

University

0.006

0.001

0.137

(0.274)

(0.048)

(0.050) [0.052, 0.233]

Rural area

City size (ref: Paris) -0.973† -0.172 (0.532)

(0.148)

-0.102 (0.047) [-0.194, -0.008]

<= 20000

-1.146∗

-0.203

-0.125

(0.540)

(0.169)

(0.052) [-0.236, -0.028]

> 20000 & <= 100000

-0.687

-0.122

(0.519)

(0.119)

-0.003 (0.002) [-0.009, -0.001]

Continued on next page...

11

... table 4 continued Variable

Coefficient

Coefficient

(Std. Err.)

(Std. Err.)

Coefficient (boot. S. E.) [boot. 95% C.I.]

> 100000

-1.248∗

-0.221

-0.138

(0.539)

(0.177)

(0.053) [-0.231, -0.026]

Parisian area

Region (ref: Paris & suburbs) 0.337 0.059 (0.513)

(0.096)

-0.093 (0.049) [-0.193, 0.005]

North

0.531

0.094

(0.614)

(0.123)

-0.034 (0.031) [-0.097, 0.046]

East

0.564

0.100

-0.151

(0.519)

(0.109)

(0.084) [-0.346, -0.004]

West

0.952†

0.168

(0.514)

(0.146)

0.052 (0.047) [-0.023, 0.166]

South-west

0.797

0.141

-0.020

(0.509)

(0.130)

(0.021) [-0.051, 0.041]

Center-east

0.879†

0.156

(0.513)

(0.138)

-0.037 (0.036) [-0.089, 0.068]

Mediterranean area

0.733

0.130

-0.005

(0.552)

(0.129)

(0.004) [-0.015, 0.001]

Intercept

Heterogeneity parameters Group 1 6.002 -

-

(6.086)

Probability

0.105

-

-

-

-

(0.309)

Intercept

Group 2 4.174 (6.086)

Continued on next page... 12

... table 4 continued Variable

Coefficient

Coefficient

(Std. Err.)

(Std. Err.)

Coefficient (boot. S. E.) [boot. 95% C.I.]

0.895 ∗∗

Probability

-

-

590 -

590 -

(0.309)

N Log-likelihood 2 χ(27) Significance levels :

590 -1157.98 176.272 † : 10%

∗ : 5%

∗∗ : 1%

Model (1) – which we suspect to be biased due to mismeasured spell duration – gives mostly insignificant results, and results that can be counter-intuitive, such as the very small impact of high education (technical school and university) compared to that of having a high-school level. The estimated baseline hazard, however, exhibits negative duration dependance, which is consistent with intuition and the Kaplan-Meier graph of figure 2. We now turn to the interpretation of the Monotone Rank Estimator parameters (shown in the last column of Table 4 ), the model which we believe is best-suited for the analysis of our data. The mean age of adults decreases the hazard rate, and this decrease slows down as they get older. It reflects the difficulty of older people to get themselves out of poverty and/or unemployment traps, perhaps because of the decline in their human capital. On the contrary, the more adults are present in the household, the higher the hazard rate: being a couple increases the chances of someone in the household finding a job. The same goes with the number of children, but obviously not for the same reasons: having children will prompt parents to quickly exit from eligibility. Our education variables indicate that higher education increases the rate of exit from poverty. However, having attended a technical school seems to have quite a small impact on the exit rate compared to general high school or having attended university (which gives the best chances of exiting eligibility spells). The regional unemployment rate variable, French nationality, disability and self-employed worker dummies are not significant. All city size dummies are negative and significant, and show that living in Paris greatly increases the chances of exiting the eligibility spell, especially compared with small cities – which might not benefit from a dynamic labor market – or large cities which might suffer from the attractivity of the French capital. Living in a medium-sized city, however, seem to reduce the hazard rate only by a small amount. Regional dummies are not significant, except from the dummy for the east of France, which has a negative impact on the hazard rate. Finally, and not surprisingly, the maximum benefit for which the household could be entitled to has a strong negative impact on the hazard rate.

13

6

Participation spells

Contrarily to eligibility spells, participation spells have been widely analyzed in the literature. Empirical methodology usually consisted on the estimation of a duration model on a sample of observed participation spells. This studies implicitly assumed that the sample of observed spells was a random sample of the population targeted by the different welfare programs. However, many studies11 have highlighted the fact that a certain fraction of eligible individuals chose not to participate in the welfare programs. In a previous study, Terracol (2002a) has estimated the determinants of take-up in France, using data from the European Community Household Panel. Results have shown that about 35% of eligible units would choose not to participate in the RMI program. The reasons for non take-up appeared to be related with social stigma, household structure. An implication of these findings is that the sample of observed welfare program participants is a self-selected sub-sample of the target population, and calls for a proper empirical methodology to analyse spells of welfare recipiency.

6.1

Data

To analyse welfare use spells and the selection into program participation, we use observations for which eligibility has been assessed at least for 3 months, wether or not the program participation is observed. We drop left-censored participation spells and observation with missing values. Our final dataset consists of 972 observations, among which 191 participation spells (ongoing spells at 48 months are censored). Figure 3 shows the Kaplan-Meier survival estimates for those spells. Tables 5 and 6 give summary statistics for variables used in participation and selection equations (see Section 6.2)

11 See Ashenfelter (1983), Moffitt (1983), Blank and Ruggles (1996), Anderson and Meyer (1997) and Bollinger and David (2001) for the United States; Blundell et al. (1988), Duclos (1995, 1997) and Bramley et al. (2000) for the United Kingdom; Riphahn (2001) for Germany.

14

Table 5: Summary statistics, participation equation Variable Mean Std. Dev. Mean age of adults 35.432 10.342 Number of adults 1.064 0.356 Number of children below 3 0.082 0.308 Number of children between 3 & 6 0.112 0.3 Number of children over 6 0.544 0.930 No or primary education 0.168 0.374 High-school 0.241 0.429 Technical school 0.346 0.477 University 0.246 0.432 Local unemployment rate 12.703 2.721 Entitlement 1297.449 815.473 French 0.921 0.27 Disabled 0.262 0.441 Self-employed 0.052 0.223 N 191

Table 6: Summary statistics, selection equation Variable Mean Std. Dev. Mean age of adults 35.514 13.347 Number of adults 1.243 0.417 Number of children below 3 0.142 0.427 Number of children between 3 & 6 0.086 0.25 Number of children over 6 0.528 1.058 No or primary education 0.149 0.356 High-school 0.17 0.376 Technical school 0.348 0.476 University 0.333 0.472 Local unemployment rate 12.037 2.522 Entitlement 1165.124 790.725 French 0.917 0.277 Disabled 0.205 0.404 Self-employed 0.142 0.349 N 972

15

0.00

0.25

0.50

0.75

1.00

Figure 3: Kaplan-Meier survival estimates for the participation spells

0

6.2

10

20 30 Spell duration in months

40

50

Econometric models

When analyzing welfare spells, researchers have overlooked the fact that some individuals eligible for welfare receipt will choose not to participate. Non take-up arises because welfare receipt induces costs that can originate from the difficulty of obtaining information on welfare programs, because of the complexity of these programs, or because of psychological (stigma) effects (Moffitt, 1983). In France, Terracol (2002a) has estimated a rate of non take-up of 35% for the RMI program. This non take-up behavior implies that the population of welfare recipients are a self-selected sub-population of the eligible households – the target population of welfare programs. This self-selection behavior calls for an appropriate statistical modelling of the durations of welfare spells that will correct for a possible selection bias along the line of Heckman (1979). In Section 6.2.1, we first estimate a semi-parametric proportional hazard duration model with selectivity by adapting a technique used by McCall (1996) and van den Berg et al. (2000). Further, and to account for a possible under-declaration of welfare participation in our dataset, Section 6.2.2 presents a parametric double-hurdle model of welfare durations that fits in the general framework for estimating duration models with selectivity proposed by Prieger (2002). 6.2.1

A semi-parametric duration model with selectivity

We start with a mixed proportional hazard model where the hazard is written as: λ (t|X, δd ) = λ0 (t) exp (Xβ + δd ) where λ0 (t) is the duration dependance, and where δd represents the unobserved characteristics affecting the hazard rate. As in Section 5.2.2, the hazard rate at time t for an individual i with 16

unobserved characteristics δdi can be written as: λi (t|Xi , δdi ) = 1 − exp [− exp (Xi β + γt + δdi )] and the survival function is: T −1

Si (t|Xi , δdi ) =

∏ exp [− exp (Xi β + γt + δdi )]

t=1

The selection process is assumed to take the following form: durations are observed if and only if d ∗ = Zγ + δs + ε > 0 where ε has an EV1 distribution, and where δs represents the unobserved characteristics affecting the selection process. The probability of observing a spell for individual i with unobserved characteristics δsi is thus: Pr (di∗ > 0|Zi , δsi ) = 1 − exp [− exp (Zi γ + δsi )] Let F (δd , δs ) be the joint distribution of our unobserved characteristics, and ci be a indicator variable for complete spells, the likelihood of an observed spell, conditional on X and Z, but unconditional on the unobserved characteristics δd and δs can be expressed as: Z Z

[λ (t|X, δd ) S (t|Xδd )]ci [S (t|X, δd )]1−ci Pr (d ∗ > 0|Z, δs ) dF (δd , δs )

δd δs

And the unconditional likelihood of an unobserved spell is simply Z

exp [− exp (Zγ + δs )] d f (δs ) δs

where f (δs ) is the density function of the unobserved characteristics δs . To parameterize our model, we take the joint distribution of δd and δs to be bivariate discrete with 2 mass points for each term. Let δd1 , δd2 , δs1 and δs2 be those points of support, the associated probabilities are defined as follows:

δs1 δs2

δd1 P1 P3

δd2 P2 P4

P1 = Pr (δd = δd1 , δs = δs1 ) P2 = Pr (δd = δd2 , δs = δs1 ) P3 = Pr (δd = δd1 , δs = δs2 ) P4 = Pr (δd = δd2 , δs = δs2 ) The covariance of the two heterogeneity terms equals cov (δd , δs ) = (P1 P4 − P2 P3 ) (δd1 − δd2 ) (δs1 − δs2 )

17

Let li , i = 1, 2, 3, 4 be the likelihood of an observed spell, conditional on X, Z and on values of δd and δs : l1 = [λ (t|X, δd1 ) S (t|Xδd1 )]ci [S (t|X, δd1 )]1−ci Pr (d ∗ > 0|Z, δs1 ) .. .. . . l4 = [λ (t|X, δd2 ) S (t|Xδd2 )]ci [S (t|X, δd2 )]1−ci Pr (d ∗ > 0|Z, δs2 ) Accordingly, let l5 and l6 be the likelihoods for unobserved spells, conditional on Z and values of δs : l5 = exp [− exp (Zγ + δs1 )] l6 = exp [− exp (Zγ + δs2 )] The final log-likelihood of our data, integrating out the unobserved heterogeneity terms, can thus be written as: " # LL =



observed

4

ln

∑ Pi li

i=1

+



ln [(P1 + P2 ) l5 + (P3 + P4 ) l6 ]

(3)

unobserved

The model specification we have chosen uses a fourth order polynomial in log-time for the duration dependance, allowing for a sufficient flexibility. The variables affecting the hazard rate are the mean age of adults, dummies for their education level, household composition (number of adults and of children), and the log of the per capita benefit entitlement. The macro-economic conditions are taken into account via the unemployment rate in the administrative district12 . Because the selection model is better identified with an exclusion restriction, a variable that explains the selection process, but not the participation spell duration must be included in the selection equation. Such a variable would represent a fixed take-up cost that can prevent program participation, but that would be neutral with respect to the spell’s length. Social stigma appears to be a good candidate for this variable. Based on the model of Besley and Coate (1992), we use the percentage of RMI recipients in the individual’s geographical area as a proxy for social stigma. Because we also control for the local unemployment rate in the selection equation, and to avoid any collinearity effects, the variable included is the log of the ratio of the percentage of RMI recipients to the local unemployment rate. A second identification variable included in the selection equation is a dummy indicating if the household head previously was a self-employed worker. Indeed, Commissariat Général du Plan (2000) notes that self-employed individuals tend to be less informed about the RMI program and might think only wage-earners workers are eligible. Table 7 presents the results for a regular semi-parametric mixed proportional hazard model such as the one described in Section 5.2.2 (Model A), and of our semi-parametric model correcting for the selection biais (Model B). The probabilities associated with the unobserved characteristics parameters are referred to as Pr1 Pr2 for Model A, and as P1 to P4 for Model B. The 12 Source:

INSEE

18

parameter estimates of P1 and P4 are on the boundary of the parameter space, which implies that the unobserved heterogeneity components of δd and δs are perfectly correlated (P1 = P4 = 0)13 . Table 7: Semiparametric models

Variable

ln(time)

Model A Coefficient

Model B Coefficient

(Std. Err.)

(Std. Err.)

Duration equation Time parameters 2.933

2.871†

(1.877)

(1.726)

-3.338

-3.425†

(2.382)

(2.009)

1.755†

1.593†

(1.055)

(0.853)

-0.283†

-0.237∗

(0.149)

(0.119)

ln(time)2 ln(time)3 ln(time)4

Socio-economic characteristics Mean age of adults -0.066∗ 0.041 Age*education Number of adults Number of children Unemployment rate ln (Entitlement)

High school

(0.028)

(0.033)

-0.007

-0.016

(0.011)

(0.012)

0.832†

1.683∗

(0.443)

(0.823)

-0.703∗∗

-0.510∗∗

(0.248)

(0.185)

-0.144∗

-0.167∗∗

(0.062)

(0.057)

-0.644∗∗

-1.059∗∗

(0.161)

(0.204)

Education (ref: none or primary) 0.221

Technical school

1.255†

(0.746)

(0.758)

-0.724

1.811

(1.022)

(1.116)

Continued on next page... 13 Such

a feature is also found in van den Berg et al. (2000).

19

... table 7 continued Variable University

Coefficient

Coefficient

(Std. Err.)

(Std. Err.)

0.052

3.626∗

(1.664)

(1.725)

Selection equation Socio-economic characteristics Mean age of adults -

-0.100∗∗ (0.026)

Age*education

-

0.039∗∗ (0.010)

Number of adults

-

-2.223∗∗ (0.814)

Number of children below 3

-

0.176 (0.379)

Number of children between 3 & 6

-

0.609 (0.462)

Number of children over 6

-

0.237∗ (0.105)

Unemployment rate

-

0.158∗∗ (0.044)

ln (Entitlement)

-

0.803∗∗ (0.171)

ln



Local RMI recipiency rate Local unemployment rate



-

0.425† (0.254)

Self-employed

-

-1.554∗∗ (0.582)

High school

Education (ref: none or primary) -

-2.040∗∗ (0.584)

Technical school

-

-4.046∗∗ (0.896)

University

-

-5.989∗∗ (1.162)

δd1

Heterogeneity terms 2.161

0.367

(1.805)

(2.017)

Continued on next page... 20

... table 7 continued Variable δd2 δs1

Coefficient

Coefficient

(Std. Err.)

(Std. Err.)

5.942∗∗

3.978∗

(2.078)

(1.996)

-

-3.296 (2.247)

-

δs2

3.221 (2.650)

0.722∗∗

Pr1

-

(0.057)

0.277∗∗

Pr2

-

(0.057)

P2

-

0.932∗∗ (0.011)

P3

-

0.067∗∗ (0.011)

N Log-likelihood 2 χ(dl) Significance levels :

191 -409.760 40.78 † : 10%

∗ : 5%

972 -835.166 150.97

∗∗ : 1%

As expected, the estimated covariance between δd and δs is negative (cov (δd , δs ) = −1.479) and is significant at the 1 % level (standard error: 0.526); meaning that the unobserved characteristics increasing the take-up of the benefit are negatively correlated with the unobserved characteristics increasing the probability of exit from welfare. The selection probability positively depends on the number of children, especially on those older than 6, on the local unemployment rate, and quite obviously on the entitlement level, and on the stigma variable (recall that our variable is negatively correlated with social stigma as defined by Besley and Coate (1992)). The selection probability decreases with the mean education level and with the number of adults in the household. As expected, the fact that the household head was self-employed significantly decreases the take-up probability. These results tend to show that individuals who expect to exit quite quickly from eligibility (i.e. those with a high education level or with a possible second income source) and facing a looser budget contraint (fewer children for example) tend to give-up their right to the benefit, especially if they face some sort of social stigma or if they’re unsure of their rights. The estimated parameters of Model B’s duration equation shows that the hazard rate out of welfare increases with the number of adults and with their education level, both variables related to their ability to increase their income by finding a job. Because they tighten the budget con-

21

straint, increase the total entitlement and are time-consuming, the number of children decreases the hazard rate. Also, per-capita entitlement reduces the exit probability. A comparison between Model A and Model B’s coefficients for the duration equation shows that the impact of the number of adults and of their education level is under-estimated when the selection process is not taken into account. Similarly, the impact of the number of children in the household has been over-estimated. Figure 4 shows three different estimated survival functions: the solid line corresponds to model A; the dashed and dotted lines to model B. The dashed line shows what the survival function would have been, had all the eligible individuals took-up the benefit . The dotted line represents the estimated survival function, conditional on participation (which is the function of interest for the policy-makers). It shows that estimated survival probabilities are under-estimated if the selection process is ignored: the estimated median duration is 20 months when selection is ignored, and rises to 27 months when imperfect take-up is taken into account.

Figure 4: Estimated survival functions, semi-parametric model Survival functions

0

.2

survival probability .4 .6 .8

1

at the mean values of the covariates

0

10

20 30 time in months

Uncorrected survival Corrected survival | participation

6.2.2

40

50

Corrected survival

Integrating under-reporting of program participation

The relatively high non take-up rate observed in our data can not only result from actual non take-up (i.e. agent actually not participating to the welfare program), but part of the observed non take-up can also caused by the under-reporting of program participation in the questionnaire14 . We first start by a simple log-normal duration model with selectivity, and then add a second selection process accounting for a possible under-reporting of program participation. 14 It could also arise from measurement errors on the agents’ primary income, leading to an over-estimation of their benefit entitlement (some households the analyst thinks are entitled to the benefit might actually not be eligible, but have under-reported their income). See Duclos (1995, 1997) and Terracol (2002a).

22

The parametric selection model Consider a log-normal duration model : ln(t) = Xβ + ε durations are observed if P∗ = Zγ + ν > 0 with (ε, ν) ∼ N(0, 0, σ 2 , 1, ρ) ; ρ being the correlation coefficient between ε and ν Thus, ν = σρ ε + ν 0 with ν 0 ⊥ ε; and Var [ν 0 ] = 1 − ρ 2 The likelihood for complete spells t ∗ is: lcomplete = Pr (ln (t) = ln (t ∗ ) |P∗ > 0) · Pr (P∗ > 0) !   Zγ + σρ (ln(t ∗ ) − Xβ ) 1 ln(t ∗ ) − Xβ p · ·φ = Φ σ σ 1 − ρ2 The likelihood for censored spells t ∗∗ is: lcensored = Pr (ln(t) > ln(t ∗∗ )|P∗ > 0) · Pr (P∗ > 0)   Xβ − ln (t ∗∗ ) = Φ2 Zγ, ,ρ σ The likelihood for unobserved spells is: lunobserved = Pr (P∗ < 0) = Φ (−Zγ) The log-likelihood of our data can thus be written as:        ∗   Zγ+ σρ (ln(t ∗ )−Xβ ) ln(t )−Xβ 1  √  + ln · φ ln Φ ∑  σ σ 1−ρ 2   complete     ∗∗ ) LL = + ∑ ln Φ2 Zγ, Xβ −ln(t ,ρ σ   censored     + ∑ (ln (Φ (−Zγ)))

(4)

unobserved

The parametric double-hurdle model To take the potential under-reporting problem into account in our statistical analysis, we now add a second selection process to the previous model, thus leading to a “ double-hurdle lognormal duration model ”: agents first decide wether or not they participate in the welfare program, and then, provided that they participate, wether or not participation is reported in the questionnaire. The model setup is as follows:

23

Participation equation : P∗ = Zγ + ν Declaration equation : D∗ = W δ + η Duration equation : ln (t) = Xβ + ε where (ν, η, ε) are jointly normally distributed:      ν 0  η  ∼ N  0  , Σ ε 0   1 ρν,η σ ρν,ε 1 σ ρη,ε  being the covariance matrix, where ρa,b is the correlation Σ =  ρν,η σ ρν,ε σ ρη,ε σ2 coefficient between a and b, and where σ is the standard deviation of ε. Welfare duration t are observed in our database if and only if the agent participates in the welfare program (i.e. if P∗ > 0), and if she reports program participation in the questionnaire (i.e. if D∗ > 0). Our statistical model thus takes the form of a double-hurdle model. As in the selectivity model of section 6.2.2, the likelihood of the data can be divided into 3 parts according to wether or not durations are observed, and to the censoring status of the observed durations. The likelihood of an (observed) complete duration is : lcomplete = Pr [Xβ + ε = ln (t) ,W δ + η > 0, Zγ + ν > 0] According to Bayes’ rule, this probability can be written as the product of marginal and conditional probabilities : lcomplete = Pr [Xβ + ε = ln (t)] · Pr [W δ + η > 0, Zγ + ν > 0|Xβ + ε = ln (t)] Using the formulaes for conditional means and standard deviations given in Greene (2000, p. 87), lcomplete can be written as : !   1 ln (t) − Xβ Zγ + µν∗ W δ + µη∗ ∗ lcomplete = · φ · Φ2 , , ρν,η σ σ σν∗ ση∗ where: µν∗ = µν |ε = ln (t) − Xβ = µη∗ = µη |ε = ln (t) − Xβ = σν∗ = σν |ε = ln (t) − Xβ =

ρν,ε  · (ln (t) − Xβ ) σ ρη,ε  · (ln (t) − Xβ ) qσ 2 1 − ρν,ε

ση∗ = ση |ε = ln (t) − Xβ =

q 2 1 − ρη,ε 24

∗ =ρ ρν,η ν,η |ε = ln (t) − Xβ =

ρν,η −ρν,ε ·ρη,ε σν∗ ·ση∗

The likelihood of an (observed) censored spell is : lcensored = Pr [Xβ + ε > ln (t) ,W δ + η > 0, Zγ + ν > 0]   Xβ − ln (t) , ρν,η , ρν,ε , ρη,ε = Φ3 Zγ,W δ , σ

(5)

The trivariate normal CDF of equation (5) is computed using the GHK (Geweke-HajivassiliouKeane) smooth recursive simulator (see Gouriéroux and Monfort (1995), Hajivassiliou et al. (1996) for details, and Appendix B for a quick note on the computation of the GHK simulator). The model is thus fit using Simulated Maximum Likelihood (SML) techniques. Finally, the likelihood of an unobserved spell is : lunobserved = Pr [Zγ + ν < 0] + Pr [Zγ + ν > 0,W δ + η < 0] = Φ (−Zγ) + Φ2 (Zγ, −W δ , −ρν,η ) The log-likelihood of our data can thus be written as:   i  h   ∗ Zγ+µν∗ W δ +µη ln(t)−Xβ 1 ∗  · φ · Φ , , ρ ln ∑  2 ν,η σ σ σν∗ ση∗    complete h   i LL = + ∑ ln Φ3 Zγ,W δ , Xβ −ln(t) , ρ , ρ , ρ ν,η ν,ε η,ε σ  censored     + ∑ [ln (Φ (−Zγ) + Φ2 (Zγ, −W δ , −ρν,η ))]

(6)

unobserved

Table 8 presents the results from a regular log-normal duration model (column C), estimated on the sub-sample of observed spells, our log-normal survival model correcting for the potential selectivity biais (column D), and our “ double-hurdle ” log-normal duration model (column E). Model E was estimated using 200 replications for the GHK simulator. As in Section 6.2.1, the duration equation includes traditional socio-economic characteristics such as age, education, household composition, nationality etc. Macro effects are controlled via the local unemployment rate and regional dummies. The parameterization of the models allows us to include more covariates in the duration, participation and declaration equations than with the semi-parametric models. We have thus included various regional and city-size dummies, and splitted the education variable into 3 dummies. The declaration equation of Model E contains the mean education level of adults, and two identifying variables: a proxy for interview reliability and the household size. The reliability variable is the proportion of adults in the household who have answered the questionnaire themselves (in a two adults households, the variable would equal 0.5 if one adult has answered both questionnaires). The household size is included because in a larger household, one could forget to report benefit receipt more easily.

25

Table 8: Parametric models

Variable

Mean age of adults

Model C Coefficient

Model D Coefficient

Model E Coefficient

(Std. Err.)

(Std. Err.)

(Std. Err.)

Duration equation Socio-economic characteristics 0.005 -0.021 (0.023)

(0.028)

(0.029)

0.015†

0.021∗

0.020∗

(0.009)

(0.010)

(0.010)

-0.469

-0.773†

-0.378

(0.347)

(0.437)

(0.400)

0.849∗

0.817∗

1.271∗∗

(0.385)

(0.392)

(0.487)

-0.106

-0.059

0.417

(0.366)

(0.377)

(0.494)

0.251†

0.299†

0.743∗

(0.148)

(0.156)

(0.359)

0.113∗

0.152∗

0.141∗

(0.057)

(0.066)

(0.065)

0.415∗∗

0.564∗∗

0.572∗∗

(0.127)

(0.160)

(0.179)

-0.481

-0.461

-0.450

(0.439)

(0.464)

(0.459)

0.234

0.368

0.360

(0.237)

(0.262)

(0.270)

Age*education Number of adults Number of children below 3 Number of children between 3 & 6 Number of children over 6 Local unemployment rate Entitlement French Disabled

High school

Education (ref: none or primary) -0.884 -1.246∗

Parisian area

-1.058†

(0.568)

(0.626)

(0.620)

-1.270

-2.066∗

-1.633†

(0.836)

(0.979)

(0.941)

-2.147†

-3.649∗

-2.898†

(1.265)

(1.565)

(1.498)

Region (ref: Paris & suburbs) 0.801∗ 0.803∗

0.842∗

Technical school University

0.026

(0.387)

(0.406)

(0.399)

Continued on next page...

26

... table 8 continued Variable

Coefficient

Coefficient

Coefficient

(Std. Err.)

(Std. Err.)

(Std. Err.)

0.163

0.251

0.265

(0.513)

(0.535)

(0.531)

0.919†

0.867†

0.919†

(0.477)

(0.497)

(0.487)

North East West South-west Center-east Mediterranean area Intercept

Mean age of adults

0.178

0.200

0.272

(0.419)

(0.437)

(0.435)

0.747†

0.679

0.749†

(0.429)

(0.445)

(0.442)

0.468

0.543

0.556

(0.465)

(0.487)

(0.478)

-1.459

-2.880

-3.379†

(1.490)

(1.778)

(2.034)

-

Number of children below 3

-

Number of children between 3 & 6

-

Number of children over 6

-

Local unemployment rate Local RMI recipiency rate Local unemployment rate

0.649 (0.401)

-

Number of adults

ln

0.613 (0.408)

Participation equation Socio-economic characteristics -0.054∗∗

Age*education



0.601 (0.387)

-



-

-0.050∗∗

(0.014)

(0.016)

0.019∗∗

0.018∗∗

(0.005)

(0.006)

-0.493∗∗

-0.199

(0.167)

(0.271)

0.049

0.584∗

(0.155)

(0.268)

0.252

0.873∗

(0.228)

(0.418)

0.066

0.663∗∗

(0.058)

(0.133)

0.077∗∗

0.085∗∗

(0.024)

(0.029)

0.168

0.182†

(0.119)

(0.105)

Continued on next page...

27

... table 8 continued Variable

Coefficient

Coefficient

Coefficient

(Std. Err.)

(Std. Err.)

(Std. Err.)

-

0.244∗∗

0.316∗∗

(0.061)

(0.079)

Entitlement French

-

Disabled

-

Self employed

High school

-

-

University

City size (ref: Paris) -

≤ 20000

-

> 20000 & ≤ 100000

-

> 100000

-

Intercept

Mean education level

0.179 (0.220)

0.205†

0.339∗

(0.121)

(0.143)

-0.644∗∗

-0.660∗∗

(0.186)

(0.200)

Education (ref: none or primary) -1.057∗∗

Technical school

Rural area

0.113 (0.189)

Declaration equation -

-0.905∗

(0.347)

(0.402)

-1.981∗∗

-1.609∗∗

(0.518)

(0.607)

-2.957∗∗

-2.352∗∗

(0.655)

(0.778)

-0.150

-0.177

(0.199)

(0.214)

0.154

0.192

(0.200)

(0.212)

-0.115

-0.130

(0.208)

(0.232)

-0.184

-0.228

(0.186)

(0.206)

-0.858

-2.173†

(0.978)

(1.148)

-

-0.218 (0.146)

Interview reliability

-

-

1.224∗∗ (0.421)

Household size

-

-

-0.575∗∗ (0.114)

Continued on next page...

28

... table 8 continued Variable

Coefficient

Coefficient

Coefficient

(Std. Err.)

(Std. Err.)

(Std. Err.)

-

-

1.635∗

Intercept

(0.667)

σ

1.149∗∗

1.421∗∗

1.415∗∗

(0.083)

(0.323)

(0.338)

-

0.668∗∗

0.642∗

(0.257)

(0.294)

-

-0.636∗

ρν,ε

-

ρν,η

(0.271)

-

ρη,ε

-

-0.561∗ (0.267)

N Log-likelihood 2 χ(dl) Significance levels :

191 -208.878 42.05 † : 10%

∗ : 5%

972 -637.056 130.51

972 -613.244 132.51

∗∗ : 1%

While the semi-parametric models of Section 6.2.1 is of the proportional hazard form, the parametric models presented in this section are of the accelerated failure time form. Thus, variable with a positive (negative) coefficient is interpreted as increasing (decreasing) the spell length (while a positive coefficient in the models of Section 6.2.1 would be interpreted as increasing the hazard rate, thus decreasing time to exit.). The correlation coefficient between the error terms of the duration and participation equations are positive and statistically significant in both Model D and Model E, implying that, as expected, the unobserved characteristics leading to a greater take-up probability are positively correlated with the unobserved characteristics leading to a longer welfare spell: individuals more likely to participate in the welfare program tend to have longer spell durations. Quite surprisingly, the correlation coefficients between the duration and the declaration equation, and between the participation and declaration equation are both negative and statistically significant. These negative coefficients show that (would be) long-term welfare recipients have a higher probability of concealing their participation to the RMI program. Coefficients of the participation equation of Model D are similar to those of the selection equation in Model B15 . Model E’s participation equation shows that the impact of the number of adults is less than when potential under-declaration of welfare participation was not taken into account. On the contrary, the impact of the age and number of children in the household appears to be under-estimated in Model D. Likewise, the education level seems to have a weaker (but still significant effect) in the participation process when the declaration equation is included in the 15 Recall that Model B’s selection equation was parameterized using an EV1 error term, while Models D and E use a normally distributed error term; the coefficients cannot be directly compared.

29

model. These features can be explained by looking at the estimated coefficients of the declaration equation: it appears that a higher education level decreases the declaration probability. It appears that highly educated people participate less frequently in welfare programs, but because they also tend to conceal their participation, the impact of education was over-estimated in Model D. The same line of reasoning can be applied to the household size. The interview reliability – our identifying variable – has the expected positive sign, and is significant at the 1 % level. This could be interpreted as welfare recipients hiding their participation to their relatives. Turning now to the duration equations, the estimated coefficients show that the number and age of children present in the household increase welfare spell duration; and a higher education level decreases durations. A comparison between the education coefficients of Models C, D and E show that the impact of education was indeed underestimated when neither selection nor under-declaration problems were taken into account, but that the selection-only model yields over-estimated coefficients. This is consistent with the highly-educated individuals having a tendency to conceal welfare participation. Macro-economic conditions do matter: the local unemployment level has a positive and significant coefficient. Regional dummies show that individuals living in Paris or its suburbs experience shorter welfare spells than those living in the parisian area (excluding Paris and suburbs), in the east and center-east where the labour market is not as dynamic. Figures 5 for the selection-only model, and 7 for the double-hurdle model, show the estimated hazard functions (computed at the mean value of the covariates). The solid lines (“ Uncorrected hazard ”) on both figures represent the estimated hazard rate for model C, while the dashed lines (“ Corrected hazard | participation ”) shows the estimated hazard rate, conditional on the individual participating in the welfare program. Because the correlation between the error terms is positive, the conditional hazard rates are in both cases substantially lower than the one obtained with model C. Likewise, both Figure 6 for the selection-only model, and 8 for the double-hurdle model show three different estimated survival functions: the solid line corresponds to model C; the dashed and dotted lines to model D (fig. 6) or Model E (fig. 8). The dashed line shows what the survival function would have been, had the selection process been non-existant (or randdom). The dotted line shows the estimated survival function, conditional on participation (which is the function of interest for the policy-makers). It shows that estimated survival probabilities are under-estimated if the selection is ignored: the estimated median duration is 20 months when selection is ignored, and rises to 26 months when imperfect take-up is taken into account (fig. 6), and to when both imperfect take-up and to 30 months imperfect reporting is taken into account (fig.8).

30

Figure 5: Estimated hazard rates, selection model Hazard functions

0

.01

hazard .02

.03

.04

at the mean values of the covariates

0

10

20 30 time in months

Uncorrected hazard

40

50

Corrected hazard | participation

Figure 6: Estimated survival functions, selection model Survival functions

0

.2

survival probability .4 .6 .8

1

at the mean values of the covariates

0

10

20 30 time in months

Uncorrected survival Corrected survival | participation

31

40

50

Corrected survival

Figure 7: Estimated hazard functions, double-hurdle model Hazard functions

0

.01

hazard .02

.03

.04

at the mean values of the covariates

0

10

20 30 time in months

Uncorrected hazard

40

50

Corrected hazard | participation

Figure 8: Estimated survival functions, double-hurdle model Survival functions

.2

survival probability .4 .6 .8

1

at the mean values of the covariates

0

10

20 30 time in months

Uncorrected survival Corrected survival | participation

7

40

50

Corrected survival

Conclusion

This article has analyzed the dynamics of eligibility and of participation to the RMI program in France using data from the European Community Household Panel, taking into account several features of welfare-related data which has been overlooked in the literature. The first part of our study has focused on welfare eligibility spells for which measurement errors on income variable is very likely to lead to severe measurement errors on the eligibility duration variable. The use of the Monotone Rank Estimatorhas allowed to overcome such problems. 32

The second part of our paper analyses the duration of welfare spells, focusing on the issue of imperfect take-up of the RMI benefit. Because we suspect the population of RMI recipients to be a self-selected sup-population of the eligible individuals, we have developed a duration model with selectivity in a semi-parametric framework. Further, and because under-reporting of welfare participation might also be a concern, a double-hurdle log-normal duration model was estimated, controlling for both self-selection into, and under-reporting of, welfare program participation. Our results show that a significant selectivity process is indeed at work among eligible individuals, and that under-reporting of welfare participation in questionnaire surveys do occur, both processes leading to an under-estimation of the duration of welfare spells if they are not modelled jointly with the duration process.

33

References A BREVAYA , J. and H AUSMAN , J. A. (1999), “ Semiparametric Estimation with Mismeasured Dependant Variables: An Application to Duration Models for Unemployment Spells ”, Annales d’Économie et de Statistique, no 55-56. A FSA , C. (1999), “ L’allocation de parent isolé : une prestation sous influence. Une analyse de la durée de perception ”, Économie et Prévision, no 137. A FSA , C. and G UILLEMOT, D. (1999), “ Plus de la moitié des sorties du RMI se font grâce à l’emploi ”, INSEE Première, no 632. A NDERSON , P. M. and M EYER , B. D. (1997), “ Unemployment insurance take-up rates and the after-tax value of benefits ”, Quarterly Journal of Economics, vol. 112 no 3. A SHENFELTER , O. (1983), “ Determining participation in income-tested social programs ”, Journal of the American Statistical Association, vol. 78 no 383. BAKER , M. and M ELINO , A. (2000), “ Duration dependence and nonparametric heterogeneity: A Monte Carlo study ”, Journal of Econometrics, vol. 96: pp. 357–393. BANE , M. J. and E LLWOOD , D. (1986), “ Slipping into and out of poverty: the dynamics of spells ”, Journal of Human Resources, vol. 21 no 1. B ERG , G. J., VAN DER K LAAUW, B. and VAN O URS , J. (2000), “ Punitive Sanctions and the Transition Rate from Welfare to Work ”, Discussion Paper no 2447, CEPR.

VAN DEN

B ESLEY, T. and C OATE , S. (1992), “ Understanding welfare stigma: taxpayer resentment and statistical discrimination ”, Journal of Public Economics, vol. 48. B LANK , R. M. (1989), “ Analyzing the length of welfare spells ”, Journal of Public Economics, vol. 39 no 3. B LANK , R. M. and RUGGLES , P. (1996), “ When Do Women Use Aid to Families with Dependant Children and Food Stamps? The Dynamics of Eligibility versus Participation ”, Journal of Human Resources, vol. 31 no 1. B LUNDELL , R., F RY, V. and WALKER , I. (1988), “ Modelling the take-up of means-tested benefits: the case of housing benefits in the United Kingdom ”, The Economic Journal, vol. 98 no 390, supplement. B OLLINGER , C. R. and DAVID , M. H. (2001), “ Estimation With Response Error and Nonresponse: Food-Stamp Participation in the SIPP ”, Journal of Business and Economic Statistics, vol. 19 no 2. B RAMLEY, G., L ANCASTER , S. and G ORDON , D. (2000), “ Benefit Take-up and the Geography of Poverty in Scotland ”, Regional Studies, vol. 34 no 6.

34

C AVANAGH , C. and S HERMAN , R. P. (1998), “ Rank estimators for monotonic index models ”, Journal of Econometrics, vol. 84. C OMMISSARIAT G ÉNÉRAL DU P LAN (2000), Minima sociaux, revenus d’activité, précarité, La Documentation française, Paris, rapport du groupe présidé par Jean-Michel Belorgey. D UCLOS , J.-Y. (1995), “ Modelling the take-up of state support ”, Journal of Public Economics, vol. 58. D UCLOS , J.-Y. (1997), “ Estimating and testing a model of welfare participation: the case of Supplementary Benefits in Britain ”, Economica, vol. 68. D UCLOS , J.-Y., F ORTIN , B., L ACROIX , G. and ROBERGE , H. (1998), “ The Dynamics of Welfare Participation in Québec ”, Université de Laval, miméo. F ITZGERALD , J. (1994), “ A Hazard Model for Welfare Durations with Unobserved LocationSpecific Effects ”, Discussion Paper no 1046-94, Institute for Research on Poverty. G OURIÉROUX , C. and M ONFORT, A. (1995), Simulation Based Econometric Methods, CORE Lectures Series, Oxford University Press. G REENE , W. H. (2000), Econometric Analysis, Prentice-Hall, 4ème édition. G URGAND , M. and M ARGOLIS , D. N. (2002), “ A Multiple State non-Stationary Model of Welfare Exit ”, CREST, mimeo. H AJIVASSILIOU , V., M C FADDEN , D. and RUUD , P. A. (1996), “ Simulation of multivariate normal rectangle probabilities and their derivative: Theoretical and computational results ”, Journal of Econometrics, no 72. H AN , A. and H AUSMAN , J. A. (1990), “ Flexible parametric estimation of duration and competing risk models ”, Journal of Applied Econometrics, vol. 5. H ECKMAN , J. J. (1979), “ Sample selection bias as a specification error ”, Econometrica, vol. 47 no 1. H ECKMAN , J. J. and S INGER , B. (1984), “ A method for minimizing the impact of distributional assumptions in econometric models for duration data ”, Econometrica, vol. 52 no 2. H OYNES , H. and M AC URDY, T. (1994), “ Has the decline in benefits shortened welfare spells? ”, American Economic Review, vol. 84 no 2. H OYNES , H. W. (1996), “ Local Labor Markets and Welfare Spells: Do Demand Conditions Matter? ”, Working Paper no 5643, NBER. L ANCASTER , T. (1979), “ Econometric methods for the duration of unemployment ”, Econometrica, vol. 47 no 4. L ANCASTER , T. (1990), The econometric analysis of transition data, Econometric Society monographs, Cambridge Univerity Press, Cambridge, UK. 35

M AGNAC , T. and V ISSER , M. (1998), “ Transition Models with Measurement Errors ”, Document de travail no 9822, CREST. M C C ALL , B. P. (1996), “ Unemployment Insurance Rules, Joblessness, and Part-Time Work ”, Econometrica, vol. 64 no 3. M EYER , B. D. (1990), “ Unemployment insurance and unemployment spells ”, Econometrica, vol. 58 no 4. M OFFITT, R. A. (1983), “ An economic model of welfare stigma ”, American Economic Review, vol. 73 no 5. N ELDER , J. A. and M EAD , R. (1965), “ A Simplex Method for Function Minimization ”, Computer Journal, vol. 7: p. 308–313. O’N EIL , J. A., BASSI , L. J. and W OLF, D. A. (1987), “ The Duration of Welfare Spells ”, The Review of Economics and Statistics, vol. 69 no 2. P RIEGER , J. E. (2002), “ A flexible parametric selection model for non-normal data with application to health care usage ”, Journal of Applied Econometrics, vol. 17 no 4. R IPHAHN , R. T. (2001), “ Rational poverty or poor rationality? The take-up of social assistance benefits ”, Review of Income and Wealth, vol. 47 no 3. S TEVENS , A. H. (1994), “ The dynamics of poverty spells: updating Bane and Ellwood ”, American Economic Review, vol. 84 no 2. T ERRACOL , A. (2002a), “ Analysing the Take-up of Means-Tested Benefits in France ”, Université de Paris 1, miméo. T ERRACOL , A. (2002b), “ Monotone Rank Estimator, modèles de durée et erreurs de mesure ”, Université de Paris 1, miméo. W ELCH , S. M. (1998), “ Nonparametric estimates of the duration of welfare spells ”, Economics Letters, vol. 60. Z OYEM , J.-P. (2001), “ Contrat d’insertion et sortie du RMI ”, Économie et Statistiques, no 346-347.

36

Appendix A

RMI eligibility calculation

Computation of households benefit entitlement was made on the basis of the official benefit scales as given by Liaisons Sociales. The different household types depend on the number of individuals (adults and children younger than 25) in the household. For each household (the basic unit of the European Community Household Panel), we have defined the following entitlement units: • Parents and children younger than 25. • For children older than 25 (or 18 for single parents), their entitlement is calculated as separate units. • The same goes for individuals with no family links to the household head, and for his ascendants. Tables 9 to 16 show the income thresholds for the RMI (¤) according to the year and household structure. An inclusive amount (different every year) is imputed to the household’s income if they have no housing costs. Table 9: RMI thresholds in 1993 Single persons Couples No children 343.47 515.21 1 child 515.21 618.25 2 children 618.25 721.28 By additional child + 137.39

Table 10: RMI thresholds in 1994 Single persons Couples No children 350.34 525.51 1 child 525.51 630.61 2 children 630.61 735.71 By additional child + 140.14

Table 11: RMI thresholds in 1995 Single persons Couples No children 354.44 531.81 1 child 531.81 638.18 2 children 638.18 744.54 By additional child + 141.82 37

Table 12: RMI thresholds in 1996 Single persons Couples No children 361.99 542.98 1 child 542.98 651.58 2 children 651.58 760.18 By additional child + 144.81

Table 13: RMI thresholds in 1997 Single persons Couples No children 366.33 549.50 1 child 549.50 659.40 2 children 659.40 769.29 By additional child + 146.53

Table 14: RMI thresholds in 1998 Single persons Couples No children 370.36 555.54 1 child 555.54 666.65 2 children 666.65 777.76 By additional child + 148.14

Table 15: RMI thresholds in 1999 Single persons Couples No children 381.47 572.21 1 child 572.21 686.65 2 children 686.65 801.09 By additional child + 152.59

Table 16: RMI thresholds in 2000 Single persons Couples No children 389.10 583.65 1 child 583.65 700.38 2 children 700.38 817.11 By additional child + 155.64 The different income types taken into account are: 1. Family benefits 38

2. Disabled adult benefit 3. Labor income 4. Illness allowance 5. Unemployment benefit 6. Retirement pensions 7. Widowhood benefit

39

B

The GHK smooth recursive simulator

Let us illustrate the GHK simulator in the trivariate case (generalization to higher orders is straightforward). Let      0 ε1  ε2  ∼ N  0  , Σ (7) 0 ε3 where Σ is a covariance matrix We wish to evaluate Pr (ε1 < b1 , ε2 < b2 , ε3 < b3 )

(8)

Equation (8) can be rewritten as a product of conditional probabilities: Pr (ε1 < b1 ) Pr (ε2 < b2 |ε1 < b1 ) Pr (ε3 < b3 |ε1 < b1 , ε2 < b2 )

(9)

Let L be the lower triangular Cholesky decomposition of Σ, such that: LL0 = Σ:   l11 0 0 L =  l21 l22 0  l31 l32 l33 We get: 

    ε1 ν1 l11 0 0  ε2  =  l21 l22 0   ν2  l31 l32 l33 ε3 ν3

(10)

where the νi are independent standard normal random variables. By (10), we get: ε1 = l11 ν1 ε2 = l21 ν1 + l22 ν2 ε3 = l31 ν1 + l32 ν2 + l33 ν3 Thus:   Pr (ε1 < b1 ) = Pr ν1 < b1 l11

(11)

  (b2 − l21 ν1 ) b1 Pr (ε2 < b2 |ε1 < b1 ) = Pr ν2 < |ν1 < l22 l11

(12)

and

and Pr (ε3 < b3 |ε1 < b1 , ε2 < b2 ) =  Pr ν3 <

(b3 −l31 ν1 −l32 ν2 ) |ν1 l33

40

<

b1 l11 , ν2

<

(b2 −l21 ν1 ) l22



(13)

Since (ν1 , ν2 , ν3 ) are independent random variables, equation (8) can be expressed as a product of univariate CDF, but conditional on unobservables (the νi ). Suppose now that we draw a random variable ν1∗ from a truncated standard normal density with upper truncation point of lb111 , and another one, ν2∗ , from a standard normal density with (b −l ν ∗ )

upper truncation point of 2 l2221 1 . These two random variables respect the conditioning events of equations (12) and (13). Equation (9) is then rewritten as:       b1 (b2 − l21 ν1∗ ) (b3 − l31 ν1∗ − l32 ν2∗ ) Pr ν1 < Pr ν2 < Pr ν3 < (14) l11 l22 l33 The GHK simulator of (8) is the arithmetic mean of the probabilities given by (14) for D random draws of ν1∗ and ν2∗ : (   "  #) # " D b3 − l31 ν1∗d − l32 ν2∗d b2 − l21 ν1∗d b 1 1 e GHK = ∑ Φ Pr Φ (15) Φ D d=1 l11 l22 l33 where ν1∗d and ν2∗d are the d-th draw of ν1∗ and ν2∗ , and where Φ (.) is the univariate normal CDF. The simulated probability (15) is then plugged into the likelihood function, and standard maximisation techniques are used.

41

Spells of eligibility and spells of participation to welfare programs in ...

Spells of welfare program participation have received great attention in the ..... chances of exiting the eligibility spell, especially compared with small cities ..... selection process accounting for a possible under-reporting of program participation. ..... sponse: Food-Stamp Participation in the SIPP ”, Journal of Business and ...

278KB Sizes 0 Downloads 162 Views

Recommend Documents

eBook 1001 Spells: The Complete Book of Spells for ...
... they were wearing InformationWeek com News analysis and research for business technology professionals plus peer to peer ... both big (health and healing).

Red-Magick-Grimoire-Of-Djinn-Spells-And-Sorceries.pdf
Red-Magick-Grimoire-Of-Djinn-Spells-And-Sorceries.pdf. Red-Magick-Grimoire-Of-Djinn-Spells-And-Sorceries.pdf. Open. Extract. Open with. Sign In.

Trolldom-Spells-And-Methods-Of-The-Norse-Folk-Magic-Tradition.pdf
... PDF eBook. TROLLDOM. Study On-line and Download Ebook Trolldom. Download Johannes Gårdbäck ebook file totally free and this file pdf identified. at Monday 25th of August 2014 05:22:37 PM, Get many Ebooks from our on the internet library connect

Trolldom-Spells-And-Methods-Of-The-Norse-Folk-Magic-Tradition.pdf
Retrying... Trolldom-Spells-And-Methods-Of-The-Norse-Folk-Magic-Tradition.pdf. Trolldom-Spells-And-Methods-Of-The-Norse-Folk-Magic-Tradition.pdf. Open.

The (Not Really) Complete Tome of Spells - Ultimate Edition!.pdf ...
Page 3 of 386. The (Not Really) Complete Tome of Spells - Ultimate Edition!.pdf. The (Not Really) Complete Tome of Spells - Ultimate Edition!.pdf. Open. Extract.

pdf-0976\encyclopedia-of-5000-spells-publisher-harperone-by ...
Spells Publisher: HarperOne By Judika Illes can assist you to resolve the problem. It can be one of. the best sources to establish your composing ability. Page 3 of 6. pdf-0976\encyclopedia-of-5000-spells-publisher-harperone-by-judika-illes.pdf. pdf-