Inference on Inequality from Complex Survey Data 9

Viewer
Transcript

Inference on Inequality from Complex Survey Data y Debopam Bhattacharyaz Department of Economics, Princeton University. April 30, 2003

Abstract We develop a framework for asymptotic inference on Lorenz curves and the Gini coe¢ cient based on data from surveys whose designs involve strati…cation and clustering. We set up the estimation problem, derive the appropriate asymptotic distribution theory as the number of clusters tends to in…nity and compute asymptotic standard errors that are robust to sample-design e¤ects. We apply our methods to estimate Lorenz curves and Gini coe¢ cient for per capita expenditure using household consumption data from the complexly designed Indian National Sample Survey. Next, we test for dominance in terms of the Gini and entire Lorenz curves of the distributions over time, using the asymptotic distribution theory obtained above. Erroneous inference ignoring the survey design implies qualitatively di¤erent conclusions in several cases. Our methods are genral enough to be used in nearly all large-scale household surveys, whose designs involve strati…cation and clustering. y

JEL classi…cation code: C12, C13, C31, C42. I am grateful to Professors Angus Deaton and Bo Honore for help, encouragement and support and

to Professors Stephen Donald and Joel Horowitz for helpful comments. I have immensely bene…tted from discussions with Alessandro Tarozzi. I would also like to thank two anonymous referees and an associate editor of the Journal of Econometrics for helpful comments on a related paper. Financial support from the Wilson Fellowship is gratefully acknowledged. All errors are mine. z All correspondences should be addressed to Debopam Bhattacharya, Department of Economics, Princeton University, Princeton, NJ 08544.

Phone: 646-932-2335, Fax: 609-254-6419.

[email protected]

1

e-mail: de-

1

Introduction

Computation of economic inequality is important for evaluating the e¤ects of micro and macro level economic policies, for studying the relationship between inequality and growth and for assessing the consequences and determinants of political outcomes. Usually, inequality is computed from household income or expenditure data, gathered by means of random sampling from the population of interest. As a result, such inequality measures are subject to sampling ‡uctuations and warrant the derivation of a statistical distribution theory. A real-life complication in these derivations is that large-scale cross-sectional household surveys are rarely simple random samples drawn from the whole population. For a variety of reasons including binding …nancial and administrative constraints, political emphasis on the study of minorities etc., survey agencies adopt a multi-stage design involving strati…cation followed by multiple layers of clustering inside every stratum. Examples include designs of the World Bank’s multi-country Living Standards Measurement Studies (LSMS), USA’s Current Population Survey (CPS) and the (cross-section component of) Panel Study of Income Dynamics (PSID) among many others. Ignoring the survey design in the estimation process can lead to inconsistent estimates of the population parameters and almost always produces inconsistent estimates of the standard error of these estimates. In applied work, standard errors are rarely reported on measures of inequality, so that valid inference becomes impossible. Moreover, the observed movement in inequality through time is usually very small, which reinforces the importance of obtaining correct standard errors for the purpose of valid inequality comparisons across time. Historically, e¤ects of sampling design on estimation were analyzed by survey statisticians (c.f. Cochran, 1977) who treated both the population and the sample size as …nite and based the (exact …nite-sample) analysis of estimates on combinatorial methods. This procedure works well only for the estimation of means and is extremely messy for analyzing inequality, based on quantiles of the distribution, since one has to keep track of the cluster and stratum identities of the ordered observations. A more elegant alternative would be to use the asymptotic distribution theory for GMM-based estimators of which quantiles and quantile-based estimators are a special case. These will be quite precise when one has a large number of observations as is typical of large household surveys. In order to implement the asymptotic methods for complexly designed samples, one needs to adapt the 2

asymptotic framework of modern cross-section econometrics, which always assumes simple random sampling, to the case of multi-stage survey designs, as is typical of household surveys. In this paper, we …rst develop a framework for method-of-moment based asymptotic inference which enables one to handle data from complexly designed surveys. We show how to set up the estimation problem, how to derive the appropriate asymptotic distribution theory and …nally, how to compute the asymptotic standard errors that are robust to sample-design e¤ects. Our procedure of inference from multi-stage strati…ed samples involves two distinct stages of correction (relative to simple random samples), one involving the estimation of the parameters and the second involving the computation of standard errors of these estimates. One needs to modify the method of estimation (the …rst ‘level’ of correction) to account for the fact that the distribution of the sampled observations generally di¤ers from their distribution in the population as a result of the multi-stage design. This can usually be achieved by suitably weighting the data where the weights, computed from the latest census, are typically available in the survey data. At the second ‘level’, one needs to use asymptotic theory for dependent and non-identically distributed observations to derive the asymptotic distribution of the estimates and compute standard errors that are robust to the sample-design e¤ects. Next, we apply these methods to derive the asymptotic distribution theory for Lorenz curves and the Gini coe¢ cient- the most popular measures of inequality. Based on these distributions, we also develop two distinct tests for inequality change- one based on the Gini and the other based on the entire Lorenz curve (in contrast to comparing Lorenz shares at a …nite number of …xed percentiles). Statistical inference with Lorenz shares and measures of inequality and poverty has a long history in both statistics and econometrics. This literature has treated the estimation of each index of inequality and poverty as a separate problem without recognizing that they are all special cases of a uni…ed method of moment estimation problem. Examples include Gastwirth (1972), Beach and Davidson (1983), Davidson and Duclos (2000), Zheng (2001). Also, these works have always assumed simple random sampling, except Zheng (2002) who analyzed Lorenz share estimation at a …nite number of percentiles (based on the Bahadur representation for quantiles) from complexly designed samples. Our work, in contrast

3

to the above, is generalizable to any estimation problem that is expressible as a MoM problem, including regressions, maximum likelihood inference and nearly all commonly used measures of inequality and poverty. We also develop a distribution theory for the Lorenz process which permits more robust tests of inequality change and, by adapting these distributions to complex sample design, we make our procedures applicable to data from the majority of real-life household surveys. The plan of the paper is as follows: in the next section we provide a brief discussion of sampling weights, section 2 sets up the method of moment problem for a generic strati…ed two-stage sample design and derives the asymptotic distribution theory for the estimate, section 3 derives a distribution theory for Lorenz curves and the Gini index, section 4 describes two tests of inequality dominance based on the Gini coe¢ cient and the entire Lorenz curve, section 5 applies the results of section 4 to test for changes in inequality in India before and after the liberalization reforms of the early 1990’s. Section 5.3 describes how our methods can be adapted to survey designs which involve multiple levels of strati…cation and clustering. Section 6 concludes.

1.1

A brief note on weighting and consistent estimation

Because of the strati…ed, clustered design, not all households in the population, in general, have an equal probability of being included in the sample. As a result, di¤erent sample observations are usually assigned di¤erent weights, with the sampling weight of the observation denoting how many values in the population it represents.1 In general, when the parameter of interest is the census parameter (i.e. the parameter one would get if one performed the same estimation exercise with the entire population, e.g. the population mean), a weighted estimation technique is appropriate. Unweighted estimates will not be consistent for the census parameter.2 When comparing standard errors for the estimate of 1

Unequal weights can arise even when no explicit strati…cation or clustering are involved. Attrition in

panel data and di¤erential survey non-response, being two other sources. 2 However, when the data are assumed to be generated by the same model (e.g. a regression model) which holds true for all observations, no matter what stratum and clusters they come from, unweighted estimates will be consistent for the parameters of that model. Weighting (by sample weights) is no longer necessary and can (e.g. in the case of a linear regression model satisfying the Gauss-Markov assumptions) produce ine¢ cient estimates relative to unweighted estimates (see DuMouchel and Duncan (1983) and Wooldridge

4

the parameter of interest that are and are not corrected for the sample design (the main focus of the paper), we shall focus on the same estimate of the parameter (weighted for inequality measures and unweighted for regressions), as we shall emphasize again in section 2.1.

2

Sample design and the Mom problem

In this section, we shall set-up the estimation problem with data from a strati…ed, multistage clustered sample. The sampling design we consider is generic and is as follows. The population is divided into S …rst stage strata. Stratum s contains a mass of Hs clusters. A sample of ns clusters (indexed by cs ) is drawn via simple random sample with replacement (sampling with or without replacement has no e¤ect on our asymptotic analysis based on increasing number of clusters; we shall henceforth refer only to simple random sampling without mentioning if it is with or without replacement) from stratum s, for each s. The cs th cluster contains a …nite population of Mscs households. A simple random sample of k households (equal for all strata and clusters and indexed by h) is drawn from it. The hth household in the cs th cluster in the sth stratum has

scs h

members. The joint density

of a (per capita) characteristic Y and household size N in the sth stratum is denoted by dF (y; js) with F (a; bjs) denoting the population proportion of households in stratum s with Y < a and N < b : Note that this joint density can di¤er across strata, so that sampled observations from di¤erent strata are independent but in general not identically distributed. Let n=

S X

ns and ns = nas with

s=1

S X

as = 1

s=1

The weight of every member in the hth household in the cs th sampled cluster in the sth stratum is given by wscs h =

Mscs Hs kns

scs h

and equals the number of individuals in the population represented by this particular individual. All expectation and variances are taken with respect to the sampling distribution, (2001)).

5

which di¤ers in general from the population distribution due to the non-simple random sampling. We shall let Ehjcs ;s (:) ; V arhjcs ;s (:) to denote expectation and variance respectively taken with respect to the second stage of sampling, conditional on stratum s and cluster cs (analogously, Ecs js (:) and V arcs js (:) for …rst stage of sampling). When expectations and variances are taken with respect to both the stages of sampling, we simply denote those by E (:js) and V (:js); Op (1) and op (1) will denote quantities that are respectively d

P

(asymptotically) bounded in probability and go to 0 in probability; ! and ! will denote

convergence in distribution and probability, respectively.

In most real-life surveys, the number of clusters sampled per stratum is much larger than the number of households sampled per cluster (in the Indian NSS for instance, the numbers are about 120 and 10, respectively). This motivates asymptotic analysis with the number of clusters (n) going to in…nity with number of households staying …xed and …nite3 . Secondly, clusters sampled within a stratum are geographically scattered over a large area; households sampled within a cluster are physically close to each other. This motivates our assumption that cluster-level aggregates are independent across clusters within a stratum but household level variables are correlated within a cluster.4 Suppose we are interested in estimating a parameter

0

of dimension p (typically char-

acterizing an individual level characteristic, e.g. the per person mean consumption in the population), which solves the p population moment conditions (our applications below are all exactly identi…ed systems; for the overidenti…ed case, see Bhattacharya (2003a)) 0=

S X s=1

3

Hs

Z

m (y;

0 ) dF (y;

js)

(1)

Sakata considers asymptotics on the number of strata. His objects of interest are parameters of a super-

population from which the strata are sampled. So the strata for his analysis are like clusters for our analysis and correction of standard errors (of superpopulation parameter estimates) due to …xed strati…cation are irrelevant. 4 For smaller strata, cluster level variables might be correlated and, as in the spatial statistics literature, one needs this dependence to ‘disappear’ (spatial ergodicity) as the distance betwen clusters increases, in order for the laws of large number to hold as the number of clusters tends to in…nity. The information on spatial distances between clusters is rare if not totally non-existent in survey data, which makes this approach infeasible. A similar consideration holds for asymptotics on the number of (correlated) households per cluster (which would arise in a design where the number of clusters selected per stratum is much smaller relative to the number of households selected per cluster; but such designs are rare).

6

For instance, the population mean 0 =

S X

Hs

s=1

=

S X

Z

(y

H s Ecs

s=1

For a given stratum s; Ec

solves:

0

nP M (s;c) K=1

0)

8 (s;cs )
dF (y; js) scs K

(yscs K

K=1

9 = 0 ) js ;

o nscK yscK js equals the expectation (over clusters)

of total cluster income (added across all population households in that cluster) whereas o nP M (s;c) n js equals the expectation (over clusters) of total cluster population of Ec scK K=1 individuals. Their ratio is therefore the overall population mean. The MoM estimator of

0

is based on the sample analog (corresponding to the multi-

stage design) of the moment conditions (1), viz. : ns k S X Hs X M (s; cs ) X ns k s=1

cs =1

scs h m (yscs h ;

h=1

For later use, let us de…ne zscs h = (yscs h ;

scs h )

)'0

and m ~ (zscs h ; ) =

(2) scs h m (yscs h ;

).

Our assumptions about the sampling process imply: A0a. For s; s0 = 1:::S, zscs h ; zs0 c0 0 h0 s

are independent unless s = s0 and cs = c0s0 for

cs = 1; :::ns , c0s0 = 1; :::ns0 and h; h0 = 1:::k: A0b. For each s; fzscs h gcs =1;:::ns ;h=1;:::k are identically distributed.5

A0c. For s 6= s0 ; zs and zs0 are independent (but not necessarily identically distributed)

where zs

fzscs h gcs =1;:::ns ;h=1;:::k

The following analysis characterizes the asymptotic distribution of ^. By asymptotic we mean that the number of sampled clusters for every stratum goes to in…nity at the same rate, so that the quantities as ’s stay …xed. We shall re-index clusters by i with i running from 1 to n. n denotes the total number of clusters in the sample. Corresponding to every cluster i is associated the index si which denotes the stratum from which i is drawn. Then by de…nition, 5

Note that A0a-b are not conditioned on the clusters; clearly conditional on the clusters, zscs h ; zs0 c0 0 h0 s

are independent for all s; s0 ; cs ; c0s0 ; h; h0 and also zscs h ; zsc0s h0 and c0s :

7

may not be identically distributed given cs

#(ijsi = s) = ns for each 1

s

S

(3)

Then (2) reduces to n

1X m ~ i (^) ' 0 where n i=1 h P M (si ;i) Pk S Hs m ~ i( ) = s=1 as 1(si = s) h=1 m (ysi ih ; ) k

si ih

i

(4)

Note that the functions m ~ i ( ) are independent (though not identically distributed owing to strati…cation) across i: This makes the asymptotic analysis of the estimator completely standard via the theory of GMM estimators developed in the econometrics and statistics literature over the last two decades. The …rst two chapters in the Handbook of Econometrics volume 4, in particular, have a comprehensive treatment of this theory. Note that the proof of consistency uses WLLN for independent non identically distributed random variables; the proof of asymptotic normality uses the Central limit theorem (LindebergFeller-Lyapunov version) for independent and non identically distributed variables. After stating the relevant theorems (without proofs which are standard), we shall derive the expression for asymptotic variance which takes into account the sample design. Proposition 1 Under standard regularity conditions6 , p lim (^

0)

n!1

and

p

n(^

0)

=0

d

! N (0; V )

with V W0

=

W0

1 0

1

where n 1X lim Wn = lim V ar (m ~ i ( )) n!1 n!1 n i=1

= p lim 6

1 n

n X i=1

@ E (m ~ i ( 0 )) @ 0

Essentially, existence of …nite second moments of the Y -distribution su¢ ces here since the class of

functions we consider here are picewise linear and therefore form an Euclidean class, whence stochastic equicontinuity follows.

8

A consistent estimate of V is given by V^ = ^ Wn ^ 0 where ^= @ E @

Wn =

ns X S X k X

(

1

) n 1X m ~i( ) n i=1

2 wsc m yscs h ; ^ m yscs h ; ^ sh

=^

0

s=1 cs =1 h=1

+

ns X S X k X X

wscs h wscs h0 m yscs h ; ^ m yscs h0 ; ^

s=1 cs =1 h=1 h6=h0

S X 1 ns s=1

ns X k X

wscs h m yscs h ; ^

cs =1 h=1

!

ns X k X

cs =1 h=1

0

wscs h m yscs h ; ^

!0

(5)

The …rst term in (5) is the estimate of the variance without taking the sample design into account. The second term is the cluster e¤ect and is a function of the covariance between values obtained from the same cluster. If the covariances are positive on average (which is empirically true and is natural), this term is positive and implies that the wrong estimate of the standard error is an underestimate of the true standard error. The greater the degree of correlation between the observations inside a single cluster and the larger the number of observations (k) sampled from each cluster, the larger the degree of underestimation. The third term is the stratum e¤ect. With multiple strata, the expression within (:) is asymptotically non-zero (a weighted average of these expressions across strata is zero). So that its ‘square’is a positive de…nite matrix.7 The degree of overestimation is larger the more homogeneous are the units within a stratum and the more heterogeneous the units across the strata. See Bhattacharya (2003) for details. 7

Intuitively, the variance with a strati…ed design is the sum of within-stratum variances. Ignoring

strati…cation and estimating the variance as if it were a simple random sample causes over-estimation by wrongly adding on the between strata-variances.

9

3

Inequality Measurement

We now use the above analysis to derive consistent tests for inequality dominance. We discuss two tests- one for testing overall inequality dominance in terms of the Gini coe¢ cient and the second for testing inequality dominance based on entire Lorenz curves. The Lorenz share corresponding to a fraction of the population is the fraction of total income accruing to that fraction of the population. Formally, the Lorenz function (:) is de…ned as:

: (p) =

[0; 1] ! [0; 1] with

EP (Y 1(Y Q (p))) ; p EP (Y )

Z

Q(p)

dF (y)

(6)

0

where Q (p) is the pth population quantile, F (:) denotes the population distribution function of Y and EP (:) denotes expectation taken with respect to the population distribution. The Gini coe¢ cient is then de…ned as 0

=

Z

1

(p) dp

0

For future use, let us also de…ne (p) = EP (Y 1(Y 0

Q (p)))

= EP (Y )

If the Lorenz share corresponding to every fraction in population 1 is greater than that in population 2, then population 1 is considered more egalitarian in terms of income distribution. The population Lorenz shares are obviously not known and need to be estimated. These estimated shares can then be compared between samples from two populations (say, populations in a state in two di¤erent years) to make statistical inference on the change in equity between these two populations. Note also that

(:) is cadlag (continuous from the

right with limit on the left), monotone non-decreasing and lies between 0 and 1. The same is true for its estimate with probability 1. Three points are worth a mention here. One, ideally one would want to estimate and compare the Lorenz share at every possible point in [0; 1]. An easier objective is to focus on a …nite number of quantiles, say the deciles, and conduct (asymptotic) inference on 10

Lorenz shares at these points as the sample size increases. This has been the practice in the econometric literature on Lorenz shares and related measures (e.g. Beach and Davidson (1983), Bishop Formby and Smith (1991), Davidson and Duclos (2000) and Zheng (2002)). In contrast, here, we are proposing a test for ‘curve dominance’to test if the entire Lorenz curve for one population lies above that for another. Two, Lorenz curves (the locus of the Lorenz shares) from two di¤erent populations can cross. In this case, unambiguous judgement is not possible. If the population Lorenz curves (and therefore, the ‘asymptotic’ sample Lorenz curves) do not cross below, say, 25% but cross at higher fractions, estimating Lorenz shares at these fractions will still be useful, since policy-makers are often interested in shares accruing to a subset, say the bottom 25%, of the population. However, if indeed one entire population Lorenz curve lies above another, i.e. there is Lorenz dominance, then one can unambiguously infer decline in inequality. And then all reasonable summary measures of inequality would show smaller inequality in the former case. The converse is not necessarily true. Three, the Gini coe¢ cient, being a (linear) functional of the Lorenz process, will be asymptotically normal with a variance that can be computed from our analysis for Lorenz processes, above. Testing for Gini dominance. i.e. to test if overall inequality has increased between two periods, would then be a test for normal means. Clearly, Gini dominance is a necessary condition for Lorenz dominance, is easier to test and can be a hypothesis of interest in itself. The method of moment nature of the Lorenz share estimation problem should be obvious from (8). The in‡uence functions for the Lorenz share at a …xed percentile p; with = (Q (p) ; (p) ; ) is given by n

^ (p)

(p) = m ~ i (^) =

1X m ~ i (^) n

i=1 S X s=1

m (ysi ih ; ) =

1

+

k

M (si ; i) X Hs 1(si = s) m ysi ih ; ^ as m

(ysi ih 1 (ysi ih (p) 2

si ih

h=1

f

Q (p))

ysi ih g

(p)) +

1

Q (p) (p

1 (ysi ih

Q (p))) (7)

Note from the in‡uence functions in (7), that the density of Y at the quantile does not 11

appear in this expression (as it would in the in‡uence function for the corresponding quantile). Intuitively, this happens because even if the density is low at a quantile and therefore that quantile is imprecisely estimated, the mean up to the quantile (which is the generalized Lorenz share, (:)), is measured precisely. De…ning 'n (p) = ^ (p) (p), we shall now characterize the behavior of the Lorenz process f'n (p) : p 2 f0; 1]g as a stochastic p process indeed by p: The idea is to show that as n ! 1; f n'n (p) : p 2 f0; 1]g converges

weakly to a Gaussian process. This will imply asymptotic normality of the Gini which is a linear functional of f'n (p) : p 2 f0; 1]g ; also for two independent Lorenz processes, 'n1 (p) and 'n2 (p) corresponding two independent populations, we would be able to test Lorenz

dominance of one over the other. The derivation of these properties works as follows: we …rst derive the asymptotic distribution of the sample c.d.f. process, F^ (with population counterpart F ). This involves verifying the Glivenko-Cantelli and Donsker properties for the empirical process F^ . Then we use continuity and Hadamard di¤erentiability of the maps from the c.d.f. to the quantiles and then from the quantiles to the Lorenz shares to derive the Glivenko-Cantelli and Donsker properties of the sample Lorenz process. Finally, we use Hadamard di¤erentiability of the integral map to derive the asymptotic covariance matrix of the Gini via the functional delta method. To that end, de…ne the empirical c.d.f., the population c.d.f. and the population c.d.f. in every stratum s as F^ (x) =

F (x) =

PS

Pns

E

s=1

s=1

Pk cs =1 h=1 Wscs h 1 (yscs h PS Pns Pk s=1 cs =1 h=1 Wscs h

nP S

cs =1

nP S

Pk

h=1 Wscs h 1 (yscs h

o x)

o W scs h s=1 cs =1 h=1 o n P E Mkscs kh=1 scs h 1 (yscs h x) js n o ; s = 1; :::S P E Mkscs kh=1 scs h js E

F (xjs) =

Pns

x)

Pns

Pk

^ (p) (analogously, the pth population and sample quantile of Y , Q (p) and its estimate Q

12

^ p) and ^ (p) as Q (p) = inf fx : F (x) pg n ^ (p) = inf ysc h : F^ (ysc h ) Q s s

p

s;cs ;j

^ (p) = ^ = ^ (p) =

PS

s=1

PS

s=1

PS

Pns

cs =1

Pns

^ (p) ^

PS

h=1 Wscs h yscs h 1

Pns

s=1

c =1

s=1

Pk

s P ns

cs =1

Pk

cs =1

o

yscs h

^ (p) Q

Pk

h=1 Wscs h

h=1 Wscs h yscs h

Pk

h=1 Wscs h

Now we state the main proposition whose proof is outlined in the appendix. Proposition 2 (a) If F (:) has compact support and is continuously di¤ erentiable with a strictly positive derivative, f on its entire support8 , then (p) j ! 0

G

0)

p2[0;1]

p

n ^

P

sup j ^ (p) 0

where G (p) =

(^

2 0

=

G 0

Z

0

G (1)

L

Q(p)

0

H (u) du

is a Gaussian process with absolutely continuous sample paths. Proposition 3 (a) Under the same assumptions as proposition 2, P

(^ )!0 p d n (^ ) ! N (0; V ) 8

A su¢ cient condition for this is that for each s, F (:js) has a compact support and admits a continuous

ans strictly positive density on the support.

13

Moreover, a consistent estimate of V is given by V^

=

ns X S X k X

2 ^2 wsc scs h sh

s=1 cs =1 h=1

+

ns X S X k X X s=1 cs =1 h=1

S X 1 ns s=1

wscs h wscs h0 ^ scs h ^ scs h0

h6=h0

ns X k X

wscs h ^ scs h

cs =1 h=1

!2

where ^ sc

sh

=

Z

1

~ sc

sh

(p) dp

0

~ sc

= s h (p)

1h ysih 1 ysih ^

^ (p) Q

^ (p) p ^ (p) + Q

1 ysih

^ (p) Q

i

^ (p) (ysih ^2

The proof of these two propositions is outlined in the appendix. Remark 1 The assumption that the density is bounded away from 0 on the whole support is too strong, ruling out parametric distributions like lognormal. In section 5 we provide a subset of analogous results when the density is assumed to be bounded away from 0 for every compact subset of the support. However, at this stage, we do not have the full set of analogous results for this case and leave the determination of those to future research. Remark 2 An alternative approach to deriving the asymptotic distribution for the Gini is through U-statistics (c.f. Bishop et al, 1997). It is not hard to generalize that approach to the case of complex surveys. The GMM based approach we pursue here is applicable to many other inequality indices like the Atkinson and Generalized entropy class (including the Theil index) and measures of poverty like the Foster-Greer-Thorbecke class which do not admit a U-statistic representation. Our approach also leads naturally to inference on entire sample curves and therefore permits more robust tests of inequality dominance than those based on a single summary statistic.

14

^)

4

Testing: Theory

4.1

Gini

The in‡uence functions for the Gini coe¢ cient are given by p

n

1 X )= p n

n (^

i

+ op (1)

i=1

where

i

=

Z

0

1

dp

8 > > S
k

s

> a > : s=1 s

1(si = s)

which are estimated by

2

1

fysih 1 (ysih

Q (p))

M (si ; i) X 6 6 + 1 fQ (p) (p 1 (ysih 4 m (p) h=1 2 (ysih

39 > > 7= Q (p)))g 7 5> > ; )

8 2 1 > ysih 1 ysih y(k) ^ Nk > N N ^( N ) k=1 > : s=1 h=1 (ysih ^ ) ^2

(p)g

39 > > 7= 7 5> > ;

(8)

where y(k) is the kth order statistic in the combined sample, N is the total number of observations and ^

k N

=

^

k N

=

PS

Pk cs =1 h=1 Wscs h yscs h 1 yscs h PS Pns Pk s=1 cs =1 h=1 Wscs h PS Pns Pk y(k) s=1 c =1 h=1 Wscs h 1 yscs h PsS Pns Pk s=1 cs =1 h=1 Wscs h s=1

Pns

y(k)

Given two independent populations, indexed by 1 and 2, a test for Gini dominance of 2 over 1, implying overall inequality is higher in 2, i.e. H0 :

1

2

versus H1 :

1

>

2

is a standard 1-sided test for normal means. Under the null of equality (the ‘worst case’) the di¤erence in the sample Ginis is asymptotically normally distributed with mean 0 and variance V1 + V2 where Vj is the asymptotic variance of ^ j : Noting the similarity between

i

and mi (:) in (4), one would get stratum and cluster

e¤ects analogous to (5) in the estimate of variance of the Gini coe¢ cient.

15

4.2

Lorenz dominance

Lorenz dominance means that the population Lorenz curve for population 1 lies above the Lorenz curve for population 2, everywhere. This is the strongest form of inequality dominance and implies dominance in terms of all proper measures of inequality. If denotes the Lorenz curve for population j for j = 1; 2 and

(p) =

2 (p)

1 (p),

j

(p)

then we

can have the following exhaustive cases: H1 :

(p) = 0 for all p

H2 :

(p)

H3 :

(p) < 0 for all p

H4 :

(p) > 0 for some p and

H5 :

(p)

H6 :

(p) > 0 for all p.

0 for all p with

0 for all p with

(p) < 0 for some p (p) < 0 for some p (p) > 0 for some p

In applied work, one is usually interested in testing two types of hypotheses: one, whether Lorenz curves for population 1 are equal to or lie above that for population 2 everywhere versus that they intersect and two, whether the curves are identical for the two populations versus that there is strict dominance. The …rst test amounts to testing the composite null H = H1 [ H2 [ H3 versus K = H4 [ H5 [ H6 . The second test amounts to testing the simple null H = H1 versus K = H2 [ H3 .

In the samples, generally the number of clusters n1 and n2 will be di¤erent and we shall

assume that = where

lim

n1 ;n2 !1

is the observed ratio of sample sizes.

r

n2 n1

We shall base the …rst test on the statistic Z 1 p ^ (p)1 ^ (p) > 0 dp ^ U = n1 0

^ whereas the second test will be based on and shall reject the null for large values of U Z 1 p 0 ^ (p)1 ^ (p) < 0 dp ^ U = n1 0

For the purpose of this paper, we will not be concerned with …nding the optimal test and shall content ourselves with one test that is consistent. For assessing the properties of these tests, which follow, we shall focus on the …rst test only, the second test is analogous. 16

Under the most conservative situation, viz. 1 (p) = 2 (p) for all p; the null distribution ^ will depend on the underlying distribution functions F1 and F2 : So we have to simulate of U ^ ; based on the data to …nd the critical values. The following the null distribution of U propositions characterize the properties of the test. The …rst proposition shows that by …xing the critical region such that the size of the test equals

at equality i.e. for H = H1 , we are guaranteed a size of at least

for the

composite one-sided null hypothesis H = H1 [ H2 [ H3 . Proposition 4 Let z solve Z 1 p ^ (p)1 ^ (p) > 0 dp n1 Pr 0

Then for all (p) satisfying Pr

p

(p)

n1

Z

1

z j (p) = 0 8 p =

0 8p with strict inequality for some p, we have ^ (p)1 ^ (p) > 0 dp

0

Proof. Suppose the true curve

(:) satis…es

(p)

z j (p) 0 for all p with strict inequality

for some p: Then p

n1

Z

1

^ (p)1 ^ (p) > 0 dp

0

p

n1

Z

0

1n

o (p) 1 ^ (p)

^ (p)

(p) > 0 dp

Therefore, Pr

p

n1

Z

0

Pr

p

n1

Z

0

=

1

^ (p)1 ^ (p) > 0 dp

1n

^ (p)

z j (p)

o (p) 1 ^ (p)

(p) > 0 dp

z j (p)

The next lemma, which will be used repeatedly in the paper, shows that the map from the Lorenz shares to the test statistic is continuous. Lemma 1 Under the same assumptions as in Proposition 2, the map z : D [0; 1] ! R+ R1 de…ned by z( ) = 0 (p)1 ( (p) > 0) dp is continuous with respect to the sup norm, where 17

D [0; 1] is the space of bounded cadlag functions, equipped with the sup norm (the Skorohod space).9 Proof. For any two elements

1, 2

2 D [0; 1] ; we have that

jz ( 1 ) z( 2 )j Z 1 Z 1 = 1 (p)1 ( 1 (p) > 0) dp 2 (p)1 ( 2 (p) > 0) dp 0 0 Z 1 j 1 (p)1 ( 1 (p) > 0) dp 2 (p)1 ( 2 (p) > 0)j dp Z0 = j 1 (p)1 ( 1 (p) > 0) 2 (p)1 ( 2 (p) > 0)j dp p: 1 (p)>0; 2 (p)>0 Z + j 1 (p)1 ( 1 (p) > 0) 2 (p)1 ( 2 (p) > 0)j dp p: 1 (p)>0; 2 (p) 0 Z + j 1 (p)1 ( 1 (p) > 0) 2 (p)1 ( 2 (p) > 0)j dp p: 1 (p)<0; 2 (p)>0 Z Z = j 1 (p) j 1 (p)j dp 2 (p)j dp + p: 1 (p)>0; 2 (p)>0 p: 1 (p)>0; 2 (p) 0 Z + j 2 (p)j dp p: 1 (p)<0; 2 (p)>0 Z Z j 1 (p) (p)j dp + j 1 (p) 2 2 (p)j dp p: 1 (p)>0; 2 (p)>0 p: 1 (p)>0; 2 (p) 0 Z + j 2 (p) 1 (p)j dp Z

0

p:

1 (p)<0; 2 (p)>0

j

1 (p)

1

2 (p)j dp

sup j

p2[0;1]

1 (p)

2 (p)j

This demonstrates that z(:) is Lipschitz and therefore continuous. Note that under the assumptions of proposition 3, (:) is continuous but ^ (:) need not be. In fact, the de…nition of ^ (:) shows that it is cadlag. This is why we are considering the space D [0; 1] rather than 9

C [0; 1].

18

Proposition 5 Under the same assumptions as in Proposition 2 and d ^! U

Z

1

0

~ ~ L(p)1 L(p

(p) = 0 for all p;

> 0)dp; where

~ L(p) = Gj (p) = Gj (1) =

1

p

Z

G2 (p)

2 (p)

20 Q0j (p)

20

G1 (p)

G2 (1)

1 (p)

10

10

G1 (1)

(Hj ) (u) du; j = 1; 2

0

n1 ^ j

0j

; j = 1; 2

Proof. The proof follows from Lemma 1 and Proposition 2, under the continuous mapping theorem for functionals. Proposition 6 (Consistency of test): Under the assumptions of Proposition 2, h ^ lim Pr U

n!1

i d j (p) = 1 for all (p) such that (p) > 0 for some p:

Note that (p) > 0 for some p is the same as the alternative K = H4 [ H5 [ H6 . Proof. Given that F1 and F2 admit continuous densities, the function uous in p. So if

(p) > 0 for some p; say p0 ; there exists

positive on (p0

; p0 + ). Let us assume that

> 0 such that

(p) is contin(:) is strictly

(:) is non-positive outside this interval,

which is the worst case for us (i.e. this is the ‘smallest’possible deviation from the null). Now, Lemma 1, together with the continuous mapping theorem for functionals implies that Z 1 Z 1 P ^ (p)1 ^ (p) > 0 dp ! (p) 1 ( (p) > 0) dp 0 0 Z p0 + = (p) dp > 0 p0

^ = p n1 So U

4.3

R1 0

P ^ (p)1 ^ (p) > 0 dp ! 1; as n1 ! 1:

The null distribution of U^

^ is non-standard, depends on nuisance parameters (in particuThe null distribution of U ^ depends on the underlying distribution functions) and cannot be lar, the variance of U simulated directly because of the complex survey design issues. We therefore resort to the 19

^ under bootstrap, adapted to the sample design under study, to derive the distribution of U the true data generating process, as follows. Note that due to the lack of pivotalness, we do not expect the bootstrap to give asymptotic re…nements over the limiting distribution assumption (which is unknown); the purpose here is to produce consistent tests of hypothesis when we do not know the true asymptotic distribution theory under the null. For population j (independently for j = 1; 2), within every stratum s, draw a sample of njs clusters with replacement from the clusters within that stratum in the original sample where njs is the total number of clusters in stratum s in the original sample. Retain all households from that cluster together with their corresponding weights. Compute the statistic ^ (p) j

for each p. Compute the statistic Z 1n ^ (p) U = 1

0

o n ^ (p) 1 ^ (p)

o ^ (p) > 0 dp

where ^ (p) = ^ (p) 2

^ (p) 1

^ (p) = ^ (p) 2

^ (p) 1

Perform this operation independently B times, generating the statistics U1 ; U2 ; :::UB : The distribution of this statistic for a large B is an approximation to the bootstrap distribution ^ which in turn is a ‘good’ approximation (in a sense made clear in the next propoof U ^ ; properly centered (i.e. the asymptotic sition) to the true asymptotic distribution of U o n o R1n distribution of 0 ^ (p) (p) 1 ^ (p) (p) > 0 dp under the true data generating process). Under the null hypothesis of equality, this equals the limiting distribution of R1 ^ ^ (p) > 0 dp: We reject the hypothesis at level if the observed value U ^ exceeds 0 (p)1 the 100

(1

) point of the simulated distribution. The justi…cation for this process

follows from the “empirical bootstrap”results (c.f. van der Vaart, 1998, theorem 23.7 and 23.8), given the continuity and Hadamard di¤erentiability of the maps from the cdf’s to the quantiles to the Lorenz shares and Lemma 1. Note that resampling the clusters is the appropriate procedure since the asymptotics here is on the number of clusters. Since we have assumed that all the dependence in the data is within clusters, we know the clusteridentities of individuals and can therefore “preserve”the population dependence structure in our bootstrap population. As a result, we do not run into the complications cited in 20

Hall and Horowitz (1996, pages 898-9)) in the context of using a block bootstrap technique (for GMM estimation) with dependent data where it is not known a priori as to which observations are correlated. In our empirical applications, we shall compare results from the “design adapted” bootstrap to that for the naive bootstrap that simply draws subsamples (of both the Y values and the weights) of size nj from the jth population. The statistics, as for the Gini, will always be computed using the sample weights. The number of bootstrap replications was chosen to be around 1000 (the actual number of replications was decided on a caseby-case basis by the robustness of the p-values across the number of draws). Note that it is infeasible to implement the procedure of Buchinsky and Andrews (1998) to determine the optimal number of bootstrap replications, since the limiting distribution of the test statistic is non-standard. Proposition 7 (Consistency of the bootstrap): Under the same assumptions as Proposition 2, sup h2BL1

EM h

p

Z

n1

(l1 (0;1))

1

(

(p)) 1 (

(p) > 0) dp

Eh

Z

0

0

1

L~ (p) 1 L~ (p) > 0 dp

where (p) =

(p)

^ (p)

BL1 (l1 (0; 1)) is the set of uniformly Lipschitz functionals with domain (l1 (0; 1)) and

EM (:) denotes expectation with respect to the bootstrap sampling distribution, conditional on the sample and L~ (:) is the limiting distribution of the sequence p

n1

Z

1

^ (p)

(p) 1 ^ (p)

(p) > 0

dp

0

Note that under proposition 4.

^ as de…ned in (p) = 0 for all p; L~ (:) is the limiting distribution of U

Proof. We consider only one of the two populations. Given our assumptions for propositions 3 (essentially that our class of functions 1 (yscj x) is Donsker), it follows that for p each s; the centered Bootstrap estimate ns F (:js) F^ (:js) converges in distribution p to the same limit as ns F^ (:js) F (:js) with probability approaching 1 (or almost 21

P

!0

surely for all samples) (c.f. Theorem 3.6.1 in van der Vaart and Wellner, 1996). Or more formally, for each s; sup h2BL1

(l1 (F

EM h

p

ns F (:js)

P

F^ (:js)

Eh (Hs ) ! 0

s ))

where BL1 (l1 (Fs )) denotes the set of uniformly Lipschitz functions on l1 (Fs ) where Fs = f1 (Ys

x)gx2supp(Y js) ,

EM denotes expectation with respect to the bootstrap sampling distribution, conditional p on the sample and Hs denotes the limit to which ns F^ (:js) F (:js) converges. Note p that the Donsker property of the class Fs implies that we can view ns F^ (:js) F (:js) p and ns F (:js) F^ (:js) as maps into l1 (Fs ). Now, since the overall c.d.f. is a linear functional (weighted average) of the stratum cdf’s and therefore di¤erentiable, it follows from the delta method (c.f. Theorem 3.9.11, van der Vaart and Wellner, 1996) that sup

EM h

p

n1 F (:)

P

F^ (:)

Eh (H) ! 0

h2BL1 (l1 (Fs ))

where H (:) denotes the limit to which

p

n1 F^ (:)

F (:) converges where n1 is the total

number of clusters in population 1. Now repeated use of the delta method, given the Hadamard di¤erentiability of the maps from the cdf’s to the Lorenz shares, it follows that the bootstrap is consistent for the Lorenz shares as well, i.e. sup

EM h

p

n1

(:)

P

^ (:)

Eh (L) ! 0

h2BL1 (D[0;1])

where L (:) is the limiting distribution of

p

n1 ^ (:)

(:)

and D [0; 1] is the space of

cadlag functions on [0; 1]. Now, combining results for the two populations (using the fact that independence implies joint convergence in distributions), and noting that the sample sizes are of the same order, the continuous mapping theorem and lemma 1 imply that Z 1 Z 1 p sup n1 ( (p)) 1 ( (p) > 0) dp Eh L~ (p) 1 L~ (p) > 0 dp EM h h2BL1 (l1 (0;1))

0

0

(9)

22

P

!0

where ^ (p)

(p) =

(p)

(p) =

2 (p)

1 (p)

^ (p) = ^ (p) 2

^ (p) 1

^: which establishes that the bootstrap is consistent for the test statistic U The de…nition of convergence in distribution in (9) is equivalent to the c.d.f. of the R1 p sequence of random variables n1 0 ( (p)) 1 ( (p) > 0) dp converging to the c.d.f. R1 of 0 L~ (p) 1 L~ (p) > 0 dp: We suggest a sequential test procedure for dominance, as follows. First consider testing

H01 :

1 (p)

2 (p)

for all p versus H11 :

1 (p)

<

2 (p)

for some p: If we reject the

null, we reject Lorenz dominance of population 2 by 1. If we accept the null, based on o n o R n ^ = 1 ^ 2 (p) ^ 1 (p) 1 ^ 2 (p) ^ 1 (p) > 0 dp, we move on to test H02 : 1 (p) = U 0

: (p) > 2 (p) for some p. The second test is based on 2 (p) forn all p versus H o12 n 1 o R1 0 ^ ^ ^ ^ (p) > 0 dp: We …x critical values c and d for ^ U = 0 (p) (p) 1 (p) 1 2 1 2 the two tests corresponding to the levels

=2 each. This guarantees that the overall level

of the test is . Indeed, the overall probability of type 1 error equals ^ > c jH01 + P U ^ < c ;U ^ 0 > d jH02 P U ^ > c jH01 + P U ^ 0 > d jH02 P U

=2 + =2 =

Consistency of the second test follows from the same argument as that for the …rst.

5

Relaxing assumptions

As we mentioned in the remark following Propositions 2a and 3a, the assumption that the density is bounded away from 0 uniformly over the support is too strong. In what follows, we state propositions analogous to 2a, 3a, 4a and 5a corresponding to the weaker assumption on the density. Proposition 8 (2b) If F (:) is continuously di¤ erentiable with a derivative, f which is uniformly bounded away from 0 for every compact subset of the support of F , then sup j ^ (p)

p2[0;1]

23

P

(p) j ! 0

Proposition 9 (3b) For every 0 < p0 < p1 < 1, p

n ^

L

is in l1 [p0 ; p1 ] and L (:) is a Gaussian process with absolutely

where the convergence

continuous sample paths. Moreover, p

n (~

N 0; V~

~0)

where ~0 = 1

2

~ = 1

2

Z

p1

(p) dp

p Z 0p1

^ (p) dp

p0

Remark 3 Note that even though we can prove uniform consistency on all of [0; 1] we can prove weak convergence only on [p0 ; p1 ] for every 0 < p0 < p1 < 1 (Below, we conjecture that we can extend the result to l1 [0; 1] and suggest three possible alternatives of proving this result. The actual establishment of this is left to future research). ~ 0 , in consequence, is an approximation to the true Gini which can be made arbitrarily close to the true Gini by choosing p0 arbitrarily close to 0 and p1 arbitrarily close to 1. Remark 4 If the conditions of Proposition 2b hold, we can test propositions of the type H1

H6 where the condition “for all p 2 [0; 1]” and “for some p 2 [0; 1]” are to be replaced

by “for all p 2 (0; 1)” and “for some p 2 (0; 1)”.

We now provide a proof of the uniform consistency result and state the conjecture with possible methods of proof. Proof. Since

(p) is monotone nondecreasing in p and bounded between

(0) = 0 and

(1) = 1, for every " > 0 there exists a partition 0 = p0 < p1 < ::: < pk < pk+1 = 1 such that for all j = 0; :::k + 1, (pj ) Now, for every pj

1

(pj

1)

<"

p < pj we have ^ (p)

(p)

^ (pj )

(pj ) + "

^ (p)

(p)

^ (pj

(pj

24

1)

1)

"

(10)

given that ^ (:) is also monotone nondecreasing and the de…nition of the pj ’s. Now for a:s: a:s: a:s: every …xed p, ^ (p) ! (p) and therefore ^ (p) ! (p) and ^ (p ) ! (p ) uniformly

in p 2 fp1 ; :::pk g. Therefore, (10) implies that lim sup ^ (p)

(p) < ", a:s:

n

Since " > 0; we are done. Conjecture 1 If F (:) is continuously di¤ erentiable with a derivative, f which is uniformly bounded away from 0 for every compact subset of the support of F , then p where the convergence

n ^

L

is in l1 [0; 1] and L (:) is a Gaussian process with absolutely

continuous sample paths. Consequently,

P

(^ )!0 p d n (^ ) ! N (0; V ) There are four possible approaches to the proof of this statement. The …rst is to show that for any " > 0; > 0, there exists a …nite partition of [0; 1] into …nitely many intervals T1 ; T2 ; :::Tk such that p lim sup Pr sup sup j n ^ (p) n

i

p

(p)

n ^ (q)

p;q2Ti

(q) j >

!

<"

using a technique similar to that used in the uniformly continuity proof. Then weak convergence follows by e.g. Theorem 18.14 of van der Vaart (1998). The second is to use a method analogous to the one used to show that L-statistics, i.e. linear combinations of order statistics, are asymptotically normal. Indeed, for a simple random sample and for any given p, observe that ^ (p) = ^ (p) = ^

1 n

P[np] i=1

Yn(i)

^

where Yn(i) is the ith order statistic and [np] is the largest integer less than or equal to np.

25

A third possible method would be to utilize the approach of Andrews (1994) for ‘MINPIN’estimators or Chen at al (2003)’s approach for semiparametric estimators with nonsmooth criterion functions, as follows. Our estimator for 1 Xn p (p) n

(:) solves

n

Yi 1 F^ (Yi )

p

i=1

o

=0

where F^ (:) is the empirical c.d.f. which can be viewed as an asymptotically ‘well-behaved’ nonparametric estimator of the true c.d.f. The criterion function is smooth in (:) and the nonparametric estimator F^ is uniformly consistent for the true c.d.f. If one can show that the sequence

n

1 X fm ( vn (p; F ) = p n

0 (p) ; F )

Em (

0 (p) ; F )g

i=1

where

m ( (p) ; F ) = and

0 (:)

is the true

(p)

Yi 1 (F (Yi )

p)

(:) ; is stochastically equicontinuous uniformly in p (which is not

hard to show given that the function y1 (F (y)

p) is of bounded variation with envelope

y for all p and we also have that Ejyj < 1) and also that 1 X n p E n n

0 (p)

Yi 1 F^ (Yi )

p

i=1

o

S

for some Gaussian process S, then the result would follow.

Finally, a fourth method would be to establish the Hadamard di¤erentiabilty of the

map from the c.d.f.’s to the Lorenz shares directly, without trying to prove that the map from the c.d.f. to the quantiles is Hadamard di¤erentiable as maps into l1 [0; 1]. The idea would be to use the mean value theorem for integrals to express Z p (p) = F 1 (s) ds = pF 1 (~ pF (p)) 0

for some p~ 2 (0; p) where p~F (p) = F Since f~ p (p) : p 2 [0; 1]g map F 7! F

1 (~ pF

1 p

Z

p

F

1

(s) ds

0

(0; 1), one can try to prove Hadamard di¤erentiability of the

(:)) as maps into l1 (0; 1) and then use the functional central limit

theorem.

26

6

Empirical Applications

We now turn attention to the application of the methods developed above to a speci…c realworld problem. The problem is the estimation of Lorenz shares and the Gini coe¢ cient for per capita monthly expenditure in India. These measurements address what is a major question facing policy-makers in India in the context of ongoing political debate concerning large-scale privatization of the Indian economy (See Ahluwalia, 2002 for some major issues in this debate). In what follows, we compute the Gini coe¢ cient for per capita monthly expenditure in India and the four major states in the four regions of the country. We compare the measures obtained from 1987-88 with those from 1993-94, which correspond to the pre-reform and post-reform phases of the Indian economy, respectively. The data come from the complexly designed Indian National Sample Surveys (NSS) and therefore both the estimates of the Ginis and estimates of their standard errors warrant the correction for sample design10 . In the cases where we observe dominance according to the Gini criterion, we compute the test statistics for Lorenz dominance to see if the observed changes in inequality are caused by overall dominance of the Lorenz curves.

6.1

Empirical Results for Gini

This subsection discusses the empirical results obtained by comparing the Gini coe¢ cients for the 43rd and those for the 50th rounds of the NSS. The 43rd round was conducted in 1987-8 and the 50th round in 1993-4 and provide the most recent reliable and comparable comprehensive data on household consumptions in India. We …rst document our …ndings for inequality and then discuss the e¤ects of correcting standard errors on our inference. In addition to reporting numbers for all India, we report the results for the four largest states in the four main regions of India. These four states represent respectively, the highly industrialized group of states (Maharashtra), the predominantly agricultural and 10

The Indian NSS employs a second level of strati…cation inside the clusters, based on correlates of income

like landholding, in order to guard against the possibility of systmatically missing the relatively wealthy households. This makes the design more complicated than the one we have outlined in section 2 and section 5.3 in this paper describes how our methods can be applied directly to more complicated designs by rede…ning the strata and clusters.

27

low Human Development Index group of states (Uttar Pradesh), the poor but economically progressive states (Andhra Pradesh) and …nally the so-called ‘industrially stagnant’states (West Bengal). These four states house more than one-third of India’s population. In Table 1, we report the estimated Gini coe¢ cient for the 50th round of the NSS corresponding to 1993-4. Correct estimates as well as naive (unweighted) estimates that do not take into account the survey design are reported. In table 2, we report the estimates of standard errors for the weighted Gini coe¢ cients. We report both the correct standard errors that take the survey design into account and the naive ones that do not. We also show the contributions of the stratum and cluster e¤ects separately to the overall standard error. Finally, in Table 3, we report the di¤erence in the Gini coe¢ cients and the associated t-statistics (together with the p-values) for testing increasing inequality between 1987-8 and 1993-4. Both the correct t-statistics as well as the naive ones are reported. From Table 1, note that the unweighted Gini is always larger than the weighted one. This is because the Indian NSS oversamples rich households in every cluster. Unweighted estimates therefore load the results disproportionately (relative to their population frequency) in favor of richer households, producing an overestimate of the true Gini coe¢ cient. Having obtained the consistent estimates, we next turn to computing their standard errors in Table 2. We compare two di¤erent estimates of the standard error- one taking the design into account and the other not- for the same estimate of the parameter viz. the weighted consistent estimates of the Gini coe¢ cient. Since the results di¤er in interesting ways between rural and urban sectors, we report the two sectors separately. As explained in the footnote to the table, columns 2 and 3 report the standard errors that, respectively, do and do not take the survey design into account and column 6 reports the % increase in standard errors due to overall design e¤ects as percentage of the naive standard error estimates. Column 4 shows the % decrease in standard errors as a result of taking only strati…cation (and not clustering) into account (for instance, 23.51 in row 2, column 4 means that by taking strati…cation into account our estimate of the standard errors has fallen by 23.51% of the naive standard error for a consistent estimate of the Gini for urban India in 1993-4). In terms of the expression in (5), this corresponds to the standard error one would get if one ignored the second term but included the third. The idea is to look at

28

the separate contributions of the three terms in (5) to the overall standard error. Similarly column shows the increase in standard errors as a result of taking only clustering (and not strati…cation) into account. It should be immediately obvious from Table 2 that in general, cluster e¤ects are larger than stratum e¤ects. They are also much larger in urban areas relative to rural ones. The most likely explanation for this is that due to higher mobility in urban areas (better property markets and no strong attachment to land unlike the agricultural rural population), the urban population sorts itself more e¢ ciently by income, making urban clusters more homogeneous. In other words, there are poor neighborhoods and rich neighborhoods in cities to a larger extent than there are rich villages and poor villages. These results suggest that for countries with greater degrees of segregation, survey design will have stronger e¤ects on standard errors through larger cluster e¤ects. Strata being larger in size are likely to be less homogeneous and therefore will produce relatively smaller stratum e¤ects on estimates of standard errors. Finally in Table 3, we report the Gini coe¢ cients corresponding to two successive rounds of the NSS survey- 1987-88 and 1993-94. We report the Ginis, the observed increases in Ginis in 1993-4, relative to 1987-8 and …nally in the last two columns, we report the naive and the design-corrected t-statitics for testing hypotheses regarding the change in Ginis11 . The purpose of this table is to demonstrate that the relative magnitude of the standard errors become critical when testing changes in inequality. It is often the case that sample Ginis actually move very little over long periods of time (e.g. for rural India, an increase of the Gini by merely 1.35 percentage points is statistically signi…cant). Without knowledge of standard errors, it is very likely for analysts to conclude wrongly that the population shares have not moved at all, when in reality they actually have. At the same time, it is critical to obtain the correct standard errors in order not to overstate (or understate) the statistical signi…cance of the observed changes. Note from Table 3 that testing hypotheses at conventional levels of signi…cance (5% and 1%) would lead us to reverse the direction of inference in some (marked with an asterisk) but not all of the cases. However, whether the direction of inference changes depends on both the actual movement in the sample Gini 11

We have assumed here that the samples for the two di¤erent years are independent so that the variance

of the di¤erences in Lorenz shares is the sum of the variances of the shares for each year. Since clusters are sampled independently in the two years, this assumption is plausible.

29

coe¢ cients and the levels of signi…cance at which we are testing the hypotheses. So that few reversals are not a justi…cation for not correcting the standard errors. Finally, from Table 3, a few interesting trends become apparent. Firstly, rural inequality has declined at both the all-India level as well as in Uttar Pradesh and Andhra Pradesh. Secondly, in the industrially developed Maharashtra, neither rural nor urban inequality has changed but overall inequality has increased signi…cantly. This suggests the ruralurban gap has gone up. Finally, in the socially as well as economically progressive Andhra Pradesh, inequality has gone down in all sectors whereas in West Bengal inequality does not seem to have changed in any respect in any sector over this period.

6.2

Empirical results for Lorenz dominance

Tables 4 and 5 summarize the results for the Lorenz dominance test. We report the tests for the cases where we have concluded dominance on the basis of the Gini, viz. rural India, the state of Maharashtra, rural Uttar Pradesh and rural, urban and the entire state of Andhra Pradesh. For quick reference, we also report the correct t-statistics for the corresponding Gini-based test of dominance. In Table 4, column 1 we report the t-statistics for the Gini coe¢ cient (which are reproduced from Table 3). In column 2, we report the p-value for the …rst test and if this p-value is greater than 2.5%, we report the p-value of the second test in column 3. The …rst p-value greater than and the second less than 2.5% implies acceptance of Lorenz dominance at level 5%, both greater than 2.5% imply acceptance of equality of the two curves and …nally the …rst less than 2.5% implies rejection of dominance (this corresponds to accepting the hypothesis that the population Lorenz curves cross). In column 4, we report the overall conclusion. All of these computations take into account the sample design. In table 5, we report the p-values, obtained via the naive bootstrap which ignores the survey design. These numbers are reported in columns 1A-3B of table 5. Columns 1A and 1B correspond to ignoring clustering and drawing bootstrap samples from within every stratum (the columns with label A denote the p-value for the …rst test of dominance, the columns marked B report the p-value of the second test if we cannot reject dominance with the …rst test) ; columns 2A and 2B correspond to ignoring strati…cation and drawing bootstrap samples of clusters and …nally, columns 3A and 3B report the p-values corre-

30

sponding to ignoring both strati…cation and clustering by drawing bootstrap samples from the entire population. For comparison, we report the corrected p-values (reproduced from table 4) in columns 4A and 4B. The tables suggest that in all cases where we had concluded dominance according to the Gini criterion, except for rural India, we cannot reject equality of the two Lorenz curves. The hypothesis of dominance against no dominance is always accepted and the hypothesis of equality versus dominance also always accepted except for rural India. For rural India, we accept the hypothesis of dominance at the …rst stage and then reject equality at the second stage, implying that the distribution in the 50th round Lorenz dominates that in the 43rd. In table 5, we also observe the expected e¤ects of ignoring clustering and strati…cation while performing the bootstrap analysis. Including strata decreases the p-value and including the clusters increases them. Note that in urban Andhra Pradesh, ignoring the survey design would lead us to conclude (at 5%) that the 50th round distribution Lorenz dominates the 43rd round one, while taking the design into account would lead us to conclude equality.

6.3

Implementation for other surveys

Data agencies di¤er in terms of availability of stratum and cluster information in the publicuse data …les. For almost all developing countries, including the LSMS surveys (which currently cover more than thirty developing countries for multiple years), the stratum and cluster identi…ers are available in the public use micro-data. In some cases, these occur as variables in the data set and are usually termed ‘stratum’and ‘psu’, respectively12 . In other cases, the stratum and cluster identities are contained in the unique household identi…er variable which is constructed by concatenating stratum and cluster identity numbers13 . Ideally, one should consult the sample design document to see what variables are the strati…cation and clustering based on and identify them in the micro-data before applying our methods. For several US surveys like the PSID and the Health and Retirement Study, the stratum and cluster information are available upon signing a sensitive data agreement for protection of respondents’privacies. 12

The terminologies vary between surveys: e.g. the LSMS survey for Azerbaijan lists strata as raions and

clusters by the variable PPID, that for Pakistan are stratum and psu, for Peru it is regtype and cluster etc. 13 e.g. in the Albanian survey of the LSMS, the …rst two digits of the hhd id represent the bashki (stratum) and the next two represent the village (cluster)

31

Di¤erent real-life surveys in the real world employ di¤erent number of levels of strati…cation and clustering. With multiple layers of strati…cation, only the …nal, i.e. the …nest level of strati…cation matters and that is what should be used as the stratifying variable. e.g. if the strati…cation is …rst by state and then by districts within every state, then each state-district cell constitutes one stratum. For multiple layers of clustering (as in the PSID, the LSMS survey for Peru etc.), taking into account correlations between observations from the primary clusters su¢ ces since this also takes into account correlations between units residing in secondary clusters14 . For instance, if the …rst stage of clustering is an urban block and the second stage is a household within the selected block (with individuals being the ultimate sampling unit), then taking into account correlations between residents of the same block ‘includes’the correlation between individuals in the same household. Thus no changes are warranted in our formulae when there are multiple levels of strati…cation and clustering; it is enough to set the stratum variable to the ‘ultimate’ stratifying variable and the cluster variable to the ‘primary’level of clustering and then applying our formulae developed above for one stage each of strati…cation and clustering. Sampling weights are included in all micro-data …les. One needs to use the right weights depending on whether one is performing the analysis at a district level, household level, individual level etc. For instance, for the individual level analysis, household weights should be multiplied by household size.

7

Conclusion

In this paper, we have adapted the framework of asymptotic GMM-based inference to strati…ed multi-stage samples and have used this framework to analyze asymptotic distributions for commonly used inequality indices. In particular, we have adapted results from the empirical process literature to complex survey design to characterize the large-sample behavior of Lorenz processes and obtained an asymptotic distribution theory for the Gini coe¢ cient as a by-product. We have then applied our results to test for changes in inequality in four large states of India, based on monthly per capita household expenditure, 14

It is almost always always the case that the number of primary clusters sampled per stratum is much

larger than the number of secondary clusters sampled per primary cluster. Hence our asymptotics with large number of primary clusters is appropriate.

32

between 1987-8 and 1993-94. We have employed tests based on the Gini coe¢ cient for which we have obtained a standard distribution theory. When we have concluded rise or fall in inequality, based on the Gini, we have applied the more robust test of Lorenz dominance. The large sample distribution of the latter test is non-standard and depends on nuisance parameters that are not directly estimable. We have therefore used (and justi…ed) a bootstrap-based test of dominance, where the bootstrap has been adapted to the survey design under study. Correction of estimates and standard errors for survey design are seen to have substantial impact in the urban sectors due to better residential sorting of the urban population into richer and poorer neighborhoods. Our conclusions suggest that rural inequality has mostly declined over this period or stayed unchanged. Urban and inter-sector inequality have changed di¤erently in di¤erent states. While it is true that the liberalization reforms had started in India in the early nineties, it would be premature to attribute our observed trends in inequality to them since the latest period of reliable data is 1994. Notwithstanding that, our …ndings here are consistent with the story that the higher growth rates of the Indian economy in the late eighties have a¤ected only the urban sector of the more industrialized states of India (Maharashtra being one of them and West Bengal and Uttar Pradesh not). They are also consistent with a migration story where the poorest and/or the richest villagers migrated to the cities, leaving village income distribution more equal. The determination of which of these stories (or none of these) is the truth is left to future research.

References [1] Ahluwalia, M.S. (2002): Economic Reforms in India Since 1991: Has Gradualism Worked?, Journal of Economic Perspectives, Vol. 16, No. 3, pages 67-88. [2] Anderson, G. (1996), Nonparametric tests of stochastic dominance in income distributions, Econometrica; 64(5), pages 1183-93. [3] Andrews, D. (1994): Empirical methods in econometrics, in Handbook of econometrics vol 4, (ed) Engle, R. and McFadden,D., (North_Holland), pages 2248-2294.

33

[4] Andrews, D. & Moshe Buchinsky (2000): A Three-Step Method for Choosing the Number of Bootstrap Repetitions, Econometrica, v. 68, iss. 1, pp. 23-51 [5] Barrett, G. & Donald,S. (1999): Consistent nonparametric tests for Stochastic dominance: a comparison of inference methods, working paper. [6] Beach, C.M. & Davidson, R. (1983): Distribution-free statistical inference with Lorenz curves and income shares, Review of Economic Studies 50,723-34. [7] Bhattacharya, D. (2003): Asymptotic Inference from Multi-stage surveys, mimeo. downloadable from www.princeton.edu/~debopamb [8] Bishop, J., J.Formby & W.J.Smith (1991): Lorenz dominance and welfare: Changes in the U.S. distribution of income, 1967-86; Review of Economics and Statistics, 73, 134-39. [9] Bishop, J. A., Formby, J. P. & Zheng, B. (1997): Statistical Inference and the Sen Index of Poverty, International Economic Review, v. 38, iss. 2, pp. 381-87 [10] Butler, J.S. (1999): E¢ ciency Results of MLE and GMM Estimation using Sampling Weights, Journal of Econometrics, 1999, 96, Issue 1, Pages 25-37 [11] Cochran, William (1977): Sampling techniques, New York, Wiley. [12] Conley, T. G. (1999): GMM Estimation with Cross Sectional Dependence, Journal of Econometrics, September 1999; 92(1), 1-45. [13] Csorgo,M. (1983): Quantile processes with statistical applications, SIAM, 1983. [14] Datt, G. and M. Ravallion (2002): Is India’s Economic Growth Leaving the Poor Behind?; Journal of Economic Perspectives, vol. 16, 3, pages 89-108. [15] Davidson, R. and Duclos, J. (2000): Statistical inference for stochastic dominance and for measurement of poverty and inequality, Econometrica 68, 1435-64. [16] Deaton, A. (1997): Analysis of household surveys: a microeconometric approach to Development policy (Johns Hopkins Press).

34

[17] DuMouchel, W.H. and G.J. Duncan (1983):Using sample survey weights in multiple regression analysis of strati…ed samples, Journal of the American Statistical Association, 78, 535-43. [18] Francisco, C. & Fuller,W. (1991): Quantile estimation with a complex survey design, Annals of Statistics, 19, 454-69. [19] Gastwirth, J. (1972): The Estimation of the Lorenz Curve and Gini Index, Review of Economics and Statistics. 54(3): 306-16. [20] Hall, P. & J. Horowitz (1996):

Bootstrap Critical Values for Tests Based on

Generalized-Method-of-Moments Estimators, Econometrica, July 1996, v. 64, pp. 891916 [21] Howes, S. and Lanjouow, J.O. (1998): Making poverty comparisons taking into account survey design, Review of Income and Wealth, March, 1998, 99-110. [22] Kish, L. 1965. Survey Sampling. New York, NY. John Wiley & Sons. [23] Kloek, T. (1981): OLS estimation in a model where a microvariable is explained by aggregates and contemporaneous disturbances are equicorrelated; Econometrica; 49(1), pages 205-07. [24] McFadden, D. (1989): Testing for stochastic dominance. In Fomby,-Thomas-B.; Seo,Tae-Kun, eds. Studies in the economics of uncertainty: In honor of Josef Hadar. Springer, pages 113-34. [25] Moulton, B. (1986): Random group e¤ects and the precision of regression estimates, Journal of Econometrics, 32, 385-97. [26] Murthy, M. (1977): Sampling theory and methods; Calcutta, statistical publishing company. [27] Newey,W. and McFadden, D.(1994): Large sample estimation and hypothesis testing, Handbook of econometrics vol 4, (ed) Engle, R. and McFadden, D., pages 2111-2241. [28] Pakes, A. & Pollard, D. (1989): Simulation and asymptotics of optimization estimators, Econometrica, Vol. 57, pages 1027-1057. 35

[29] Pepper, J.V. (2002): Robust inferences from random clustered samples: an application using data from the panel study of income dynamics, Economics Letters, 75, Issue 3, Pages 341-345. [30] Pfe¤ermann, D. and Nathan, G. (1981): Regression analysis of data from a cluster sample; Journal of the American statistical association; vol 76, no. 375, pages 681-689. [31] Sakata, S. : Quasi-Maximum Likelihood Estimation with Complex Survey Data. (work in progress). University of Michigan. [32] Vaart, A.W. van der(1998): Asymptotic statistics, Cambridge University Press. [33] Vaart, A.W. van der & Jon A. Wellner (1996): Weak Convergence and Empirical Processes: With Applications to Statistics, Springer Verlag. [34] White, H. (1980) : A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity; Econometrica, 48, 817-838. [35] Wooldridge, J. (1999): Asymptotic properties of weighted M-estimators for variable probability samples; Econometrica, Vol.67, no. 6; pages 1385-1406. [36] Wooldridge, J. (2001): Asymptotic properties of weighted M-estimators for standard strati…ed samples, Econometric Theory, 17, 451-470. [37] Zheng, B. (2001): Statistical inference for poverty measures with relative poverty lines; Journal-of-Econometrics; 101(2), pages 337-56. [38] Zheng, B. (2002): Testing Lorenz curves with non-simple random samples, Econometrica, vol. 70, 3.

8

Appendix

Proof. First observe that F (:) is a proper distribution function with compact support and admits a continuous and positive density over its entire support, given our assumptions

36

about F (:js) for each s: Now, p

n F^ (x)

F (x)

=

mi (x) =

p1 n

PS

Hs s=1 ns

S X

Pn

Pns

i=1 mi (x)

i=1

M (si ;i) k

Hs 1 (si = s)

s=1

Pk

; where

h=1

(11)

si ih

k

M (si ; i) X (1 (ysi ih k

x)

F (xjs))

si ih

h=1

The denominator goes in probability to S X

Hs E

s=1

k Msi X k h=1

!

sih js

under a weak law of large numbers. Under …nite second moment assumptions on Msi ; for each s; i; j, and given the piecewise linear nature of the functions 1 (ysih

sih ; Hs

x) ; it follows

from Pakes and Pollard (1989) (See example 2.11 and lemmas 2.3 and 2.17) that sup F^ (x) x

P

F (x) ! 0

and for every " > 0; > 0; there exists > 0 such that " # n n 1 X 1 X p lim sup Pr sup mi (x) p mi (y) > <" n n n!1 jx yj< i=1

(12)

i=1

Given (11) and (12), it follows that p

n F^

F

H

(13)

where H is a stochastic process with uniformly continuous (with respect to a pseudo-

metric de…ned on the support of Y ) sample paths and ‘ ’denotes weak convergence (See for instance Andrews (1994) page 2251). De…ne the pth population and sample quantile of ^ (p) as Y , Q (p) and Q Q (p) = inf fx : F (x) pg n ^ (p) = inf ysc h : F^ (ysc h ) Q s s s;cs ;;j

p

o

Under the assumption that F (:) has compact support (equal to the union of the support for each s, each of which is assumed to be compact), continuously di¤erentiable with a 37

strictly positive derivative, f , Lemma 21.4 in van der Vaart (1998) shows that the inverse map Q (:)

F

1

mapping F (:) to the quantile, as a map from the space of bounded cadlag

functions on the support of Y to the space of bounded functions on (0; 1) is Hadamard di¤erentiable at F (:) with Hadamard derivative Q0 given by h f

Q0 (h) =

1

F

Using the functional delta method, therefore, (see for instance, van der Vaart (1998), theorem 20.8) p

^n n Q

p

n F^n (Q)

Q =

F (Q) + op (1)

f (Q)

Given that f (:) is strictly positive, it follows that P

^ n (p) sup Q

Q (p) ! 0

p2[0;1]

p

^n n Q

H (Q) f (Q)

Q

(14)

where H (:) is de…ned in (13). Now the generalized Lorenz shares are given by (p) =

Z

Q(p)

yf (y) dy =

0

Z

p

Q (z) dz

0

so that sup j^ (p)

p2[0;1]

(p)j =

Z

sup p2[0;1]

sup

Z

p

0 p

Z

^ (z) dz Q

p

Q (z) dz

0

^ (z) Q

Q (z) dz

p2[0;1] 0

sup

Z

p

^ (t) sup Q

^ (z) = sup Q z

: l1 (0; 1) ! C[0; 1] (Q) (p) =

Z

0

38

P

Q (z) ! 0

Secondly, the map

de…ned as

Q (t) dz

t

p2[0;1] 0

p

Q (t) dt

(15)

is linear and therefore Hadamard di¤erentiable at Q with Hadamard derivative given by Z p 0 (h) (p) = h (u) du 0

Using the functional delta method, it follows that p

n (^

where G (p) = Finally, the Gini is given by p

n (^

p

) = 2 n Z 1 = 2 0

Z

Z

) p

G

H (Q (t)) dt f (Q (t))

0

1

^ (p) ^ 0 p n (^ (p) ~

(p)

dp

(p))

~ (p)

p

n (^ ~2

)

dp

(16)

P

where the ‘~’denotes intermediate values. Using (15) and that ^ ! it follows that p R1 p p n 0 (^ (p) (p)) dp n (^ ) n (^ )=2 + op (1) It is trivial that the map :

7!

Z

1

(p) dp

0

as a map from l1 [0; 1] to < is continuous, so that by the continuous mapping theorem, Z 1 Z 1 p n (^ (p) (p)) dp G (p) dp 0

0

The asymptotic normality follows from the observation that Z 1 Z 1Z p H (Q (t)) G (p) dp = dtdp f (Q (t)) 0 0 0 Z 1 Z Q(p) H (z) = f (z) dzdp by a change of variables f (z) 0 0 ! Z Z Q(p)

1

=

0

0

H (z) dz

dp

is essentially a ‘sum’of multivariate normals, since H is a Gaussian process. 39

8.1

Hadamard di¤erentiability

A map

: D 7! E where D

D; and D; E are normed spaces is said to be Hadamard

di¤erentiable (ref. van der Vaart, 1998 page 296) at linear map

0

2 D if there exists a continuous

: D 7! E (called the Hadamard derivative) such that ( + tht ) t

as t ! 0; for every ht ! h: When the map to be Hadamard di¤erentiable at

( )

0

(h) E

0

!0

exists only on a subset D0

D;

(:) is said

tangentially to D0 :

Hadamard di¤erentiability of the quantile function Q (:) as a map from the space of distribution functions (cadlag functions bounded between 0 and 1) to the space of bounded functions (that contains the quantile functions, in particular) on (0; 1) is proved in van der Vaart Lemma 21.4. The derivative exists tangentially to the space of continuous functions on the support of the c.d.f. F whose quantiles we are interested in. The Hadamard derivative is given by 0 F

h F f

(h) (p) =

1

(p)

with f denoting the density function corresponding to F: To show the Hadamard di¤erentiability of the generalized Lorenz shares, consider the map : l1 (0; 1) ! cs [0; 1] de…ned as (Q) (p) =

Z

0

40

p

Q (t) dt

Take a sequence of functions ht ! h uniformly over its domain: Then Z p (Q + tht ) (p) (Q) (p) h (u) du t 0 1 Z p (Q + tht ) (p) (Q) (p) = sup h (u) du t 0 p2[0;1] Rp Rp Z p fQ (u) + th (u)g du t 0 0 Q (u) du = sup h (u) du t 0 p2[0;1] Z p Z p = sup ht (u) du h (u) du p2[0;1]

sup

Z

0 p

p2[0;1] 0

so that

0

jjht (u)

h (u)jj du ! 0

is Hadamard di¤erentiable tangentially to C[0; 1] with Hadamard derivative given

by 0

(h) (p) =

Z

0

41

p

h (u) du

Table 1: Unweighted and weighted estimates of Gini coefficient for monthly per capita household expenditure (0)

(1) Gini: naive 0.3856 0.3152 0.3961

(2) Gini: wtd 0.3250 0.2856 0.3430

Maharashtra Rural Urban

0.4255 0.3377 0.3980

0.3770 0.3070 0.3578

Uttar Pradesh Rural Urban

0.3507 0.3037 0.3796

0.3020 0.2807 0.3268

West Bengal Rural Urban

0.3610 0.2817 0.3598

0.3080 0.2547 0.3394

Andhra Pradesh Rural Urban

0.3731 0.3306 0.3809

0.3120 0.2901 0.3382

All India Rural Urban

Notes: Data come from The Indian National Sample Survey, Round 50 corresponding to 19931994. Column (0) lists the regions. Column 1 and column 2 report respectively the unweighted and weighted Gini coefficients.

Table 2: Design Effects on standard errors: Gini, 1993-4 (0)

(1)

(2)

(3)

(4)

(5)

(6)

Std errors naive

Stratum Effect

Cluster Effect

% rise

Region All India Rural Urban

Gini

Std errors correct

0.2856 0.3430

0.0021 0.0065

0.0020 0.0041

12.97 23.51

18.18 41.46

8.00 22.11

Maharashtra Rural Urban

0.3070 0.3578

0.0094 0.0097

0.0075 0.0078

13.38 5.13

33.60 27.20

23.22 24.08

Uttar Pradesh Rural Urban

0.2807 0.3268

0.0040 0.0040

0.0037 0.0037

9.94 11.94

13.93 33.93

5.99 24.08

West Bengal Rural Urban

0.2547 0.3394

0.0151 0.0105

0.0152 0.0077

3.70 2.72

4.14 36.63

0.84 35.91

Andhra Pradesh Rural Urban

0.2901 0.3382

0.0066 0.0081

0.0060 0.0054

7.48 4.36

13.58 49.62

8.09 48.26

Notes: Data come from The Indian National Sample Survey, Round 50 corresponding to 1993-1994. Column (0) lists the areas. Column 1 reports weighted estimates of the Gini coefficient for per capita monthly consumption expenditure. Column 2 reports standard errors corrected for the sample design and column 3 report standard errors computed ignoring the design (details in the text, Section 2.2). Column 4 measures the degree of overestimation of standard errors due to ignoring stratification as a percentage of the naive standard errors. Column 5 measures the degree of underestimation of standard errors due to ignoring clustering as a percentage of the naive standard errors. Column 6 measures the overall change in estimated standard errors due to the correction for sample design, as percentage of the naive standard errors.

Table 3: Tests for Changes in Gini: 1987-88 vs 1993-94 (0)

(1)

(2)

(3)

(4)

(5)

Region

Gini 1987-8

Gini 1993-4

Change in Gini 50-43

t-ratio correct (p-value)

t-ratio naïve (p-value)

All India

0.3295

0.3250

-0.0041

Rural

0.2992

0.2856

0.0135

Urban

0.3491

0.3430

-0.0061

-1.31* (0.097) -2.65 (0.003) -1.06 (0.147)

-1.72* (0.047) -2.88 (0.001) -1.21 (0.115)

Maharashtra

0.3594

0.3770

0.0172

Rural

0.3120

0.3070

-0.0050

Urban

0.3479

0.3578

0.0099

1.51* (0.067) -0.25 (0.4) 1.01 (0.16)

1.88* (0.031) -0.25 (0.4) 1.28 (0.10)

Uttar Pradesh

0.3078

0.3020

-0.0061

Rural

0.2908

0.2807

-0.0101

Urban

0.3397

0.3268

-0.0129

-1.03 (0.147) -1.70 (0.045) -0.97 (0.171)

-1.30 (0.097) -1.85 (0.032) -1.25 (0.106)

West Bengal

0.3072

0.3080

0.0007

Rural

0.2552

0.2547

-0.0005

Urban

0.3465

0.3394

-0.0071

0.06 (0.274) -0.03 (0.490) -1.46* (0.074)

0.07 (0.242) -0.03 (0.490) -1.91* (0.03)

Andhra Pradesh

0.3322

0.3120

-0.0200

Rural

0.3095

0.2901

-0.0194

Urban

0.3758

0.3382

-0.0376

-2.52 (0.006) -2.24 (0.012) -2.58 (0.005)

-3.10 (0.001) -2.44 (0.007) -3.31 (0.0)

Notes: Data come from The Indian National Sample Survey, round 43 (1987-8) and round 50 (19934). Column (0) lists the areas. Columns 1 and 2 report the Gini coefficient for per capita monthly consumption expenditure for 1987-8 and 1993-4, respectively. Column 3 reports changes in observed Ginis. Column 4 reports the t-ratio (p-values in parentheses) for testing change in Gini, obtained through the correct standard errors, corrected for the sample design. Column 5 reports the t-ratio (pvalues in parentheses) with no correction for design. The asterisk (*) indicates cases where inference is reversed at conventional critical values of 1.64 or 1.96, corresponding to the 95th and 99th percentile of the standard normal distribution.

Table 4: Test of Dominance (0)

Region

(1) Dominance vs non-dominance

(2) Equality Vs dominance

(3)

Rural India

43<=50 P=0.752

50=43 P=0.030

43<50

43<50

Maharashtra

50<=43 P=0.553

50=43 P=0.24

Equality

Equality

Rural Uttar Pradesh

43<=50 P=0.731

43=50 P=0.26

Equality

Equality

Andhra Pradesh

43<=50 P=0.992

43=50 P=0.116

Equality

Equality

Rural

43<=50 P=0.77

43=50 P=0.13

Equality

Equality

Urban

43<=50 P=0.973

43=50 P=0.09

Equality

43<50

Accept hypothesis 5% 10%

Notes: Data come from The Indian National Sample Survey, Round 50 corresponding to 1993-1994. Column (0) lists the areas. Column (1) lists the p-values for testing dominance versus no dominance; 43<=50 means that the null hypothesis being tested is that the Lorenz curve for the 43rd round lies everywhere below that for the 50th rounds versus the alternative that the 43rd round curve lies above the 50th round one at least one percentile value. Column (2) contains p-values for testing equality versus dominance; 43=50 means that the null hypothesis being tested is that the two curves are identical versus the alternative that there is strict dominance for some percentile. Column (3) reports the final conclusion, based on levels 5% and 10%.

Table 5: P-values by Bootstrap (0)

(1A) Strata, no cluster

(1B) Strata, no cluster

(2A) Cluster, no strata

(2B) Cluster, no strata

(3A) No cluster, no strata

(3B) No cluster, no strata

(4A) Cluster and strata

(4B) Cluster and strata

Rural India

43<=50 P=0.732

50=43 P=0.019

43<=50 P=0.765

50=43 P=0.035

43<=50 P=0.74

50=43 P=0.02

43<=50 P=0.752

50=43 P=0.030

Maharashtra

50<=43 P=0.54

50=43 P=0.22

50<=43 P=0.56

50=43 P=0.285

50<=43 P=0.55

50=43 P=0.225

50<=43 P=0.553

50=43 P=0.24

Rural Uttar Pradesh

43<=50 P=0.72

43=50 P=0.223

43<=50 P=0.74

43=50 P=0.29

43<=50 P=0.727

43=50 P=0.245

43<=50 P=0.731

43=50 P=0.26

Andhra Pradesh

43<=50 P=0.986

43=50 P=0.035

43<=50 P=0.995

43=50 P=0.154

43<=50 P=0.992

43=50 P=0.05

43<=50 P=0.992

43=50 P=0.116

Rural

43<=50 P=0.66

43=50 P=0.12

43<=50 P=0.79

43=50 P=0.15

43<=50 P=0.73

43=50 P=0.13

43<=50 P=0.77

43=50 P=0.13

Urban

43<=50 P=0.970

43=50 P=0.01

43<=50 P=0.98

43=50 P=0.09

43<=50 P=0.970

43=50 P=0.025

43<=50 P=0.973

43=50 P=0.09

Region

Notes: Data come from The Indian National Sample Survey, Round 50 corresponding to 1993-1994. Column (0) lists the areas. Columns with suffix A contain p-values for testing dominance against no dominance; 43<=50 means that the null hypothesis being tested is that the Lorenz curve for the 43rd round lies everywhere below that for the 50th rounds versus the alternative that the 43rd round curve lies above the 50th round one at least one percentile value. Columns with suffix B contain p-values for testing equality versus dominance; 43=50 means that the null hypothesis being tested is that the two curves are identical versus the alternative that there is strict dominance for some percentile.

Inference on Inequality from Complex Survey Data

Survey on Data Clustering - IJRIT

Type Inference Algorithms: A Survey

Survey on clustering of uncertain data urvey on ...

Report from EMA industry survey on Brexit preparedness

A Survey on Brain Tumour Detection Using Data Mining Algorithm

A Survey on Data Stream Clustering Algorithms

A Short Survey on P2P Data Indexing - Semantic Scholar

Survey on Physical and Data Safety for Cellular ...

A Short Survey on P2P Data Indexing - Semantic Scholar

Global Inequality Dynamics: New Findings from WID.world

Inference on Breakdown Frontiers

Survey on Malware Detection Methods.pdf

Inference on vertical constraints between ...

Wealth dynamics on complex networks

Efficient routing on complex networks

Epidemic dynamics on complex networks

Little Ethiopia Survey Data -

Update on Abyei - Small Arms Survey Sudan

Mini survey on settlement hierarchy.pdf