Comparing dispersions between two probability vectors under multinomial sampling Shifeng Xiong∗,
Guoying Li
Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing 100080, China
Abstract
We consider testing hypotheses concerning comparing dispersions between two pa-
rameter vectors of multinomial distributions in both one-sample and two-sample cases. The comparison criterion is the concept of Schur majorization. A new dispersion index is proposed for testing the hypotheses. The corresponding test for one-sample problem is an exact test. For twosample problem, the bootstrap is used to approximate the null distribution of the test statistic and the p-value. We prove that the bootstrap test is asymptotic correct and consistent. Simulation studies for the bootstrap test are reported and a real life example is presented.
Key words:
multinomial distribution, dispersion, diversity, bootstrap, hypothesis testing, ma-
jorization, ordering
1
Introduction
Comparing dispersions between two probability vectors is an important statistical issue and can be applied in many fields such as ecology, economics and sociology (Dykstra et al., 2002), especially, in studies on diversity (Patil and Taillie, 1982). There are many dispersion measures for a P probability vector p = (p1 , . . . , pk ). The most popular ones are Shannon’s entropy, − ki=1 pi log pi , Pk and Gini’s (or Simpson’s) index, 1 − i=1 p2i . Gilula and Haberman (1995) pointed out that a reasonable dispersion measure D should be a concave and symmetric real function on Q = Pk {x ∈ Rk : xi > 0, i = 1, . . . , k, and i=1 xi = 1} such that D((1, 0, . . . , 0)) = 0. Here ‘D is symmetric’ means D(x1 , . . . , xk ) = D(xi1 , . . . , xik ) for all x ∈ Q and all permutation i1 , . . . , ik of
1, . . . , k. Denote the set of all functions satisfying the conditions above by V . Then Gilula and
Haberman (1995) stated that any function D ∈ V can be a dispersion measure. However, using
any single dispersion index to compare dispersion may be not suitable since different indices may ∗ Tel:
+86-10-62651423; fax: +86-10-82568364.
E-mail address:
[email protected]
1
give conflicting results (Gove et al., 1994). To avoid the ambiguous comparisons, Dykstra et al. (2002) considered comparing dispersion through the concept of majorization. Now we first present some basic concepts of majorization theory (See Marshall and Olkin (1979) for details). For each x = (x1 , . . . , xk ) ∈ Rk , write x[1] > · · · > x[k] for its nonincreasing permutation and Pk Pk write x[·] = (x[1] , . . . , x[k] ). For x = (x1 , . . . , xk ), y = (y1 , . . . , yk ) with i=1 xi = i=1 yi , we
denote x ≺ y if
i X
x[j] 6
j=1
i X j=1
y[j] , i = 1, . . . , k − 1.
When x ≺ y, x is said to be majorized by y (y majorizes x). A symmetric function φ defined on a set A ⊂ Rk is said to be Schur-convex (Schur-concave) on A if
x ≺ y for x, y ∈ A ⇒ φ(x) 6 (>) φ(y). Dykstra et al. (2002) gave the following definition: Probability vector p is more dispersed than probability vector q if p ≺ q and p[·] 6= q [·] . By use of some results concerning Schur convex
function, Gilula and Haberman (1995) showed that p is more dispersed than q according to the
above definition if and only if D(p) > D(q) for all D ∈ V , with strict inequality for some D ∈ V .
Note that ‘≺’ is only a partial ordering. There are many pairs of probability vectors which are not comparable. If p and q are not comparable, then there exist different dispersion measures which lead to different comparison results. The concept of majorization was also proposed to compare the diversity of ecological populations (Patil and Taillie, 1979). For a population C with k species, let Ni be the abundance of species i and N = (N1 , . . . , Nk ). A commonly used diversity mesure, called intrinsic diversity profile, is defined as the plotting of pairs (i, Ti ), where Ti =
k X 1 N[j] , i = 1, . . . , k − 1. N1 + · · · + Nk j=i+1
According to Gove et al. (1994), C is intrinsically more diverse than another population C ′ with k species if and only if the intrinsic diversity profile of C is above that of C ′ . Let Ni′ be the Pk Pk abundance of species i of C ′ . Denote pi = Ni / i=1 Ni , qi = Ni′ / i=1 Ni′ , p = (p1 , . . . , pk ) and q = (q1 , . . . , qk ). It is clear that
the intrinsic diversity profile of C is above that of C ′ ⇔ p ≺ q. Therefore, comparing diversity through the intrinsic diversity profile is equivalent to comparing dispersion through majorization between two probability vectors. In this paper we study dispersion comparison between two probability vectors using the concept of majorization. Specifically, the following two testing hypothese are considered: H0 : p ≺ q ←→ H1 : H0 does not hold
(1.1)
H0′ : p ≻ q ←→ H1′ : H0′ does not hold.
(1.2)
and
2
We will study the two testing problems in both one-sample cases and two-sample cases. In onesample cases, the probability vector q is known and the sample X n = (Xn1 , . . . , Xnk ) is taken from a multinomial distribution, M N Dk (n; p), with an unknown parameter p; while in two-sample cases, the samples X m = (Xm1 , . . . , Xmk ) and Y n = (Yn1 , . . . , Ynk ) are taken independently from M N Dk (m; p) and M N Dk (n; q) with unknown p and q, respectively. The emphasis of our discussion is the two-sample problem which is perhaps more useful for real problems. Gilula and Haberman (1995) pointed that p ≺ q is equivalent to p[·] being stochastically
larger than q [·] as probability vectors. Many papers have studied testing hypotheses concerning stochastic orderings of multinomial parameters (e.g. see Robertson and Wright, 1981; Silvapulle
and Sen, 2004; Cohen et al., 2006; Feng and Wang, 2006). It is clear that the stochastic ordering of multinomial parameter p’s themselves is different from that of ordered multinomial parameter p[·] ’s. The latter is our concern in this paper. Nevertheless, limited papers have considered this problem. Xiong and Li (2005) discussed one-sided hypotheses for p[1] , which are special cases of (1.1) and (1.2). Dykstra et al. (2002) studied the likelihood ratio test for (1.1) in one-sample cases. Moreover, they pointed out that the likelihood ratio test for the two-sample problem is not a trivial extension of that for the one-sample problem. In this paper, we do not adopt likelihood ratio technique to study the testing problems. In Section 2 we discuss the one-sample problem. An ad hoc test for (1.1) and (1.2) respectively is investigated. By use of the theory of Schur convex function, the test can be exact (do not rely on asymptotics). Its consistency is also proved. The test for (1.1) in one-sample cases is extended to two-sample cases in Section 3. We use the bootstrap to approximate the null distribution of the test statistic and prove that this approximation and the corresponding bootstrap test are consistent. Section 4 presents simulation results to evaluate the performance of the bootstrap test for (1.1) in two-sample cases. A real data set is also analyzed using our method.
2
One-Sample Problem
For all k-dimensional probability vectors p, q, denote k−1 k−1 X X ϕ(p, q) = max p[1] − q[1] , (p[1] + p[2] ) − (q[1] + q[2] ), . . . , p[i] − q[i] . i=1
i=1
It is clear that ϕ(p, q) 6 0 if and only if p ≺ q. Therefore, it can be regarded as a single index
for comparing dispersion of p and q through majorization. Actually, in one sample cases, the empirical forms of ϕ(p, q) and ϕ(q, p), Sn = ϕ
Xn Xn , q and Sn′ = ϕ q, , n n
can be used as the test statistics for (1.1) and (1.2), respectively, where q is known, and X n ∼
M N Dk (n; p).
3
Theorem 2.1. For all c ∈ R, sup Pp (Sn > c) =
sup Pp (Sn > c),
(2.1)
sup Pp (Sn′ > c).
(2.2)
p[·] =q [·]
p≺q
sup Pp (Sn′ > c) =
p[·] =q [·]
p≻q
Proof. By definition, ϕ(·, q) is a Schur convex function. Then for any c, f (x) = I(ϕ(x, q) > c) is also a Schur convex function. By C.4.a of Chapter 11, Marshall and Olkin (1979), g(p) = Ep f (X n /n) = Pp (Sn > c) is a Schur convex function. Then (2.1) follows. The equality (2.2) can be proved similarly.
Denote the tests constructed by Sn and Sn′ for (1.1) and (1.2) by φS and φ′S , respectively. Let xn be the observation of X n and sn = Sn (xn ), s′n = Sn′ (xn ). Note that the distribution of X n[·] , hence that of Sn , only relies on p[·] . Then the p-values of φS and φ′S can be calculated as follows ps = Pq (Sn > sn ), ps′ = Pq (Sn′ > s′n ). When n and k are both small, ps and ps′ can be calculated exactly. In other cases, they can be approximated by Monte Carlo method. Theorem 2.2. φS and φ′S are consistent tests. Proof. Note that
√ n(Sn − ϕ(p, q)) has an asymptotic distribution, which is denoted by Fp , as
n → ∞ (Marcheselli, 2000; see also Xiong and Li, 2006a). Therefore, given a significant level α, the √ asymptotic rejection region of φS is { nSn > Fq−1 (1 − α)}. For all p ∈ H1 , ϕ(p, q) > 0 = ϕ(q, q). Thus,
√ √ √ Pp ( nSn > Fq−1 (1 − α)) = Pp ( n(Sn − ϕ(p, q)) > Fq−1 (1 − α) − nϕ(p, q)) → 1.
Remark 2.1. We have pointed out that ϕ may be used as an index for comparing dispersion of two probability vectors. In fact, ϕ can also be used to quanlify the dispersion of one probability vector. Let π 0 = (1/k, . . . , 1/k). For a probability vector p, we may consider ϕ(p, π 0 ) as a dispersion index. In adition, note that if q = π0 , then (1.1) becomes the following goofness-of-fit hypotheses HE0 : p = π 0 ←→ HE1 : p 6= π 0 . For the special case, it is perhaps worth comparing φS with the usual goodness-of-fit tests, such as the power-divergence test (Read and Cressie, 1988).
3
Two-Sample Problem
This section discusses testing (1.1) in two-sample cases. Let X m ∼ M N Dk (m; p) and Y n ∼
M N Dk (n; q) be independent. Similarly to the one-sample cases, the test statistic can be Tmn = ϕ
Xm Y n , . m n 4
Theorem 3.1. For all c ∈ R, sup P(p,q) (Tmn > c) = sup P(p,q) (Tmn > c).
(3.1)
p[·] =q[·]
p≺q
Proof. We prove a more general result. It is obvious that f (x) = ϕ(x, θ) and f1 (x) = ϕ(θ, x) are respectively Schur convex and Schur concave functions. For all probability vector p0 with p ≺ p0 ≺ q, P(p,q) (Tmn > c) = Eq P(p,q) (Tmn > c | Y n ) X X Xm y Xm y = Pp (ϕ( , ) > c)δ(y) 6 Pp0 (ϕ( , ) > c)δ(y) m n m n y y = P(p0 ,q) (Tmn > c) = Ep0 P(p0 ,q) (Tmn > c | X m ) 6 P(p0 ,p0 ) (Tmn > c),
(3.2)
where δ(·) is the probability function of Y n . Then (3.1) follows.
From this theorem, given the observations xm and y n , we can calculate the p-value for testing (1.1) following p = sup P(p,p) (Tmn > tmn ),
(3.3)
p∈Q
where Q = {x ∈ Rk : xi > 0, i = 1, . . . , k,
and
Pk
i=1
xi = 1} and tmn = Tmn (xm , y n ).
Since the p-value is very difficult to compute exactly using (3.3). we use the bootstrap method to approximate it. The main idea is to generate bootstrap sample from the distribuion of X m and
ˆ and Y n , with p and q being replaced by their MLEs under the boundary of null hypothesis. Let p ˆ be MLEs of p and q under p[·] = q [·] , respectively. Denote q ˆ = (ˆ π π1 , . . . , π ˆk ) =
X m[·] + Y n[·] . m+n
ˆ [·] = qˆ[·] = π ˆ in Lemma 3.1 below. Our bootstrap test for (1.1) can be We will prove that p constructed as follows: ˆ Y ∗t ∼ M N Dk (t; π), ˆ Given X m and Y n , generate the bootstrap sample X ∗s ∼ M N Dk (s; π),
∗ where s is a number smaller than m and s/t = m/n. Let Tst be the bootstrap version of Tmn , i.e. ∗ Tst = ϕ(X ∗s /s, Y ∗t /t) and denote
√ ∗ ∗ Fmn (x) = P( sTst 6 x | X m, Y n)
(3.4)
√ which is an approximate distribution function of mTmn under p[·] = q [·] . Then given a significance √ ∗−1 level α, the rejection region is { mTmn > Fmn (1 − α)} and the approximate p-value is √ ∗ ∗ p = P(Tst > tmn | xm , y n ) = 1 − Fmn ( mtmn ).
(3.5)
Denote this bootstrap test for (1.1) by φ. Remark 3.1. Note that the bootstrap distribution estimators of X m[·] and Y n[·] , hence that of Tmn , may be inconsistent if the bootstrap sample sizes are taken as m and n (Xiong and Li, 2006a). 5
This is the reason that we change the bootstrap sample size t (s) to be smaller than m (n). Quite a few papers have studied this phenomena, for example, Shao and Tu, (1995) and Bickel et al. √ (1997). From our simulation in Section 4, to take s = 4 m may be a good choice. In order to show that φ is a reasonable test, we will discuss whether it is asymptotic correct ˆ and q ˆ. (Shao and Tu, 1995, p.178) and consistent. First we discuss the properties of p ˆ m and q ˆ have the following properties: Lemma 3.1. The MLEs p ˆ [·] = q ˆ [·] = π. ˆ (1). p (2). As m, n → ∞, m/n → λ ∈ (0, +∞), ˆ → π = (π1 , . . . , πk ) = π and
λ 1 p[·] + q almost surely 1+λ 1 + λ [·]
√ m(ˆ πi − πi ) = Op (1), i = 1, . . . , k.
(3). If p ≺ q, then p ≺ π ≺ q. Proof. See the Appendix. Now we show that F ∗ defined in (3.4) converge to the asymptotic distribution of
√ mTmn
under p[·] = q [·] . Some lemmas are needed. Lemma 3.2. Let pm be a sequence of random probability vectors with pm → p almost surely, where p is a probablity vector. Suppose that X m ∼ M N D(m; p), and given pm , the conditional distribution of X ∗s is M N D(s; pm ). Then as m → ∞ and s → ∞, √ X ∗s s − pm |pm s where ‘|Gm
d
a.s. ’
d a.s.
√
m
Xm −p , m
means that given the sequence of sub-σ fields (or random variables) Gm , the left
side converges in distribuion to the asymptotic distribuion of the right side almost surely. Proof. Using characteristic function (see Shao and Tu, 1995, p.76). Lemma 3.3. Assume that all the conditions in Lemma 3.2 hold. Assume further that pi ) = Op (1), i = 1, . . . , k. Then as m → ∞, s → ∞, s/m → 0, √ X ∗s[·] s − pm[·] |pm s where ‘|Gm
d
P’
d P
√ m(pmi −
√ X m[·] m − p[·] , m
means that given the sequence of sub-σ fields (or random variables) Gm , the left
side converges in distribuion to the asymptotic distribuion of the right side in probability. Proof. See the Appendix.
6
˜ m ∼ M N Dk (m; π) and Y˜ n ∼ M N Dk (n; π) be independent. Given X m , Y n , let Let X
ˆ and Y ∗t ∼ M N Dk (t; π) ˆ be independent. For i = 1, . . . , k − 1, denote X ∗s ∼ M N Dk (s; π) Pi Pi i i X X √ √ j=1 Xm[j] j=1 Yn[j] Umi = m − p[j] , Vni = n − q[j] , m n j=1 j=1 Pi P i i i ˜ ˜ X X √ √ j=1 Xm[j] j=1 Yn[j] ˜mi = m U − πj , V˜ni = n − πj , m n j=1 j=1 Pi P i ∗ i ∗ i X X √ √ j=1 Xs[j] j=1 Yt[j] ∗ Usi = s − π ˆj , Vti∗ = t − π ˆj , s t j=1 j=1 Lemma 3.4. As m, n, s, t → ∞ with s/m → 0, lim s/t = lim m/n = λ ∈ (0, +∞), √ √ ∗ m d ˜ sTst |Fmn P max {Umi − √ V˜ni }, 16i6k−1 n
(3.6)
where Fmn is the sub-σ field generated by (X m , Y n ). Proof. See the Appendix. Denote the asymptotic distribution of the right side of (3.6) by Fπ . By the P´ olya theorem (see Theorem 3.7 of Xiong and Li (2006b)), we have Lemma 3.5. ∗ sup |Fmn (x) − Fπ (x)|→0
in probability,
x
∗ where Fmn is given in (3.4).
Noting that if p[·] = q [·] , then √
mTmn
√ √ d m m ˜ = max Umi − √ Vni = max {Umi − √ V˜ni }, 16i6k−1 16i6k−1 n n
(3.7)
d
where ‘=’ means its both sides have the same distribution. Therefore the following theorem holds from Lemma 3.4. Theorem 3.2. If p[·] = q [·] , then as m, n, s, t → ∞ with s/m → 0, lim s/t = lim m/n = λ ∈ (0, +∞), we have
√ ∗ sTst |Fmn
d P
√ mTmn .
The following theorem indicates that φ is asymptotic correct and consistent. Theorem 3.3. Given a nominal level α ∈ (0, 1),
(1). as m, n, s, t → ∞ with s/m → 0, lim s/t = lim m/n = λ ∈ (0, +∞), sup (p,q)∈H0
√ ∗−1 lim sup P( mTmn > Fmn (1 − α)) = α.
(3.8)
(2). If H0 does not hold, then as m, n, s, t → ∞ with s/m → 0, lim s/t = lim m/n = λ ∈ (0, +∞), √ ∗−1 lim P( mTmn > Fmn (1 − α)) = 1. 7
(3.9)
Proof. (1). If p[·] = q [·] , then it follows from Lemma 3.5 and (3.7) that √ ∗−1 lim P( mTmn > Fmn (1 − α)) = α. If p ≺ q, then from (3.2), (3.7) and Lemma 3.1 (3), we have
6
√ P(p,q) ( mTmn > Fπ−1 (1 − α)) √ P(π,π) ( mTmn > Fπ−1 (1 − α)) → α.
From this result and Lemma 3.5, it follows that √ ∗−1 lim sup P( mTmn > Fmn (1 − α)) 6 α. Then (3.8) is proved. (2). If H0 does not hold, noting that √ mTmn = then
√ i i X √ X m Umi − √ Vni + m( p[j] − q[i] ) , 16i6k−1 n j=1 j=1 max
√ mTmn → ∞. Thus (3.9) follows.
It is worth noting that besides multinomial sampling, there are other sampling plans to estimate a probability vector (Barabesi and Fattorini, 1998; see also Fattorini and Marcheselli, 1999, Marcheselli 2000, and Marcheselli 2003). How to construct tests for (1.1) and (1.2) in such cases is an interesting problem and will be addressed in future research.
4
Simulation results and a real example
In the preceding section we derived theoretical properties for a bootstrap test for (1.1). To investigate the finite-sample behavior of the proposed procedure, we conducted simulation studies. From Theorem 3.3, we need only to discuss the case of p = q. The simulation condition for p follows respectively: p1 = (1/6, 1/6, 1/6, 1/6, 1/6, 1/6), p2 = (0.2, 0.2, 0.15, 0.15, 0.15, 0.15), p3 = (0.3, 0.2, 0.15, 0.15, 0.1, 0.1), p4 = (0.4, 0.2, 0.1, 0.1, 0.1, 0.1), p5 = (0.5, 0.2, 0.1, 0.1, 0.05, 0.05), p6 = (0.75, 0.05, 0.05, 0.05, 0.05, 0.05). Given the nominal level α = 0.05, M = 1000 simulations are performed for four kinds of sample size √ of (m, n) (see Table 1) in turn. In each simulation, we set s = 4 m and t = sn/m, then generate 1000 bootstrap observations to construct the rejection region. The frequencies of rejecting H1 are presented in Table 1. From Table 1 we observe that all frequencies are quite close to the nominal level. It is concluded than our bootstrap test φ performs well even in finite-sample cases.
8
Table 1 Frequencies of rejecting H1 with α = 0.05 p1
p2
p3
p4
p5
p6
m = 30, n = 40
0.03
0.03
0.05
0.08
0.06
0.06
m = n = 50
0.03
0.04
0.08
0.05
0.06
0.06
m = 40, n = 100
0.05
0.04
0.07
0.05
0.05
0.05
m = 150, n = 200
0.04
0.04
0.06
0.06
0.05
0.05
We next illustrate the bootstrap test for the occupational status data previously examined in Dykstra et al. (2002). There are 151 workers in Caucasian Walton County, Florida, taken from the 1885 Florida census manuscripts. The occupational status (1=professional; 2=manager, clerical, proprietor; 3=skilled; 4=unskilled; 5=laborer; 6=farmer) of each worker was recorded, and the observation was X = (15, 14, 6, 11, 78, 27) with the sample size m = 151. Comparing with the data of 1870, Y = (4, 6, 18, 11, 95, 75) with the sample size n = 209, a social scientist or demographer may be interested in investigating the change in the level of occupation diversity (or dispersion). ˆ of the Thus, we consider testing (1.1) with this data set. First we compute the ordered form π ˆ and q ˆ of p and q under p[·] = q [·] using Lemma 3.1 as follows: MLEs p ˆ = (0.481, 0.283, 0.092, 0.069, 0.047, 0.028). π √ √ Let s = [4 m] = 49, t = [4 mn/m] = 68, and B = 100000. We generate B bootstrap observaˆ and M N D6 (t, π), ˆ respectively. From (3.5), we have the p-value= 0.175. tions from M N D6 (s, π) Therefore there is insufficient evidence to reject p ≺ q. Dykstra et al. (2002) assumed q = Y /n
and used one-sample likelihood ratio test for (1.1) with the data. Their p-value= 0.242, which is larger than ours.
Appendix Proof of Lemma 3.1. First we need to solve the optimization problem: max
k X
Xmi log pi +
i=1
k X
Yni log qi
i=1
subject to {p1 , . . . , pk } = {q1 , . . . , qk }. Let (u1 , . . . , uk ) be a permutation of (1, . . . , k) with pi = qui , i = 1, . . . , k. k X
Xmi log pi +
k X
(Xmi + Ynui ) log
i=1
6
i=1
k X
Yni log qi =
i=1
k X i=1
Xmi + Ynui . m+n 9
(Xmi + Ynui ) log pi
t Since f (t) = t log( m+n ) is a convex function, g(t1 , . . . , tk ) =
(Mashall and Olkin, 1979). Note that
Pk
i=1
f (ti ) is a Schur convex function
(Xm1 + Ynu1 , . . . , Xmk + Ynuk ) ≺ (Xm[1] + Yn[1] , . . . , Xm[k] + Yn[k] ). Therefore, k X
Xmi log pi +
i=1
k X
Yni log qi 6
i=1
k X
(Xm[i] + Yn[i] ) log(ˆ πi ).
i=1
Then (1) is proved, and (2), (3) are obvious.
Proof of Lemma 3.3. Denote {¯ p1 , . . . , p¯l } = {p1 , . . . , pk } with p¯1 > · · · > p¯l and {ri1 , . . . , riki } =
{j : pj = p¯i } with ri1 < · · · < riki for i = 1, . . . , l. Write
Asi = {X ∗s[k1 +···+ki−1 +1] , . . . , X ∗s[k1 +···+ki ] } = {X ∗sri1 , . . . , X ∗srik } , i Bmi = {X m[k1 +···+ki−1 +1] , . . . , X m[k1 +···+ki ] } = {X mri1 , . . . , X mriki } , As =
l \
Asi , Bm =
i=1
l \
Bmi .
i=1
From Xiong and Li (2006b), P(As | pm ) → 1 in probability, and P(Bm ) → 1. Therefore for x ∈ Rk , k ∗ \ √ Xs[i] P s( − pm[i] ) 6 xi pm s i=1
=
P
k ∗ \ √ Xs[i] s( − pm[i] ) 6 xi , As , Bm pm + op (1) s i=1
=
P
l n \ i=1
( here
∗ ∗ √ Xsa √ Xsa ki 1 s( − pmb1 ) 6 x1 , . . . , s( − pmbki ) 6 xki , s s a1 ,...,aki b1 ,...,bki o ∗ ∗ pm + op (1) Xsa > · · · > X , p > · · · > p mb mb sa 1 k 1 ki i [ [ and denote unions for all permutations a1 , . . . , aki
[
a1 ,...,aki
=
[
b1 ,...,bki
and b1 , . . . , bki of ri1 , . . . , riki , respectively) √ l n \ ∗ [ [ √ Xsa s√ 1 s( − pma1 ) 6 x1 + √ m(pmb1 − pma1 ), . . . , P s m i=1 a1 ,...,aki b1 ,...,bki √ ∗ √ Xsa s√ ki s( − pmaki ) 6 xki + √ m(pmbki − pmaki ), s m o ∗ ∗ Xsa > · · · > X , p > · · · > p p mb mb 1 m + op (1) sak ki 1 i
10
=
P
l n \ i=1
∗ ∗ √ Xsa √ Xsa ki 1 s( − pma1 ) 6 x1 + op (1), . . . , s( − pmaki ) 6 xki + op (1), s s a1 ,...,aki o ∗ ∗ Xsa > · · · > X p sak m + op (1) 1
[
i
=
P
l \
n
i=1
=
P
√ Xma1 √ Xmaki − pa1 ) 6 x1 , . . . , m( − paki ) 6 xki , m( m m a1 ,...,aki o Xma1 > · · · > Xmaki + op (1) [
k \ √ Xm[i] − p[i] ) 6 xi + op (1). m( m i=1
Proof of Lemma 3.4. By Lemma 3.1, Lemma 3.3 and the continuous mapping theorem, for each i = 1, . . . , k − 1, ∗ Usi |Fmn
d P
˜mi , Vti∗ |Fmn U
d P
V˜ni .
Therefore ∗ ∗ ∗ ∗ Yt[1] + · · · + Yt[i] + · · · + Xs[i] √ Xs[1] s − 16i6k−1 s t √ √ ∗ s d ˜mi − √m V˜ni }. max Usi − √ Vti∗ |Fmn P max {U 16i6k−1 16i6k−1 n t
√ ∗ sTst =
=
max
References Barabesi, L., Fattorini, L., 1998. The use of replicated plot, line and point sampling for estimating species abundance and ecological diversity. Environ. Ecol. Statist. 5, 353-370. Bichel, P.J., G¨ otze, F., Zwet, W.R., 1997. Resampling fewer than n observations: gains, losses, and remedies for losses. Statist. Sinica. 7, 1-37. Cohen, A., Kolassa, J., Sackrowitz., H.B., 2006. A new test for stochastic order of k > 3 ordered multinomial populations. Statist. Probab. Lett. 76, 1017-1024. Dykstra, R.L., Lang, J.B., Oh, M., Robertson, T., 2002. Order restricted inference for hypotheses concerning qualitative dispersion. J. Statist. Plann. Inference. 107, 249-265. Fattorini, L., Marcheselli, M., 1999. Inference on intrinsic diversity profiles of Biological populations. Environmetrics. 10, 589-599. Feng, Y., Wang, J., 2006. Likelihood ratio test against simple stochastic ordering among several multinomial populations. J. Statist. Plann. Inference. To appear. Gilula, Z., Haberman, S.J., 1995. Dispersion of categorical variables and penalty functions; derivation, estimation and comparability. J. Amer. Statist. Assoc. 90, 1447-1452. Gove, J.H., Patil, G.P., Swindel, B.F., Taillie, C., 1994. Ecological diversity and forest management. In Handbook of Statistics 12. Environmental statistics (Patil, G.P and Rao, C.R., eds.) 409-462. North-Holland, Amsterdam. 11
Marcheselli, M., 2000. A generalized delta method with applications to intrinsic diversity profiles. J. Appl. Prob. 37, 504-510. Marcheselli, M., 2003. Asymptotic results in jackknifing nonsmooth functions of the sample mean vector. Ann. Statist. 31, 1885-1904. 504-510. Marshall, A.W., Oklin, I., 1979. Inequalities: Theory of Majorization and Its Applications. Academic Press, New York. Patil, G.P., Taillie, C., 1979. An overview of diversity. In: Grassle, J.F., Patil, G.P., Smith, W., Taillie, C. (Eds.), Ecological Diversity in Theory and Practice. International Co-operative Publishing House, Fairland, MD. Patil, G.P., Taillie, C., 1982. Diversity as a concept and its measurement. J. Amer. Statist. Assoc. 77, 548-561. Read, T.R.C., Cressie, N., 1988. Goodness-of-Fit Statistics for Discrete Multivariate Data. Springer, New York. Robertson, T., Wright, F.T., 1981. Likelihood ratio tests for and against a stochastic ordering between multinomial populations. Ann. Statist. 9, 1248-1257. 504-510. Shao, J., Tu, D., 1995. The Jackknife and Bootstrap. Springer, New York. Silvapulle, M.J., Sen, P.K., 2004. Constrained Statistical Inference: Inequality, Order, and Shape Restrictions. John Wiley & Sons, New York. Xiong, S., Li, G., 2005. Testing for the maximum cell probabilities in multinomial distributions. Sci. Sinica Ser. A. 48, 972-985. Xiong, S., Li, G., 2006a. Inference for ordered parameters of multinomial distributions. Submitted. http://xiongshifeng.googlepages.com/06.pdf Xiong, S., Li, G., 2006b. Some results on convergency of conditional distributions with applications. Submitted. http://xiongshifeng.googlepages.com/06b.pdf
12