Comparing dispersions between two probability vectors under multinomial sampling Shifeng Xiong∗,

Guoying Li

Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing 100080, China

Abstract

We consider testing hypotheses concerning comparing dispersions between two pa-

rameter vectors of multinomial distributions in both one-sample and two-sample cases. The comparison criterion is the concept of Schur majorization. A new dispersion index is proposed for testing the hypotheses. The corresponding test for one-sample problem is an exact test. For twosample problem, the bootstrap is used to approximate the null distribution of the test statistic and the p-value. We prove that the bootstrap test is asymptotic correct and consistent. Simulation studies for the bootstrap test are reported and a real life example is presented.

Key words:

multinomial distribution, dispersion, diversity, bootstrap, hypothesis testing, ma-

jorization, ordering

1

Introduction

Comparing dispersions between two probability vectors is an important statistical issue and can be applied in many fields such as ecology, economics and sociology (Dykstra et al., 2002), especially, in studies on diversity (Patil and Taillie, 1982). There are many dispersion measures for a P probability vector p = (p1 , . . . , pk ). The most popular ones are Shannon’s entropy, − ki=1 pi log pi , Pk and Gini’s (or Simpson’s) index, 1 − i=1 p2i . Gilula and Haberman (1995) pointed out that a reasonable dispersion measure D should be a concave and symmetric real function on Q = Pk {x ∈ Rk : xi > 0, i = 1, . . . , k, and i=1 xi = 1} such that D((1, 0, . . . , 0)) = 0. Here ‘D is symmetric’ means D(x1 , . . . , xk ) = D(xi1 , . . . , xik ) for all x ∈ Q and all permutation i1 , . . . , ik of

1, . . . , k. Denote the set of all functions satisfying the conditions above by V . Then Gilula and

Haberman (1995) stated that any function D ∈ V can be a dispersion measure. However, using

any single dispersion index to compare dispersion may be not suitable since different indices may ∗ Tel:

+86-10-62651423; fax: +86-10-82568364.

E-mail address: [email protected]

1

give conflicting results (Gove et al., 1994). To avoid the ambiguous comparisons, Dykstra et al. (2002) considered comparing dispersion through the concept of majorization. Now we first present some basic concepts of majorization theory (See Marshall and Olkin (1979) for details). For each x = (x1 , . . . , xk ) ∈ Rk , write x[1] > · · · > x[k] for its nonincreasing permutation and Pk Pk write x[·] = (x[1] , . . . , x[k] ). For x = (x1 , . . . , xk ), y = (y1 , . . . , yk ) with i=1 xi = i=1 yi , we

denote x ≺ y if

i X

x[j] 6

j=1

i X j=1

y[j] , i = 1, . . . , k − 1.

When x ≺ y, x is said to be majorized by y (y majorizes x). A symmetric function φ defined on a set A ⊂ Rk is said to be Schur-convex (Schur-concave) on A if

x ≺ y for x, y ∈ A ⇒ φ(x) 6 (>) φ(y). Dykstra et al. (2002) gave the following definition: Probability vector p is more dispersed than probability vector q if p ≺ q and p[·] 6= q [·] . By use of some results concerning Schur convex

function, Gilula and Haberman (1995) showed that p is more dispersed than q according to the

above definition if and only if D(p) > D(q) for all D ∈ V , with strict inequality for some D ∈ V .

Note that ‘≺’ is only a partial ordering. There are many pairs of probability vectors which are not comparable. If p and q are not comparable, then there exist different dispersion measures which lead to different comparison results. The concept of majorization was also proposed to compare the diversity of ecological populations (Patil and Taillie, 1979). For a population C with k species, let Ni be the abundance of species i and N = (N1 , . . . , Nk ). A commonly used diversity mesure, called intrinsic diversity profile, is defined as the plotting of pairs (i, Ti ), where Ti =

k X 1 N[j] , i = 1, . . . , k − 1. N1 + · · · + Nk j=i+1

According to Gove et al. (1994), C is intrinsically more diverse than another population C ′ with k species if and only if the intrinsic diversity profile of C is above that of C ′ . Let Ni′ be the Pk Pk abundance of species i of C ′ . Denote pi = Ni / i=1 Ni , qi = Ni′ / i=1 Ni′ , p = (p1 , . . . , pk ) and q = (q1 , . . . , qk ). It is clear that

the intrinsic diversity profile of C is above that of C ′ ⇔ p ≺ q. Therefore, comparing diversity through the intrinsic diversity profile is equivalent to comparing dispersion through majorization between two probability vectors. In this paper we study dispersion comparison between two probability vectors using the concept of majorization. Specifically, the following two testing hypothese are considered: H0 : p ≺ q ←→ H1 : H0 does not hold

(1.1)

H0′ : p ≻ q ←→ H1′ : H0′ does not hold.

(1.2)

and

2

We will study the two testing problems in both one-sample cases and two-sample cases. In onesample cases, the probability vector q is known and the sample X n = (Xn1 , . . . , Xnk ) is taken from a multinomial distribution, M N Dk (n; p), with an unknown parameter p; while in two-sample cases, the samples X m = (Xm1 , . . . , Xmk ) and Y n = (Yn1 , . . . , Ynk ) are taken independently from M N Dk (m; p) and M N Dk (n; q) with unknown p and q, respectively. The emphasis of our discussion is the two-sample problem which is perhaps more useful for real problems. Gilula and Haberman (1995) pointed that p ≺ q is equivalent to p[·] being stochastically

larger than q [·] as probability vectors. Many papers have studied testing hypotheses concerning stochastic orderings of multinomial parameters (e.g. see Robertson and Wright, 1981; Silvapulle

and Sen, 2004; Cohen et al., 2006; Feng and Wang, 2006). It is clear that the stochastic ordering of multinomial parameter p’s themselves is different from that of ordered multinomial parameter p[·] ’s. The latter is our concern in this paper. Nevertheless, limited papers have considered this problem. Xiong and Li (2005) discussed one-sided hypotheses for p[1] , which are special cases of (1.1) and (1.2). Dykstra et al. (2002) studied the likelihood ratio test for (1.1) in one-sample cases. Moreover, they pointed out that the likelihood ratio test for the two-sample problem is not a trivial extension of that for the one-sample problem. In this paper, we do not adopt likelihood ratio technique to study the testing problems. In Section 2 we discuss the one-sample problem. An ad hoc test for (1.1) and (1.2) respectively is investigated. By use of the theory of Schur convex function, the test can be exact (do not rely on asymptotics). Its consistency is also proved. The test for (1.1) in one-sample cases is extended to two-sample cases in Section 3. We use the bootstrap to approximate the null distribution of the test statistic and prove that this approximation and the corresponding bootstrap test are consistent. Section 4 presents simulation results to evaluate the performance of the bootstrap test for (1.1) in two-sample cases. A real data set is also analyzed using our method.

2

One-Sample Problem

For all k-dimensional probability vectors p, q, denote k−1 k−1 X X  ϕ(p, q) = max p[1] − q[1] , (p[1] + p[2] ) − (q[1] + q[2] ), . . . , p[i] − q[i] . i=1

i=1

It is clear that ϕ(p, q) 6 0 if and only if p ≺ q. Therefore, it can be regarded as a single index

for comparing dispersion of p and q through majorization. Actually, in one sample cases, the empirical forms of ϕ(p, q) and ϕ(q, p), Sn = ϕ

Xn  Xn  , q and Sn′ = ϕ q, , n n

can be used as the test statistics for (1.1) and (1.2), respectively, where q is known, and X n ∼

M N Dk (n; p).

3

Theorem 2.1. For all c ∈ R, sup Pp (Sn > c) =

sup Pp (Sn > c),

(2.1)

sup Pp (Sn′ > c).

(2.2)

p[·] =q [·]

p≺q

sup Pp (Sn′ > c) =

p[·] =q [·]

p≻q

Proof. By definition, ϕ(·, q) is a Schur convex function. Then for any c, f (x) = I(ϕ(x, q) > c) is also a Schur convex function. By C.4.a of Chapter 11, Marshall and Olkin (1979), g(p) = Ep f (X n /n) = Pp (Sn > c) is a Schur convex function. Then (2.1) follows. The equality (2.2) can be proved similarly.



Denote the tests constructed by Sn and Sn′ for (1.1) and (1.2) by φS and φ′S , respectively. Let xn be the observation of X n and sn = Sn (xn ), s′n = Sn′ (xn ). Note that the distribution of X n[·] , hence that of Sn , only relies on p[·] . Then the p-values of φS and φ′S can be calculated as follows ps = Pq (Sn > sn ), ps′ = Pq (Sn′ > s′n ). When n and k are both small, ps and ps′ can be calculated exactly. In other cases, they can be approximated by Monte Carlo method. Theorem 2.2. φS and φ′S are consistent tests. Proof. Note that

√ n(Sn − ϕ(p, q)) has an asymptotic distribution, which is denoted by Fp , as

n → ∞ (Marcheselli, 2000; see also Xiong and Li, 2006a). Therefore, given a significant level α, the √ asymptotic rejection region of φS is { nSn > Fq−1 (1 − α)}. For all p ∈ H1 , ϕ(p, q) > 0 = ϕ(q, q). Thus,

√ √ √ Pp ( nSn > Fq−1 (1 − α)) = Pp ( n(Sn − ϕ(p, q)) > Fq−1 (1 − α) − nϕ(p, q)) → 1.



Remark 2.1. We have pointed out that ϕ may be used as an index for comparing dispersion of two probability vectors. In fact, ϕ can also be used to quanlify the dispersion of one probability vector. Let π 0 = (1/k, . . . , 1/k). For a probability vector p, we may consider ϕ(p, π 0 ) as a dispersion index. In adition, note that if q = π0 , then (1.1) becomes the following goofness-of-fit hypotheses HE0 : p = π 0 ←→ HE1 : p 6= π 0 . For the special case, it is perhaps worth comparing φS with the usual goodness-of-fit tests, such as the power-divergence test (Read and Cressie, 1988).

3

Two-Sample Problem

This section discusses testing (1.1) in two-sample cases. Let X m ∼ M N Dk (m; p) and Y n ∼

M N Dk (n; q) be independent. Similarly to the one-sample cases, the test statistic can be Tmn = ϕ

Xm Y n  , . m n 4

Theorem 3.1. For all c ∈ R, sup P(p,q) (Tmn > c) = sup P(p,q) (Tmn > c).

(3.1)

p[·] =q[·]

p≺q

Proof. We prove a more general result. It is obvious that f (x) = ϕ(x, θ) and f1 (x) = ϕ(θ, x) are respectively Schur convex and Schur concave functions. For all probability vector p0 with p ≺ p0 ≺ q, P(p,q) (Tmn > c) = Eq P(p,q) (Tmn > c | Y n ) X X Xm y Xm y = Pp (ϕ( , ) > c)δ(y) 6 Pp0 (ϕ( , ) > c)δ(y) m n m n y y = P(p0 ,q) (Tmn > c) = Ep0 P(p0 ,q) (Tmn > c | X m ) 6 P(p0 ,p0 ) (Tmn > c),

(3.2)

where δ(·) is the probability function of Y n . Then (3.1) follows.



From this theorem, given the observations xm and y n , we can calculate the p-value for testing (1.1) following p = sup P(p,p) (Tmn > tmn ),

(3.3)

p∈Q

where Q = {x ∈ Rk : xi > 0, i = 1, . . . , k,

and

Pk

i=1

xi = 1} and tmn = Tmn (xm , y n ).

Since the p-value is very difficult to compute exactly using (3.3). we use the bootstrap method to approximate it. The main idea is to generate bootstrap sample from the distribuion of X m and

ˆ and Y n , with p and q being replaced by their MLEs under the boundary of null hypothesis. Let p ˆ be MLEs of p and q under p[·] = q [·] , respectively. Denote q ˆ = (ˆ π π1 , . . . , π ˆk ) =

X m[·] + Y n[·] . m+n

ˆ [·] = qˆ[·] = π ˆ in Lemma 3.1 below. Our bootstrap test for (1.1) can be We will prove that p constructed as follows: ˆ Y ∗t ∼ M N Dk (t; π), ˆ Given X m and Y n , generate the bootstrap sample X ∗s ∼ M N Dk (s; π),

∗ where s is a number smaller than m and s/t = m/n. Let Tst be the bootstrap version of Tmn , i.e. ∗ Tst = ϕ(X ∗s /s, Y ∗t /t) and denote

√ ∗ ∗ Fmn (x) = P( sTst 6 x | X m, Y n)

(3.4)

√ which is an approximate distribution function of mTmn under p[·] = q [·] . Then given a significance √ ∗−1 level α, the rejection region is { mTmn > Fmn (1 − α)} and the approximate p-value is √ ∗ ∗ p = P(Tst > tmn | xm , y n ) = 1 − Fmn ( mtmn ).

(3.5)

Denote this bootstrap test for (1.1) by φ. Remark 3.1. Note that the bootstrap distribution estimators of X m[·] and Y n[·] , hence that of Tmn , may be inconsistent if the bootstrap sample sizes are taken as m and n (Xiong and Li, 2006a). 5

This is the reason that we change the bootstrap sample size t (s) to be smaller than m (n). Quite a few papers have studied this phenomena, for example, Shao and Tu, (1995) and Bickel et al. √ (1997). From our simulation in Section 4, to take s = 4 m may be a good choice. In order to show that φ is a reasonable test, we will discuss whether it is asymptotic correct ˆ and q ˆ. (Shao and Tu, 1995, p.178) and consistent. First we discuss the properties of p ˆ m and q ˆ have the following properties: Lemma 3.1. The MLEs p ˆ [·] = q ˆ [·] = π. ˆ (1). p (2). As m, n → ∞, m/n → λ ∈ (0, +∞), ˆ → π = (π1 , . . . , πk ) = π and

λ 1 p[·] + q almost surely 1+λ 1 + λ [·]

√ m(ˆ πi − πi ) = Op (1), i = 1, . . . , k.

(3). If p ≺ q, then p ≺ π ≺ q. Proof. See the Appendix. Now we show that F ∗ defined in (3.4) converge to the asymptotic distribution of

√ mTmn

under p[·] = q [·] . Some lemmas are needed. Lemma 3.2. Let pm be a sequence of random probability vectors with pm → p almost surely, where p is a probablity vector. Suppose that X m ∼ M N D(m; p), and given pm , the conditional distribution of X ∗s is M N D(s; pm ). Then as m → ∞ and s → ∞,  √ X ∗s s − pm |pm s where ‘|Gm

d

a.s. ’

d a.s.



m

 Xm −p , m

means that given the sequence of sub-σ fields (or random variables) Gm , the left

side converges in distribuion to the asymptotic distribuion of the right side almost surely. Proof. Using characteristic function (see Shao and Tu, 1995, p.76). Lemma 3.3. Assume that all the conditions in Lemma 3.2 hold. Assume further that pi ) = Op (1), i = 1, . . . , k. Then as m → ∞, s → ∞, s/m → 0,  √ X ∗s[·] s − pm[·] |pm s where ‘|Gm

d

P’

d P

√ m(pmi −

 √ X m[·] m − p[·] , m

means that given the sequence of sub-σ fields (or random variables) Gm , the left

side converges in distribuion to the asymptotic distribuion of the right side in probability. Proof. See the Appendix.

6

˜ m ∼ M N Dk (m; π) and Y˜ n ∼ M N Dk (n; π) be independent. Given X m , Y n , let Let X

ˆ and Y ∗t ∼ M N Dk (t; π) ˆ be independent. For i = 1, . . . , k − 1, denote X ∗s ∼ M N Dk (s; π) Pi Pi i i X X   √ √ j=1 Xm[j] j=1 Yn[j] Umi = m − p[j] , Vni = n − q[j] , m n j=1 j=1 Pi P i i i ˜ ˜ X X   √ √ j=1 Xm[j] j=1 Yn[j] ˜mi = m U − πj , V˜ni = n − πj , m n j=1 j=1 Pi P i ∗ i ∗ i X  X  √ √ j=1 Xs[j] j=1 Yt[j] ∗ Usi = s − π ˆj , Vti∗ = t − π ˆj , s t j=1 j=1 Lemma 3.4. As m, n, s, t → ∞ with s/m → 0, lim s/t = lim m/n = λ ∈ (0, +∞), √ √ ∗ m d ˜ sTst |Fmn P max {Umi − √ V˜ni }, 16i6k−1 n

(3.6)

where Fmn is the sub-σ field generated by (X m , Y n ). Proof. See the Appendix. Denote the asymptotic distribution of the right side of (3.6) by Fπ . By the P´ olya theorem (see Theorem 3.7 of Xiong and Li (2006b)), we have Lemma 3.5. ∗ sup |Fmn (x) − Fπ (x)|→0

in probability,

x

∗ where Fmn is given in (3.4).

Noting that if p[·] = q [·] , then √

mTmn

√ √  d m m ˜ = max Umi − √ Vni = max {Umi − √ V˜ni }, 16i6k−1 16i6k−1 n n

(3.7)

d

where ‘=’ means its both sides have the same distribution. Therefore the following theorem holds from Lemma 3.4. Theorem 3.2. If p[·] = q [·] , then as m, n, s, t → ∞ with s/m → 0, lim s/t = lim m/n = λ ∈ (0, +∞), we have

√ ∗ sTst |Fmn

d P

√ mTmn .

The following theorem indicates that φ is asymptotic correct and consistent. Theorem 3.3. Given a nominal level α ∈ (0, 1),

(1). as m, n, s, t → ∞ with s/m → 0, lim s/t = lim m/n = λ ∈ (0, +∞), sup (p,q)∈H0

√ ∗−1 lim sup P( mTmn > Fmn (1 − α)) = α.

(3.8)

(2). If H0 does not hold, then as m, n, s, t → ∞ with s/m → 0, lim s/t = lim m/n = λ ∈ (0, +∞), √ ∗−1 lim P( mTmn > Fmn (1 − α)) = 1. 7

(3.9)

Proof. (1). If p[·] = q [·] , then it follows from Lemma 3.5 and (3.7) that √ ∗−1 lim P( mTmn > Fmn (1 − α)) = α. If p ≺ q, then from (3.2), (3.7) and Lemma 3.1 (3), we have

6

√ P(p,q) ( mTmn > Fπ−1 (1 − α)) √ P(π,π) ( mTmn > Fπ−1 (1 − α)) → α.

From this result and Lemma 3.5, it follows that √ ∗−1 lim sup P( mTmn > Fmn (1 − α)) 6 α. Then (3.8) is proved. (2). If H0 does not hold, noting that √ mTmn = then

√ i i X  √ X m Umi − √ Vni + m( p[j] − q[i] ) , 16i6k−1 n j=1 j=1 max

√ mTmn → ∞. Thus (3.9) follows.



It is worth noting that besides multinomial sampling, there are other sampling plans to estimate a probability vector (Barabesi and Fattorini, 1998; see also Fattorini and Marcheselli, 1999, Marcheselli 2000, and Marcheselli 2003). How to construct tests for (1.1) and (1.2) in such cases is an interesting problem and will be addressed in future research.

4

Simulation results and a real example

In the preceding section we derived theoretical properties for a bootstrap test for (1.1). To investigate the finite-sample behavior of the proposed procedure, we conducted simulation studies. From Theorem 3.3, we need only to discuss the case of p = q. The simulation condition for p follows respectively: p1 = (1/6, 1/6, 1/6, 1/6, 1/6, 1/6), p2 = (0.2, 0.2, 0.15, 0.15, 0.15, 0.15), p3 = (0.3, 0.2, 0.15, 0.15, 0.1, 0.1), p4 = (0.4, 0.2, 0.1, 0.1, 0.1, 0.1), p5 = (0.5, 0.2, 0.1, 0.1, 0.05, 0.05), p6 = (0.75, 0.05, 0.05, 0.05, 0.05, 0.05). Given the nominal level α = 0.05, M = 1000 simulations are performed for four kinds of sample size √ of (m, n) (see Table 1) in turn. In each simulation, we set s = 4 m and t = sn/m, then generate 1000 bootstrap observations to construct the rejection region. The frequencies of rejecting H1 are presented in Table 1. From Table 1 we observe that all frequencies are quite close to the nominal level. It is concluded than our bootstrap test φ performs well even in finite-sample cases.

8

Table 1 Frequencies of rejecting H1 with α = 0.05 p1

p2

p3

p4

p5

p6

m = 30, n = 40

0.03

0.03

0.05

0.08

0.06

0.06

m = n = 50

0.03

0.04

0.08

0.05

0.06

0.06

m = 40, n = 100

0.05

0.04

0.07

0.05

0.05

0.05

m = 150, n = 200

0.04

0.04

0.06

0.06

0.05

0.05

We next illustrate the bootstrap test for the occupational status data previously examined in Dykstra et al. (2002). There are 151 workers in Caucasian Walton County, Florida, taken from the 1885 Florida census manuscripts. The occupational status (1=professional; 2=manager, clerical, proprietor; 3=skilled; 4=unskilled; 5=laborer; 6=farmer) of each worker was recorded, and the observation was X = (15, 14, 6, 11, 78, 27) with the sample size m = 151. Comparing with the data of 1870, Y = (4, 6, 18, 11, 95, 75) with the sample size n = 209, a social scientist or demographer may be interested in investigating the change in the level of occupation diversity (or dispersion). ˆ of the Thus, we consider testing (1.1) with this data set. First we compute the ordered form π ˆ and q ˆ of p and q under p[·] = q [·] using Lemma 3.1 as follows: MLEs p ˆ = (0.481, 0.283, 0.092, 0.069, 0.047, 0.028). π √ √ Let s = [4 m] = 49, t = [4 mn/m] = 68, and B = 100000. We generate B bootstrap observaˆ and M N D6 (t, π), ˆ respectively. From (3.5), we have the p-value= 0.175. tions from M N D6 (s, π) Therefore there is insufficient evidence to reject p ≺ q. Dykstra et al. (2002) assumed q = Y /n

and used one-sample likelihood ratio test for (1.1) with the data. Their p-value= 0.242, which is larger than ours.

Appendix Proof of Lemma 3.1. First we need to solve the optimization problem: max

k X

Xmi log pi +

i=1

k X

Yni log qi

i=1

subject to {p1 , . . . , pk } = {q1 , . . . , qk }. Let (u1 , . . . , uk ) be a permutation of (1, . . . , k) with pi = qui , i = 1, . . . , k. k X

Xmi log pi +

k X

(Xmi + Ynui ) log

i=1

6

i=1

k X

Yni log qi =

i=1

k X i=1

Xmi + Ynui  . m+n 9

(Xmi + Ynui ) log pi

t Since f (t) = t log( m+n ) is a convex function, g(t1 , . . . , tk ) =

(Mashall and Olkin, 1979). Note that

Pk

i=1

f (ti ) is a Schur convex function

(Xm1 + Ynu1 , . . . , Xmk + Ynuk ) ≺ (Xm[1] + Yn[1] , . . . , Xm[k] + Yn[k] ). Therefore, k X

Xmi log pi +

i=1

k X

Yni log qi 6

i=1

k X

(Xm[i] + Yn[i] ) log(ˆ πi ).

i=1

Then (1) is proved, and (2), (3) are obvious.



Proof of Lemma 3.3. Denote {¯ p1 , . . . , p¯l } = {p1 , . . . , pk } with p¯1 > · · · > p¯l and {ri1 , . . . , riki } =

{j : pj = p¯i } with ri1 < · · · < riki for i = 1, . . . , l. Write

 Asi = {X ∗s[k1 +···+ki−1 +1] , . . . , X ∗s[k1 +···+ki ] } = {X ∗sri1 , . . . , X ∗srik } , i  Bmi = {X m[k1 +···+ki−1 +1] , . . . , X m[k1 +···+ki ] } = {X mri1 , . . . , X mriki } , As =

l \

Asi , Bm =

i=1

l \

Bmi .

i=1

From Xiong and Li (2006b), P(As | pm ) → 1 in probability, and P(Bm ) → 1. Therefore for x ∈ Rk , k ∗ \  √ Xs[i] P s( − pm[i] ) 6 xi pm s i=1

=

P

k ∗ \  √ Xs[i] s( − pm[i] ) 6 xi , As , Bm pm + op (1) s i=1

=

P

l n \ i=1

( here

∗ ∗ √ Xsa √ Xsa ki 1 s( − pmb1 ) 6 x1 , . . . , s( − pmbki ) 6 xki , s s a1 ,...,aki b1 ,...,bki  o ∗ ∗ pm + op (1) Xsa > · · · > X , p > · · · > p mb mb sa 1 k 1 ki i [ [ and denote unions for all permutations a1 , . . . , aki

[

a1 ,...,aki

=

[

b1 ,...,bki

and b1 , . . . , bki of ri1 , . . . , riki , respectively) √ l n \ ∗ [ [ √ Xsa s√ 1 s( − pma1 ) 6 x1 + √ m(pmb1 − pma1 ), . . . , P s m i=1 a1 ,...,aki b1 ,...,bki √ ∗ √ Xsa s√ ki s( − pmaki ) 6 xki + √ m(pmbki − pmaki ), s m  o ∗ ∗ Xsa > · · · > X , p > · · · > p p mb mb 1 m + op (1) sak ki 1 i

10

=

P

l n \ i=1

∗ ∗ √ Xsa √ Xsa ki 1 s( − pma1 ) 6 x1 + op (1), . . . , s( − pmaki ) 6 xki + op (1), s s a1 ,...,aki  o ∗ ∗ Xsa > · · · > X p sak m + op (1) 1

[

i

=

P



l \

n

i=1

=

P

√ Xma1 √ Xmaki − pa1 ) 6 x1 , . . . , m( − paki ) 6 xki , m( m m a1 ,...,aki o Xma1 > · · · > Xmaki + op (1) [

k \ √ Xm[i]  − p[i] ) 6 xi + op (1). m( m i=1



Proof of Lemma 3.4. By Lemma 3.1, Lemma 3.3 and the continuous mapping theorem, for each i = 1, . . . , k − 1, ∗ Usi |Fmn

d P

˜mi , Vti∗ |Fmn U

d P

V˜ni .

Therefore ∗ ∗ ∗ ∗ Yt[1] + · · · + Yt[i] + · · · + Xs[i] √ Xs[1]  s − 16i6k−1 s t √ √  ∗ s d ˜mi − √m V˜ni }. max Usi − √ Vti∗ |Fmn P max {U  16i6k−1 16i6k−1 n t

√ ∗ sTst =

=

max

References Barabesi, L., Fattorini, L., 1998. The use of replicated plot, line and point sampling for estimating species abundance and ecological diversity. Environ. Ecol. Statist. 5, 353-370. Bichel, P.J., G¨ otze, F., Zwet, W.R., 1997. Resampling fewer than n observations: gains, losses, and remedies for losses. Statist. Sinica. 7, 1-37. Cohen, A., Kolassa, J., Sackrowitz., H.B., 2006. A new test for stochastic order of k > 3 ordered multinomial populations. Statist. Probab. Lett. 76, 1017-1024. Dykstra, R.L., Lang, J.B., Oh, M., Robertson, T., 2002. Order restricted inference for hypotheses concerning qualitative dispersion. J. Statist. Plann. Inference. 107, 249-265. Fattorini, L., Marcheselli, M., 1999. Inference on intrinsic diversity profiles of Biological populations. Environmetrics. 10, 589-599. Feng, Y., Wang, J., 2006. Likelihood ratio test against simple stochastic ordering among several multinomial populations. J. Statist. Plann. Inference. To appear. Gilula, Z., Haberman, S.J., 1995. Dispersion of categorical variables and penalty functions; derivation, estimation and comparability. J. Amer. Statist. Assoc. 90, 1447-1452. Gove, J.H., Patil, G.P., Swindel, B.F., Taillie, C., 1994. Ecological diversity and forest management. In Handbook of Statistics 12. Environmental statistics (Patil, G.P and Rao, C.R., eds.) 409-462. North-Holland, Amsterdam. 11

Marcheselli, M., 2000. A generalized delta method with applications to intrinsic diversity profiles. J. Appl. Prob. 37, 504-510. Marcheselli, M., 2003. Asymptotic results in jackknifing nonsmooth functions of the sample mean vector. Ann. Statist. 31, 1885-1904. 504-510. Marshall, A.W., Oklin, I., 1979. Inequalities: Theory of Majorization and Its Applications. Academic Press, New York. Patil, G.P., Taillie, C., 1979. An overview of diversity. In: Grassle, J.F., Patil, G.P., Smith, W., Taillie, C. (Eds.), Ecological Diversity in Theory and Practice. International Co-operative Publishing House, Fairland, MD. Patil, G.P., Taillie, C., 1982. Diversity as a concept and its measurement. J. Amer. Statist. Assoc. 77, 548-561. Read, T.R.C., Cressie, N., 1988. Goodness-of-Fit Statistics for Discrete Multivariate Data. Springer, New York. Robertson, T., Wright, F.T., 1981. Likelihood ratio tests for and against a stochastic ordering between multinomial populations. Ann. Statist. 9, 1248-1257. 504-510. Shao, J., Tu, D., 1995. The Jackknife and Bootstrap. Springer, New York. Silvapulle, M.J., Sen, P.K., 2004. Constrained Statistical Inference: Inequality, Order, and Shape Restrictions. John Wiley & Sons, New York. Xiong, S., Li, G., 2005. Testing for the maximum cell probabilities in multinomial distributions. Sci. Sinica Ser. A. 48, 972-985. Xiong, S., Li, G., 2006a. Inference for ordered parameters of multinomial distributions. Submitted. http://xiongshifeng.googlepages.com/06.pdf Xiong, S., Li, G., 2006b. Some results on convergency of conditional distributions with applications. Submitted. http://xiongshifeng.googlepages.com/06b.pdf

12

Comparing dispersions between two probability vectors ...

We consider testing hypotheses concerning comparing dispersions between two pa- ..... noting that besides multinomial sampling, there are other sampling plans to esti- ..... Testing for the maximum cell probabilities in multinomial distributions.

145KB Sizes 2 Downloads 223 Views

Recommend Documents

Comparing CSI Scores Between Groups
Jul 21, 2004 - Appendix 4: Refugee Camps by District, Nationality, and Market .... of Rwandese/Congolese/Burundians housed in a camp for protection cases ...

Comparing Two Roman Marriages II.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Comparing Two ...

Between two worlds.pdf
Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Between two worlds.pdf. Between two worlds.pdf. Open. Extract.

Vectors
The gravitational field strength, g, gravitational field strength, g, gravitational field strength, g, of a planet is the force exerted per unit of mass of an object (that is on that planet). Gravitational field strength therefore has units of N kg-1

Comparing Google Consumer Surveys to Existing Probability and Non ...
Google Consumer Surveys also maintains a mobile application for Android ... With Consumer Surveys, researchers create and run surveys of up to 10 questions.

Comparing Google Consumer Surveys to Existing Probability and Non ...
Internet panel and Google Consumer Surveys against several media consumption ... to Internet-based surveying in the last 10 years. ... these survey questions on publisher websites and answer questions in order to obtain access to the ..... pewInterne

Comparing Google Consumer Surveys to Existing Probability and Non ...
The benchmarks measured Video on Demand. (VoD), Digital Video Recorder (DVR) and satellite dish usage in American households. Four health benchmarks were also measured against responses drawn from the Consumer Surveys respondents. Large government su

Comparing Google Consumer Surveys to Existing Probability and Non ...
Mar 19, 2012 - panel and Google Consumer Surveys against several media consumption ... these survey questions on publisher websites and answer ..... pewInternet.org/Reports/2011/Teens-and-social-media/Methodology/Survey.aspx.

Vectors - PDFKUL.COM
consequences. Re-entry into the Atmosphere Gravitational potential energy is converted into kinetic energy as a space craft re-enters the atmosphere. Due ... a massive ball of gas which is undergoing nuclear fusion. Produces vast .... populated areas

Convergences and divergences between two ...
Central System data came from several publications ... able 1. Checklist of the species present at the W ..... Here, we assume that the analysis of a long period of ...

COMPARING THE INFLUENCE OF TWO USER ...
(www.nrl.navy.mil/aic/ide/NASATLX.php) in order to subjectively assess the workload they experienced during the experiment. The questionnaire consists of six ...

Two models of unawareness: comparing the object ... - Springer Link
Dec 1, 2010 - In this paper we compare two different approaches to modeling unawareness: the object-based approach of Board and Chung (Object-based unawareness: theory and applications. University of Minnesota, Mimeo, 2008) and the subjective-state-s

Distance Learning experiments between two ...
ferent processes could be used for the development of instructional systems using educational ... The goals of the project (course) must explain what the learner is expected to learn ... scope was tackled (seen) as a project management activity. ...

CHAPTER 10 Comparing Two Populations or Groups
Feb 22, 2018 - What if we want to compare the effectiveness of Treatment 1 and. Treatment 2 in a completely randomized experiment? This time, the parameters p1 and p2 that we want to compare are the true proportions of successful outcomes for each tr

Comparing OPEN and UML: the two third-generation ...
Two so-called third-generation approaches, OPEN (a full methodology) and ...... source (or sources) to a target (or targets)'' which appears to ... MACHINE(s), defined by the CLASS, each of which is an .... It is like learning a natural language.

Two models of unawareness: comparing the object ... - Springer Link
Dec 1, 2010 - containing no free variables.3 We use OBU structures to provide truth conditions only ..... can envisage an extension where unawareness of properties is also modeled. ..... are the identity when domain and codomain coincide.

Comparing OPEN and UML: the two third-generation ...
aSchool of Information Technology, Swinburne University of Technology, Australia. bStorage ... It was clear for several years that there are too many object-oriented (OO) ... 2 A modelling language consists of a metamodel plus notation. 3 OPEN is an

COMPARING THE INFLUENCE OF TWO USER ...
objective of this study was to compare the influence of the manual user interface ... investigated speech and visual/manual interaction techniques in vehicles.

Two models of unawareness: Comparing the object ...
Comparing the object-based and the subjective- state-space ... In this paper we compare ...... Incorporating unawareness into contract theory. Mimeo, University ...

pdf-175\evolve-a-bridge-between-probability-set-oriented-numerics ...
... the apps below to open or edit this item. pdf-175\evolve-a-bridge-between-probability-set-oriente ... ion-iv-international-conference-held-at-leiden-univ.pdf.

What is the Probability that Two Group Elements ...
Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to ... W. H. GUSTAFSON, Indiana University. 1.

Vectors & Scalars 1 Vectors & Scalars 1
3) A baseball player runs 27.4 meters from the batter's box to first base, overruns first base by 3.0 meters, and then returns to first base. Compared to the total distance traveled by the player, ... the second kilometer in 6.2 minutes, the third ki

Vectors - with mr mackenzie
National 5 Physics Summary Notes. Dynamics & Space. 3. F. Kastelein ..... galaxy. Universe everything we know to exist, all stars planets and galaxies. Scale of ...