Acta Mathematica Sinica, English Series Oct., 2008, Vol. 24, No. 10, pp. 1637–1650 Published online: October 5, 2008 DOI: 10.1007/s10114-008-6220-6 Http://www.ActaMath.com

Acta Mathematica Sinica, English Series The Editorial Office of AMS & Springer-Verlag 2008

Simultaneous Optimality of LSE and ANOVA Estimate in General Mixed Models Mi Xia WU College of Applied Sciences, Beijing University of Technology, Beijing 100022, P. R. China and Division of Epidemiology, Statistics and Prevention Research, National Institute of Child Health and Human Development, NIH/DHHS, 6100 Executive Boulevard, Rockville, MD 20852, USA E-mail: [email protected]

Song Gui WANG College of Applied Sciences, Beijing University of Technology, Beijing 100022, P. R. China E-mail: [email protected]

Kai Fun YU Division of Epidemiology, Statistics and Prevention Research, National Institute of Child Health and Human Development, NIH/DHHS, 6100 Executive Boulevard, Rockville, MD 20852, USA E-mail: [email protected] Abstract Problems of the simultaneous optimal estimates and the optimal tests in general mixed models are considered. A necessary and sufficient condition is presented for the least squares estimate of the fixed effects and the analysis of variance (Hendreson III’s) estimate of variance components being uniformly minimum variance unbiased estimates simultaneously. This result can be applied to the problems of finding uniformly optimal unbiased tests and uniformly most accurate unbiased confidential interval on parameters of interest, and for finding equivalences of several common estimates of variance components. Keywords Linear mixed model, Least squares estimate, Analysis of variance estimate, Minimum variance unbiased estimate, Uniformly most powerful unbiased test MR(2000) Subject Classification 62F10, 62J10, 62F03

1

Introduction

Linear mixed effects model is an important class of statistical models that are used in many fields of applications, such as biological health sciences, economics, computer graphics and mechanic engineering etc., see Baltagi [1], Davidian and Giltinan [2], Diggle, et al. [3]. The parameters in these models, consisting of fixed effects and variance components, are mainly estimated by Received April 27, 2006, Revised May 27, 2007, Accepted May 30, 2008 The work is supported by Funding Project for Academic Human Resources Development in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality PHR (IHLB), National Natural Science Foundation of China (NSFC) (10801005) and (NSFC) (10771010) and the Intramural Research Program of the National Institute of Child Health and Human Development, National Institute of Health

Wu M. X., et al.

1638

the following popular methods: the least squares (LS), analysis of variance (ANOVA), maximum likelihood (ML), restricted maximum likelihood (REML) and minimum norm quadratic unbiased (MINQU). In general, ML estimate and REML estimate need to solve a system of non-linear equations, which usually do not give explicit solution, they must be determined by iterative algorithms, though many iterative algorithms have been developed, such as EM iterations or penalized least squares, see, Searle et al [4] and Bates and DebRoy [5], and there mainly exist simulation results in literature; MINQU estimate strongly depends on the prior values of variance components. In practice, the least squares estimate (LSE) and ANOVA estimate are more commonly employed for their simple forms and some good properties under some models, especially in hypothesis testing. Baltagi [1], by large simulations, also showed that ANOVA type estimate has high efficiency in two-stage estimate of fixed effects. On ANOVA estimate taking negative values, some improved estimates of ANOVA estimate have been presented, see Kelly and Mathew [6], and in some fields, such as biology, genetics, the negative values of variance components are permitted only if the covariance matrix can be ensured nonnegative definite, see Nelder [7], Smith and Murray [8], and Hazelton and Gurrin [9]. In this paper, we mainly consider simultaneous optimality of the LSE and ANOVA estimate in general linear mixed models. Graybill and Hultquist [10] discussed this problem in general random effect models and present a sufficient condition for the LSE of the fixed effect and the ANOVA estimate of variance components to be uniformly minimum variance unbiased (UMVU) estimates simultaneously. It is the case for all random effect models with balanced data, see Searle, et al. [4]. For general linear mixed models, Rao [11] obtained the necessary and sufficient conditions for the LSE of fixed effects to be the UMVU estimate; Seely [12] presented the necessary and sufficient condition for the existence of the uniformly minimum variance invariant unbiased (UMVIU) estimator of variance components. Brown [13] proved that an ANOVA estimate, which is obtained by some independent quadratic forms in the total sum of squares, is UMVIU under some conditions. Khuri, et al [14] studied the optimality of tests constructed by these above quadratic forms. However, the optimality of Henderson III’s estimate, as the most common form of ANOVA estimate, in general, is yet to be solved, besides some results under special models, see, Searle, et al [4] and Wu and Wang [15]. This paper gives the fairly general theories of the simultaneous optimal estimates and the optimal tests in general mixed models. The outline of this paper is as follows. In Section 2, we describe the LSE of any estimable function and the ANOVA estimate (Henderson III’s estimate) of variance components under general linear mixed models. In Section 3, the necessary and sufficient conditions are obtained for the LSE and the ANOVA estimate to be UMVU estimates simultaneously, optimal tests and confidence interval are presented. In the last section, the equivalences of several common estimates of variance components, ANOVA estimate, REML estimate and MINQU estimate, are established. 2

LSE and ANOVA Estimate

Consider a general linear mixed model

Simultaneous Optimality of LSE and ANOVA Estimate in General Mixed Models

y = Xβ +

k 

Ui ξi + e,

1639

(2.1)

i=1

where y is an n × 1 vector of observations, X is an n × p known matrix, β is an unknown p × 1 vector of fixed effects, each Ui is an n × qi known matrix, each ξi is an qi × 1 vector of random effects, having normal distribution with mean vector 0 and covariance matrix σi2 Iqi , e is an n × 1 error vector, having normal distribution with mean vector 0 and covariance matrix σe2 In , and ξ1 , . . . , ξk and e are mutually independently distributed. Denote Uk+1 = In , ξk+1 = e and 2 = σe2 . Then the covariance matrix of y is σk+1

V (σ 2 ) = Cov(y) =

k+1 

σi2 Ui Ui ,

(2.2)

i=1 2 ) . The parameter space of σ 2 is defined to be where σ 2 = (σ12 , . . . , σk2 , σk+1 2 > 0}. Ω = {σ 2 | σi2 ≥ 0, i = 1, . . . , k, σk+1

The LSE of any estimable function c β under model (2.1) is c βˆ = c (X  X)− X  y = α PX y,

(2.3)

where α is defined by c = X  α, PX = X(X  X)− X  , A− denotes a g-inverse of A. The Henderson III estimates, the most common form of ANOVA estimate, is based on sums of squares obtained by the method of fitting constants. Without loss of generality, we assume that the order of U1 , U2 , . . . , Uk+1 satisfies the condition: M (Ui ) ⊂ M (Uj ) implies i < j,

(2.4)

which is the partial ordering() on Sn , the set of all matrices with n rows, denoted by Ui  Uj . Here M (A) stands for the column space of a matrix A. Then an ANOVA estimate of σ 2 is the solution to the following equations ai (σ 2 ) = y  Mi y,

i = 1, . . . , k + 1,

(2.5)

where each ai (σ 2 ) = E(y  Mi y) and M1 = P(X:U1 ) − PX , Mi = P(X:U1 :···:Ui ) − P(X:U1 :···:Ui−1 ) ,

i = 2, . . . , k + 1,

(2.6)

E(η) denotes the expectation of a random variable. Clearly, P(X:U1 :···:Uk :Uk+1 ) = In from the fact Uk+1 = In , and ai (σ 2 ) =

k+1 

tr(Mi Uj Uj ) σj2 ,

i = 1, . . . , k + 1,

j=i

where tr(A) denotes the trace of matrix A. Let C = (cij ) be a (k + 1) × (k + 1) upper triangular matrix such that  tr(Mi Uj Uj ), if i ≤ j, cij = 0, otherwise.

(2.7)

Wu M. X., et al.

1640

Denote a = (a1 , . . . , ak+1 ) . Then (2.7) can be rewritten as a = Cσ 2 .

(2.8)

If C is invertible, then the ANOVA estimate of σ 2 exists, and can be described as σ ˆ 2 = C −1 s,

(2.9)

where s = (s1 , . . . , sk+1 ) with element si = y  Mi y, i = 1, . . . , k + 1. Clearly, the ANOVA estimate σ ˆ 2 is unbiased. Remark 2.1 The partial ordering required on U1 , U2 , . . . Uk+1 above (2.4) ensures that each Mi = 0 if ANOVA estimate of σ 2 exists. In the following, we consider the matrix C = (cij ) being invertible. 3

Necessary and Sufficient Conditions for the Simultaneous Optimality

In order to prove our main results, we need the following lemmas. Lemma 3.1 For any matrices A and B with the same number of rows, if PA PB = PB PA , then (i) PA PB is the orthogonal projector onto M (A) ∩ M (B); (ii) P(A: B) = PA + PB − PB PA PB = PA + QA PB , where QA = I − PA . For proof see Theorem 1 of Baksalary [16]. Lemma 3.2

For any n × n positive definite matrix V and n × p matrix X, the following two

statements are equivalent. (i) PX and V commute; (ii) V can be decomposed as V =

m 

λi P i +

i=1

m0 

λi P i ,

(3.1)

i=m+1

where λ1 , . . . , λm and λm+1 , . . . , λm0 are distinct non-zero eigenvalues of QX V QX and PX V PX , m respectively. The orthogonal projectors P1 , . . . , Pm0 satisfy that Pi Pj = 0 (i = j) and i=1 Pi = m0 Pi = PX . QX , i=m+1 Proof Obviously, (i) is straight result of (ii), we prove (i) ⇒ (ii). Assume that (i) is true. Then V = V (QX + PX ) = QX V QX + PX V PX . Let the spectral decompositions of QX V QX and PX V PX be QX V QX =

m 

λi P i ,

PX V PX =

i=1

m0 

λi P i ,

i=m+1

where λ1 , . . . , λm and λm+1 , . . . , λm0 are the distinct non-zero eigenvalues of QX V QX and PX V PX , respectively. Since V is a positive-definite matrix, it is easy to show that M (QX V QX ) = M (QX ) and M (PX V PX ) = M (PX ). We have m  i=1

Pi = QX ,

m0  i=m+1

Pi = PX ,

Pi Pj = 0 (i = j)

Simultaneous Optimality of LSE and ANOVA Estimate in General Mixed Models

1641

so (ii) holds. The proof of Lemma 3.2 is completed. Remark 3.1 There may exist some equal eigenvalues between QX V QX and PX V PX , that is, λm+j = λi may hold for some i, j (≤ m) in (3.1). By (ii) of Lemma 3.2, it is easy to verify that if PX and V commute, then QX V −1 QX = and |V

−1

|=

m0  i=1

m  1 Pi λ i=1 i

(3.2)

m

 1 1  −1 H|, ri = ri · |H V λi λ i i=1

(3.3)

where ri = tr(Pi ), i = 1, 2, . . . , m0 , r = rk(X), and H is an n × r column orthogonal matrix such that H  H = Ir and PH = HH  = PX . ˆ2 In the following theorem, we present the necessary and sufficient conditions for c βˆ and σ to be the UMVU estimates. ˆ 2 are the UMVU estimates of estimable function Theorem 3.1 Under model (2.1), c βˆ and σ  2 c β and σ , respectively, if and only if QX V (σ 2 ) is symmetric and exactly has k + 1 non-zero distinct eigenvalues, which are linearly independent combinations of the σ 2 ’s. Proof (Sufficiency) Let λ1 , λ2 , . . . , λk+1 be exactly k + 1 distinct eigenvalues of QX V (σ 2 ). Since they are linearly independent combinations of the σ 2 ’s, there exists a (k + 1) × (k + 1) reversible constant matrix A = (aij ) such that λ = Aσ 2 ,

(3.4)

where λ = (λ1 , λ2 , . . . , λk+1 ) . Using the symmetry of QX V (σ 2 ), we have QX V (σ 2 )QX = QX V (σ 2 ) =

k+1 

λi P i ,

(3.5)

i=1

k+1 where {Pi } are orthogonal projectors and satisfy Pi Pj = 0 (i = j), i=1 Pi = QX . By combining (3.2) and (3.3) with QX V (σ 2 ) = V (σ 2 )QX , the density function of y can be rewritten as   1 −n 2 − 12  −1 2 2 f (y) = (2π) |V (σ )| exp − (y − Xβ) V (σ )(y − Xβ) 2  k+1 m   1 r 1 1 − n−r  2 exp − y Pi y (2π)− 2 |H  V −1 (σ 2 )H| 2 = (2π) ri /2 2λi i=1 λi i=1

    −1 2 · exp −(H y − H Xβ) H V (σ )H(H  y − H  Xβ)/2 , (3.6) where H is defined in (3.3) and σ 2 = A−1 λ. Noting that Xβ = H(H  Xβ), hence any estimable function c β (= α Xβ) is determined by the estimable vector H  Xβ. Reparametrizing the regressive parameter as γ = H  Xβ, we have H  y, y  Pi y, i = 1, 2, . . . , k + 1 are a set of complete and sufficient statistics according to Theorem 15.16 of Arnold [17]. Furthermore, combining (2.2) and (3.4) with (3.5) yields k+1 k+1 k+1    k+1    k+1  2 2  2 2 σj QX Uj Uj QX = σj aij Pi = σj aij Pi QX V (σ )QX = j=1

i=1

j=1

j=1

i=1

Wu M. X., et al.

1642

for any σ 2 ∈ Ω, which implies that any QX Uj Uj QX can be decomposed as QX Uj Uj QX =

k+1 

aij Pi .

(3.7)

i=1



Denote bij =

0, aij = 0, 1, aij = 0.

Thus the orthogonal projector onto subspace M (QX Uj ) is given by P QX Uj =

k+1 

bij Pi ,

j = 1, 2, . . . , k + 1,

(3.8)

i=1

which shows that P QX Ui and P QX Uj commute (1 ≤ i, j ≤ k + 1), that is, PQX Ui PQX Uj = PQX Uj PQX Ui ,

1 ≤ i, j ≤ k + 1.

According to (ii) of Lemma 3.1, we have P(QX U1 :QX U2 ) =

k+1 

k+1 

i=1

i=1

(bi1 + bi2 − bi1 bi2 )Pi =

max{bi1 , bi2 } Pi .

(3.9)

Clearly, P(QX U1 :QX U2 ) PQX U3 = PQX U3 P(QX U1 :QX U2 ) . Note that each max{bi1 , bi2 } still takes only the value 1 or 0. Similarly, we can deduce P(QX U1 :QX U2 :···:QX Ul ) =

k+1  i=1

max{bij } Pi , j≤ l

1 ≤ l ≤ k + 1.

(3.10)

By the matrix identity P(A: B) = PA + PQA B , it is easy to verify M1 = P(QX U1 ) =

k+1 

bi1 Pi ,

i=1

Ml = P(QX U1 :QX U2 :···:QX Ul ) − P(QX U1 :QX U2 :···:QX Ul−1 ) =

k+1  i=1



max{bij } − max {bij } Pi , j≤ l

2 ≤ l ≤ k + 1.

j≤ l−1

(3.11)

Noting that c βˆ = c (X  X)− X  y = α HH  y, k+1  s1 = y  M1 y = bi1 y  Pi y, sl = y  Ml y =

i=1 k+1 

i=1

(3.12)

max{bij } − max {bij } y  Pi y, j≤ l

j≤ l−1

2 ≤ l ≤ k + 1,

then c βˆ and σ ˆ 2 are the UMVU estimates of c β and σ 2 , respectively. (Necessity) For any known (k + 1) × 1 vector f = (f1 , f2 , . . . , fk+1 ) , it is obvious by (2.5) ˆ 2 belongs to the class of the unbiased quadratic invariant estimate of f  σ 2 that f  σ D = {y  Gy : G = G , GX = 0, E(y  Gy) = f  σ 2 }. Under normal assumption of y, the variance of each y  Gy ∈ D is Var(y  Gy) = tr(GV (σ 2 ))2 .

Simultaneous Optimality of LSE and ANOVA Estimate in General Mixed Models

1643

If ANOVA estimate σ ˆ 2 is the UMVU estimate of σ 2 , then f  σ ˆ 2 is the UMVU estimate of f  σ 2 . ˆ 2 is definitely the minimum variance estimate under class D, a subclass of the unbiased Thus f  σ ˆ 2 = y  G0 y. Then matrix G0 is the solution to the following estimate class of f  σ 2 . Denote f  σ minimum trace problem min{tr(GV (σ 2 ))2 : GX = 0, each tr(GUi Ui ) = fi }, G

which, being similar to the minimum trace problem on MINQU estimate, can be proved to be equivalent to the following fact that σ ˆ 2 is the explicit solution to the equations k+1 

tr(P (σ 2 )Ui Ui P (σ 2 )Uj Uj )σj2 = y  P (σ 2 )Ui Ui P (σ 2 ) y,

i = 1, 2, . . . , k + 1,

(3.13)

j=1



−1 where P (σ 2 ) = V −1 (σ 2 )−V −1 (σ 2 )X X  V −1 (σ 2 )X X  V −1 (σ 2 ) = B(B  V (σ 2 )B)−1 B  . Here B is an n × (n − r) column orthogonal matrix such that B  B = In−r and BB  = QX . Applying Theorem 5 of Szatrowski [18], we get (3.13) has the explicit solution if and only if B  V (σ 2 )B has exact k + 1 non-zero distinct eigenvalues which are linearly independent combinations of the σ 2 . Noting the facts that QX V (σ 2 )QX and B  V (σ 2 )B have the same non-zero eigenvalues, the symmetry of QX V (σ 2 ) is the straightforward result of c βˆ to be the UMVU estimate (see Searle et. al [4]), so the necessary is testified. The proof of Theorem 3.1 is completed. By Lemma 3.2, two conditions PX V (σ 2 ) = V (σ 2 )PX , and QX V (σ 2 )QX have k + 1 nonzero distinct eigenvalues which are linearly independent combinations of the σ 2 ’s, are equivalent to that V (σ 2 ) can be decomposed as V (σ 2 ) =

k+1 

λi P i +

i=1

m0 

λi P i ,

(3.14)

i=k+2

where λ1 , . . . , λk+1 are the same as that defined in Theorem 3.1, and orthogonal projectors k+1 P1 , · · · , Pm0 are independent on σ 2 and satisfy that Pi Pj = 0 (i = j) and i=1 Pi = QX , m0 i=k+2 Pi = PX . Thus we obtain the following simple and useful corollaries. ˆ 2 are the UMVU estimates of estimable functions c β and σ 2 , Corollary 3.1 c βˆ and σ respectively, if V (σ 2 ) can be decomposed as V (σ 2 ) =

k+1 

λi P i + λ0 P X ,

(3.15)

i=1

where eigenvalues λ1 , λ2 , . . . , λk+1 are linearly independent combinations of the σ 2 ’s, orthogonal k+1 projectors P1 , . . . , Pk+1 satisfy that PX Pi = 0, Pi Pj = 0 (i = j) and i=1 Pi = QX . ˆ 2 are the UMVU estimates of estimable functions c β and σ 2 , Corollary 3.2 c βˆ and σ respectively, if V (σ 2 ) can be decomposed as V (σ 2 ) =

k+1 

λi G i ,

(3.16)

i=1

where eigenvalues λ1 , λ2 , . . . , λk+1 are linearly independent combinations of the σ 2 ’s, orthogonal k+1 projectors G1 , . . . , Gk+1 satisfy that Gi PX = PX Gi = 0, Gi Gj = 0 (i = j) and i=1 Gi = In .

Wu M. X., et al.

1644

As a special case of Corollaries 3.1 and 3.2, under random effect models ( wherein Xβ = 1μ), n μ ˆ = y¯ = i=1 yi /n, we can obtain Theorem 5 in Graybill and Hultquist [10], that is, Corollary 3.3 The necessary and sufficient condition for the simultaneous optimality of μ ˆ 2 2 2 ¯ and σ ˆ is that W = V (σ ) + nμ Jn exactly has k + 2 distinct eigenvalues, which are linearly independent combinations of the σ 2 ’s and μ2 , where J¯n = 1n 1n /n, 1n = (1, . . . , 1) . Remark 3.2

Under the condition of Corollary 3.2, the spectral decomposition of QX V (σ 2 )

QX is QX V (σ 2 )QX =

k+1 

λi (QX Gi QX ) =

i=1

k+1 

λi P i .

i=1

Remark 3.3 From the proof of Theorem 3.1, we have that QX V (σ 2 ) being symmetric is a ˆ2 necessary condition for σ ˆ 2 being UMVU estimate σ 2 . That is, if c βˆ is not UMVU, then σ must be not UMVU. Remark 3.4 Combining Corollary 3.2 with Theorem 5.1 of Szatroski (1980), we have if the ˆ2 σ ˆ 2 must be UMVU. However, the converse is MLE of (c β, σ 2 ) has the explicit form, then σ not true. ˆ 2 may be the uniformly minimum variance Furthermore, if c βˆ is not UMVU estimate, σ estimate under the invariant unbiased estimate class {y −→ y + Xa} if QX V (σ 2 )QX exactly has k + 1 non-zero distinct eigenvalues which are linearly independent combinations of the σ 2 ’s, see Brown [13]. The following theorem will show the uniqueness of ANOVA estimate σ ˆ 2 under the latter condition. Theorem 3.2 If QX V (σ 2 )QX exactly has k + 1 non-zero distinct eigenvalues λ1 , λ2 , . . . , k+1 λk+1 , which are linearly independent combinations of σ 2 ’s, denoted by each λi = i=1 aij σj2 , then ANOVA estimate σ ˆ 2 is unique, and equal to the solution to equations λi =

k+1 

aij σj2 = y  Pi y/tr(Pi ),

j = 1, 2, . . . , k + 1,

(3.17)

j=1

where Pi is defined by (3.5). Proof Denote A = (aij ). Then λ = Aσ 2 . By the proof of Theorem 3.1, if QX V (σ 2 )QX exactly has k + 1 non-zero distinct eigenvalues which are linearly independent combinations of σ 2 ’s, then (3.11) holds. Thus equations (2.5) can be rewritten as a1 (σ 2 ) = E(s1 ) =

k+1 

bi1 λi · tr(Pi ) =

i=1

al (σ 2 ) = E(sl ) =

k+1 

=

i=1

bi1 · y  Pi y,

i=1



max bij − max bij λi · tr(Pi ) j≤l

i=1 k+1 

k+1 



j≤l−1

max bij − max bij · y  Pi y, j≤l

j≤l−1

l = 1, 2, . . . , k + 1.

(3.18)

Denote D = (dij ) being (k + 1) × (k + 1) matrix, where each di1 = bi1 , and dij = (maxj≤l bij −

Simultaneous Optimality of LSE and ANOVA Estimate in General Mixed Models

1645

maxj≤l−1 bij ) for any j = 1. Then the matrix form of (3.18) is a = D diag(tr(P1 ), . . . , tr(Pk+1 ))λ = D (y  P1 y, . . . , y  Pk+1 y) . Together with (2.7) and (3.4), we have D = CA−1 diag(tr(P1 ), . . . , tr(Pk+1 ))−1 is invertible, and equations (3.18) are equivalent to equations (3.17). Thus ANOVA estimate σ ˆ 2 is unique and equal to the solution of (3.17). The proof of Theorem 3.2 is completed. Remark 3.5 Theorem 3.2 implies that if there doesn’t exist the partial ordering ‘  between Ui and Uj , then Ui and Uj can permute in (2.4). To illustrate the results above, we consider the following three simple examples. Example 3.1

Two-way classification random model yij = μ + αi + βj + eij , i = 1, 2, . . . , a,

(3.19)

j = 1, 2, . . . , b,

where μ is overall mean, both αi and βj are random effects. Assume that αi ∼ N (0, σα2 ), βj ∼ N (0, σβ2 ) and eij ∼ N (0, σe2 ), all αi s, βj s and eij s are mutually independent. Denote y = (y11 , . . . , y1b , . . . , y21 , . . . , yab ). Then Cov(y) is given by V (σ 2 ) = σα2 Ia ⊗ Jb + σβ2 Ja ⊗ Ib + σe2 Iab and its spectral decomposition is V (σ 2 ) = (bσα2 + aσβ2 + σe2 ) J¯ab + (bσα2 + σe2 )Ia ⊗ (Ib − J¯b ) + (aσβ2 + σe2 ) (Ia − J¯a ) ⊗ J¯b + σe2 (Ia − J¯a ) ⊗ (Ib − J¯b ), where Jm = 1m 1m , J¯m = Jm /m. By Corollary 3.1, μ ˆ = y¯ and ANOVA estimate of σ 2 = ˆ 2 which is (σα2 , σβ2 , σe2 ) are UMVU. Applying (3.17), we can easily obtain ANOVA estimate σ the solution of the following equations ⎧ 1  ¯ 2 2 ⎪ ¯ ⎪ ⎪ bσα + σe = a(b − 1) y [Ja ⊗ (Ib − Jb )]y, ⎪ ⎪ ⎨ 1 aσβ2 + σe2 = y  [(Ia − J¯a ) ⊗ J¯b ]y, (3.20) b(a − 1) ⎪ ⎪ ⎪ 1 ⎪ ⎪ ⎩ σe2 = y  [(Ia − J¯a ) ⊗ (Ib − J¯b )]y. (a − 1)(b − 1) Example 3.2

Two-way classification mixed model yij = μ + αi + βj + eij , i = 1, 2, . . . , a, j = 1, 2, . . . , b,

where βj is the fixed effect, and μ, αi , eij have the same assumption as (3.19). Thus Cov(y) = V (σ 2 ) = (bσα2 + σe2 ) Ia ⊗ J¯b + σe2 Ia ⊗ (Ib − J¯b ), and X = (1a ⊗ 1b : 1a ⊗ Ib ). Denote γ = (μ, α1 , . . . , αa ), which is the vector of fixed effects. By ˆ 2 = (ˆ σβ2 , σ ˆe2 ) Corollary 3.2, the LSE c γˆ (for any estimable function c γ) and ANOVA estimate σ

Wu M. X., et al.

1646

are UMVU, where γˆ = (¯ y , y¯1 − y¯, . . . , y¯a − y¯). Using Remark 3.2 and (3.17), we can obtain ⎧ 1 ⎪ ⎪ ˆe2 = y  [(Ia − J¯a ) ⊗ (Ib − J¯b )]y, ⎨ σ (a − 1)(b − 1) ⎪ ⎪ ⎩ σ ˆα2 = Example 3.3

1 [y  (Ia − J¯a ) ⊗ J¯b y − (a − 1)ˆ σe2 ]. b(a − 1)

Random regression coefficient model yi = 1ui + Xbi + ei ,

i = 1, 2, . . . , k,

(3.21)

where yi = (yi1 , . . . , yin ) is the observation vector of ith subclass, X is an n × p matrix, rk(X) = p, ui is random intercept with distribution N (μ, σu2 ), bi is p × 1 random regression coefficient with distribution N (β, σβ2 Ip ), ei ∼ N (0, σe2 In ), and all ui s, bi s and ei s are mutually independent. Denote ui = μ + ηi , bi = β + ξi , where ηi ∼ N (0, σu2 ), ξi ∼ N (0, σb2 Ip ). Let y = (y1 , . . . , yk ) , η = (η1 , . . . , ηk ) , ξ = (ξ1 , . . . , ξk ) and e = (e1 , . . . , ek ) . Then (3.21) can be rewritten as y = 1kn μ + (1k ⊗ X)β + (Ik ⊗ 1n )η + (Ik ⊗ X)ξ + e. The covariance matrix of y is given by



V (σ 2 ) = σu2 Ik ⊗ Jn + σb2 Ik ⊗ (XX  ) + σe2 Ikn . Let X0 = (1kn : 1k ⊗ X). Clearly, we have PX0 V (σ 2 ) = V (σ 2 )PX0 , so the LSE of c β is the UMVU estimate. Moreover, if X satisfies the orthogonal condition X  X = aI, (a > 0) and 1p X = 0, then the spectral decomposition of QX0 V (σ 2 )QX0 is given by QX0 V (σ 2 )QX0 = (nσu2 + σe2 )((Ik − J¯k ) ⊗ J¯n ) + (aσb2 + σe2 )((Ik − J¯k ) ⊗ PX ) + σe2 (Ik × (In − J¯n − PX )). By Theorems 3.1 and 3.2, the ANOVA estimate of (σe2 , σb2 , σu2 ) is UMVU and unique, which is the solution to the following equations ⎧ y  ((Ik − J¯k ) ⊗ J¯n )y ⎪ ⎪ , ⎪ nσu2 + σe2 = λ1 = ⎪ ⎪ (k − 1) ⎪ ⎨ y  ((Ik − J¯k ) ⊗ PX )y (3.22) aσb2 + σe2 = λ2 = , ⎪ (k − 1)p ⎪ ⎪ ⎪ y  (Ik × (In − J¯n − PX ))y ⎪ ⎪ ⎩ σe2 = λ3 = . k(n − 1 − p) Theorems 3.1 and 3.2 not only extend the results on optimality in Graybill and Hultquist [10] to general mixed models, but also give the actual formula for ANOVA estimate of σ 2 , which is the solution to equations (3.17). ˆ 2 are UMVU estimates, then Corollary 3.4 If both the LSE c βˆ and ANOVA estimate σ ˆ ˆ ˆ k+1 are mutually independent, and each λ ˆ i is distributed as (λ1 /ri ) · χ2 , where c β, λ1 , . . . , λ ri ˆ i = y  Pi y/ri , ri = tr(Pi ), χ2 denotes the Chi-squares distribution with degree of freedom m. λ m

Simultaneous Optimality of LSE and ANOVA Estimate in General Mixed Models

1647

Corollary 3.4 is a direct result of (3.6). It is very important for constructing exact tests and exact confidence intervals of parameters based on the LSE and ANOVA estimate. Firstly, we consider the hypothesis H0 : σi2 = 0 vs Ha : σi2 > 0

(3.23)

under model (2.1). Among λ1 , . . . , λk+1 , if there exist two eigenvalues λi1 , λi2 such that λi1 − λi2 = mi σi2 for some positive real number mi , then Fi (y) =

ˆ i1 λ ˆ i2 λ

(3.24)

is an exact F -statistic for testing the hypothesis H0 . A convenient example is to test σα2 = 0 by ˆ 3 in Example 3.3. ˆ 1 /λ F - statistic λ Since the statistics H  y, yP1 y, y  P2 y, . . . , y  Pk+1 y are sufficient and complete, by standard multiparameter exponential family results (Lehmann [19]), the test  1, if Fi (y) ≥ Fri1 ,ri2 (1 − α), φ(Fi (y)) = 0, otherwise is the uniformly most powerful unbiased (U M P U ) test with the power function g(λi2 /λi1 ) = Pσi2 {Fi (y) ≥ Fri1 ,ri2 (1 − α)}   λi2 Fri1 ,ri2 (1 − α) , = P Fri1 ,ri2 ≥ λi1

(3.25)

where Fm,n denotes the distribution with degree of freedom m and n. Furthermore, it is easy to verify that the probability for σ12 taking negative value is equal to   ˆ i1 ≤ λ ˆ i2 } = P Fr ,r ≤ λi2 . P {λ (3.26) i1 i2 λi1 Secondly, we consider tests on fixed parameters of interest. If X can be partitioned as X = (X1 , X2 , . . . , Xq ), where Xi Xj = 0, (i = j), then each Xi βi is an estimable vector, and its LSE is (3.27) Xi βˆi = Xi (Xi Xi )− Xi y = PXi y, i = 1, 2, . . . , q, where βi is the subvector of parameter β corresponding to Xi , that is β = (β1 , β2 , . . . , βq ) . ˆ1, . . . , λ ˆ k+1 are mutually independent. By Corollary 3.4, we can conclude that X1 βˆ1 , . . . , Xq βˆq , λ Thus there exists the UMPU test for the hypothesis H0 : ci βi = di vs Ha : ci βi = di

(3.28)

as long as PXi V (σ 2 ) = λj PXi , and λj coincides with one of λ1 , . . . , λk+1 , where ci ∈ M (Xi ). This test statistic is given by ci βˆi − di T (y) =  ˆj ci (Xi Xi )− ci · λ

,

which has t-distribution with degree of freedom rj under the null hypothesis ci βi = di .

(3.29)

Wu M. X., et al.

1648

From (3.29), we can directly derive that the exact (1 − α) confidence interval of ci βi α  

ˆ j , c βˆi + tr α ˆj ci βˆi − trj (3.30) ci (Xi Xi )− ci · λ ci (Xi Xi )− ci · λ i j 2 2 is the uniformly most accurate unbiased (U M AU ) confidence interval. In Example 3.3, if X  X = aI, (a > 0) and 1p X = 0, then Cov(ˆ μ) = λ1 /kn,

ˆ = (λ2 /ak)Ip , Cov(β)

ˆ = 0, Cov(ˆ μ, β)

the 1 − α exact confidence intervals for the fixed effects μ and c β are      α α ˆ 1 /kn, μ ˆ 1 /kn μ ˆ − tk−1 ˆ + tk−1 λ λ 2 2      α α ˆ ˆ   ˆ ˆ c β − tp(k−1) c cλ2 /ak, c β + tp(k−1) c cλ2 /ak , 2 2

and

respectively. 4

Equivalent Estimates of ANOVA Estimate

In this section, we prove that the condition of Theorem 3.2 is the necessary and sufficient condition for equivalence of the ANOVA estimate, REML estimate and MINQU estimate. Theorem 4.1 The condition of Theorem 3.2 is necessary and sufficient for equivalence of the ANOVA estimate σ ˆ 2 and the explicit solution of the REML equations of model (2.1). Proof (Sufficiency)

The log-likelihood function based on B  y is given as

l(σ 2 ) = log L(σ 2 |B  y) 1 1 1 = − (n − rk(X)) log 2π − log |B  V (σ 2 )B| − y  B(B  V (σ 2 )B)−1 B  y, (4.1) 2 2 2 where B is an n × (n − r) column orthogonal matrix such that B  B = In−r and BB  = QX . 2 ) , we obtain the REML equations By maximizing l(σ 2 ) with respect to σ 2 = (σ12 , σ22 · · · , σk+1

tr(P (σ 2 )Ui Ui ) = y  P (σ 2 )Ui Ui P (σ 2 ) y,

i = 1, 2, . . . , k + 1,

(4.2)

which is an equivalent form of equations (3.13). Noting that

−1 P (σ 2 ) = V −1 (σ 2 ) − V −1 (σ 2 )X X  V −1 (σ 2 )X X  V −1 (σ 2 )

+ = QX QX V (σ 2 )QX QX ,

(4.3)

under the conditions of Theorem 3.2, we have P (σ 2 ) =

k+1  i=1

k+1

 1 1 QX Pi QX = Pi . λi λ i=1 i

Combining with (3.7), the REML equations (4.2) can be reduced to k+1  i=1

k+1

 aij aij  tr(Pi ) = 2 y Pi y, λi λ i i=1

j = 1, 2, . . . , k + 1,

(4.4)

where aij and λi is the same as that defined in the proof of Theorem 3.1. By the invertibility of A = (aij ) (whose collum vectors ai = (ai1 , ai2 , . . . , ai(k+1) ) i = 1, 2, . . . , k + 1 are linearly

Simultaneous Optimality of LSE and ANOVA Estimate in General Mixed Models

1649

independent), thus equations (4.4) is equivalent to equations (3.19), that is, the REML equations (4.4) have the explicit solution, which is equal to the ANOVA estimate σ ˆ 2 by Theorem 3.2. (Necessity) If the ANOVA estimate σ ˆ 2 and the explicit solution to the REML equations are equal, then the explicit solution to the REML equations (3.13) must exist. Thus QX V (σ 2 )QX exactly has k + 1 non-zero distinct eigenvalues which are linearly independent combinations of σ 2 , which has been proved in Theorem 3.1. The proof of Theorem 4.1 is completed. If there exists a prior value σ˙ 2 , then MINQU estimate is another important explicit estimate of σ 2 , which is the solution to the following equations k+1 

tr(P (σ˙ 2 )Ui Ui P (σ˙ 2 )Uj Uj )σj2 = y  P (σ˙ 2 )Ui Ui P (σ˙ 2 ) y,

i = 1, 2, . . . , k + 1,

(4.5)

j=1

where P (σ˙ 2 ) has the same form as (4.2), obtained by replacing σ 2 in P (σ 2 ) with its prior value σ˙ 2 , see [20]. Thus we have Theorem 4.2 The condition of Theorem 3.2 is also necessary and sufficient for equivalence of the ANOVA estimate σ ˆ 2 and the MINQU estimate. Proof Noting that the REML iterative equations (3.13) and MINQU estimate equations (4.5) have the same form as long as we replace the prior value λ˙ by the ith iterative value of λ, see [20], thus MINQU estimate being independent of the prior value σ˙ 2 is equivalent to REML equations (3.13) having the explicit solution. Therefore we need only to prove the sufficiency of Theorem 4.2. Using the fact k+1 

P (σ˙ 2 )Ui Ui P (σ˙ 2 )Uj Uj σj2 = P (σ˙ 2 )Ui Ui P (σ˙ 2 )V (σ 2 )

j=1

and (3.7) and (4.3), we can conclude that if the condition of Theorem 3.2 holds, Equation (4.5) is equivalent to k+1  aij λi tr(Pi ) k+1  aij y  Pi y = , j = 1, 2, . . . , k + 1, (4.6) λ˙ 2i λ˙ 2i i=1 l=1 which can be simplified as (3.19) by making use of invertibility of A = (aij ) and each λ˙ i > 0. Thus the ANOVA estimate σ ˆ 2 and the MINQU estimate are equal. The proof of Theorem 4.2 is completed. For example, the ANOVA estimate, the solution to REML equations and MINQU estimate under model (3.21) are equal if and only if QX0 V (σ 2 )QX0 exactly has three non-zero distinct eigenvalues which are linearly independent combinations of (σu2 , σb2 , σe2 ). It can be proved the latter is equivalent to X  X = aI, (a > 0) and 1p X = 0 in model (3.21). Acknowledgments We are grateful to the anonymous referees and the editor for their detailed suggestions which have considerably improved the quality of the paper. References [1] Baltagi, B. H.: Econometric Analysis of Panel Data, John Wiley, New York, 1995

1650

Wu M. X., et al.

[2] Davidian, M., Giltinan, D. M.: Nonlinear Models for Repeated Measurement Data, Chapman and Hall, London, 1996 [3] Diggle, P. J., Liang, K. E., Zeger, S. L.: Analysis of Longitudinal Data, Oxford Science, New York, 2000 [4] Searle, S. R., Casella, G., McCulloch, C. E.: Variance Components, Wiley, New York, 1992 [5] Bates, M. D., DebRoy, S.: Linear mixed models and penalized least squares. J. Multiv. Anal., 91, 1–17 (2004) [6] Kelly, R. J., Mathew, T.: Improved estimators of variance components with smaller probability of negativity. J. R. Statist. Soc., B, 55, 897–911 (1994) [7] Nelder, J. A.: The interpretation of negative variance components. Biometrika, 41, 544–548 (1954) [8] Smith, D. W., Murray, L. W.: An alternative to Eisenhart’s model II and mixed model in the case of negative variance estimates. J. Amer. Statist. Assoc., 79, 145–151 (1984) [9] Hazelton, M. L., Gurrin, L. C.: A note on genetic variance components in mixed models. Genetic Epidemiology, 24, 297–301 (2003) [10] Graybill, F. A., Hultquist, R. A.: Theorems concerning Eisenhart’s model II. Ann. Math. Statist., 32, 261–169 (1961) [11] Rao, C. R.: Least squares theory using an estimated dispersion matrix and its application to measurement of signals, Proc. Fifth Berkeley Symposium on Math Statist. & Pro., LeCam, L M., Neyman, J. ( eds.), Univ. California Press, Berkeley, 1, 355–372 (1967) [12] Seely, J.: Quadratic subspaces and completeness. Ann. Math. Statist, 42, 710–721 (1971) [13] Brown, K. G.: On analysis of variance in the mixed model. Ann. Statist, 12, 1488–1499 (1984) [14] Khuri, A. I., Mathew, T., Sinha, B. K.: Statistical Test for Mixed Linear Models, Wiley, New York, 1998 [15] Wu, M. X., Wang, S. G.: Simultaneously optimal estimation for fixed effects and variance components in the mixed model. Sci. China Ser. A, 47(5) 787–799 (2004) [16] Baksalary, J. K., Algebraic characterizations and statistical implications of the commutativity of orthogonal projectors. Proc. Second International Tampere Conference in Statistics, Tarmo Pukkila & Puntanen. S. (eds.), 113–142, 1987 [17] Arnold, S. F.: The Theory of Linear Models and Multivariate Analysis, John Wiley, New York, 1981 [18] Szatroski, T. H.: Necessary and sufficient conditions for explicit solutions in the multivariate normal estimation problem for patterned means and covariances. Ann. Statist, 8, 802–810 (1980) [19] Lehmann, E. L.: Testing Statistical Hypotheses, 2nd ed, John Wiley & Sons, New York, 1986 [20] Rao, C. R., Kleffe, J., Estimation of Variance Components and Applications, Elsevier Science Publishers B. V., New York, 1988

Acta Mathematica Sinica, English Series - Springer Link

10, pp. 1637–1650. Published online: October 5, 2008 ... Linear mixed effects model is an important class of statistical models that are used in many fields.

280KB Sizes 3 Downloads 140 Views

Recommend Documents

Acta Mathematica Academiae Paedagogicae Nyıregyháziensis 20 ...
sc respectively. These classes were introduced by El-Ashwah and Thomas[1]. The functions in these classes are close to convex and hence univalent. Sokol [11] introduced two more parameter in this class and obtained structural formula, the coefficient

An English-Arabic Bi-directional Machine Translation ... - Springer Link
For each natural language processing component, i.e., analysis, transfer, and generation, we ... The size of the modern English content (e.g. lit- erature and web ...

An English-Arabic Bi-directional Machine Translation ... - Springer Link
rule-based generation, Arabic natural language processing, bilingual agricul- ... erature and web content) is far larger than the amount of Arabic content available. ..... In: 40th Annual Meeting of the Association for Computational Lin-.

Tinospora crispa - Springer Link
naturally free from side effects are still in use by diabetic patients, especially in Third .... For the perifusion studies, data from rat islets are presented as mean absolute .... treated animals showed signs of recovery in body weight gains, reach

Chloraea alpina - Springer Link
Many floral characters influence not only pollen receipt and seed set but also pollen export and the number of seeds sired in the .... inserted by natural agents were not included in the final data set. Data were analysed with a ..... Ashman, T.L. an

GOODMAN'S - Springer Link
relation (evidential support) in “grue” contexts, not a logical relation (the ...... Fitelson, B.: The paradox of confirmation, Philosophy Compass, in B. Weatherson.

Bubo bubo - Springer Link
a local spatial-scale analysis. Joaquın Ortego Æ Pedro J. Cordero. Received: 16 March 2009 / Accepted: 17 August 2009 / Published online: 4 September 2009. Ó Springer Science+Business Media B.V. 2009. Abstract Knowledge of the factors influencing

Quantum Programming - Springer Link
Abstract. In this paper a programming language, qGCL, is presented for the expression of quantum algorithms. It contains the features re- quired to program a 'universal' quantum computer (including initiali- sation and observation), has a formal sema

BMC Bioinformatics - Springer Link
Apr 11, 2008 - Abstract. Background: This paper describes the design of an event ontology being developed for application in the machine understanding of infectious disease-related events reported in natural language text. This event ontology is desi

Candidate quality - Springer Link
didate quality when the campaigning costs are sufficiently high. Keywords Politicians' competence . Career concerns . Campaigning costs . Rewards for elected ...

Mathematical Biology - Springer Link
Here φ is the general form of free energy density. ... surfaces. γ is the edge energy density on the boundary. ..... According to the conventional Green theorem.

Artificial Emotions - Springer Link
Department of Computer Engineering and Industrial Automation. School of ... researchers in Computer Science and Artificial Intelligence (AI). It is believed that ...

Bayesian optimism - Springer Link
Jun 17, 2017 - also use the convention that for any f, g ∈ F and E ∈ , the act f Eg ...... and ESEM 2016 (Geneva) for helpful conversations and comments.

Contents - Springer Link
Dec 31, 2010 - Value-at-risk: The new benchmark for managing financial risk (3rd ed.). New. York: McGraw-Hill. 6. Markowitz, H. (1952). Portfolio selection. Journal of Finance, 7, 77–91. 7. Reilly, F., & Brown, K. (2002). Investment analysis & port

(Tursiops sp.)? - Springer Link
Michael R. Heithaus & Janet Mann ... differences in foraging tactics, including possible tool use .... sponges is associated with variation in apparent tool use.

Fickle consent - Springer Link
Tom Dougherty. Published online: 10 November 2013. Ó Springer Science+Business Media Dordrecht 2013. Abstract Why is consent revocable? In other words, why must we respect someone's present dissent at the expense of her past consent? This essay argu

Regular updating - Springer Link
Published online: 27 February 2010. © Springer ... updating process, and identify the classes of (convex and strictly positive) capacities that satisfy these ... available information in situations of uncertainty (statistical perspective) and (ii) r

philosophiae naturalis principia mathematica english pdf download ...
philosophiae naturalis principia mathematica english pdf download. philosophiae naturalis principia mathematica english pdf download. Open. Extract.

Mathematical Biology - Springer Link
May 9, 2008 - Fife, P.C.: Mathematical Aspects of reacting and Diffusing Systems. ... Kenkre, V.M., Kuperman, M.N.: Applicability of Fisher equation to bacterial ...

Subtractive cDNA - Springer Link
database of leafy spurge (about 50000 ESTs with. 23472 unique sequences) which was developed from a whole plant cDNA library (Unpublished,. NCBI EST ...

Hooked on Hype - Springer Link
Thinking about the moral and legal responsibility of people for becoming addicted and for conduct associated with their addictions has been hindered by inadequate images of the subjective experience of addiction and by inadequate understanding of how

Fair Simulation Minimization - Springer Link
Any savings obtained on the automaton are therefore amplified by the size of the ... tions [10] that account for the acceptance conditions of the automata. ...... open issue of extending our approach to generalized Büchi automata, that is, to.

mineral mining technology - Springer Link
the inventory of critical repairable spare components for a fleet of mobile ... policy is to minimize the expected cost per unit time for the inventory system in the ... In [6] researchers develop a ..... APPLICATION OF THE APPROACH PROPOSED .... min