MathSoft

Viewer
Transcript

MathSoft

Research Report No. 75

Combining multiple imputation t, 2, and F inferences Tim Hesterberg Draft, Last revised May 7, 1998

Acknowledgments: This work was supported by NIH 2R44CA65147-02.

MathSoft, Inc. 1700 Westlake Ave. N, Suite 500 Seattle, WA 98109{9891, USA Tel: (206) 283{8802 FAX: (206) 283{6310

E-mail:

[email protected]

Combining multiple imputation t, 2, and F inferences Tim C. Hesterberg August 26, 1998

Abstract We discuss rules for combining inferences from multiple imputations when complete-data inferences would be based on t-distributions rather than normal distribution, or F -distributions rather than 2 distributions. Standard errors are obtained based on a distinction between the squared standard error and the actual variance of a t-distribution. Degrees of freedom are based on the coecient of variation of a squared standard error, and combine the simulation error from using a nite number of imputations and the degrees of freedom in the original problem, adjusted for the estimated loss of information due to missing data. We extend these ideas to situations where complete-data inferences would be based on 2 - and F -distributions, or are based on p-values or 2 (Wald) or F statistics. We conclude with a discussion about appropriate calculations for regression summaries. This is work in progress, and comments are welcomed. Key Words: Missing data Multiple imputations Incomplete data.

1 Introduction For an introduction to statistical analysis using multiple imputations, see Schafer (1997) (referred to as S97 in the sequel). We use notation from S97, in particular Sections 4.3.2 and 4.3.3 (S432 and S433 in the sequel) S432 is based on (Rubin (1987), Chapter 3). S432 provides rules for combining inferences across multiple imputations which are appropriate if complete-data inferences (where no missing data is present) would be based on normal distributions e.g. condence intervals in the absence of missing data would be of the form p Q^ z =2 U where Q^ is a parameter estimate, U is its variance (or a very accurate estimate), and za is the 1 ; a quantile of a standard normal distribution. We focus on combining inferences when complete-data inferences would instead be based on t-distributions, p Q^ t =2 U where is the degrees of freedom and U is an estimate of the variance. 1

Similarly, S433 provides rules for combining inferences when complete-data inferences would be based on 2 distributions we provide alternative versions of those rules, and extend those rules to situations where complete-data inferences would be based on F distributions. We begin in Section 2 with a review of rules for normal-based problems, and discuss rules for t-based problems in Section 3. In Section 4 we review rules for 2 -based problems, and generalize those rules. In Section 5 we extend those rules to F -distributions. Section 6 relates to combining F -statistics, when only F -statistics (not the parameter estimates and covariances matrices that were used to obtain the statistics) are available. In Section 7 we discuss the construction of common summary statistics from a linear model.

2 Combining normal-based inferences We begin by reviewing the rules from S432 for combining inferences in normal problems. The observed data are Yobs, and the missing data are replaced by one of m sets of imputations (t) Ymis , t = 1 : : : m. Let (t) Q^ (t) = Q^ (Yobs Ymis ) and (t) U (t) = U (Yobs Ymis ) be the point and variance estimates using the tth set of imputed data, t = 1 : : : m. The multiple-imputation point estimate for Q is the average of the complete-data point estimates, m X Q = m1 Q^ (t) : (1) t=1

The variance estimate associated with Q has two components. The within-imputation variance is the average of the complete-data variance estimates, m X U = 1 U (t) : (2)

m t=1 The between-imputation variance is the variance of the complete-data point estimates, m X (3) B = m 1; 1 (Q^ (t) ; Q)2 : t=1 The total variance is dened as Tz1 = U + (1 + m;1 )B (4) and inferences are based on the approximation q (5) (Q ; Q)= T1 t1 2

where the degrees of freedom are

"

#2 U z1 = (m ; 1) 1 + (1 + m;1 )B : (6) In the sequel we write T for a generic estimate of total variation, and use subscripts such as Tz1 to denote specic estimates, where the z indicates an estimate for normal-based problems t, c, and f will indicate t, 2 , and F -based problems, respectively. Similarly, T will indicate a generic estimate of the nal degrees of freedom, and specic estimates are written e.g. z1 .

3 Combining Inferences in t-based problems In this section we discuss how to estimate total variation in t-based problems. We discuss two ways to combine t-based inferences, one based on adding variances, and the other based on adding variance parameters. This distinction does not arise with normal-based problems, because the variance of a normal distribution is its variance parameter. But with t-shaped distributions there is a dierence. For example, suppose that S is the standard error for an estimate Q^ , such that the posterior distribution of (Q ; Q^ )=S has a t-distribution with degrees of freedom, then the posterior variance for Q has variance parameter (squared standard error) S 2 but variance =( ; 2)S 2. This distinction may be important when combining inferences. We begin by extending the notation in S432. Let U (t) be the complete-data squared standard error for the tth complete data set, and

U (t) = =( ; 2)U (t) the variance of the corresponding scaled t distribution. Similarly, write T and T = T =(T ; 2)T for the squared standard error and corresponding variance for the combined analysis, where T is the nal degrees of freedom for the inference, discussed below. Note that from the point of view of a user of software it is most convenient to work with squared standard errors, rather than the (posterior) variances. However, it may be appropriate to work with variances internally within software that combines inferences. One way to combine t-based inferences works solely with the squared standard errors, and uses (1, 2, 3) from Section 2, yielding the nal squared standard error estimate as Tt1 = U + (1 + m;1 )B (7) (this is distinct from (4) because U (t) now a squared standard error rather than a variance). The second way adds variances of posterior distributions. Here T = U + (1 + m;1 )B 3

where U = m;1 P(t) U (t) and the nal squared standard error is T ; 2 T ; 2 ; 1 (8) Tt2 = T = ; 2 U + (1 + m )B T T This is undened if T 2 (T ), and has a small factor on B if is slightly greater than 2. Tt1 is the simpler estimate, and is more conservative (i.e. larger) than Tt2 , which is generally preferred. However, we give here one example in which only Tt2 gives correct inferences. Example 1 This is Example 1, page 73, of S97. Suppose that the observed data are a sample of n1 observations from a univariate normal distribution with unknown mean and variance , there are n0 = n ; n1 missing observations from the same distribution, and that multiple imputations would be based on data augmentation using the diuse prior ( ) / ;1. Using this prior for a Bayesian analysis based solely on the observed data matches the standard frequentist condence interval for of the form

y1 t =2n1 ;1s1 where y1 and s1 are the mean and sample standard deviation of the observed data. It would be desirable for multiple-imputation based inference to yield the same interval, in the limit as m ! 1 (for nite m there is simulation variability) the interval should not be dierent because in this univariate example multiple imputations do not add information to that contained in the observed data. When doing data augmentation with the above prior (conditional on the observed data), the steady-state distribution for (t) is (t) (n1 ; 1)s21= 2n1 ;1 which has expected value n1 ; 1 (t) E ( ) = n ; 3 s21 : 1

Conditional on (t) , the steady-state distribution for (t) is

(t) N (y1 n;1 1(t) ) which has variance

n1 ; 1 1 1 (t) var( ) = n E ( ) = n n ; 3 s21: 1 1 1 The unconditional variance of ymis is 1 1 n1 ; 1 var(Y mis) = n + n n ; 3 s21 0 1 1 (t)

4

so that

; 1 2 s21: var(Y ) = nn0 n1 + n1 nn1 ; 3 0 1 1 The complete-data sample variance has expected value n1 ; 1 n ; 3 2(t) E (S ) = n ; 3 n ; 1 s21: 1 ( t ) Let Q^ (t) = Y and U (t) = S 2(t) =n. Then the following limits hold as m ! 1: Q ! y1 U ! E (S 2(t) )=n 1 n n ; 3 1;1 2 = n n1 ; 3 n ; 1 s1 B ! var(Y(t) ) n0 n1 ; 1 s2 = nn 1 1 n1 ; 3 ! 2 n 2 n s 1;1 1 1 (9) Tt1 ! n n ; 3 1 ; (n ; 1)n 1 1 Note that Tt1 does not approach the desired s21 =n1. Using the second way of combining inferences yields the same limits for Q and B , but U ! E (S 2(t) )=n ; 1 s2 = n1 nn1 ; 3 1 1 2 1 (1 + n =n ) T ! sn1 nn1 ; 0 1 1;3 2 s ;1 = n1 nn1 ; 3 1 1 ; 2 T Tt2 = T T 2 ! ns1 if T ! n1 ; 1 1

Note that Tt2 has the desired limiting value, if T approaches the correct limiting degrees of freedom n1 ; 1.

3.1 Degrees of freedom for combining normal inferences

S97 provides a heuristic Bayesian justication for the procedure in S432 and Section 2, and indicates that the degrees of freedom z1 \are obtained by approximately matching the rst 5

two moments of the reduced-information posterior to those of a t-distribution." We provide an alternate interpretation, which suggests alternatives to z1 for normal-based problems, and which provides a way to combine and z for t-based problems. Recall that if Z has a normal distribution with mean 0 and variance 2 , and if T is an estimate of 2 which has mean 2 , is proportional to a 2 variable and is independent of Z , then pZ tT (10) T where T is the degrees of freedom for T . The reference distribution is t rather than normal because of variation in the denominator, causing the ratio to have a wider distribution. In practice, the t reference distribution is often used when the numerator is only approximately normal, the denominator is only approximately proportional to a 2 variable, and the numerator and denominator have small covariance. What is important for our purposes is that variation in the denominator gives rise to a t-distribution. Furthermore, the relationship between the variance and mean of the denominator determines the degrees of freedom. If T is proportional to a 2 variate, then var(T ) = 2 : (11) E (T )2 T In other words, the (squared) coecient of variation of T is inversely proportional to the degrees of freedom. Rearranging (11) yields an expression for the degrees of freedom, E (T )2 T = 2var( (12) T) which may be used whether or not T has a 2 distribution. In S432, Tz1 (4) is the squared standard error of Q, so the appropriate degrees of freedom depends on var(Tz1), which in turn involves the variance of U , variance of B , and the covariance of U and B . We now make two assumptions: A1: var(U ) = 0 (this implies that the covariance is also zero), and A2: B is proportional to a 2 variate with m ; 1 degrees of freedom. Then var(Tz1) = (1 + m;1 )2var(B ) = (1 + m;1)2 (2=(m ; 1))E (B )2: (13) Substituting into (12), and replacing E (Tz1) and E (B ) with Tz1 and B , respectively, we obtain: E^ (Tz1))2 ^T = 2(var( ^ Tz1) 2 = (1 + m;1 )22(2T=z1(m ; 1))B 2 6

which simplies to (6). In other words, assumptions A1 and A2 lead to the degrees of freedom formula (6). In normal-based problems, it may be possible to improve on (6) by avoiding assumptions A1 and A2. The variances and covariances of U and B may be estimated from multiple imputations note that Tz1 can be written as a sample average,

Tz1 = m;1

m X t=1

T (t)

(14)

where

T (t) = U (t) + (m=(m ; 1))(Q^ (t) ; Q)2 (15) so that var(T ) may be estimated by the usual formulas for the variance of a sample average, m X 1 (t) 2 var( ^ Tz 1 ) = (16) m(m ; 1) t=1(T ; T ) and used in (12) to estimate the nal degrees of freedom,

!;1 m X 1 (t) 2 z2 = (2T ) m(m ; 1) (T ; T ) t=1 2

(17)

However, doing this accurately requires that m be relatively large, but in practice it is is usually small, say 3 to 5. Furthermore, the improvement over (6) is likely to be relatively small in normal problems, where the presumption is that U is exact so that variability in U should be small.

3.2 Degrees of freedom in t-based problems

In this section we derive formulas for combining z1 (or any replacement that avoids assumptions A1 and A2) and . In this section T indicates one of Tt1 or Tt2 . We begin with a frequentist derivation, decomposing the variance of T by conditioning on the observed data Yobs: var(T ) = E (var(T jYobs)) + var(E (T jYobs)) = E (simulation variance) + other variance

(18)

In this derivation the underlying parameter Q is xed, so there are two sources of variation: one due to random observed data Yobs, and the simulation variance in choosing m random (t) sets of imputations Ymis . The rst term in (18) involves var(T jYobs), the simulation variation in T due to using a nite number m of random imputations, after conditioning on the observed data. This is 7

the variance that was estimated in (13), implicitly conditioned on Yobs. We estimate this term as: E^ (simulation variance) = E^ (var(T jYobs)) = var( ^ T jYobs) = 2 (E^ (T ))2 = 2 T 2 (19) sim sim where sim is an estimate of degrees of freedom due to simulation error in estimating T in particular, we may use sim = z1. The second term involves variance due to random Yobs. Let U1 = E (U (t) jYobs) and B1 = E (B jYobs). Note that these are functions of Yobs, but not of the random imputations. Then (20) other variance = var(E (T jYobs)) = var(U1 + (1 + m;1 )B1): In the absence of missing data, B1 = 0 and U1 = U , the complete-data squared standard error for Q^ . In this case, the degrees of freedom from the complete-data problem implies by (11) that var( ^ U ) = (2= )U 2 . In the presence of missing data, we begin with a simple estimate for the non-simulation variance of T , then propose adjustments. The simple estimate supposes that (aU1 + bB1) is proportional to a 2 variable with degrees of freedom for any positive a and b (particular a and b correspond to Tt1 and Tt2 ), yielding the estimate var( ^ E (T jYobs)) = (2= )T 2: (21) This choice in combination with (19) leads to a simple estimate for the nal degrees of freedom, 2T 2 = 2T 2 t1 = var( ^ T ) (2= )T 2 + (2= )T 2 1 1 ;1 sim = + (22) sim Note that this is never greater than this is desirable in statistical software, where the degrees of freedom for an analysis with multiple imputations should be at least as small as would obtain with complete data. However, note that as m ! 1 that (22) approaches , but the actual degrees of freedom should be smaller because some data are missing. We propose two ways to adjust the relatively simple estimate (22). The basic idea is to estimate the fraction of nonmissing data as U=T , and adjust the original degrees of freedom by this quantity, yielding !;1 1 1 (23) t2 = + sim (U=T ) A minor variation on the previous adjustment is based on maintaining a distinction between sample size and degrees of freedom. For example, in Example 1, the observed and 8

complete-data sample sizes were n1 and n, respectively, while the appropriate degrees of freedom are n1 ; 1 and n ; 1 the degrees of freedom are oset from the sample size by 1. In other problems the oset is dierent, e.g. p + 1 in linear regression with p coecients and an intercept. If n is known and the appropriate oset is (n ; ) then the adjusted non-simulation degrees of freedom would be (U=T )n ; (n ; ), yielding

t3 = 1 + sim

1

(U=T )n ; (n ; )

!;1

(24)

Both (23) and (24) may be calculated using Tt1 or Tt2 using Tt2 requires solving a system of two equations in two unknowns. The combination of Tt2 and (24) and yields the desired answer as m ! 1 in Example 1.

3.3 Summary for univariate estimates

The single estimate for total variation T in normal-based problems is Tz1 (4). Estimates for degrees of freedom T in normal-based problems are z1 and z2 (6, 17). As long as m is relatively small, we suggest using z1 because z2 would be highly variable. Estimates for total variation T in t-based problems are: Tt1 and Tt2 (7, 8), both linear combinations of U and B . Tt1 is simplest. Tt2 is more accurate in Example 1 as m ! 1, but Tt1 is more conservative, which would generally be preferred in more complicated situations (where some of the assumptions underlying the methods may not hold) or where m is small. Dierences between these will be small if the original degrees of freedom are large, or if the fraction of missing information is smaller. We suggest using Tt1 , which is simpler and more conservative. Estimates for degrees of freedom T in t-based problems are: t1 , t2 , and t3 (22, 23, 24). We suggest using t3 , which should be the most accurate. Combining estimate Tt2 for total variation and degrees of freedom (t2 , t3 ) require solving a system of two nonlinear equations in two unknowns. We suggest using the approximations obtained by rst computing Tt1 , using it to compute degrees of freedom, then using those degrees of freedom in Tt2 .

4 Combining inferences in for multidimensional estimates, 2 situations We begin by reviewing the rules from S433 for combining inferences in 2 problems these largely parallel the rules in Section 2 from S432. Let Q^ be a complete-data point estimate of a k-dimensional parameter Q, and let U be its covariance matrix (or an very accurate estimate), and assume that Q^ is approximately 9

distributed as N (Q U ), so that complete-data inferences would be based on (Q^ ; Q)T U ;1 (Q^ ; Q) _ 2k The multivariate analogs of (1{4) are m X Q = m1 Q^ (t) t=1 m X U = m1 U (t) t=1 m X 1 B = m ; 1 (Q^ (t) ; Q)(Q^ (t) ; Q)T t=1 Tc1 = U + (1 + m;1)B

(25) (26) (27) (28)

Inferences are based on the test statistic (Q ; Q)T T ;1(Q ; Q) _ FkT

(29)

for some T and associated degrees of freedom T . S433 notes that B is a noisy estimate of var(Q^ jYobs), and does not even have full rank of m k, and implies that using Tz1 for T in (29) yields a test statistic which may not be approximately F -distributed. S433 suggests assuming that A3: var(Q^ jYobs) / E (U jYobs), and letting (30) Tc2 = (1 + r1 )U where r1 = (1 + m;1 )tr(BU ;1 )=k with degrees of freedom

(

k;1)(1 + r1;1)2=2 if k = k(m ; 1) 4 c1 = k4 +(1(+ ; 1 k ; 4)1 + (1 ; 2=k)r1 ]2 otherwise

(31)

S433 indicates that assumption A3 is equivalent to assuming that the fractions of missing information for all components of Q are equal. We believe that the assumption is actually stronger, that it implies that the fractions of missing information for all linear combinations of components of Q are equal. We suggest the weaker assumption A4: the correlation matrices corresponding to var(Q^ jYobs) and E (U jYobs) are equal. 10

Then let

(32) B 0 = diag(B )1=2 diag(U );1=2 U diag(U );1=2 diag(B )1=2 be the adjusted estimate of var(Q^ jYobs) and, where diag(M ) for a square matrix M is the matrix with the same diagonal elements and zero elsewhere, and Tc3 = U + (1 + m;1)B 0 (33) be the estimate of total variation. Note that the diagonal elements of this matrix are the same as if total variation were estimated individually for components of the multivariate parameter using (4). It is easy to construct examples for which even the weaker assumption A4 is violated. For example, if X and Y are jointly gaussian with largely disjoint sets of missing values and the parameters of interest are the means of the variables, then the o-diagonal element of var(Q^ jYobs) is really larger than implied by assumption A4.

4.1 Degrees of freedom in multivariate

2

situations

Degrees of freedom may also be computed individually for components of the multivariate parameter using (6) or (17). These may be combined to obtain the denominator degrees of freedom using the conservative choice of the smallest degree of freedom the reciprocal average

c2 = j=1 min :::k 1j

(34)

0 k 1;1 X c3 = @k;1 1;j1 A

(35)

j =1

or the directionally-weighted reciprocal average

0 1;1 k X X c4 = @( wj );1 wj 1;j1 A j =1

(36)

where wj = (T5jj );1=2(Qj ; Qj ) is proportional to the normalized value of Q ; Q in the j th direction. We use reciprocal (weighted) averages because of the reciprocal relationship (12) between degrees of freedom and squared coecient of variation of T .

4.2 Summary for

2

situations

Note that assumption A4 is much weaker than A3, and aects only the correlation structure of the total variation estimates. The correlation structure is where the lack of full rank in B would occur, and is also presumably where the greatest noise would occur. 11

We suggest the use of Assumption A4 and resulting estimate of total variation Tc3. The big advantage of this over A3 and Tc2 is that overall results are consistent with componentwise results. We suggest using the conservative choice of the smallest degrees of freedom, c2 in fact even this choice is not overly conservative, as it is easy to construct examples where the fraction missing information for a combination of parameters is higher than for any parameter individually. This conservative choice is sensitive to random variation with large k and small m, in that a single one of a large number of parameters may have a high individual estimated degrees of freedom. The choice c4 might be the most accurate in the majority of situations, but requires specication of Q in order to compute directional weights, leading to the unusual situation for F statistics that the denominator degrees of freedom depends on the particular null hypothesis value being tested.

5 Combining inferences in for multidimensional estimates, F situations Inferences in F situations would involve the test statistic k;1(Q^ ; Q)T U ;1 (Q^ ; Q) _ Fk for complete data problems, where Q and Q^ is as in Section 4 but U is an estimate of the covariance matrix for Q^ (which is smaller than the covariance matrix U for the posterior distribution by a factor ( ; 2)= ). Combining multiple-imputations inferences in F situations involves a combination of the ideas from t and 2 situations. We summarize here the results that would obtain using the particular methods recommended in earlier sections. We begin by extending the notation used in 2 situations, Let U (t) be the \covarianceerror" matrix for the tth complete data set, with average U , and let T denote the total covariance-error matrix. Let B 0 be as in (32). The nal estimate of total variation based on (7) and (33) is (37) Tf 1 = (U + (1 + m;1 )B 0) The test statistic is the F -statistic (29). The degrees of freedom for individual components of T may be computed as for t-based problems, and the overall degrees of freedom (the denominator degrees of freedom for the F -statistic) computed using the conservative choice (34).

6 Combining F -statistics In the previous section we discussed combining F -inferences, when the parameters and covariance estimates used to obtain F -statistics were available. Here we combine inferences 12

based solely on the statistics. This incorporates ideas from previous sections and the material on page 115 in S97, which is based on (Li et al. (1991)). We begin by reviewing that material, which relates to combining 2 statistics. The complete-data Wald ( 2) statistics are d(Wt) = (Q(t) ; Q0)T (U (t) );1 (Q(t) ; Q0 ): The combined statistic is ;1 m ; 1);1r2 D2 = dW k ; (m1++1)( r2 where m X dW = m1 d(Wt) t=1 is the average of the Wald statistics, and " m q q 2 # X 1 (t) ; 1 r2 = (1 + m ) m ; 1 dW ; dW t=1 is an estimate of the average relative increase in variance. The numerator degrees of freedom are k and the denominator degrees of freedom are 2 = k;3=m (m ; 1)(1 + r2;1)2 To extend this methodology to the case where F statistics are available, we note that dividing a 2 variate by its degrees of freedom k yields an F variate with k and 1 degrees of freedom. Given F -statistics, we propose to convert them into approximate 2 statistics by multiplying by k and applying the above methodology however the denominator degrees of freedom combine the degrees of freedom 2 due to nite m with the original denominator degrees of freedom using (23), except that \fraction of information not lost to missing data" will be estimated by (1 ; 2=( 2 + 3))=(r2 + 1) instead of U=T the former is based on (4.30) in S432. In particular, if the individual F -statistics are d(Ft) with k and denom let d(Wt) = DF(t) k calculate D2 , dW , r2 , and 2 as above, let the overall statistic be DF = D2 =k with numerator degrees of freedom k and denominator degrees of freedom

1;1 0 1 1 A 20 = @ + 1;2=(2+3) 2 r2 +1 denom 13

7 Linear Model Summary Statistics There are well-known relationships that apply between common summary statistics for a linear model, in particular including those in an analysis of variance (anova) table of the form: SS df MS F model SSm m MSm F error SSe e MSe total SST T p as well as R2 = SSm=SST and residual standard deviation s = MSe . In this section we discuss the computation of these quantities in a multiple-imputation setting. Ideally, these summary statistics could be computed in the multiple-imputation context in a way that is consistent with their uses in the non-imputation context for both descriptive summaries and inference. However, this does not appear possible. In particular, the F -statistic has two uses: it is a simple descriptive statistic measuring the quality of the model it estimates the ratio between the actual reduction in residual sums of squares due to the model and what the reduction would be under under the null hypothesis, and it is used in computing a p-value for determining statistical signicance The appropriate values of F for these uses dier in the multiple-imputation context. We propose to maintain a distinction between inferential and non-inferential statistics. For inferential purposes, only F , m and e are needed these were discussed in Section 5, albeit with dierent notation: m = k, and e = c2 . For computing descriptive statistics, we presume that all quantities in the anova table are available for each complete data set, and are denoted with a superscript(t) , e.g. SSm(t) . Then we propose the following denitions:

SSm = SSm(t) m = m(t) MSm = MSm(t) = SSm=m F = MSm =MSe SSe = SSe(t) e = e(t) MSe = MSe(t) = SSe=e SST = SST(t) T = T(t) The sum of squares and mean squares terms are all simple averages across the imputations. The degree of freedom terms are identical across imputations. Other quantities, including R2 and s, are calculated as in the non-imputation setting. The values for e and F are dierent than the values used for inference. 14

Note that these denitions are consistent with the descriptive use of the F ratio described above, the use of R2 as a description of \the fraction of variance explained by the model," and the use of MSe = s2 as an estimate of the conditional variance of the response given the explanatory variables (assuming homoskedasticity).

References Li, H. H., Meng, X. L., Raghunathan, T. E., and Rubin, D. B. (1991). Signicance Levels from Repeated p-Values with Multiply-imputed Data. Statistica Sinica, 1:65{92. Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. John Wiley, New York. Schafer, J. (1997). Analysis of incomplete multivariate data. Chapman and Hall.

15

May 7, 1998 - We discuss rules for combining inferences from multiple imputations when complete-data in- ferences would be based on t-distributions rather than normal distribution, or F-distributions rather than 2 distributions. Standard errors are obtained based on a distinction between the squared standard error and ...

Download PDF

197KB Sizes 3 Downloads 211 Views

Report

MathSoft

Recommend Documents