Alden Gross 16 Aug 2012 Which ICC should we use for our functional composites study? Analysis goals Here, I use data provided in Shrout and Fleiss (1979) on ratings for 6 targets from 4 judges. (1) I calculate ICCs using their formulae, along the way testing what icr11.ado does. (2) I then apply the equations to a study we conducted in which 10 expert clinicians were asked to rate cognitive, physical, and independent loadings (3 sets of ratings) for each of 25 IADL/ADL items. Summary of results ICC(2,k) to describe agreement of the mean judge rating for items. ICC(2,1) to describe agreement of judges for a particular individual item’s rating. Background We report we are using an ICC(1,1) to describe reliability of a single rater. I will show later that I think that icr11.ado is calculating an ICC(3,1). This is easy to do: the form of the equations is identical. ICC(1,1) describes reliability in a study when each of a bunch of items (or subjects) are rated by a unique set of judges (or raters). That is, not all judges necessarily rate each item. This is an aspect of study design, and the ICC(1,1) can be ruled out quickly if your study did not use this design. ICC(2,1) assumes a random set of judges from a population of judges have each rated all the items. For example, the same judges rate cognitive load for every item in a functional battery This is our situation in SAGES. ICC(3,1) is similar, but judges are fixed effects because we have sampled from all possible judges (example: all 50 states vote on a constitutional amendment; there are 50 states and we have no need to generalize to the 51st state. So the judge, states, are fixed effects). Depending on one’s explicit purpose, ICC(2,1) and ICC(3,1) can be calculated together: the former can be described more as a measure of agreement while the latter measures consistency across judges. ICC(3,1) is usually larger because it does not care about random judges having been selected from the population. In addition to cases 1, 2, 3, we can describe reliability of an individual rater, ICC( , 1) or of the mean rating of judges, ICC( , k). This is an interpretative issue, but I would think we want to describe the mean rating among a set of judges in SAGES, since we will use the composites and not individual judge ratings later on for Thurstone scaling. Thus, ICC(2,k). Shrout and Fleiss (1979) provide a dense description of the ICC. They note (pg 423-424), ”It is not likely that ICC(2,1) or ICC(3,1) will ever be erroneously used in a case 1 study, since the appropriate mean squares would not be available. The misuse of ICC(1,1) on data from Case 3 1

or Case 3 studies is more likely. A consequence of this mistake is the underestimation of the true correlation...”

2

Here are ICCs based on data provided in Table 2 of Shrout and Fleiss (1979) on ratings for 6 targets from 4 judges. The correct ICCs are provided in Table 4 of their paper. The calculations agree (we also validated the equations using data from Shrout’s chapter in Psychiatric Epidemiology). . . webuse judges (Ratings of targets by judges) . anova rating judge target

Source

. . . . .

Number of obs = 24 Root MSE = 1.00968 Partial SS df MS

R-squared = 0.9095 Adj R-squared = 0.8612 F Prob > F

Model

153.666667

8

19.2083333

18.84

0.0000

judge target

97.4583333 56.2083333

3 5

32.4861111 11.2416667

31.87 11.03

0.0000 0.0001

Residual

15.2916667

15

1.01944444

Total

168.958333

23

7.34601449

* note: JMS was tough. * From SF1979, WMS=ems + (jms-ems)/n. * Check out table1, middle column, do algebra local bms = (`e(ss_2)´ / `e(df_2)´)

. local ems = (`e(rss)´/`e(df_r)´) . local jms = (`e(ss_1)´/`e(df_1)´) . local wms = (`e(rss)´/`e(df_r)´) /// 6.26 in SF1979 > + (`e(ss_1)´/`e(df_1)´ /// > - `e(rss)´ / `e(df_r)´)/(`e(df_2)´+1) . . . * ICC(1,1). should be 0.17, per Table 4. . display "ICC(1,1): " _c ICC(1,1): . display (`bms´ - `wms´ ) /// > / (`bms´ + `e(df_1)´*(`wms´)) .16574177 . . * ICC(2,1). should be 0.29, per Table 4. . display "ICC(2,1): " _c ICC(2,1): . display (`bms´ - `ems´ ) /// > / (`bms´ /// > + `e(df_1)´ * `ems´ /// > + (`e(df_1)´+1)*(`jms´ - `ems´) / (`e(df_2)´+1) .28976378 . . * ICC(3,1). should be 0.71, per Table 4. . display "ICC(3,1): " _c ICC(3,1): . display (`bms´ - `ems´ ) /// > / ( `bms´ /// > + `e(df_1)´ * `ems´ ) .71484071 . . * ICC(1,k). should be 0.44, per Table 4. . display "ICC(1,k): " _c ICC(1,k): . display (`bms´ - `wms´ ) / `bms´ .44279713 . . * ICC(2,k). should be 0.62, per Table 4.

3

)

. display "ICC(2,k): " _c ICC(2,k): . display (`bms´ - `ems´ ) /// > / (`bms´ /// > + (`jms´ - `ems´)/(`e(df_2)´+1) ) .62005055 . . * ICC(3,k). should be 0.91, per Table 4. . display "ICC(3,k): " _c ICC(3,k): . display (`bms´ - `ems´ ) / (`bms´) .90931554 .

So, what is icr11.ado doing? It appears to be ICC(3,1). This is usually larger than ICC(1,1) and likely larger than but similar in magnitude to ICC(2,1) since ICC(2,1) has additional uncertainty of random raters. This is easy to do by mixing up JMS with EMS in ANOVA because the equations are otherwise the same. . version 10 . icr11 , rating(rating) rater(judge) case(target) anova (Using anova) Number of obs = 24 R-squared = Root MSE = 0 Adj R-squared =

ICR(1,1) =

Source

Partial SS

df

MS

Model

168.958333

23

7.34601449

target judge target*judge

56.2083333 97.4583333 15.2916667

5 3 15

11.2416667 32.4861111 1.01944444

Residual

0

0

Total

168.958333

23

7.34601449

0.715

The intraclass correlation for a single rater [ICR(1,1)] describes the reliability of a single randomly selected rater. The result can be interpreted as the percent of the variance of a single rater´s ratings that are attributable to systematic differences between cases.

4

F

1.0000 Prob > F

What do these ICCs look like for the SAGES functional composites study?

. use $derived/fxncomp-208-ratings.dta, clear . quietly Composite ICC(1,1): ICC(2,1): ICC(3,1): ICC(1,k): ICC(2,k): ICC(3,k):

foreach t in 1 2 3 { type 1 .61593358 .62128812 .7219388 .94130478 .94254623 .96291256

Composite ICC(1,1): ICC(2,1): ICC(3,1): ICC(1,k): ICC(2,k): ICC(3,k):

type 2 .66668327 .66868278 .71135592 .95238434 .95279134 .96100565

Composite ICC(1,1): ICC(2,1): ICC(3,1): ICC(1,k): ICC(2,k): ICC(3,k):

type 3 .68435392 .68594985 .72247879 .95591034 .95622109 .96300856

5

Bonus material: Cronbachs Alpha is mathematically equivalent to the ICC for the mean of multiple observations with fixed raters/items, ICC(3,k). . foreach type in 1 2 3 { 2. display "Composite type `type´" 3. preserve 4. keep if type==`type´ 5. drop stub u lab name 6. reshape wide nu, i(item) j(raterid) 7. alpha nu* 8. restore 9. } Composite type 1 (500 observations deleted) (note: j = 1 2 3 4 5 6 7 8 9 10) Data Number of obs. Number of variables j variable (10 values) xij variables:

long

->

wide

250 4 raterid

-> -> ->

25 12 (dropped)

nu

->

nu1 nu2 ... nu10

Test scale = mean(unstandardized items) Average interitem covariance: Number of items in the scale: Scale reliability coefficient: Composite type 2 (500 observations deleted) (note: j = 1 2 3 4 5 6 7 8 9 10) Data Number of obs. Number of variables j variable (10 values) xij variables:

.0679642 10 0.9629

long

->

wide

250 4 raterid

-> -> ->

25 12 (dropped)

nu

->

nu1 nu2 ... nu10

Test scale = mean(unstandardized items) Average interitem covariance: Number of items in the scale: Scale reliability coefficient: Composite type 3 (500 observations deleted) (note: j = 1 2 3 4 5 6 7 8 9 10) Data Number of obs. Number of variables j variable (10 values) xij variables:

.0748336 10 0.9610

long

->

wide

250 4 raterid

-> -> ->

25 12 (dropped)

nu

->

nu1 nu2 ... nu10

Test scale = mean(unstandardized items) Average interitem covariance: Number of items in the scale: Scale reliability coefficient:

.1143844 10 0.9628

6

There’s an official Stata ado, icc.ado, that calculates all the ICCs from Shrout and Fleiss and has some other options to confuse you further: . webuse judges (Ratings of targets by judges) . * ICC(1,1 and k ) . icc rating target Intraclass correlations One-way random-effects model Absolute agreement Random effects: target

Number of targets = Number of raters =

rating

ICC

Individual Average

.1657418 .4427971

F test that ICC=0.00: F(5.0, 18.0) = 1.79

6 4

[95% Conf. Interval] -.1329323 -.8844422

.7225601 .9124154

Prob > F = 0.165

Note: ICCs estimate correlations between individual measurements and between average measurements made on the same target. . * ICC(2,1 and k ) . icc rating target judge, absolute Intraclass correlations Two-way random-effects model Absolute agreement Random effects: target Random effects: judge

Number of targets = Number of raters =

rating

ICC

Individual Average

.2897638 .6200505

6 4

[95% Conf. Interval] .0187865 .0711368

.7610844 .927232

F test that ICC=0.00: F(5.0, 15.0) = 11.03 Prob > F = 0.000 Note: ICCs estimate correlations between individual measurements and between average measurements made on the same target. . * ICC(3,1 and k ) . icc rating target judge, consistency Intraclass correlations Two-way random-effects model Consistency of agreement Random effects: target Number of targets = Random effects: judge Number of raters = rating

ICC

Individual Average

.7148407 .9093155

F test that ICC=0.00: F(5.0, 15.0) = 11.03

6 4

[95% Conf. Interval] .3424648 .6756747

.9458583 .9858917

Prob > F = 0.000

7

Alden Gross 16 Aug 2012 Which ICC should we use for ...

Aug 16, 2012 - + (`e(ss_1)´/`e(df_1)´ ///. > - `e(rss)´ / `e(df_r)´)/(`e(df_2)´+1) . . . * ICC(1,1). should be 0.17, per Table 4. . display "ICC(1,1): " _c. ICC(1,1): . display (`bms´ - `wms´ ) ///. > / (`bms´ + `e(df_1)´*(`wms´)) .16574177 . . * ICC(2,1). should be 0.29, per Table 4. . display "ICC(2,1): " _c. ICC(2,1): . display (`bms´ - `ems´ ).

71KB Sizes 3 Downloads 131 Views

Recommend Documents

Should We Use Linearized Models To Calculate Fiscal ...
Nov 25, 2017 - project. The views expressed in this paper are solely the responsibility of the authors and should not be interpreted as reflecting the views of Sveriges Riksbank. †Research Division, Sveriges Riksbank, SE-103 ...... Hebden, J.S., Li

AWP AUG 2012 QP.pdf
9. Indicate which one of the following reasons for the use of an earth mat with antennas is false. [ ]. ( a ) Improvement of the radiation pattern of the antenna.

EC-Aug-2012.pdf
Page 1 of 60. August 2012. Dear!mission!partners,. The!eyes!of!the!world!are turning!to!London!this!summer!for!the!2012!London!Olympic!Games.!This! is! a! great! year! for! London! to! shine! on! the! world! stage! and! to! have! this! privilege! to!

16. reserve for future use
Page 1. City of Mesquite - Personnel Policies. Page 107. March 25, 2008. 16. RESERVE FOR FUTURE USE.

Which iPhone 6 Should You Buy.pdf
... below to open or edit this item. Which iPhone 6 Should You Buy.pdf. Which iPhone 6 Should You Buy.pdf. Open. Extract. Open with. Sign In. Main menu.

Why You Should Not Use Arch - GitHub
One of the best things about Arch is that it provides the users with the newest software in a form of ... Ubuntu with no display manager or desktop environment and then install your favorite ... Installation Framework. ... package contains all applic

Bethesda Softworks - Dishonored Aug 2012 - Mobile Marketing ...
Oct 12, 2012 - conditions and the tail end of the current console lifecycle,” explains Russell. Ball, senior account manager for Bethesda's media planning and ...

ICC Uniforms Info 15-16.pdf
Page 1 of 1. 2015-2016 Uniforms. ALL choirs (except Prep) have a Casual Uniform in addition to the Dress Uniform listed below. UNLESS OTHERWISE ...

ICC Statement_PR.pdf
Sign in. Loading… Page 1. Whoops! There was a problem loading more pages. Retrying... ICC Statement_PR.pdf. ICC Statement_PR.pdf. Open. Extract.

Monthly Labor Review, June 2012: Which industries ...
Of course, our decomposition is merely an accounting ex- ercise and does ..... (4) professional and business services, (5) education and health services, (6) .... age program could have lowered hires per vacancy during ..... is relatively small.

The identifylayeroption specifies which method to use when ... - GitHub
This widget now has full GUI support for App Builder. I Can't Stress ..... field is e for the ooltip is string u can ge in the atic text. , etc. from the lace the pdf.jpg" ...

pdf-1423\hoodoo-almanac-2012-for-the-use-of ...
... apps below to open or edit this item. pdf-1423\hoodoo-almanac-2012-for-the-use-of-rootworker ... -the-world-of-visibles-and-invisibles-by-denise-al.pdf.

alden scholarship application.pdf
Page 1 of 2. The Alden Mitchell. Scholarship. Sponsored By The American Legion Post # 28 Farmington Maine. This scholarship is to be awarded to a ...

BLIS Article Obesity Surgery Aug 2012.pdf
which are: laparoscopic Roux-en-Y gastric bypass (RYGB),. laparoscopic adjustable gastric banding (LAGB), and the lap- aroscopic vertical sleeve gastrectomy ...

Kerala University of Health Sciences BDS Aug 2012 General ...
Trigeminal ganglion. 14.Cerebellar peduncles. 15.Lingual artery ... Sciences BDS Aug 2012 General Human Anatomy including Embryology and Histology.pdf.

docket report 2012-Aug-20.pdf
Defender Appointment. Janet C Tung. Federal Defenders of San Diego Incorporated. 225 Broadway, Ste. ... 18:924(j)(1) − Causing Death. Through Use of a Firearm. (7s). Case: 4:11-cr-00187-LABU As of: 08/20/2012 ... docket report 2012-Aug-20.pdf. dock

Krancer response to CAC Aug 2012 highlighted.pdf
... analyze information contained in bituminous underground. mine permit applications, in monitoling reports, and in other data sets submitted by underground.

Kuvempu University BA History Aug 2012 History of Modern Asia.pdf
Page 1 of 4. P.T.O.. Final Year B.A. Degree Examination, September/October 2012. Directorate of Distance Education. HISTORY. Paper – IV : History of Modern Asia. (From 1900 to the Present) (East Asia & West Asia). Time : 3 Hours Max. Marks : 70/80. I

Dr. M.G.R. Medical University M.Ch Neonatology Aug 2012 Clinical ...
Risk factors and management of developmental dysplasia of hip. 10. ... Dr. M.G.R. Medical University M.Ch Neonatology Aug 2012 Clinical Neonatology.pdf.

Kerala University of Health Sciences BDS Aug 2012 Dental ...
13. Define ridge and groove with examples. 14. Compensatory curves. 15. Histology of maxillary sinus lining. 16. Pre-eruptive tooth movement. *********************. Page 1 of 1. Main menu. Displaying Kerala University of Health Sciences BDS Aug 2012