Evaluating Equating Methods for a Manual and Computerized Card Sorting Task Christopher Koch George Fox University
Abstract
Method
Color Incongruent Raw Times 120
Participants Fifty-three students from an elementary school and 18 residents of a local retirement facility volunteered to participate in the study. All participants had normal or corrected to normal visual acuity and normal color vision.
Color Congruent Raw Times 120
60
40
20
80
0
NSCST = 0.760(Tablet) + 31.804 R² = 0.967
0
20
40
60 Tablet Percentile
80
100
120
2.5
60
40
2
20
1.5
0 0.00
log NSCST
Measure The Nonverbal Stroop Card Sorting Test (NSCST, Koch & Roid, 2012) was used. This test requires examinees to sort two decks of cards onto a display board. Pairs of color congruent blocks appear on one deck of cards while pairs of incongruent blocks appear on the other deck of cards. The NSCST demonstrates good reliability and validity. A tablet version of the task was created in which color congruent or incongruent cards appear on the screen and can either be dragged or tapped into the appropriate color bin, as in the NSCST.
80
100
NSCST Percetile
Different versions of the same test are typically equated so that scores can be similarly interpreted over time. Likewise, different versions of the same test may need to be equated. For instance, the Nonverbal Stroop Card Sorting Task (NSCST; Koch and Roid, 2012) is a manual card sorting task. Although computerized version of the task simulate the NSCST, the response time results are different (Koch and Hotovec, 2012). Two of the most frequently accepted methods for equating are linear and equipercentile equating. These two methods of equating are compared for the NSCST using a sample (n = 30) of middle schoolers. Administration order of the NSCST and the computer tablet version of the test was randomized across participants. The two tests were equated based on raw scores and verified using standardized scores. Overall, the equipercentile method produced the best fit. Implications for equating performance measures are discussed.
NSCST Percentile
100
20.00
40.00
60.00 Tablet Percentile
80.00
100.00
120.00
NSCST = 2.434(log(Tablet)) - 2.984 R² = 0.965
1
0.5
0 0
0.5
1
1.5
2
2.5
log Tablet
Objectives Cognitive assessment has been traditionally done with paper-based tests. For instance, the digit span forward and backward tasks in the WAIS-IV requires the examiner to read a string of numbers which are then repeated by the examinee. Some tasks, such as the picture completion task in the Leiter-3, require examinees to respond using manipulatives. This is also the case with sorting tasks (e.g., Wisconsin Card Sorting Task and the Nonverbal Stroop Card Sorting Test).
Procedure A Latin Square was used to create four orders of the tasks (i.e., color congruent cards, color incongruent cards, color congruent tablet, and color incongruent tablet). Participants completed the orders in randomized blocks. Completion times and number of errors were recorded for each task.
A problem with tests utilizing manipulatives is that they can become cumbersome. Extra pieces need to be carried, resorted, stored, etc. Missing pieces make the tests undeliverable. Examinees with manual dexterity difficulties may also find the pieces hard to handle. Computerized versions of tests can eliminate many of the problems mentioned above and have other benefits. For instance, computerbased tests provide standardize administration across examiners and can lead to immediate scoring. Web-based tests are easy to administer as long as there is internet connectivity. Tablet-based tests are easily portable. Although computer-based testing can produce similar results to traditional formats for some tasks (e.g., memory tasks), performance-based tasks may yield slightly different scores. Tests for which response times are recorded are especially problematic. Response times with the computer can vary based on the type of mouse used (Plant and Turner, 2009). Similarly, response times can vary between keyboard and touch-screen responses (cf., Beaumont, 1985). This is true even when the device attributes and input requirements overlap (see McLaughlin, Rogers, & Fisk, 2009). These differences present a problem when determine scaled scores. If the response times are different between formats then the scaled scores will be different between formats. Therefore, scores between different versions need to be equated.
Discussion This study was conducted to determine the relationship between a tablet version of the NSCST and standard card version of the test. The results show that times from the tablet version of the test can be used to accurately predict the standardized values for the NSCST. A linear relationship exists between the tablet and standard versions of the NSCST for the color congruent condition. However, a log relationship exists between the tablet and standard versions of the test for the incongruent condition.
References Beaumont, J. G. (1985). Speed of response using keyboard and screen-based microcomputer response media. International Journal of Man-Machine Studies, 23, 61-70.
Koch, C, & Roid, G. (2012). Manual for the nonverbal Stroop card sorting task. Wood Dale IL: Stoelting Company.
Results The NSCST is normed within seven age groups including 3-4, 5-6, 79, 10-12, 13-18, 19-59, and 60 or older. Given the distribution of participants, this analysis is limited to the 10-12 age group. Percentile equating tends to be more accurate when the data is less discrete (Livingston, 2004). Since response (or completion) time is a continuous variable, times for both the congruent and incongruent conditions were converted to percentiles and then matched to standardized norms for the NSCST.
Livingston, S., (2004). Equating test scores (without IRT). ETS Report. Princeton, NJ: Educational Testing Service. McLaughlin, Rogers, W., & Fisk, A. D. (2009). Using direct and indirect input devices: Attention demands and age-related differences. ACM Transactions on Computer-Human Interaction, 16, 1-15. Plant, R. R., & Turner, G. (2009). Millisecond precision psychological research in a world of commodity computers: New hardware, new problems? Behavior Research Methods, 41, 598-614.
Poster presented at the 46th Annual Meeting of the Society for Computers in Psychology