BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257, 43-52

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257, 43-52

EXPERIENCES AND TOOLS MICHELANGELO VIANELLO, PASQUALE ANSELMI & EGIDIO ROBUSTO Analysis of evaluative attributes in a race IAT Analisi di Rasch delle risposte date a un IAT

Introduction Although the study of automatic processes dates back to the end of 1800 (see for example the work of Binet, 1896, Carpenter, 1888 and James, 1889), research on psycho-social phenomena has grown exponentially since the end of the 70s (e.g. Higgins, Rholes & Jones, 1977). This period marked the beginning of a remarkable development of theoretical knowledge on the subject. Yet, it has been necessary to wait another decade for measuring individual differences in implicit cognition. Implicit measures were created as an attempt to circumvent the influence of corrective processes involved in explicit measures (e.g. questionnaires) which are due to social desirability or other cultural determinants. Implicit techniques, indeed, do not rely on introspection or other conscious assessments the individual can have of the construct under investigation, because the implicit measure is based on the individual performance at categorization or recognition tasks. The attractive perspective of measuring unconscious psychological aspects, and especially that of doing it “without asking”, has encouraged the development of a large number of implicit techniques. Among all, two are remarkable: the Evaluative Priming (Fazio, Sanbomatsu, Powell & Kardes, 1986) and the Implicit Association Test (IAT, Greenwald, McGhee & Schwartz, 1998). The latter was

recently devoted special attention: currently, the articles that mention the use of this method are more than 200, more than 8 millions of IAT have been completed worldwide, and a dozen variants of the procedure were constructed. The IAT is used in psychology of personality (Asendorpf, Banse & Mücke, 2002), clinical psychology (Teachman, Gregg & Woody, 2001), consumer psychology (Maison, Greenwald & Bruin, 2004), health psychology (Wiers, Van Woerden, Smulders & De Jong, 2002), gerontology (Hummert, Gartska, O’Brien, Greenwald & Mellott, 2002), neuropsychology (Cunningham, Raye & Johnson, 2005) and even more surprisingly in orthodontics (Orsini, Huang, Kiyak, Ramsay, Bollen, Anderson & Giddon, 2006). The Implicit Association Test The IAT is the only implicit technique that presents all the characteristics of a real psychological test. We know its properties, and some publications serve as a real manual (e.g. Lane, Banaji, Nosek & Greenwald, 2007) and others as a normative sample (Nosek, Smyth, Hansen, Devos, Lindner, Ranganath, Smith, Olson, Chugh, Greenwald & Banaji, 2007). The materials used in an IAT can be grouped in four categories, defined by category labels (e.g. Men, Women, Good, Bad) and stimuli that serve as examples of these categories (e.g. faces of men and women and words with

good and bad meanings). In most IATs, the four categories represent opposing couples, sometimes separated in target concepts (e.g. Men-Women) and attributes (e.g. Good-Bad). The two dimensions (e.g. “Gender” and “Evaluation”) generally define the nominal categories of interest, which generate the combined identification tasks (Greenwald, Nosek, Banaji & Klauer, 2005). The IAT effect is a relative measure of the association between these two nominal categories, and measures the difference between the associative strength of two contrasted pairs (Men with Good, Women with Bad) with the strength of the association between two other couples (Men with Bad, Women with Good). In this example, the score has a simple interpretation, and it is the automatic preference for men over women. The choice of nominal categories and stimuli to represent them is therefore a crucial aspect in building an IAT, because it influences the validity of the procedure and it defines the nature of the implicit construct that is measured. This choice must be geared to: a) adequately represent the super-ordinate category and b) to avoid weak or peripheral exemplars of that category. Steffens & Plewe (2001) suggest that the stimuli should be easily identifiable as representatives of one and only one category. For example, use of an item such as “warm” or “aggressive” in a gender IAT can induce confusion on their nominal belongings (is it to be categorized on the Male-Female axis or on the Good-Bad axis?). Another important aspect in collecting the stimuli is to ensure that they represent the nominal categories, and not others. For example, if the faces of women and men belong to different races, their categorical membership is

43

EXPERIENCES AND TOOLS MICHELANGELO VIANELLO, PASQUALE ANSELMI & EGIDIO ROBUSTO Analysis of evaluative attributes in a race IAT Analisi di Rasch delle risposte date a un IAT

Introduction Although the study of automatic processes dates back to the end of 1800 (see for example the work of Binet, 1896, Carpenter, 1888 and James, 1889), research on psycho-social phenomena has grown exponentially since the end of the 70s (e.g. Higgins, Rholes & Jones, 1977). This period marked the beginning of a remarkable development of theoretical knowledge on the subject. Yet, it has been necessary to wait another decade for measuring individual differences in implicit cognition. Implicit measures were created as an attempt to circumvent the influence of corrective processes involved in explicit measures (e.g. questionnaires) which are due to social desirability or other cultural determinants. Implicit techniques, indeed, do not rely on introspection or other conscious assessments the individual can have of the construct under investigation, because the implicit measure is based on the individual performance at categorization or recognition tasks. The attractive perspective of measuring unconscious psychological aspects, and especially that of doing it “without asking”, has encouraged the development of a large number of implicit techniques. Among all, two are remarkable: the Evaluative Priming (Fazio, Sanbomatsu, Powell & Kardes, 1986) and the Implicit Association Test (IAT, Greenwald, McGhee & Schwartz, 1998). The latter was

recently devoted special attention: currently, the articles that mention the use of this method are more than 200, more than 8 millions of IAT have been completed worldwide, and a dozen variants of the procedure were constructed. The IAT is used in psychology of personality (Asendorpf, Banse & Mücke, 2002), clinical psychology (Teachman, Gregg & Woody, 2001), consumer psychology (Maison, Greenwald & Bruin, 2004), health psychology (Wiers, Van Woerden, Smulders & De Jong, 2002), gerontology (Hummert, Gartska, O’Brien, Greenwald & Mellott, 2002), neuropsychology (Cunningham, Raye & Johnson, 2005) and even more surprisingly in orthodontics (Orsini, Huang, Kiyak, Ramsay, Bollen, Anderson & Giddon, 2006). The Implicit Association Test The IAT is the only implicit technique that presents all the characteristics of a real psychological test. We know its properties, and some publications serve as a real manual (e.g. Lane, Banaji, Nosek & Greenwald, 2007) and others as a normative sample (Nosek, Smyth, Hansen, Devos, Lindner, Ranganath, Smith, Olson, Chugh, Greenwald & Banaji, 2007). The materials used in an IAT can be grouped in four categories, defined by category labels (e.g. Men, Women, Good, Bad) and stimuli that serve as examples of these categories (e.g. faces of men and women and words with

good and bad meanings). In most IATs, the four categories represent opposing couples, sometimes separated in target concepts (e.g. Men-Women) and attributes (e.g. Good-Bad). The two dimensions (e.g. “Gender” and “Evaluation”) generally define the nominal categories of interest, which generate the combined identification tasks (Greenwald, Nosek, Banaji & Klauer, 2005). The IAT effect is a relative measure of the association between these two nominal categories, and measures the difference between the associative strength of two contrasted pairs (Men with Good, Women with Bad) with the strength of the association between two other couples (Men with Bad, Women with Good). In this example, the score has a simple interpretation, and it is the automatic preference for men over women. The choice of nominal categories and stimuli to represent them is therefore a crucial aspect in building an IAT, because it influences the validity of the procedure and it defines the nature of the implicit construct that is measured. This choice must be geared to: a) adequately represent the super-ordinate category and b) to avoid weak or peripheral exemplars of that category. Steffens & Plewe (2001) suggest that the stimuli should be easily identifiable as representatives of one and only one category. For example, use of an item such as “warm” or “aggressive” in a gender IAT can induce confusion on their nominal belongings (is it to be categorized on the Male-Female axis or on the Good-Bad axis?). Another important aspect in collecting the stimuli is to ensure that they represent the nominal categories, and not others. For example, if the faces of women and men belong to different races, their categorical membership is

43

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

EXPERIENCES AND TOOLS

EXPERIENCES AND TOOLS

clear, but participants may respond on the basis of an irrelevant category (ethnicity). Nosek, Greenwald & Banaji (2005) observed that the size and reliability of the IAT effect do not depend strictly on the number of items used to represent the construct (if K>2). The same authors noted that the difference in length of the evaluation attributes may adversely affect the measure. Nosek, Greenwald & Banaji (2007) suggest that good words must be longer than 5 letters and the bad ones shorter than 10. Despite their importance, the choice of stimuli of a newly created IAT does not always focus enough attention. Neither are the procedures described for data analysis to be used to control the material chosen. This study aims to examine whether and eventually how the recognizability of the evaluation attributes influence the IAT effect. Indeed, it is possible to hypothesize that an imbalance in their recognizability causes an unwanted shift in the IAT effect. The procedure to be used for the scoring of the IAT (D, Greenwald, Nosek & Banaji, 2003) prescribes, in fact, to divide the dif-

ference in performance to critical blocks (such as Women/Good and Women/Bad) by the standard deviation of all the responses given by the participant. So if negative stimuli were less recognizable than the positive stimuli, the standard deviation would increase, and consequently all D scores would get closer to zero. The analyses conducted on the recognizability of the stimuli will also provide interesting information on the different intensity of the four associations involved in the IAT effect (e.g. White/Black or Good/Bad), and to identify the evaluation attributes (e.g. beautiful or ugly) that most helped the emergence of the IAT. The possibility of breaking the IAT effect into sub-components will then be discussed.

Materials and procedure Participants were provided with an IAT on racial prejudice, according to the procedure described in Greenwald et al. (2003) (see Figure 1). The task was presented as a test of lexical decision. Participants were asked to respond as fast and accurately as possible. The category labels used were “White”, “Black”, “Good Words” and “Bad Words”. The stimuli consisted of 12 faces of white people (3 men, 3 women) and black people (3 men, 3 women) and 20 evaluation attributes of which 10 had a positive meaning (attractive, awesome, beautiful, best, excellent, good, mellow, nice, pleasant, wonderful) and 10 had a negative meaning (annoying, bad, evil, foul, hateful, horrible, nasty, pesky, sickening, ugly). The positive and negative sets of attributes were composed by the same number of letters (respectively 82 and 81), since the difference in length could influence their recognizability. Appendix 1 provides original and translated words. The stimuli appeared in white on a black background, and had to be assigned to the category labels on the left

Method Participants The study involved 116 undergraduate students of the University of Padua who participated for no reward. Data were collected in sessions with a maximum of 10 participants placed at a distance of about 5 meters from each other.

Figure 1 Blocks of trials provided in a race IAT

Block 1 2 3 4 5 6 7

(practice) (practice) (test) (test) (practice) (test) (test)

N° of trials 20 20 20 40 40 20 40

ference in performance to critical blocks (such as Women/Good and Women/Bad) by the standard deviation of all the responses given by the participant. So if negative stimuli were less recognizable than the positive stimuli, the standard deviation would increase, and consequently all D scores would get closer to zero. The analyses conducted on the recognizability of the stimuli will also provide interesting information on the different intensity of the four associations involved in the IAT effect (e.g. White/Black or Good/Bad), and to identify the evaluation attributes (e.g. beautiful or ugly) that most helped the emergence of the IAT. The possibility of breaking the IAT effect into sub-components will then be discussed.

Materials and procedure Participants were provided with an IAT on racial prejudice, according to the procedure described in Greenwald et al. (2003) (see Figure 1). The task was presented as a test of lexical decision. Participants were asked to respond as fast and accurately as possible. The category labels used were “White”, “Black”, “Good Words” and “Bad Words”. The stimuli consisted of 12 faces of white people (3 men, 3 women) and black people (3 men, 3 women) and 20 evaluation attributes of which 10 had a positive meaning (attractive, awesome, beautiful, best, excellent, good, mellow, nice, pleasant, wonderful) and 10 had a negative meaning (annoying, bad, evil, foul, hateful, horrible, nasty, pesky, sickening, ugly). The positive and negative sets of attributes were composed by the same number of letters (respectively 82 and 81), since the difference in length could influence their recognizability. Appendix 1 provides original and translated words. The stimuli appeared in white on a black background, and had to be assigned to the category labels on the left

Method Participants The study involved 116 undergraduate students of the University of Padua who participated for no reward. Data were collected in sessions with a maximum of 10 participants placed at a distance of about 5 meters from each other.

Figure 1 Blocks of trials provided in a race IAT

Left labels

Right labels

Block

Good Words Faces of White People Faces of White People + Good Words Faces of White People + Good Words Faces of Black People Faces of Black People + Good Words Faces of Black People + Good Words

Bad Words Faces of Black People Faces of Black People + Bad Words Faces of Black People + Bad Words Faces of White People Faces of White People + Bad Words Faces of White People + Bad Words

1 2 3 4 5 6 7

Note. Blocks 3 and 4 were counterbalanced across participants with blocks 6 and 7.

44

clear, but participants may respond on the basis of an irrelevant category (ethnicity). Nosek, Greenwald & Banaji (2005) observed that the size and reliability of the IAT effect do not depend strictly on the number of items used to represent the construct (if K>2). The same authors noted that the difference in length of the evaluation attributes may adversely affect the measure. Nosek, Greenwald & Banaji (2007) suggest that good words must be longer than 5 letters and the bad ones shorter than 10. Despite their importance, the choice of stimuli of a newly created IAT does not always focus enough attention. Neither are the procedures described for data analysis to be used to control the material chosen. This study aims to examine whether and eventually how the recognizability of the evaluation attributes influence the IAT effect. Indeed, it is possible to hypothesize that an imbalance in their recognizability causes an unwanted shift in the IAT effect. The procedure to be used for the scoring of the IAT (D, Greenwald, Nosek & Banaji, 2003) prescribes, in fact, to divide the dif-

(practice) (practice) (test) (test) (practice) (test) (test)

N° of trials 20 20 20 40 40 20 40

Left labels

Right labels

Good Words Faces of White People Faces of White People + Good Words Faces of White People + Good Words Faces of Black People Faces of Black People + Good Words Faces of Black People + Good Words

Bad Words Faces of Black People Faces of Black People + Bad Words Faces of Black People + Bad Words Faces of White People Faces of White People + Bad Words Faces of White People + Bad Words

Note. Blocks 3 and 4 were counterbalanced across participants with blocks 6 and 7.

44

or on the right of the screen using respectively the keys “E” and “I”. A red cross appeared in case of mistake and it disappeared only after a correction was made. The order of presentation of critical blocks (compatible and incompatible) was counterbalanced. Data Analysis: model and strategy The data were analyzed through a Many-Facet Rasch Model (MFRM) with the help of software Facets v. 3.63. The MFRM, being a Rasch model, shares its characteristics: stochastic independence, specific objectivity, linearity and measurement unit (for a discussion of such properties see Andrich, 1988 and Cristante & Mannarini, 2004). Further benefits of a MFRM analysis of an IAT are derived from the fact that: a) all facets are located on the same latent trait, allowing comparisons between elements, b) the measures are independent from the distributions of participants and items, c) fit statistics provide detailed tests of the adherence of data to the model and provide an important contribution to the interpretation of results. In formal terms, the Many-Facet comes from Simple Logistic Model (SLM, Rasch, 1960/1980), which is the traditional Rasch model and takes the following mathematical form: P (Xni = xni | β n , δ i ) = exp[xni (β n – δ i )]

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

EXPERIENCES AND TOOLS

EXPERIENCES AND TOOLS

a function of subject’s ability and item’s difficulty ( β n – δ i ). The more (or less) able the subject, and the easier (or more difficult) the item, then the greater the likelihood of a correct (or wrong) answer. Computing the ratio between the probabilities associated with the events “correct answer” and “wrong answer”, and taking the logarithm of both factors, we obtain: ln

= ln

P (Xni = 1 | β n , δ i ) P (Xni = 0 | β n , δ i )

=

exp (β n – δ i ) / 1 + exp (β n – δ i )

= βn – δi

1/1 + exp (β n – δ i )

=

(2)

that represents the SLM according to the Many-Facet model (Linacre, 1989). This formalization allows new parameters (facets) to be entered in the model, other than participants’ ability and items’ difficulty, accounting for the likelihood of a particular response to be given to an item. Consequently, in a MFRM analysis of the IAT, we can introduce a new parameter (γ j) accounting for the association condition (compatible or incompatible) in which the item was presented and another parameter (θ k) that controls the scale of response (k = (1, 2, 3, 4, 5)), given by a discretization of response times. The model used to analyze the data is then represented by equation (3).

(1)

In [Pnijk / Pnijk–1] = βn – δ i – γj – θ k (3)

where: – x ni is 1 if the answer is correct, 0 if the answer is incorrect; – β n is the ability of the subject n; – δ i is the difficulty of the item i. The SLM expresses, according to a logistic distribution, the probability of obtaining a response as

In summary, as far as responses to an IAT are concerned, the model specifies that the probability of a person n to give a response k rather than k-1 to item i in condition j depends on the additive effects of the speed of the person to the categorization task proposed (βn ), on the ease of

=

1 + exp(β n – δ i )

recognition of the item (δ i ) on the difficulty of the condition (γj ) and on the difficulty of giving a response k rather than a response k-1. Relevant indices in a Many-Facet Rasch model For a detailed description of indices and statistics used in ManyFacet Rasch models, we refer the interested reader to Myford & Wolfe (2003). We will now provide a brief explanation of their meaning. Fit indices show how much the data associated to a parameter conform to what the model predicted. They are based on residuals and have to be interpreted as the total amount of misfit (MS) attributable to each element of a facet. Consequently, they are based on the differences between observed and expected responses, calculated for each subject and each item. The mean squares of standardized residuals can be weighted (Infit) or not (Outfit). Outfit indices are more sensitive to outliers. Indices equal to or close to one indicate a perfect correspondence between observed and expected responses; values higher than one indicate the presence in the data of unmodeled variance. Fit indices lower than one indicate overpredictability of data, thus less variance than the model expected. Generally, values higher than one are more dangerous for the quality of measurement. Several ranges have been proposed to interpret Infit and Outfit statistics (Linacre, 2004; Myford & Wolfe, 2003; Smith, Schumacker & Bush, 1998); among these, we chose one of the most restrictive (.6
45

or on the right of the screen using respectively the keys “E” and “I”. A red cross appeared in case of mistake and it disappeared only after a correction was made. The order of presentation of critical blocks (compatible and incompatible) was counterbalanced. Data Analysis: model and strategy The data were analyzed through a Many-Facet Rasch Model (MFRM) with the help of software Facets v. 3.63. The MFRM, being a Rasch model, shares its characteristics: stochastic independence, specific objectivity, linearity and measurement unit (for a discussion of such properties see Andrich, 1988 and Cristante & Mannarini, 2004). Further benefits of a MFRM analysis of an IAT are derived from the fact that: a) all facets are located on the same latent trait, allowing comparisons between elements, b) the measures are independent from the distributions of participants and items, c) fit statistics provide detailed tests of the adherence of data to the model and provide an important contribution to the interpretation of results. In formal terms, the Many-Facet comes from Simple Logistic Model (SLM, Rasch, 1960/1980), which is the traditional Rasch model and takes the following mathematical form: P (Xni = xni | β n , δ i ) = exp[xni (β n – δ i )]

a function of subject’s ability and item’s difficulty ( β n – δ i ). The more (or less) able the subject, and the easier (or more difficult) the item, then the greater the likelihood of a correct (or wrong) answer. Computing the ratio between the probabilities associated with the events “correct answer” and “wrong answer”, and taking the logarithm of both factors, we obtain: ln

= ln

P (Xni = 1 | β n , δ i ) P (Xni = 0 | β n , δ i )

=

exp (β n – δ i ) / 1 + exp (β n – δ i )

= βn – δi

1/1 + exp (β n – δ i )

=

(2)

that represents the SLM according to the Many-Facet model (Linacre, 1989). This formalization allows new parameters (facets) to be entered in the model, other than participants’ ability and items’ difficulty, accounting for the likelihood of a particular response to be given to an item. Consequently, in a MFRM analysis of the IAT, we can introduce a new parameter (γ j) accounting for the association condition (compatible or incompatible) in which the item was presented and another parameter (θ k) that controls the scale of response (k = (1, 2, 3, 4, 5)), given by a discretization of response times. The model used to analyze the data is then represented by equation (3).

(1)

In [Pnijk / Pnijk–1] = βn – δ i – γj – θ k (3)

where: – x ni is 1 if the answer is correct, 0 if the answer is incorrect; – β n is the ability of the subject n; – δ i is the difficulty of the item i. The SLM expresses, according to a logistic distribution, the probability of obtaining a response as

In summary, as far as responses to an IAT are concerned, the model specifies that the probability of a person n to give a response k rather than k-1 to item i in condition j depends on the additive effects of the speed of the person to the categorization task proposed (βn ), on the ease of

=

1 + exp(β n – δ i )

recognition of the item (δ i ) on the difficulty of the condition (γj ) and on the difficulty of giving a response k rather than a response k-1. Relevant indices in a Many-Facet Rasch model For a detailed description of indices and statistics used in ManyFacet Rasch models, we refer the interested reader to Myford & Wolfe (2003). We will now provide a brief explanation of their meaning. Fit indices show how much the data associated to a parameter conform to what the model predicted. They are based on residuals and have to be interpreted as the total amount of misfit (MS) attributable to each element of a facet. Consequently, they are based on the differences between observed and expected responses, calculated for each subject and each item. The mean squares of standardized residuals can be weighted (Infit) or not (Outfit). Outfit indices are more sensitive to outliers. Indices equal to or close to one indicate a perfect correspondence between observed and expected responses; values higher than one indicate the presence in the data of unmodeled variance. Fit indices lower than one indicate overpredictability of data, thus less variance than the model expected. Generally, values higher than one are more dangerous for the quality of measurement. Several ranges have been proposed to interpret Infit and Outfit statistics (Linacre, 2004; Myford & Wolfe, 2003; Smith, Schumacker & Bush, 1998); among these, we chose one of the most restrictive (.6
45

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

EXPERIENCES AND TOOLS

EXPERIENCES AND TOOLS

ratio between the standard “true” deviation (i.e. the standard deviation corrected for measurement error) and the average standard error of the parameters, then: G = Adj SD / RMSE. The Separation Reliability (R) shows how reproducibly different the measures are. It is computed as the ratio of “True variance” on “Observed variance” for the elements of the facet. As far as the persons are concerned, it is an equivalent to Cronbach’s alpha or K-R 20. This index is closely linked to the previous one, indeed R = G2/(1 + G2). If R<.5, it is likely that the value of G is completely due to measurement error. Generally, high person and item reliabilities are preferred. The Fixed (all same) chi-square leads the assumption that all elements of facet have the same logit, in relation to measurement error (SE). Results A three-facet model (participants, conditions, stimuli) for polytomous responses, formally represented in equation (3), was applied to the answers given to evaluative attributes in critical blocks. A bias term was introduced for assessing the interaction between the facets “stimuli” and “conditions”. In this way, we will obtain for each item an estimate of its different recognizability in the two conditions. Responses lower than 300 ms. and greater than 10000 ms. have been excluded from the analyses and response times were discretized according to quintiles computed on the full data matrix. Therefore, our dependent variable presents values 5, 4, 3, 2 and 1 to identify respectively very short, short, medium, long or very long response times. The analysis provided, for each element of every facet, its location

46

on the latent trait “Response Speed” (in logits), and a series of statistical indices useful to the facet and its components. Table 2 shows a graphical representation of how the various elements of the three facets are considered on the latent trait “Response Speed”, which locates the elements according to stimuli’s “recognizability”, participants’ ability and ease of the task in the case of conditions (compatible and incompatible blocks). Estimates of stimuli and conditions parameters were centered around zero. The two conditions (compatible and incompatible blocks) have excellent fit indices (.96
facet I with K observations, df = (N – 1 + K – 1). Once obtained a t value and its degrees of freedom, a Cohen’s d can also be computed using the following formula: d=

2t

√df

(see Rosenthal & Rosnow, 1991). For example, one can easily observe that the item “beautiful” is significantly more recognizable than “excellent” (t(462) = (.22 – .06) / √.062 + .062 = 2, p<.05) but not more than “attractive” (t(462) = 1.88, p = .06). The same procedure can be used to test the difference between the averages of two groups of elements. Since the positive and negative attributes used throughout the IAT have the same number of letters (respectively 81 and 82), we expect them to be equally recognizable. Indeed, the difference in recognition between all the positive attributes and negative attributes is not significant (t(19) = 1.39, p = .18). Going even further in detail, it is possible to break up the IAT effect into the individual components supplied by each evaluation attribute. This is possible by analyzing the difference in attributes’ recognizability between the two conditions, which can also be expressed in terms of a Cohen’s d (see Figure 1). In this study, the items “pleasant” and “excellent” are recognized much more easily in the compatible condition (respectively, d = .30, t(230) = 2.27, p<.05; d = .29, t(230) = 2.18, p<.05). “Pleasant” and “excellent” therefore represent the attributes that contribute most to the emergence of the IAT effect. The item “annoying”, on the contrary, does not contribute to the emergence of the effect, because its different recognizability is far below the average bias of the other stimuli (d = .26, t(230) = –2, p<.05). We

ratio between the standard “true” deviation (i.e. the standard deviation corrected for measurement error) and the average standard error of the parameters, then: G = Adj SD / RMSE. The Separation Reliability (R) shows how reproducibly different the measures are. It is computed as the ratio of “True variance” on “Observed variance” for the elements of the facet. As far as the persons are concerned, it is an equivalent to Cronbach’s alpha or K-R 20. This index is closely linked to the previous one, indeed R = G2/(1 + G2). If R<.5, it is likely that the value of G is completely due to measurement error. Generally, high person and item reliabilities are preferred. The Fixed (all same) chi-square leads the assumption that all elements of facet have the same logit, in relation to measurement error (SE). Results A three-facet model (participants, conditions, stimuli) for polytomous responses, formally represented in equation (3), was applied to the answers given to evaluative attributes in critical blocks. A bias term was introduced for assessing the interaction between the facets “stimuli” and “conditions”. In this way, we will obtain for each item an estimate of its different recognizability in the two conditions. Responses lower than 300 ms. and greater than 10000 ms. have been excluded from the analyses and response times were discretized according to quintiles computed on the full data matrix. Therefore, our dependent variable presents values 5, 4, 3, 2 and 1 to identify respectively very short, short, medium, long or very long response times. The analysis provided, for each element of every facet, its location

46

on the latent trait “Response Speed” (in logits), and a series of statistical indices useful to the facet and its components. Table 2 shows a graphical representation of how the various elements of the three facets are considered on the latent trait “Response Speed”, which locates the elements according to stimuli’s “recognizability”, participants’ ability and ease of the task in the case of conditions (compatible and incompatible blocks). Estimates of stimuli and conditions parameters were centered around zero. The two conditions (compatible and incompatible blocks) have excellent fit indices (.96
facet I with K observations, df = (N – 1 + K – 1). Once obtained a t value and its degrees of freedom, a Cohen’s d can also be computed using the following formula: d=

2t

√df

(see Rosenthal & Rosnow, 1991). For example, one can easily observe that the item “beautiful” is significantly more recognizable than “excellent” (t(462) = (.22 – .06) / √.062 + .062 = 2, p<.05) but not more than “attractive” (t(462) = 1.88, p = .06). The same procedure can be used to test the difference between the averages of two groups of elements. Since the positive and negative attributes used throughout the IAT have the same number of letters (respectively 81 and 82), we expect them to be equally recognizable. Indeed, the difference in recognition between all the positive attributes and negative attributes is not significant (t(19) = 1.39, p = .18). Going even further in detail, it is possible to break up the IAT effect into the individual components supplied by each evaluation attribute. This is possible by analyzing the difference in attributes’ recognizability between the two conditions, which can also be expressed in terms of a Cohen’s d (see Figure 1). In this study, the items “pleasant” and “excellent” are recognized much more easily in the compatible condition (respectively, d = .30, t(230) = 2.27, p<.05; d = .29, t(230) = 2.18, p<.05). “Pleasant” and “excellent” therefore represent the attributes that contribute most to the emergence of the IAT effect. The item “annoying”, on the contrary, does not contribute to the emergence of the effect, because its different recognizability is far below the average bias of the other stimuli (d = .26, t(230) = –2, p<.05). We

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

EXPERIENCES AND TOOLS

EXPERIENCES AND TOOLS

Table 1 Evaluative attributes’ recognizability across conditions and in each of them

Across blocks

Pleasant Excellent Beautiful Evil Hateful Good Attractive Nice Awesome Wonderful Foul Horrible Best Mellow Ugly Bad Sickening Nasty Pesky Annoying

Compatible block

Observed score

δ

Observed score

713 714 764 685 702 727 719 664 705 704 639 725 740 687 703 703 669 677 674 618

.05 .06 .22 −.04 .02 .10 .07 −.11 .03 .02 −.19 .09 .14 −.03 .02 .02 −.09 −.06 −.07 −.26

436 436 456 412 419 430 425 397 415 412 378 421 427 397 405 404 386 388 384 351

δ .18 .18 .33 .02 .06 .14 .10 −.08 .04 .02 −.20 .08 .12 −.08 −.03 −.03 −.15 −.14 −.16 −.36

Table 1 Evaluative attributes’ recognizability across conditions and in each of them

Incompatible block Observed score

δ

277 278 308 273 283 297 294 267 290 292 261 304 313 290 298 299 283 289 290 267

−.07 −.06 .13 −.10 −.03 .06 .04 −.14 .02 .03 −.18 .11 .16 .02 .07 .08 −.03 .01 .02 −.14

Across blocks t(230) 2.27* 2.18* 1.67° 1.00 .82 .72 .54 .54 .18 −.09 −.18 −.27 −.45 −.91 −.91 −1.00 −1.09 −1.36 −1.64° 2.00*

Cohen’s d .30 .29 .22 .13 .11 .09 .07 .07 .02 .01 .02 .04 .06 .12 .12 .13 .14 .18 .22 .26

Pleasant Excellent Beautiful Evil Hateful Good Attractive Nice Awesome Wonderful Foul Horrible Best Mellow Ugly Bad Sickening Nasty Pesky Annoying

* p<.05; ° p<.10. Note. The standard error of the delta parameters across blocks is equal to .06. The standard error of the delta parameters in the compatible and incompatible blocks is equal to .08. The t values 2t test the hypothesis that the difference between the delta parameters is equal to zero. Cohen’s d = (Rosenthal & Rosnow, 1991). √df

can then state that in our 116 participants their implicit ingroup favoritism is mainly based on characteristics such as pleasantness and excellence, rather than on “annoyance”. Discussion Theory and methods in implicit social cognition have grown enormously in recent years. Such a success is justified by the attractiveness of measuring psychological aspects of which we are poorly aware, and from the possibility of circumventing the

individual control on our measures of these aspects. Such an important goal is also very ambitious. Therefore, the theory about implicit constructs and processes rely heavily on implicit techniques, such that in this more than in other areas methodological advancements are extremely complex and important. Even though the IAT has been used in more than 200 papers since 1998, there still is uncertainty or debate on some of its properties. The wider margins of improvement are related to the interpretation of the IAT and

Compatible block

Observed score

δ

Observed score

713 714 764 685 702 727 719 664 705 704 639 725 740 687 703 703 669 677 674 618

.05 .06 .22 −.04 .02 .10 .07 −.11 .03 .02 −.19 .09 .14 −.03 .02 .02 −.09 −.06 −.07 −.26

436 436 456 412 419 430 425 397 415 412 378 421 427 397 405 404 386 388 384 351

δ .18 .18 .33 .02 .06 .14 .10 −.08 .04 .02 −.20 .08 .12 −.08 −.03 −.03 −.15 −.14 −.16 −.36

Incompatible block Observed score

δ

277 278 308 273 283 297 294 267 290 292 261 304 313 290 298 299 283 289 290 267

−.07 −.06 .13 −.10 −.03 .06 .04 −.14 .02 .03 −.18 .11 .16 .02 .07 .08 −.03 .01 .02 −.14

t(230) 2.27* 2.18* 1.67° 1.00 .82 .72 .54 .54 .18 −.09 −.18 −.27 −.45 −.91 −.91 −1.00 −1.09 −1.36 −1.64° 2.00*

Cohen’s d .30 .29 .22 .13 .11 .09 .07 .07 .02 .01 .02 .04 .06 .12 .12 .13 .14 .18 .22 .26

* p<.05; ° p<.10. Note. The standard error of the delta parameters across blocks is equal to .06. The standard error of the delta parameters in the compatible and incompatible blocks is equal to .08. The t values 2t test the hypothesis that the difference between the delta parameters is equal to zero. Cohen’s d = (Rosenthal & Rosnow, 1991). √df

of its limitations. In fact, the IAT measures associations between two bipolar categories, hence it can provide, for example, a measure of the strength of the association between Black people and Good words compared to Black people and Bad words. The relativity of the IAT measure, according to some authors, is also its greatest limitation. In addition, and this applies to all implicit techniques, the choice of material through which to create a new test is crucial for the validity of the measure. Nevertheless, the choice of stimuli that

47

can then state that in our 116 participants their implicit ingroup favoritism is mainly based on characteristics such as pleasantness and excellence, rather than on “annoyance”. Discussion Theory and methods in implicit social cognition have grown enormously in recent years. Such a success is justified by the attractiveness of measuring psychological aspects of which we are poorly aware, and from the possibility of circumventing the

individual control on our measures of these aspects. Such an important goal is also very ambitious. Therefore, the theory about implicit constructs and processes rely heavily on implicit techniques, such that in this more than in other areas methodological advancements are extremely complex and important. Even though the IAT has been used in more than 200 papers since 1998, there still is uncertainty or debate on some of its properties. The wider margins of improvement are related to the interpretation of the IAT and

of its limitations. In fact, the IAT measures associations between two bipolar categories, hence it can provide, for example, a measure of the strength of the association between Black people and Good words compared to Black people and Bad words. The relativity of the IAT measure, according to some authors, is also its greatest limitation. In addition, and this applies to all implicit techniques, the choice of material through which to create a new test is crucial for the validity of the measure. Nevertheless, the choice of stimuli that

47

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

EXPERIENCES AND TOOLS

EXPERIENCES AND TOOLS

Table 2 Locations of different facets on the latent trait “Response Speed”

Table 2 Locations of different facets on the latent trait “Response Speed”

Logits 1.20 1.15 1.10 1.05 1.00 .95 .90 .85 .80 .75 .70 .65 .60 .55 .50 .45 .40 .35 .30 .25 .20 .15 .10 .05 .00 −.05 −.10 −.15 −.20 −.25 −.30 −.35 −.40 −.45 −.50 −.55 −.60 −.65 −.70 −.75 −.80 −.85 −.90 −.95 −1.00 −1.05 −1.10 −1.15

48

Participants’ ability

Ease of conditions

Stimuli’s recognizability

Logits

*

1.20 1.15 1.10 1.05 1.00 .95 .90 .85 .80 .75 .70 .65 .60 .55 .50 .45 .40 .35 .30 .25 .20 .15 .10 .05 .00 −.05 −.10 −.15 −.20 −.25 −.30 −.35 −.40 −.45 −.50 −.55 −.60 −.65 −.70 −.75 −.80 −.85 −.90 −.95 −1.00 −1.05 −1.10 −1.15

* * * ** * ** * * ***** *** ** * **** ** **** **** **** * ******* *** **** ******* *** ******* ******** ** ******* *** *** *** ** ** *** *

Compatible

Beautiful Best Good Attractive Wonderful Mellow Sickening

Horrible Excellent Ugly Evil Nice

Pleasant Bad Nasty

Awesome Hateful Pesky

Foul Annoying

Incompatible

* * **

*

48

Participants’ ability

Ease of conditions

Stimuli’s recognizability

*

* * * ** * ** * * ***** *** ** * **** ** **** **** **** * ******* *** **** ******* *** ******* ******** ** ******* *** *** *** ** ** *** *

* * **

*

Compatible

Beautiful Best Good Attractive Wonderful Mellow Sickening Foul Annoying

Incompatible

Horrible Excellent Ugly Evil Nice

Pleasant Bad Nasty

Awesome Hateful Pesky

make up a new IAT does not always focus enough attention and, above all, a set of data analysis procedures to make a post-hoc control of the material chosen is not yet available. Referring to traditional tests, the importance of these checks is obvious: it would be like changing the questions of a test and to use it later as if it were the original test. If we choose wrong stimuli to build an IAT, we might no longer know what the instrument actually measures. This contribution has provided an examination of the evaluation attributes used to measure implicit prejudice against black people, using an IAT that since 1998 has been presented to more than 3 million people (Nosek et al., 2007). At the same time, we presented a model and a data analysis strategy that is well suited to the purpose. In addition, we obtained information on the nature of the IAT, which is useful for interpreting its meaning. In accordance with Nosek et al. (2005), we stressed the importance of positive and negative attributes to be equally recognizable. We observed that positive attributes are more easily recognizable than negative attributes, although as expected the difference is not significant. However, since negative and positive sets of attributes were made up with the same number of letters, we could have expected a smaller difference than that observed (t(19) = 1.39, p = .18). Probably, this result is due to the presence of two highly used positive words: “beautiful” and “good”. The antonyms of these two stimuli (“ugly” and “bad”) that appeared in the negative set have in fact about a ten times lower frequency of occurrence in the spoken and written language (De Mauro, Mancini, Vedovelli & Voghera, 1993).

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

EXPERIENCES AND TOOLS

EXPERIENCES AND TOOLS

In terms of construction of new IATs, this aspect should be considered carefully. Nosek et al. (2005) suggest that the IAT could also be used with only two stimuli in each category. However, if we had used only the attributes “beautiful”, “good”, “nasty” and “annoying” the difference would have been highly significant (t(462) = 2.7, p<.01). As a consequence, since the D algorithm divides the difference in performance between critical blocks by the standard deviation of all responses, a two-stimuli IAT would have provided rather smaller D scores. The arguments presented in this contribution suggest that the choice of stimuli should perhaps be more accurate than what suggested by Greenwald et al. (2003), and that recognizability checks should be conducted before interpreting the IAT effect. We have also observed a tendency of the White/Good association to emerge more clearly than the Black/Bad one. Therefore, the strength of implicit preference for the in-group seems to be slightly stronger than implicit out-group derogation. The technical peculiarity of the IAT, however, makes it extremely difficult if not impossible (Nosek et al., 2005) to interpret such effects as independent. Future research on the importance of different components of implicit construct such as attitudes and prejudice would benefit not only from analysis procedures that allow the interpretation of these components separately, but also from techniques that are best suited to reveal an independence between those components, such as the Go No/Go Association Task (GNAT, Nosek & Banaji, 2001), the Extrinsic Affective Simon Task (EAST, De Houwer, 2003) or the Affect Misattribution Procedure (AMP, Payne, Cheng, Govorun & Stewart, 2005). If these

results are confirmed by other studies, measures derived from a Race IAT, in our opinion, might not be interpreted as a measure of implicit prejudice. Rather, the IAT effects would more likely express implicit ingroup favoritism. Besides the tendency of the White/Good association to be stronger than the Black/Bad association (“primacy of ingroup favoritism over outgroup derogation”), we also observed that our IAT-measured implicit prejudice was largely due to a small set of stimuli (e.g. “beautiful”, “excellent”, but also “good”), and that other stimuli (e.g. “annoying”) were equally associated with whites and blacks. We can then say that in our sample, implicit prejudice against blacks is based on the semantic contents of pleasantness, excellence, and beauty, and not on components such as annoyance. “Good” associations seem to be stronger than “bad” associations. This effect, called “Good Associations Primacy” by Sriram & Greenwald (2009) has been recently observed by Bar-Anan, Nosek & Vianello (2009) as well and offers the opportunity of discussing the possibility of separating the IAT effect into its components. For example, it is possible to think of separating the attitude toward white people from that toward black people, getting two independent estimates, or even to separate the four individual associations behind the procedure (White/Good, White/Bad, Black/Good, Black/Bad). Yet, the same authors of the IAT argued that this limit is inherent in the procedure (Nosek et al., 2005), and suggested where needed the use of alternative techniques such as the GNAT (Nosek et al., 2001) or the EAST (De Houwer, 2003). However, it would be appropriate, and toward this goal we address future research, to

49

make up a new IAT does not always focus enough attention and, above all, a set of data analysis procedures to make a post-hoc control of the material chosen is not yet available. Referring to traditional tests, the importance of these checks is obvious: it would be like changing the questions of a test and to use it later as if it were the original test. If we choose wrong stimuli to build an IAT, we might no longer know what the instrument actually measures. This contribution has provided an examination of the evaluation attributes used to measure implicit prejudice against black people, using an IAT that since 1998 has been presented to more than 3 million people (Nosek et al., 2007). At the same time, we presented a model and a data analysis strategy that is well suited to the purpose. In addition, we obtained information on the nature of the IAT, which is useful for interpreting its meaning. In accordance with Nosek et al. (2005), we stressed the importance of positive and negative attributes to be equally recognizable. We observed that positive attributes are more easily recognizable than negative attributes, although as expected the difference is not significant. However, since negative and positive sets of attributes were made up with the same number of letters, we could have expected a smaller difference than that observed (t(19) = 1.39, p = .18). Probably, this result is due to the presence of two highly used positive words: “beautiful” and “good”. The antonyms of these two stimuli (“ugly” and “bad”) that appeared in the negative set have in fact about a ten times lower frequency of occurrence in the spoken and written language (De Mauro, Mancini, Vedovelli & Voghera, 1993).

In terms of construction of new IATs, this aspect should be considered carefully. Nosek et al. (2005) suggest that the IAT could also be used with only two stimuli in each category. However, if we had used only the attributes “beautiful”, “good”, “nasty” and “annoying” the difference would have been highly significant (t(462) = 2.7, p<.01). As a consequence, since the D algorithm divides the difference in performance between critical blocks by the standard deviation of all responses, a two-stimuli IAT would have provided rather smaller D scores. The arguments presented in this contribution suggest that the choice of stimuli should perhaps be more accurate than what suggested by Greenwald et al. (2003), and that recognizability checks should be conducted before interpreting the IAT effect. We have also observed a tendency of the White/Good association to emerge more clearly than the Black/Bad one. Therefore, the strength of implicit preference for the in-group seems to be slightly stronger than implicit out-group derogation. The technical peculiarity of the IAT, however, makes it extremely difficult if not impossible (Nosek et al., 2005) to interpret such effects as independent. Future research on the importance of different components of implicit construct such as attitudes and prejudice would benefit not only from analysis procedures that allow the interpretation of these components separately, but also from techniques that are best suited to reveal an independence between those components, such as the Go No/Go Association Task (GNAT, Nosek & Banaji, 2001), the Extrinsic Affective Simon Task (EAST, De Houwer, 2003) or the Affect Misattribution Procedure (AMP, Payne, Cheng, Govorun & Stewart, 2005). If these

results are confirmed by other studies, measures derived from a Race IAT, in our opinion, might not be interpreted as a measure of implicit prejudice. Rather, the IAT effects would more likely express implicit ingroup favoritism. Besides the tendency of the White/Good association to be stronger than the Black/Bad association (“primacy of ingroup favoritism over outgroup derogation”), we also observed that our IAT-measured implicit prejudice was largely due to a small set of stimuli (e.g. “beautiful”, “excellent”, but also “good”), and that other stimuli (e.g. “annoying”) were equally associated with whites and blacks. We can then say that in our sample, implicit prejudice against blacks is based on the semantic contents of pleasantness, excellence, and beauty, and not on components such as annoyance. “Good” associations seem to be stronger than “bad” associations. This effect, called “Good Associations Primacy” by Sriram & Greenwald (2009) has been recently observed by Bar-Anan, Nosek & Vianello (2009) as well and offers the opportunity of discussing the possibility of separating the IAT effect into its components. For example, it is possible to think of separating the attitude toward white people from that toward black people, getting two independent estimates, or even to separate the four individual associations behind the procedure (White/Good, White/Bad, Black/Good, Black/Bad). Yet, the same authors of the IAT argued that this limit is inherent in the procedure (Nosek et al., 2005), and suggested where needed the use of alternative techniques such as the GNAT (Nosek et al., 2001) or the EAST (De Houwer, 2003). However, it would be appropriate, and toward this goal we address future research, to

49

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

EXPERIENCES AND TOOLS

EXPERIENCES AND TOOLS

quantify the sensitivity of these techniques to the measure of independence among associations. The results of the study suggest that the MFRM could be a suitable model. Results observed through the analysis of each single item’s contribution to the overall IAT effect may have important theoretical consequences. If the implicit prejudice against blacks, in white participants, is based mainly on beauty and excellence, it can be assumed that other implicit constructs, such as stereotypes and attitudes, are also based on a subsample of stimuli and that this sub-sample varies between participants belonging to different groups. These inter-group differences in the meaning of implicit constructs could be particularly interesting. Drawing on our results, it can be hypothesized that implicit prejudice is different in nature between blacks and whites, but also that gender stereotypes assume different meanings in men and women, and that implicit attitude towards a political ideology (such as a right-wing ideology) is different in meaning between “rightwing” people and “left-wing” people, and so forth. In addition, a confirmation of these early results could have important theoretical implications for the definition of the implicit construct under investigation. In this study, for example, we noted that the measure obtained by a Race IAT reflects more an implicit ingroup favoritism, rather than prejudice. This observation underscores the value of the MFRM in analyzing implicit associations. Further studies, hopefully conducted with the help of these models, could capitalize profitably on this first observation to analyze in detail the nature of the semantics of implicit prejudice and other widely used implicit constructs.

50

In summary, this study underlines the importance of carrying out a post-hoc control on the stimuli used for the implicit measurement, demonstrating that even a careful choice might not be always adequate. It was noted that the Many-Facet Rasch Models are well suited to the purpose, and we therefore suggest their use in future research on the topic. These models allowed us to provide a detailed analysis of the nature of implicit prejudice in white people, suggesting that it is based on a “positive” sub-sample of all stimuli included in the test. References ANDRICH, D. (1988). Rasch models for measurement. Beverly Hills, CA: Sage Publications. ASENDORPF, J.B., BANSE, R. & MÜCKE, D. (2002). Double dissociation between implicit and explicit personality self-concept: The case of shy behavior. Journal of Personality and Social Psychology, 83 (2), 380-393. BAR-ANAN, Y., NOSEK, B.A. & VIANELLO, M. (2009). The Sorting Paired Features Task: A measure of association strengths. Experimental Psychology. BINET, A. (1896). On double consciousness. Chicago:Open Court. BOND, T.G. & FOX, C.M. (2001). Applying the Rasch model: Fundamental measurement in the human sciences. Mahwah, NJ: Erlbaum. CARPENTER, W.B. (1888). Principles of mental physiology, with their applications to the training and discipline of the mind and the study of its morbid conditions. New York: Appleton. CRISTANTE, F. & MANNARINI, S. (2004). Misurare in psicologia. Il modello di Rasch. Roma: Laterza. CUNNINGHAM, W.A., RAYE, C. & JOHNSON, M.K. (2005). Neural correlates of evaluation associated with promotion and pre-

vention regulatory focus. Cognitive, Affective and Behavioral Neuroscience, 5 (2), 202-211. DE HOUWER, J. (2003). The estrinsic affective Simon task. Experimental Psychology, 50 (2), 77-85. DE MAURO, T., MANCINI, F., VEDOVELLI, M. & VOGHERA, M. (1993). Lessico di frequenza dell’italiano parlato. Milano: Etaslibri. FAZIO, R.H., SANBONMATSU, D.M., POWELL, M.C. & KARDES, F.R. (1986). On the automatic activation of attitudes. Journal of Personality and Social Psychology, 50 (2), 229-238. GREENWALD, A.G., MCGHEE, D.E. & SCHWARTZ, J.L.K. (1998). Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology, 74 (6), 1464-1480. GREENWALD, A.G, NOSEK, B.A. & BANAJI, M.R. (2003). Understanding and using the Implicit Association Test: I. An improved scoring algorithm. Journal of Personality and Social Psychology, 85 (2), 197-216. GREENWALD, A.G., NOSEK, B.A., BANAJI, M.R. & KLAUER, K.C. (2005). Validity of the salience asymmetry interpretation of the IAT: Comment on Rothermund and Wentura, 2004. Journal of Experimental Psychology: General, 134 (3), 420-425. HIGGINS, E.T., RHOLES, W.S. & JONES, C.R. (1977). Category accessibility and impression formation. Journal of Experimental Social Psychology, 13, 141-154. HUMMERT, M.E., GARSTKA, T.A., O’BRIEN, L.T., GREENWALD, A.G. & MELLOT, D.S. (2002). Using the Implicit Association Test to measure age differences in implicit cognitions. Psychology and Aging, 17 (3), 482-495. JAMES, W. (1889). Automatic writing. Proceedings of the American Society for psychical research, 1, 558-564.

quantify the sensitivity of these techniques to the measure of independence among associations. The results of the study suggest that the MFRM could be a suitable model. Results observed through the analysis of each single item’s contribution to the overall IAT effect may have important theoretical consequences. If the implicit prejudice against blacks, in white participants, is based mainly on beauty and excellence, it can be assumed that other implicit constructs, such as stereotypes and attitudes, are also based on a subsample of stimuli and that this sub-sample varies between participants belonging to different groups. These inter-group differences in the meaning of implicit constructs could be particularly interesting. Drawing on our results, it can be hypothesized that implicit prejudice is different in nature between blacks and whites, but also that gender stereotypes assume different meanings in men and women, and that implicit attitude towards a political ideology (such as a right-wing ideology) is different in meaning between “rightwing” people and “left-wing” people, and so forth. In addition, a confirmation of these early results could have important theoretical implications for the definition of the implicit construct under investigation. In this study, for example, we noted that the measure obtained by a Race IAT reflects more an implicit ingroup favoritism, rather than prejudice. This observation underscores the value of the MFRM in analyzing implicit associations. Further studies, hopefully conducted with the help of these models, could capitalize profitably on this first observation to analyze in detail the nature of the semantics of implicit prejudice and other widely used implicit constructs.

50

In summary, this study underlines the importance of carrying out a post-hoc control on the stimuli used for the implicit measurement, demonstrating that even a careful choice might not be always adequate. It was noted that the Many-Facet Rasch Models are well suited to the purpose, and we therefore suggest their use in future research on the topic. These models allowed us to provide a detailed analysis of the nature of implicit prejudice in white people, suggesting that it is based on a “positive” sub-sample of all stimuli included in the test. References ANDRICH, D. (1988). Rasch models for measurement. Beverly Hills, CA: Sage Publications. ASENDORPF, J.B., BANSE, R. & MÜCKE, D. (2002). Double dissociation between implicit and explicit personality self-concept: The case of shy behavior. Journal of Personality and Social Psychology, 83 (2), 380-393. BAR-ANAN, Y., NOSEK, B.A. & VIANELLO, M. (2009). The Sorting Paired Features Task: A measure of association strengths. Experimental Psychology. BINET, A. (1896). On double consciousness. Chicago:Open Court. BOND, T.G. & FOX, C.M. (2001). Applying the Rasch model: Fundamental measurement in the human sciences. Mahwah, NJ: Erlbaum. CARPENTER, W.B. (1888). Principles of mental physiology, with their applications to the training and discipline of the mind and the study of its morbid conditions. New York: Appleton. CRISTANTE, F. & MANNARINI, S. (2004). Misurare in psicologia. Il modello di Rasch. Roma: Laterza. CUNNINGHAM, W.A., RAYE, C. & JOHNSON, M.K. (2005). Neural correlates of evaluation associated with promotion and pre-

vention regulatory focus. Cognitive, Affective and Behavioral Neuroscience, 5 (2), 202-211. DE HOUWER, J. (2003). The estrinsic affective Simon task. Experimental Psychology, 50 (2), 77-85. DE MAURO, T., MANCINI, F., VEDOVELLI, M. & VOGHERA, M. (1993). Lessico di frequenza dell’italiano parlato. Milano: Etaslibri. FAZIO, R.H., SANBONMATSU, D.M., POWELL, M.C. & KARDES, F.R. (1986). On the automatic activation of attitudes. Journal of Personality and Social Psychology, 50 (2), 229-238. GREENWALD, A.G., MCGHEE, D.E. & SCHWARTZ, J.L.K. (1998). Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology, 74 (6), 1464-1480. GREENWALD, A.G, NOSEK, B.A. & BANAJI, M.R. (2003). Understanding and using the Implicit Association Test: I. An improved scoring algorithm. Journal of Personality and Social Psychology, 85 (2), 197-216. GREENWALD, A.G., NOSEK, B.A., BANAJI, M.R. & KLAUER, K.C. (2005). Validity of the salience asymmetry interpretation of the IAT: Comment on Rothermund and Wentura, 2004. Journal of Experimental Psychology: General, 134 (3), 420-425. HIGGINS, E.T., RHOLES, W.S. & JONES, C.R. (1977). Category accessibility and impression formation. Journal of Experimental Social Psychology, 13, 141-154. HUMMERT, M.E., GARSTKA, T.A., O’BRIEN, L.T., GREENWALD, A.G. & MELLOT, D.S. (2002). Using the Implicit Association Test to measure age differences in implicit cognitions. Psychology and Aging, 17 (3), 482-495. JAMES, W. (1889). Automatic writing. Proceedings of the American Society for psychical research, 1, 558-564.

LANE, K.A., BANAJI, M.R., NOSEK, B.A. & GREENWALD, A.G. (2007). Understanding and using the Implicit Association Test: IV: Procedures and validity. In B. Wittenbrink & N. Schwarz (eds.), Implicit measures of attitudes: Procedures and controversies. New York: Guilford Press. LINACRE, J.M. (1989). Multi-facet Rasch measurement. Chicago: MESA Press. LINACRE, J.M. (2004). Facets Rasch measurement computer program. Chicago: Winsteps.com. LINACRE, J.M. & WRIGHT, B.D. (1994). Reasonable meansquare fit values. Rasch Measurement Transactions, 8, 370. Avalaible on (April 2007): http://rasch.org/rmt/rmt83.htm. MAISON, D., GREENWALD, A.G. & BRUIN, R. (2004). Predictive validity of the Implicit Association Test in studies of brands, consumer attitudes, and behavior. Journal of Consumer Psychology, 14 (4), 405-415. MYFORD, C.M. & WOLFE, E.W. (2003). Detecting and measuring rater effects using manyfacet Rasch measurement: Part I. Journal of Applied Measurement, 4 (4), 386-422. NOSEK, B.A. & BANAJI, M.R. (2001). The go/no-go association task. Social cognition, 19 (6), 625-666. NOSEK, B.A., GREENWALD, A.G. & BANAJI, M.R. (2005). Understanding and using the Implicit Association Test: II. Methodological Issues. Personality and Social Psychology Bulletin, 31 (2), 166-180. NOSEK, B.A., GREENWALD, A.G. & BANAJI, M. R. (2007). The Implicit Association Test at age 7: A methodological and conceptual review. In J.A. Bargh (ed.), Automatic processes in social thinking and behavior. New York: Psychology Press. NOSEK, B.A., SMYTH, F.L., HANSEN, J.J., DEVOS, T., LINDNER, N.M., RANGANATH, K.A., SMITH, C.T., OLSON, K.R., CHUGH, D.,

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

EXPERIENCES AND TOOLS

EXPERIENCES AND TOOLS

GREENWALD, A.G. & BANAJI, M.R. (2007). Pervasiveness and correlates of implicit attitudes and stereotypes. European Review of Social Psychology, 18, 36-88. ORSINI, M., HUANG, G., KIYAK, H., RAMSAY, D., BOLLEN, A., ANDERSON, N. & GIDDON, D. Methods to evaluate profile preferences for the anteroposterior position of the mandible. American Journal of Orthodontics and Dentofacial Orthopedics, 130 (3), 283-291. PAYNE, B.K., CHENG, C.M., GOVORUN, O., & STEWART, B.D. (2005). An inkblot for attitudes: Affect misattribution as implicit measurement. Journal of Personality and Social Psychology, 89 (3), 277-293. RASCH, G. (1960/1980). Probabilistic models for some intelligence and attainment test. Danish Institute for Educational Research, Copenhagen, 1960; reprinted, The University of Chicago Press, Chicago, 1980. ROSENTHAL, R. & ROSNOW, R.L. (1991). Essentials of behavioral research: Methods and data analysis (2nd ed.). New York: McGraw Hill. SMITH, R.M., SCHUMACKER, R.E. & BUSH, M.J. (1998). Using item mean squares to evaluate fit to the Rasch Model. Journal of Outcome Measurement, 2 (1), 66-78. SRIRAM, N. & GREENWALD, A.G. (2009). The Brief Implicit Association Test. Unpublished manuscript. STEFFENS, M.C. & PLEWE, I. (2001). Items’ cross category associations as a confounding factor in the Implicit Association Test. Zeitschrift für Experimentelle Psychologie, 48, 123-134. TEACHMAN, B.A., GREGG, A.P. & WOODY, S.R. (2001). Implicit associations for fear-relevant stimuli among individuals with snake and spider fears. Journal of Abnormal Psychology, 110 (2), 226-235.

WIERS, R.W., VAN WOERDEN, N., SMULDERS, F.T.Y. & DE JONG, P.J. (2002). Implicit and explicit alcohol-related cognitions in heavy and light drinkers. Journal of Abnormal Psychology, 111 (4), 648-658.

SUMMARY. Introduction: The Implicit Association Test (IAT) is the only implicit technique that presents all main characteristics of a real psychological test. We know its main properties and characteristics, some publications serve as a real manual and others as a normative sample. Yet, still not sufficient attention has been paid to the materials used in an IAT. Methods: In this paper, we show that a Many-Facet Rasch model can adequately address this issue, and that even if the evaluative attributes used in an IAT are carefully selected, they might not always be adequate, possibly affecting the quality of measurement. Results: Studying IAT-measured implicit prejudice, we found that, in our sample, implicit ingroup favoritism is slightly stronger than implicit outgroup derogation, that the IAT effect is mainly due to a small subsample of all stimuli included in the test, and that associations of faces with good words seem to be slightly stronger than associations of faces with bad words. Conclusions: We argue that post-hoc analyses should be routinely conducted before interpreting the IAT effect. Methodological and theoretical implications of these results are then discussed. RIASSUNTO. Introduzione: Il Test d’Associazione Implicita (IAT) è la tecnica di rilevazione delle associazioni automatiche che più delle altre presenta le caratteristiche di un vero e proprio test psicologico. Ne conosciamo infatti le principali proprietà e caratteristiche, alcune pubblicazioni fungono da vero e proprio manuale ed altre da campione normativo. Tuttavia, l’attenzione ri-

51

LANE, K.A., BANAJI, M.R., NOSEK, B.A. & GREENWALD, A.G. (2007). Understanding and using the Implicit Association Test: IV: Procedures and validity. In B. Wittenbrink & N. Schwarz (eds.), Implicit measures of attitudes: Procedures and controversies. New York: Guilford Press. LINACRE, J.M. (1989). Multi-facet Rasch measurement. Chicago: MESA Press. LINACRE, J.M. (2004). Facets Rasch measurement computer program. Chicago: Winsteps.com. LINACRE, J.M. & WRIGHT, B.D. (1994). Reasonable meansquare fit values. Rasch Measurement Transactions, 8, 370. Avalaible on (April 2007): http://rasch.org/rmt/rmt83.htm. MAISON, D., GREENWALD, A.G. & BRUIN, R. (2004). Predictive validity of the Implicit Association Test in studies of brands, consumer attitudes, and behavior. Journal of Consumer Psychology, 14 (4), 405-415. MYFORD, C.M. & WOLFE, E.W. (2003). Detecting and measuring rater effects using manyfacet Rasch measurement: Part I. Journal of Applied Measurement, 4 (4), 386-422. NOSEK, B.A. & BANAJI, M.R. (2001). The go/no-go association task. Social cognition, 19 (6), 625-666. NOSEK, B.A., GREENWALD, A.G. & BANAJI, M.R. (2005). Understanding and using the Implicit Association Test: II. Methodological Issues. Personality and Social Psychology Bulletin, 31 (2), 166-180. NOSEK, B.A., GREENWALD, A.G. & BANAJI, M. R. (2007). The Implicit Association Test at age 7: A methodological and conceptual review. In J.A. Bargh (ed.), Automatic processes in social thinking and behavior. New York: Psychology Press. NOSEK, B.A., SMYTH, F.L., HANSEN, J.J., DEVOS, T., LINDNER, N.M., RANGANATH, K.A., SMITH, C.T., OLSON, K.R., CHUGH, D.,

GREENWALD, A.G. & BANAJI, M.R. (2007). Pervasiveness and correlates of implicit attitudes and stereotypes. European Review of Social Psychology, 18, 36-88. ORSINI, M., HUANG, G., KIYAK, H., RAMSAY, D., BOLLEN, A., ANDERSON, N. & GIDDON, D. Methods to evaluate profile preferences for the anteroposterior position of the mandible. American Journal of Orthodontics and Dentofacial Orthopedics, 130 (3), 283-291. PAYNE, B.K., CHENG, C.M., GOVORUN, O., & STEWART, B.D. (2005). An inkblot for attitudes: Affect misattribution as implicit measurement. Journal of Personality and Social Psychology, 89 (3), 277-293. RASCH, G. (1960/1980). Probabilistic models for some intelligence and attainment test. Danish Institute for Educational Research, Copenhagen, 1960; reprinted, The University of Chicago Press, Chicago, 1980. ROSENTHAL, R. & ROSNOW, R.L. (1991). Essentials of behavioral research: Methods and data analysis (2nd ed.). New York: McGraw Hill. SMITH, R.M., SCHUMACKER, R.E. & BUSH, M.J. (1998). Using item mean squares to evaluate fit to the Rasch Model. Journal of Outcome Measurement, 2 (1), 66-78. SRIRAM, N. & GREENWALD, A.G. (2009). The Brief Implicit Association Test. Unpublished manuscript. STEFFENS, M.C. & PLEWE, I. (2001). Items’ cross category associations as a confounding factor in the Implicit Association Test. Zeitschrift für Experimentelle Psychologie, 48, 123-134. TEACHMAN, B.A., GREGG, A.P. & WOODY, S.R. (2001). Implicit associations for fear-relevant stimuli among individuals with snake and spider fears. Journal of Abnormal Psychology, 110 (2), 226-235.

WIERS, R.W., VAN WOERDEN, N., SMULDERS, F.T.Y. & DE JONG, P.J. (2002). Implicit and explicit alcohol-related cognitions in heavy and light drinkers. Journal of Abnormal Psychology, 111 (4), 648-658.

SUMMARY. Introduction: The Implicit Association Test (IAT) is the only implicit technique that presents all main characteristics of a real psychological test. We know its main properties and characteristics, some publications serve as a real manual and others as a normative sample. Yet, still not sufficient attention has been paid to the materials used in an IAT. Methods: In this paper, we show that a Many-Facet Rasch model can adequately address this issue, and that even if the evaluative attributes used in an IAT are carefully selected, they might not always be adequate, possibly affecting the quality of measurement. Results: Studying IAT-measured implicit prejudice, we found that, in our sample, implicit ingroup favoritism is slightly stronger than implicit outgroup derogation, that the IAT effect is mainly due to a small subsample of all stimuli included in the test, and that associations of faces with good words seem to be slightly stronger than associations of faces with bad words. Conclusions: We argue that post-hoc analyses should be routinely conducted before interpreting the IAT effect. Methodological and theoretical implications of these results are then discussed. RIASSUNTO. Introduzione: Il Test d’Associazione Implicita (IAT) è la tecnica di rilevazione delle associazioni automatiche che più delle altre presenta le caratteristiche di un vero e proprio test psicologico. Ne conosciamo infatti le principali proprietà e caratteristiche, alcune pubblicazioni fungono da vero e proprio manuale ed altre da campione normativo. Tuttavia, l’attenzione ri-

51

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

BOLLETTINO DI PSICOLOGIA APPLICATA, 2009, 257

EXPERIENCES AND TOOLS

EXPERIENCES AND TOOLS

posta sulla qualità dei materiali da utilizzare nella tecnica è ancora insufficiente. Metodo: In questo articolo dimostriamo che un modello Many-Facet di Rasch si adatta bene a questo scopo e che anche se gli stimoli valutativi di uno IAT sono stati scelti con cura, possono rivelarsi talvolta inadeguati, minacciando la qualità della misurazione effettuata. Risultati: Studiando il pregiudi-

zio attraverso l’IAT, abbiamo rilevato che nel nostro campione la preferenza implicita per l’ingroup è leggermente più forte del pregiudizio implicito nei confronti dell’outgroup, che l’effetto IAT è dovuto ad un piccolo sottocampione degli stimoli valutativi scelti, e che l’associazione di volti con parole buone sembra essere più forte dell’associazione con parole cattive. Conclusio-

ni: Si conclude suggerendo che i dati ottenuti da un IAT dovrebbero essere abitualmente sottoposti a controlli prima di interpretare un eventuale effetto. Si discutono infine le implicazioni teoriche e metodologiche dei risultati ottenuti. Keywords: Implicit Association Test, Many-Facet Rasch Models, Stimuli

posta sulla qualità dei materiali da utilizzare nella tecnica è ancora insufficiente. Metodo: In questo articolo dimostriamo che un modello Many-Facet di Rasch si adatta bene a questo scopo e che anche se gli stimoli valutativi di uno IAT sono stati scelti con cura, possono rivelarsi talvolta inadeguati, minacciando la qualità della misurazione effettuata. Risultati: Studiando il pregiudi-

zio attraverso l’IAT, abbiamo rilevato che nel nostro campione la preferenza implicita per l’ingroup è leggermente più forte del pregiudizio implicito nei confronti dell’outgroup, che l’effetto IAT è dovuto ad un piccolo sottocampione degli stimoli valutativi scelti, e che l’associazione di volti con parole buone sembra essere più forte dell’associazione con parole cattive. Conclusio-

ni: Si conclude suggerendo che i dati ottenuti da un IAT dovrebbero essere abitualmente sottoposti a controlli prima di interpretare un eventuale effetto. Si discutono infine le implicazioni teoriche e metodologiche dei risultati ottenuti. Keywords: Implicit Association Test, Many-Facet Rasch Models, Stimuli

Michelangelo Vianello, Pasquale Anselmi & Egidio Robusto, Dept. of General Psychology, University of Padova.

Michelangelo Vianello, Pasquale Anselmi & Egidio Robusto, Dept. of General Psychology, University of Padova.

APPENDIX

APPENDIX

List of evaluative attributes used in the race IAT

List of evaluative attributes used in the race IAT

52

Good words

Good words

Original (Translation)

Original (Translation)

Attraente (Attractive) Bello (Beautiful) Buono (Good) Eccellente (Excellent) Gradevole (Mellow) Meraviglioso (Wonderful) Ottimo (Best) Piacevole (Pleasant) Simpatico (Nice) Stupendo (Awesome)

Attraente (Attractive) Bello (Beautiful) Buono (Good) Eccellente (Excellent) Gradevole (Mellow) Meraviglioso (Wonderful) Ottimo (Best) Piacevole (Pleasant) Simpatico (Nice) Stupendo (Awesome)

Bad words

Bad words

Original (Translation)

Original (Translation)

Antipatico (Nasty) Brutto (Ugly) Cattivo (Bad) Disgustoso (Foul) Fastidioso (Annoying) Maligno (Evil) Molesto (Pesky) Odioso (Hateful) Orribile (Horrible) Rivoltante (Sickening)

Antipatico (Nasty) Brutto (Ugly) Cattivo (Bad) Disgustoso (Foul) Fastidioso (Annoying) Maligno (Evil) Molesto (Pesky) Odioso (Hateful) Orribile (Horrible) Rivoltante (Sickening)

52

BpA 257.qxd

psychology (Wiers, Van Woerden,. Smulders & De Jong, 2002), ... teristics of a real psychological test. We know its ...... B.A. & GREENWALD, A.G.. (2007).

134KB Sizes 1 Downloads 117 Views

Recommend Documents

BPA BuyerBewareFull5.pdf
Page 2 of 76. ACKNOWLEDGMENTS. Thank you to the principal authors of this. report: • Connie Engel, Janet Nudelman,. Sharima Rasanayagam and Maija Witte. from the Breast Cancer Fund, with. support from Nancy Buermeyer, Emily. Reuman and Katie Gibbs.

BpA 257.qxd
EXPERIENCES AND TOOLS. MICHELANGELO ... Faces of White People + Good Words. Faces of Black ... Data Analysis: model and strategy. The data were ...

BPA BuyerBewareFull5.pdf
Page 2 of 76. ACKNOWLEDGMENTS. Thank you to the principal authors of this. report: • Connie Engel, Janet Nudelman,. Sharima Rasanayagam and Maija Witte. from the Breast Cancer Fund, with. support from Nancy Buermeyer, Emily. Reuman and Katie Gibbs.

Neuqua BPA - NLC Qualifiers.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Neuqua BPA ...