MEMORY, 2011, 19 (8), 853870

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

Monitoring memory errors: The influence of the veracity of retrieved information on the accuracy of judgements of learning Matthew G. Rhodes1 and Sarah K. Tauber2 1

Colorado State University, CO, Fort Collins, USA Kent State University, OH, Kent, USA

2

The current study examined the degree to which predictions of memory performance made immediately or at a delay are sensitive to confidently held memory illusions. Participants studied unrelated pairs of words and made judgements of learning (JOLs) for each item, either immediately or after a delay. Half of the unrelated pairs (deceptive items; e.g., nursedollar) had a semantically related competitor (e.g., doctor) that was easily accessible when given a test cue (e.g., nursedo_ _ _r) and half had no semantically related competitor (control items; e.g., subjectdollar). Following the study phase, participants were administered a cued recall test. Results from Experiment 1 showed that memory performance was less accurate for deceptive compared with control items. In addition, delaying judgement improved the relative accuracy of JOLs for control items but not for deceptive items. Subsequent experiments explored the degree to which the relative accuracy of delayed JOLs for deceptive items improved as a result of a warning to ensure that retrieved memories were accurate (Experiment 2) and corrective feedback regarding the veracity of information retrieved prior to making a JOL (Experiment 3). In all, these data suggest that delayed JOLs may be largely insensitive to memory errors unless participants are provided with feedback regarding memory accuracy.

Keywords: Memory; Metacognition; Judgements of learning; Delayed judgements of learning; Metamemory.

Numerous prior studies have demonstrated that memory is often subject to striking inaccuracies that are held with high levels of confidence (e.g., Anastasi, Rhodes, & Burns, 2000; Kelley & Sahakyan, 2003; Loftus, Miller, & Burns, 1978; Owens, Bower, & Black, 1979; Roediger & McDermott, 1995; for reviews see Koriat, Goldsmith, & Pansky, 2001; Rhodes & Jacoby, 2007). For example, Kelley and Sahakyan (2003, Experiment 1; see also Kato, 1985; Rhodes & Kelley, 2005), had participants study unrelated pairs of words (e.g., tablecheer; kitecenter) and related pairs (e.g., morningevening) in preparation for a

cued recall test. For some of the unrelated pairs, termed deceptive items, a semantically related competitor was easily accessible when a retrieval cue consisting of the cue word and three letters of the target word was provided (e.g., chair in the case of tablech_ _r). For other, control items, there was no semantically related competitor easily accessible when given a retrieval cue (e.g., kitece_ _ _r). Overall, accuracy was poorer for deceptive compared with control items. As well, participants exhibited high levels of overconfidence in the correctness of their memories for deceptive items. Specifically, Kelley and Sahakyan

Address correspondence to: Matthew G. Rhodes, Department of Psychology, Colorado State University, Fort Collins, CO, 805231876, USA. E-mail: [email protected] We thank David McCabe and John Dunlosky for helpful comments on a previous version of this manuscript, and Andrew Baxley, Christie Miller, Talyn Olguin, Jordan Peters, Jessica Sullenberger, and Amber Witherby for assistance with data collection.

# 2011 Psychology Press, an imprint of the Taylor & Francis Group, an Informa business http://www.psypress.com/memory http://dx.doi.org/10.1080/09658211.2011.613841

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

854

RHODES AND TAUBER

reported that, whereas participants produced a correct response for 46% of deceptive items, their average confidence in their accuracy was 74%. Conversely, for control items, accuracy (64%) was much greater and more closely corresponded with confidence (67%). In the current study we examined whether participants’ predictions of future memory performance were subject to similar illusions of overconfidence. That is, are predictions of future memory performance sensitive to those conditions that diminish memory accuracy? As well, do methods that typically enhance predictive accuracy, such as delaying judgement, enhance one’s sensitivity to memory illusions?

METACOGNITION: OVERVIEW AND CONCEPTUAL FRAMEWORK The efficacy of memory predictions has been explored for several decades by researchers studying awareness of our own cognitive processes, termed metacognition (for reviews see Koriat, 2007; Metcalfe, 2000; Nelson, 1996). Memory predictions have frequently been examined by soliciting judgements of learning (JOLs) either immediately after presenting an item or following a delay (e.g., Arbuckle & Cuddy, 1969; Koriat, 1997; Rhodes & Castel, 2008). For example, in a typical experiment participants study memoranda such as paired associates (e.g., DOGSPOON). Either immediately after the presentation of the pair or following some delay participants make a JOL, cued by the stimulus (e.g., DOG), predicting the likelihood of later remembering the target (SPOON). After JOLs have been made for each item, participants are given a memory test for the studied items, permitting an assessment of the overall correspondence between JOLs and test performance (absolute accuracy) and the degree that to which JOLs discriminate between what is and is not remembered (relative accuracy). The accuracy of JOLs is relevant to multiple aspects of metacognition. In particular, one widely accepted framework (Nelson & Narens, 1990, 1994) for metacognition distinguishes between those processes related to assessing our own learning (monitoring) and the self-regulation of learning based on information gained from monitoring (control). Of key importance, such frameworks posit that monitoring plays a causal role in self-regulated learning (Nelson, 1996;

Nelson & Narens, 1990; but see also Koriat, Ma’ayan, Nussinson, 2006). For example, Rhodes and Castel (2009) had participants listen to and make JOLs for words presented at a quiet or loud volume. Participants also chose words for restudy, although a restudy opportunity was not actually provided. The results showed that participants regarded loud words as more memorable than quiet words (i.e., loud words elicited higher JOLs than quiet words) despite the fact that there was no influence of volume on recall. In addition, participants more frequently chose to restudy quiet words, which were regarded as less memorable. Such data indicate that monitoring informed restudy choices independently of memory performance, suggesting a causal role for metacognitive processes in self-regulated learning (see also Metcalfe & Finn, 2008). Thus monitoring has a substantial influence on decisions of what to study, impacting subsequent memory performance.

THE POWER OF DELAYING JUDGEMENT Given the important relationship between monitoring processes and control of learning, it is imperative that monitoring is accurate. One method that appears to consistently improve the relative accuracy of monitoring is delaying JOLs for even a brief period of time (e.g., Dunlosky & Nelson, 1992, 1994, 1997; Koriat & Bjork, 2006; Koriat & Ma’ayan, 2005; Meeter & Nelson, 2003; Nelson & Dunlosky, 1991; Nelson, Narens, & Dunlosky, 2004; Serra, Dunlosky, & Hertzog, 2008; Wahlheim, 2011; for a review see Rhodes & Tauber, 2011). That is, when made after a delay, participants’ JOLs are better at discriminating between items that have a high or low probability of being recalled than when JOLs are made immediately after the presentation of an item. This is typically manifest in higher gamma correlations, a nonparametric measure of association (Nelson, 1984), between delayed JOLs and recall compared with immediate JOLs and recall, a finding termed the delayed JOL effect. For example, in a meta-analytic review of the delayed JOL literature, Rhodes and Tauber (2011) reported that the mean gamma correlation for delayed JOLs was .77 and for immediate JOLs was .42. As well, an analysis of 112 effect sizes (from 4,554 participants) comparing gamma

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

MONITORING MEMORY ERRORS

correlations for delayed and immediate JOLs indicated that delaying JOLs confers a substantial benefit to monitoring (g.93). Thus the delayed JOL effect has proven highly robust and to date few departures from this pattern have been reported. The predominant theoretical accounts of the delayed JOL effect (see Rhodes & Tauber, 2011, for a review of theories) suggest that it reflects access to and an interaction with long-term memory (LTM) which provides information used as a basis for judgement. For example, the monitoring dual memories account (Dunlosky & Nelson, 1994; Nelson & Dunlosky, 1991; Nelson et al., 2004) suggests that while immediate JOLs must necessarily rely on information from shortterm memory that is currently accessible, delayed JOLs rely on currently accessible information as well as information retrieved from LTM in response to a cue. Such information from LTM will be more diagnostic of future memory performance than information available immediately after studying an item, leading to higher levels of resolution for JOLs made after a delay compared with immediate JOLs. Spellman and Bjork (1992; see also Kimball & Metcalfe, 2003; Spellman, Bloomfield, & Bjork, 2008) have also theorised that delayed JOL accuracy reflects access to LTM. However, in contrast to the monitoring dual memories hypothesis, their self-fulfilling prophecy account suggests that delaying JOLs is tantamount to retrieval practice. When participants successfully retrieve a target during a delayed JOL they will likely ascribe a higher JOL to that item than to items that were not retrieved. Thus the self-fulfilling prophecy account suggests that the delayed JOL effect occurs not because of any benefit conferred to monitoring processes per se but because successful retrieval increases the chance of subsequent target retrieval and consequently begets an accurate JOL. Both accounts assume that the delayed JOL effect occurs because participants access information from LTM that is diagnostic of future memory performance. However, it is unclear whether delaying JOLs will increase relative accuracy if the information retrieved from LTM is not veridical. For example, as noted previously, a number of prior studies have demonstrated that memory errors may be held with high levels of confidence (e.g., Anastasi et al., 2000; Kelley & Sahakyan, 2003; Roediger & McDermott, 1995). Are participants more sensitive to the potential

855

for such errors when judgement is delayed? We suggest that while delayed JOLs likely rely on information retrieved from LTM this will not inevitably lead to increases in relative accuracy. Rather, delayed JOLs reflect attributions regarding the availability and ease of retrieving information from LTM in response to a cue (cf. Jacoby, Kelley, & Dywan, 1989; Kelley & Rhodes, 2002; Koriat, 1997; Koriat & Ma’ayan, 2005). Consequently, delayed JOL accuracy will be a function of the extent to which judgements rely on retrieving accurate information rather than access to LTM per se. Prior work examining delayed JOLs has generally used unrelated paired associates (e.g., Nelson & Dunlosky, 1991; Nelson et al., 2004) that are not likely to induce retrieval of confidently held, inaccurate information (but see Wahlheim, 2011) and thus has not fully tested this possibility.

THE CURRENT STUDY In the current study we examined the accuracy of memory predictions made immediately or at a delay for information that frequently elicits confidently held memory errors. Participants studied unrelated pairs of words (e.g., tablecheer; kite center), in addition to related filler items (e.g., morningevening). For some of the unrelated pairs, termed deceptive items, a semantically related competitor was easily accessible when a retrieval cue consisting of the cue word and three letters of the target word were provided (e.g., chair in the case of tablech_ _r). For other, control items, there were no semantically related competitors easily accessible when given a retrieval cue (e.g., kitece_ _ _r). Prior work has demonstrated that participants frequently report the related (but incorrect) competitor for deceptive items with high levels of confidence (Kelley & Sahakyan, 2003; Rhodes & Kelley, 2005). JOLs in the experiments reported were solicited immediately for half of the items studied and at a delay for the remaining half of the items studied. In order to directly assess the nature of the information used when making JOLs we adopted the Pre-Judgement Recall and Monitoring (PRAM) procedure introduced by Nelson et al. (2004). Specifically, participants attempted to recall the target item in response to the cue just prior to each JOL, providing an indication of the information used as a basis for the JOL. Our particular interest was in relative accuracy for

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

856

RHODES AND TAUBER

delayed compared with immediate JOLs. Given that control items (e.g., kitecenter) essentially resembled items used in previous studies of delayed JOLs we expected relative accuracy to be markedly higher for such items when JOLs were made after a delay rather than immediately. In contrast, if participants frequently and confidently retrieve incorrect, related competitors for deceptive items (e.g., nursedollar) that are indistinguishable from veridical information, they may not exhibit improvements in relative accuracy for delayed compared with immediate JOLs. To anticipate, participants in Experiment 1 exhibited a delayed JOL effect for control items but not for deceptive items. The remaining experiments examined manipulations that could serve to restore the delayed JOL effect for deceptive items. Thus participants in Experiment 2 were given an explicit warning indicating that deceptive items could induce retrieval of plausible, but incorrect information. Participants in Experiment 3 were given feedback regarding the correctness of a response recalled during the pre-JOL recall stage. The experiments reported therefore provide an examination of boundary conditions of the delayed JOL effect.

EXPERIMENT 1 In Experiment 1 participants studied control items, deceptive items, and related pairs. For half of these items JOLs were solicited immediately and for half of the items JOLs were solicited at a delay. While we anticipated that delaying JOLs would improve relative accuracy for control items, we expected that participants might not be able to distinguish between veridical and inaccurate information when making judgements at a delay for deceptive items. As such, a delayed JOL effect would not be evident for deceptive items.

Method Participants. A total of 32 Colorado State University psychology students participated for partial course credit. All participants were tested individually. Materials. The materials consisted of 96 word pairs (see Appendix A), half of which were related filler items (e.g., morningevening) and

half of which were unrelated word pairs. Study items were taken from Kato (1985) and Kelley and Sahakyan (2003). We also created a number of additional pairs following the procedures outlined by Kato and Kelley and Sahakyan. There were two versions of each unrelated pair. In one version (deceptive items) the pair had a potentially interfering competitor that was semantically related to the cue word and shared the same first two letters and last letter as the target word (e.g., the reptilesnore pair had the interfering competitor, snake, and the test cue reptile sn_ _e). The other half of the word pairs consisted of unrelated control items that had no such interfering competitors. In order to create control items we randomly re-paired cues and targets. For example, the deceptive pairs tablecheer and reptilesnore were randomly re-paired to create the control item tablesnore. Thus, across participants, each target (e.g., snore) was presented equally often with a control (e.g., table) or deceptive (e.g., reptile) cue. The study list consisted of 96 pairs of words, with 24 control items, 24 deceptive items, and 48 related fillers. As well, primacy and recency buffers were included at the beginning and end of the study list. Each buffer consisted of three pairs of words, with one related pair (month-year) and two unrelated pairs (turkey-opera). The test list consisted of all 96 study items. A two-item practice buffer was also included to familiarise the participant with the test procedure. The practice buffer consisted of an unrelated and related pair presented in the primacy and recency portions of the study list. Procedure. Participants were instructed that they would learn pairs of words such that if given the cue and three letters of the target they would later be able to recall the target. The 96 pairs from the study list were presented in four blocks, each consisting of 24 items, with each pair presented at a 4-second rate. Within each block participants studied 6 control items, 6 deceptive items, and 12 related items. Half of the items within a block received an immediate JOL and half were designated to receive a delayed JOL. Following Nelson et al. (2004), just prior to each JOL (made either immediately or after a delay), participants were prompted to recall the target when given the cue word and three letters of the target (e.g., reptile sn_ _e). Immediately following this preJOL recall, a JOL was solicited by asking participants to assess the likelihood of later

recalling the target (given the cue and three letters of the target) on a scale from 0% (not likely at all) to 100% (very likely). Participants were encouraged to use the entire range of the scale and were given 4 seconds to report their JOL aloud, which was then entered into the computer by the experimenter. Participants did not proceed to the next pair until the entire 4 seconds had elapsed. Delayed JOLs were made after the presentation of the second and fourth study blocks. The order of the delayed JOLs was random with the condition that JOLs were first solicited for pairs from the first and then second block of the previous study blocks. For example, after studying and making immediate JOLs for pairs from blocks 1 and 2, participants made delayed JOLs for pairs from block 1 followed by block 2. This ensured that a minimum of 12 pairs intervened between the presentation of a pair in a study block and the delayed JOL for that pair. Each pair appeared equally often in each block and was given an immediate or delayed JOL equally often. The order of presentation within each block was randomised anew for each participant. The test phase immediately followed the study phase. Participants were given 10 seconds to recall the target given the cue and three letters from the target. Participants said their answer aloud to an experimenter who entered the response. If the response was provided prior to the 10-second deadline, the test trial was terminated (after coding the response) and the participant proceeded to the next item. If a response was not provided within the 10 seconds deadline participants were prompted to provide a response and, if one could not be provided, were coded as ‘‘passing’’ on the item (i.e., not providing a response). The test phase took place in blocks that mimicked the structure of the study phase (i.e., participants were first tested on pairs from block 1 of the study phase, next tested on pairs from block 2 of the study phase, etc). The order of items within each block was randomised anew for each participant.

Results Given the primary importance of resolution data, in Experiment 1 and all subsequent experiments we first discuss data for relative accuracy, followed by analyses of pre-JOL recall, final recall, JOL magnitude, and calibration (i.e., absolute

857

accuracy). Consistent with prior work (e.g., Kelley & Sahakyan, 2003; Rhodes & Kelley, 2005), data for related filler items were excluded from all analyses reported (see Appendix B for descriptive statistics for related items). The alpha level was set of .05 for all statistical tests reported. Relative accuracy. We examined relative accuracy by calculating gamma correlations (Nelson, 1984). Gamma is a nonparametric index of association that ranges from 1.0 to 1.0 and quantifies the association between JOLs and memory performance (Gonzalez & Nelson, 1996; Nelson, 1984; but see Benjamin & Diaz, 2008; Masson & Rotello, 2009, for alternative measures). To the degree that subsequently remembered items are given high JOLs and items that are less likely to be remembered are given lower JOLs, gamma will be positive. Likewise, to the degree that subsequently remembered items are given low JOLs and items that are less likely to be remembered are given higher JOLs, gamma will be negative. Gamma correlations were calculated for each participant between immediate JOLs and recall and delayed JOLs and recall by item type (see Figure 1).1 Analyses of individual correlations showed that immediate JOLs did not differ from zero for control items, t(31) 1.69, p .101, but did reliably differ from zero for deceptive items, t(31) 2.41. Gamma correlations for delayed JOLs differed reliably from zero for control items,

1.00 Mean gamma correlation

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

MONITORING MEMORY ERRORS

0.80 Immediate JOLs Delayed JOLs

0.60 0.40 0.20 0.00

Control items

Deceptive items Item Type

Figure 1. Mean gamma correlations by JOL Type and Item Type in Experiment 1. Error bars reflect standard errors of the mean. 1 Several participants reported either invariant JOLs or did not report the target for any item. These participants were excluded from analyses, reflected by variations in degrees of freedom reported for statistical tests in each of the experiments reported.

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

858

RHODES AND TAUBER

(30)16.87, and were marginally different from zero for deceptive items, t(29)2.01, p .054. These data were analysed in a 2 (JOL Type: immediate, delayed)2 (Item Type: control, deceptive) repeated-measures analysis of variance (ANOVA). Overall, correlations for delayed JOLs (M .51, SE .05) exceeded correlations for immediate JOLs (M .19, SE .06), F(1, 28) 17.85, h2p .39, and correlations were higher for control items (M .46, SE .05) than deceptive items (M .24, SE .06), F(1, 28) 6.72, h2p .19. More importantly, these main effects were qualified by a reliable JOL TypeItem Type interaction, F(1, 28) 11.60, h2p .29. Specifically, whereas gamma correlations were reliably higher for delayed compared with immediate JOLs for control items, F(1, 30) 61.65, h2p .67, no difference in gamma correlations was apparent for delayed compared with immediate JOLs for deceptive items, FB1. Thus the advantage typically observed for delayed relative to immediate JOLs was evident only for control items and not deceptive items.2 What accounts for the fact that the delayed JOL effect did not obtain for deceptive items? Correlations between pre-JOL recall accuracy and the magnitude of delayed JOLs indicates that JOLs were more strongly related to pre-JOL recall accuracy for control items (G .90, SE .03) than for deceptive items (G .26, SE .11), F(1, 29) 29.28, h2p .50. However, despite this difference in relative accuracy, participants relied equally on JOL magnitude in their decision to volunteer a response (cf. Koriat & Ma’ayan, 2005). For example, for 31.77% of control items judged at a delay and 10.94% of deceptive items judged at a delay, participants were unable to provide any response (i.e., the response was coded as ‘‘pass’’). Correlations between the decision to volunteer a response and JOLs were at unity (G 1.00) for both control and deceptive items. Thus, participants relied to an equal degree on the response coming to mind as a basis for JOLs (cf. Koriat & Ma’ayan, 2005). However, given the inaccuracy of information retrieved for deceptive items, this led to a weak relationship between final recall and JOLs, even when judgements were made at a delay. 2 The patterns of data obtained for gamma correlations for this and subsequent experiments do not change if one calculates alternatives measures of relative accuracy, such as da (cf. Masson & Rotello, 2009).

Pre-JOL recall. The mean percentage of targets recalled at the pre-JOL recall stage (see Table 1) was examined in a 2 (Item Type: control, deceptive) 2 (JOL Type: delayed, immediate) repeated-measures ANOVA. As would be expected, recall was far better when made immediately (M 98.83, SE .38) after study than after a delay (M 35.03, SE 2.80), F(1, 31) 504.69, h2p .94. Pre-JOL recall was superior for control items (M 70.96, SE 1.70) compared with deceptive items (M 62.89, SE 1.65), F(1, 31) 19.82, h2p .39, and Item Type interacted with JOL Type, F(1, 31) 17.87, h2p .37. In particular, whereas recall did not differ between control and deceptive items when tested immediately, F(1, 31) 1.30, p.263 h2p .04, participants correctly recalled reliably more control items than deceptive items when pre-JOL recall occurred after a delay, F(1, 31) 19.58, h2p .39. Thus, participants recalled reliably more targets for control relative to deceptive items, consistent with prior work (e.g., Kelley & Sahakyan, 2003; Rhodes & Kelley, 2005). Final recall. Final recall performance (see Table 1) was analysed in a 2 (Item Type: control, deceptive) 2 (JOL Type: delayed, immediate) repeated-measures ANOVA. Overall, the percentage of targets recalled was greater for control (M 52.34, SE 2.81) compared with deceptive (M 34.77, SE 2.75) items, F(1, 31) 48.85, h2p .61. As well, participants correctly recalled more items given an immediate JOL (M 51.17, SE 2.69) than items given a delayed JOL (M35.94, SE2.73), F(1, 31) 47.50, h2p .61. Item Type did not interact with JOL Type, F B1. JOL magnitude. JOLs (see Table 1) were analysed in a 2 (Item Type: control, deceptive) 2 (JOL Type: delayed, immediate) repeated-measures ANOVA. Overall, participants made reliably higher JOLs for deceptive (M 57.07, SE 3.22) compared with control (M 46.00, SE 2.84) items, F(1, 31) 36.99, h2p .54. While the magnitude of immediate JOLs (M 54.13, SE 3.58) did not reliably exceed that of delayed JOLs (M 48.93, SE 2.94), F(1, 31) 2.91, p .10, h2p .09, there was a reliable Item Type JOL Type interaction, F(1, 31) 40.10, h2p .56. In particular, JOLs did not differ when elicited immediately, FB1, but were reliably higher for deceptive items at a delay, F(1, 31) 42.35, h2p .58.

MONITORING MEMORY ERRORS

859

TABLE 1 Mean percentage of correct pre-JOL recall, mean JOLs, and mean recall of targets by JOL type and item type in Experiments 1, 2, and 3 JOL type

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

Experiment 1 Immediate JOL Control Items Deceptive Items Delayed JOL Control Items Deceptive Items Experiment 2 Immediate JOL Control Items Deceptive Items Delayed JOL Control Items Deceptive Items Experiment 3 Immediate JOL Control Items Deceptive Items Delayed JOL Control Items Deceptive Items

Pre-JOL recall

JOL

Final recall

Calibration

99.22 (0.44) 98.44 (0.58)

54.08 (3.65) 54.19 (3.59)

60.16 (2.89) 42.19 (3.55)

6.08 (3.96) 12.01 (4.12)

42.71 (3.30) 27.34 (3.29)

37.92 (3.11) 59.94 (3.65)

44.53 (3.60) 27.34 (2.98)

6.61 (2.84) 32.59 (4.18)

97.66 (0.44) 95.57 (1.20)

63.22 (3.87) 60.31 (4.23)

50.52 (3.78) 41.93 (3.62)

9.79 (6.03) 21.29 (4.64)

42.45 (3.57) 36.46 (3.11)

44.39 (3.32) 67.25 (2.78)

43.75 (4.11) 35.68 (3.25)

0.64 (2.56) 31.58 (4.34)

99.22 (0.44) 97.40 (0.96)

60.09 (3.21) 60.19 (2.64)

59.64 (3.25) 54.17 (3.06)

0.45 (3.77) 6.02 (3.52)

50.78 (3.43) 42.97 (3.27)

41.09 (2.77) 37.88 (2.91)

54.95 (3.68) 45.83 (3.23)

13.86 (2.45) 7.95 (2.70)

Standard errors are in parentheses.

Why were JOLs reliably higher for deceptive relative to control items when elicited at a delay? One possibility is that JOL magnitude was not sensitive to whether the target was recalled for deceptive items, which induced potent retrieval of plausible but incorrect items. We tested this by examining JOL magnitude as a function of PreJOL recall accuracy.3 When the target was recalled, JOLs for deceptive items (M 70.56, SE 4.24) did not differ from JOLs for control items (M 64.73, SE 3.59), F(1, 30) 1.45, p .238, h2p .05. However, when the target was not recalled, JOLs were reliably higher for deceptive items (M 54.38, SE 4.45) than control items (M 18.03, SE 2.96), F(1, 30) 71.57, h2p .70. In fact, JOLs for pre-JOL recall of the related competitor (e.g., snake for the pair reptile snore) for deceptive items (M 66.85, SE 4.24) did not differ from JOLs for recall of the target for deceptive items, F B1. Thus participants’ JOLs did not distinguish between correct and incorrect recall for deceptive items. 3 Due to essentially ceiling levels of accurate recall for immediate JOLs (see Table 1) this analysis was confined in Experiment 1 and the subsequent experiments to delayed JOLs.

Calibration. Calibration (see Table 1) was examined as the signed difference between JOLs and recall. Positive values are indicative of overconfidence (i.e., JOL magnitude exceeded recall performance) and negative values are indicative of underconfidence (i.e., JOL magnitude was less than recall performance). These data were examined in a 2 (Item Type: control, deceptive)2 (JOL Type: delayed, immediate) repeated-measures ANOVA. Overall, participants exhibited high levels of overconfidence for deceptive items, relative to the underconfidence evident for control items, F(1, 31) 96.27, h2p .76, and overconfidence was greater for delayed than immediate JOLs, F(1, 31) 10.60, h2p .26. These main effects were qualified by a reliable Item Type JOL Type interaction, F(1, 31) 24.12, h2p .44. In particular, the level of underconfidence did not differ for control items for delayed compared with immediate JOLs, FB1. However, for deceptive items, participants exhibited significantly higher levels of overconfidence for delayed compared with immediate JOLs, F(1, 31) 24.67, h2p .44. Thus participants’ delayed JOLs were markedly overconfident for deceptive items.

RHODES AND TAUBER

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

EXPERIMENT 2 Experiment 1 showed that delaying JOLs improved resolution for control items. However, for deceptive items, resolution did not differ for delayed compared with immediate JOLs. This failure to obtain a delayed JOL effect contrasts with prior work indicating significant improvements in relative accuracy when JOLs are delayed (see Rhodes & Tauber, 2011, for a review). The results suggest that this occurred largely because, for deceptive items, participants’ JOLs did not distinguish between recall of the target and instances in which other information, such as the related competitor, was mistakenly recalled. Such findings beg the question of how the relative accuracy of delayed JOLs might be ameliorated for deceptive items. One possibility is that informing participants of the nature of the deceptive items may serve to shift bases for judgement, perhaps encouraging participants to more closely scrutinise retrieved information prior to assigning a JOL. We examined this in Experiment 2 by providing participants with an explicit warning about the nature of the deceptive items. Prior studies have reported changes in performance when participants become aware (Jacoby & Whitehouse, 1989) or are informed (e.g., Kelley & Jacoby, 1996) about manipulations that induce memorial or metacognitive illusions. However, it is not a foregone conclusion that warning participants about the nature of the deceptive items will improve JOL accuracy. For example, Rhodes and Castel (2008) had participants study words presented in a large or small font. While font size was unrelated to subsequent recall performance, participants provided significantly higher JOLs for larger compared with small words. The illusion persisted even when participants were explicitly warned to ignore font size when making a JOL (see also Kelley & Lindsay, 1993; Tauber & Rhodes, 2010). To summarise, participants in Experiment 2 were specifically warned about the nature of the deceptive items and instructed to keep this in mind throughout the experiment. If such a warning allows participants to effectively monitor the accuracy of potently retrieved but incorrect information for deceptive items, then the delayed JOL effect should obtain for both control and deceptive items. However, deceptive items may induce retrieval of information that is difficult to distinguish from target information even when

forewarned. If that is the case, then the pattern of results should be the same as those of Experiment 1 where no delayed JOL effect was evident for deceptive items.

Method Participants. A total of 32 Colorado State University psychology students participated for partial course credit. All participants were tested individually. Materials and procedure. The materials and procedure were identical to those used in Experiment 1, with the exception that participants were provided with a warning, prior to beginning the experiment, about the nature of the deceptive items. In particular, participants were instructed: As you recall the word pairs, beware that some of the pairs can be misleading. For example, suppose you studied the pair ALARM CLUCK. You might later be asked to recall given ALARMCL_ _K. In this case, you might mistakenly believe you saw ‘‘CLOCK’’ rather than what you actually saw, ‘‘CLUCK’’. Be sure to be as accurate as possible when you are asked to recall what you studied.

Results Relative accuracy. Gamma correlations (see Figure 2) for immediate JOLs differed reliably from zero for control items, t(31) 3.37, but not for deceptive items, t(31) 1.32, p.195. Gamma correlations for delayed JOLs differed reliably 1.00

Mean gamma correlation

860

0.80

Immediate JOLs Delayed JOLs

0.60 0.40 0.20 0.00

Control items

Deceptive items Item Type

Figure 2. Mean gamma correlations by JOL Type and Item Type in Experiment 2. Error bars reflect standard errors of the mean.

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

MONITORING MEMORY ERRORS

from zero for control items, t(31) 9.14, and for deceptive items, t(31) 2.28. These data were analysed in a 2 (JOL Type: immediate, delayed) 2 (Item Type: control, deceptive) repeated-measures ANOVA. Overall, correlations for delayed JOLs (M .49, SE .07) exceeded correlations for immediate JOLs (M .22, SE .07), F(1, 31) 6.85, h2p .18. As well, correlations were reliably greater for control items (M .52, SE .06) compared with deceptive items (M .19, SE .07), F(1, 31) 12.41, h2p .29. Item Type did not reliably interact with JOL Type, F(1, 31) 2.52, p .123, h2p .08. However, given our a priori interest in the pattern of delayed and immediate JOLs, we conducted follow-up tests on each type of item. These analyses showed that, for control items, gamma correlations for delayed JOLs reliably exceeded those for immediate JOLs, F(1, 31) 11.49, h2p .27. In contrast, for deceptive items, resolution did not differ for immediate compared with delayed JOLs, F B1. Thus, providing a warning did not improve the relative accuracy of delayed JOLs for deceptive items. As in Experiment 1, we examined correlations between pre-JOL recall accuracy and the magnitude of delayed JOLs. In particular, JOLs were more strongly related to pre-JOL recall accuracy for control items (G .80, SE .07) than for deceptive items (G .49, SE.10), F(1, 29) 8.39, h2p .21. However, despite this discrepancy in JOL accuracy, participants relied equally on JOL magnitude in their decision to volunteer a response. For example, for 24.28% of control items judged at a delay and 7.03% of deceptive items judged at a delay, participants were unable to provide any response. Correlations between the decision to volunteer a response and JOLs were near unity for both control (G .98 SE .12) and deceptive items (G .95, SE .21). Thus, participants relied to an equal degree on the response coming to mind as a basis for JOLs. However, the inaccuracy of this information for deceptive items led to a weak relationship between final recall and JOLs even when judgements were made at a delay. Pre-JOL recall. The percentage of targets recalled prior to the JOL (see Table 1) was examined in a 2 (Item Type: control, deceptive) 2 (JOL Type: delayed, immediate) repeated-measures ANOVA. Recall was far better when made immediately (M 96.62, SE .80) after study than after a delay (M 39.45,

861

SE 2.78), F(1, 31) 425.81, h2p .93, and preJOL recall was superior for control items (M 70.05, SE 1.89) compared with deceptive items (M 66.02, SE 1.65), F(1, 31) 4.59, h2p .13. Item Type did not reliably interact with JOL Type, F(1, 31) 1.12, p.297, h2p .04. Final recall. The percentage of targets recalled on the final recall test (see Table 1) was examined in a 2 (Item Type: control, deceptive)2 (JOL Type: delayed, immediate) repeated-measures ANOVA. As in Experiment 1, the percentage of targets recalled was greater for control (M 47.14, SE 3.25) compared with deceptive (M 38.80, SE 2.89) items, F(1, 31) 7.97, h2p .20. In addition, participants correctly recalled more items given an immediate JOL (M 46.22, SE 2.91) than those given a delayed JOL (M 39.71, SE 2.97), F(1, 31) 7.87, h2p .20. Item Type did not interact with JOL Type, F B1. JOL magnitude. JOLs (see Table 1) were analysed in a 2 (Item Type: control, deceptive) 2 (JOL Type: delayed, immediate) repeated-measures ANOVA. Overall, JOLs were greater for deceptive (M 65.24, SE 2.86) than control (M52.36, SE 3.21) items, F(1, 31) 59.14, h2p .65, and did not reliably differ as a function of the JOL Type, F(1, 31) 3.29, p.079, h2p .10. A reliable Item Type JOL Type interaction was also present, F(1, 31) 39.95, h2p .56. In particular, JOLs were reliably greater for deceptive compared with control items when elicited immediately, F(1, 31) 5.69, h2p .16. JOLs were also reliably greater for deceptive compared with control items when elicited at a delay, F(1, 31) 57.35, h2p .65, but the difference was of a much greater magnitude, as indexed by the effect size, than that evident for immediate JOLs. As in Experiment 1 we examined JOL magnitude as a function of pre-JOL recall accuracy for delayed JOLs. When the target was recalled, JOLs for deceptive items (M 80.02, SE 3.56) were reliably higher than JOLs for correctly recalled control items (M 72.38, SE 3.97), F(1, 31) 4.39, h2p .12. Likewise, when the target was not recalled, JOLs were reliably higher for deceptive items (M 59.06, SE 3.52) than control items (M 22.27, SE 2.71), F(1, 31) 135.79, h2p .81, but the difference was of a much greater magnitude than that evident for immediate JOLs. Participants’ JOLs for pre-JOL recall of the related competitor (e.g., mistakenly

862

RHODES AND TAUBER

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

recalling snake after studying reptile snore) for deceptive items (M 70.18, SE 4.24) were reliably lower than JOLs for recall of the target for deceptive items, F(1, 31) 7.62, h2p .20. However, overall, errors in recall for deceptive items continued to be assigned high JOLs, hindering participants’ ability to distinguish between correct and incorrect recall for deceptive items even when warned. Calibration. As in Experiment 1, calibration was examined as the signed difference between JOLs and recall (see Table 1). A 2 (Item Type: control, deceptive) 2 (JOL Type: delayed, immediate) repeated-measures ANOVA showed that participants exhibited high levels of overconfidence for deceptive items, relative to the underconfidence evident for control items, F(1, 31) 56.92, h2p .65. However, calibration did not differ between immediate and delayed JOLs, F B1. A reliable Item TypeJOL Type interaction was also evident, F(1, 31) 9.52, h2p .24. In particular, for control items, overconfidence did not reliably differ for immediate compared with delayed JOLs, F(1, 31) 2.51, p .124, h2p .08. However, for deceptive items, participants exhibited significantly higher levels of overconfidence for delayed compared with immediate JOLs, F(1, 31) 4.20, h2p .12. Thus, even with a warning, calibration was poorer for deceptive relative to control items and poorer for delayed compared with immediate JOLs for deceptive items.

EXPERIMENT 3 Findings from Experiment 2 indicated that a warning was insufficient to restore the delayed JOL effect for deceptive items. That is, even when warned about the nature of the deceptive items, resolution did not differ for deceptive items given immediate versus delayed JOLs. Such data contrast with the reliable delayed JOL effect evident for control items. We suggest that this occurred because the warning did not appreciably alter participants’ ability to monitor retrieved information during the pre-JOL recall phase. A more effective method of improving resolution for deceptive items given delayed JOLs may be to provide on-line information about the veracity of the information used for judgement. We attempted this in Experiment 3 by providing participants with feedback regarding the correctness of their pre-JOL recall response. If

monitoring of deceptive items is generally poor because participants cannot distinguish between correct and incorrect responses, then providing such feedback may serve to improve relative accuracy. For example, data from Experiments 1 and 2 indicates that, for deceptive items, participants frequently conferred high JOLs on related distractors (e.g., snake) that were mistakenly recalled. Making feedback available may permit participants to readily identify such errors and assign JOLs accordingly, potentially restoring the delayed JOL effect for deceptive items.

Method Participants. A total of 32 Colorado State University psychology students participated for partial course credit. All participants were tested individually. Materials and procedure. The materials and procedure were identical to those used in Experiment 1, with the exception that participants were provided with feedback immediately after making a pre-JOL recall response. In particular, immediately after each pre-JOL recall response, a screen appeared for 2.5 seconds indicating whether that response was correct or incorrect.4 This was immediately followed by a screen prompting participants to make a JOL.

Results Relative accuracy. Gamma correlations (see Figure 3) for immediate JOLs reliably differed from zero for control items, t(31) 2.18, and were marginally different from zero for deceptive items, t (31)2.01, p .053. Gamma correlations for delayed JOLs reliably differed from zero for control items, t(31) 17.41, and for deceptive items, t(31) 18.18. These data were analysed in a 2 (Item Type: control, deceptive)2 (JOL Type: immediate, delayed) repeated-measures ANOVA. Overall, correlations for delayed JOLs (M .86, SE .03) exceeded correlations for immediate JOLs (M.19, SE .06), F(1, 31) 111.40, h2p .78. However, correlations did not reliably differ for control compared with 4

The correct response was not provided, thus eliminating the possibility that the provision of feedback changed the amount of time participants were exposed to the studied pair.

deceptive items, F B1, nor did JOL Type interact with Item Type, F B1. Specifically, resolution was reliably better for delayed compared with immediate JOLs for control items, F(1, 31) 55.40, h2p .64, and, in contrast to Experiments 1 and 2, was reliably better for delayed JOLs for deceptive items, F(1, 31) 50.17, h2p .62. This benefit for delaying judgement was also apparent at the preJOL recall stage. In particular, the gamma correlation between JOLs and recall accuracy at the pre-JOL recall stage was robust for both control (G .95, SE .04) and deceptive (G .98, SE .01) items and did not reliably differ, F(1, 31) 1.65, p .208, h2p .05. Thus, feedback permitted participants to effectively distinguish between correct and incorrect responses during pre-JOL recall, leading to high levels of relative accuracy for all items. Pre-JOL recall. The percentage of targets recalled prior to the JOL (see Table 1) was examined in a 2 (Item Type: control, deceptive) 2 (JOL Type: delayed, immediate) repeated-measures ANOVA. Recall was reliably better when made immediately after study (M 98.31, SE .64) than after a delay (M 46.88, SE 2.96), F(1, 31) 329.12, h2p .91 and pre-JOL recall was superior for control items (M 75.00, SE 1.74) compared with deceptive items (M 70.18, SE 1.80), F(1, 31) 10.68, h2p .26. However, Item Type did not reliably interact with JOL Type, F(1, 31) 3.81, p .060, h2p .11. In particular, recall was superior for control relative to deceptive items when JOLs were made immediately, F(1, 31) 6.36, h2p .17, and at a delay, F(1, 31) 7.15, h2p .19. Final recall. The percentage of targets recalled on the final recall test (see Table 1) was examined Immediate JOLs Delayed JOLs

1.00

Mean gamma correlation

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

MONITORING MEMORY ERRORS

0.80 0.60 0.40 0.20 0.00

Control items

Deceptive items Item Type

Figure 3. Mean gamma correlations by JOL Type and Item Type in Experiment 3. Error bars reflect standard errors of the mean.

863

in a 2 (Item Type: control, deceptive)2 (JOL Type: delayed, immediate) repeated-measures ANOVA. Recall was greater for control (M 57.29, SE 2.90) compared with deceptive (M 50.00, SE 2.78) items, F(1, 31) 7.27, h2p .19. As well, participants correctly recalled more items given an immediate JOL (M 56.90, SE 2.50) than items given a delayed JOL (M 50.39, SE 2.97), F(1, 31) 8.13, h2p .21. Item Type did not interact with JOL Type, F B1. JOL magnitude. JOLs (see Table 1) were analysed in a 2 (Item Type: control, deceptive) 2 (JOL Type: delayed, immediate) repeated-measures ANOVA. Overall, JOLs did not reliably differ for control (M 50.59, SE 2.19) compared with deceptive (M 49.03, SE 2.27) items, F(1, 31) 1.48, p .233, h2p .05. However, JOLs were reliably greater for immediate JOLs (M 60.14, SE 2.78) relative to delayed JOLs (M 39.48, SE 2.60), F(1, 31) 39.85, h2p .56. Item Type did not interact with the JOL Type, F(1, 31) 1.60, p .215, h2p .05. In particular, JOLs for deceptive compared with control items did not differ when elicited immediately, F B1, and, in contrast with prior experiments, did not differ when elicited at a delay, F(1, 31) 2.51, p.124, h2p .08. Thus, the provision of feedback resulted in more conservative JOLs for deceptive items, particularly when JOLs were elicited after a delay. As in Experiment 1, we examined JOL magnitude as a function of pre-JOL recall accuracy for delayed JOLs. When the target was correctly recalled, JOLs for deceptive items (M 72.22, SE 3.27) did not differ from JOLs for control items (M 75.21, SE 3.39), F(1, 31) 2.36, p.135, h2p .07. Likewise, when the target was not recalled, JOLs were marginally, but not reliably, greater for deceptive items (M 12.47, SE 2.37) than control items (M 8.51, SE 2.37), F(1, 31) 3.92, p .057, h2p .11. In addition, JOLs for pre-JOL recall of the related competitor (e.g., snake) for deceptive items (M 11.95, SE 2.55) were reliably lower than JOLs for recall of the target for deceptive items, F(1, 31) 278.07, h2p .90. Thus when provided with feedback, JOLs for incorrectly recalled deceptive items were low in magnitude and did not differ from errors made for control items. Calibration. Calibration data (Table 1) were examined in a 2 (Item Type: control, deceptive) 2 (JOL Type: delayed, immediate) repeated-measures ANOVA. Overall, participants

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

864

RHODES AND TAUBER

were underconfident when JOLs were elicited at a delay and somewhat overconfident when JOLs were elicited immediately, leading to reliable differences in calibration, F(1, 31) 40.99, h2p .57. As well, overall, participants were reliably more underconfident for control relative to deceptive items, F(1, 31) 5.69, h2p .16. Item Type did not interact with JOL Type, F B1. In fact, participants were better calibrated for deceptive items when JOLs were made at a delay than immediately, F(1, 31) 4.34, h2p .12, in contrast to the prior experiments. Thus, with feedback, participants’ delayed JOLs demonstrated superior calibration for deceptive compared with control items.

GENERAL DISCUSSION Common frameworks of metacognition suggest that monitoring informs control processes (Nelson, 1996; Nelson & Narens, 1990). Therefore, it is important that monitoring accurately reflects learning. The experiments reported examined the extent to which predictions of memory performance were sensitive to a manipulation that impairs memory accuracy. In particular, one method of increasing the accuracy of memory predictions is to make such predictions at a delay (Nelson & Dunlosky, 1991; Rhodes & Tauber, 2011). Does delaying judgement increase the relative accuracy of predictions of memory performance for confidently held but inaccurate information? The results of Experiment 1 showed that this was not the case. That is, while delaying JOLs led to a substantial increase in relative accuracy for control items (e.g., kitecenter), there was no evidence in Experiment 1 of a delayed JOL effect for deceptive items (e.g., tablecheer) that readily induce retrieval of related but incorrect information. We suggest that the delayed JOL effect did not occur for deceptive items because participants were unable to distinguish between accurate and inaccurate information. Consequently, monitoring accuracy did not improve for deceptive items even when judgements were made at a delay. In Experiment 2 we attempted to restore the delayed JOL effect for deceptive items by explicitly warning participants about the nature of those items. Results showed that even with a warning resolution did not differ for deceptive items for immediate compared with delayed JOLs. Thus, simply providing information about the general nature of the stimuli did not enhance

participants’ ability to distinguish between correct and incorrect information used as a basis for judgement (cf. Rhodes & Castel, 2008; Tauber & Rhodes, 2010). In Experiment 3 participants were given feedback, immediately after the pre-JOL recall response, regarding whether their response was correct. With feedback, participants demonstrated the delayed JOL effect for deceptive items, exhibiting superior resolution for items given delayed compared with immediate JOLs. Thus, providing feedback on the correctness of information coming to mind was sufficient to substantially increase the accuracy of predictions of future memory performance. Why did feedback improve resolution for deceptive items given delayed JOLs? We consider two possibilities. First, feedback permitted participants to assign appropriate JOLs for intrusions recalled in response to cues for deceptive items. For example, participants in Experiment 1 (M 66.85, SE 4.24) and Experiment 2 (M 70.18, SE 4.24) provided high JOLs for deceptive items when the related competitor (e.g., snake after studying reptilesnore) was mistakenly recalled that were essentially indistinguishable from JOLs provided when the target was recalled. However, when participants were given feedback on the veracity of a pre-JOL recall response in Experiment 3, the JOLs assigned when the related competitor was recalled were considerably lower (M 11.95, SE 2.55) and reliably differed from JOLs given to target responses. Thus, when feedback was provided, participants’ JOLs distinguished between retrieval of the target and memory errors. Such an idea assumes that the improvement in relative accuracy for deceptive items given delayed JOLs in Experiment 3 solely reflects a benefit to monitoring. However, feedback might alter the information retrieved in response to the cue (cf. Jacoby, Kelley, & McElree, 1999; Jacoby, Shimizu, Daniels, & Rhodes, 2005) or change the manner of encoding the study list (cf. Jacoby, Wahlheim, Rhodes, Daniels, & Rogers, 2010). This should be apparent in fewer instances of participants reporting the related competitor for deceptive items during the pre-JOL recall stage. Indeed, participants were more likely to mistakenly report the related competitor at the preJOL stage in Experiment 1 (M 55.99, SE 3.49) than in Experiment 2 (M 49.22, SE 3.43) or Experiment 3 (M 42.97, SE 2.97). A one-way ANOVA on these responses with Experiment as a

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

MONITORING MEMORY ERRORS

between-participants factor confirmed that the probability of mistakenly recalling the related competitor at the pre-JOL stage reliably differed across experiments, F(2, 93) 4.02, h2p .08. Follow-up tests showed that while the probability of recalling the related competitor for deceptive items did not differ between Experiments 1 and 2, t(62)1.41, p .163, d.35, participants in Experiment 3 mistakenly recalled the related competitor less often than participants in Experiment 1, t(62) 2.89, d.72. However, the probability of mistakenly recalling the related competitor did not differ between Experiments 2 and 3, t(62) 1.40, p .167, d.35.5 Because resolution was reliably better for deceptive items given delayed JOLs for the feedback condition compared with the warning condition for both pre-JOL recall, t(62)5.10, d1.28, and final recall, t(62) 5.20, d 1.30, it does not appear that changes in encoding can account for the benefit of feedback. Rather, feedback likely ensured that participants’ reported JOLs that discriminated between accurate and inaccurate information and thus brought about a delayed JOL effect for deceptive items.

Perspectives on the delayed JOL effect The failure to obtain a delayed JOL effect in Experiments 1 and 2 for deceptive items contrasts with numerous prior observations of improvements in resolution when JOLs are delayed (see Rhodes & Tauber, 2011, for a review). The monitoring dual memories hypothesis (e.g., Dunlosky & Nelson, 1994; Nelson & Dunlosky, 1991; Nelson et al., 2004) suggests that when JOLs are made immediately after study participants must rely on information that is currently accessible. Because testing often occurs after a retention interval of at least a few minutes, such information will not be diagnostic of future memory performance. In contrast, when JOLs are made at a delay, participants can use information from LTM to inform JOLs. Likewise, the selffulfilling prophecy account (Spellman & Bjork, 1992; see also Kimball & Metcalfe, 2003) suggests that delayed JOL accuracy reflects access to LTM. However, rather than enhancing monitoring, delayed JOLs are viewed as an opportunity for 5

The pattern of results is the same if the analyses are restricted to the probability of recalling the target for deceptive items.

865

retrieval practice. When participants successfully retrieve a target during a delayed JOL they will likely ascribe a higher JOL to that item than to items that were not retrieved or did not receive practice. From this perspective, the delayed JOL effect occurs not because of any benefit conferred to monitoring processes but because successful retrieval increases the chance of subsequent target retrieval and therefore begets an accurate JOL. Taken together, both theories posit that delayed JOL accuracy accrues from an interaction with and access to LTM. The current study suggests that while delayed JOLs may indeed rely on access to LTM, access to LTM alone is not sufficient to improve delayed JOL accuracy and may be compromised by interference from semantically related information. Thus, delayed JOLs are likely based on attributions holding that what is retrieved from LTM is predictive of accurate recall on a future test (cf. Jacoby et al., 1989; Koriat, 1997). To the extent that delayed JOLs rely on accurate information coming to mind as a basis for judgement, JOLs will be accurate and diagnostic of future memory performance. However, to the extent that participants retrieve information from LTM that is inaccurate and cannot be distinguished from accurate information, delayed JOLs will not be diagnostic of future memory performance.6 From this inferential perspective the accuracy of delayed JOLs is not inevitable and is a function of the match between the heuristic employed and the nature of the information that is or is not retrieved. This idea is consistent with a variety of other work suggesting that metamemory judgements reflect inferences based on available cues (Koriat, 1997) rather than direct access to the contents of memory (e.g., Hart, 1965). Such inferences will be flawed to the extent that the information or cues available for judgement are not predictive of performance (e.g., Benjamin, Bjork, & Schwartz, 1998; Brewer & Sampaio, 2006; Brewer, Sampaio, & Barlow, 2005; Kelley & Lindsay, 1993; Koriat, 1993, 1995; Koriat & Levy-Sadot, 2001; Rhodes & Castel, 2008, 2009). For example, Brewer et al. (2005) had participants study sentences (e.g., The saw was concealed in the cake) that, when later recalled, were highly likely to lead to substitutions of synonyms that were not studied (e.g., The saw 6 We note that eliciting target recall just prior to the JOL provides an independent assessment of the accuracy of the information used in making the JOL and thus avoids any concerns about circularity in the logic of this argument.

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

866

RHODES AND TAUBER

was hidden in the cake). Results showed that participants often mistakenly recalled substituted synonyms and exhibited essentially the same level of confidence for such errors as items that were correctly recalled. Brewer et al. concluded that participants mistakenly applied a heuristic holding that complete recall was a cue for accuracy and high levels of confidence. This heuristic would be valid when recall was accurate but would lead to inappropriately high levels of confidence for incorrectly recalled items. In a similar manner, delayed JOLs likely rely on a heuristic holding that what does or does not come to mind in response to a cue is predictive of future memory performance (cf., Koriat, 1993, 1995). When this inference is made after retrieving inaccurate information, the delayed JOL effect will not obtain. In the present experiments, only when feedback was provided that permitted participants to distinguish between accurate and inaccurate information retrieved from LTM (i.e., Experiment 3) did delayed JOLs confer a monitoring advantage for deceptive items. Thus, delayed judgements may indeed produce more accurate monitoring than immediate judgements. However, monitoring will only be enhanced if veridical information is used as a basis for that judgement or if inaccurate information can be readily distinguished from veridical information. Manuscript received 8 October 2010 Manuscript accepted 18 July 2011 First published online 18 October 2011

REFERENCES Anastasi, J. S., Rhodes, M. G., & Burns, M. C. (2000). Distinguishing between memory illusions and actual memories utilizing phenomenological measurements and explicit warnings. American Journal of Psychology, 113, 126. doi: 10.2307/1423458 Arbuckle, T. Y., & Cuddy, L. L. (1969). Discrimination of item strength at time of presentation. Journal of Experimental Psychology, 81, 126131. doi: 10.1037/ h0027455 Benjamin, A. S., Bjork, R. A., & Schwartz, B. L. (1998). The mismeasure of memory: When retrieval fluency is misleading as a metacognitive index. Journal of Experimental Psychology: General, 127, 5569. doi: 10.1037/0096-3445.127.1.55 Benjamin, A. S., & Diaz, M. (2008). Measure of relative metamnemonic accuracy. In J. Dunlosky & R. A. Bjork (Eds.), Handbook of memory and metamemory (pp. 7394). New York, NY: Psychology Press. Brewer, W. F., & Sampaio, C. (2006). Processes leading to confidence and accuracy in sentence recognition:

A metamemory approach. Memory, 14, 540552. doi: 10.1080/09658210600590302 Brewer, W. F., Sampaio, C., & Barlow, M. R. (2005). Confidence and accuracy in the recall of deceptive and nondeceptive sentences. Journal of Memory and Language, 52, 618627. doi: 10.1016/j.jml.2005. 01.017 Dunlosky, J., & Nelson, T. O. (1992). Importance of the kind of cue for judgements of learning (JOL) and the delayed-JOL effect. Memory & Cognition, 20, 374380. doi: 10.3758/BF03210921 Dunlosky, J., & Nelson, T. O. (1994). Does the sensitivity of judgements of learning (JOLs) to the effects of various study activities depend on when the JOLs occur? Journal of Memory and Language, 33, 545565. doi: 10.1006/jmla.1994.1026 Dunlosky, J., & Nelson, T. O. (1997). Similarity between the cue for judgements of learning (JOL) and the cue for test is not the primary determinant of JOL accuracy. Journal of Memory and Language, 36, 34 49. doi: 10.1006/jmla.1996.2476 Gonzalez, R., & Nelson, T. O. (1996). Measuring ordinal association in measures that contain tied scores. Psychological Bulletin, 119, 159165. doi: 10.1037/0033-2909.119.1.159 Hart, J. T. (1965). Memory and the feeling-of-knowing experience. Journal of Educational Psychology, 56, 208216. Jacoby, L. L., Kelley, C. M., & Dywan, J. (1989). Memory attributions. In H. L. Roediger & F. I. M. Craik (Eds.), Varieties of memory and consciousness: Essays in honor of Endel Tulving (pp. 391422). Hillsdale, NJ: Lawrence Erlbaum Associates. Jacoby, L. L., Kelley, C. M., & McElree, B. D. (1999). The role of cognitive control: Early selection versus late correction. In S. Chaiken & Y. Trope (Eds.), Dual process theories in social psychology (pp. 383 400). New York, NY: The Guilford Press. Jacoby, L. L., Shimizu, Y., Daniels, K. A., & Rhodes, M. G. (2005). Modes of cognitive control in recognition and source memory: Depth of retrieval. Psychonomic Bulletin & Review, 12, 852857. Jacoby, L. L., Wahlheim, C. N., Rhodes, M. G., Daniels, K. A., & Rogers, C. S. (2010). Learning to diminish the effects of proactive interference: Reducing false memory for younger and older adults. Memory & Cognition, 38, 820829. doi: 10.3758/MC.38.6.820 Jacoby, L. L., & Whitehouse, K. (1989). An illusion of memory: False recognition influenced by unconscious perception. Journal of Experimental Psychology: General, 118, 126135. doi: 10.1037/00963445.118.2.126 Kato, T. (1985). Semantic-memory sources of episodic retrieval failure. Memory & Cognition, 13, 442452. doi: 10.3758/BF03198457 Kelley, C. M., & Jacoby, L. L. (1996). Adult egocentrism: Subjective experience versus analytic bases for judgement. Journal of Memory & Language, 35, 157175. doi: 10.1006/jmla.1996.0009 Kelley, C. M., & Lindsay, S. D. (1993). Remembering mistaken for knowing: Ease of retrieval as a basis for confidence in answers to general knowledge questions. Journal of Memory and Language, 32, 1 24. doi: 10.1006/jmla.1993.1001

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

MONITORING MEMORY ERRORS

Kelley, C. M., & Rhodes, M. G. (2002). Making sense and nonsense of experience: Attributions in memory and judgement. In B. Ross (Ed.), The psychology of learning and motivation (pp. 293320). New York, NY: Academic Press. Kelley, C. M., & Sahakyan, L. (2003). Memory, monitoring, and control in the attainment of memory accuracy. Journal of Memory and Language, 48, 704721. doi: 10.1016/S0749-596X(02)00504-1 Kimball, D. R., & Metcalfe, J. (2003). Delaying judgements of learning affects memory, not metamemory. Memory & Cognition, 31, 918929. doi: 10.3758/ BF03196445 Koriat, A. (1993). How do we know that we know? The accessibility model of feelings of knowing. Psychological Review, 100, 609639. doi: 10.1037/0033295X.100.4.609 Koriat, A. (1995). Dissociating knowing and feeling of knowing: Further evidence for the accessibility model. Journal of Experimental Psychology: General, 124, 311333. doi: 10.1037/0096-3445.124.3.311 Koriat, A. (1997). Monitoring one’s own knowledge during study: A cue-utilization approach to judgements of learning. Journal of Experimental Psychology: General, 126, 349370. doi: 10.1037/ 0096-3445.126.4.349 Koriat, A. (2007). Metacognition and consciousness. In P. D. Zelazo, M. Moscovitch, & E. Thompson (Eds.), The Cambridge handbook of consciousness (pp. 289325). New York, NY: Cambridge University Press. Koriat, A., & Bjork, R. A. (2006). Illusions of competence during study can be remedied by manipulations that enhance learners’ sensitivity to retrieval conditions at test. Memory & Cognition, 34, 959 972. doi: 10.3758/BF03193244 Koriat, A., & Goldsmith, M. (1996). Monitoring and control processes in the strategic regulation of memory accuracy. Psychological Review, 103, 490 517. doi: 10.1037/0033-295X.103.3.490 Koriat, A., Goldsmith, M., & Pansky, A. (2001). Toward a psychology of memory accuracy. Annual Review of Psychology, 51, 481537. doi: 10.1146/ annurev.psych.51.1.481 Koriat, A., & Levy-Sadot, R. (2001). The combined contributions of the cue familiarity and accessibility heuristics to feelings of knowing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 3453. doi: 10.1037/0278-7393.27.1.34 Koriat, A., & Ma’ayan, H. (2005). The effects of encoding fluency and retrieval fluency on judgements of learning. Journal of Memory and Language, 52, 478492. doi: 10.1016/j.jml.2005.01.001 Koriat, A., Ma’ayan, H., & Nussinson, R. (2006). The intricate relationships between monitoring and control in metacognition: Lessons for the cause-andeffect relation between subjective experience and behavior. Journal of Experimental Psychology: General, 135, 3669. doi: 10.1037/0096-3445.135.1.36 Loftus, E. F., Miller, D. G., & Burns, H. J. (1978). Semantic integration of verbal information into visual memory. Journal of Experiment Psychology: Human Learning and Memory, 4, 1931. doi: 10.1037/0278-7393.4.1.19

867

Masson, M. E. J., & Rotello, C. M. (2009). Sources of bias in the Goodman-Kruskal gamma coefficient measure of association: Implications for studies of metacognitive processes. Journal of Experimental Psychology: Learning, Memory, and Cognition, 35, 509527. doi: 10.1037/a0014876 Meeter, M., & Nelson, T. (2003). Multiple study trials and judgements of learning. Acta Psychologica, 113, 123132. doi: 10.1016/S0001-6918(03)00023-4 Metcalfe, J. (2000). Metamemory: Theory and data. In E. Tulving & F. I. M. Craik (Eds.), The Oxford handbook of memory (pp. 197211). New York, NY: Oxford University Press. Metcalfe, J., & Finn, B. (2008). Evidence that judgements of learning are causally related to study choice. Psychonomic Bulletin & Review, 15, 174 179. doi: 10.3758/PBR.15.1.174 Nelson, T. O. (1984). A comparison of current measures of the accuracy of feeling-of-knowing predictions. Psychological Bulletin, 95, 109133. doi: 10.1037/ 0033-2909.95.1.109 Nelson, T. O. (1996). Consciousness and metacognition. American Psychologist, 51, 102116. doi: 10.1037/ 0003-066X.51.2.102 Nelson, T. O., & Dunlosky, J. (1991). When people’s judgements of learning (JOLs) are extremely accurate at predicting subsequent recall: The ‘delayedJOL effect’. Psychological Science, 2, 267270. doi: 10.111/j.1467-9280.1991.tb00147.x Nelson, T. O., & Narens, L. (1990). Metamemory: A theoretical framework and new findings. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 26, pp. 125173). New York, NY: Academic Press. Nelson, T. O., & Narens, L. (1994). Why investigate metacognition? In J. Metcalfe & A. P. Shimamura (Eds.), Metacognition: Knowing about knowing (pp. 125). Cambridge, MA: MIT Press. Nelson, T. O., Narens, L., & Dunlosky, J. (2004). A Revised methodology for research on metamemory: Pre-judgement recall and monitoring (PRAM). Psychological Methods, 9, 5369. doi: 10.1037/1082989X.9.1.53 Owens, J., Bower, G. H., & Black, J. B. (1979). The ‘‘soap opera’’ effect in story recall. Memory & Cognition, 7, 185191. doi: 10.3758/BF03197537 Rhodes, M. G., & Castel, A. D. (2008). Memory predictions are influenced by perceptual information: Evidence for metacognitive illusions. Journal of Experimental Psychology: General, 137, 615625. doi: 10.1037/a0013684 Rhodes, M. G., & Castel, A. D. (2009). Metacognitive illusions for auditory information: Effects on monitoring and control. Psychonomic Bulletin & Review, 16, 550554. doi: 10.3758/PBR.16.3.550 Rhodes, M. G., & Jacoby, L. L. (2007). Toward analysing cognitive illusions: Past, present, and future. In J. S. Nairne (Ed.), The foundations of remembering: Essays in honor of Henry L. Roediger III (pp. 379393). New York, NY: Psychology Press. Rhodes, M. G., & Kelley, C. M. (2005). Executive processes, memory accuracy, and memory monitoring: An aging and individual difference analysis.

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

868

RHODES AND TAUBER

Journal of Memory and Language, 52, 578594. doi: 10.1016/j.jml.2005.01.014 Rhodes, M. G., & Tauber, S. K. (2011). The influence of delaying Judgements of Learning (JOLs) on metacognitive accuracy: A meta-analytic review. Psychological Bulletin, 137, 131148. doi: 10.1037/ a0021705 Roediger, H. L. & McDermott, K. B. (1995). Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, & Cognition, 21, 803814. doi: 10.1037/ 0278-7393.21.4.803 Serra, M. J., Dunlosky, J., & Hertzog, C. (2008). Do older adults show less confidence in their monitoring of learning? Experimental Aging Research, 34, 379 391. doi: 10.1080/03610730802271898 Spellman, B. A., & Bjork, R. A. (1992). When predictions create reality: Judgements of learning may alter what they are intended to assess. Psychological

Science, 5, 315316. doi: 10.1111/j.1467-9280.1992.tb 00680.x Spellman, B. A., Bloomfield, A., & Bjork, R. A. (2008). Measuring memory and metamemory: Theoretical and statistical problems with assessing learning (in general) and using gamma (in particular) to do so. In J. Dunlosky& & R. A. Bjork (Eds.), Handbook of metamemory and memory (pp. 95114). New York, NY: Psychology Press. Tauber, S. K., & Rhodes, M. G. (2010). Does the amount of material to be remembered influence judgements of learning (JOLs)? Memory, 18, 351 362. doi: 10.1080/09658211003662755 Wahlheim, C. N. (2011). Predicting memory performance under conditions of proactive interference: Immediate and delayed judgements of learning. Memory & Cognition, 39, 827838. doi: 10.3758/ s13421-010-0065-9

MONITORING MEMORY ERRORS

869

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

APPENDIX A Item used in the experiments reported Related items

Control items*

Deceptive items

Related distractor**

answer reply basic simple bird feather call phone car travel cat kitten chocolate vanilla deep shallow end begin enter leave fiddle violin flu virus fork knife fruit orange glacier iceberg globe world half whole hill valley honey sweet ice water idea thought inside outside kill murder king queen knee elbow knob handle lack without lane street lion tiger many several mint candy morning night noisy quiet old young pearl oyster princess prince risk danger sea ocean soldier military squeak mouse study school thread needle trouble problem trust loyal truth honest turtle shell wide narrow yesterday today

able house after whirl agony mumbled annoy slump avoid grain back afford bank spies beast frost bed shorts beetle winning blossom middle bored whale clean behave dawn ignite early flirt finish flavor friend poodle gold sale grass duck hour stilt identical fight idiot lace ketchup harbor last moldy leopard same lightning leader linen bowler mail plate nail decent need timid nurse refund officer sharp painter cheer pilot theater pony infant raft staged remember coast reptile arrest rescue sister scared annual snow wart subject dollar table snore tape master tennis drawer truck fought wagon diary wool enjoy

able winning after behave agony decent annoy bowler avoid ignite back frost bank moldy beast annual bed slump beetle infant blossom flavor bored timid clean diary dawn duck early lace finish stilt friend enjoy gold sister grass grain hour middle identical sale idiot staged ketchup mumbled last fight leopard spies lightning theater linen shorts mail leader nail harbor need wart nurse dollar officer poodle painter arrest pilot plate pony house raft flirt remember fought reptile snore rescue same scared afford snow whale subject master table cheer tape refund tennis coast truck drawer wagon whirl wool sharp

willing before defeat bother ignore front money animal sleep insect flower tired dirty dusk late start enemy silver green minute same stupid mustard first spots thunder sheets letter hammer want doctor police artist plane horse float forget snake save afraid white matter chair record court driver wheel sheep

*Control items were created by randomly re-pairing cues and targets from the deceptive items. **Related distracters refer to unstudied items that were related to the cue and shared the same first two letters and last letter as the target.

870

RHODES AND TAUBER

APPENDIX B Mean percentage of correct pre-JOL recall, mean JOLs, mean recall of targets, mean calibration score, and mean gamma correlation by JOL type and experiment for related items

Downloaded by [Colorado State University], [Matthew Rhodes] at 16:17 11 January 2012

JOL type

Pre-JOL recall

JOL

Final recall

Calibration

G

Experiment 1 Immediate JOLs Delayed JOLs

100.00 () 92.84 (0.97)

80.54 (3.08) 76.00 (3.15)

96.74 (0.82) 94.27 (0.93)

16.20 (3.08) 18.27 (3.16)

0.46 (.15) 0.82 (.11)

Experiment 2 Immediate JOLs Delayed JOLs

99.87 (0.13) 92.58 (0.98)

86.86 (2.06) 82.47 (1.71)

94.01 (0.97) 94.01 (0.78)

7.15 (2.33) 11.54 (1.88)

0.22 (.15) 0.71 (.13)

Experiment 3 Immediate JOLs Delayed JOLs

99.87 (0.13) 92.06 (0.96)

83.23 (2.31) 80.75 (2.33)

95.05 (0.90) 94.66 (0.74)

11.82 (2.20) 13.91 (2.29)

0.44 (.14) 0.92 (.04)

Standard errors are in parentheses. Ggamma correlation between JOLs and final recall.

Monitoring memory errors: The influence of the veracity ...

Jan 11, 2012 - relative accuracy of monitoring is delaying JOLs for even a brief period of ... ing the availability and ease of retrieving informa- tion from LTM in ...

153KB Sizes 1 Downloads 186 Views

Recommend Documents

The Influence of Training Errors, Context and Number ...
Apr 21, 2009 - (i, j), 0 ≤ i ≤ M − 1,0 ≤ j ≤ N − 1, will be denoted by S, the support of the image. ... deserved a great deal of attention in the computer vision literature since they were ..... International Journal of Remote Sensing 13:

Monitoring the Errors of Discriminative Models with ...
One key component of our system is BayesDB, a probabilistic programming platform for probabilistic data analysis. (Mansinghka et al., 2015b). A second key component is CrossCat, a Bayesian non-parametric method for learning the joint distribution ove

An Empirical Study of Memory Hardware Errors in A ... - cs.rochester.edu
hardware errors on a large set of production machines in a server-farm environment. .... to the chipkill arrange- ment [7] that the memory controller employed.

An Empirical Study of Memory Hardware Errors in A ... - cs.rochester.edu
by the relentless drive towards higher device density, tech- nology scaling by itself ... While earlier studies have significantly improved the un- derstanding of ...

Measuring memory monitoring with judgements of
Most prior research has examined predictions of future memory performance by eliciting judgements of ... future memory performance commonly made on a ...... There was no. JUDGEMENTS OF RETENTION main effect for judgement condition, nor was there an I

The effect of time synchronization errors on the ...
due to the clock jitter is independent of the number of transmit nodes and the penalty increases as the SNR increases. I. INTRODUCTION. In large wireless ...

The effect of time synchronization errors on the ...
In large wireless sensor networks, the distribution of nodes can be looked at in ...... tems with Rayleigh fading”, IEEE Transactions on Vehicular Technology,. Vol.

The Influence of Focus Alternative Sets on Memory for ...
WIND COWLES MICHELLE PERDOMO. University of Florida [email protected] [email protected] ... Wind Cowles and Michelle Perdomo. (1). -What did the rabbits eat in your garden? -They ate the ..... and Cognition 36: 201–16. Ward, Peter, and Patrick Sturt. 2

influence the psychology of persuasion.pdf
Influencethe psychology of persuasion by robertcialdini. Page 2 of 2. influence the psychology of persuasion.pdf. influence the psychology of persuasion.pdf.

Understanding the Propagation of Hard Errors to Software ... - LLVM.org
Mar 1, 2008 - though in fault-free mode, SPEC applications spend negligible time in the ..... out or infant mortality due to incomplete burn-in [4, 5, 50]. Precise ..... respectively), while the top-most (black) stack is the percentage of injections 

Characterizing the Errors of Data-Driven ... - Research at Google
they are easily ported to any domain or language in .... inference algorithm that searches over all possible ... the name of the freely available implementation.1.

Business English Presentations- Correct the Errors - UsingEnglish.com
Correct your own errors in your homework or things you said in the last class that your ... There is a list of original sources in the last page of the handout.

the art of memory pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. the art of ...

evaluate the effectiveness of the speed monitoring ...
Changeable Message Sign with Radar (CMR) ... Binary outcome models for vehicle speeding ..... Left lane: small & big sign effective for passenger car and.

The Impact of Integrated Measurement Errors on ...
KEYWORDS: Integrated Measurement Errors; Location Shifts; Long-run Data; ...... are estimated by OLS, and then reverse these transformations to recover the.

Understanding the Propagation of Hard Errors to Software ... - LLVM.org
Mar 1, 2008 - and Secure Computing, 3(3), July-Sept 2006. [50] David Yen. Chip Multithreading Processors Enable Reliable. High Throughput Computing.