Distinct forms of implicit learning that respond differentially to performance errors and sensory prediction errors Yohsuke R. Miyamoto, Shengxin Wang, Andrew E. Brennan, Maurice A. Smith Recent studies have investigated the contribution of an explicit strategy alongside implicit learning during visuomotor adaptation (e.g. Mazzoni & Krakauer 2006, Taylor & Ivry 2010, Taylor et al 2014). A prominent idea is that strategy learning is driven by performance error (the discrepancy between desired and actual motion), whereas implicit learning is driven by sensory-prediction error (the discrepancy between actual motion and an internal model prediction). The key consequence of this idea is that implicit learning would proceed in a manner independent of strategy, because sensory prediction errors should be unaffected by strategy. The current study challenges this idea. Here, we decomposed implicit and explicit (strategic) adaptation into temporally-stable (TS) and temporally-labile (TL) parts, and found that strategy is purely TS while implicit learning has distinct TS and TL components. Surprisingly, we found that while TS implicit learning may be driven by sensory-prediction error, TL implicit learning is instead driven by performance error. Subjects adapted short (9cm), rapid (~300ms), point-to-point reaching arm movements to a ±30° visuomotor rotation (VMR). The task required subjects to select an aiming direction strategy for each movement before initiating it, by adjusting the position of an aiming marker using a keypad. Having trial-by-trial information about the aiming direction allowed us to decompose the learning on each trial into separate strategic and implicit components (Fig. 1a). Following a 200 trial baseline period, a ±30° VMR was introduced (balanced across subjects). After 150 and 300 training trials in a single target direction, we tested subjects’ aftereffects in blocks of trials with no visual feedback (Fig. 1b). Because of rest breaks of 1 min+ before these testing blocks, these blocks probed the TS components of adaptation ( decay~16.5s for labile VMR adaptation; Hadjiosif & Smith 2013). Consistent with Taylor et al. 2014, rapid strategic adjustments allowed mean performance to quickly achieve near-zero error during training (Fig. 2a). Mean implicit learning increased more gradually, however, and after its quick initial rise, mean strategic learning gradually decreased in a complementary fashion. Despite a match between the mean adaptation patterns we observed and those previously reported, we discovered a remarkable inter-individual diversity in the level of strategy employed. This diversity in strategy had little effect on performance because strategy and implicit learning were largely complementary and thus strongly anti-correlated across subjects (r=-0.91), so as to maintain a nearly fixed level of combined (strategy+implicit) learning as shown in Fig. 3j. Inter-individual variability in strategy (s.d.=9.6°) thus dwarfed the variability in combined learning (s.d.=1.5°), as shown in Fig. 2b-c. At asymptote, about 50% of subjects displayed < 5° of aiming strategy, whereas 20% displayed > 20°. To explore this considerable diversity, we divided individuals into low, mid, and high-strategy subgroups (Fig. 2d) based on their asymptotic strategy levels. We then compared the aftereffects of strategic and implicit learning, after 1 minute rest breaks that would remove TL adaptation, for the low, mid, and high-strategy subgroups. Strategy displayed TS aftereffects that closely matched the overall (TS+TL) level of strategy during late training (Fig. 3a,b), indicating that strategic learning is TS over 1-5 minutes (the onset and offset latencies for the testing blocks) with little contribution from a TL component (Fig. 3c). In contrast, TS implicit learning was largely dissociated from the overall (TS+TL) implicit learning observed during the training period across subjects (Fig. 3d,e, bar graphs in Fig. 3k, l), indicating its distinct TS and TL components. Critically, TS implicit learning in high-strategy subjects was consistently larger than their overall (TL+TS) implicit learning measured during training (p < 0.01, Fig. 3e), revealing a surprisingly negative TL component (p < 0.01, Fig. 3f). But why would TL learning be consistently negative in high-strategy subjects in the presence of a perturbation that should drive positive learning? We suggest that this could be the case if the TL component was reacting to overactive TS learning. Indeed, for high-strategy subjects, we found that combined (strategy+implicit) learning during the testing blocks overshot the ideal level (p = 0.01, Fig. 3g). Correspondingly, TS combined learning was greater than overall (TL+TS) combined (strategy+implicit) learning during late training (p < 0.01, Fig. 3h) due to negative TL combined learning (p < 0.01, Fig. 3i) arising from the negative TL implicit learning observed in 3f. These findings suggest that negative TL implicit learning acts to compensate for overactive TS combined learning to prevent overcompensatory performance errors. This would, however, require TL implicit learning to respond to performance errors, and would result in a strong relationship between TL implicit learning and the highly diverse strategy levels that underlie the differences in TS combined (strategy+implicit) learning shown in Fig 3g. Correspondingly, we find clear evidence for an exceptionally strong relationship between TL implicit learning and strategy level (Fig 3k), despite little relationship between TS implicit learning and strategy (Fig 3l), suggesting that the latter may be driven by sensory-prediction error. In summary, dissection of learning into temporally-stable and temporally-labile components revealed that labile implicit learning (Fig. 3f, k) was crucial for maintaining task performance by bolstering stable learning in low-strategy individuals and counteracting it in high-strategy individuals. Of note, is that Mazzoni & Krakauer 2006 and Taylor & Ivry 2010 likely missed these effects because their 8-target paradigm ensured large intertrial intervals, leading to a decay of labile learning that would allow overactive stable learning to proceed unchecked.
Fig. 1: Experimental paradigm
Fig. 1: (a) Prior to each movement, subjects indicated their strategy by positioning the aiming marker used to guide their aiming. Hand direction relative to the aiming marker defines implicit learning. (b) Subjects experienced a baseline period with no rotation, and training periods in which the cursor movement was rotated by ±30°. Gray regions (NVF test) indicate blocks of trials with no visual feedback, in which we tested for subjects’ aftereffects from training after a rest break. Fig. 2: (a, b) Mean learning curves for implicit learning (red), strategic learning (blue), and overall learning (green). Errorbars in (b) indicate standard deviations across subjects. (c) Distributions of implicit learning, strategy, and overall learning at asymptote are represented also by gaussian fits. (d) We separated subjects into 3 subgroups based on the level of strategy at asymptote. Light blue represents the low-strategy subgroup mean, darker blue shades represent the mid and high-strategy subgroups. Fig. 3: (a, d, g) Comparison of learning in the training direction between overall learning (stable + labile) measured during training vs stable learning measured during the testing blocks. a, e, d show strategic, implicit, and combined (strategy+implicit) learning, respectively, for the 3 strategy groups introduced in Fig. 2d. Error bars show 2 SEM. (b, e, h), Temporally-stable learning measured during testing blocks is plotted against overall learning, which consists of both labile and stable components, measured during asymptotic training. (b) plots temporally-stable strategy, (e) plots temporally-stable implicit, and (h) plots temporally-stable combined (strategy+implicit) learning. Each dot is an individual subject, colored by strategy group. Stars indicate group means. (c, f, i) Like Fig. b, e, h, but plotting temporally-labile learning (measured by temporal decay from training to aftereffects in the testing block), rather than stable learning. (j, k, l) Relationships between (j) overall implicit learning (stable + labile), (k) labile implicit learning, or (l) stable implicit learning vs strategy at asymptote. Red bars at right indicate size of labile and stable components for each strategy group. Asterisks indicate significant differences from zero at p<0.01.
NVF test
140 254
404 518
Trial
Mean learning 30
Implicit Strategy
No visual feedback testing 0
c
140
Trial
254
404
Asymptotic learning distributions
30 Subjects
Combined learning Strategy
0
0
Implicit
20 Learning (deg)
{ { {
40
j
d
Low / mid / high strategy subgroups 30 No visual feedback testing
0 0
140
Trial
High-strategy (n = 9) Mid-strategy(n = 13) Low-strategy (n = 26) 254 404
30
High-strategy
Overall
30
Overall
Stable
b
Mid-strategy Overshoots ideal level
High-strategy (HS)
0
High-strategy (HS)
Low-strategy
Mid-strategy
Low-strategy
0
Combined Learning
g
Low-strategy
Mid-strategy
0
Stable
Overall
e
Stable
h 30
y=
0 0
y=
x
y
r = 0.931 p < 0.001
-30
c
f
=x
0
Stable greater than overall implicit for HS group
i
Negative labile learning for HS group
30
x
Stable greater than overall learning for HS group Negative labile learning for HS group
0 0
0 -30
0 30 Overall strategy during training (deg)
Structure of asymptotic learning 30
0
Implicit
d
Ideal Performance
30
ng
0
Stable (deg)
30
Labile (deg)
Learning (deg)
Spread of learning curves (s.d.)
Strategy
a
ni
0
Combined learning
No visual feedback testing
n = 48
b
Strategy (deg)
Fig. 3: Generalization function data reveal overactive temporally stable learning and negative temporally labile learning in high strategy subjects
Overall implicit (deg)
a
Learning (deg)
Fig. 2: Strategy learning shows large inter-individual diversity but combined learning does not
ar
0 -324 -124 -10
NVF test
le
NVF test
p < 0.001
0 30 Strategy (deg)
k
0 30 Overall implicit learning during training (deg)
Labile implicit dependence on strategy 20 ** **
0 ** p < 0.001
0 30 Labile Strategy (deg) implicit
l Stable implicit (deg)
30
Labile implicit (deg)
Perturbation schedule
ne d
b
Cursor rotation (deg)
Strategy + Implicit = Combined learning
bi
Start location
m
Aiming Marker
Strategy Co I mp mb licit ( rel Handined ati d lea ve ire rn to cti ing tar on ge t)
Cursor Target
Co
Strategy vs implicit decomposition
Learning (deg)
a
0 30 Overall combined learning during training (deg)
Stable implicit dependence on strategy 20
**** **
0
0 30 Stable Strategy (deg) implicit