YNIMG-08312; No. of pages: 9; 4C: NeuroImage xxx (2011) xxx–xxx
Contents lists available at ScienceDirect
NeuroImage j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / y n i m g
1
Dynamic decoding of ongoing perception
2
Marcel A.J. van Gerven a,⁎, Peter Kok a, Floris P. de Lange a, Tom Heskes b
3 4
a b
Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, The Netherlands Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands
5
a r t i c l e 6 7 8 9 10 11 13 12 14 15 16 17 18 19 20
i n f o
Article history: Received 1 December 2010 Revised 3 May 2011 Accepted 6 May 2011 Available online xxxx Keywords: Decoding Visual cortex Time-series analysis Linear dynamical system Hidden Markov model
a b s t r a c t Decoding of perceptual and mental states using multivariate analysis methods has received much recent attention. It relies on selective responses to experimental conditions in single trials, aggregated across voxels. In this study, we show that decoding is also possible when the state of interest changes continuously over time. It is shown that both orientation and rotation direction of a continuously rotating grating can be decoded with high accuracy using linear dynamical systems and hidden Markov models. These findings extend the decoding results for static gratings and are of importance in the decoding of ongoing changes in mental state. © 2011 Published by Elsevier Inc.
31
21 22 23 24 25 26 27
29 28
30 32
Introduction
33
Various studies have shown that the (subjective) contents of perception can be decoded from functional magnetic resonance imaging (fMRI) data using multivariate analysis methods (Norman et al., 2006; Haynes and Rees, 2006). For example, groundbreaking work by Kamitani and Tong (Kamitani and Tong, 2005) has shown that the orientation of a static grating can be decoded with high accuracy. The possibility of decoding stimulus orientation from BOLD response has been hypothesized to rely either on biased sampling of cortical orientation columns (Kamitani and Tong, 2005; Swisher et al., 2010; Kriegeskorte et al., 2010) or on patterns of spatial organization at a coarser scale, e.g., due to influences of macroscopic blood vessels or globally unequal representations (Op de Beeck, 2010). Recent results suggest that both fine and coarse patterns of spatial organization might be responsible for successful decoding of stimulus orientation (Shmuel et al., 2010). The possibility to probe internal states from distributed patterns of brain activity offers a new window into the brain and has led to novel insights about neural representations involved in various mental processes (Haxby et al., 2001; Kamitani and Tong, 2005; Haynes and Rees, 2005b; Thirion et al., 2006; Formisano et al., 2008; Miyawaki et al., 2008; Kay et al., 2008; Naselaris et al., 2009; Haynes, 2009;
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
⁎ Corresponding author at: Donders Institute for Brain, Cognition and Behaviour, Centre for Cognition, Cognitive Artificial Intelligence Department, Radboud University Nijmegen, Montessorilaan 3, 6500 HE, Nijmegen, The Netherlands. Fax: + 31 24 3616066. E-mail addresses:
[email protected],
[email protected] (M.A.J. van Gerven).
Hassabis et al., 2009; Chadwick et al., 2010; Reddy et al., 2010). However, to date, most decoding studies have focused on static decoding of single events in individual trials even though mental state is an ongoing process which evolves over time. The ability to decode continuously changing stimuli over time from distributed patterns of brain activity would truly enable tracking of how the internal state evolves as it is influenced by endogenous and exogenous controls. Dynamic decoding has many applications, such as being able to track the temporal dynamics of bistable perception (Haynes and Rees, 2005b; Brouwer and van Ee, 2007), decode the memory trace (Hassabis et al., 2009; Chadwick et al., 2010; Fuentemilla et al., 2010) or predict certain aspects of ongoing (subjective) perception (Miyawaki et al., 2008; Kay et al., 2008; Naselaris et al., 2009). This has been recognized by Haynes and Rees (Haynes and Rees, 2005b), who have shown that spontaneous changes in conscious experience during binocular rivalry can be decoded using linear discriminant analysis. In this paper, we demonstrate that time-series analysis methods are well suited for the dynamic decoding of ongoing mental processes. Contrary to Haynes and Rees, we use a generative method that is able to track these ongoing changes through time by modeling the autocorrelations in the signal of interest, thereby achieving better decoding performance. Furthermore, we do not only focus on discrete changes in internal state, as is the case with spontaneous changes in bistable perception, but on continuous changes as well. In order to achieve these desiderata we presented three subjects with a moving grating that was continuously changing in terms of orientation direction and speed. We decoded both rotation angle and direction of the grating, thereby extending previous work on static decoding of perceived stimulus orientation (Kamitani and Tong, 2005) and rotational motion (Kamitani and Tong, 2006).
1053-8119/$ – see front matter © 2011 Published by Elsevier Inc. doi:10.1016/j.neuroimage.2011.05.020
Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
2
M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx
84
Material and methods
MRI acquisition
122
85
Subjects
123 124
86 87 89
Three healthy adults with normal or corrected-to-normal vision participated in this study. The study was approved by the local ethics committee and all subjects (authors MG, PK and FL) gave written informed consent.
90
Stimuli and experimental design
91
Visual stimuli were generated using MATLAB in conjunction with the Psychophysics Toolbox (Brainard, 1997) and displayed on a rearprojection screen using an LCD projector (EIKI, Rancho Santa Margarita, CA, USA). The experiment consisted of ten 180 s long runs which were each preceded by a 15 s long rest period. In each run, a rotating sine-wave annular grating (∼ 100% contrast, 1.5 cycles per degree, 5° eccentricity, inner annulus 3° in diameter) was presented, as shown in Fig. 1. The rotating grating was defined to have a different spatial phase between runs. Within a run, a variable angular velocity of ten to twenty degrees per second was used with an initial angular velocity of fifteen degrees per second. Angular velocity was modulated by a sine wave that consisted of 3, 5, or 7 cycles per run. In each run, rotation direction changed at constant intervals amounting to 3, 5, or 7 direction changes per run. The subject viewed the stimulus while maintaining central fixation and counting the number of changes in rotation direction throughout the experiment. The discrete changes in direction and continuous changes in angular velocity make the decoding of rotation angle a non-trivial problem. In a separate session, a standard retinotopic mapping experiment was conducted to delineate visual areas using a rotating wedge stimulus (Sereno et al., 1995). Subjects viewed a wedge, consisting of a flashing checkerboard pattern (3 Hz), first rotating clockwise for 9 cycles and then anticlockwise for another 9 cycles (at a rotation speed of 23.4 s/cycle). In a third session, a functional localization experiment was used to localize area MT+. In order to localize voxels sensitive to linear or rotational motion, the subjects viewed the grating in different conditions: static non-drifting, static-drifting, and rotating nondrifting. Each run consisted of a rest block followed by the blocks defined by the different conditions. Each block was 15 s long and each run was repeated ten times.
Functional images were acquired with a Siemens 3T MRI system using a 32 channel coil for signal reception. For the main experiment, 1319 blood oxygenation level dependent (BOLD) sensitive functional images were acquired using a single shot gradient EPI sequence, with a repetition time (TR) of 1500 ms, echo time (TE) of 30 ms, isotropic voxel size of 2 × 2 × 2 mm, acquired in 26 axial slices in ascending order using a (GRAPPA) acceleration factor of 3. Dummy volumes were acquired at the beginning of each scan in order to account for T1 equilibration effects. A high-resolution anatomical image was acquired using an MP-RAGE sequence (TE/TR = 3.39/2250 ms; 176 sagittal slices, isotropic voxel size of 1 × 1 × 1 mm). Functional MRI data preprocessing
134
Functional data were preprocessed within the framework of SPM8 (Statistical Parametric Mapping, http://www.fil.ion.ucl.ac.uk/spm). Dummy scans were removed before preprocessing. Functional brain volumes were realigned to the mean image in order to correct for motion and the anatomical image was coregistered with the mean of the functional images. Data were high-pass filtered with a filter cutoff of 128 s, represented as percentage signal change and finally centered and scaled to have mean zero and unit variance. In the present analysis we restricted ourselves to regions of interest in visual cortex. We performed retinotopic mapping to identify the boundaries of retinotopic areas in early visual cortex using well-established methods (Sereno et al., 1995; DeYoe et al., 1996; Engel et al., 1997). Freesurfer (http://surfer.nmr.mgh.harvard.edu) was used to generate inflated representations of the cortical surface from each subject's T1-weighted structural image and to analyze the functional data of the retinotopic mapping session. Fourier-based methods were used to obtain polar angle maps of the cortical surface, on the basis of which the borders of visual areas (dorsal and ventral V1, V2 and V3 in both hemispheres) could be defined for each subject (Sereno et al., 1995). Area MT+ was localized by taking the most significant clusters obtained from a GLM contrast (static drifting + rotating − static conditions) while restricting the clusters to fall within area OC5 of the SPM Anatomy toolbox. This area was delineated using a liberal probability threshold greater than zero in order to ensure the inclusion of motion-sensitive voxels for the decoding analysis. The area was warped into native space by applying the inverse normalization parameters to the mask.
135
Decoding analysis
162
In order to dynamically decode both grating orientation and rotation direction from the continuously rotating grating, we made use of time-series analysis methods. Assume that the state vector of interest x = (x1,…, xT) forms a Markov chain and is only indirectly observable via measurements y = (y1,…, yT). The joint density for this generative model is then given by
163 164
88
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121
10 degrees visual angle
10-20 degrees/s Fig. 1. Subjects were shown a continuously rotating grating whose rotation angle was unpredictable due to pseudo-random changes in both angular velocity and rotation direction. Subjects had to maintain fixation and count the number of times a rotation direction change occurred.
T
T
t =2
t=1
pðx; yÞ = pðx1 Þ ∏ pðxt jxt−1 Þ ∏ pðyt jxt Þ
125 126 127 128 129 130 131 132 133
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161
165 166 167 168
ð1Þ
where the dependence on the model parameters θ is left implicit. Ultimately, we are interested in computing the most probable sequence of states x⁎ = arg maxxp(x|y) for new observations y. This is achieved by estimating the model parameters θ from data acquired over multiple runs and inferring x⁎ from measurements made in independent runs. Since the state is observed during training we can immediately estimate parameters from data instead of needing to estimate them using an expectation-maximization algorithm (Dempster et al., 1977).
Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020
169 170 171 172 173 174 175 176 177 178
M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx
179 180 181 182 183 184
Decoding of grating orientation and rotation direction requires inference about a continuous mental state (orientation) or a discrete mental state (direction). In case of continuous mental state, we assume that both the state and measurement variables are Gaussian with the mean being linearly dependent on the conditioning variables. That is, we have pðx1 Þ pðxt jxt−1 Þ pðyt jxt Þ
185 186 187 188 189 190 191 192 193 194
195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235
= = =
N ðx1 ; μ; ΣÞ N ðxt ; Axt−1 ; Q Þ N ðyt ; Cxt ; RÞ
with parameters θ = {μ, Σ, A, Q, C, R}. This model is known as a linear dynamical system (LDS) (Roweis and Ghahramani, 1999) where decoding of x is realized using the Kalman smoother. If, in contrast, the underlying mental state is discrete, such as in case of rotation direction or spontaneous changes during bistable perception, we obtain the hidden Markov model (HMM) (Roweis and Ghahramani, 1999) where the state variables are multinomial random variables and x is computed by the Viterbi decoder. For binomial state variables with xt ∈ {0, 1}, the distributions are defined by pðx1 = 1Þ
=
m
pðxt = 1jxt−1 Þ
=
a0
pðyt jxt Þ
=
N ðyt ; μ0 ; R0 Þð1−xt Þ N ðyt ; μ1 ; R1 Þxt
ð1−xt−1 Þ xt−1 a1
with parameters θ = {m, ak, μk, Rk} with k ∈ {0, 1}. All algorithms have been implemented in MATLAB 7 (Mathworks, Natick, MA, USA). For the HMM a small diagonal term of 10 − 3I was added to the covariance matrices Rk in order to avoid numerical stability issues. In order to take into account the hemodynamic lag, for both the LDS and the HMM, measurements yt associated with a state xt at a certain point in time were given by the BOLD response within the volume acquired at a six second delay relative to stimulus onset t. In order to estimate the angle αt, we used a linear dynamical system where the state vector is given by xt = (sin(αt), cos(αt)) T. This transformation is required since angle itself is a periodic variable which should be decoded using techniques from circular statistics (Fisher, 1996; Mardia and Jupp, 1999). The sine–cosine transformation allowed us to use standard time-series analysis methods. Furthermore, since a grating is mirror symmetric, a full rotation is achieved after a 180° change in angle such that the average absolute deviation between the real angle and a random angle will be 45°. Decoding performance was estimated by computing the mean absolute deviation (MAD) between the real and predicted angle of the rotating grating. In order to estimate rotation direction, we used a hidden Markov model where the state xt ∈ {0, 1} denotes clockwise or counter-clockwise rotation. Decoding performance was estimated by computing the classification error, that is, the proportion of volumes that have been assigned to the incorrect rotation direction. In order to improve the efficiency of inference and to prevent overfitting we restricted the number of voxels that act as observations. We achieved this by ranking individual voxels in terms of the correlation between their BOLD timecourse and the timecourse(s) of interest (sine and cosine of the angle or rotation direction). In order to prevent circularity issues (Kriegeskorte et al., 2009), we used an outer and inner cross-validation approach. In the outer cross-validation, individual runs (test data) were decoded by means of probabilistic inference using a model that was estimated from data in the nine remaining runs (training data). In the inner cross-validation approach, eight out of nine runs in the training data were used to rank voxels according to their correlation with the signal timecourse(s) of interest. This ranking was used to identify the optimal number of voxels according to decoding performance on the remaining run in the test data. The identified voxels were used to train the model on all training data. This model was subsequently used in the outer cross-
3
validation to produce decoding results for the test data. Final decoding performance was estimated by computing the average decoding performance over each of the ten runs. See Fig. 2 for a description of the decoding analysis. In order to examine the differences in decoding performance between regions of interest, we have also analyzed decoding performance per region. In this case, in order to be able to compare decoding performances, a fixed number of voxels was used as the basis for decoding.
236 237
Results
245
Decoding of grating orientation
246
Fig. 3 depicts the orientation decoding results for all three subjects using the outer and inner cross-validation approach described above. Panel 3.A shows the real and predicted cumulative angles (i.e., predicted orientations while taking into account full rotations) for the three subjects across all runs as compared with a random prediction that is obtained by ignoring the observed BOLD responses. The mean absolute deviations between the angle of the real and predicted grating orientation were 14.56° ± 0.80° SEM, 14.52° ± 0.56° SEM and 15.61° ± 0.71° SEM for subjects MG, PK and FL respectively, which is well below the chance level of 45°. All three subjects showed a weak positive correlation between the absolute deviation and the angular velocity (r = 0.28, r = 0.29 and r = 0.22) which implies that the
247 248
BOLD DATA
preprocessing
correlation-based feature ranking
model selection
xt-1
xt
yt-1
yt
LDS:
orientation decoding
HMM:
direction decoding
Fig. 2. Overview of the decoding analysis. BOLD data is realigned, high-pass filtered, standardized and corrected for the hemodynamic lag by aligning the peak of the BOLD response with the moment of stimulation. Both BOLD data and the signal time courses (orientation and rotation direction) serve as input to a correlation-based feature ranking which ranks the voxels according to their usefulness for decoding. Using an inner cross-validation which iterates over nine out of ten runs the optimal number of voxels is computed by minimizing MAD/classification error using the LDS/HMM. After this model selection phase, the parameters of the LDS/HMM are re-estimated based on all nine runs. Decoding performance is subsequently computed using the test data (the remaining run). This procedure is repeated for each of the ten runs and reported decoding performance is the average over runs.
Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020
238 239 240 241 242 243 244
249 250 251 252 253 254 255 256 257 258
M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx
4
1
real predicted random
MG
x 10
B 45
0
40
−1
200
400
4
1
x 10
600 PK
800
1000
1200
35 MAD
A
cumulative angle cumulative angle cumulative angle
4
0 −1
200
400
4
1
x 10
600 FL
800
1000
30 25 20
1200
15 10
0 −1
200
C
400
600 volume
800
1000
MG
V2d
MT+
V1
101
number of voxels
PK
V2d V3d
FL
MT+
MT+
V3d
V2d
V3d
V3d V2d
V3d
MT+
V1
V2v
V2v V3v
1
V3v
V2v
MT+
V2d V2d V1
V1 V1
V3v
102
1200
MT+ V3d
MG PK FL
V1 V2v V3v
V2v V3v V3v
V2v
0
Fig. 3. Orientation decoding results for the three subjects. Panel A shows the angle of the real, predicted and random orientation accumulated across runs as a function of measured volume. Random decoding results were obtained by ignoring the observed responses and are added for comparison. Upward directions denote clockwise rotation and downward directions denote anti-clockwise rotation. Panel B shows the mean absolute deviation as a function of the number of used voxels where error bars denote standard error of the mean. Dashed vertical lines denote the number of voxels that are selected per cross-validation fold. Panel C shows the correlations between the signal of interest and those voxels which have been selected in the inner cross-validation to produce the final decoding results.
259
277
accuracy of the estimates decreases as the rotation speed of the grating increases. In order to examine how MAD changes as a function of the number of used voxels, we used the inner cross-validation results. That is, the corresponding estimates obtained per fold were averaged to produce the curves of interest for three subjects, as shown in Panel 3.B. The dashed vertical lines depict the optimal number of voxels for individual runs in the inner cross-validation. A minimal MAD was reached at about 200 included voxels. Finally, Panel 3.C shows the voxels which were used for decoding the angle. The contours delineate dorsal and ventral visual areas V1, V2 and V3 as well as area MT+. The heat maps denote how often each voxel was used for decoding, averaged over cross-validation folds. Selected voxels extend from V1 up to and beyond V3 with dorsal areas being more prevalent than ventral areas, which could be associated with the performance asymmetry between the upper and lower visual field (Levine and McAnany, 2005) or due to differences in susceptibility artifacts between dorsal and ventral areas (Winawer et al., 2010).
278
Decoding of rotation direction
279
Fig. 4 depicts the direction decoding results for all three subjects using the inner and outer cross-validation approach described above. Panel 4.A shows the real and predicted rotation directions for the three subjects in the first five runs. Classification errorswere 0.37, 0.34 and 0.41 for subjects MG, PK and FL respectively. Subjects showed no or a very weak positive correlation between the classification error
260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276
280 281 282 283 284
and the angular velocity (r = 0.05, r = 0.12 and r = − 0.01). In order to examine how classification error changes as a function of the number of used voxels, we used the inner cross-validation results. Panel 4.B depicts the curves of interest for three subjects. The dashed vertical lines depict the optimal number of voxels for individual runs in the inner cross-validation. A minimal error was reached at about ten included voxels. Panel 4.C indeed shows that a small number of voxels are selected, mostly outside of early visual areas or area MT+. In general, rotation direction seems much harder to decode than grating orientation. The variability of the number of voxels which are selected in the inner cross-validation (cf. Panels 3.B and 4.B) also indicates decoding of direction is less robust than decoding of orientation.
285
Region of interest analysis
298
In order to identify how visual areas independently contribute to the decoding of grating orientation and rotation direction, we have used the functional localizers that have been acquired in independent sessions. In order to allow comparison of the contributions for each visual area, we used the fifty voxels whose BOLD response correlated most strongly with the signal(s) of interest. Fig. 5 shows the decoding results on the flattened cortical surface for each subject. For grating orientation, decoding performance, averaged over dorsal and ventral as well as left and right hemisphere regions for all subjects, decreased when moving downstream from V1 to V3 (MADs of 30° ± 3° SEM, 31° ± 3° SEM and 34° ± 4° SEM for areas V1, V2 and V3, respectively). This matches the decoding results of Kamitani and
299
Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020
286 287 288 289 290 291 292 293 294 295 296 297
300 301 302 303 304 305 306 307 308 309 310
M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx
A
real predicted
right left right left
B 0.5
classification error
direction
right left right left
direction
MG
PK
FL direction
5
right left right left
0.45
0.4
0.35
MG PK FL
0.3
101
102 number of voxels
100
200
300
400
500
600
volume
C
MG
PK
FL
MT+
MT+ V3d
V2d V3d
V2d
MT+
V3d
V3d
V3d V2d
V2d
V3d
MT+
V1
V1
MT+
V1
V1 V1
V3v
V2v V3v
V2v
1
V3v
MT+
V2d
V2d
V1 V2v
V2v
V3v
V3v
V2v V3v V2v
0
Fig. 4. Predicted rotation directions for the three subjects. Panel A shows the decoding result for first five runs. Panel B shows the classification error as a function of the number of used voxels where error bars denote standard error of the mean. Dashed vertical lines denote the number of voxels that are selected per cross-validation fold. Panel C shows the correlations between the signal of interest and those voxels which have been selected in the inner cross-validation to produce the final decoding results.
311 312 313
of 43° ± 1° SEM), which is also consistent with previous findings 314 (Kamitani and Tong, 2005). Dorsal areas (MADs of 30° ± 2° SEM, 315 29° ± 2° SEM and 32° ± 2° SEM for areas V1, V2 and V3, respectively) 316
Tong(2005) and is consistent with the poorer orientation selectivity in higher visual areas, as shown in animal studies (Vanduffel et al., 2002). Area MT+ did not allow decoding of grating orientation (MAD
A
MG
PK
FL
20 MT+
MT+ V3d
V2d V3d
V2d
MT+
V3d
V3d V2d
V3d V2d
V3d
V1
V1 V1
V1 V3v
V2v V3v
V2v
MT+
V2d
V2d
MT+
V1
V1
MT+
V3v
V2v
V2v
V3v
V3v
V2v V3v V2v
5
B
MG
PK
FL
0.7
V2d
MT+
V1
V3v
V2v
V2d V3d
MT+
MT+
MT+ V3d
V3d
V2d
V3d
V3d V2d
V3d
V2d
MT+
V2d
MT+
V1
V1 V2v V3v
V3v
V2v
V1
V1
V1 V2v V3v
V3v
V2v V3v V2v
0.5 Fig. 5. Decoding results separately computed for visual areas V1, V2, V3 and MT+ for the three subjects. Panel A shows angle decoding results where color codes for decoding performance (45 — MAD). Panel B shows direction decoding results where color codes for classification accuracy (1 — classification error). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020
6
90
dorsal left hemisphere 90
40 60
120
90
20
30
30
180
330
300
240 270
30
330
210
300 270
30
10
0 180
240
20
150
10
0 180
210
60 30
20
150
10
40
120
30
20
150
90 60
30
10
ventral right hemisphere
40
120
60
120
30 150
dorsal right hemisphere
40
0 180
210
330
240
300 270
0
210
330
240
300 270
Fig. 6. Polar plots for the angle deviation as a function of orientation where orientations are binned into 30° windows with 20° overlap. Results are shown for all three subjects in ventral and dorsal regions of primary visual area V1 for both the left and right hemisphere.
M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx
Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020
ventral left hemisphere
MG PK FL
M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx
330 331
performed somewhat better than ventral areas (MADs of 30° ± 4° SEM, 32° ± 4° SEM and 35° ± 4° SEM for areas V1, V2 and V3, respectively). For rotation direction, decoding performance per region of interest was quite low. Still, we could observe decreases in performance in the downstream direction when dorsal and ventral as well as left and right hemisphere regions were pooled (classification errors of 0.37, 0.42 and 0.43 for areas V1, V2 and V3, respectively). Area MT+ now also became involved with an average classification error of 0.41. We also examined whether individual regions showed a particular bias for certain grating orientations. Fig. 6 shows that there is some variability in the decoding of certain orientations in visual area V1, both within and between subjects. However, we have found no clear correspondences with previously reported orientation anisotropies in early visual cortex (Furmanski and Engel, 2000; Mannion et al., 2010).
332
Static versus dynamic decoding
333 334
In order to substantiate our claim that the decoding of ongoing mental state is facilitated by the use of dynamic instead of static decoding methods, we performed a comparative analysis. The static decoder was obtained simply by removing the autocorrelations for the state variable at neighboring timepoints. That is, we forced the matrix A to be zero and the transition probabilities a0 and a1 to be 0.5 during parameter estimation for the LDS and the HMM, respectively. Fig. 7 demonstrates the decoding of mental state using randomly selected voxels in area V1. Panel 7.A shows an example of decoding of the sine and cosine of the rotation angle using a static decoder (blue line) and a dynamic decoder (red line). These results demonstrate that the dynamic decoder effectively performs an adaptive smoothing of the time-series due to the intrinsic autocorrelations. Panel 7.B shows that the dynamic decoder indeed improves on the static
326 327 328 329
335 336 337 338 339 340 341 342 343 344 345 346
A
347 348
Discussion
357
In this paper, we have shown that dynamic decoding of ongoing mental state can be achieved using time-series analysis methods. First, it was shown that grating orientation can be decoded continuously by means of a linear dynamical system where the grating orientation is inferred using a Kalman smoother. Second, it was shown that rotation direction can be decoded using a hidden Markov model where the rotation direction is inferred using Viterbi decoding. Our comparison between static and dynamic decoding methods shows that the decoding of ongoing internal state from BOLD response is improved by employing dynamic decoding methods. This paves the way for decoding of ongoing perception in more ecological settings, such as viewing of movies (Hasson et al., 2004), as well as decoding of ongoing mental state which arises from internal processes instead of from external stimulation (Haynes and Rees, 2005a; Haynes, 2009). In the latter case, the state itself cannot be observed and requires the use of an expectation-maximization algorithm when estimating model parameters. Even though both grating orientation and rotation direction could be decoded from BOLD response, results indicate that the former is
358
B 45
MG 4 2
MG dynamic PK dynamic FL dynamic MG static PK static FL static
40
0 −2
20
40
60
80
100
120
35 30
4 2
25
0 −2
20 20
40
static dynamic
C
60
80
100
120
0
100
200 300 number of voxels
400
500
D
MG
0.52 right classification error
324 325
sin(angle)
323
cos(angle)
321 322
left direction
319 320
decoder in terms of MAD performance. The effects are even more salient for the decoding of rotation direction. Panel 7.C shows an example of decoding the rotation direction using a static decoder and a dynamic decoder. The static decoder completely fails due to the independence assumption between neighboring timepoints. Classification error is also substantially lower for the dynamic decoder (Panel 7.D). Subject PK, who performed best in previous analyses, now displays the worst decoding performance. This can be explained by the fact that those voxels which allowed decoding before may have been excluded by the random selection of voxels in area V1.
MAD
317 318
7
right left right
0.5 0.48 0.46 0.44
left 0.42 100
200
300
400
500
600
101
102 number of voxels
Fig. 7. Comparison between static and dynamic decoding using randomly selected voxels in area V1. Averaged results for twenty repeats using different random selections are shown. Panel A demonstrates the decoding of the sine and cosine components of the rotation angle using a static and a dynamic decoder based on 400 randomly selected voxels. Panel B shows how MAD changes as a function of the number of used voxels. Panel C depicts the decoding of rotation direction using both decoders based on 50 randomly selected voxels. Panel D shows how classification error changes as a function of the number of used voxels. The dynamic decoder outperforms the static decoder in both cases. Error bars in Panels B and D denote standard error of the mean.
Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020
349 350 351 352 353 354 355 356
359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376
8
377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442
M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx
much easier to decode than the latter. Figs. 3B and 4B show that the decrease in error is both more pronounced and more consistent for grating orientation compared with rotation direction. The improved consistency follows from the fact that results are more comparable between and within subjects. Within subjects, the results for each cross-validation fold are more similar, as evidenced both by the smaller error bars as well as by the fact that the optimal number of voxels used for decoding is more similar between folds. Improved consistency also follows from the voxels selected for decoding as shown in Figs. 3C and 4C. Clusters of voxels in early visual cortex correlated strongly with grating orientation whereas the voxels selected to decode rotation direction were more dispersed throughout the measured brain volume. The fact that orientation was easier to decode than rotational motion is in line with previous findings (Kamitani and Tong, 2005, 2006) and could be due to differences in spatial anisotropies between orientation and direction-selective neurons. Swisher et al.(2010) provide a nice account of the orientation biases that are present in the BOLD signal at multiple spatial scales. The region of interest analysis for direction decoding, depicted in Fig. 5B, shows that early visual cortex and area MT+ did allow for the decoding of rotation direction. Here, area MT+ outperformed early visual areas for subjects MG and PK while the converse was true for subject FL. Note however that it remains problematic to directly compare classification accuracy between different areas of the brain. For example, sampling from voxels in area V1, which is estimated to be about 3 times larger than MT, may in and of itself lead to a higher signal-to-noise ratio at the area level (Serences and Boynton, 2007). Note further that there is a disparity between these results and the earlier decoding analysis in terms of selected voxels and their associated performance. This could indicate that the employed correlation-based feature ranking criterion may be suboptimal for selecting voxels that respond well to rotation direction. For subject FL, additional information with respect to orientation was present in regions anterior to the early retintopically defined visual cortical areas. While these areas are likely to be tuned for more complex stimulus properties than orientation, they may still show a response bias that can be picked up by the decoding algorithm. Retinotopic maps are found throughout the visual dorsal stream (Silver and Kastner, 2009) and these maps are known to exhibit unequal sampling of different orientations in particular parts of visual space (Sasaki et al., 2006). In this paper, we have used linear dynamical systems and hidden Markov models in order to demonstrate that dynamic decoding of ongoing mental state from BOLD data is feasible. It should be noted however that the rate of stimulus change as used in this paper was quite slow. A weak positive correlation between decoding error and angular velocity was observed but it remains an open question how decoding performance is affected by more rapid stimulus changes in the presence of autocorrelations. More sophisticated methods for time-series analysis could further improve decoding performance. One way to improve results is to use more complex generative models that provide a more natural representation of the phenomena we are trying to model. This can either be achieved by relaxing the Gaussianity assumptions, by introducing additional hidden variables, or by making other independence assumptions (Barber and Cemgil, 2010). Another way to improve results is to use a more principled approach to voxel selection, for example by means of dimensionality reduction techniques (Schölkopf et al., 1998) or by embedding the selection process directly within the employed inference algorithms, e.g., using automatic relevance determination (Beal, 2003). Finally, results may be improved by including a forward model of the hemodynamic response (Makni et al., 2008). Dynamic decoding has several potential benefits compared to static decoding since it gives better insight into the temporal
dynamics of ongoing mental state. This possibility to track mental state over time becomes especially important with the advent of realtime fMRI (DeCharms, 2008) where BOLD data is analyzed in an online manner which has applications in brain–computer interfacing and brain-state dependent stimulation. The aim of this paper is to encourage the use of time-series analysis methods for the decoding of ongoing mental state. Our results on the decoding of grating orientation and rotation direction from BOLD response show that such methods are warranted whenever decoding is applied within a paradigm that is characterized by intrinsic temporal dynamics.
443 444
Acknowledgments
453
The authors gratefully acknowledge the support of the Netherlands Organization for Scientific Research NWO (Veni grant 451.09.001 and Vici grant 639.023.604) and the BrainGain Smart Mix Programme of the Netherlands Ministry of Economic Affairs and the Netherlands Ministry of Education, Culture and Science. We thank Paul Gaalman for his assistance during fMRI data acquisition.
454
References
460
Barber, D., Cemgil, A.T., 2010. Graphical models for time series. IEEE Signal Process. Mag. 27 (6), 18–28. Beal, M., 2003. Variational algorithms for approximate Bayesian inference. Ph.D. thesis, University College London. Brainard, D.H., 1997. The psychophysics toolbox. Spat. Vis. 10, 433–436. Brouwer, G.J., van Ee, R., 2007. Visual cortex allows prediction of perceptual states during ambiguous structure-from-motion. J. Neurosci. 27 (5), 1015–1023. Chadwick, M.J., Hassabis, D., Weiskopf, N., Maguire, E.A., 2010. Decoding individual episodic memory traces in the human hippocampus. Curr. Biol. 20 (6), 544–547. DeCharms, R.C., 2008. Applications of real-time fMRI. Nat. Rev. Neurosci. 9, 720–729. Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39 (1), 1–38. DeYoe, E.A., Carman, G.J., Bandettini, P., Glickman, S., Wieser, J., Cox, R., Miller, D., Neitz, J., 1996. Mapping striate and extrastriate visual areas in human cerebral cortex. Proc. Natl. Acad. Sci. U.S.A. 93 (6), 2382–2386. Engel, S.A., Glover, G.H., Wandell, B.A., 1997. Retinotopic organization in human visual cortex and the spatial precision of functional MRI. Cereb. Cortex 7 (2), 181–192. Fisher, N.I., 1996. Statistical analysis of circular data. Cambridge University Press, Cambridge. Formisano, E., De Martino, F., Bonte, M., Goebel, R., 2008. “Who” is saying “what”? Brain-based decoding of human voice and speech. Science 322 (5903), 970–973. Fuentemilla, L., Penny, W.D., Cashdollar, N., Bunzeck, N., Düzel, E., 2010. Theta-coupled periodic replay in working memory. Curr. Biol. 20, 1–7. Furmanski, C.S., Engel, S.A., 2000. An oblique effect in human primary visual cortex. Nat. Neurosci. 3 (6), 535–536. Hassabis, D., Chu, C., Rees, G., Weiskopf, N., Molyneux, P.D., Maguire, E.A., 2009. Decoding neuronal ensembles in the human hippocampus. Curr. Biol. 19, 546–554. Hasson, U., Nir, Y., Levy, I., Fuhrmann, G., Malach, R., 2004. Intersubject synchronization of cortical activity during natural vision. Science 303 (5664), 1634–1640 Mar. Haxby, J., Gobbini, M., Furey, M., Ishai, A., Schouten, J., Pietrini, P., 2001. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425–2430. Haynes, J.-D., 2009. Decoding visual consciousness from human brain signals. Trends Cogn. Sci. 13 (5), 194–202. Haynes, J.-D., Rees, G., 2005a. Predicting the orientation of invisible stimuli from activity in human primary visual cortex. Nat. Neurosci. 8 (5), 686–691 May. Haynes, J.-D., Rees, G., 2005b. Predicting the stream of consciousness from activity in human visual cortex. Curr. Biol. 15 (14), 1301–1307. Haynes, J.-D., Rees, G., 2006. Decoding mental states from brain activity in humans. Nat. Rev. Neurosci. 7, 523–534. Kamitani, Y., Tong, F., 2005. Decoding the visual and subjective contents of the human brain. Nat. Neurosci. 8 (5), 679–685. Kamitani, Y., Tong, F., 2006. Decoding seen and attended motion directions from activity in the human visual cortex. Curr. Biol. 16 (11), 1096–1102 Jun. Kay, K., Naselaris, T., Prenger, R.J., Gallant, J.L., 2008. Identifying natural images from human brain activity. Nature 452, 352–355. Kriegeskorte, N., Simmons, W.K., Bellgowan, P.S.F., Baker, C.I., 2009. Circular analysis in systems neuroscience: the dangers of double dipping. Nat. Neurosci. 12 (5), 535–540. Kriegeskorte, N., Cusack, R., Bandettini, P., 2010. How does an fMRI voxel sample the neuronal activity pattern: compact-kernel or complex spatiotemporal filter? NeuroImage 49 (3), 1965–1976. Levine, M.W., McAnany, J.J., 2005. The relative capabilities of the upper and lower visual hemifields. Vis. Res. 45 (21), 2820–2830. Makni, S., Beckmann, C., Smith, S., Woolrich, M., 2008. Bayesian deconvolution of fMRI data using bilinear dynamical systems. NeuroImage 42 (4), 1381–1396.
461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518
Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020
445 446 447 448 449 450 451 452
455 456 457 458 459
M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx
519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540
Mannion, D.J., McDonald, J.S., Clifford, C.W.G., 2010. Orientation anisotropies in human visual cortex. J. Neurophysiol. 103 (6), 3465–3471. Mardia, K.V., Jupp, P.E., 1999. Directional statistics, 2nd ed. Wiley, New York. Miyawaki, Y., Uchida, H., Yamashita, O., Sato, M., Morito, Y., Tanabe, H.C., Sadato, N., Kamitani, Y., 2008. Visual image reconstruction from human brain activity using a combination of multiscale local image decoders. Neuron 60 (5), 915–929. Naselaris, T., Prenger, R.J., Kay, K.N., Oliver, M., Gallant, J.L., 2009. Bayesian reconstruction of natural images from human brain activity. Neuron 63 (6), 902–915. Norman, K.A., Polyn, S.M., Detra, G.J., Haxby, J.V., 2006. Beyond mind-reading: multivoxel pattern analysis of fMRI data. Trends Cogn. Sci. 10 (9), 424–430. Op de Beeck, H.P., 2010. Against hyperacuity in brain reading: spatial smoothing does not hurt multivariate fMRI analyses? NeuroImage 49 (3), 1943–1948. Reddy, L., Tsuchiya, N., Serre, T., 2010. Reading the mind's eye: decoding category information during mental imagery. NeuroImage 50 (2), 818–825. Roweis, S.T., Ghahramani, Z., 1999. A unifying review of linear Gaussian models. Neural Comput. 11 (2), 305–345. Sasaki, Y., Rajimehr, R., Kim, B.W., Ekstrom, L.B., Vanduffel, W., Tootell, R.B.H., 2006. The radial bias: a different slant on visual orientation sensitivity in human and nonhuman primates. Neuron 51 (5), 661–670. Schölkopf, B., Smola, A.J., Müller, K.-R., 1998. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319.
9
Serences, J.T., Boynton, G.M., 2007. Feature-based attentional modulations in the absence of direct visual stimulation. Neuron 55 (2), 301–312. Sereno, M.I., Dale, A.M., Reppas, J.B., Kwong, K.K., Belliveau, J.W., Brady, T.J., Rosen, B.R., Tootell, R.B., 1995. Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science 268 (5212), 889–893. Shmuel, A., Chaimow, D., Raddatz, G., Ugurbil, K., Yacoub, E., 2010. Mechanisms underlying decoding at 7 T: ocular dominance columns, broad structures, and macroscopic blood vessels in V1 convey information on the stimulated eye. NeuroImage 49 (3), 1957–1964. Silver, M.A., Kastner, S., 2009. Topographic maps in human frontal and parietal cortex. Trends Cogn. Sci. 13 (11), 488–945. Swisher, J.D., Gatenby, J.C., Gore, J.C., Wolfe, B.A., Moon, C.-H., Kim, S.-G., Tong, F., 2010. Multiscale pattern analysis of orientation-selective activity in the primary visual cortex. J. Neurosci. 30 (1), 325–330. Thirion, B., Duchesnay, E., Hubbard, E., Dubois, J., Poline, J., Lebihan, D., Dehaene, S., 2006. Inverse retinotopy: inferring the visual content of images from brain activation patterns. NeuroImage 33 (4), 1104–1116. Vanduffel, W., Tootell, R.B.H., Schoups, A.A., Orban, G.A., 2002. The organization of orientation selectivity throughout macaque visual cortex. Cereb. Cortex 12 (6), 647–662. Winawer, J., Horiguchi, H., Sayres, R.A., Amano, K., Wandell, B.A., 2010. Mapping hV4 and ventral occipital cortex: the venous eclipse. J. Vis. 10 (5), 1–22.
563
Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020
541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562