Dynamic decoding of ongoing perception - Floris de Lange

Viewer
Transcript

YNIMG-08312; No. of pages: 9; 4C: NeuroImage xxx (2011) xxx–xxx

Contents lists available at ScienceDirect

NeuroImage j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / y n i m g

1

Dynamic decoding of ongoing perception

2

Marcel A.J. van Gerven a,⁎, Peter Kok a, Floris P. de Lange a, Tom Heskes b

3 4

a b

Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, The Netherlands Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands

5

a r t i c l e 6 7 8 9 10 11 13 12 14 15 16 17 18 19 20

i n f o

Article history: Received 1 December 2010 Revised 3 May 2011 Accepted 6 May 2011 Available online xxxx Keywords: Decoding Visual cortex Time-series analysis Linear dynamical system Hidden Markov model

a b s t r a c t Decoding of perceptual and mental states using multivariate analysis methods has received much recent attention. It relies on selective responses to experimental conditions in single trials, aggregated across voxels. In this study, we show that decoding is also possible when the state of interest changes continuously over time. It is shown that both orientation and rotation direction of a continuously rotating grating can be decoded with high accuracy using linear dynamical systems and hidden Markov models. These ﬁndings extend the decoding results for static gratings and are of importance in the decoding of ongoing changes in mental state. © 2011 Published by Elsevier Inc.

31

21 22 23 24 25 26 27

29 28

30 32

Introduction

33

Various studies have shown that the (subjective) contents of perception can be decoded from functional magnetic resonance imaging (fMRI) data using multivariate analysis methods (Norman et al., 2006; Haynes and Rees, 2006). For example, groundbreaking work by Kamitani and Tong (Kamitani and Tong, 2005) has shown that the orientation of a static grating can be decoded with high accuracy. The possibility of decoding stimulus orientation from BOLD response has been hypothesized to rely either on biased sampling of cortical orientation columns (Kamitani and Tong, 2005; Swisher et al., 2010; Kriegeskorte et al., 2010) or on patterns of spatial organization at a coarser scale, e.g., due to inﬂuences of macroscopic blood vessels or globally unequal representations (Op de Beeck, 2010). Recent results suggest that both ﬁne and coarse patterns of spatial organization might be responsible for successful decoding of stimulus orientation (Shmuel et al., 2010). The possibility to probe internal states from distributed patterns of brain activity offers a new window into the brain and has led to novel insights about neural representations involved in various mental processes (Haxby et al., 2001; Kamitani and Tong, 2005; Haynes and Rees, 2005b; Thirion et al., 2006; Formisano et al., 2008; Miyawaki et al., 2008; Kay et al., 2008; Naselaris et al., 2009; Haynes, 2009;

34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53

⁎ Corresponding author at: Donders Institute for Brain, Cognition and Behaviour, Centre for Cognition, Cognitive Artiﬁcial Intelligence Department, Radboud University Nijmegen, Montessorilaan 3, 6500 HE, Nijmegen, The Netherlands. Fax: + 31 24 3616066. E-mail addresses: [email protected], [email protected] (M.A.J. van Gerven).

Hassabis et al., 2009; Chadwick et al., 2010; Reddy et al., 2010). However, to date, most decoding studies have focused on static decoding of single events in individual trials even though mental state is an ongoing process which evolves over time. The ability to decode continuously changing stimuli over time from distributed patterns of brain activity would truly enable tracking of how the internal state evolves as it is inﬂuenced by endogenous and exogenous controls. Dynamic decoding has many applications, such as being able to track the temporal dynamics of bistable perception (Haynes and Rees, 2005b; Brouwer and van Ee, 2007), decode the memory trace (Hassabis et al., 2009; Chadwick et al., 2010; Fuentemilla et al., 2010) or predict certain aspects of ongoing (subjective) perception (Miyawaki et al., 2008; Kay et al., 2008; Naselaris et al., 2009). This has been recognized by Haynes and Rees (Haynes and Rees, 2005b), who have shown that spontaneous changes in conscious experience during binocular rivalry can be decoded using linear discriminant analysis. In this paper, we demonstrate that time-series analysis methods are well suited for the dynamic decoding of ongoing mental processes. Contrary to Haynes and Rees, we use a generative method that is able to track these ongoing changes through time by modeling the autocorrelations in the signal of interest, thereby achieving better decoding performance. Furthermore, we do not only focus on discrete changes in internal state, as is the case with spontaneous changes in bistable perception, but on continuous changes as well. In order to achieve these desiderata we presented three subjects with a moving grating that was continuously changing in terms of orientation direction and speed. We decoded both rotation angle and direction of the grating, thereby extending previous work on static decoding of perceived stimulus orientation (Kamitani and Tong, 2005) and rotational motion (Kamitani and Tong, 2006).

1053-8119/$ – see front matter © 2011 Published by Elsevier Inc. doi:10.1016/j.neuroimage.2011.05.020

Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020

54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83

2

M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx

84

Material and methods

MRI acquisition

122

85

Subjects

123 124

86 87 89

Three healthy adults with normal or corrected-to-normal vision participated in this study. The study was approved by the local ethics committee and all subjects (authors MG, PK and FL) gave written informed consent.

90

Stimuli and experimental design

91

Visual stimuli were generated using MATLAB in conjunction with the Psychophysics Toolbox (Brainard, 1997) and displayed on a rearprojection screen using an LCD projector (EIKI, Rancho Santa Margarita, CA, USA). The experiment consisted of ten 180 s long runs which were each preceded by a 15 s long rest period. In each run, a rotating sine-wave annular grating (∼ 100% contrast, 1.5 cycles per degree, 5° eccentricity, inner annulus 3° in diameter) was presented, as shown in Fig. 1. The rotating grating was deﬁned to have a different spatial phase between runs. Within a run, a variable angular velocity of ten to twenty degrees per second was used with an initial angular velocity of ﬁfteen degrees per second. Angular velocity was modulated by a sine wave that consisted of 3, 5, or 7 cycles per run. In each run, rotation direction changed at constant intervals amounting to 3, 5, or 7 direction changes per run. The subject viewed the stimulus while maintaining central ﬁxation and counting the number of changes in rotation direction throughout the experiment. The discrete changes in direction and continuous changes in angular velocity make the decoding of rotation angle a non-trivial problem. In a separate session, a standard retinotopic mapping experiment was conducted to delineate visual areas using a rotating wedge stimulus (Sereno et al., 1995). Subjects viewed a wedge, consisting of a ﬂashing checkerboard pattern (3 Hz), ﬁrst rotating clockwise for 9 cycles and then anticlockwise for another 9 cycles (at a rotation speed of 23.4 s/cycle). In a third session, a functional localization experiment was used to localize area MT+. In order to localize voxels sensitive to linear or rotational motion, the subjects viewed the grating in different conditions: static non-drifting, static-drifting, and rotating nondrifting. Each run consisted of a rest block followed by the blocks deﬁned by the different conditions. Each block was 15 s long and each run was repeated ten times.

Functional images were acquired with a Siemens 3T MRI system using a 32 channel coil for signal reception. For the main experiment, 1319 blood oxygenation level dependent (BOLD) sensitive functional images were acquired using a single shot gradient EPI sequence, with a repetition time (TR) of 1500 ms, echo time (TE) of 30 ms, isotropic voxel size of 2 × 2 × 2 mm, acquired in 26 axial slices in ascending order using a (GRAPPA) acceleration factor of 3. Dummy volumes were acquired at the beginning of each scan in order to account for T1 equilibration effects. A high-resolution anatomical image was acquired using an MP-RAGE sequence (TE/TR = 3.39/2250 ms; 176 sagittal slices, isotropic voxel size of 1 × 1 × 1 mm). Functional MRI data preprocessing

134

Functional data were preprocessed within the framework of SPM8 (Statistical Parametric Mapping, http://www.ﬁl.ion.ucl.ac.uk/spm). Dummy scans were removed before preprocessing. Functional brain volumes were realigned to the mean image in order to correct for motion and the anatomical image was coregistered with the mean of the functional images. Data were high-pass ﬁltered with a ﬁlter cutoff of 128 s, represented as percentage signal change and ﬁnally centered and scaled to have mean zero and unit variance. In the present analysis we restricted ourselves to regions of interest in visual cortex. We performed retinotopic mapping to identify the boundaries of retinotopic areas in early visual cortex using well-established methods (Sereno et al., 1995; DeYoe et al., 1996; Engel et al., 1997). Freesurfer (http://surfer.nmr.mgh.harvard.edu) was used to generate inﬂated representations of the cortical surface from each subject's T1-weighted structural image and to analyze the functional data of the retinotopic mapping session. Fourier-based methods were used to obtain polar angle maps of the cortical surface, on the basis of which the borders of visual areas (dorsal and ventral V1, V2 and V3 in both hemispheres) could be deﬁned for each subject (Sereno et al., 1995). Area MT+ was localized by taking the most signiﬁcant clusters obtained from a GLM contrast (static drifting + rotating − static conditions) while restricting the clusters to fall within area OC5 of the SPM Anatomy toolbox. This area was delineated using a liberal probability threshold greater than zero in order to ensure the inclusion of motion-sensitive voxels for the decoding analysis. The area was warped into native space by applying the inverse normalization parameters to the mask.

135

Decoding analysis

162

In order to dynamically decode both grating orientation and rotation direction from the continuously rotating grating, we made use of time-series analysis methods. Assume that the state vector of interest x = (x1,…, xT) forms a Markov chain and is only indirectly observable via measurements y = (y1,…, yT). The joint density for this generative model is then given by

163 164

88

92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121

10 degrees visual angle

10-20 degrees/s Fig. 1. Subjects were shown a continuously rotating grating whose rotation angle was unpredictable due to pseudo-random changes in both angular velocity and rotation direction. Subjects had to maintain ﬁxation and count the number of times a rotation direction change occurred.

T

T

t =2

t=1

pðx; yÞ = pðx1 Þ ∏ pðxt jxt−1 Þ ∏ pðyt jxt Þ

125 126 127 128 129 130 131 132 133

136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161

165 166 167 168

ð1Þ

where the dependence on the model parameters θ is left implicit. Ultimately, we are interested in computing the most probable sequence of states x⁎ = arg maxxp(x|y) for new observations y. This is achieved by estimating the model parameters θ from data acquired over multiple runs and inferring x⁎ from measurements made in independent runs. Since the state is observed during training we can immediately estimate parameters from data instead of needing to estimate them using an expectation-maximization algorithm (Dempster et al., 1977).

Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020

169 170 171 172 173 174 175 176 177 178

M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx

179 180 181 182 183 184

Decoding of grating orientation and rotation direction requires inference about a continuous mental state (orientation) or a discrete mental state (direction). In case of continuous mental state, we assume that both the state and measurement variables are Gaussian with the mean being linearly dependent on the conditioning variables. That is, we have pðx1 Þ pðxt jxt−1 Þ pðyt jxt Þ

185 186 187 188 189 190 191 192 193 194

195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235

= = =

N ðx1 ; μ; ΣÞ N ðxt ; Axt−1 ; Q Þ N ðyt ; Cxt ; RÞ

with parameters θ = {μ, Σ, A, Q, C, R}. This model is known as a linear dynamical system (LDS) (Roweis and Ghahramani, 1999) where decoding of x is realized using the Kalman smoother. If, in contrast, the underlying mental state is discrete, such as in case of rotation direction or spontaneous changes during bistable perception, we obtain the hidden Markov model (HMM) (Roweis and Ghahramani, 1999) where the state variables are multinomial random variables and x is computed by the Viterbi decoder. For binomial state variables with xt ∈ {0, 1}, the distributions are deﬁned by pðx1 = 1Þ

=

m

pðxt = 1jxt−1 Þ

=

a0

pðyt jxt Þ

=

N ðyt ; μ0 ; R0 Þð1−xt Þ N ðyt ; μ1 ; R1 Þxt

ð1−xt−1 Þ xt−1 a1

with parameters θ = {m, ak, μk, Rk} with k ∈ {0, 1}. All algorithms have been implemented in MATLAB 7 (Mathworks, Natick, MA, USA). For the HMM a small diagonal term of 10 − 3I was added to the covariance matrices Rk in order to avoid numerical stability issues. In order to take into account the hemodynamic lag, for both the LDS and the HMM, measurements yt associated with a state xt at a certain point in time were given by the BOLD response within the volume acquired at a six second delay relative to stimulus onset t. In order to estimate the angle αt, we used a linear dynamical system where the state vector is given by xt = (sin(αt), cos(αt)) T. This transformation is required since angle itself is a periodic variable which should be decoded using techniques from circular statistics (Fisher, 1996; Mardia and Jupp, 1999). The sine–cosine transformation allowed us to use standard time-series analysis methods. Furthermore, since a grating is mirror symmetric, a full rotation is achieved after a 180° change in angle such that the average absolute deviation between the real angle and a random angle will be 45°. Decoding performance was estimated by computing the mean absolute deviation (MAD) between the real and predicted angle of the rotating grating. In order to estimate rotation direction, we used a hidden Markov model where the state xt ∈ {0, 1} denotes clockwise or counter-clockwise rotation. Decoding performance was estimated by computing the classiﬁcation error, that is, the proportion of volumes that have been assigned to the incorrect rotation direction. In order to improve the efﬁciency of inference and to prevent overﬁtting we restricted the number of voxels that act as observations. We achieved this by ranking individual voxels in terms of the correlation between their BOLD timecourse and the timecourse(s) of interest (sine and cosine of the angle or rotation direction). In order to prevent circularity issues (Kriegeskorte et al., 2009), we used an outer and inner cross-validation approach. In the outer cross-validation, individual runs (test data) were decoded by means of probabilistic inference using a model that was estimated from data in the nine remaining runs (training data). In the inner cross-validation approach, eight out of nine runs in the training data were used to rank voxels according to their correlation with the signal timecourse(s) of interest. This ranking was used to identify the optimal number of voxels according to decoding performance on the remaining run in the test data. The identiﬁed voxels were used to train the model on all training data. This model was subsequently used in the outer cross-

3

validation to produce decoding results for the test data. Final decoding performance was estimated by computing the average decoding performance over each of the ten runs. See Fig. 2 for a description of the decoding analysis. In order to examine the differences in decoding performance between regions of interest, we have also analyzed decoding performance per region. In this case, in order to be able to compare decoding performances, a ﬁxed number of voxels was used as the basis for decoding.

236 237

Results

245

Decoding of grating orientation

246

Fig. 3 depicts the orientation decoding results for all three subjects using the outer and inner cross-validation approach described above. Panel 3.A shows the real and predicted cumulative angles (i.e., predicted orientations while taking into account full rotations) for the three subjects across all runs as compared with a random prediction that is obtained by ignoring the observed BOLD responses. The mean absolute deviations between the angle of the real and predicted grating orientation were 14.56° ± 0.80° SEM, 14.52° ± 0.56° SEM and 15.61° ± 0.71° SEM for subjects MG, PK and FL respectively, which is well below the chance level of 45°. All three subjects showed a weak positive correlation between the absolute deviation and the angular velocity (r = 0.28, r = 0.29 and r = 0.22) which implies that the

247 248

BOLD DATA

preprocessing

correlation-based feature ranking

model selection

xt-1

xt

yt-1

yt

LDS:

orientation decoding

HMM:

direction decoding

Fig. 2. Overview of the decoding analysis. BOLD data is realigned, high-pass ﬁltered, standardized and corrected for the hemodynamic lag by aligning the peak of the BOLD response with the moment of stimulation. Both BOLD data and the signal time courses (orientation and rotation direction) serve as input to a correlation-based feature ranking which ranks the voxels according to their usefulness for decoding. Using an inner cross-validation which iterates over nine out of ten runs the optimal number of voxels is computed by minimizing MAD/classiﬁcation error using the LDS/HMM. After this model selection phase, the parameters of the LDS/HMM are re-estimated based on all nine runs. Decoding performance is subsequently computed using the test data (the remaining run). This procedure is repeated for each of the ten runs and reported decoding performance is the average over runs.

Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020

238 239 240 241 242 243 244

249 250 251 252 253 254 255 256 257 258

M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx

4

1

real predicted random

MG

x 10

B 45

0

40

−1

200

400

4

1

x 10

600 PK

800

1000

1200

35 MAD

A

cumulative angle cumulative angle cumulative angle

4

0 −1

200

400

4

1

x 10

600 FL

800

1000

30 25 20

1200

15 10

0 −1

200

C

400

600 volume

800

1000

MG

V2d

MT+

V1

101

number of voxels

PK

V2d V3d

FL

MT+

MT+

V3d

V2d

V3d

V3d V2d

V3d

MT+

V1

V2v

V2v V3v

1

V3v

V2v

MT+

V2d V2d V1

V1 V1

V3v

102

1200

MT+ V3d

MG PK FL

V1 V2v V3v

V2v V3v V3v

V2v

0

Fig. 3. Orientation decoding results for the three subjects. Panel A shows the angle of the real, predicted and random orientation accumulated across runs as a function of measured volume. Random decoding results were obtained by ignoring the observed responses and are added for comparison. Upward directions denote clockwise rotation and downward directions denote anti-clockwise rotation. Panel B shows the mean absolute deviation as a function of the number of used voxels where error bars denote standard error of the mean. Dashed vertical lines denote the number of voxels that are selected per cross-validation fold. Panel C shows the correlations between the signal of interest and those voxels which have been selected in the inner cross-validation to produce the ﬁnal decoding results.

259

277

accuracy of the estimates decreases as the rotation speed of the grating increases. In order to examine how MAD changes as a function of the number of used voxels, we used the inner cross-validation results. That is, the corresponding estimates obtained per fold were averaged to produce the curves of interest for three subjects, as shown in Panel 3.B. The dashed vertical lines depict the optimal number of voxels for individual runs in the inner cross-validation. A minimal MAD was reached at about 200 included voxels. Finally, Panel 3.C shows the voxels which were used for decoding the angle. The contours delineate dorsal and ventral visual areas V1, V2 and V3 as well as area MT+. The heat maps denote how often each voxel was used for decoding, averaged over cross-validation folds. Selected voxels extend from V1 up to and beyond V3 with dorsal areas being more prevalent than ventral areas, which could be associated with the performance asymmetry between the upper and lower visual ﬁeld (Levine and McAnany, 2005) or due to differences in susceptibility artifacts between dorsal and ventral areas (Winawer et al., 2010).

278

Decoding of rotation direction

279

Fig. 4 depicts the direction decoding results for all three subjects using the inner and outer cross-validation approach described above. Panel 4.A shows the real and predicted rotation directions for the three subjects in the ﬁrst ﬁve runs. Classiﬁcation errorswere 0.37, 0.34 and 0.41 for subjects MG, PK and FL respectively. Subjects showed no or a very weak positive correlation between the classiﬁcation error

260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276

280 281 282 283 284

and the angular velocity (r = 0.05, r = 0.12 and r = − 0.01). In order to examine how classiﬁcation error changes as a function of the number of used voxels, we used the inner cross-validation results. Panel 4.B depicts the curves of interest for three subjects. The dashed vertical lines depict the optimal number of voxels for individual runs in the inner cross-validation. A minimal error was reached at about ten included voxels. Panel 4.C indeed shows that a small number of voxels are selected, mostly outside of early visual areas or area MT+. In general, rotation direction seems much harder to decode than grating orientation. The variability of the number of voxels which are selected in the inner cross-validation (cf. Panels 3.B and 4.B) also indicates decoding of direction is less robust than decoding of orientation.

285

Region of interest analysis

298

In order to identify how visual areas independently contribute to the decoding of grating orientation and rotation direction, we have used the functional localizers that have been acquired in independent sessions. In order to allow comparison of the contributions for each visual area, we used the ﬁfty voxels whose BOLD response correlated most strongly with the signal(s) of interest. Fig. 5 shows the decoding results on the ﬂattened cortical surface for each subject. For grating orientation, decoding performance, averaged over dorsal and ventral as well as left and right hemisphere regions for all subjects, decreased when moving downstream from V1 to V3 (MADs of 30° ± 3° SEM, 31° ± 3° SEM and 34° ± 4° SEM for areas V1, V2 and V3, respectively). This matches the decoding results of Kamitani and

299

Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020

286 287 288 289 290 291 292 293 294 295 296 297

300 301 302 303 304 305 306 307 308 309 310

M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx

A

real predicted

right left right left

B 0.5

classification error

direction

right left right left

direction

MG

PK

FL direction

5

right left right left

0.45

0.4

0.35

MG PK FL

0.3

101

102 number of voxels

100

200

300

400

500

600

volume

C

MG

PK

FL

MT+

MT+ V3d

V2d V3d

V2d

MT+

V3d

V3d

V3d V2d

V2d

V3d

MT+

V1

V1

MT+

V1

V1 V1

V3v

V2v V3v

V2v

1

V3v

MT+

V2d

V2d

V1 V2v

V2v

V3v

V3v

V2v V3v V2v

0

Fig. 4. Predicted rotation directions for the three subjects. Panel A shows the decoding result for ﬁrst ﬁve runs. Panel B shows the classiﬁcation error as a function of the number of used voxels where error bars denote standard error of the mean. Dashed vertical lines denote the number of voxels that are selected per cross-validation fold. Panel C shows the correlations between the signal of interest and those voxels which have been selected in the inner cross-validation to produce the ﬁnal decoding results.

311 312 313

of 43° ± 1° SEM), which is also consistent with previous ﬁndings 314 (Kamitani and Tong, 2005). Dorsal areas (MADs of 30° ± 2° SEM, 315 29° ± 2° SEM and 32° ± 2° SEM for areas V1, V2 and V3, respectively) 316

Tong(2005) and is consistent with the poorer orientation selectivity in higher visual areas, as shown in animal studies (Vanduffel et al., 2002). Area MT+ did not allow decoding of grating orientation (MAD

A

MG

PK

FL

20 MT+

MT+ V3d

V2d V3d

V2d

MT+

V3d

V3d V2d

V3d V2d

V3d

V1

V1 V1

V1 V3v

V2v V3v

V2v

MT+

V2d

V2d

MT+

V1

V1

MT+

V3v

V2v

V2v

V3v

V3v

V2v V3v V2v

5

B

MG

PK

FL

0.7

V2d

MT+

V1

V3v

V2v

V2d V3d

MT+

MT+

MT+ V3d

V3d

V2d

V3d

V3d V2d

V3d

V2d

MT+

V2d

MT+

V1

V1 V2v V3v

V3v

V2v

V1

V1

V1 V2v V3v

V3v

V2v V3v V2v

0.5 Fig. 5. Decoding results separately computed for visual areas V1, V2, V3 and MT+ for the three subjects. Panel A shows angle decoding results where color codes for decoding performance (45 — MAD). Panel B shows direction decoding results where color codes for classiﬁcation accuracy (1 — classiﬁcation error). (For interpretation of the references to color in this ﬁgure legend, the reader is referred to the web version of this article.)

Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020

6

90

dorsal left hemisphere 90

40 60

120

90

20

30

30

180

330

300

240 270

30

330

210

300 270

30

10

0 180

240

20

150

10

0 180

210

60 30

20

150

10

40

120

30

20

150

90 60

30

10

ventral right hemisphere

40

120

60

120

30 150

dorsal right hemisphere

40

0 180

210

330

240

300 270

0

210

330

240

300 270

Fig. 6. Polar plots for the angle deviation as a function of orientation where orientations are binned into 30° windows with 20° overlap. Results are shown for all three subjects in ventral and dorsal regions of primary visual area V1 for both the left and right hemisphere.

M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx

Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020

ventral left hemisphere

MG PK FL

M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx

330 331

performed somewhat better than ventral areas (MADs of 30° ± 4° SEM, 32° ± 4° SEM and 35° ± 4° SEM for areas V1, V2 and V3, respectively). For rotation direction, decoding performance per region of interest was quite low. Still, we could observe decreases in performance in the downstream direction when dorsal and ventral as well as left and right hemisphere regions were pooled (classiﬁcation errors of 0.37, 0.42 and 0.43 for areas V1, V2 and V3, respectively). Area MT+ now also became involved with an average classiﬁcation error of 0.41. We also examined whether individual regions showed a particular bias for certain grating orientations. Fig. 6 shows that there is some variability in the decoding of certain orientations in visual area V1, both within and between subjects. However, we have found no clear correspondences with previously reported orientation anisotropies in early visual cortex (Furmanski and Engel, 2000; Mannion et al., 2010).

332

Static versus dynamic decoding

333 334

In order to substantiate our claim that the decoding of ongoing mental state is facilitated by the use of dynamic instead of static decoding methods, we performed a comparative analysis. The static decoder was obtained simply by removing the autocorrelations for the state variable at neighboring timepoints. That is, we forced the matrix A to be zero and the transition probabilities a0 and a1 to be 0.5 during parameter estimation for the LDS and the HMM, respectively. Fig. 7 demonstrates the decoding of mental state using randomly selected voxels in area V1. Panel 7.A shows an example of decoding of the sine and cosine of the rotation angle using a static decoder (blue line) and a dynamic decoder (red line). These results demonstrate that the dynamic decoder effectively performs an adaptive smoothing of the time-series due to the intrinsic autocorrelations. Panel 7.B shows that the dynamic decoder indeed improves on the static

326 327 328 329

335 336 337 338 339 340 341 342 343 344 345 346

A

347 348

Discussion

357

In this paper, we have shown that dynamic decoding of ongoing mental state can be achieved using time-series analysis methods. First, it was shown that grating orientation can be decoded continuously by means of a linear dynamical system where the grating orientation is inferred using a Kalman smoother. Second, it was shown that rotation direction can be decoded using a hidden Markov model where the rotation direction is inferred using Viterbi decoding. Our comparison between static and dynamic decoding methods shows that the decoding of ongoing internal state from BOLD response is improved by employing dynamic decoding methods. This paves the way for decoding of ongoing perception in more ecological settings, such as viewing of movies (Hasson et al., 2004), as well as decoding of ongoing mental state which arises from internal processes instead of from external stimulation (Haynes and Rees, 2005a; Haynes, 2009). In the latter case, the state itself cannot be observed and requires the use of an expectation-maximization algorithm when estimating model parameters. Even though both grating orientation and rotation direction could be decoded from BOLD response, results indicate that the former is

358

B 45

MG 4 2

MG dynamic PK dynamic FL dynamic MG static PK static FL static

40

0 −2

20

40

60

80

100

120

35 30

4 2

25

0 −2

20 20

40

static dynamic

C

60

80

100

120

0

100

200 300 number of voxels

400

500

D

MG

0.52 right classification error

324 325

sin(angle)

323

cos(angle)

321 322

left direction

319 320

decoder in terms of MAD performance. The effects are even more salient for the decoding of rotation direction. Panel 7.C shows an example of decoding the rotation direction using a static decoder and a dynamic decoder. The static decoder completely fails due to the independence assumption between neighboring timepoints. Classiﬁcation error is also substantially lower for the dynamic decoder (Panel 7.D). Subject PK, who performed best in previous analyses, now displays the worst decoding performance. This can be explained by the fact that those voxels which allowed decoding before may have been excluded by the random selection of voxels in area V1.

MAD

317 318

7

right left right

0.5 0.48 0.46 0.44

left 0.42 100

200

300

400

500

600

101

102 number of voxels

Fig. 7. Comparison between static and dynamic decoding using randomly selected voxels in area V1. Averaged results for twenty repeats using different random selections are shown. Panel A demonstrates the decoding of the sine and cosine components of the rotation angle using a static and a dynamic decoder based on 400 randomly selected voxels. Panel B shows how MAD changes as a function of the number of used voxels. Panel C depicts the decoding of rotation direction using both decoders based on 50 randomly selected voxels. Panel D shows how classiﬁcation error changes as a function of the number of used voxels. The dynamic decoder outperforms the static decoder in both cases. Error bars in Panels B and D denote standard error of the mean.

Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020

349 350 351 352 353 354 355 356

359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376

8

377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442

M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx

much easier to decode than the latter. Figs. 3B and 4B show that the decrease in error is both more pronounced and more consistent for grating orientation compared with rotation direction. The improved consistency follows from the fact that results are more comparable between and within subjects. Within subjects, the results for each cross-validation fold are more similar, as evidenced both by the smaller error bars as well as by the fact that the optimal number of voxels used for decoding is more similar between folds. Improved consistency also follows from the voxels selected for decoding as shown in Figs. 3C and 4C. Clusters of voxels in early visual cortex correlated strongly with grating orientation whereas the voxels selected to decode rotation direction were more dispersed throughout the measured brain volume. The fact that orientation was easier to decode than rotational motion is in line with previous ﬁndings (Kamitani and Tong, 2005, 2006) and could be due to differences in spatial anisotropies between orientation and direction-selective neurons. Swisher et al.(2010) provide a nice account of the orientation biases that are present in the BOLD signal at multiple spatial scales. The region of interest analysis for direction decoding, depicted in Fig. 5B, shows that early visual cortex and area MT+ did allow for the decoding of rotation direction. Here, area MT+ outperformed early visual areas for subjects MG and PK while the converse was true for subject FL. Note however that it remains problematic to directly compare classiﬁcation accuracy between different areas of the brain. For example, sampling from voxels in area V1, which is estimated to be about 3 times larger than MT, may in and of itself lead to a higher signal-to-noise ratio at the area level (Serences and Boynton, 2007). Note further that there is a disparity between these results and the earlier decoding analysis in terms of selected voxels and their associated performance. This could indicate that the employed correlation-based feature ranking criterion may be suboptimal for selecting voxels that respond well to rotation direction. For subject FL, additional information with respect to orientation was present in regions anterior to the early retintopically deﬁned visual cortical areas. While these areas are likely to be tuned for more complex stimulus properties than orientation, they may still show a response bias that can be picked up by the decoding algorithm. Retinotopic maps are found throughout the visual dorsal stream (Silver and Kastner, 2009) and these maps are known to exhibit unequal sampling of different orientations in particular parts of visual space (Sasaki et al., 2006). In this paper, we have used linear dynamical systems and hidden Markov models in order to demonstrate that dynamic decoding of ongoing mental state from BOLD data is feasible. It should be noted however that the rate of stimulus change as used in this paper was quite slow. A weak positive correlation between decoding error and angular velocity was observed but it remains an open question how decoding performance is affected by more rapid stimulus changes in the presence of autocorrelations. More sophisticated methods for time-series analysis could further improve decoding performance. One way to improve results is to use more complex generative models that provide a more natural representation of the phenomena we are trying to model. This can either be achieved by relaxing the Gaussianity assumptions, by introducing additional hidden variables, or by making other independence assumptions (Barber and Cemgil, 2010). Another way to improve results is to use a more principled approach to voxel selection, for example by means of dimensionality reduction techniques (Schölkopf et al., 1998) or by embedding the selection process directly within the employed inference algorithms, e.g., using automatic relevance determination (Beal, 2003). Finally, results may be improved by including a forward model of the hemodynamic response (Makni et al., 2008). Dynamic decoding has several potential beneﬁts compared to static decoding since it gives better insight into the temporal

dynamics of ongoing mental state. This possibility to track mental state over time becomes especially important with the advent of realtime fMRI (DeCharms, 2008) where BOLD data is analyzed in an online manner which has applications in brain–computer interfacing and brain-state dependent stimulation. The aim of this paper is to encourage the use of time-series analysis methods for the decoding of ongoing mental state. Our results on the decoding of grating orientation and rotation direction from BOLD response show that such methods are warranted whenever decoding is applied within a paradigm that is characterized by intrinsic temporal dynamics.

443 444

Acknowledgments

453

The authors gratefully acknowledge the support of the Netherlands Organization for Scientiﬁc Research NWO (Veni grant 451.09.001 and Vici grant 639.023.604) and the BrainGain Smart Mix Programme of the Netherlands Ministry of Economic Affairs and the Netherlands Ministry of Education, Culture and Science. We thank Paul Gaalman for his assistance during fMRI data acquisition.

454

References

460

Barber, D., Cemgil, A.T., 2010. Graphical models for time series. IEEE Signal Process. Mag. 27 (6), 18–28. Beal, M., 2003. Variational algorithms for approximate Bayesian inference. Ph.D. thesis, University College London. Brainard, D.H., 1997. The psychophysics toolbox. Spat. Vis. 10, 433–436. Brouwer, G.J., van Ee, R., 2007. Visual cortex allows prediction of perceptual states during ambiguous structure-from-motion. J. Neurosci. 27 (5), 1015–1023. Chadwick, M.J., Hassabis, D., Weiskopf, N., Maguire, E.A., 2010. Decoding individual episodic memory traces in the human hippocampus. Curr. Biol. 20 (6), 544–547. DeCharms, R.C., 2008. Applications of real-time fMRI. Nat. Rev. Neurosci. 9, 720–729. Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39 (1), 1–38. DeYoe, E.A., Carman, G.J., Bandettini, P., Glickman, S., Wieser, J., Cox, R., Miller, D., Neitz, J., 1996. Mapping striate and extrastriate visual areas in human cerebral cortex. Proc. Natl. Acad. Sci. U.S.A. 93 (6), 2382–2386. Engel, S.A., Glover, G.H., Wandell, B.A., 1997. Retinotopic organization in human visual cortex and the spatial precision of functional MRI. Cereb. Cortex 7 (2), 181–192. Fisher, N.I., 1996. Statistical analysis of circular data. Cambridge University Press, Cambridge. Formisano, E., De Martino, F., Bonte, M., Goebel, R., 2008. “Who” is saying “what”? Brain-based decoding of human voice and speech. Science 322 (5903), 970–973. Fuentemilla, L., Penny, W.D., Cashdollar, N., Bunzeck, N., Düzel, E., 2010. Theta-coupled periodic replay in working memory. Curr. Biol. 20, 1–7. Furmanski, C.S., Engel, S.A., 2000. An oblique effect in human primary visual cortex. Nat. Neurosci. 3 (6), 535–536. Hassabis, D., Chu, C., Rees, G., Weiskopf, N., Molyneux, P.D., Maguire, E.A., 2009. Decoding neuronal ensembles in the human hippocampus. Curr. Biol. 19, 546–554. Hasson, U., Nir, Y., Levy, I., Fuhrmann, G., Malach, R., 2004. Intersubject synchronization of cortical activity during natural vision. Science 303 (5664), 1634–1640 Mar. Haxby, J., Gobbini, M., Furey, M., Ishai, A., Schouten, J., Pietrini, P., 2001. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425–2430. Haynes, J.-D., 2009. Decoding visual consciousness from human brain signals. Trends Cogn. Sci. 13 (5), 194–202. Haynes, J.-D., Rees, G., 2005a. Predicting the orientation of invisible stimuli from activity in human primary visual cortex. Nat. Neurosci. 8 (5), 686–691 May. Haynes, J.-D., Rees, G., 2005b. Predicting the stream of consciousness from activity in human visual cortex. Curr. Biol. 15 (14), 1301–1307. Haynes, J.-D., Rees, G., 2006. Decoding mental states from brain activity in humans. Nat. Rev. Neurosci. 7, 523–534. Kamitani, Y., Tong, F., 2005. Decoding the visual and subjective contents of the human brain. Nat. Neurosci. 8 (5), 679–685. Kamitani, Y., Tong, F., 2006. Decoding seen and attended motion directions from activity in the human visual cortex. Curr. Biol. 16 (11), 1096–1102 Jun. Kay, K., Naselaris, T., Prenger, R.J., Gallant, J.L., 2008. Identifying natural images from human brain activity. Nature 452, 352–355. Kriegeskorte, N., Simmons, W.K., Bellgowan, P.S.F., Baker, C.I., 2009. Circular analysis in systems neuroscience: the dangers of double dipping. Nat. Neurosci. 12 (5), 535–540. Kriegeskorte, N., Cusack, R., Bandettini, P., 2010. How does an fMRI voxel sample the neuronal activity pattern: compact-kernel or complex spatiotemporal ﬁlter? NeuroImage 49 (3), 1965–1976. Levine, M.W., McAnany, J.J., 2005. The relative capabilities of the upper and lower visual hemiﬁelds. Vis. Res. 45 (21), 2820–2830. Makni, S., Beckmann, C., Smith, S., Woolrich, M., 2008. Bayesian deconvolution of fMRI data using bilinear dynamical systems. NeuroImage 42 (4), 1381–1396.

461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518

Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020

445 446 447 448 449 450 451 452

455 456 457 458 459

M.A.J. van Gerven et al. / NeuroImage xxx (2011) xxx–xxx

519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540

Mannion, D.J., McDonald, J.S., Clifford, C.W.G., 2010. Orientation anisotropies in human visual cortex. J. Neurophysiol. 103 (6), 3465–3471. Mardia, K.V., Jupp, P.E., 1999. Directional statistics, 2nd ed. Wiley, New York. Miyawaki, Y., Uchida, H., Yamashita, O., Sato, M., Morito, Y., Tanabe, H.C., Sadato, N., Kamitani, Y., 2008. Visual image reconstruction from human brain activity using a combination of multiscale local image decoders. Neuron 60 (5), 915–929. Naselaris, T., Prenger, R.J., Kay, K.N., Oliver, M., Gallant, J.L., 2009. Bayesian reconstruction of natural images from human brain activity. Neuron 63 (6), 902–915. Norman, K.A., Polyn, S.M., Detra, G.J., Haxby, J.V., 2006. Beyond mind-reading: multivoxel pattern analysis of fMRI data. Trends Cogn. Sci. 10 (9), 424–430. Op de Beeck, H.P., 2010. Against hyperacuity in brain reading: spatial smoothing does not hurt multivariate fMRI analyses? NeuroImage 49 (3), 1943–1948. Reddy, L., Tsuchiya, N., Serre, T., 2010. Reading the mind's eye: decoding category information during mental imagery. NeuroImage 50 (2), 818–825. Roweis, S.T., Ghahramani, Z., 1999. A unifying review of linear Gaussian models. Neural Comput. 11 (2), 305–345. Sasaki, Y., Rajimehr, R., Kim, B.W., Ekstrom, L.B., Vanduffel, W., Tootell, R.B.H., 2006. The radial bias: a different slant on visual orientation sensitivity in human and nonhuman primates. Neuron 51 (5), 661–670. Schölkopf, B., Smola, A.J., Müller, K.-R., 1998. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319.

9

Serences, J.T., Boynton, G.M., 2007. Feature-based attentional modulations in the absence of direct visual stimulation. Neuron 55 (2), 301–312. Sereno, M.I., Dale, A.M., Reppas, J.B., Kwong, K.K., Belliveau, J.W., Brady, T.J., Rosen, B.R., Tootell, R.B., 1995. Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science 268 (5212), 889–893. Shmuel, A., Chaimow, D., Raddatz, G., Ugurbil, K., Yacoub, E., 2010. Mechanisms underlying decoding at 7 T: ocular dominance columns, broad structures, and macroscopic blood vessels in V1 convey information on the stimulated eye. NeuroImage 49 (3), 1957–1964. Silver, M.A., Kastner, S., 2009. Topographic maps in human frontal and parietal cortex. Trends Cogn. Sci. 13 (11), 488–945. Swisher, J.D., Gatenby, J.C., Gore, J.C., Wolfe, B.A., Moon, C.-H., Kim, S.-G., Tong, F., 2010. Multiscale pattern analysis of orientation-selective activity in the primary visual cortex. J. Neurosci. 30 (1), 325–330. Thirion, B., Duchesnay, E., Hubbard, E., Dubois, J., Poline, J., Lebihan, D., Dehaene, S., 2006. Inverse retinotopy: inferring the visual content of images from brain activation patterns. NeuroImage 33 (4), 1104–1116. Vanduffel, W., Tootell, R.B.H., Schoups, A.A., Orban, G.A., 2002. The organization of orientation selectivity throughout macaque visual cortex. Cereb. Cortex 12 (6), 647–662. Winawer, J., Horiguchi, H., Sayres, R.A., Amano, K., Wandell, B.A., 2010. Mapping hV4 and ventral occipital cortex: the venous eclipse. J. Vis. 10 (5), 1–22.

563

Please cite this article as: van Gerven, M.A.J., et al., Dynamic decoding of ongoing perception, NeuroImage (2011), doi:10.1016/ j.neuroimage.2011.05.020

541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562