Tuning properties of the auditory frequency-shift detectors Laurent Demanya兲 Laboratoire Mouvement, Adaptation, Cognition (UMR CNRS 5227), Université de Bordeaux, BP 63, 146 Rue Leo Saignat, F-33076 Bordeaux, France

Daniel Pressnitzer Département d’Etudes Cognitives, Laboratoire Psychologie de la Perception (UMR CNRS 8158), Université Paris-Descartes and Ecole Normale Supérieure, 29 Rue d’Ulm, F-75230 Paris Cedex 05, France

Catherine Semal Laboratoire Mouvement, Adaptation, Cognition (UMR CNRS 5227), Université de Bordeaux, BP 63, 146 Rue Leo Saignat, F-33076 Bordeaux, France

共Received 11 December 2008; revised 16 April 2009; accepted 23 June 2009兲 Demany and Ramos 关共2005兲. J. Acoust. Soc. Am. 117, 833–841兴 found that it is possible to hear an upward or downward pitch change between two successive pure tones differing in frequency even when the first tone is informationally masked by other tones, preventing a conscious perception of its pitch. This provides evidence for the existence of automatic frequency-shift detectors 共FSDs兲 in the auditory system. The present study was intended to estimate the magnitude of the frequency shifts optimally detected by the FSDs. Listeners were presented with sound sequences consisting of 共1兲 a 300-ms or 100-ms random “chord” of synchronous pure tones, separated by constant intervals of either 650 cents or 1000 cents; 共2兲 an interstimulus interval 共ISI兲 varying from 100 to 900 ms; 共3兲 a single pure tone at a variable frequency distance 共⌬兲 from a randomly selected component of the chord. The task was to indicate if the final pure tone was higher or lower than the nearest component of the chord. Irrespective of the chord’s properties and of the ISI, performance was best when ⌬ was equal to about 120 cents 共1/10 octave兲. Therefore, this interval seems to be the frequency shift optimally detected by the FSDs. © 2009 Acoustical Society of America. 关DOI: 10.1121/1.3179675兴 PACS number共s兲: 43.66.Mk, 43.66.Hg 关BCM兴

I. INTRODUCTION

Auditory scene analysis has two facets: a segregation facet and an integration facet 共Bregman, 1990兲. This is observable, for instance, when the scene is a rapid sequence of pure tones varying in frequency. Instead of being identified as the products of a single acoustic source, the elements of such a scene may be perceived as products of two or more concurrently active sources. This reveals the segregation facet of auditory scene analysis. On the other hand, the tones will also be perceptually linked to each other along the time dimension, into a single stream when no segregation occurs or into several streams otherwise. As a result of this sequential integration, the listener will perceive melodic patterns rather than independent sound events. The mechanisms of auditory scene analysis remain largely unknown but are currently the topic of active research 共Moore and Gockel, 2002; Snyder and Alain, 2007; Micheyl et al., 2007a; Micheyl et al., 2007b; Carlyon, 2004; Pressnitzer et al., 2008; Elhilali et al., 2009兲. For a scene such as the one considered above, it is possible that segregation and integration are governed by different neural processes. However, another possibility is that they have a common basis. van Noorden 共1975兲 proposed a specific hypothesis in line with the latter idea. He suggested that the

a兲

Author to whom correspondence should be addressed. Electronic mail: [email protected]

1342

J. Acoust. Soc. Am. 126 共3兲, September 2009

Pages: 1342–1348

auditory system contains automatic “pitch motion detectors” working much like the spatial motion detectors of the visual system. Visual motion percepts can be elicited by discontinuous as well as continuous spatial shifts 共Ekroll et al., 2008兲. The core function of the neural machinery underlying visual motion perception is to bind successive stimuli and to give us an ability to identify them as one and the same physical object 共Ullman, 1978; Shepard, 1984兲. In audition, similarly, pitch motion detectors might serve to bind successive tones into higher-order auditory entities subjectively emanating from a single acoustic source, even though each tone maintains its perceptual identity. Assuming that the strength of the bonds created by the detectors depends on the frequency and temporal relations of the tones, the detectors might participate in scene analysis as both integration tools and segregation tools. Some evidence of the existence of pitch motion detectors was found in an experiment by Okada and Kashino 共2004兲, where listeners had to judge as ascending or descending the direction of a frequency glide preceded by a repeating pair of discrete tones forming an ascending or descending melodic interval. The results showed that judgments of the glide direction were influenced by the direction of the previous melodic interval. This effect was consistent with the idea that the initial discrete tones adapted neural pitch motion detectors responding to both continuous and discrete frequency changes. Other experiments have shown that subjective judgments on the temporal order of tones can be influ-

0001-4966/2009/126共3兲/1342/7/$25.00

© 2009 Acoustical Society of America

enced by previous stimuli consisting of glides 共Okada and Kashino, 2003兲, and that streaming judgments on tone sequences can be influenced by previous sequences 共Snyder et al., 2008, 2009兲. The latter two findings can also be construed as reflecting the adaptation of pitch motion detectors. However, a subjective judgment was used in all of these studies so it is possible that the adaptating stimulus affected the listener’s decision criterion 共Wakefield and Viemeister, 1984; Okada and Kashino, 2003兲. Demany and Ramos 共2005兲 reported objective psychophysical observations that seem to provide stronger evidence for the existence of automatic pitch motion detectors. They constructed sound sequences in which a random “chord” of five synchronous pure tones, separated by frequency intervals of at least 0.5 octave, was followed after a 500-ms delay by a single pure tone 共T兲. Because the components of the chord were synchronous, they were very difficult to hear out individually. This was objectively verified in an experimental condition where, on each trial, T could be either identical to one component of the chord 共selected at random兲 or positioned halfway in frequency between two components. The task was to indicate if T was present in the chord or absent from it. Performance in this “present/absent” task was quite poor. In another condition, however, T was positioned one semitone above or below 共equiprobably兲 one of the chord’s components 共selected at random兲, and the task was to indicate if T was higher or lower in pitch than the closest chord component. Surprisingly, this “up/down” task was performed much better than the present/absent task. The up/down task was relatively easy because, when T was relatively close in frequency to one component of the chord, the sequence formed by this component and T generally elicited a clear percept of directional pitch change, even though the chord’s component was generally not consciously perceived. The fact that it is possible to hear a pitch change between two tones without consciously hearing one of them strongly suggests that the auditory system does contain the automatic pitch motion detectors invoked by van Noorden 共1975兲. Demany and Ramos 共2005, 2007兲 preferred to denominate these neural entities as “frequency-shift detectors” 共FSDs兲. In order to account for their findings, Demany and Ramos 共2005兲 proposed a simple qualitative model, similar to a recent model of visual motion perception 关see Ditterich et al. 共2003兲; see also Allik et al. 共1989兲 for related ideas in the auditory domain兴. The model first assumes the existence, in the auditory system, of two subsets of FSDs respectively tuned to upward and downward frequency shifts. A second assumption is that, within each subset, the FSDs respond most strongly to small frequency shifts 共of the same magnitude for the two subsets兲. The third and final assumption is that the consciously available information about the FSDs’ activity is only the difference between the response strengths of the two subsets. This model correctly predicted that the up/down task should be easy because up and down trials were expected to activate the two subsets of FSDs in opposite ways. The model also predicted correctly that the present/absent task should be difficult because on both present and absent trials the two subsets of FSDs were expected to be activated with approximately the same strength. J. Acoust. Soc. Am., Vol. 126, No. 3, September 2009

Finally, the model made sense of results obtained in a third condition, named “present/close,” where the T tone could be either identical to one of the chord’s components or one semitone away from one component 共in either direction兲. The present/close task, in which listeners had to discriminate between these two types of trials, was found to be harder than the up/down task but easier than the present/absent task. The model that we just described is only qualitative. It needs to be made more quantitative. One of its assumptions is that the FSDs respond more strongly to small frequency shifts than to large ones. But by definition, a FSD is not expected to respond strongly to a sequence of tones with identical frequencies. Thus, response strength must be maximal for frequency shifts with a certain magnitude 共possibly depending on temporal parameters of the sound sequence兲. What is this optimal magnitude? We endeavored to estimate it in the two experiments reported here. II. EXPERIMENT 1 A. Rationale

In this experiment, the up/down task described above was performed again using chords made up of six synchronous pure tones equally spaced on a log-frequency scale. We manipulated two independent variables. One of them was the magnitude of the frequency interval separating the components of the chord presented on a given trial; this interval 共I兲 was equal to either 650 cents 共1 cent= 1 / 100 semitone = 1 / 1200 octave兲 or 1000 cents. The second independent variable was the magnitude of the frequency interval ⌬ separating T 共the pure tone following the chord兲 from the closest component of the chord; ⌬ was varied from 50 to 250 cents for I = 650 cents and from 50 to 300 cents for I = 1000 cents. Note that for each value of I, the maximum value of ⌬ was well below I / 2, so that the task was always objectively unambiguous: Even for the maximum ⌬, a listener who would be able to hear out individually the components of the chord was not expected to make errors due to an incorrect identification of the chord component closest to T. For such a listener, performance should either increase monotonically with ⌬ or increase up to some ⌬ value and then stay on a plateau. By contrast, for a listener unable to hear out individually the chord components but possessing automatic FSDs working as specified by the model defined above, performance could be maximal for some ⌬ value and worse for both smaller and larger ⌬ values. Logically, the value of ⌬ maximizing performance 共⌬opt兲 should correspond to the ⌬ value for which up trials and down trials elicit maximally different responses of the two hypothetical subsets of FSDs. B. Procedure

Each tone had a total duration of 300 ms, including 5-ms raised-cosine amplitude ramps, and a nominal sound pressure level of 65 dB. The chord of six tones presented on each trial was randomly positioned between 125 and 4000 Hz 共using a logarithmic frequency scale兲. It was followed by the T tone after a 500-ms interstimulus interval 共ISI兲. As in the Demany et al.: Tuning of the frequency-shift detectors

1343

FIG. 1. 共a兲 Results of experiment 1 for I = 650 cents. Each listener is represented by a specific symbol shape. Open symbols show how d⬘ varied as a function of ⌬, the magnitude of the frequency shift to be judged as ascending or descending in the up/down task. Thick segments connect the mean values of d⬘ in that task. Filled symbols represent the data obtained in the present/absent task. 共b兲 Same as 共a兲, but for I = 1000 cents. The present/absent task was not performed with this value of I. 共c兲 Circles represent the same data as those plotted with thick segments in 共a兲, plus the expected data point for ⌬ = 0 共d⬘ = 0兲. The continuous curve is the best-fitting adjustment of Eq. 共1兲 to the data. The ⌬ value corresponding to the maximum of this function is indicated by a broken arrow. 共d兲 Same as 共c兲, but for the data plotted in 共b兲.

study of Demany and Ramos 共2005兲, the chord was presented 600 ms after a random melody produced at the beginning of the trial. This random melody, serving both as a warning signal and as a pitch eraser 关see Demany and Ramos 共2005兲, footnote 1兴, consisted here of five immediately consecutive tones, with frequencies randomly selected between 125 and 4000 Hz. There were ten experimental conditions. In nine of them, the listener had to perform the up/down task. On each trial, the T tone was randomly positioned ⌬ cents above or below 共at random兲 one of the four “inner” components of the chord 共a random choice was made among these four components兲. In five conditions, I was equal to 650 cents and ⌬ was equal to 50, 100, 150, 200, and 250 cents. In four other conditions, I was equal to 1000 cents and ⌬ was equal to 50, 100, 200, and 300 cents. In all nine conditions, the listener simply had to vote for up if the frequency shift ⌬ was positive, and for down if it was negative. In the tenth and last condition, the task to be performed was a present/absent task: The T tone could be, at random, either identical to one of the four inner components of the chord or halfway in 共log兲 frequency between two adjacent components; one had to vote for present in the former case, and for absent in the latter case. In each experimental session, ten blocks of 50 trials were run: one block in each of the ten conditions. These ten 1344

J. Acoust. Soc. Am., Vol. 126, No. 3, September 2009

blocks of trials were randomly ordered. Eight sessions were run for each listener, so that overall 400 trials were performed for each condition and listener. The stimuli were generated at a sampling rate of 44.1 kHz using a 24-bit sound card 共Echo Gina兲. They were presented binaurally, via electrostatic headphones 共Stax SR-007兲 in a double-walled soundproof booth 共Gisol, Bordeaux兲. Listeners gave their responses by means of mouse clicks on two virtual buttons. Responses were not followed by immediate visual feedback, but listeners were allowed to look at their results following each of block of trials. Seven listeners with normal hearing 共five men, two women兲 were tested individually. This group included six students in their twenties and the first author 共54 years兲. Most of these listeners were amateur musicians. Two of them had previously been tested in closely related experiments. The other five listeners had no previous experience with psychoacoustics. They were initially familiarized with the tasks during one or two practice sessions, using at first chords consisting of only three pure tones, very widely spaced in frequency. C. Results and discussion

Performance was measured in terms of d⬘ 共Green and Swets, 1974兲. The results are displayed in Fig. 1共a兲 for I Demany et al.: Tuning of the frequency-shift detectors

TABLE I. Estimates of ⌬opt derived from the results of experiment 1. The table also indicates the associated values of r2. I = 650 cents

S1 S2 S3 S4 S5 S6 S7 Means of d⬘

I = 1000 cents

⌬opt

r

2

⌬opt

r2

102 127 127 114 119 147 91 117

0.968 0.985 0.970 0.987 0.997 0.956 0.997 0.990

72 128 54 108 127 149 151 120

0.958 0.974 0.999 0.955 0.993 0.968 0.955 0.990

= 650 cents and in Fig. 1共b兲 for I = 1000 cents. Each listener is represented in the two panels by a specific symbol shape. The filled symbols in panel 共a兲 represent the results obtained in the present/absent task. It can be seen that performance in this task was generally quite poor; d⬘ exceeded 1 for only one of the seven listeners, and the average value of d⬘ was 0.59. This poor performance indicates that the components of the chords were very difficult to perceive individually, at least for I = 650 cents. For I = 1000 cents, listeners informally reported that the chords’ components were not noticeably easier to hear out. Performance was generally much better in the up/down task, as indicated by the open symbols in panels 共a兲 and 共b兲. For this task, the grand mean of d⬘ was 1.92 when I was 650 cents and 1.97 when I was 1000 cents. Thus, I had essentially no effect on overall performance. For each value of I, however, performance was markedly dependent on ⌬, as confirmed by a repeated-measures analysis of variance 关for for I I = 650 cents: F共4 , 24兲 = 19.8, P ⬍ 10−6; = 1000 cents: F共3 , 18兲 = 7.8, P = 0.0015兴. It can be seen that in most cases, as ⌬ increased from 50 cents to its maximal value, d⬘ initially increased and then decreased. There were only two exceptions to this rule: For one listener, when I was 1000 cents, d⬘ monotonically 共but very slowly兲 decreased as ⌬ increased; for the other listener, when I was 1000 cents, d⬘ was slightly larger for ⌬ = 300 cents than for ⌬ = 200 cents. In order to estimate precisely, for the two types of chord, the value of ⌬ maximizing d⬘ in the up/down task 共i.e., ⌬opt兲, we fitted continuous curves to the individual and mean data, taking into account the fact that d⬘ should be equal to 0 for ⌬ = 0. It was found that very good fits could be achieved with the scaled gamma distribution function, d⬘ = a · ⌬b exp共− c⌬兲.

共1兲

The three parameters of this function, a, b, and c, were adjusted iteratively using the NCSS statistical software. Table I indicates, for each listener as well as for the mean data 关plotted with thick lines in Figs. 1共a兲 and 1共b兲兴, the resulting estimates of ⌬opt, as well as the associated r2 statistics reflecting the proportion of variance accounted for by the bestfitting functions. In Figs. 1共c兲 and 1共d兲, the mean data are replotted as circles, and the functions fitted to them are displayed. J. Acoust. Soc. Am., Vol. 126, No. 3, September 2009

It can be seen that, on the basis of the mean data, ⌬opt was estimated at 117 cents for I = 650 cents and 120 cents for I = 1000 cents. These two figures are almost identical. They are also close to the figures obtained by averaging the individual estimates of ⌬opt across listeners; the corresponding means are 118 cents for I = 650 cents and 113 cents for I = 1000 cents. Note that there was no correlation between the individual estimates of ⌬opt for the two values of I 共r = 0.07兲. The variability of these individual estimates for a given value of I may arise mainly from the limited accuracy of our performance measurements and the rather coarse sampling of ⌬ 共especially for I = 1000 cents兲 rather than from genuine differences between listeners. For I = 1000 cents, ⌬opt was found to be smaller than I / 8. Thus, it would be very unreasonable to ascribe the decrease in performance for ⌬ beyond ⌬opt to errors in the identification of the chord component closest to T. This would, in addition, presuppose unrealistically that the components of the chords could be perceived individually. The decrease in performance beyond ⌬opt must originate mainly from properties of the auditory system rather than merely from the chords’ characteristics. Such a view is supported by the fact that ⌬opt did not appear to change when I changed from 650 to 1000 cents. Our interpretation of the results is that the auditory system contains automatic FSDs activated optimally by frequency shifts of about 120 cents, at least when the temporal parameters of the sound sequence are those used in this experiment. III. EXPERIMENT 2 A. Rationale

Experiment 1 suggests that the FSDs are optimally activated by frequency shifts of about 120 cents, but is this optimal magnitude a constant or does it depend on the temporal characteristics of the sound sequence containing the frequency shift? We carried out experiment 2 to answer that question. This second experiment was a variant of experiment 1 in which we examined again the effect of ⌬ on performance in the up/down task, but using now shorter tones and a variable ISI between the chord and T. B. Procedure

The stimuli were the same as those used in the up/down condition of experiment 1, except for the following differences: 共1兲 each tone now had a total duration of 100 ms; 共2兲 I was fixed at 1000 cents; 共3兲 the ISI separating the chord from T could be equal to 100, 250, or 900 ms, thus producing stimulus-onset asynchronies 共SOAs兲 of 200, 350, and 1000 ms; 共4兲 the components of the random melody preceding the chord were always separated by 250-ms ISIs; 共5兲 ⌬ could be equal to 50, 100, 150, 200, 250, or 300 cents. In a first stage of the experiment, the ISI separating the chord from T took two possible values: 250 and 900 ms. Each session, in this stage, consisted of 12 blocks of 40 trials, one block for each combination of ISI and ⌬; these 12 blocks were randomly ordered, and ten sessions were run for each listener. Then, in the second stage, the ISI was fixed at 100, and each session consisted of six randomly ordered Demany et al.: Tuning of the frequency-shift detectors

1345

FIG. 2. Individual 共open symbols兲 and average 共thick curves兲 values of d⬘ in experiment 2. Each panel represents the data obtained for a given value of the ISI separating the chord from the following T tone.

blocks of 40 trials, one block for each ⌬; ten sessions were again run for each listener 共but sessions were often grouped in pairs, run on the same day兲. Overall, therefore, each listener performed 400 trials in the 18 subconditions 共three ISIs⫻ six ⌬ values兲. Four male listeners, including the first author, were tested. All of them had previously taken part in experiment 1. C. Results and discussion

The results are displayed in Fig. 2. When the ISI was 100 or 250 ms, d⬘ was clearly a nonmonotonic function of ⌬, as in experiment 1. When the ISI was 900 ms, a similar trend could again be discerned, but it was less clear. The mean values of d⬘ for the three ISIs were very close to each other, differing by less than 0.1. A repeated-measures analysis of variance 关⌬ ⫻ ISI兴 revealed a significant main effect of ⌬ 关F共5 , 15兲 = 7.4, P = 0.001兴, but no main effect of the ISI 关F共2 , 6兲 ⬍ 1兴, and no significant interaction of the two factors 关F共10, 30兲 = 1.5, P = 0.18兴. As in experiment 1, continuous functions defined by Eq. 共1兲 were fitted to the individual and mean data for each ISI, in order to estimate ⌬opt. It was again assumed, in so doing, that d⬘ was equal to 0 for ⌬ = 0. Table II shows the obtained estimates of ⌬opt, as well as the associated values of r2. The grand mean of r2 共0.929兲 was lower than that found in experiment 1 共0.978兲 but still high enough to consider that, overall, the fits were satisfactory. For each ISI, moreover, the mean of the four individual ⌬opt estimates was close to the ⌬opt estimated from the means of d⬘ across listeners. The functions fitted to these means are shown in Fig. 3, together with the means themselves. For clarity, the values of d⬘ have

been increased by 1 for the 250-ms ISI and by 2 for the 900-ms ISI; this is why the ordinate axis is not numbered. It can be seen that ⌬opt remained approximately constant when the ISI changed. The mean of the three ⌬opt values indicated in the figure is 121 cents. This estimate differs by only 1 cent from the one at which we arrived in experiment 1 when I had the same value as in the present experiment, i.e., 1000 cents. Figure 3 suggests that, as the ISI increased, d⬘ decayed less and less rapidly when ⌬ exceeded ⌬opt. However, the reliability of this trend is uncertain since the analysis of variance reported above did not demonstrate a significant interaction between ⌬ and ISI. IV. GENERAL DISCUSSION

We infer from the present study that the automatic FSDs of the human auditory system 共Demany and Ramos, 2005, 2007兲 are optimally sensitive to frequency shifts of about 120 cents, i.e., one-tenth of an octave, at least when the shifts take place in slow or moderately rapid sound sequences. They arrived at this estimate with tones lasting 300 ms 共experiment 1兲 as well as 100 ms 共experiment 2兲 and with four different SOAs: 200, 350, 800, and 1000 ms. Further work would be needed to check that the optimal frequency shift, measured in cents 共that is to say, in logarithmic units兲, is largely independent of frequency. A previous study by Rose and Moore 共2000兲 suggests that departures from this assumption are possible. We investigated auditory stream segregation in rapid tone sequences 共ABA-ABA-,…兲 made up of two pure tones A and B differing in frequency. They found that the minimum frequency interval permitting to hear the tones A and B in two separate streams was less constant

TABLE II. Estimates of ⌬opt derived from the results of experiment 2. The table also indicates the associated values of r2. ISI= 100 ms

S1 S2 S4 S5 Means of d⬘

1346

ISI= 250 ms

ISI= 900 ms

⌬opt

r2

⌬opt

r2

⌬opt

r2

109 131 116 128 122

0.930 0.985 0.788 0.960 0.947

99 157 55 95 109

0.903 0.935 0.793 0.995 0.965

91 179 109 159 131

0.993 0.989 0.829 0.952 0.969

J. Acoust. Soc. Am., Vol. 126, No. 3, September 2009

Demany et al.: Tuning of the frequency-shift detectors

FIG. 3. Best fits of Eq. 共1兲 to the mean data plotted in Fig. 2. These mean data are replotted here as circles, with vertical shifts: For the 250-ms ISI, the values of d⬘ have been increased by 1; for the 900-ms ISI, the values of d⬘ have been increased by 2. Arrows indicate the ⌬ values corresponding to the maxima of the fitted curves; the mean of these three ⌬ values is 121 cents.

across frequency when expressed in cents than when expressed in ERB units, that is, in terms of the bandwidth of the auditory filters 共Glasberg and Moore, 1990; Moore, 2003兲. Another goal for future research is to evaluate more precisely the shape of the FSDs’ “tuning curves.” We assumed for simplicity that there are only two opponent channels of FSDs, up and down, with symmetrical properties. Such a hypothesis is sufficient to account for the present and previous data 共Demany and Ramos, 2005, 2007兲 if listeners base their judgments on the difference between the activities of the two channels. Suppose, in addition, that the 共signed兲 frequency shift maximizing the activity of each channel does not activate significantly the other channel. If so, the absolute value of this frequency shift will be equal to the ⌬ value maximizing performance in the up/down task 共i.e., the ⌬ value that we defined here as ⌬opt兲. More complex layouts for FSDs are possible, including, for instance, largely overlapping channels and/or a set of more than two opponent channels, but such additions to the original model of Demany and Ramos 共2005兲 do not seem necessary to explain the available behavioral data. It would be relevant to replicate experiment 2 using SOAs even shorter than 200 ms. For such SOAs, the FSDs might be maximally activated by frequency shifts smaller than 120 cents, and an even more likely possibility is that they become less sensitive to frequency shifts exceeding the optimal shift. This conjecture stems from the fact that the perception of temporal coherence in rapid melodic sequences of pure tones is limited by a tradeoff between their speed and the size of the melodic intervals: Increasing the speed of a sequence reduces the range of melodic intervals for which the sequence can be perceived as a single coherent melody 共van Noorden, 1975; Micheyl et al., 2007b兲. As pointed out by van Noorden 共1975兲 共see also Bregman and Achim, 1973兲, this auditory phenomenon is analogous to the visual phenomenon known as Korte’s third law of apparent motion 共Korte, 1915; Lakatos and Shepard, 1997; Ekroll et al., 2008兲. The link between FSDs and perceptual streaming, however, remains to be clarified. For melodic sequences with J. Acoust. Soc. Am., Vol. 126, No. 3, September 2009

relatively short SOAs, frequency selectivity and adaptation are sufficient to predict stream segregation 共Bee and Klump, 2005; Micheyl et al., 2005; Pressnitzer et al., 2008兲. Activity in FSD channels could provide an additional encoding of the sequences, signaling the binding between tones. Since in any case the FSDs seem to be particularly sensitive to frequency shifts of about 120 cents over a wide temporal range, it is worth considering if such a frequency distance is also perceptually special in other ways. In this regard, we shall first note that 120 cents appears to be, over a wide frequency range, a good estimate of the minimum frequency interval permitting a perceptual segregation of two simultaneous pure tones, in the absence of other tones 共Plomp, 1964兲. However, this might well be a fortuitous coincidence: Making sense of it is not straightforward. Another coincidence could be more meaningful. In an experiment on pitch memory, Deutsch 共1972兲 required listeners to make pitch comparisons 共same/different judgments兲 on two pure tones separated by a sequence of interfering tones. All but one of the interfering tones were remote in frequency from the initial test tone. The independent variable was the frequency distance separating the remaining interfering tone from the initial test tone: This distance 共D兲 varied from 0 to 200 cents in 33-cent steps. It was found 共on both “same” and “different” trials兲 that listeners’ error rate steadily increased when D was increased from 0 to 133 cents but then steadily decreased when D was further increased to 200 cents. Deutsch and Feroe 共1975兲 confirmed this observation and proposed an interesting explanation of it. They assumed that pitch memory traces are stored in a tonotopically organized neural network where lateral inhibitory interactions take place and where inhibition is maximal for a frequency distance of about 133 cents. In support for a role of lateral inhibition, Deutsch and Feroe 共1975兲 showed that the deleterious effect of an interfering tone I1 on the memory trace of a test tone 133 cents away can be reduced by the presentation, following I1, of another interfering tone I2, 133 cents away from I1 and 266 cents away from the test tone. A natural interpretation of this finding is that I2 inhibits the trace of I1 and, in doing so, disinhibits the trace of the test tone. The critical frequency distance identified by Deutsch and Feroe 共1975兲 is not significantly different from the one identified here, given among other things the individual variability of our data 共see Tables I and II兲. Now if at this frequency distance a tone affects in a particularly strong way the internal representation of a previous tone 关as shown by Deutsch and Feroe 共1975兲兴, then one can hypothesize that the initial tone also affects in a special way the encoding of the subsequent tone. However, this “forward interaction” hypothesis is not sufficient to account for our results in the up/down task; it must be supposed, in addition, that the effect of the initial tone 共a chord component, C兲 on the encoding of the subsequent tone 共T兲 crucially depends on the direction of the frequency shift. One could imagine, for example, that the auditory system’s “normal” response to T is inhibited by C when the frequency shift is negative but enhanced by C when the frequency shift is positive. However, such a scenario would imply that the intensity of T cannot be encoded independently of the relationship between the frequencies of Demany et al.: Tuning of the frequency-shift detectors

1347

C and T. Generally speaking, as pointed out by Demany and Ramos 共2007兲 and Demany and Semal 共2008兲, the mere existence of a forward interaction between two successive sounds in the auditory system does not immediately account for the perception of a relation between them; the listener must dissociate, in the neural activity concomitant to the presentation of the second sound, what is due to the relation between the two sounds from what could be due to intrinsic properties of the second sound. Another problem with the forward interaction hypothesis, in the present context, is that frequency shifts between tones can be detected automatically even for ISIs exceeding 1 s 共Demany and Ramos, 2005兲, whereas forward masking 共i.e., the effect of one sound on the absolute threshold of a subsequent sound兲 is only observable for ISIs smaller than 100–200 ms 共Moore, 2003兲. We thus believe that the forward interaction hypothesis is not an adequate explanation of listeners’ success in the up/down task. In other words, it seems unlikely that this task can be performed by considering only the internal representation of the tone following the chord. We argue instead that successful performance rests upon the existence of FSDs that do not participate in the encoding of the tones themselves. The precise mechanism of their action remains to be elucidated. ACKNOWLEDGMENTS

We thank Makio Kashino, Brian Moore, and an anonymous reviewer for judicious comments on a previous version of this paper. Allik, J., Dzhafarov, E. N., Houtsma, A. J. M., Ross, J., and Versfeld, N. J. 共1989兲. “Pitch motion with random chord sequences,” Percept. Psychophys. 46, 513–527. Bee, M. A., and Klump, G. M. 共2005兲. “Auditory stream segregation in the songbird forebrain: Effects of time intervals on responses to interleaved tone sequences,” Brain Behav. Evol. 66, 197–214. Bregman, A. S., 共1990兲. Auditory Scene Analysis 共MIT, Cambridge, MA兲. Bregman, A. S., and Achim, A. 共1973兲. “Visual stream segregation,” Percept. Psychophys. 13, 451–454. Carlyon, R. P. 共2004兲. “How the brain separates sounds,” Trends Cong. Sci 8, 465–471. Demany, L., and Ramos, C. 共2005兲. “On the binding of successive sounds: Perceiving shifts in nonperceived piches,” J. Acoust. Soc. Am. 117, 833– 841. Demany, L., and Ramos, C., 共2007兲. “A paradoxical aspect of auditory change detection,” in Hearing—From Sensory Processing to Perception, edited by B. Kollmeier, G. Klump, V. Hohmann, U. Langemann, M. Mauermann, S. Uppenkamp, and J. Verhey 共Springer, Heidelberg兲, pp. 313–321. Demany, L., and Semal, C., 共2008兲. “The role of memory in auditory perception,” in Auditory Perception of Sound Sources, edited by W. A. Yost, A. N. Popper, and R. R. Fay 共Springer, New York兲, pp. 77–113. Deutsch, D. 共1972兲. “Mapping of interactions in the pitch memory store,” Science 175, 1020–1022. Deutsch, D., and Feroe, J. 共1975兲. “Disinhibition in pitch memory,” Percept. Psychophys. 17, 320–324. Ditterich, J., Mazurek, M. E., and Shadlen, M. N. 共2003兲. “Microstimulation

1348

J. Acoust. Soc. Am., Vol. 126, No. 3, September 2009

of visual cortex affects the speed of perceptual decisions,” Nat. Neurosci. 6, 891–898. Ekroll, V., Faul, F., and Golz, J. 共2008兲. “Classification of apparent motion percepts based on temporal factors,” J. Vision 8, 1–22. Elhilali, M., Ma, L., Micheyl, C., Oxenham, A., and Shamma, S. A. 共2009兲. “Temporal coherence in the organization and representation of auditory scenes,” Neuron 61, 317–329. Glasberg, B. R., and Moore, B. C. J. 共1990兲. “Derivation of auditory filter shapes from notched-noise data,” Hear. Res. 47, 103–138. Green, D. M., and Swets, J. A., 共1974兲. Signal Detection Theory and Psychophysics 共Krieger, New York兲. Korte, A. 共1915兲. “Kinematoskopische Untersuchungen 关Cinematoscopic investigations兴,” Z. Psychol. Z. Angew. Psychol. 72, 193–296. Lakatos, S., and Shepard, R. N. 共1997兲. “Constraints common to apparent motion in visual, tactile, and auditory space,” J. Exp. Psychol. Hum. Percept. Perform. 23, 1050–1060. Micheyl, C., Carlyon, R. P., Gutschalk, A. , Melcher, J. R., Oxenham, A. J., Rauschecker, J. P., Tian, B., and Courtenay Wilson, E., 共2007a兲. “The role of auditor cortex in the formation of auditory streams,” Hear. Res. 229, 116–131. Micheyl, C., Shamma, S. A., and Oxenham, A. J. 共2007b兲. “Hearing out repeating elements in randomly varying multitone sequences: A case of streaming?,” in Hearing—From Sensory Processing to Perception, edited by B. Kollmeier, G. Klump, V. Hohmann, U. Langemann, M. Mauermann, S. Uppenkamp, and J. Verhey 共Springer, Heidelberg兲, pp. 313–321. Micheyl, C., Tian, B., Carlyon, R. P., and Rauschecker, J. P. 共2005兲. “Perceptual organization of tone sequences in the auditory cortex of awake macaques,” Neuron 48, 139–148. Moore, B. C. J., 共2003兲. An Introduction to the Psychology of Hearing 共Elsevier, Amsterdam兲. Moore, B. C. J., and Gockel, H. 共2002兲. “Factors influencing sequential stream segregation,” Acta Acust. Acust. 88, 320–332. Okada, M., and Kashino, M. 共2003兲. “The role of spectral change detectors in temporal order judgment of tones,” NeuroReport 14, 261–264. Okada, M., and Kashino, M. 共2004兲. “The activation of spectral change detectors by a sequence of discrete tones,” Acoust. Sci. & Tech. 25, 293– 295. Plomp, R. 共1964兲. “The ear as a frequency analyzer,” J. Acoust. Soc. Am. 36, 1628–1636. Pressnitzer, D., Sayles, M., Micheyl, C., and Winter, I. M. 共2008兲. “Perceptual organization of sound begins in the auditory periphery,” Curr. Biol. 18, 1124–1128. Rose, M. M., and Moore, B. C. J. 共2000兲. “Effects of frequency and level on auditory stream segregation,” J. Acoust. Soc. Am. 108, 1209–1214. Shepard, R. N. 共1984兲. “Ecological constraints on internal representation: Resonant kinematics of perceiving, imagining, thinking, and dreaming,” Psychol. Rev. 91, 417–447. Snyder, J. S., and Alain, C. 共2007兲. “Toward a neurophysiological theory of auditory stream segregation,” Psychol. Bull. 133, 780–799. Snyder, J. S., Carter, O. L., Hannon, E. L., and Alain, C. 共2009兲. “Adaptation reveals multiple levels of representation in auditory stream segregation,” J. Exp. Psychol. Hum. Percept. Perform. In press. Snyder, J. S., Carter, O. L., Lee, S. K., Hannon, E. E., and Alain, C. 共2008兲. “Effects of context on auditory stream segregation,” J. Exp. Psychol. Hum. Percept. Perform. 34, 1007–1016. Ullman, S. 共1978兲. “Two dimensionality of the correspondence process in apparent motion,” Perception 7, 683–693. van Noorden, L. P. A. S. 共1975兲. “Temporal coherence in the perception of tone sequences,” Ph.D. dissertation, Institute for Perception Research, Eindhoven, The Netherlands. Wakefield, G. H., and Viemeister, N. F. 共1984兲. “Selective adaptation to linear frequency-modulated sweeps: Evidence for direction-specific FM channels?” J. Acoust. Soc. Am. 75, 1588–1592.

Demany et al.: Tuning of the frequency-shift detectors

Tuning properties of the auditory frequency-shift detectors

sure level of 65 dB. The chord of .... justed iteratively using the NCSS statistical software. Table I ..... “Mapping of interactions in the pitch memory store,”. Science ...

173KB Sizes 3 Downloads 161 Views

Recommend Documents

Tuning properties of the auditory frequency-shift detectors
Irrespective of the chord's properties and of the ISI, performance was best when was equal to about 120 ... Auditory scene analysis has two facets: a segregation facet and an .... mos (2005) proposed a simple qualitative model, similar to a .... Fill

Enhancing and tuning absorption properties of ...
Key Laboratory for Micro/Nano Optoelectronic Devices of Ministry of Education, School of ... (Received 30 July 2008; accepted 10 December 2008; published online 31 ..... 7R. A. Shelby, D. R. Smith, S. C. Nemat-Nasser, and S. Schultz, Appl.

Neuroplasticity of the Auditory System.pdf
Neuroplasticity of the Auditory System.pdf. Neuroplasticity of the Auditory System.pdf. Open. Extract. Open with. Sign In. Main menu.

The Methane Detectors Challenge - Environmental Defense Fund
analysis of a sensor that was previously developed for mine safety applications. The Methane ... system to provide a complete, low-cost sensor package that.

Tuning electrical properties of Au/n-InP junctions by ...
Aug 5, 2017 - Tuning electrical properties of Au/n-InP junctions by inserting atomic ... contact resistance are pivotal to obtain high device performance.

The sound of change: visually- induced auditory synesthesia
stimuli were either tonal beeps (360 Hz) on sound trials or centrally flashed discs (1.5 deg radius) on visual trials. On each trial, subjects judged whether two ...

A Self-Tuning System Tuning System Tuning System ...
Hadoop is a MAD system that is becoming popular for big data analytics. An entire ecosystem of tools is being developed around Hadoop. Hadoop itself has two ...

Auditory-visual virtual environment for the treatment of ...
7Institut du Cerveau et de la Moelle épinière, ICM, Social and Affective Neuroscience (SAN) Laboratory, F-75013, ... 8Trinity College Dublin, Dublin, Ireland.

pdf-1533\understanding-developmental-disorders-of-auditory ...
... the apps below to open or edit this item. pdf-1533\understanding-developmental-disorders-of-au ... ss-languages-international-perspectives-research.pdf.

Motor contributions to the temporal precision of auditory ... - Nature
Oct 15, 2014 - saccades, tactile and haptic exploration, whisking or sniffing), the motor system ..... vioural data from a variant of Experiment 1 (Fig. 1) in which ...

Properties of Water
electron presence. Electron density model of H2O. 1. How many hydrogen atoms are in a molecule of water? 2. How many oxygen atoms are in a molecule of ...

Dynamical and Correlation Properties of the Internet
Dec 17, 2001 - 2International School for Advanced Studies SISSA/ISAS, via Beirut 4, 34014 Trieste, Italy. 3The Abdus ... analysis performed so far has revealed that the Internet ex- ... the NLANR project has been collecting data since Novem-.

A Family of Self-Normalizing Carrier Lock Detectors ...
plication specified integrated circuit (ASIC). Analysis ... of one sample per symbol and that sample corresponds to the .... The family of lock detectors is defined as.

A Comparison of Video-based and Interaction-based Affect Detectors ...
An online physics pretest (administered at the start of day 1) and posttest ... The study was conducted in a computer-enabled classroom with ..... detectors have been built to some degree of success in whole ..... Sensor-Free Affect Detection for a S

Control of the polarization properties of the ...
Our studies show that depolarization of the SCG is depen- dent on the plane ... coherence, good polarization properties, spectral ... ideal broadband ultrafast light source. ... The spectra of SC are recorded using a fiber- coupled spectrometer (Ocea

Psychometric properties of the Spanish version of the ...
redundant and in order to make the administration easier, a revised and shortened version was ... (Sanavio, 1988) is a 60-item questionnaire that assesses the degree of disturbance ..... American Journal of Medical Genetics, 88,. 38–43.

properties
Type. Property Sites. Address. Zip. Code. Location. East or West. Site. Acres. Main Cross Streets. Status. Price. Bldg. (GSF). Year. Built. 1 Building. Brady School.

Dynamic engagement of human motion detectors across space-time ...
Motion detection is a fundamental property of the visual system. ..... before combining it with B and C. A similar procedure was necessary for. N[0,1] ...... ena. We discuss this potential connection below. Potential significance for suprathreshold .

Optical Sources and Detectors - Semantic Scholar
1. Introduction. Light is the basis of the science of optics and optical ... sources, scribing, and microfabrication in semiconductor and computer ..... He received his Doctor of Science degree in 1976 from Kapitza Institute for Physical. Problems. T

Dynamic engagement of human motion detectors across space-time ...
We account for all results by extending the motion energy model to incorporate a ..... a “hit” classified image (Green and Swets, 1966), demonstrates that (as.

Optical Sources and Detectors - Semantic Scholar
imaging systems in the visible and infrared regions. 1. Introduction ... information processing, optical sensing and ranging, optical communication, ... Visible light, the most familiar form of electromagnetic waves, may be defined as that.