The perception of frequency peaks and troughs in wide frequency modulations. III. Complex carriers Laurent Demany and Sylvain Cle´ment Laboratoire de Psychoacoustique, Universite´ Bordeaux 2, 146 rue Le´o Saignat, F-33076 Bordeaux Cedex, France

~Received 5 April 1994; revised 24 April 1995; accepted 5 May 1995! In widely frequency-modulated ~FM! sine tones, local frequency maxima are perceived more accurately than local frequency minima @L. Demany and K. I. McAnally, J. Acoust. Soc. Am. 96, 706 –715 ~1994!; L. Demany and S. Cle´ment, J. Acoust. Soc. Am. 97, 2454 –2459 ~1995!#. The aim of the present work was to determine if a similar perceptual asymmetry exists for nonsinusoidal FM carriers. Within each stimulus, the logarithm of instantaneous frequency followed one cycle of a ~2.5- or 5-Hz! cosine function, starting at phase p in the ‘‘peak’’ condition and phase 0 in the ‘‘trough’’ condition. In each condition, subjects had to detect shifts in the frequency apex occurring at the temporal center of the stimuli. In experiment 1, the FM functions were imposed on complex tones consisting of a series of consecutive harmonics. Some of the stimuli were bandpass filtered in a 1-oct window with fixed edges. The measured thresholds were about four times lower in the peak condition than in the trough condition, which suggests that the asymmetry previously observed for ‘‘spectral’’ pitches also exists for ‘‘virtual’’ pitches. In experiment 2, the FM carriers were Shepard tones. With such carriers, the standard peak and trough stimuli could be made identical at both the apex and the end points. In spite of these local identities, the results were similar to those of experiment 1, which suggests that the perceptual asymmetry is not determined by local differences between the stimuli and is instead a genuine ‘‘motion’’ effect. In experiment 3, FM was imposed on the frequency of an amplitude modulation of high-pass noise; thus the modulated frequency could have only a temporal ~and no place! representation in the auditory nerve. Performance was again better in the peak condition than in the trough condition for apexes near 120 Hz. However, a slight trend in the opposite direction was observed for apexes near 70 Hz. © 1995 Acoustical Society of America. PACS numbers: 43.66.Fe, 43.66.Hg, 43.66.Mk

INTRODUCTION

Two previous papers from our laboratory ~Demany and McAnally, 1994; Demany and Cle´ment, 1995; hereafter: ‘‘paper 1’’ and ‘‘paper 2’’! reported an intriguing phenomenon concerning the perception of the local peaks and troughs of wide and continuous frequency modulations ~FM!. Initially ~experiment 1 of paper 1!, we modulated a sinusoidal carrier of about 1 kHz by the exponential of complex and periodic functions which were symmetric in amplitude and time. On a log-frequency scale, the local frequency peaks and troughs were exact mirror images of each other. Since the auditory system is supposed to process frequency on a quasilogarithmic scale ~at least around 1 kHz!, it was expected that the local maxima and minima of the FM functions would be perceived as equally salient auditory events. However, this was not the case: The maxima were much more often heard as auditory events than the minima. In subsequent experiments, simpler FM functions were used— each consisted of a single cosine cycle—and the subjects had to detect small shifts in local frequency extrema. For extrema near 250 or 1000 Hz, shifts in maxima appeared to be more detectable than shifts in minima, which showed again that maxima are ‘‘better’’ perceived than minima. These observations are probably relevant to the perception of speech, at least because our cosine modulations ~cf. Fig. 3 of paper 1! 2515

J. Acoust. Soc. Am. 98 (5), Pt. 1, November 1995

closely resembled natural formant transitions ~see, for instance, the speech spectrograms displayed by Gupta and Schroeter, 1993!. In the present study, the detectability of shifts in frequency maxima and minima was investigated for complex rather than sinusoidal carriers. In experiment 1, the carriers consisted of harmonic series. They were used to determine if the perceptual asymmetry previously demonstrated in the ‘‘spectral pitch’’ domain also exists in the domain of ‘‘virtual’’ pitch, for virtual pitch sensations similar to those that humans extract from vowels or common musical sounds.1 In experiment 2, each carrier was a Shepard tone: a sum of sinusoids spaced by octave intervals, this sum being bandpass filtered by a fixed spectral window ~Shepard, 1964; Risset, 1971!. With such carriers, it was possible to dissociate the immediate directions of a frequency movement from the nature of the difference between the end points of the movement. Normally, the stimulus obtained at the terminal point of a frequency rise is ‘‘higher’’ than the stimulus at the starting point. However, this may not be true when the carrier is a Shepard tone: For instance, the ‘‘maximum’’ obtained at the terminal point can be exactly the same stimulus as the starting point. Are shifts in such pseudomaxima more detectable than shifts in comparable pseudominima? This would suggest that the perceptual asymmetry between maxima and minima is crucially dependent on the immediate directions of

0001-4966/95/98(5)/2515/9/$6.00

© 1995 Acoustical Society of America

2515

ownloaded¬11¬Apr¬2011¬to¬193.50.102.40.¬Redistribution¬subject¬to¬ASA¬license¬or¬copyright;¬see¬http://asadl.org/journals/doc/ASALIB-home/info/terms.js

the frequency movement rather than on the relations between temporally nonadjacent stimulus states. In experiment 3, finally, the carrier signal was a periodically amplitude-modulated ~AM! noise and the FM functions were imposed on the frequency of amplitude modulation. Here, the theoretical value of the carriers lay in the fact that the modulated frequency had no spectral correlate in the stimuli, and thus no place representation in the auditory periphery. The ‘‘atonal’’ pitch sensations induced by the stimuli could be coded only in a temporal manner by the auditory nerve. Such sensations are not completely ‘‘amusical’’ ~Burns and Viemeister, 1976, 1981!. It was interesting to determine if the local peaks and troughs of tonal and atonal pitches are subject to a similar perceptual asymmetry. I. EXPERIMENT I: HARMONIC SERIES A. Method, experiment 1a

Experiment 1a was very similar to experiment 2 of paper 1, except that the carriers used now consisted of harmonic series and that the stimuli were presented in a background of noise. Each carrier was the sum of the first 20 harmonics of some fundamental frequency (F0). These 20 harmonics had the same nominal sound-pressure level ~SPL!, 60 dB, and were added in cosine phase. The FM waveforms were exactly the same as before: On a log-frequency scale, they consisted of one cycle of a 5-Hz cosine function, starting at phase p in the ‘‘peak’’ condition and at phase 0 in the ‘‘trough’’ condition. In both conditions, the standard apex of F0—i.e., the F0 extremum reached at the temporal center of the standard stimuli—varied randomly from trial to trial, in a 0.5-oct range which was logarithmically centered on 200 Hz. For the standard stimuli, the total frequency swing was equal to 0.5 oct. Each stimulus had a total duration of 200 ms and was gated on and off with 10-ms cosine shaped amplitude ramps. On every trial, as in experiment 2 of paper 1, three successive stimuli separated by 500 ms were presented. The first one consisted of the standard stimulus. This standard was presented again in either the second or the third listening interval, at random. Subjects had to identify the position of the other stimulus, called the ‘‘target,’’ by pressing one of two keys. The target had the same end points as the standard stimulus, but a shifted apex due to an increase in the frequency swing ~cf. Fig. 3 of paper 1!. Visual feedback about response accuracy was provided immediately. Thresholds for the detection of shifts in apex were measured with the ‘‘weighted up–down’’ adaptive method described by Kaernbach ~1991, see paper 1 for minor details!. Each threshold measurement was an estimation of the shift detected with P(C)50.75. The stimuli were presented binaurally via a TDH 39 headset. They were generated in real time, by means of a DSP card ~Oros AU22!, with a sampling rate of 20 kHz. The test sessions were run in a continuous background of pink noise, produced by a Bruel & Kjaer WB1314 generator; this noise had a spectrum level of 31 dB at 1000 Hz. The experiment consisted of three test sessions for each subject. In each session, ten thresholds were measured for each of the two 2516

J. Acoust. Soc. Am., Vol. 98, No. 5, Pt. 1, November 1995

FM conditions, presented in alternation. Three persons with normal audiograms, including the two authors, served as subjects. Each of them had been previously tested for many hours in several related experiments. B. Method, experiment 1b

Experiment 1b was identical in every respect to experiment 1a except for a bandpass filtering of the stimuli described above: These stimuli were now heard through a fixed spectral window, produced by a filter ~Kemo VBF/8.04! with cutoff frequencies set to 1000 and 2000 Hz. Beyond the cutoff frequencies, the effective attenuation rate ~as measured with a spectrum analyzer! was about 75 dB/oct. Combination tones with frequencies below 1000 Hz were probably masked by the pink noise added to the stimuli; this noise was the same as in experiment 1a. Since the maximum value that F0 could take was 336.4 Hz ~0.75 oct above 200 Hz! while the lower edge of the spectral window was fixed at 1000 Hz, the maximum spacing of two adjacent stimulus components was 0.42 oct. This maximum spacing was smaller than the frequency swing imposed on the standard stimuli ~and the target stimuli, a fortiori!. Therefore, a stimulus component which, when the stimulus began, was at one of its spectral edges, could no longer be at this edge when the stimulus apex occurred: Another component always ‘‘took its place.’’ This is an important point because it might be assumed that in experiment 1a, subjects based their judgments on the pitch of a spectral edge rather than on virtual pitch; the spectral edges of the stimuli were submitted to the same FM as F0. In experiment 1b, by contrast, the spectral edges were essentially steady. Experiment 1b was actually run before experiment 1a. C. Results

Figure 1 shows the mean thresholds measured for each subject in each test session of experiment 1a ~unfiltered stimuli, left column! and experiment 1b ~bandpass filtered stimuli, right column!. Averages over subjects are also plotted ~bottom panels!. Filled and open circles respectively symbolize the peak and the trough conditions. The error bars displayed for the individual data represent within-session standard errors of the means. ~Error bars are invisible when the standard error is smaller than the symbol of the mean.! Thresholds did not vary markedly from session to session, and were always much better in the peak condition than in the trough condition: overall, about four times better. As could be expected from previous studies on frequency discrimination of complex tones ~e.g., Moore et al., 1984!, performance was worse for the bandpass filtered stimuli than for the unfiltered stimuli ~by an overall factor of about 2!; however, the effect of FM condition was roughly the same for these two kinds of stimuli. Each subject had previously participated in a comparable study with sinusoidal instead of complex carriers. For LD, the effect of FM condition was markedly larger here than in the previous study. However, this was not the case for SC and MS ~see paper 2, especially the middle column of its Fig. 2!. L. Demany and S. Cle´ment: Frequency peaks and troughs

2516

ownloaded¬11¬Apr¬2011¬to¬193.50.102.40.¬Redistribution¬subject¬to¬ASA¬license¬or¬copyright;¬see¬http://asadl.org/journals/doc/ASALIB-home/info/terms.js

FIG. 1. Mean results of experiment 1a ~left column! and experiment 1b ~right column!, for each subject ~six upper panels! and subjects being pooled ~bottom panels!. The vertical segments plotted for the individual data represent standard errors. The ordinate scale is logarithmic.

D. Discussion

What do the present results add to those reported in papers 1 and 2? We have to consider a hypothesis under which the answer is: nothing new. Assume that, instead of listening to the stimuli in a ‘‘synthetic’’ manner, the subjects actually attended to only one of their components. If this had been the case, the ‘‘effective’’ stimuli would have been essentially identical to the stimuli used in our previous studies. In experiment 1b, the spectral edges of the ‘‘objective’’ stimuli were essentially steady; they were not submitted to the same FM as F0; moreover, the frequency interval formed by two adjacent stimulus components was generally small and often small enough to prevent analytic listening ~Plomp, 1976, Chap. 1!. Nonetheless, it might still be hypothesized that an analytic strategy was used at least in experiment 1a. There are serious reasons to believe that this hypothesis is not realistic. First, it is well known that a steady and harmonic complex tone is much more easily heard in a synthetic manner—i.e., as a single sound with one virtual pitch—than in an analytic manner—i.e., as a sum of pure tones with different pitches. Recently, this was verified in a series of experiments performed by Moore and Glasberg ~1990!. They found, for instance, that discrimination of the frequency of a single common partial in two complex tones is markedly worse when the two tones differ in virtual pitch than when they have the same virtual pitch. From this finding, it was concluded that listeners who are requested to attend solely to 2517

J. Acoust. Soc. Am., Vol. 98, No. 5, Pt. 1, November 1995

spectral pitch are in fact unable to ignore virtual pitch. Second, our complex tones were not steady but frequency modulated and there are reasons to believe that the FM involved reinforced the perceptual fusion of the spectral components. This is suggested by McAdams ~1989!, Marin and McAdams ~1991!, and Darwin et al. ~1994!. In the studies of McAdams ~1989! and Marin and McAdams ~1991!, each stimulus was a mixture of synthetic vowels with different fundamental frequencies. It was found that when the spectral components of each vowel were frequency modulated in a coherent manner ~coherent modulations were also used here!, the perceptual prominence of the vowels was higher than in the absence of any FM. The benefit of FM for perceptual prominence may be ascribed to a reinforcement of the fusion of each vowel’s spectral components. Darwin et al. ~1994! supported this interpretation with different stimuli. They showed that the ‘‘harmonic sieve’’ used for the extraction of virtual pitch from a tonal complex ~Duifhuis et al., 1982! is more tolerant to deviations from harmonicity when the spectral components are frequency modulated coherently than when there is no FM. Third, our stimuli were presented in a background of pink noise and there is evidence that background noise tends to promote a synthetic mode of listening to complex tones. One piece of evidence was provided by Moore and Glasberg ~1991!: They found that discrimination of the F0 of complex tones with nonoverlapping harmonics can be improved by the addition of pink noise to these stimuli. For harmonics at 60 dB SPL ~as in the present study! and a standard F0 of 200 Hz, the maximum improvement was obtained for a pink noise with the same intensity as the one used here. Therefore, it is reasonable to admit that the effective stimuli of experiments 1a and 1b were significantly different from those used in our previous studies. Our previous results demonstrated a perceptual asymmetry between peaks and troughs of spectral pitch. The present results lead us to believe that peaks and troughs of virtual pitch can be similarly asymmetric. II. EXPERIMENT II: SHEPARD TONES A. Stimuli, experiment 2a

The stimuli of experiment 2a differed from those of experiment 1b in that: ~1! their carriers consisted of Shepard tones instead of consecutive harmonics; ~2! the spectral window through which they were heard had different edges; ~3! they were not presented in a background of noise. The left part of Fig. 2 shows the spectrogram of the average standard stimulus for the peak condition and the trough condition. These are ‘‘average’’ stimuli in so far as a random transposition of the modulated frequencies occurred from trial to trial, in a 0.5-oct range, as in experiment 1. In both conditions, the carrier signal was a sum of sine tones spaced by octave intervals and the apex frequency of the middle component was, on the average, 1000 Hz. Seven simultaneous sine tones were generated, but this complex was bandpass filtered in a spectral window which was logarithmically centered on 1000 Hz and covered an interval of 3.5 oct. This spectral window was produced by setting to 297 L. Demany and S. Cle´ment: Frequency peaks and troughs

2517

ownloaded¬11¬Apr¬2011¬to¬193.50.102.40.¬Redistribution¬subject¬to¬ASA¬license¬or¬copyright;¬see¬http://asadl.org/journals/doc/ASALIB-home/info/terms.js

FIG. 2. Spectrograms of the average standard stimuli used in the peak condition and the trough condition of experiments 2a and 2b. The component frequencies were randomized from trial to trial in a range indicated by the double arrow on the left. Gray areas represent the spectral regions lying beyond the cutoff frequencies of the ~fixed! bandpass filter.

and 3360 Hz the cutoff frequencies of the filter used in experiment 1. Within the window, each spectral component had a nominal SPL of 55 dB; thus, although the input to the filter consisted of seven components, at most five of them were detectable in the output at any given time. Let us compare the average standard stimuli for the two conditions. For both conditions, as shown in Fig. 2 and already mentioned above, the apex frequency of the middle component was 1000 Hz. Thus there was no difference between the overall momentary stimuli corresponding to the frequency apexes. In addition, however, there was no difference between the momentary stimuli corresponding to the end points: It can be seen in Fig. 2 that the beginning and the end of the ‘‘peak’’ stimuli were identical to the beginning and the end of the ‘‘trough’’ stimuli. Therefore, a theoretical listener who would sample the stimuli only at their apex and end points ~but without restriction on the frequency dimension! could not detect any difference between the standard peaks and troughs; clearly, such a listener should have the same detection threshold for shifts in the apex of peaks and troughs. B. Stimuli, experiment 2b

In the standard stimuli of experiment 2b, the frequency swing was equal to 1 oct instead of 0.5 oct and the frequency of the cosine modulation function was 2.5 Hz instead of 5 Hz. Thus stimulus duration was doubled but, since the frequency swing was also doubled, the rates of frequency variation were not changed. The right part of Fig. 2 shows the spectrogram of the average standard stimulus for the two FM 2518

J. Acoust. Soc. Am., Vol. 98, No. 5, Pt. 1, November 1995

conditions ~note the change in time scale!; the actual standard stimuli were again randomized from trial to trial in a 0.5-oct range. In experiment 2a, the standard peak stimuli did not differ from the standard trough stimuli at both the frequency apex and the end points. As shown by Fig. 2, this was also true in experiment 2b. In the latter experiment, however, the standard stimuli had two new notable properties. First, their frequency apex was identical to their end points. Second, the standard peak stimuli did not differ from the standard trough stimuli at times corresponding to 41 and 34 of their total duration, i.e., 100 and 300 ms after the onset. C. Procedure and subjects

For each subject, in experiment 2a as well as experiment 2b, four test sesions were run: one familiarization session followed by three formal sessions. Each session consisted of ten threshold measurements for each of the two FM conditions, presented in alternation. Five listeners with normal hearing were tested. Only two of them ~LD and HP! had previously participated as subjects in a related experiment. The other three listeners had no previous experience in psychoacoustic tasks and received no preliminary information about the rationale of the study. D. Results

The results are displayed in Fig. 3, on its left part for experiment 2a ~standard FM swing50.5 oct! and its right part for experiment 2b ~standard FM swing51 oct!. Generally, performance was again much better in the peak condiL. Demany and S. Cle´ment: Frequency peaks and troughs

2518

ownloaded¬11¬Apr¬2011¬to¬193.50.102.40.¬Redistribution¬subject¬to¬ASA¬license¬or¬copyright;¬see¬http://asadl.org/journals/doc/ASALIB-home/info/terms.js

1, the stimuli effectively processed by the subjects did not differ from the objective stimuli. This being assumed, what are the new implications of the present results? In a rapid sequence of discrete tones differing in frequency, the last tone is generally better perceived than the previous ones ~see, e.g., Divenyi and Hirsh, 1975; Watson et al., 1975!. By analogy, it might be thought that in a stimulus consisting of a continuous frequency motion, the final point is the most salient moment. In support of this view, it has been found that the global pitch sensation evoked by a short and rising frequency glissando is more dependent on the final part of the glissando than on its beginning ~Brady et al., 1961; Na´beˇlek et al., 1970; Pinek and Cave´, 1994!. Therefore, one might hypothesize that for a frequency motion including an apex in its temporal center, the perception of the apex will be critically dependent on the stimulus state at the final point ~especially when the time interval between the apex and the final point is only 100 ms!. For sinusoidal carriers, or for complex carriers such as those used in experiment 1, the perceptual asymmetry between the apexes of FM peaks and troughs would then rest on the fact that the final frequency ~or F0! of a peak stimulus is lower than its apex frequency, whereas the reverse is true for a trough stimulus. Clearly, this hypothesis seems to be inconsistent with the present results. The use of Shepard carriers made it possible to produce FM peaks and troughs that were objectively equated both at the apex and the end points. Moreover, the apex of a standard stimulus could be made objectively identical to its end points. In spite of these identities, the apexes of the peaks and troughs were not perceived with the same accuracy. Thus, we conclude that even for relatively brief stimuli, the perception of a frequency apex seems to depend much less on the objective relation between the apex and the end points than on the ‘‘immediate’’ directions of the frequency motion, or in other words on the derivative of this motion. FIG. 3. Same as Fig. 1, but for experiments 2a ~left! and 2b ~right!.

tion than in the trough condition. There was only one exception to this rule: the case of subject BP in experiment 2b. Overall, thresholds were slightly larger in experiment 2b than in experiment 2a. Whereas practice had essentially no effect in experiment 1, here thresholds generally improved from session to session. This difference should probably be ascribed to a weaker initial training of the subjects in the present study.2 However, the ratio of the thresholds obtained in the trough and the peak condition did not markedly change from session to session. On the average, this ratio had a value of 3.6 in experiment 2a and 3.3 in experiment 2b. The average ratio obtained for experiment 2a is close to that obtained in experiment 1. E. Discussion

The authors, who participated as subjects in the present study, felt that it was impossible to adopt an analytic listening strategy, i.e., to attend to individual components of the stimuli. So, let us assume that here as well as in experiment 2519

J. Acoust. Soc. Am., Vol. 98, No. 5, Pt. 1, November 1995

III. EXPERIMENT III: AM NOISE A. Stimuli

In experiment 3, we wished to determine if frequency peaks and troughs are still perceived asymmetrically when frequency can be coded in terms of time but not place in the auditory nerve. To this aim, AM noise carriers were employed and FM was imposed on the frequency of AM. Imposing an AM on a white noise does not change the shape of its long-term spectrum; thus the long-term spectrum does not depend on AM frequency ~g!. In the short-term spectrum, the instantaneous spectral density at any given frequency fx is correlated with the spectral density at fx2g and fx1g. This might provide short-term place clues to the value of g in the auditory nerve. However, as argued by Burns and Viemeister ~1981!, the corresponding clues can be removed by a restriction of the stimulus spectrum to high frequencies: If, whatever fx , the auditory periphery is unable to resolve fx from fx2g or fx1g, then short-term place clues will not be available. In the present study, the stimuli were high-pass filtered at 3000 Hz and the highest g value corresponding to a frequency apex was 285 Hz. At the apexes, therefore, the freL. Demany and S. Cle´ment: Frequency peaks and troughs

2519

ownloaded¬11¬Apr¬2011¬to¬193.50.102.40.¬Redistribution¬subject¬to¬ASA¬license¬or¬copyright;¬see¬http://asadl.org/journals/doc/ASALIB-home/info/terms.js

quency separation of correlated spectral components never exceeded 285 Hz, which is smaller than the equivalent rectangular bandwidth of the peripheral auditory filters with center frequencies above 3000 Hz ~Glasberg and Moore, 1990!. For one subject, the high-pass filtered AM noises were sometimes complemented by an unmodulated noise which was low-pass filtered at 3000 Hz; this additional noise was intended to mask auditory distortion products at frequencies below 3000 Hz, under the hypothesis that such distortion existed and could play a role. The instantaneous amplitude of the unfiltered carrier signal, A(t), could be expressed as A ~ t ! 5a ~ t ! k

sin~ 2 p –g–t !

,

~1!

where a(t) represents white noise with a rectangular probability distribution and k531.623. Thus the AM function was a sinusoid on a logarithmic scale of amplitude ~i.e., on a dB scale! and its peak-to-peak depth was equal to 60 dB. On each stimulus presentation, the initial phase of the AM waveform was randomized. The FM functions imposed on g were the same as those used in experiment 2b. Thus stimulus duration was 400 ms and the total frequency swing was 1 oct for the standard stimuli. As in experiments 1 and 2, the standard apex used in a given threshold measurement was randomized from trial to trial in a range of 0.5 oct. This range was logarithmically centered on 70, 120, or 240 Hz. High-pass filtering at 3000 Hz followed FM and was realized by a Kemo VBF/8.04 filter with spectral slopes of about 75 dB/oct. The doubly modulated stimuli were gated on and off with 10-ms cosine-shaped amplitude ramps and binaurally presented at a ‘‘long-term’’ level of 60 dB SPL ~‘‘long-term’’ referring here to an integration over several AM cycles!. They were generated by an Oros AU22 DSP card at a sampling rate of 20 kHz. The output of the DSP card was lowpass filtered at 8000 Hz in order to remove aliasing; however, this was probably unnecessary since TDH 39 earphones were again used for stimulus presentation. The additional unmodulated noise sometimes presented to one subject was produced by a Bruel & Kjaer pink noise generator ~WB1314!, followed by a Kemo VBF/8.04 filter with a low-pass function and a cutoff frequency set at 3000 Hz. This low-pass noise was presented continuously throughout each test session. Its nominal level in each 31 octave band ~below 3000 Hz! was equal to the long-term level of the modulated stimuli in the 31 oct band for which their long-term level was maximal ~i.e., around 5000 Hz!. B. Subjects and procedure

Four persons with normal hearing were tested. Two of them ~LD and HP! had previously participated as subjects in several related studies, including experiment 2. The other two listeners ~PR and PL! had no previous experience in psychoacoustic tasks; they received no preliminary information about the rationale of the experiment. After a single familiarization session, each subject participated in seven formal sessions during which the unmodulated noise was absent. In three of these formal sessions, the average frequency apex of the standard stimuli was 70 Hz. In 2520

J. Acoust. Soc. Am., Vol. 98, No. 5, Pt. 1, November 1995

FIG. 4. Mean results of experiment 3. Conventions are the same as in Figs. 1 and 3. For subject LD, the unconnected symbols stand for the data collected in the presence of unmodulated low-pass noise.

another set of three, the average standard apex was 120 Hz. These six sessions were organized in the same manner as those run in experiments 1 and 2; thus, each consisted of ten threshold measurements in both the peak and the trough condition. In the seventh session, the average standard apex was 240 Hz and only ten threshold measurements were made, all in the peak condition ~for reasons stated later!. For subject LD, the modulated stimuli were complemented with the unmodulated noise in two additional sessions, during which the average standard apex was respectively 120 Hz ~ten threshold measurements in each FM condition! and 240 Hz ~peak condition only!. C. Results

In the four upper panels of Fig. 4, the mean thresholds measured in each subject for the two FM conditions are displayed as a function of the average standard apex. The standard errors of these individual means ~thresholds being pooled over sessions! are also plotted. For LD, the results obtained in the presence of the unmodulated low-pass noise are represented by the unconnected symbols. The overall means obtained in the absence of this noise are plotted in the bottom panel. Thresholds were much larger than in experiment 2b, although the FM functions were the same. This is not surprising, at least because the carrier signals used here were not tonal ~see, e.g., Formby, 1985!. More interesting are the L. Demany and S. Cle´ment: Frequency peaks and troughs

2520

ownloaded¬11¬Apr¬2011¬to¬193.50.102.40.¬Redistribution¬subject¬to¬ASA¬license¬or¬copyright;¬see¬http://asadl.org/journals/doc/ASALIB-home/info/terms.js

combined effects of FM condition and the standard apex. Let us consider first the data for the intermediate range of standard apexes, centered on 120 Hz. For this range, each subject performed significantly better in the peak condition than in the trough condition. The addition of the unmodulated lowpass noise had essentially no effect in subject LD. The median ratio ~across subjects, without low-pass noise! of the mean thresholds respectively obtained in the trough and the peak condition was equal to 2.05. This median ratio did not markedly change from the first formal session to the third and last one. When the average standard apex was 70 Hz, the results were quite different. As shown in Fig. 4, each subject performed less well in the peak condition, but better in the trough condition. For this lowest range, consequently, overall performance was poorer in the peak condition than in the trough condition, by a median factor of 1.31. For each subject, this factor increased from the first session to the third and final one; the corresponding values of its median were, respectively, 1.15 and 1.52. For the highest range ~apexes near 240 Hz!, the trough condition was omitted because the average AM rate at the end points of the stimuli would have been 480 Hz, too high for a good encoding of power fluctuations by the auditory system ~Viemeister, 1979!. The addition of the unmodulated noise had again no effect in subject LD. Interestingly, thresholds were roughly similar to those obtained in the peak condition for apexes one octave below ~near 120 Hz!, and much better than those obtained in the corresponding trough condition. This implies that the very poor performance obtained for troughs with apexes near 120 Hz cannot be ascribed to the AM rate at the end points of these stimuli. D. Discussion

The results obtained for apexes near 120 Hz were qualitatively similar to those obtained in experiments 1 and 2, as well as previous studies using sinusoidal carriers ~papers 1 and 2!. Near 70 Hz, however, an advantage of the peaks over the troughs was no longer obtained; the mean results actually showed a small trend in the opposite direction. One way to make sense of the latter result is to assume that, in each FM condition, the acoustic information used by the subjects was not narrowly localized in the apexes of the stimuli but included instead a significant portion of their surrounding flanks. On each flank of a peak with an apex at 70 Hz, about 10 AM cycles could be heard ~from, or until, the corresponding end point!. For a trough with the same apex, this number was doubled since the AM carrier was one octave above; it is plausible that the larger number of AM cycles favored the extraction of perceptual clues from the flanks. Presumably, the number of AM cycles on the flanks of the peaks and troughs was less critical for apexes around 120 Hz. This hypothesis, however, cannot account for the fact that thresholds in the trough condition markedly increased when the average standard apex varied from 70 to 120 Hz. Indeed, it would predict the opposite outcome. Moreover, this threshold increase could not be predicted from what is known about the differential sensitivity to the rate of AM for stimuli with noise carriers and a steady AM rate: The relative 2521

J. Acoust. Soc. Am., Vol. 98, No. 5, Pt. 1, November 1995

difference limen of AM rate is roughly the same at 70 and 120 Hz ~Bilsen and Wieman, 1980; Formby, 1985!. Therefore, it is clear that for troughs with an apex near 120 Hz, the perception of the apex was impaired by its acoustic context, i.e., by one or both of the stimulus flanks. The perceptual advantage of peaks with similar apexes might originate from this deleterious effect alone, and not from an additional and hypothetical nonlinearity thanks to which the apex of a peak could be perceived more precisely within its acoustic context than without it. The strong interaction visible in Fig. 4 may be clarified by relating the results to those of Burns and Viemeister ~1976! about the nature of the sensations evoked by AM noise. From experiments in which listeners identified musical intervals formed by stimuli consisting of sinusoidally AM noise, Burns and Viemeister concluded that such stimuli are able to evoke musical pitch sensations. However, their data indicate that the perceptual correlate of AM rate becomes more and more amusical when the AM rate is decreased below about 100 Hz. Using periodic pulse trains rather than AM noise, Guttman and Pruzansky ~1962! also found that musical pitch judgments progressively deteriorate below about 100 Hz; these authors state that the lower limit of musical pitch corresponds to a frequency of 60 Hz. Why this should occur remains unclear, but the findings that we just mentioned, by themselves, apparently clarify our own results: It seems that atonal frequency maxima are better perceived than atonal frequency minima if and only if the perceptual correlate of frequency is a sufficiently salient sensation of musical pitch. For the corresponding frequencies ~exceeding about 100 Hz!, the perceptual asymmetry clearly occurs in the absence of a place representation of frequency at the level of the auditory nerve. Thus the asymmetry possibly represents a purely temporal effect. Some support for this view was already provided by a previous study ~see paper 2! in which the FM stimuli had sinusoidal carriers and standard apexes varied from 250 to 4000 Hz. The asymmetry observed in this previous study was notably weaker for apexes near 4000 Hz than for apexes near 250 or 1000 Hz; thus the asymmetry was found to disappear at frequencies for which phase locking disappears in the auditory nerve of mammals. In the auditory periphery, the ‘‘instantaneous’’ virtual pitch of a fluctuating complex and tonal sound is represented by both place and time information. A perceptual asymmetry between pitch peaks and troughs appears to occur for such sounds, as shown by experiments 1 and 2, but also for atonal sounds with a pitch peripherally represented by time information only. Independently of our previous results for sinusoidal carriers, this may suggest that the instantaneous virtual pitch of fluctuating, complex, and tonal sounds is primarily encoded in terms of time rather than place by the auditory nerve. However, as stressed by Burns and Viemeister ~1976, 1981!, results on the perception of atonal pitch are difficult to extrapolate to the tonal domain because atonal pitch is weaker and less precise than tonal pitch. Recently, Carlyon and Shackleton ~1994! provided evidence that the auditory system uses separate mechanisms to measure the F0 of resolved harmonics ~i.e., tonal pitches! and unresolved harL. Demany and S. Cle´ment: Frequency peaks and troughs

2521

ownloaded¬11¬Apr¬2011¬to¬193.50.102.40.¬Redistribution¬subject¬to¬ASA¬license¬or¬copyright;¬see¬http://asadl.org/journals/doc/ASALIB-home/info/terms.js

monics ~i.e., atonal pitches!. Moreover, it should be emphasized that neural information which is purely temporal in the auditory nerve may be recoded into a place representation at a higher level of the auditory system. Indeed, some physiological data show that AM rate has a place representation in the auditory midbrain of cats ~Langner and Schreiner, 1988!. So, it is possible that the asymmetry observed in the present experiment ~as well as other ones! originates from a central place effect. Actually, the perceptual disadvantage of a momentary frequency minimum might be produced more by what follows this minimum than by what precedes it. Thus the asymmetry might rest on neural phenomena taking place beyond the primary auditory cortex, in associative cortical areas where ~musical! pitches of any kind are represented in the same manner. IV. FINAL COMMENTS AND CONCLUSIONS

The present results extend those reported in papers 1 and 2 in showing that frequency maxima also have a perceptual advantage over frequency minima when the FM carriers are not sinusoidal. Although we used complex carriers of various types, it is clear that additional experiments on still other stimuli of the same family would be worth performing. Especially, it would be interesting to modulate resonant frequencies in order to simulate the formant transitions of natural speech, F0 being fixed or modulated independently. However, the stimuli employed here made it possible to tackle rather basic issues concerning the perception of wide FM, and four general conclusions can be drawn from our study. ~1! A pronounced perceptual advantage of frequency maxima over frequency minima is observable when the perceptual correlate of frequency is a tonal pitch probably extracted from a set of spectral components that are at least partly resolvable in the auditory periphery ~experiments 1 and 2!. ~2! A perceptual asymmetry of the same type exists for atonal pitches, provided that they originate from frequencies exceeding about 100 Hz ~experiment 3!. Since the asymmetry can be observed while frequency has no place representation in the auditory nerve, it might be a purely temporal effect; however, alternative interpretations are possible. ~3! The end points of FM stimuli temporally centered on a frequency apex are probably not responsible for the perceptual asymmetry between maxima and minima, even for stimuli lasting only 200 ms ~experiment 2!. The immediate directions of the frequency changes seem to be much more important than their end points. In other words, the asymmetry seems to be a genuine ‘‘motion’’ effect. ~4! At least in the atonal domain, the main source of the asymmetry seems to be a deleterious factor acting on the perception of minima rather than a facilitating factor acting on the perception of maxima. ACKNOWLEDGMENTS

This work was supported by the Conseil Re´gional d’Aquitaine. The first author is affiliated with the Centre National de la Recherche Scientifique. Special thanks are due to 2522

J. Acoust. Soc. Am., Vol. 98, No. 5, Pt. 1, November 1995

Alain de Cheveigne´, who gave us the idea of doing experiment 2, analyzed the stimuli used in this experiment, and made valuable comments on a previous version of the manuscript. We also thank Brian Moore and an anonymous psychoacoustician for their formal reviews, as well as Edward Burns for discussions about experiment 3. We adopt here a terminology proposed by Terhardt ~1974!. He suggested the name ‘‘spectral pitch’’ for the pitch of a sine tone ~presented in isolation or as one component of a complex tone!, and the name ‘‘virtual pitch’’ for the pitch sensation induced by a complex tone as a whole. 2 The present study was actually performed before ‘‘experiment 1,’’ and also before the study reported in paper 2. 1

Bilsen, F. A., and Wieman, J. L. ~1980!. ‘‘Atonal periodicity sensation for comb-filtered noise signals,’’ in Psychophysical, Physiological, and Behavioural Studies in Hearing, edited by G. van den Brink and F. A. Bilsen ~Delft U.P., Delft, The Netherlands!. Brady, P. T., House, A. S., and Stevens, K. N. ~1961!. ‘‘Perception of sounds characterized by a rapidly changing resonant frequency,’’ J. Acoust. Soc. Am. 33, 1357–1362. Burns, E. M., and Viemeister, N. F. ~1976!. ‘‘Nonspectral pitch,’’ J. Acoust. Soc. Am. 60, 863– 869. Burns, E. M., and Viemeister, N. F. ~1981!. ‘‘Played-again SAM: Further observations on the pitch of amplitude-modulated noise,’’ J. Acoust. Soc. Am. 70, 1655–1660. Carlyon, R. P., and Shackleton, T. M. ~1994!. ‘‘Comparing the fundamental frequencies of resolved and unresolved harmonics: Evidence for two pitch mechanisms?’’ J. Acoust. Soc. Am. 95, 3541–3554. Darwin, C. J., Ciocca, V., and Sandell, G. J. ~1994!. ‘‘Effects of frequency and amplitude modulation on the pitch of a complex tone with a mistuned harmonic,’’ J. Acoust. Soc. Am. 95, 2631–2636. Demany, L., and Cle´ment, S. ~1995!. ‘‘The perception of frequency peaks and troughs in wide frequency modulations. II. Effects of frequency register, stimulus uncertainty, and intensity,’’ J. Acoust. Soc. Am. 97, 2454 – 2459. Demany, L., and McAnally, K. I. ~1994!. ‘‘The perception of frequency peaks and troughs in wide frequency modulations,’’ J. Acoust. Soc. Am. 96, 706 –715. Divenyi, P. L., and Hirsh, I. J. ~1975!. ‘‘The effect of blanking on the identification of temporal order in three-tone sequences,’’ Percept. Psychophys. 17, 246 –252. Duifhuis, H., Willems, L. F., and Sluyter, R. J. ~1982!. ‘‘Measurement of pitch in speech: An implementation of Goldstein’s theory of pitch perception,’’ J. Acoust. Soc. Am. 71, 1568 –1580. Formby, C. ~1985!. ‘‘Differential sensitivity to tonal frequency and to the rate of amplitude modulation of broadband noise by normally hearing listeners,’’ J. Acoust. Soc. Am. 78, 70–77. Glasberg, B. R., and Moore, B. C. J. ~1990!. ‘‘Derivation of auditory filter shapes from notched-noise data,’’ Hear. Res. 47, 103–138. Gupta, S. K., and Schroeter, J. ~1993!. ‘‘Pitch-synchronous frame-by-frame and segment-based articulatory analysis by synthesis,’’ J. Acoust. Soc. Am. 94, 2517–2530. Guttman, N., and Pruzansky, S. ~1962!. ‘‘Lower limits of pitch and musical pitch,’’ J. Speech Hear. Res. 5, 207–214. Kaernbach, C. ~1991!. ‘‘Simple adaptive testing with the weighted up–down method,’’ Percept. Psychophys. 49, 227–229. Langner, G., and Schreiner, C. E. ~1988!. ‘‘Periodicity coding in the inferior colliculus of the cat. I: Neuronal mechanisms,’’ J. Neurophysiol. 60, 1799–1822. Marin, C. M. H., and McAdams, S. ~1991!. ‘‘Segregation of concurrent sounds. II: Effects of spectral envelope tracing, frequency modulation coherence, and frequency modulation width,’’ J. Acoust. Soc. Am. 89, 341– 351. McAdams, S. ~1989!. ‘‘Concurrent sound segregation. I: Effects of frequency modulation coherence,’’ J. Acoust. Soc. Am. 86, 2148 –2159. Moore, B. C. J., and Glasberg, B. R. ~1990!. ‘‘Frequency discrimination of complex tones with overlapping and nonoverlapping harmonics,’’ J. Acoust. Soc. Am. 87, 2163–2177. Moore, B. C. J., and Glasberg, B. R. ~1991!. ‘‘Effects of signal-to-noise ratio on the frequency discrimination of complex tones with overlapping or nonoverlapping harmonics,’’ J. Acoust. Soc. Am. 89, 2858 –2865. L. Demany and S. Cle´ment: Frequency peaks and troughs

2522

ownloaded¬11¬Apr¬2011¬to¬193.50.102.40.¬Redistribution¬subject¬to¬ASA¬license¬or¬copyright;¬see¬http://asadl.org/journals/doc/ASALIB-home/info/terms.js

Moore, B. C. J., Glasberg, B. R., and Shailer, M. J. ~1984!. ‘‘Frequency and intensity difference limens for harmonics within complex tones,’’ J. Acoust. Soc. Am. 75, 550–561. Na´beˇlek, I. V., Na´beˇlek, A. K., and Hirsh, I. J. ~1970!. ‘‘Pitch of tone bursts of changing frequency,’’ J. Acoust. Soc. Am. 48, 536 –553. Pinek, B., and Cave´, C. ~1994!. ‘‘The pitch of glissandi: effects of musical ability, gender, and listening conditions,’’ Acta Acustica 2, 59– 63. Plomp, R. ~1976!. Aspects of Tone Sensation ~Academic, London!. Risset, J. C. ~1971!. ‘‘Paradoxes de hauteur: le concept de hauteur sonore n’est pas le meˆme pour tout le monde,’’ in Proceedings of the 7th International Congress on Acoustics ~Budapest!, Paper 20 S 10.

2523

J. Acoust. Soc. Am., Vol. 98, No. 5, Pt. 1, November 1995

Shepard, R. N. ~1964!. ‘‘Circularity in judgments of relative pitch,’’ J. Acoust. Soc. Am. 36, 2346 –2353. Terhardt, E. ~1974!. ‘‘Pitch, consonance, and harmony,’’ J. Acoust. Soc. Am. 55, 1061–1069. Viemeister, N. F. ~1979!. ‘‘Temporal modulation transfer functions based upon modulation thresholds,’’ J. Acoust. Soc. Am. 66, 1364 –1380. Watson, C. S., Wroton, H. W., Kelly, W. J., and Benbassat, C. A. ~1975!. ‘‘Factors in the discrimination of tonal patterns. I. Component frequency, temporal position, and silent intervals,’’ J. Acoust. Soc. Am. 57, 1175– 1185.

L. Demany and S. Cle´ment: Frequency peaks and troughs

2523

ownloaded¬11¬Apr¬2011¬to¬193.50.102.40.¬Redistribution¬subject¬to¬ASA¬license¬or¬copyright;¬see¬http://asadl.org/journals/doc/ASALIB-home/info/terms.js

The perception of frequency peaks and troughs in wide ...

''spectral pitch'' domain also exists in the domain of ''vir- tual'' pitch, for virtual ...... or as one component of a complex tone), and the name ''virtual pitch'' for the pitch sensation .... ''Temporal modulation transfer functions based upon modulation ...

257KB Sizes 1 Downloads 109 Views

Recommend Documents

The Perception of Frequency Peaks and Troughs in ...
(which corresponds to C6 in the tempered musical scale) and a sound pressure ..... the ordering of the signals and could not hear them during ... dents in music. All played some musical instrument. For three additional isteners, the test session was

The perception of frequency peaks and troughs in wide ...
... 1, November 1997. L. Demany and S. Clement: Frequency peaks and troughs ..... geometric means of the EF/ER and SR/SF ratios were, re-. FIG. 3. Schematic ...

Speeded naming frequency and the development of the lexicon in ...
Speeded naming frequency and the development of the lexicon in Williams syndrome.pdf. Speeded naming frequency and the development of the lexicon in ...

The concentration and frequency of C. sakazakii in ...
The objective of this study is to control the biological risk of the hospital kitchen`s environment at the University ... highest level of microbiological safety in hospital/nursery are defined in MRA Series 10 (FAO/WHO 2004). Keywords: PIF; Enteroba

consumption inequality and the frequency of purchases
UC Berkeley and NBER. Dmitri Koustas. UC Berkeley. This Draft: July 2nd, 2017. Abstract: We document a decline in the frequency of shopping trips in the U.S. since 1980 and consider its ...... 18 We can map one-to-one from the effect on time dispersi

consumption inequality and the frequency of purchases
Jul 2, 2017 - In the figures made using the DS, we include a vertical line to ...... Volatility Moderation in Russia: Evidence from Micro Level Panel Data on ...

Asymptotic Optimality of the Static Frequency Caching in the Presence ...
that improves the efficiency and scalability of multimedia content delivery, benefits of ... probability, i.e., the average number of misses during a long time period.

Frequency and characteristics of Listeria spp. in minced ...
(Open Access). Frequency and characteristics of Listeria spp. in minced meat in. Albanian retail market. ELVIRA BELI1*, RENIS MAÇI2, SONILA ÇOÇOLI1, HALIT MEMOÇI2. 1Agricultural University, Kamez, Albania. 2Food Safety and Veterinary Institute, S

Modeling of Effort Perception in Lifting and Reaching ...
Jun 26, 2001 - measures of effort and use this information to predict fatigue or the risk of injury. Lifting and reaching tasks were performed in seated and standing ... required to perform these tasks were rated on a ten point visual analog scale. S

Perception and Understanding of Social Annotations in Web Search
May 17, 2013 - [discussing personal results for a query on beer] “If. [friend] were a beer drinker then maybe [I would click the result]. The others, I don't know if ...

Improved perception of speech in noise and Mandarin ...
A mathematical analysis of the nonlinear distortions caused by ..... A program written ...... Software User Manual (Cochlear Ltd., Lane Cove, Australia). Turner ...

Perception of Linguistic and Affective Prosody in ...
deficits in affective-prosodic processing, while grammatical- prosodic ..... likely to misidentify a command as a statement (chi square. = 11.43, p

Modeling of Shoulder and Torso Effort Perception in ...
Proceedings of SAE Digital Human Modeling for Design and Engineering Conference,. Munich ... A total of 2,551 ratings were obtained from twenty subjects.

MAGNITUDE AND FREQUENCY OF FLOODING IN ...
In contrast, in alluvial reaches with a mobile bed, such ...... Camuffo, D. and Enzi, S.: 1995, 'The Analysis of Two Bi-Millennial Series: Tiber and Po River. Floods' ...

Symmetricom - Time and Frequency Measurements in ...
There was a problem previewing this document. Retrying. ... Symmetricom - Time and Frequency Measurements in Synchronization and Packet Networks.pdf.

AFFECT AND RISK PERCEPTION IN THE CONTEXT ...
researchers [5], Canadian participants were presented with a scenario of a nuclear blast in ... to the statistical risk yet highly representative of the iconic images.

Coexistence of amplitude and frequency modulations in ...
Mar 24, 2008 - purposes, from triggering the developmental program of fer- ... nels to Ca2+ stores such as the endoplasmic reticulum (ER), and by the action of active ..... (a) Bifurcation diagram and (b) period diagram of an excit- able version ...

A-Frequency-Dictionary-Of-Japanese-Routledge-Frequency ...
A FREQUENCY DICTIONARY OF PORTUGUESE (ROUTLEDGE FREQUENCY DICTIONARIES). Read On the internet and Download Ebook A Frequency Dictionary Of Portuguese (Routledge Frequency Dictionaries). Download Mark Davies ebook file at no cost and this file pdf ava

3aSCb6. Neutralization in the perception and ...
Darcy et al. (2007): Inexperienced American English learners of French use L1 phonological process (i.e., regressive place assimilation) for L2 perception in.

The perception of self-produced sensory stimuli in ...
sensation (caused by a piece of soft foam) on the palm of their left hand ..... Impaired central mismatch error-correcting behavior in schizo- phrenia. Archives of ...