A computational algorithm for computing cochlear ...

Viewer
Transcript

Auditory Signal Processing: Physiology, Psychoacoustics, and Models. Pressnitzer, D., de Cheveign´e, A., McAdams, S.,and Collet, L. (Eds). Springer Verlag, 2004.

A computational algorithm for computing cochlear frequency selectivity: Further studies Alberto Lopez-Najera1, Ray Meddis2, and Enrique A. Lopez-Poveda1§ 1 2 §

Universidad de Castilla-La Mancha, {enrique.lopezpoveda, alberto.lopez}@uclm.es University of Essex, [email protected] Corresponding author, presently at Universidad de Salamanca, [email protected]

1 Introduction Phenomenological filter algorithms that compute nonlinear cochlear frequency selectivity are important for basic and applied hearing research. They constitute a crucial stage for modeling the physiology of processes higher in the auditory system (e.g., Sumner, Lopez-Poveda, O’Mard and Meddis 2002; Zhang, Heinz, Bruce and Carney 2001), or even to model behavioral data pertaining to auditory frequency selectivity (Lopez-Poveda and Meddis 2001; Irino and Patterson 2001). They also may serve as the basis for developing speech processors for auditory prostheses (Wilson, Brill, Cartee, Cox, Lawson, Schatzer and Wolford 2002). Several algorithms of this sort have been proposed (Goldstein 1990; Irino and Patterson 2001; Zhang et al. 2001; Meddis, O’Mard and Lopez-Poveda 2001). Their success will finally depend on their ability to reproduce accurately as many features of the basilar-membrane (BM) response as possible, as well as on their computational speed and conceptual simplicity. Meddis et al. (2001) presented one such algorithm termed dual-resonance nonlinear (DRNL) filter. Preceded by a middle-ear (ME) filter, the DRNL filter reproduced reasonably the measured response of point sites on the BM to single tones, clicks, and combination tones. However, it failed to model two prominent characteristics of the BM response. First, it did not reproduce the amplitude and phase plateaus observed at high levels for frequencies higher than the characteristic frequency (CF) of the measurement site (Ruggero, Rich, Recio, Narayan and Robles 1997). Instead, the gain of the DRNL filter decreased steeply above CF regardless of level. Second, the filter did not reproduce the long-lasting multiplelobe aspect of the BM impulse response (IR) (Recio, Rich, Narayan and Ruggero 1998). Instead, the filter’s IR showed two lobes only, consistent with its dualresonance structure. Both these failures question the validity of the simple, dualresonance architecture of the DRNL filter to model BM responses. In search for an optimum algorithm, the present work investigates the reason for these failures and introduces improvements on the original DRNL filter to

A computational algorithm for computing cochlear frequency selectivity

15

overcome them. It is shown that the second of these two failures is a result of using an unrealistic middle-ear filter. Indeed, longer multiple-lobe IRs can be reproduced with the DRNL filter if empirical stapes IR waveforms are input directly to the filter. It is also shown, however, that a third, all-pass parallel filter path must be incorporated into the DRNL filter to model the high-frequency plateau. The analysis below focuses on the ability of the improved filter to reproduce many aspects of the BM response to pure tones and clicks for a single chinchilla at a CF ∼ 10 kHz (L113 of Ruggero et al. 1997 and Recio et al. 1998). The reason for this choice is that L113 is perhaps the case for which the most complete set of BM measurements has ever been reported.

2 The improved DRNL filter The improved DRNL filter consists of the original dual-resonance nonlinear algorithm (described in full in Meddis et al., 2001) completed with a third parallel path that acts as a linear, zero-phase, all-pass filter. The output from the improved filter is the sum of the outputs from the original DRNL filter and the third path. The gain (scalar) of the third-path filter, k, is free to vary above zero, but is usually lower than the gain of the linear path, g, of the original DRNL filter. This makes the contribution of the third path to the total filter output become prominent only at high input levels. Furthermore, its contribution is most evident for frequencies higher than CF because in this region the output from both the linear and the nonlinear paths of the DRNL filter is highly attenuated (line “no-3rd” in Fig. 1B). This third path allows modeling of the high-frequency amplitude and phase plateaus observed in BM tonal responses (Robles and Ruggero 2001). The idea of using a zero-phase, all-pass filter is based on a suggestion by Robles and Ruggero (p. 1313) that the plateaus “…reflect, more or less directly (…) stapes motion…”

3 Implementation and evaluation The improved DRNL filter was implemented and evaluated digitally in the time domain. The part corresponding to the original DRNL filter was implemented as described by Lopez-Poveda and Meddis (2001, Appendix). The new zero-phase, all-pass filter was implemented digitally as suggested by Smith (2002). A middle-ear stage was placed between the stimulus and the input to the filter. For any sound pressure (Pa) waveform, this stage produces stapes velocity (m/s), which is the assumed input to the DRNL filter. Special attention was paid to make sure that this ME stage preserves all aspects of the experimental stapes response. For this reason, it was implemented differently when evaluating the filter for tones and clicks. For pure tones, it was realized as an FIR filter whose coefficients match the empirical IR of the chinchilla stapes. For clicks, however, the measured IR of the stapes (in velocity units) was used directly as the input to the improved DRNL filter. Unfortunately, the stapes IR for chinchilla L113 was not available (Ruggero, pers. comm.). Instead, we used the experimental waveform (Fig. 3) for a different

16

Alberto Lopez-Najera, Ray Meddis, and Enrique A. Lopez-Poveda

animal (CB063 of Rhode and Recio, 2000). This may explain some of the discrepancies between the experimental and the model responses discussed later. The response of the improved DRNL filter was examined for stimuli identical to those used in the experiments. Its phase and amplitude responses for pure tones were obtained by fitting the output waveform to a sine waveform by means of a sine-fitting routine. The sampling rate was 6.25×104 Hz for tones and 2.5×105 Hz for clicks. All filters were implemented and evaluated in MatlabTM 6.5. The code is available from the authors on request.

4 Filter parameters The parameters for the improved DRNL filter are given in Table I. The same set was used throughout this report. The optimization strategy differed from that used by Meddis et al. (2001) or Lopez-Poveda and Meddis (2001). They paid attention to reproduce the amplitude aspect of the BM response to pure tones only. However, their procedure is inadequate, as it does not set constraints on every parameter. For example, varying the order or the bandwidth of the gammatone filters in the nonlinear path may be equally valid to reproduce the tuning of the BM response at low levels, where the experimental data is usually scarce. However, both parameters have a different effect on the phase of the DRNL filter, as can be easily understood from the analytical description of the filter’s response of Lopez-Poveda (submitted). As a result, the parameters in Table I were optimized considering both the amplitude and phase aspects of the experimental response simultaneously. Table I. Parameters of the improved DRNL filter used throughout this report. The notation is identical as in Lopez-Poveda and Meddis (2001). (GT: Gammatone; LP: Lowpass) Linear path GT cascade 5 LP cascade 7 CFlin (Hz) 9000 BWlin (Hz) 3500 LPlin (Hz) 8800 Gain, g 89

Nonlinear path GT cascade 3 LP cascade 4 CFnl (Hz) 10000 BWnl (Hz) 1800 LPnl (Hz) 10000 Gain a 2900 Gain b[(m/s)(1-c)] 0.04 Gain c 0.25

All-Pass path Gain k 1

5 Response to pure tones Figure 1 compares the response of the improved DRNL filter against the experimental data for identical pure tone stimuli. Overall, the model (Fig. 1B) reproduces the sensitivity data (Fig. 1A) to a good approximation both qualitatively and quantitatively. The discrepancies are more evident at high levels (90 dB), and may be attributed to using a ME filter in the model that does not correspond to chinchilla L113.

17

A computational algorithm for computing cochlear frequency selectivity

Note that the improved DRNL filter now reproduces the plateau observed at high levels for frequencies higher than 13 kHz. It also reproduces the shift in best frequency (BF) from CF at low levels to around 0.6×CF at high levels (the actual BF at high levels may be influenced by the presence of peaks in the stapes response). A closer look reveals, however, that in the model the shift occurs abruptly from the BF of its nonlinear path to the BF of its linear path, whereas the shift is more gradual in the data. Indeed, Recio et al. (1998, Fig. 12) suggested that the data may be better described if a third highly tuned, nonlinear filter were added between the two resonances already modeled by the DRNL filter (arrow in Fig. 1A). Animal (L113) 1.E+00

Model 1.0E+00

A

1.0E-01

Sensitivity (m/s/Pa)

Sensitivity (m /s/Pa)

1.E-01

1.E-02

1.E-03

1.0E-02

1.0E-03

1.0E-04

1.E-04

1.0E-05

1.E-05 0 0

ST Phase re condensation (cycles)

20 30 40 50 60 70 80 90 100 stapes No 3rd

B

5000

10000 15000 Fr e que ncy (Hz)

C

-1

200000 0

-1

-2

-2

-3

-3

-4

-4

-5

-5

-6

-6

5000

10000 15000 Fr equency (Hz)

20000

D

20 30 70 80 90 stapes

0

5000

10000 15000 Fr e que ncy (Hz)

20000 0

5000

10000 15000 Fr equency (Hz)

20000

Fig. 1. Left panels: BM response for cochlea L113 (from Ruggero et al. 1997). (A) Sensitivity (m/s/Pa). (C) Phase (cycles) relative to input pressure (condensation). Right panels: model response. (B) Sensitivity (m/s/Pa). (D) Phase (cycles) relative to stapes phase. Different symbols (insets) illustrate different signal levels (dB SPL).

The bottom panels in Fig. 1 compare the phase responses. The agreement in shape is reasonable, including the presence of a high-frequency plateau. However, the phase lag in the plateau region is 2.5 cycles larger in the data. The reason may be that the data includes the lag introduced by the stapes, which is absent in the

18

Alberto Lopez-Najera, Ray Meddis, and Enrique A. Lopez-Poveda

model results. Unfortunately, the phase of the stapes response for L113 was not measured (Ruggero, pers. comm.), and the lag introduced by the ME stage in our model (shown in Fig. 2D) is unusually large compared with the values measured in Ruggero’s laboratory (Temchin, Robles and Ruggero 2001).

ST Phase re 80 dB SPL (deg)

180

Animal (L113)

0.5

A

0

-180 4000

Model

B

20 30 40 50 60 70 90 100

0

6000

8000

10000

12000

Frequency (Hz)

-0.5 140004000

6000

8000

10000

12000

Fig. 2. Phase relative to 80 dB SPL. (A) Animal data (L113 from Ruggero et al. 1997). (B) Model response.

14000

Fre que ncy (Hz)

Figure 2 reproduces the phase data of Fig. 1 but relative to 80 dB SPL. Therefore, the contribution of the stapes to the BM phase, which is linear, is cancelled out in this representation. The model reproduces the systematic phase lead with level for frequencies lower than CF. It also reproduces the phase lag observed at high levels (≥ 90 dB) across frequencies. However, the model does not reproduce the phase lag at low levels for frequencies higher than CF. This failure could be corrected if different parameters were used (not shown), but then the fit to the sensitivity data in Fig. 1 would deteriorate slightly.

6 Response to clicks Figure 3 compares the IR for cochlea L113 (Recio et al. 1998) and the response of the improved DRNL filter when the input is an empirical stapes IR (see Sec. 3). Note that the filter’s response shows more than two lobes (as would be expected from a dual-resonance architecture) and its duration is comparable to the experimental IRs. This is a consequence of using a realistic stapes IR as input to the DRNL filter. It was not observed when the middle ear was modeled with a bandpass filter (compare Fig. 3B below with Fig. 7B of Meddis et al., 2001). As shown by Recio et al. (1998), the BM responses to clicks in Fig. 3A are frequency modulated (Fig. 4A). Recio et al. wrote (their p. 1976) that the “…instantaneous frequency was influenced by level but its time trajectory retained its main features even at the highest level…” and post-mortem. The improved DRNL filter reproduces this behavior to a reasonable approximation (Fig. 4B), but only when a realistic stapes response is used as input. When the ME stage is removed (“no-ME” in Fig. 4), or when it is modeled by a simple Butterworth bandpass filter (not shown), the trajectory of the instantaneous frequency of the model differs considerably from the data. If the ME largely determined the

19

A computational algorithm for computing cochlear frequency selectivity

trajectory, it would explain its invariance post-mortem. For the trajectory to settle down at approximately the same frequency for all levels, it is also very important that the center frequencies of the gammatone filters in the linear and nonlinear paths of the DRNL filter are relatively close together (compare CFlin and CFnl in Table I). The shift in BF with level (illustrated in Fig. 1) may still be modeled by setting the cut-off frequency of the lowpass filter in the linear path (LPlin) lower than CFlin. A

Animal (L113)

Model

B um/s

6000 0

-6000 -12000 -18000 -24000 -30000 -36000 -42000

Stapes

750 -48000 -750

-54000 0

Tim e (ms)

1

2

3

4

5

Time (m s)

Fig. 3. (A) BM velocity click response for case L113 (from Recio et al., 1998). (B) Response of the improved DRNL filter to an empirical stapes IR (bottom waveform).

7 Conclusions The improved DRNL filter reproduces the characteristic high-frequency plateaus. The fits to the data improve when its parameters are optimized bearing in mind the amplitude and phase aspects of the response to tones simultaneously. The middleear stage contributes significantly to improving the fits, particularly at high levels, and is essential to explain and to model the waveforms and the instantaneous frequency of the BM response to clicks, both in vivo and post mortem.

Acknowledgments Thanks to Mario Ruggero and Alberto Recio for their suggestions and for providing some of the experimental data. Work supported by FIS PI020343 and G03/203.

20

Alberto Lopez-Najera, Ray Meddis, and Enrique A. Lopez-Poveda

B

Fig. 4. Instantaneous frequency for different levels (insets). (A) Data for cochlea L113 (from Recio et al. 1998). (B) Results for the improved DRNL filter.

References Goldstein, J. L. (1990) Modeling rapid wave form compression on the basilar membrane as multiple-band-pass-nonlinearity filtering. Hear. Res. 49, 39-60. Irino, T., and Patterson, R.D. (2001) A compressive gammachirp auditory filter for both physiological and psychophysical data,” J. Acoust. Soc. Am. 109, 2008-2022. Lopez-Poveda, E.A. and Meddis R. (2001) A human nonlinear cochlear filterbank. J. Acoust. Soc. Am. 110, 3107-3118. Lopez-Poveda, E.A. (2003) An approximate transfer function for the dual-resonance nonlinear filter model of auditory frequency selectivity. J. Acoust. Soc. Am. 114, 21122117. Meddis, R. O'Mard, L. and Lopez-Poveda, E.A. (2001) A computational algorithm for computing nonlinear auditory frequency selectivity. J. Acoust. Soc. Am. 109, 2852-2861. Recio, A. Rich, N.C. Narayan, and S. Ruggero, M.A. (1998) Basilar-membrane responses to clicks at the base of the chinchilla cochlea. J. Acoust. Soc. Am. 103, 1972-1989. Rhode, W.S. and Recio, A. (2000) Study of mechanical motions in the basal region of the chinchilla cochlea. J. Acoust. Soc. Am. 107: 3317-3332. Robles, L. Ruggero, M.A. (2001) Mechanics of the mammalian cochlea. Physiol. Rev. 81, 1305-1352. Ruggero, M.A. Rich, N.C. Recio, A. Narayan, S. and Robles L. (1997) Basilar-membrane responses to tones at the base of chinchilla cochlea. J. Acoust. Soc. Am. 101: 2151-2163. Smith, J. O. (2003). Introduction to Digital Filters, Stanford University. Web published at http://www-ccrma.stanford.edu/~jos/filters/. Sumner, C.J., Lopez-Poveda, E.A., O’Mard, L.P., and Meddis, R. (2002) A revised model of the inner-hair cell and the auditory-nerve complex. J. Acoust. Soc. Am. 111, 2178-2188. Temchin, A.N. Robles, L. and Ruggero, M.A. (2001) A re-examination of middle-ear transmission in chinchilla. Poster #586. Meeting of the Assoc. Res. Otolaryng. Wilson B.S., Brill, S.M., Cartee, L.A., Cox, J.H., Lawson, D.T., Schatzer, R. and Wolford, R.D. (2002) Speech processors for auditory prostheses. Final Report. NIH project N01DC-8-2105. Zhang, X., Heinz, M.G., Bruce, I.C., and Carney, L.H. (2001) A phenomenological model for the responses of auditory nerve fibers. I. Non-linear tuning with compression and suppression. J. Acoust. Soc. Am. 109, 648-670.

A computational algorithm for computing cochlear ...

Carney 2001), or even to model behavioral data pertaining to auditory ... For any sound pressure (Pa) waveform, this stage produces stapes velocity (m/s),.

Download PDF

218KB Sizes 0 Downloads 219 Views

Report

A computational algorithm for computing cochlear ...

Recommend Documents