Stereophonic audio signal decompression switching to monaural ...

Viewer
Transcript

USO0RE42949E

(19) United States (12) Reissued Patent

(10) Patent Number: US RE42,949 E (45) Date of Reissued Patent: Nov. 22, 2011

Tzannes et a]. (54)

STEREOPHONIC AUDIO SIGNAL

(56)

References Cited

DECOMPRESSION SWITCHING TO MONAURAL AUDIO SIGNAL

(75) Inventors: Michael A. Tzannes, Lexington, MA (US); Peter N. Heller, Somerville, MA (US); John P. Stautner, The Woodlands, TX (U S); William R. Morrell, Santa Cruz, CA (US); Sriram J ayasimha,

Bangalore (IN) (73) Assignee: Hybrid Audio LLC, Tyler, TX (US) (21) Appl. No.: 11/898,920 (22) Filed:

Sep. 14, 2007 (Under 37 CFR 1.47)

6,252,909

Issued:

Jun. 26, 2001

(57) ABSTRACT [A communication system for sending a sequence of symbols on a communication link. The system includes a transmitter

for placing information indicative of the sequence of symbols mitter. The transmitter includes a clock for de?ning succes

each of M carrier signals With a signal related to the value of one of the symbols thereby generating a modulated carrier

Appl. No.:

08/804,909

signal corresponding to each of the carrier signals. The modu

Filed: _

Feb. 25, 1997

lated carriers are combined into a sum signal Which is trans

Continuation of application No. 10/994,925, ?led on

part of application No. 08/307,331, ?led on Sep. 16,

mitted on the communication link. The carrier signals include ?rst and second carriers, the ?rst carrier having a different bandwidth than the second carrier. In one embodiment, the modulator includes a tree-structured array of ?lter banks hav

ing M leaf nodes, each of the values related to the symbols forming an input to a corresponding one of the leaf nodes.

1994, noW Pat. No. 5,606,642, Which is a division of

Each of the nodes includes one of the ?lter banks. Similarly,

application No. 07/948,147, ?led on Sep. 21, 1992,

the receiver can be constructed of a tree-structured array of

sub-band ?lter banks for converting M time-domain samples

noW Pat. No. 5,408,580.

received on the communication link to M symbol values A

Int. Cl. G01L 19/02 H04H 40/81

(2006.01) (2008.01)

H04K1/10

(2006.01)

phonic audio signal. A de-quantizer de-quantizes the com pressed stereophonic audio signal to generate sets offre

H04B 1/38 H03D 1/24

(2006.01) (2006.01)

signals. A controller switches to constructing a single set of

stereophonic audio signal decompression method that includes decoding, using a decoder, a compressed stereo

quency components for synthesizing left and right audio

US. Cl. ...... .. 375/260; 704/205; 704/230; 704/500;

381/11; 375/219; 329/357 (58)

(74) Attorney, Agent, or Firm *Jason H. Vick; Sheridan Ross RC.

sive frames, each of the frames including M time intervals,

Nov. 23, 2004, noW Pat. No. Re. 40,281, Which is a division of application No. 10/603,833, ?led on Jun. 26, 2003, noW abandoned, Which is a continuation-in

(52)

Primary Examiner * Martin Lerner

Where M is an integer greater than 1 . A modulator modulates

Patent No.:

US. Appl1cat1ons:

(51)

Engel .......................... .. 708/306

(Continued)

information placed on the communication link by the trans

Reissue of:

(60)

8/1976

on the communication link and a receiver for receiving the

Related US. Patent Documents

(64)

U.S. PATENT DOCUMENTS 3,976,863 A *

Field of Classi?cation Search ................ .. 704/200,

704/200.1, 201, 205, 206, 500, 501, 502, 704/503, 504; 708/300, 313, 318; 381/11 See application ?le for complete search history.

frequency components by averaging corresponding fre quency components in the left and right audio signals when a computational workload exceeds a capacity of a decompres sion system and a synthesizer synthesizes a monaural audio

time domain signal. 1 Claim, 16 Drawing Sheets

800

802 DECODER 804 ,/

DE-QUANTIZER '

SWITCH/ ADDER

CONTROLLER 5.1.1

814

/

US RE42,949 E Page2 U.S. PATENT DOCUMENTS

5,285,474 A *

2/1994 Chow e161. ................. .. 375/231

*

4,157,455 A *

6/1979 Okatani et a1. ................ .. 381/11

233333 2 ,1

4,251,690 A * 4399325 A *

“981 Takahashi 6‘ a1~ 8/1983 TaPakaetaL "

5,408,580 A * 4/1995 smut?efef'a'i' "704/205 5,479,447 A * 12/1995 Chowetal. .INIIIIIIIIIIII 375/260

381/ 11 381“

313g: {,‘Gi‘gston "

325%

1222133? A : 12x32‘; gghf?ns ~~~~ ~

~

5,515,442 A *

4,713,776 A * 0/1987 Amsek~i~

~~~704m9

5,583,962 A * 12/1996 Dav1setal.

704/229

4747142 A *

'

5,606,642 A *

704/205

4,833,715 A * 1

M988 To?e

M989 Sakai'~'

1

381m

~~~~~~~~ ~~

5,048,054 A *

9/1991 Eyuboglu etal' "

5,170,413 A * 12/1992 Hess e161. 5,225,904 A

381m

*

5,243,629 A *

7/1993

Golin etal.

.

5/1996 D0n_1br0wski, 11.. 2/1997 Stautneretal. .

5,699,432 A * 12/1997 Schne1der

.. 381/11

5,771,293

.. 381/10

*

6/1998

Schne1der

' 375/222

6252 909 Bl*

6/2001

Tzannes et al

375/260

R’E40’281 E ,,

4/2008 Tzannesetal' '

375/240.12

9/1993 Wei ............................. .. 275/299

.. 381/11

A

’

* cited by examiner

375/260 '

""""""" "

704/205

US. Patent

Nov. 22, 2011

Sheet 4 or 16

omwEMzDO

A.12 65

O’

IW

US RE42,949 E

US. Patent

Nov. 22, 2011

Sheet 5 or 16

US RE42,949 E

586E 2

\\

> » M + +

i205

ESE mm /3 mm \2

1FE

zSéwag

327%2mmi@

3w2-5z:é%m *w3aM2$sW%3»SE5

\ \

cm

w;wQ.FaImMF

US. Patent

Nov. 22, 2011

Sheet 6 or 16

US RE42,949 E

BAN D

20

(FRkEQUHNzCY) 18

17 16

12 11 1O

TiMEUns} FIG. 6

W"

US. Patent

Nov. 22, 2011

Sheet 7 0f 16

US RE42,949 E

301

19/ \E \ X

WM

313

FIG. 7

375 377 /

376

LOW~FREQ

\%

AUDIO SIGNAL

BANDPASS W’ FKLTER

FiLTERED AUDIQ SIGNAL

MODULATION SIGNAL

FIG. 8(A)

/ 320

AUDlO “y wsAMPLE SHIFT REGISTER ’

SAMPLES

‘

i GENERATE 2M POLYPHASE

‘

322

COMPONENTS T

325 \

*Lm/ 324 mriATmx MULT. ’

controller ~

ssq stock

WP

TO GENERATE ._> 83 M SUB~8AND COMPONENTS

FIG. 8(B)

US. Patent

Nov. 22, 2011

Sheet 8 or 16

US RE42,949 E

40% 496

PYSCO~ACOUSTIC ANALYZER

404

AUDIO

TO

SUB-BAND ANALYSIS FILTER BANK 402

FIG. 9

US. Patent

Nov. 22, 2011

Sheet 10 or 16

US RE42,949 E

mA1m21:g bq?w

Em

2m8%

%m5?

%1 2m\

5\52

cm

agmuinmE~H532 72m

w

3

@5E0268‘ EDU

US. Patent

Nov. 22, 2011

Sheet 11 or 16

US RE42,949 E

3©\mow> 3.

x0M4mjO,PZU

1:m h mkpnz\ou

4mi0eN!FZJwS%O‘VmQ

1:m<@19?5m3A0:1Pzwl

Q.oz

\ + 3 \

Now /

VMmLQOiUlmQ com

a vow /

£6

US. Patent

Nov. 22, 2011

Sheet 12 or 16

2K .

2EN

‘I.\\

\Fm ao2umo . N2.2K

“EA aL73“w.5..

50m1w6iAzEgoa

A1m?ikzhxrw 1%“$30NQ520.\3PSEin1Oev

_EMSEQE1

5?h&ma2

\ <5

US RE42,949 E

US. Patent

Nov. 22, 2011

Sheet 15 or 16

row

com TV:

051A.

mom

US RE42,949 E

US. Patent

Nov. 22, 2011

Sheet 16 or 16

US RE42,949 E

1w§E2. Evo

02+

H

Nov.

6:5

mzév3kwom5g

US RE42,949 E 1

2

STEREOPHONIC AUDIO SIGNAL DECOMPRESSION SWITCHING TO MONAURAL AUDIO SIGNAL

mented on digital computers which may be general purpose computers or special computers designed to more e?iciently

carry out the operations. Ifthe analysis and synthesis opera tions are carried out with su?icient precision, the segment of

audio sound track generated by the synthesis filter bank will Matter enclosed in heavy brackets [ ] appears in the original patent but forms no part of this reissue speci?ca

match the original segment of audio sound track that was

tion; matter printed in italics indicates the additions made by reissue.

the reconstructed audio sound track and the original sound track can be made arbitrarily small. In this case, the specific filter bank characteristics such as the length of the segment

CROSS-REFERENCE T0 RELATED APPLICATIONS

analyzed, the number of‘?lters in the filter bank, and the location and shape offilter response characteristics would be of‘little interest, since any set of‘?lter banks satisfying the

inputted to the analysis filter bank. The di?'erences between

perfect, or near-perfect, reconstruction condition would

This application is a Continuation of US. Reissue appli

exactly regenerate the audio segment.

cation Ser. No. 10/994,925, now Reissue Pat. No. RE 40,28], which is a Division of US. Reissue application Ser. No. 10/603,833?led Jun. 26, 2003, now abandoned, which is a

Unfortunately, the replacement of the frequency compo

Reissue of‘U.S. application Ser. No. 08/804, 909,?ledFeb. 25, I997, now US. Pat. No. 6,252,909, issuedJun. 26, 200]. US.

application Ser. No. 08/804,909, ?led Feb. 25, I997, is a Continuation-in-Part of US. patent application Ser. No. 08/307,331, ?led Sep. 16, 1994, Pat. No. 5,606,642, Whichis

20

audio track can be minimized. Hence, the length of‘the seg ments analyzed in prior art systems is chosen to be a com

promise. When the frequency components are replaced by

a division of US. Patent Application Ser. No. 07/948,147,

?led Sep. 21, 1992, Pat. No. 5,408,580. 25

The present invention relates to data transmission systems, and more particularly, to an improved multi-carrier transmis

The noise signal will be present over the entire segment ofthe 30

compression and decompression systems. BACKGROUND OF THE INVENTION

While digital audio recordings provide many advantages over analog systems, the data storage requirements for high ?delity recordings are substantial. A high fidelity recording

35

40

ber of‘sub-bands gtoreq.64) are called r‘transform coders ”. It is known from psychophysical studies of the human audi 45

great demand. One class ofprior an audio compression systems divide the

mated by a component representing the time averaged signal amplitude in the critical band.

represented by each segment, the sound track is analyzed to 50

frequency bands. The measured components are then

but which preservefeatures ofthe sound track that are impor

nents may be viewed as r‘noise ”. The noise becomes signifi

tant to a human listener At the receiver, an approximation to 55

nal components. The analysis and synthesis operations are normally car ried out with the aid ofperfect, or nearperfect, reconstruction 60

bank which generates a set of decimated subband outputs from a segment of the sound track Each decimated subband

practice, the synthesis and analysis ?lter banks are imple

cantly less audible ifits spectral energy is within one critical bandwidth of the tone. Hence, it is advantageous to use fre

quency decompositions which approximate the critical band structure of the auditory system. Systems which utilize uniform frequency bands are poorly suitedfor systems designed to take advantage of this type of approximation. In principle, each audio segment can be ana

lyzed to generate a large number ofuniform frequency bands,

output represents the signal in a predetermined frequency range. The inverse operation is carried out by a synthesis filter bank which accepts a set ofdecimated subband outputs and generates therefrom a segment of audio sound track In

In addition, the ear’s sensitivity to a noise source in the presence of a localized frequency component such as a sine

tone depends on the relative levels of‘the signals and on the relation of the noise spectral components to the tone. The errors introduced by approximating the ?’equency compo

replaced by approximations requiringf‘ewer bits to represent,

?lter banks. The systems in question include an analysisfilter

tory system that there are critical bandwidths which vary with

frequency. The information in a critical band may be approxi

sound track into a series ofsegments. Over the time interval

the original sound track is generated by reversing the analysis process with the approximations in place of‘the original sig

Prior art systems also utilize ?lter banks in which the frequency bands are uniform in size. Systems with a few (I 6-32) sub-bands in a 0-22 kHz frequency range are gener ally called r‘subband coders ” while those with a large num

di?icult. Hence, systems for compressing audio sound tracks

determine the signal components in each of‘a plurality of

able. Hence, short segments are preferred. However, the segment is too short, there is insu?icient spectral resolution to acquire information needed to properly determine the mini mum number ofbits needed to represent each frequency com

audio tracks over limited bandwidth transmission systems

to reduce the storage and bandwidth requirements are in

reconstructed sound track. Hence, the length of the segments is re?ected in the types ofartifacts introduced by the approxi mations. If the segment is short, the artifacts are less notice

ponent. On the other hand, the segment is too long, temporal resolution of the human auditory system will detect artifacts.

typically requires more than one million bits per second of playback time. The total storage needed for even a short

recording is too high for many computer applications. In addition, the digital bit rates inherent in non-compressed highfidelity audio recordings makes the transmission of‘such

approximations, an error is introduced in each component. An error in a given frequency component produces an acous

tical e?‘ect which is equivalent to the introduction of‘a noise signal with frequency characteristics that depend on filter characteristics of the corresponding filter in the filter bank.

FIELD OF THE INVENTION

sion system. The present invention further relates to audio

nents generated by the analysis filter bank with a quantized approximation thereto results in artifacts that do depend on the detail characteristics ofthefilter banks. There is no single segment length for which the artifacts in the reconstructed

65

and then, several bands at the higher frequencies could be merged to provide a decomposition into critical bands. This approach imposes the same temporal constraints on all fre quency bands. That is, the time window over which the low frequency data is generatedfor each band is the same as the

US RE42,949 E 4

3

One solution to this problem-would be to use lower quality

time window over which each high-frequency band is gener

playback on computerplatforms that lack the computational

ated. To provide accuracy in the low frequency ranges, the time window must be very long. This leads to temporal arti

resources to decode compressed audio material at high ?del

facts that become audible at higher frequencies. Hence, sys

ity quality levels. Unfortunately, this solution requires that the

tems in which the audio segment is decomposed into uniform sub-bands with adequate low-frequency resolution cannot

audio material be coded at various quality levels. Hence, each audio program would need to be stored in aplurality of formats. Di?'erent types ofusers would then be sent theformat suited to their application. The cost and complexity ofmain taining such multi-format libraries makes this solution unat

take full advantage of the critical band properties of the

auditory system. Prior art systems that recognize this limitation have

attempted to solve the problem by utilizing analysis and syn

tractive. In addition, the storage requirements ofthe multiple

thesis ?lter banks based on QMF?lter banks that analyze a segment of an audio sound track to generate frequency com ponents in twofrequency bands. To obtain a decomposition of

formats partially defeats the basic goal of reducing the

the segment into frequency components representing the amplitudes ofthe signal in critical bands, these twofrequency

putational resources of a particular playback platform are fixed. This assumption is not always true in practice. The computational resources of a computing system are often shared among aplurality ofapplications that are running in

amount ofstorage needed to store the audio material. Furthermore, the above discussion assumes that the com

band QMF?lters are arranged in a tree-structured configu

ration. That is, each of the outputs of the first level ?lter becomes the input to anotherfilter bank at least one ofwhose two outputs is fed to yet another level, and so on. The leaf nodes ofthis tree provide an approximation to a critical band analysis ofthe input audio track. It can be shown that this type

a time-shared environment. Similarly, communication links

between the playback platform and shared storage facilities 20

also may be shared. As the playback resources change, the

offilter bankused diferent length audio segments to generate

format ofthe audio material must change in systems utilizing a multi-format compression approach. This problem has not

the di?'erentfrequency components. That is, a low frequency component represents the signal amplitude in an audio seg

been adequately solved in prior art systems. In prior art multi-carrier systems, a communication path

ment that is much longer than a high-frequency component. Hence, the need to choose a single compromise audio seg ment length is eliminated.

25

bands having different frequencies. The Width of the sub bands is chosen to be the same for all sub-bands and small enough to alloW the distortion in each sub-band to be modeled

While tree structured?lter banks having many layers may be used to decompose the frequency spectrum into critical

bands, such ?lter banks introduce significant aliasing arti

30

facts that limit their utility. In a multilevel?lter bank, the aliasing artifacts are expected to increase exponentially with the number of levels. Hence, ?lter banes with large numbers of levels are to be avoided. Unfortunately, ?lter banks based on QMF?lters which divide the signal into two bandlimited

35

signals require large numbers of levels. Prior art audio compression systems are also poorly suited to applications in which theplayback ofthe material is to be carried out on a digital computer The use ofaudiofor com

puter applications is increasingly in demand. Audio is being integrated into multimedia applications such as computer based entertainment, training, and demonstration systems.

40

45

upgradedfor audio with the addition ofplug-in peripherals. ited to the use ofcostly outboard equipment such as an analog 50

specializedplaybackcon?guration, and there is no possibility

of distributing the media electronically. However, personal 55

allow distribution ofprogram material on digital disks or over a computer network.

Until recently, the use ofhigh quality audio on computer platforms has been limited due to the enormous data rate

required tier storage and playback Quality has been com

60

promised in order to store the audio data conveniently on disk Although some increase in performance and some

reduction in bandwidth has been gained using conventional audio compression methods, these improvements have not

been su?icient to allow playback ofhigh?del ity recordings on the commonly used computerplatforms without the addition

of expensive special purpose hardware.

is binary, each consecutive group of 4 bits is used to compute the corresponding symbol value Which is then sent on the communication channel in the sub-band in question.

modulated carriers is carried out via a mathematical transfor mation that generates a sequence of numbers that represents

such systems it is necessary to provide a user with a highly

computer based systems using compressed audio and video data promise to provide inexpensive playback solutions and

maximum capacity, the amount of data that can be transmitted in the communication path for a given error rate is maxi mized. For example, consider a system in Which one of the sub channels has a signal-to-noise ratio Which alloWs at least 16 digital levels to be distinguished from one another With an acceptable error rate. In this case, a symbol set having 16

In digitally implemented multi-carrier systems, the actual synthesis of the signal representing the sum of the various

Computer based audio and video systems have been lim

laser disc playerfor playback ofaudio and video. This has limited the usefulness and applicability ofsuch systems. With

by a single attenuation and phase shift for the band. If the noise level in each band is knoWn, the volume of data sent in each band may be maximized for any given bit error rate by choosing a symbol set for each channel having the maximum number of symbols consistent With the available signal-to noise ratio of the channel. By using each sub-band at its

possible signal values is chosen. If the incoming data stream

Over the course of the next few years, many new personal

computers will be outfitted with audio playback and record ing capability. In addition, existing computers will be

having a ?xed bandwidth is divided into a number of sub

the amplitude of the signal as function of time. For example, a sum signal may be generated by applying an inverse Fourier transformation to a data vector generated from the symbols to be transmitted in the next time interval. Similarly, the sym bols are recovered at the receiver using the corresponding inverse transformation.

The computational Workload inherent in synthesizing and analyzing the multi-carrier signal is related to the number of sub-bands. For example, if Fourier transforms are utilized, the Workload is of order NlogN Where N is the number of sub-bands. Similar relationships exist for other transforms. Hence, it is advantageous to minimize the number of sub bands. There are tWo factors that determine the number of sub

65

bands in prior art systems. First, the prior art systems utilize a uniform bandWidth. Hence, the number of sub-bands is at least as great as the total bandWidth available for transmission

divided by the bandWidth of the smallest sub-band. The size