USO0RE42949E
(19) United States (12) Reissued Patent
(10) Patent Number: US RE42,949 E (45) Date of Reissued Patent: Nov. 22, 2011
Tzannes et a]. (54)
STEREOPHONIC AUDIO SIGNAL
(56)
References Cited
DECOMPRESSION SWITCHING TO MONAURAL AUDIO SIGNAL
(75) Inventors: Michael A. Tzannes, Lexington, MA (US); Peter N. Heller, Somerville, MA (US); John P. Stautner, The Woodlands, TX (U S); William R. Morrell, Santa Cruz, CA (US); Sriram J ayasimha,
Bangalore (IN) (73) Assignee: Hybrid Audio LLC, Tyler, TX (US) (21) Appl. No.: 11/898,920 (22) Filed:
Sep. 14, 2007 (Under 37 CFR 1.47)
6,252,909
Issued:
Jun. 26, 2001
(57) ABSTRACT [A communication system for sending a sequence of symbols on a communication link. The system includes a transmitter
for placing information indicative of the sequence of symbols mitter. The transmitter includes a clock for de?ning succes
each of M carrier signals With a signal related to the value of one of the symbols thereby generating a modulated carrier
Appl. No.:
08/804,909
signal corresponding to each of the carrier signals. The modu
Filed: _
Feb. 25, 1997
lated carriers are combined into a sum signal Which is trans
Continuation of application No. 10/994,925, ?led on
part of application No. 08/307,331, ?led on Sep. 16,
mitted on the communication link. The carrier signals include ?rst and second carriers, the ?rst carrier having a different bandwidth than the second carrier. In one embodiment, the modulator includes a tree-structured array of ?lter banks hav
ing M leaf nodes, each of the values related to the symbols forming an input to a corresponding one of the leaf nodes.
1994, noW Pat. No. 5,606,642, Which is a division of
Each of the nodes includes one of the ?lter banks. Similarly,
application No. 07/948,147, ?led on Sep. 21, 1992,
the receiver can be constructed of a tree-structured array of
sub-band ?lter banks for converting M time-domain samples
noW Pat. No. 5,408,580.
received on the communication link to M symbol values A
Int. Cl. G01L 19/02 H04H 40/81
(2006.01) (2008.01)
H04K1/10
(2006.01)
phonic audio signal. A de-quantizer de-quantizes the com pressed stereophonic audio signal to generate sets offre
H04B 1/38 H03D 1/24
(2006.01) (2006.01)
signals. A controller switches to constructing a single set of
stereophonic audio signal decompression method that includes decoding, using a decoder, a compressed stereo
quency components for synthesizing left and right audio
US. Cl. ...... .. 375/260; 704/205; 704/230; 704/500;
381/11; 375/219; 329/357 (58)
(74) Attorney, Agent, or Firm *Jason H. Vick; Sheridan Ross RC.
sive frames, each of the frames including M time intervals,
Nov. 23, 2004, noW Pat. No. Re. 40,281, Which is a division of application No. 10/603,833, ?led on Jun. 26, 2003, noW abandoned, Which is a continuation-in
(52)
Primary Examiner * Martin Lerner
Where M is an integer greater than 1 . A modulator modulates
Patent No.:
US. Appl1cat1ons:
(51)
Engel .......................... .. 708/306
(Continued)
information placed on the communication link by the trans
Reissue of:
(60)
8/1976
on the communication link and a receiver for receiving the
Related US. Patent Documents
(64)
U.S. PATENT DOCUMENTS 3,976,863 A *
Field of Classi?cation Search ................ .. 704/200,
704/200.1, 201, 205, 206, 500, 501, 502, 704/503, 504; 708/300, 313, 318; 381/11 See application ?le for complete search history.
frequency components by averaging corresponding fre quency components in the left and right audio signals when a computational workload exceeds a capacity of a decompres sion system and a synthesizer synthesizes a monaural audio
time domain signal. 1 Claim, 16 Drawing Sheets
800
802 DECODER 804 ,/
DE-QUANTIZER '
SWITCH/ ADDER
CONTROLLER 5.1.1
814
/
US RE42,949 E Page2 U.S. PATENT DOCUMENTS
5,285,474 A *
2/1994 Chow e161. ................. .. 375/231
*
4,157,455 A *
6/1979 Okatani et a1. ................ .. 381/11
233333 2 ,1
4,251,690 A * 4399325 A *
“981 Takahashi 6‘ a1~ 8/1983 TaPakaetaL "
5,408,580 A * 4/1995 smut?efef'a'i' "704/205 5,479,447 A * 12/1995 Chowetal. .INIIIIIIIIIIII 375/260
381/ 11 381“
313g: {,‘Gi‘gston "
325%
1222133? A : 12x32‘; gghf?ns ~~~~ ~
~
5,515,442 A *
4,713,776 A * 0/1987 Amsek~i~
~~~704m9
5,583,962 A * 12/1996 Dav1setal.
704/229
4747142 A *
'
5,606,642 A *
704/205
4,833,715 A * 1
M988 To?e
M989 Sakai'~'
1
381m
~~~~~~~~ ~~
5,048,054 A *
9/1991 Eyuboglu etal' "
5,170,413 A * 12/1992 Hess e161. 5,225,904 A
381m
*
5,243,629 A *
7/1993
Golin etal.
.
5/1996 D0n_1br0wski, 11.. 2/1997 Stautneretal. .
5,699,432 A * 12/1997 Schne1der
.. 381/11
5,771,293
.. 381/10
*
6/1998
Schne1der
' 375/222
6252 909 Bl*
6/2001
Tzannes et al
375/260
R’E40’281 E ,,
4/2008 Tzannesetal' '
375/240.12
9/1993 Wei ............................. .. 275/299
.. 381/11
A
’
* cited by examiner
375/260 '
""""""" "
704/205
US. Patent
Nov. 22, 2011
Sheet 4 or 16
omwEMzDO
A.12 65
O’
IW
US RE42,949 E
US. Patent
Nov. 22, 2011
Sheet 5 or 16
US RE42,949 E
586E 2
\\
> » M + +
i205
ESE mm /3 mm \2
1FE
zSéwag
327%2mmi@
3w2-5z:é%m *w3aM2$sW%3»SE5
\ \
cm
w;wQ.FaImMF
US. Patent
Nov. 22, 2011
Sheet 6 or 16
US RE42,949 E
BAN D
20
(FRkEQUHNzCY) 18
17 16
12 11 1O
TiMEUns} FIG. 6
W"
US. Patent
Nov. 22, 2011
Sheet 7 0f 16
US RE42,949 E
301
19/ \E \ X
WM
313
FIG. 7
375 377 /
376
LOW~FREQ
\%
AUDIO SIGNAL
BANDPASS W’ FKLTER
FiLTERED AUDIQ SIGNAL
MODULATION SIGNAL
FIG. 8(A)
/ 320
AUDlO “y wsAMPLE SHIFT REGISTER ’
SAMPLES
‘
i GENERATE 2M POLYPHASE
‘
322
COMPONENTS T
325 \
*Lm/ 324 mriATmx MULT. ’
controller ~
ssq stock
WP
TO GENERATE ._> 83 M SUB~8AND COMPONENTS
FIG. 8(B)
US. Patent
Nov. 22, 2011
Sheet 8 or 16
US RE42,949 E
40% 496
PYSCO~ACOUSTIC ANALYZER
404
AUDIO
TO
SUB-BAND ANALYSIS FILTER BANK 402
FIG. 9
US. Patent
Nov. 22, 2011
Sheet 10 or 16
US RE42,949 E
mA1m21:g bq?w
Em
2m8%
%m5?
%1 2m\
5\52
cm
agmuinmE~H532 72m
w
3
@5E0268‘ EDU
US. Patent
Nov. 22, 2011
Sheet 11 or 16
US RE42,949 E
3©\mow> 3.
x0M4mjO,PZU
1:m h mkpnz\ou
4mi0eN!FZJwS%O‘VmQ
1:m<@19?5m3A0:1Pzwl
Q.oz
\ + 3 \
Now /
VMmLQOiUlmQ com
a vow /
£6
US. Patent
Nov. 22, 2011
Sheet 12 or 16
2K .
2EN
‘I.\\
\Fm ao2umo . N2.2K
“EA aL73“w.5..
50m1w6iAzEgoa
A1m?ikzhxrw 1%“$30NQ520.\3PSEin1Oev
_EMSEQE1
5?h&ma2
\ <5
US RE42,949 E
US. Patent
Nov. 22, 2011
Sheet 15 or 16
row
com TV:
051A.
mom
US RE42,949 E
US. Patent
Nov. 22, 2011
Sheet 16 or 16
US RE42,949 E
1w§E2. Evo
02+
H
Nov.
6:5
mzév3kwom5g
US RE42,949 E 1
2
STEREOPHONIC AUDIO SIGNAL DECOMPRESSION SWITCHING TO MONAURAL AUDIO SIGNAL
mented on digital computers which may be general purpose computers or special computers designed to more e?iciently
carry out the operations. Ifthe analysis and synthesis opera tions are carried out with su?icient precision, the segment of
audio sound track generated by the synthesis filter bank will Matter enclosed in heavy brackets [ ] appears in the original patent but forms no part of this reissue speci?ca
match the original segment of audio sound track that was
tion; matter printed in italics indicates the additions made by reissue.
the reconstructed audio sound track and the original sound track can be made arbitrarily small. In this case, the specific filter bank characteristics such as the length of the segment
CROSS-REFERENCE T0 RELATED APPLICATIONS
analyzed, the number of‘?lters in the filter bank, and the location and shape offilter response characteristics would be of‘little interest, since any set of‘?lter banks satisfying the
inputted to the analysis filter bank. The di?'erences between
perfect, or near-perfect, reconstruction condition would
This application is a Continuation of US. Reissue appli
exactly regenerate the audio segment.
cation Ser. No. 10/994,925, now Reissue Pat. No. RE 40,28], which is a Division of US. Reissue application Ser. No. 10/603,833?led Jun. 26, 2003, now abandoned, which is a
Unfortunately, the replacement of the frequency compo
Reissue of‘U.S. application Ser. No. 08/804, 909,?ledFeb. 25, I997, now US. Pat. No. 6,252,909, issuedJun. 26, 200]. US.
application Ser. No. 08/804,909, ?led Feb. 25, I997, is a Continuation-in-Part of US. patent application Ser. No. 08/307,331, ?led Sep. 16, 1994, Pat. No. 5,606,642, Whichis
20
audio track can be minimized. Hence, the length of‘the seg ments analyzed in prior art systems is chosen to be a com
promise. When the frequency components are replaced by
a division of US. Patent Application Ser. No. 07/948,147,
?led Sep. 21, 1992, Pat. No. 5,408,580. 25
The present invention relates to data transmission systems, and more particularly, to an improved multi-carrier transmis
The noise signal will be present over the entire segment ofthe 30
compression and decompression systems. BACKGROUND OF THE INVENTION
While digital audio recordings provide many advantages over analog systems, the data storage requirements for high ?delity recordings are substantial. A high fidelity recording
35
40
ber of‘sub-bands gtoreq.64) are called r‘transform coders ”. It is known from psychophysical studies of the human audi 45
great demand. One class ofprior an audio compression systems divide the
mated by a component representing the time averaged signal amplitude in the critical band.
represented by each segment, the sound track is analyzed to 50
frequency bands. The measured components are then
but which preservefeatures ofthe sound track that are impor
nents may be viewed as r‘noise ”. The noise becomes signifi
tant to a human listener At the receiver, an approximation to 55
nal components. The analysis and synthesis operations are normally car ried out with the aid ofperfect, or nearperfect, reconstruction 60
bank which generates a set of decimated subband outputs from a segment of the sound track Each decimated subband
practice, the synthesis and analysis ?lter banks are imple
cantly less audible ifits spectral energy is within one critical bandwidth of the tone. Hence, it is advantageous to use fre
quency decompositions which approximate the critical band structure of the auditory system. Systems which utilize uniform frequency bands are poorly suitedfor systems designed to take advantage of this type of approximation. In principle, each audio segment can be ana
lyzed to generate a large number ofuniform frequency bands,
output represents the signal in a predetermined frequency range. The inverse operation is carried out by a synthesis filter bank which accepts a set ofdecimated subband outputs and generates therefrom a segment of audio sound track In
In addition, the ear’s sensitivity to a noise source in the presence of a localized frequency component such as a sine
tone depends on the relative levels of‘the signals and on the relation of the noise spectral components to the tone. The errors introduced by approximating the ?’equency compo
replaced by approximations requiringf‘ewer bits to represent,
?lter banks. The systems in question include an analysisfilter
tory system that there are critical bandwidths which vary with
frequency. The information in a critical band may be approxi
sound track into a series ofsegments. Over the time interval
the original sound track is generated by reversing the analysis process with the approximations in place of‘the original sig
Prior art systems also utilize ?lter banks in which the frequency bands are uniform in size. Systems with a few (I 6-32) sub-bands in a 0-22 kHz frequency range are gener ally called r‘subband coders ” while those with a large num
di?icult. Hence, systems for compressing audio sound tracks
determine the signal components in each of‘a plurality of
able. Hence, short segments are preferred. However, the segment is too short, there is insu?icient spectral resolution to acquire information needed to properly determine the mini mum number ofbits needed to represent each frequency com
audio tracks over limited bandwidth transmission systems
to reduce the storage and bandwidth requirements are in
reconstructed sound track. Hence, the length of the segments is re?ected in the types ofartifacts introduced by the approxi mations. If the segment is short, the artifacts are less notice
ponent. On the other hand, the segment is too long, temporal resolution of the human auditory system will detect artifacts.
typically requires more than one million bits per second of playback time. The total storage needed for even a short
recording is too high for many computer applications. In addition, the digital bit rates inherent in non-compressed highfidelity audio recordings makes the transmission of‘such
approximations, an error is introduced in each component. An error in a given frequency component produces an acous
tical e?‘ect which is equivalent to the introduction of‘a noise signal with frequency characteristics that depend on filter characteristics of the corresponding filter in the filter bank.
FIELD OF THE INVENTION
sion system. The present invention further relates to audio
nents generated by the analysis filter bank with a quantized approximation thereto results in artifacts that do depend on the detail characteristics ofthefilter banks. There is no single segment length for which the artifacts in the reconstructed
65
and then, several bands at the higher frequencies could be merged to provide a decomposition into critical bands. This approach imposes the same temporal constraints on all fre quency bands. That is, the time window over which the low frequency data is generatedfor each band is the same as the
US RE42,949 E 4
3
One solution to this problem-would be to use lower quality
time window over which each high-frequency band is gener
playback on computerplatforms that lack the computational
ated. To provide accuracy in the low frequency ranges, the time window must be very long. This leads to temporal arti
resources to decode compressed audio material at high ?del
facts that become audible at higher frequencies. Hence, sys
ity quality levels. Unfortunately, this solution requires that the
tems in which the audio segment is decomposed into uniform sub-bands with adequate low-frequency resolution cannot
audio material be coded at various quality levels. Hence, each audio program would need to be stored in aplurality of formats. Di?'erent types ofusers would then be sent theformat suited to their application. The cost and complexity ofmain taining such multi-format libraries makes this solution unat
take full advantage of the critical band properties of the
auditory system. Prior art systems that recognize this limitation have
attempted to solve the problem by utilizing analysis and syn
tractive. In addition, the storage requirements ofthe multiple
thesis ?lter banks based on QMF?lter banks that analyze a segment of an audio sound track to generate frequency com ponents in twofrequency bands. To obtain a decomposition of
formats partially defeats the basic goal of reducing the
the segment into frequency components representing the amplitudes ofthe signal in critical bands, these twofrequency
putational resources of a particular playback platform are fixed. This assumption is not always true in practice. The computational resources of a computing system are often shared among aplurality ofapplications that are running in
amount ofstorage needed to store the audio material. Furthermore, the above discussion assumes that the com
band QMF?lters are arranged in a tree-structured configu
ration. That is, each of the outputs of the first level ?lter becomes the input to anotherfilter bank at least one ofwhose two outputs is fed to yet another level, and so on. The leaf nodes ofthis tree provide an approximation to a critical band analysis ofthe input audio track. It can be shown that this type
a time-shared environment. Similarly, communication links
between the playback platform and shared storage facilities 20
also may be shared. As the playback resources change, the
offilter bankused diferent length audio segments to generate
format ofthe audio material must change in systems utilizing a multi-format compression approach. This problem has not
the di?'erentfrequency components. That is, a low frequency component represents the signal amplitude in an audio seg
been adequately solved in prior art systems. In prior art multi-carrier systems, a communication path
ment that is much longer than a high-frequency component. Hence, the need to choose a single compromise audio seg ment length is eliminated.
25
bands having different frequencies. The Width of the sub bands is chosen to be the same for all sub-bands and small enough to alloW the distortion in each sub-band to be modeled
While tree structured?lter banks having many layers may be used to decompose the frequency spectrum into critical
bands, such ?lter banks introduce significant aliasing arti
30
facts that limit their utility. In a multilevel?lter bank, the aliasing artifacts are expected to increase exponentially with the number of levels. Hence, ?lter banes with large numbers of levels are to be avoided. Unfortunately, ?lter banks based on QMF?lters which divide the signal into two bandlimited
35
signals require large numbers of levels. Prior art audio compression systems are also poorly suited to applications in which theplayback ofthe material is to be carried out on a digital computer The use ofaudiofor com
puter applications is increasingly in demand. Audio is being integrated into multimedia applications such as computer based entertainment, training, and demonstration systems.
40
45
upgradedfor audio with the addition ofplug-in peripherals. ited to the use ofcostly outboard equipment such as an analog 50
specializedplaybackcon?guration, and there is no possibility
of distributing the media electronically. However, personal 55
allow distribution ofprogram material on digital disks or over a computer network.
Until recently, the use ofhigh quality audio on computer platforms has been limited due to the enormous data rate
required tier storage and playback Quality has been com
60
promised in order to store the audio data conveniently on disk Although some increase in performance and some
reduction in bandwidth has been gained using conventional audio compression methods, these improvements have not
been su?icient to allow playback ofhigh?del ity recordings on the commonly used computerplatforms without the addition
of expensive special purpose hardware.
is binary, each consecutive group of 4 bits is used to compute the corresponding symbol value Which is then sent on the communication channel in the sub-band in question.
modulated carriers is carried out via a mathematical transfor mation that generates a sequence of numbers that represents
such systems it is necessary to provide a user with a highly
computer based systems using compressed audio and video data promise to provide inexpensive playback solutions and
maximum capacity, the amount of data that can be transmitted in the communication path for a given error rate is maxi mized. For example, consider a system in Which one of the sub channels has a signal-to-noise ratio Which alloWs at least 16 digital levels to be distinguished from one another With an acceptable error rate. In this case, a symbol set having 16
In digitally implemented multi-carrier systems, the actual synthesis of the signal representing the sum of the various
Computer based audio and video systems have been lim
laser disc playerfor playback ofaudio and video. This has limited the usefulness and applicability ofsuch systems. With
by a single attenuation and phase shift for the band. If the noise level in each band is knoWn, the volume of data sent in each band may be maximized for any given bit error rate by choosing a symbol set for each channel having the maximum number of symbols consistent With the available signal-to noise ratio of the channel. By using each sub-band at its
possible signal values is chosen. If the incoming data stream
Over the course of the next few years, many new personal
computers will be outfitted with audio playback and record ing capability. In addition, existing computers will be
having a ?xed bandwidth is divided into a number of sub
the amplitude of the signal as function of time. For example, a sum signal may be generated by applying an inverse Fourier transformation to a data vector generated from the symbols to be transmitted in the next time interval. Similarly, the sym bols are recovered at the receiver using the corresponding inverse transformation.
The computational Workload inherent in synthesizing and analyzing the multi-carrier signal is related to the number of sub-bands. For example, if Fourier transforms are utilized, the Workload is of order NlogN Where N is the number of sub-bands. Similar relationships exist for other transforms. Hence, it is advantageous to minimize the number of sub bands. There are tWo factors that determine the number of sub
65
bands in prior art systems. First, the prior art systems utilize a uniform bandWidth. Hence, the number of sub-bands is at least as great as the total bandWidth available for transmission
divided by the bandWidth of the smallest sub-band. The size