Abstract

AE

G T

T

D

DOG

CAT ATE

AGED



George Saon, Daniel Povey and Geoffrey Zweig

3

t-1

t

4

THE CAT 6 A CAT

2 29.4

ATE

5 451.0

t

t+1

RT04 16.4% 15.2%

= Word trace

= Token

THE CAT ATE 9 3

DEV04 14.5% 13.0%

10 1709.7

ATE

RT03 17.4% 16.1%

32

31

31.5

30

30.5

29

29.5

28

28.5

27

27.5

26.5 0.1

0.2

0.3

0.7

on-demand hierarchical decoupled hierarchical on-demand

0.4 0.5 0.6 Real-time factor (cpu time/audio time)

Phonetic context Number of leaves Number of words Number of n-grams Number of states Number of arcs

Search statistics:

Word error rate Search errors Run-time factor Likelihood/search ratio Avg. Gaussians/frame Max. states/frame

SA 19.0% 0.3% 0.55xRT 55/45 43.5K 15.0K

SA 3 21.5K 32.9K 4.2M 26.7M 68.7M

SI 28.7% 2.2% 0.14xRT 60/40 7.5K 5.0K

Decoding graph statistics: SI 2 7.9K 32.9K 3.9M 18.5M 44.5M

0.8

EARS 2004 evaluation submission in the one times real-time (or 1xRT) category. Two-pass decoding scheme with three adaptation passes inbetween (VTLN, FMLLR, MLLR).

Experimental setup (1xRT system)



IBM T.J. Watson Research Center phone (914)-945-2985, email [email protected]

Viterbi search speed-ups Graph memory layout: graph stored as a linear array of arcs sorted by origin state Successor look-up table: maps static to dynamic state indices Running beam pruning: pruning based on current maximum score estimate

Lattice generation

1

2

1

Keep track of the N-best distinct word sequences arriving at every state

A CAT

TIME

A DOG

ONE CAT 1

THE CAT 4

N-best degree Lattice link density

Speaker-adapted decoding LM rescoring + consensus

Likelihood computation Hierarchical decoupled On-demand Hierarchical on-demand

WER (%)

Anatomy of an extremely fast LVCSR decoder

We report in detail the decoding strategy that we used for the past two Darpa Rich Transcription evaluations (RT’03 and RT’04) which is based on finite state automata (FSA). We discuss the format of the static decoding graphs, the particulars of our Viterbi implementation, the lattice generation and the likelihood evaluation. Experimental results are given on the EARS database (English conversational telephone speech) with emphasis on our faster than real-time system.

They are acceptors (instead of transducers) Arcs in graph have three different types of labels: – leaf labels (context-dependent output distributions), – word labels and – epsilon labels (e.g. due to LM back-off states).

AW

AO

D AW AO G G K AE T

JH D

= null state



Two different types of states:

K

D

– emitting states for which all incoming arcs are labeled by the same leaf and – null states which have incoming arcs labeled by words or epsilon.

EY

EY T JH

= emitting state



Static decoding graphs



Anatom y of an e xtremely fast LVCSR decoder Abstract ...

atson. Research. Center phone. (914)-945-2985, email saon@w atson.ibm.com. Abstract. W e report in detail the decoding strategy that w e used for the past tw o ... Experimentalresults are given on the. EARS database. (English con versational telephone speech) with em- phasis on our faster than real-time system. Static.

18KB Sizes 0 Downloads 96 Views

Recommend Documents

Design of LVCSR Decoder for Czech Language
All known specifics of the Czech lan- ... of recognition network can be performed. ... This leads to a significant speed up due to less local likelihood calculations and less .... to a better RTR of the tree decoder in comparison to the previous test

An LDPC Decoder Chip Based on Self-Routing ...
implementation for higher decoding speed. Newly high-speed communication .... th rows, the check node phase can deal with the th and the th rows. ..... [9] Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access. Systems ...

An efficient video decoder design for MPEG-2 MP@ ML
In this paper, we present an efficient MPEG-2 video decoder architecture design to meet. MP@ML real-time decoding requirement. The overall architecture, as ...

GOVERNMENT OF KERALA Abstract
Dec 11, 2009 - remit the collection under FORM TR 5, to the head of account 8658- ... Thiruvananthapuram Divisional Office, P.B.No.434, St.Joseph's Press.

Theoretical study of an abstract bubble vibration model
we refer to [20] and to [18]. In particular, one of the most difficult issues raised by diphasic flows is the numerical handling of interfaces. That is why an accurate resolution requires an adaptive mesh refinement technique to avoid any diffusion o

GOVERNMENT OF KERALA Abstract
Dec 11, 2009 - remit the collection under FORM TR 5, to the head of account 8658- ... Thiruvananthapuram Divisional Office, P.B.No.434, St.Joseph's Press.

Abstract
Location: Biogeografía de Medios Litorales: Dinámicas y conservación (2014), ISBN 978-84-617-. 1068-3, pages 185-188. Language: Spanish. Near the town of ...

VIN Decoder .pdf
Decorative items. Computer Parts. Recharging PrePaid. Blogs. Online Stores. Directories qlweb ... Link does not work / spam? Highlight this ... VIN Decoder .pdf.

1-E&Y Cloud computing.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps. ... 1-E&Y Cloud computing.pdf. 1-E&Y Cloud computing.pdf. Open.

GOVERNMENT OF ANDHRAPRADESH ABSTRACT School ...
Jul 10, 2015 - School Education – Enactment of Act No.1 of 2015- Amendment to the Section 78-A of A.P.. Education Act,- Raising of age of superannuation of teaching and non-teaching staff working in aided Private Educational Institutions from 58 to

GOVERNMENT OF ANDHRA PRADESH ABSTRACT
Sep 23, 2014 - PUBLIC SERVICES – Direct Recruitment – Relaxation of Upper Age Limit ... ensuing recruitments through A.P. Public Service Commission and.

I M f y R E
Page 2. www.mechdiploma.com. Q.4.What​ ​is​ ​Neutral​ ​axis​ ​​ ​in​ ​case​ ​of​ ​Bending? In​​a​​beam​​subjected​​to​​bending ...

Padres e hijos y viceversa.pdf
Download. Connect more apps... Try one of the apps below to open or edit this item. Padres e hijos y viceversa.pdf. Padres e hijos y viceversa.pdf. Open. Extract.

1-E&Y Cloud computing.pdf
Loading… Page 1. Whoops! There was a problem loading more pages. 1-E&Y Cloud computing.pdf. 1-E&Y Cloud computing.pdf. Open. Extract. Open with.

GOVERNMENT OF ANDHRA PRADESH ABSTRACT ...
Nov 19, 2014 - belonging to Scheduled Castes, Scheduled Tribes, Backward Classes, Physically. Challenged, Ex-service Men and women is applicable as ...

4º Bilingüe y EOI.pdf
listening, speaking y writing. Para superar estas pruebas es necesario obtener a lo largo de cada. evaluación una media de 3 como mínimo en cada destreza): ...

GOVERNMENT OF ANDHRA PRADESH ABSTRACT ...
Nov 19, 2014 - ¦B 1¥" § ¤ £ FGGP (¢¤ ¦§ £ 4 ' VD FGGP A ¥" B ¤ ¦ ¦§ ¤8 ¡¢ ¤ ¦ © §¤¥ ...... VD 6(¡©. VD 6§¦£©. Y'. 0§¤8"§8¢ && 1¤8 ©¡ ( ¤ ¢¤ ¥. 6¢ ¡ ¥ 8B. VD 6(¡©.

INVESTIGACION ACERCAD E LA EMPRESA AMAZON Y ...
INVESTIGACION ACERCAD E LA EMPRESA AMAZON Y PAYONEER.pdf. INVESTIGACION ACERCAD E LA EMPRESA AMAZON Y PAYONEER.pdf. Open.

GOVERNMENT OF ANDHRA PRADESH ABSTRACT
Sep 23, 2014 - PUBLIC SERVICES – Direct Recruitment – Relaxation of Upper Age Limit ... ensuing recruitments through A.P. Public Service Commission and.

Application of an AMR Strategy to an Abstract Bubble Vibration Model
namics system by means of an Adaptive Mesh Refinement algorithm in order to handle ... thanks to a hierarchical grid structure whereas we use the Local Defect ..... data at time n, intermediate calculated values, and required data at time n + 1. ...

Implementation of Viterbi decoder for a High Rate ...
Code Using Pre-computational Logic for TCM Systems .... the same number of states and all the states in the same cluster will be extended by the same BMs.

Implementation of H.264 Decoder on General-Purpose ...
implementation of a real-time H.264 decoder on general-purpose processors with ... single-instruction-multiple-data (SIMD) execution model was introduced in Intel ... and yield a much better perceptual quality for the decoded video stream.

2017-ESTANCIA-SEMINARIO EN FLORENCIA-Relatori e abstract ...
... [email protected]. Page 3 of 8. 2017-ESTANCIA-SEMINARIO EN FLORENCIA-Relatori e abstract relazioni.pdf. 2017-ESTANCIA-SEMINARIO EN ...

A study on soft margin estimation for LVCSR
large vocabulary continuous speech recognition in two aspects. The first is to use the ..... IEEE Trans. on Speech and Audio Proc., vol. 5, no. 3, pp. 257-265, 1997 ... recognition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167 .