Erratum to Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz Air Force Research Laboratory Wright-Patterson AFB, OH USA [email protected]

Chris Callison-Burch Johns Hopkins University Baltimore, MD USA [email protected]

William Schuler Ohio State University Columbus, OH USA [email protected]

Stephen Wu Mayo Clinic Rochester, MN USA [email protected]

Abstract Schwartz et al. (2011) presented a novel technique for incorporating syntactic knowledge into phrase-based machine translation through incremental syntactic parsing, and presented empirical results on a constrained Urdu-English translation task. The work contained an error in the description of the experimental setup, which was discovered subsequent to publication. After correcting the error, no improvement in BLEU score is seen over the baseline when the syntactic language model is used on the constrained UrduEnglish translation task. The error does not affect the originally reported perplexity results.

1

Error

Schwartz et al. (2011) presented a novel technique for incorporating syntactic knowledge into phrasebased machine translation through incremental syntactic parsing. That work contained an error in the description of the experimental setup, which was discovered subsequent to publication. The penultimate sentence of Section 6 stated that during MERT (Och, 2003), “we tuned the parameters using a constrained dev set (only sentences with 1-20 words).” Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the sponsors or the United States Air Force. Material presented here was cleared for public release with Case Number 88ABW-2010-6489 on 10 Dec 2010 and with Case Number 88ABW-2012-0302 on 19 Jan 2012.

While this was the intended experimental configuration, subsequent to publication a re-examination of the experiment revealed that for the condition where the HHMM syntactic language model was used in addition to the n-gram language model (HHMM + n-gram), tuning was actually performed using a constrained dev set of sentences with 1-40 words. As a result of this error, the BLEU scores reported in Figure 9 do not represent directly comparable experimental conditions, since the dev set used for tuning was different (sentences with 1-20 words for n-gram only versus sentences with 1-40 words for HHMM + n-gram). Because the results are not comparable, the claims of statistically significant improvements to translation quality are not justified. In order to provide comparable results, we re-ran the n-gram only configuration performing tuning with a constrained dev set of 1-40 words, to match the actual configuration that was used for the HHMM + n-gram configuration. A list of corrections is listed below.

2

List of Corrections • Abstract, final sentence: We present empirical results on a constrained Urdu-English translation task that demonstrate a significant BLEU score improvement and a large decrease in perplexity. should become

We present empirical results on a constrained Urdu-English translation task that demonstrate a large decrease in perplexity but no significant improvement to BLEU score.

should become Moses LM(s) n-gram only HHMM + n-gram

• Section 1, final sentence:

• Section 7, sentence 5:

Integration with Moses (§5) along with empirical results for perplexity and significant translation score improvement on a constrained UrduEnglish task (§6)

The translation quality significantly improved on a constrained task, and the perplexity improvements suggest that interpolating between n-gram and syntactic LMs may hold promise on larger data sets.

should become

should become

Integration with Moses (§5) along with empirical results for perplexity and translation scores on a constrained Urdu-English task (§6)

While translation quality did not significantly improve on a constrained task, the perplexity improvements suggest that interpolating between ngram and syntactic LMs may hold promise on larger data sets.

• Section 6, final two sentences: Due to this slowdown, we tuned the parameters using a constrained dev set (only sentences with 1-20 words), and tested using a constrained devtest set (only sentences with 1-20 words). Figure 9 shows a statistically significant improvement to the BLEU score when using the HHMM and the ngram LMs together on this reduced test set. should become Due to this slowdown, we tuned the parameters using a constrained dev set (only sentences with 1-40 words), and tested using a constrained devtest set (only sentences with 1-20 words). Figure 9 shows no statistically significant improvement to the BLEU score when using the HHMM and the n-gram LMs together on this reduced test set. • Figure 9: Moses LM(s) n-gram only HHMM + n-gram

BLEU 21.43 21.72

BLEU 18.78 19.78

3

Conclusion

The description of the experimental setup in Schwartz et al. (2011) contained an error that was discovered subsequent to publication. The description stated that MERT was performed on a constrained dev set of sentences with 1-20 words. In fact, one of the experimental conditions (HHMM + n-gram) was instead run on a constrained dev set of sentences with 1-40 words. This error has been corrected — after correction, no statistically significant improvement to translation quality is seen in terms of BLEU score. The error does not affect the originally reported perplexity results.

References Franz Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pages 160–167, Sapporo, Japan, July. Lane Schwartz, Chris Callison-Burch, William Schuler, and Stephen Wu. 2011. Incremental syntactic language models for phrase-based translation. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 620–631, Portland, Oregon, June.

Erratum to Incremental Syntactic Language Models for ...

HHMM + n-gram). Because the results are not comparable, the claims of statistically significant improvements to transla- tion quality are not justified. In order to provide comparable results, we re-ran the n-gram only con- figuration performing tuning with a constrained dev set of 1-40 words, to match the actual configuration.

80KB Sizes 1 Downloads 177 Views

Recommend Documents

Erratum to
Feb 4, 2016 - In the original publication of the article, the author group was published ... College of Animal Science and Technology, Guangxi. University, 100 ... Key Laboratory of Mariculture, Ministry of Education, Ocean. University of ...

Exploiting Syntactic Structure for Natural Language ...
Assume we compare two models M1 and M2 they assign probability PM1(Wt) and PM2(Wt) ... A common choice is to use a finite set of words V and map any word not ... Indeed, as shown in 27], for a 3-gram model the coverage for the. (wijwi-2 ...

Syntactic Processing in Aphasia - Language and Cognitive ...
That noted, we hasten to add that the Wernicke's patients are not likely to be entirely ..... All are variations on the same theme, namely, that syntactic limitations.

Portability of Syntactic Structure for Language ... - Semantic Scholar
assign conditional word-level language model probabilities. The model is trained in ..... parser based on maximum entropy models,” in Second. Conference on ...

Generalized Syntactic and Semantic Models of Query ...
As an example, for ... measures extracted from simple document counts [5], or in- .... We call qs the source and qt the target. A similarity measure between two ..... QS1500. #. Similarity Function. Spearman. mAP. Prec@1. Prec@3. Prec@5. Sig.

Generalized Syntactic and Semantic Models of Query ...
Jul 23, 2010 - pre-compute reformulations by ranking candidate queries extracted from query logs ..... supervised combination we used a neural network regression model using all of the ..... past queries: Social searching? In Proceedings of.

Generalized syntactic and semantic models of ... - Research at Google
tion “apple” to “mac os” PMI(G)=0.2917 and PMI(S)=0.3686;. i.e., there is more .... nal co-occurrence count of two arbitrary terms wi and wj is denoted by Ni,j and ...

erratum
Nov 26, 2013 - Printed in the U.S.A.. ERRATUM: “A UNIVERSAL, LOCAL STAR FORMATION LAW IN GALACTIC CLOUDS, NEARBY. GALAXIES, HIGH-REDSHIFT DISKS, AND ... Below we give corrected versions of Tables 3 and 4 from the published paper, a corrected versio

Unary Data Structures for Language Models - Research at Google
sion competitive with the best proposed systems, while retain- ing the full finite state structure, and ..... ronments, Call Centers and Clinics. Springer, 2010.

Semantic Language Models for Topic Detection and ...
Center for Intelligent Information Retrieval, ..... Burges, C. J. C., A Tutorial on Support Vector Machines ... Proc. of Uncertainty in Artificial Intelligence, 1999.

Large-Scale Random Forest Language Models for ... - CiteSeerX
data. This paper addresses large-scale training and testing of the RFLM via an efficient disk-swapping strategy that exploits the recursive structure of a .... computational resources for training, and therefore have a hard time scaling up, giving ..

Semantic Language Models for Topic Detection ... - Semantic Scholar
Ramesh Nallapati. Center for Intelligent Information Retrieval, ... 1 Introduction. TDT is a research ..... Proc. of Uncertainty in Artificial Intelligence, 1999. Martin, A.

Designing Language Models for Voice Portal ... - Springer Link
Designing Language Models for Voice Portal Applications. PHIL SHINN, MATTHEW ... HeyAnita Inc., 303 N. Glenoaks Blvd., 5th Floor, Burbank, CA 91502, USA.

Becoming Syntactic
acquisition of production skills, one that accounts for data that reveal how experience ...... Bock et al., 2005) separated primes and targets with a list of intransitive filler ...... connectionist software package (Rohde, 1999). The model had 145 .

Information theoretic models in language evolution - ScienceDirect.com
Information theoretic models in language evolution. 1. Rudolf Ahlswede, Erdal Arikan, Lars Bäumer, Christian Deppe. Universität Bielefeld, Fakultät für Mathematik, Postfach 100131, 33501 Bielefeld,. Germany. Abstract. We study a model for languag

Erratum
Nov 5, 2007 - qj V P ,T. 2. + kBT qj ln1 − e qj V P ,T. /kBT . PHYSICAL REVIEW B 76, 189901E 2007 ... ©2007 The American Physical Society. 189901-1.

Randomized Language Models via Perfect Hash Functions
Randomized Language Models via Perfect Hash Functions ... jor languages, making use of all the available data is now a ...... functions. Journal of Computer and System Sci- ence ... Acoustics, Speech and Signal Processing (ICASSP). 2007 ...

PORTABILITY OF SYNTACTIC STRUCTURE FOR ...
Travel Information System (ATIS) domain. We compare this approach to applying the Microsoft rule-based parser (NLP- win) for the ATIS data and to using a ...

Richer Syntactic Dependencies for Structured ... - Microsoft Research
equivalent with a context-free production of the type. Z →Y1 ...Yn , where Z, Y1,. .... line 3-gram model, for a wide range of values of the inter- polation weight. We note that ... Conference on Empirical Methods in Natural Language. Processing ..

Knowledge Integration Into Language Models
Mar 9, 2009 - Deleted interpolation! In what order do we back off or delete? Well... No “natural ... Or tried to find the “optimal” path or combination of paths.

erratum - Bourse de Montréal
May 16, 2018 - The associated symbol and strike prices for these options will be listed as follows: New Classes. Company name. Symbol. Months. Strike Prices.

Incremental Cooperative Diversity for Wireless ...
be used in cognitive wireless systems to mitigate interference and to improve spectrum .... the number of relays involved in the incremental phase is not fixed, i.e. ...