Using Syntactic Information for Improving why-Question Answering Suzan Verberne Lou Boves Nelleke Oostdijk Peter-Arno Coppen Radbound University Nijmegen

Presenter: Sai Qian 1

Structure • Introduction & Related work • Paragraph retrival for why-QA • Answer re-ranking • Discussion • Future directions

2

Introduction & Related work • 5% of all questions in the QA system are why•

questions Difference from factoid question

– Can not be stated in a single phrase – Paragraph retrieval instead of named entity retrieval

• Improving the QA system

– Better retrieval technique (search engine) – Better ranking system

• Syntactic knowledge between question and answer helps?

3

Introduction & Related work • A substantial amount of work in improving

QA system by adding syntactic information – Tiedemann, 2005 – Quarteroni et al., 2007 – Higashinaka and Isozaki, 2008

• Syntactic information gives a small but significant improvement on top of the traditional bag-of-words approach

4

Paragraph retrival for why-QA • Baseline system – Wumpus Search Engine – Question analysis • Remove stop words • Remove punctuation • Remains: set of question content words – Ranking: QAP algorithm (passage scoring algorithm) 5

Paragraph retrival for why-QA • Evaluation

– Manual assement – Parameter: Success@10, Success@150, Mean Reciprocal Rank@150 – Answer & Document retrieval

• Result

• Improvement – Retrieval – Ranking

6

Answer re-ranking • QAP algorithm (baseline system) – Term overlap between query and passage – Passage length – Total corpus frequency for each term

• Example – Why do people sneeze? – Why do women live longer than men on average? – Why are mountain tops cold?

• The aim: The syntactic information that discourses a relation between the question and its answer! 7

Answer re-ranking • Re-ranking system

– Idea: term overlap – Term: a subset of question terms – Feature: a set of question items and a set of answer items – Proportion:

• Defined Features: 32 in total

– F1: head; F2: modifier; F3: noun phrase; – F4: subject; F6: main verb; F10: direct object; – …… 8

Answer re-ranking • Feature extraction

– Parser • Pelican Parser: more detailed • EP4IR Dependency Parser: more robust – Lemmatization • “sailors of the old” • Only to verbs

• Re-ranking

– Scoring: 0-10 for each feature – Feature selection: genetic algorithm (optimize MRR) 9

Answer re-ranking • Result

• Features that substantially contribute to the ranking score

10

Discussion • Error analysis – No effect: 35/93 • 25/35 no relevant answer • 10/35 RR=1 – Improve: 40/93 – Deteriorate: 18/93 – 11 drops out of top 10, 22 enters top 10

11

Discussion • Example of deteriorated QA pairs

– Why do neutral atoms have the same number of protons as electrons? (answer in “Oxidation number”) – Why do flies walk on food? (answer in “Insect Habitat”) – Why is Wisconsin called the Badger State? (answer in “Wisconsin”)

• Reason

– No lexical overlap between the question focus and the document title – Feature 28 & Feature 13 12

Discussion • Feature selection analysis

– QAP: baseline system – Cue words: because, since, therefore, in order to, due to…… – Main verbs: lemmatization leads to more matches – Question focus & Document title

• Parser comparison

– Only EP4IR is applied to the answer documents 13

Future directions • Improving retrieval • Collecting a larger data collection: improve feature selection • Investigating extra information for why-Q other than syntactic description • Improving the EP4IR parser in constituent extraction 14

The End

Thank you very much!

15

Using Syntactic Information for Improving why-Question ...

Improving the QA system. – Better retrieval technique (search engine). – Better ranking system. • Syntactic knowledge between question and answer helps?

1MB Sizes 3 Downloads 151 Views

Recommend Documents

Using Syntactic Information for Improving why-Question ...
Improving the QA system. – Better retrieval technique (search engine). – Better ranking system. • Syntactic knowledge between question and answer helps?

Improving Arabic Information Retrieval System using n-gram method ...
Improving Arabic Information Retrieval System using n-gram method.pdf. Improving Arabic Information Retrieval System using n-gram method.pdf. Open. Extract.

Morpho-syntactic Lexicon Generation Using Graph ... - Manaal Faruqui
Google Inc. ryanmcd@google. .... Prefixes like un-, in- often denote ad- jectives. Thus we ...... all three language primarily express morphological properties via ...

Morpho-syntactic Lexicon Generation Using Graph ... - Manaal Faruqui
been developed for east-european languages like. Slovene (Dzeroski et al., ..... In Proc. of AISTATS. Kilian Q. Weinberger, John Blitzer, and Lawrence K. Saul.

Morpho-syntactic Lexicon Generation Using Graph ... - Manaal Faruqui
Table 1: A sample English morpho-syntactic lexicon. They are often .... like English, German, Greek etc. and might not work very ..... technique to obtain a high quality tag dictionary for ...... tion in open source development of a morphological.

Morpho-syntactic Lexicon Generation Using Graph ... - Manaal Faruqui
c 2016 Association for Computational Linguistics. ... For every attribute to be propagated, we learn ..... online adaptive gradient descent (Duchi et al., 2011).

Improving IMAGE matting USING COMPREHENSIVE ... - GitHub
Mar 25, 2014 - ... full and partial pixel coverage (alpha-channel) ... Choose best pair among all possible pairs ... confidence have higher smoothing weights) ...

Improving Statistical Machine Translation Using ...
5http://www.fjoch.com/GIZA++.html. We select and annotate 33000 phrase pairs ran- ..... In AI '01: Proceedings of the 14th Biennial Conference of the Canadian ...

Improving Dependency Parsers using Combinatory ...
[email protected], 1tdeoskar,[email protected] ... Dependency parsers can recover much of the .... sues, as the data for Hindi is small. We provided.

Becoming Syntactic
acquisition of production skills, one that accounts for data that reveal how experience ...... Bock et al., 2005) separated primes and targets with a list of intransitive filler ...... connectionist software package (Rohde, 1999). The model had 145 .

Small-sample Reinforcement Learning - Improving Policies Using ...
Small-sample Reinforcement Learning - Improving Policies Using Synthetic Data - preprint.pdf. Small-sample Reinforcement Learning - Improving Policies ...

PORTABILITY OF SYNTACTIC STRUCTURE FOR ...
Travel Information System (ATIS) domain. We compare this approach to applying the Microsoft rule-based parser (NLP- win) for the ATIS data and to using a ...

Richer Syntactic Dependencies for Structured ... - Microsoft Research
equivalent with a context-free production of the type. Z →Y1 ...Yn , where Z, Y1,. .... line 3-gram model, for a wide range of values of the inter- polation weight. We note that ... Conference on Empirical Methods in Natural Language. Processing ..

BOOTSTRAPPING PERCEPTION USING INFORMATION ... - eSMCs
Homeokinesis: A new principle to back up evolution with learning (IOS Press, 1999). [6] Edgington, M., Kassahun, Y., and Kirchner, F., Using joint probability ...

Re-training Monolingual Parser Bilingually for Syntactic ...
HMM and IBM Models (Och and Ney, 2003), are directional ... insensitive IBM BLEU-4 (Papineni et al., 2002). ... this setting, we run IDG to combine the bi-.

Using Sub-sequence Information with kNN for ...
in a computer system in order to detect signs of security problems [2]. ..... Rate of increase in false positive is less for Jaccard similarity measure (0.005-.

Using lexico-semantic information for query expansion ...
retrieval engine using Apache Lucene (Jakarta,. 2004). Documents have been .... method (1.2K vs 1.4K, as can be seen in 1). The proximity-based method ...

Using lexico-semantic information for query expansion ...
back loop that feeds lexico-semantic alternations .... in the top-k passages returned by the system. The ..... http://lucene.apache.org/java/docs/index.html. Kaisser ...

Using lexico-semantic information for query expansion ...
Using lexico-semantic information for query expansion in passage retrieval for question answering. Lonneke van der Plas. LATL ... Information retrieval (IR) is used in most QA sys- tems to filter out relevant passages from large doc- ..... hoofdstad

System and method for obtaining and using location specific information
Sep 1, 2010 - supports the coordinate entry or linked to an existing Web ..... positions to any GPS receiver that is within the communica tion path and is tuned ...

System and method for obtaining and using location specific information
(73) Assignee: Apple Inc., Cupertino, CA (US). (21) App1.No.: 12/874,155. (22) Filed: Sep. 1, 2010. Related US. Patent Documents. Reissue of: (64) Patent No.:.