Global Inference for Bridging Anaphora Resolution 1
Yufang Hou1 , Katja Markert2 , Michael Strube1 Heidelberg Institute of Theoretical Studies gGmbH, Heidelberg, Germany 2 School of Computing, University of Leeds, UK
1 / 39
Bridging Examples on OntoNotes Bridging Example If Mr. McDonough’s plans get executed, as much as possible of the Polish center will be made from aluminum, steel and glass recycled from Warsaw’s abundant rubble. A 20-story mesh spire will stand atop 50 stories of commercial space. The windows will open. The carpets won’t be glued down and walls will be coated with non-toxic finishes.
2 / 39
Bridging Examples on OntoNotes Bridging Example If Mr. McDonough’s plans get executed, as much as possible of the Polish center will be made from aluminum, steel and glass recycled from Warsaw’s abundant rubble. A 20-story mesh spire will stand atop 50 stories of commercial space. The windows will open. The carpets won’t be glued down and walls will be coated with non-toxic finishes.
The windows
the Polish center
part-of relation The carpets
walls 2 / 39
Bridging Examples on OntoNotes
Bridging Example Still employees do occasionally try to smuggle out a gem or two. One man wrapped several diamonds in the knot of his tie. Another poked a hole in the heel of his shoe. None made it past the body searches . . .
3 / 39
Bridging Examples on OntoNotes
Bridging Example Still employees do occasionally try to smuggle out a gem or two. One man wrapped several diamonds in the knot of his tie. Another poked a hole in the heel of his shoe. None made it past the body searches . . .
One man employees
set/member relation Another
None 3 / 39
Bridging Examples on OntoNotes Bridging Example He farms 12 acres in Grabowiec, two miles from the Soviet border in one of Poland’s poorest places. Now he is mounting the steps of a stucco building in a nearby village, on a visit to the Communist administrator, the "naczelnik". "Many people in Poland hope this government will break down", says Mr. Niciporuk, who belongs to the local council and to Rural Solidarity.
4 / 39
Bridging Examples on OntoNotes Bridging Example He farms 12 acres in Grabowiec, two miles from the Soviet border in one of Poland’s poorest places. Now he is mounting the steps of a stucco building in a nearby village, on a visit to the Communist administrator, the "naczelnik". "Many people in Poland hope this government will break down", says Mr. Niciporuk, who belongs to the local council and to Rural Solidarity.
a nearby village Grabowiec. . .
other general relation
the Communist administrator. . . the local council 4 / 39
Outline 1 Task
2 Related Work
3 IS and Bridging Annotation on Top of OntoNotes
4 Models for Antecedent Selection
5 Conclusions
5 / 39
Outline 1 Task
2 Related Work
3 IS and Bridging Annotation on Top of OntoNotes
4 Models for Antecedent Selection
5 Conclusions
6 / 39
Task: Bridging Resolution Document
The windows The carpets
walls
Bridging anaphora recognition
7 / 39
Task: Bridging Resolution The windows
Document
the Polish center
The carpets
walls
Antecedent selection
Bridging anaphora recognition
8 / 39
Outline 1 Task
2 Related Work
3 IS and Bridging Annotation on Top of OntoNotes
4 Models for Antecedent Selection
5 Conclusions
9 / 39
Related Work: Corpora Available reliable annotated corpora relating to IS and (or) bridging:
Corpus Language Size Bridging Ante.
Nissim et al. (2004)
Riester et al. (2010)
Markert et al. (2012)
Switchboard (dialogue) English 147 documents 70,000 mentions No
radio news transcripts
OntoNotes (written text) English 50 documents 11,000 mentions Yes
German 1,420 sentences 6,668 mentions Yes
Other Corpus: • GNOME (Poesio (2004)), no IS annotation, but bridging antecedents for NPs are annotated
10 / 39
Related Work: Computational Models No system does full bridging resolution For bridging anaphora recognition: available systems solve this problem under fine-grained IS classification
Rahman and Ng (2012)
Cahill and Riester (2012)
Markert et al. (2012)
Switchboard SVMmulticlass on top of rule based prediction Quite high results for 4 med/bridging subcategories
radio news transcripts CRF (morphology/ syntactic/GermaNet features) did not report results for bridging subcategory
OntoNotes collective classification parent-of relation low results for med/bridging subcategory
11 / 39
Related Work: Computational Models No system does full bridging resolution For antecedent selection, available systems are restricted to specific relations or anaphora types: • Poesio et al. (2004): GNOME, focus on mereological bridging anaphora resolution (part-of relation), the accuracy is quite high (92.5) because of simplification of testing by balancing out training and testing data which does not reflect real text conditions • Gardent et al. (2005): DEDE (French corpus), only for definite anaphora bridging resolution, the accuracy is 23.6
12 / 39
Outline 1 Task
2 Related Work
3 IS and Bridging Annotation on Top of OntoNotes
4 Models for Antecedent Selection
5 Conclusions
13 / 39
Annotation Scheme Old
Old
Mediated
World Knowledge Syntactic Comparative Bridging Aggregation Function
New
New
14 / 39
Corpus Overview • Consists of 50 texts from the WSJ portion of the OntoNotes corpus • IS as well as antecedents for bridging and comparative anaphora
are annotated
Overall κ coarse Overall κ fine • For Med/Bridging,
A-B 77.3 80.1
A-C 75.2 77.7
B-C 74.7 77.3
κ over 70 for two expert annotators, agreement
for selecting antecedent for Med/Bridging was about 80%
15 / 39
IS Distribution on Gold Standard Corpus Texts
50
Mentions
10,980
Old Coref Generic_deictic_pr
3237 3,143 94
World knowledge Syntactic Aggregate Func Comparative Bridging
3,708 924 1,592 211 65 253 663
Mediated
New
4,035
16 / 39
Bridging Anaphors and Antecedents Distribution
Bridging Anaphors
663
Bridging Antecedents
683
Bare Definite Indefinite Other
43% 38.5% 15.4% 3%
Event Set Part-attribute-of Other NP Non_NP
2.3% 6.6% 13.6% 77.6% 94.4% 5.6%
(automatically computed w.r.t. article)
(computed based on gold annotation)
17 / 39
Bridging Anaphor-Antecedent Distance Distribution • 71% of NP antecedents occurring in the same or up 2 sentences prior to the anaphor • Far away antecedents are common when the antecedents is the global focus of a document
18 / 39
Outline 1 Task
2 Related Work
3 IS and Bridging Annotation on Top of OntoNotes
4 Models for Antecedent Selection
5 Conclusions
19 / 39
Pairwise Mention-Entity Model set of antecedent candidate entities Em e0
e1
e2
e3
m
• Instance (m, e) • Binary classification problem • For each anaphora m, a post processing step is needed to select antecedent among candidates
20 / 39
Pairwise Model Features: Poesio et al. ’s Feature Set
Group lexical salience
Feature Google distance (ofPattern) WordNet distance utterance distance local first mention global first mention
Value numeric numeric numeric boolean boolean
21 / 39
Pairwise Model Features: Other Features extend Poesio’s feature set to capture more diverse relations between anaphor and antecedent Group semantic
salience syntactic
Feature preposition pattern verb pattern WordNet partOf semantic class document span isSameHead isWordOverlap isCoArgument
Value numeric numeric boolean enumeration numeric boolean boolean boolean
22 / 39
Problems for Pairwise Mention-Entity Model set of antecedent candidate entities Em e0
e1
e2
e3
m Poesio’s feature set other local features Problems: • Data imbalance -> fixed window size is used to restrict the set of candidates • Antecedents beyond the window size can not be accessed • Decision is made for each instance separately ignoring relations between instances 23 / 39
Global Model Based on MLNs (Markov Logic Networks) Bridging Example If Mr. McDonough’s plans get executed, as much as possible of the Polish center will be made from aluminum, steel and glass recycled from Warsaw’s abundant rubble. A 20-story mesh spire will stand atop 50 stories of commercial space. The windows will open. The carpets won’t be glued down and walls will be coated with non-toxic finishes.
Related or similar anaphors are likely to share the same antecedent Globally salient entities are likely to be the antecedent of multiple anaphors
24 / 39
Global Model Based on MLNs
ante. candidate entity set Em e1
e2
isBridging?
m
ante. candidate entity set En
e3
e4
e5
e6
isBridging?
hasSameAnte?
n
25 / 39
Global Model Based on MLNs MLNs: A powerful representation template for global log linear model. 3 types of global constraint are explored: predicted global ante.
e0
organization
ante. candidate entity set Em
e1
e2
isBridging?
m
ante. candidate entity set En
isBridging?
e5
e6
isBridging?
hasSameAnte?
n
rolePerson 26 / 39
Pairwise Model vs. Global Model Based on MLNs
Local mention-entity pairwise model
e0
e1
e2
e3
Global model based on MLNs e1
e2
e3
e4
e5
e6
local formulas m Poesio’s feature set other local features
low antecedent accessibility decision is made separately
m
n global formulas
high antecedent accessibility decisions are made jointly 27 / 39
Experimental Setup • Experiments are performed on our gold standard corpus via 10-fold
cross-validation on documents • Gold mentions and gold coreference information are used • OntoNotes annotation layers are used for feature extraction • Mention-entity pairwise model: SVM-Light, best first strategy to
select the antecedent for each anaphor • Global model: thebeast is used to learn weights for the formulas and
to perform inference • Accuracy is reported with regard to all anaphors
28 / 39
Experimental Results
2 sent. + NB 5 sent. + NB pairwise model I pairwise model II pairwise model III MLN model I MLN model II
ante. candidates 2 sent. 5 sent. 2 sent. 2 sent. 2 sent.+special 2 sent. + special Union of (2 sent.+special)
feature Poesio’s feature set Poesio’s feature set Poesio’s salience + prep. pattern Poesio’s salience + all other local Poesio’s salience + all other local MLNs local formulas MLNs (local+global) formulas
acc 18.85 18.40 29.11 33.94 36.35 35.60 41.32
special: a more advanced antecedent candidate selection strategy
• For each anaphor, add top k salient entities measured through the length of the coreference chains as additional antecedent candidates (k is set to 10%).
• For possible set anaphors, singular antecedent candidates are filtered out. • For possible same type anaphors (anaphor premodified by relation indicators such as "following" or "previous"), only the same semantic class antecedent candidates are kept. 29 / 39
Experimental Results
2 sent. + NB 5 sent. + NB pairwise model I pairwise model II pairwise model III MLN model I MLN model II
ante. candidates 2 sent. 5 sent. 2 sent. 2 sent. 2 sent.+special 2 sent. + special Union of (2 sent.+special)
NP antecedent accessibility 71% 84% 71% 71% 77% 77% 91%
acc 18.85 18.40 29.11 33.94 36.35 35.60 41.32
MLN model II: an elegant framework to express complex discourse relations • Local formulas are used to capture semantic relations for bridging pairs without being influenced by noisy antecedent candidates
• Global formulas resolve several bridging anaphors together to a globally salient antecedent beyond the local window
• Selecting antecedent from antecedent candidates pool improve the accessibility to the NP antecedents 30 / 39
Error Analysis 1. Anaphors with long distance antecedents are harder to resolve. MLN II pairwise III pair sentence distance 0 48.57 45.14 1 34.62 35 47.78 43.33 2 ≥3 35.44 16.46 (numbers represent percentage of correctly resolved anaphors)
31 / 39
Error Analysis 2. In MLN II, global salience and links between similar anaphors helped to improve the performance on sibling anaphors, "non-siblings" are harder to resolve. MLN II sibling anaphors 54.00 non-siblings anaphors 24.00 (numbers represent percentage of correctly resolved anaphors)
32 / 39
Error Analysis 3. The semantic knowledge we have is still insufficient. The cases that use very context specific bridging relations are especially hard to handle.
Context specific bridging relation example He farms 12 acres in Grabowiec, two miles from the Soviet border in one of Poland’s poorest places. Now he is mounting the steps of a stucco building in a nearby village, on a visit to the Communist administrator, the "naczelnik". "Many people in Poland hope this government will break down", says Mr. Niciporuk, who belongs to the local council and to Rural Solidarity.
33 / 39
Error Analysis 3. The semantic knowledge we have is still insufficient. The cases that use very context specific bridging relations are especially hard to handle.
Context specific bridging relation example It’s a California crime saga worthy of an Erle Stanley Gardner title: The Case of the Porloined Palm Trees. Edward Carlson awoke one morning last month to find eight holes in his front yard where his prized miniature palms, called cycards, once stood. Days later, the thieves returned and dug out more, this time adding insult to injury.
34 / 39
Error Analysis 3. The semantic knowledge we have is still insufficient. The cases that use very context specific bridging relations are especially hard to handle.
Context specific bridging relation example The irony in this novel is that neither man represents a "safe" middle-class haven: Nora’s decision is between emotional excitement and emotional security, with no firm economic base anywhere. ... The humor of the the story owes much to the fact that no hearts are likely to bleed for the plight of health-food eaters. But readers may well feel the pangs of recognition.
35 / 39
Error Analysis 4. Non-NP antecedents are not considered in our current model. Context specific bridging relation example But despite more than two years of research showing AZT can relieve dementia and other symptoms in children, the drug still lacks federal approval for use in the youngest patients. As a result, many youngsters have been unable to obtain the drug and, for the few exceptions, insurance carriers won’t cover its cost $6,400 of a year.
36 / 39
Outline 1 Task
2 Related Work
3 IS and Bridging Annotation on Top of OntoNotes
4 Models for Antecedent Selection
5 Conclusions
37 / 39
Conclusions • We provide a reasonably sized and reliably annotated English
corpus for bridging resolution. • The corpus covers a diverse set of relations between anaphor and
antecedent as well as anaphor/antecedent types. • We develop semantic, syntactic and salience features based on
linguistic intuition. • Inspired by the observations that salient entities are preferred as
antecedents and sibling anaphors are likely to share the same antecedent, we implement a global model for antecedent selection within the framework of Markov Logic Networks.
38 / 39
Suggestions Are Welcome! Thanks • Angela Fahrni • Sebastian Martschat
• LGFG Foundation • Klaus Tschira Foundation (KTF)
The corpus can be downloaded from: http://www.h-its.org/english/research/nlp/download/index.php
39 / 39