Prediction of Thematic Rank for Structured Semantic Role Labeling Weiwei Sun/孙薇, Zhifang Sui and Meng Wang † Institute ‡ Key
of Computational Linguistics, EECS, Peking University Laboratory of Computational Linguistics, Ministry of Education, China
Overview I
Thematic hierarchy theory argues that there exists a language independent rank of possible semantic roles.
I
Thematic hierarchy establishes priority among arguments with respect to their syntactic realization.
I
Thematic rank between arguments can be accurately identified by using syntax clues.
I
Strong dependencies among arguments → globally assign semantic roles
I
To import structural information, re-ranking technique is emplied to incorporate thematic rank information into local classification results.
I
Experimental results show that prediction of thematic rank can help semantic role classification.
Linguistic Basis
Subject selection rule of Fillmore’s Case Grammer If there is an A [=Agent], it becomes the subject; otherwise, if there is an I [=Instrument], it becomes the subject; otherwise, the subject is the O [=Object, i.e., Patient/Theme]. I
John broke the window with a hammer.
I
A hammer broke the window.
I
The window broke.
Linguistic Basis Subject selection rule of Fillmore’s Case Grammer If there is an A [=Agent], it becomes the subject; otherwise, if there is an I [=Instrument], it becomes the subject; otherwise, the subject is the O [=Object, i.e., Patient/Theme]. I
A thematic hierarchy is a language independent rank of possible semantic roles, which establishes prominence relations among arguments.
I
The thematic hierarchy theory argues that thematic ranks of semantic roles affect their syntactic realization.
I
Thematic hierarchies can help to construct mapping from semantics to syntax.
Problems in Modeling Thematic Hierarchy of PropBank Roles 1. Predicates of PropBank do not share the same list of semantic roles. I
I
There are six semantic role types in the label set, which are tagged as Arg0-5. There is no consistent meaning of the six labels. Arg3 for rise.01 is Location, whereas Arg3 for order.02 is Source.
2. Although there is general agreement that the Agent should be the highest-ranking role in a hierarchy, there is no consensus over hierarchies of the remaining roles in the therotical discussion. I
For example, the Patient occupies the second highest hierarchy in some linguistic theories but the lowest in some other theories.
Ranking Arguments in PropBank I
I
We take into account the proto-role theory to rank PropBank roles. There are three key points in our solution: 1. The rank of Arg0 is the highest: Arg0Argi(i > 0). 2. The rank of Arg1 is the second highest or the lowest: Arg1Argi(i > 1) Vs ArgiArg1. 3. We do not rank other arguments. That means we take equivalence relation among other roles.
I
Two sets of roles closely correspond to numbered arguments: 1. referenced arguments (R-A*) 2. continuation arguments (C-A*)
I
To adapt the relation to help these two kinds of arguments, the equivalence relation is divided into six sub-categories.
Ranking Arguments in PropBank (cont’d)
I
Arg0 is generally the argument exhibiting features of a prototypical Agent;
I
Arg1 is a prototypical Patient.
I
The Agent is almost without exception the highest role in proposed hierarchies. As being the proto-Agent, the rank of Arg0 is higher than other numbered arguments
I
Both rank of Arg1 are tested on PropBank data.
I
A majority of thematic hierarchies take an equivalence relation among Source, Goal, Locative, and etc. This kind of roles are usually labeled as from Arg2 to Arg5.
Ranking Arguments in PropBank (cont’d)
Figure: The Hasse diagrams of hierarchies.
Thematic Rank Prediction I
Assigning different labels to different rank relations, we formulate the prediction of thematic rank between two arguments as a multi-class classification task.
I
Let A denote the set of arguments, R denote rank relation set. Given a score function STH : A × A × R 7→ R, the relation r is recognized in argmax flavor: ˆr = r ∗ (ai , aj ) = arg max STH (ai , aj , r ) r ∈R
Score function STH (ai , aj , r ) = P
exp{ψ(ai , aj , r ) · w} r ∈R exp{ψ(ai , aj , r ) · w}
where ψ is the feature map and w is the parameter vector to learn.
Label List 1. : first argument is higher than the second argument. 2. ≺: first argument is lower than the second argument. 3. AR: the second argument is the referenced argument of the first argument. 4. RA: the first argument is the referenced argument of the second. 5. AC: the second argument is the continuation argument of the first argument. 6. CA: the first argument is the continuation argument of the second. 7. =: two arguments are labeled as the same role label. 8. ∼: two arguments are equivalent, but not in the same type.
SRL Re-ranking with Thematic Rank Information I
Structured Semantic Role Labeling I
I
I
Arguments in one predicate-argument structure are highly correlated. Toutanova et al. (2005) empirically show that global information is important for SRL and that structured solutions outperform local semantic role classifiers.
The local semantic classifier can produce a list of labeling results: Y S(a, s) = Sl (ai , si ) i
I
Our re-ranking step picks one from this list according to the predicted rank relations. Two re-ranking polices: 1. hard constraint re-ranking 2. soft constraint re-ranking
Hard Constraint Re-ranking I
Being strictly in accordance with the rank relations.
I
If the thematic rank prediction result shows the rank of argument ai is higher than aj , then the role assignment [ai =Patient and aj =Agent] will be eliminated.
Hard constraint re-ranking S(a, s) =
Y
Sl (ai , si )
i
Y
I(r ∗ (ai , aj ), r (si , sj ))
i,j,i
I
r ∗ : A × A 7→ R: predict rank of two arguments;
I
r : S × S 7→ R: predict the thematic rank of two semantic roles, e.g. r (Agent, Patient) = ” ”.
I
I : R × R 7→ {0, 1}.
Soft Constraint Re-ranking I
The predicted confidence score of relations is added as factor items to the score function of the semantic role assignment.
Soft constraint re-ranking S(a, s) =
Y i
I
Sl (ai , si )
Y
STH (ai , aj , r (si , sj ))
i,j,i
As there are not too many arguments (at most six on the test corpus) in an argument structure, the output space is not very large and just an exhaustive search can compute efficiently.
How to Rank Arguments?
I
Table below is the performance of thematic rank prediction and structured semantic role classification on different thematic hierarchies. Baseline A A & P↑ A & P↓
Rank Prediction – 94.65% 95.62% 94.09%
SRL (S) 94.77% 95.44% 95.07% 95.13%
SRL (G) – 96.89% 96.39% 97.22%
Table: SRC performance based on different thematic hierarchy definitions.
Performance of Semantic Role Classification
Baseline Hard Soft Gold
Gold 95.14% 95.71% 96.07% 97.63%
Charniak 94.12% 94.74% 95.44% 97.32%
Table: Overall semantic role classification accuracy.
Performance of SRC (cont’d) Arg0 Arg1 Arg2 Arg3 Arg4 Arg5 R-Arg0 R-Arg1 R-Arg2 C-Arg0 C-Arg1 C-Arg2
Local 96.10 95.26 90.09 83.8 90.20 80.00 93.19 83.39 56.00 12.50 80.59 N/A
Hard 96.47 96.09 90.63 83.03 87.62 72.73 95.56 89.38 62.07 56.00 81.12 18.18
Soft 97.07 96.58 91.56 84.43 87.20 83.33 96.89 90.63 66.67 66.67 84.85 18.18
Table: F-measures of SRC based on Charniak parsing.
Analysis I
Modification for local classification results with structural information.
I
Using individual features only, local classifier may falsely label roles in a one-by-one style. Structural information can correct some of this kind of mistakes.
An Example I
Some ”circuit breakers” installed after the October 1987 crash failed their first test. Assignment Arg0+Arg1 Arg1+Arg2
Score(Local) 78.97% × 82.30% 14.25% × 11.93%
Score(Rank) :0.02% ∼:99.98%
Analysis
An Example I
Some ”circuit breakers” installed after the October 1987 crash failed their first test. Assignment Arg0+Arg1 Arg1+Arg2
Score(Local) 78.97% × 82.30% 14.25% × 11.93%
Score(Rank) :0.02% ∼:99.98%
The baseline system falsely assigns roles as Arg0+Arg1. Taking into account thematic rank prediction result that relation ”∼” gets a extremely high probability, our system returns Arg1+Arg2 as SRL result.
Conclusion & Future Work
I
Borrow the thematic hierarchy idea from linguists.
I
Use relation information to represent structural constraints.
I
Open question: how to define reasonable hierarchy of a given semantic role list?
Game Over