Discovering Commonsense Entailment Rules Implicit in Sentences
Jonathan Gordon & Lenhart Schubert University of Rochester
1
Introduction 2
Knowledge for Inference We want to reason about ordinary human situations: forward from facts & backward from goals. This requires diverse kinds of knowledge.
3
What’s Been Acquired & What Hasn’t Information & knowledge extraction work has found: base facts (Barack Obama is president) e.g., Banko et al. 2007 generalizations (A president may make a speech) e.g., Schubert 2002 But what if we’re told ‘Sally crashed her car’? Common sense: The driver in a car crash might be injured. Rarely stated directly, but people say when this expectation is disconfirmed: ‘Sally crashed her car into a tree but she wasn’t hurt.’
4
‘Crashes cause injuries’ The commonsense rules we want include expectations about results & lexical entailments. Previous work on these lines isn’t appropriate for inference. Limitations: Binary relations Just synonymy/similarity relations Not enough information about the types of arguments E.g., Lin & Pantel 2001, Girju 2003, Chklovski & Pantel 2004 (VerbOcean, supra), Pekar 2006, Schoenmackers et al. 2010 5
Method 6
Finding Detailed Commonsense Rules Exploit presuppositional discourse patterns (such as ones involving ‘but’, ‘yet’, and ‘hoping to’) and abstract the matched material into general rules. Steps Find trees (based on lexico-syntactic patterns). Filter. Preprocess & abstract. Rewrite as rules (based on semantic patterns).
7
Finding Appropriate Parse Trees First, Use TGrep2 to find parse trees matching hand-authored lexico-syntactic patterns centered around cue words like ‘hoping to’ or ‘but didn’t’. Filter out parse trees unlikely to produce reasonable results, e.g., ones containing parentheses or quoted utterances.
8
Preprocess & Abstract Trees Then preprocess, top-down: Remove (usually) extraneous constituents Rewrite constituents to simplify & abstract. ‘The next day he and another Bengali boy who lives near by [sic] chose another way home, hoping to escape the attackers.’
9
Preprocess & Abstract Trees Then preprocess, top-down: Remove (usually) extraneous constituents Rewrite constituents to simplify & abstract. ‘The next day he and another Bengali boy who lives near by [sic] chose another way home, hoping to escape the attackers.’ People chose another way home, hoping to escape the attackers.
10
Preprocessing Rules Remove INTJ, some PPs. Turn long expressions into keywords like ‘a proposition’. Abstract named entities. Reorder some sentences to be easier to process: ‘Fourteen inches from the floor it’s supposed to be.’ It’s supposed to be fourteen inches from the floor.
11
Rewrite as Rules Rewrite trees as conditional expressions (if–then rules) based on the semantic patterns they match.
12
Disconfirmed Expectations Sentences where ‘but’ or ‘yet’ are used to indicate the expected inference doesn’t hold: Add or remove ‘not’ so that the expectation is confirmed. ‘The ship weighed anchor and ran out her big guns, but did not fire a shot.’ If a ship weighs anchor and runs out her big guns, then it may fire a shot. ‘She was poor but proud.’ If a female is poor, then she may not be proud. 13
Contrasting Good & Bad Identify this pattern by consulting a lexicon of sentiment annotations (SentiWordNet). ‘He is very clever but eccentric.’ If a male is very clever, then he may be eccentric. Clever is positive, eccentric is negative.
14
Expected Outcomes Abstract sentences giving a participant’s intent into general rules. ‘He stood before her in the doorway, evidently expecting to be invited in.’ If a male stands before a female in a doorway, then he may expect to be invited in.
15
Evaluation 16
Evaluating Inferences vs Evaluating Rules Evaluating rule acquisition is tricky. COPA (Roemmele et al. 2011) may be a good option. We’d like to evaluate the results of inference with the rules, but that needs to wait until logical forms are automatically created. Here we judged the rules themselves.
17
Evaluation Conditions Development corpora: Brown corpus (hand-parsed) BNC (machine-parsed). Evaluation corpus: Personal stories from weblogs (Gordon & Swanson). Sampled 100 of the if–then rules produced & Rated them on a scale of 1–5 with 1 being best. Criteria: generality, plausibility, usefulness.
18
Evaluation: Dialogue Test For a rule, e.g., If attacks are brief, then they may not be intense, imagine a dialogue with a computer agent: ‘The attacks (on Baghdad) were brief.’ ‘So I suppose they weren’t intense, were they?’ If this is a reasonable follow-up, then the rule is probably good.
19
Results Average ratings: 1.84 (judge 1) and 2.45 (judge 2). Lower ratings are better. 80
74 61
60
40
40 20 0
14
1
2
3
4
11
5
Counts for how many rules were assigned each rating by judges.
20
Good Results If a pain is great, then it may not be manageable. If a person texts a male, then he-or-she may get a reply. If a male looks around, then he may hope to see someone. If a person doesn’t like some particular store, then he-or-she may not keep going to it.
21
Conclusion & Future Work 22
Goal & Results Enabling an inference system to reason about common situations and activities requires more types of general world knowledge and lexical knowledge. Rules extracted by our method often describe complex consequences or reasons & subtle relations among adjectival attributes that appear quite different from the kind of rules targeted in previous work.
23
Our Approach We’ve suggested an initial approach to acquiring these rules: Look at interesting discourse patterns and rewrite them as conditional expressions based on semantic patterns.
24
Other Approaches: Why not ML/bootstrapping? In other work, these techniques are successful when: Aimed at finding fixed types of relationships (e.g., hyponymy) between pairs of lexical items The relationship between the lexical items is hinted at sufficiently often by their co-occurrence in certain local lexico-syntactic patterns
or by their occurrences in similar sentential environments (distributional similarity)
25
Other Approaches: Why not ML/bootstrapping? But, in this work: We’re looking for a broad range of (more or less strong) consequence relationships The relationships are between entire clauses not just lexical items We’re not likely to find multiple occurrences of the same pair of clauses in a variety of syntactic configurations, all indicating a consequence relation.
26
Future Work Want to produce a logical representation for the rules: Do inference in Epilog (reasoner for Episodic Logic) How: Logically interpret the rules shown or Interpret sentences and then form rules. More advanced filtering to improve quality.
27
Questions?
28