TextInfer ⋅ July 30, 2011

Representing and resolving ambiguities in ontology-based question answering Christina Unger & Philipp Cimiano Semantic Computing Group, CITEC, Bielefeld University

1 / 34

Outline Ambiguities in ontology-based interpretation

Representing and resolving ambiguities Enumeration Underspecification Ontological reasoning

Results and conclusion

Outline

2 / 34

Outline Ambiguities in ontology-based interpretation

Representing and resolving ambiguities Enumeration Underspecification Ontological reasoning

Results and conclusion

Ambiguities in ontology-based interpretation

3 / 34

Ambiguities

Ambiguities comprise all cases in which natural language expressions have more than one meaning, due to: ▸

structural properties (e.g. modifier/PP attachment sites) ▸ ▸



Put the box on the table by the window in the kitchen. We saw old elves and wizards.

alternative lexical meanings ▸

bank

Ambiguities in ontology-based interpretation

3 / 34

Ontology-based interpretation

Goal: Map natural language expressions (e.g. questions) to formal representations aligned to the ontology (e.g. SPARQL queries). ▸

Which city has the most inhabitants?

▸ PREFIX geo:

SELECT ?c WHERE { ?c rdf:type geo:city . ?c geo:population ?p . } ORDER BY DESC ?p LIMIT 1

Ambiguities in ontology-based interpretation

4 / 34

Ontology-based interpretation

The meaning of a lexical expression is the ontology concept that this expression verbalizes. ▸

Which city has the most inhabitants? ▸ city → geo:city ▸ has inhabitants → geo:population

Ambiguities in ontology-based interpretation

5 / 34

Ontology-based interpretation

Main challenge: Map natural language expressions to corresponding ontology concepts. This mapping needs not be one-to-one. ▸

different expressions can refer to the same ontology concept ▸ flows through → geo:flowsThrough ▸ traverses → geo:flowsThrough



one expression can refer to different ontology concepts ▸ New York → geo:new_york, geo:new_york_city ▸ has → geo:flowsThrough, geo:inState

Ambiguities in ontology-based interpretation

6 / 34

Ambiguities in ontology-based interpretation

Ambiguities in the context of ontology-based interpretation comprise all cases in which a natural language expression cannot be mapped uniquely to an ontology concept.

Ambiguities in ontology-based interpretation

7 / 34

Non-overlapping alternatives ▸

What is the area of New York? ▸ (New York state) → geo:new_york ▸ (New York city) → geo:new_york_city



What is the biggest city? ▸



SELECT ?c WHERE { ?c a geo:city . ?c geo:population ?n . } ORDER BY DESC ?n LIMIT 1 SELECT ?c WHERE { ?c a geo:city . ?c geo:area ?n . } ORDER BY DESC ?n LIMIT 1

Ambiguities in ontology-based interpretation

8 / 34

Overlapping alternatives



Give me all films starring Jeff Bridges. ▸ ▸



only inculding leading roles also including supporting roles

Which cities have more than two million inhabitants? ▸ ▸



Ambiguities in ontology-based interpretation

9 / 34

Context-dependency ▸

Which state has the most rivers? ▸



SELECT COUNT(?s) AS ?n WHERE { ?s a geo:state . ?x a geo:river . ?x geo:flowsThrough ?s. } ORDER BY DESC ?n LIMIT 1

Which state has the most cities? ▸

SELECT COUNT(?s) AS ?n WHERE { ?s a geo:state . ?x a geo:city . ?x geo:inState ?s. } ORDER BY DESC ?n LIMIT 1

Due to sortal restrictions, only one of the alternatives is admissible. Ambiguities in ontology-based interpretation

10 / 34

Ambiguities are pervasive. ▸

QALD-1 training questions for DBpedia ▸



At least 16 % contain expressions that do not have a unique ontological correspondent.

880 user questions for GeoBase ▸ ▸

1278 occurences of light expressions: is/are, has/have, with, in, of 151 ocurrences of context-dependent expressions: big, small, major

Ambiguities in ontology-based interpretation

11 / 34

Outline Ambiguities in ontology-based interpretation

Representing and resolving ambiguities Enumeration Underspecification Ontological reasoning

Results and conclusion

Representing and resolving ambiguities

12 / 34

Representing ambiguities



Enumeration: constructing a different semantic representation and query for every meaning alternative



Underspecification: constructing only one underspecified representation that subsumes all different interpretations

Representing and resolving ambiguities

12 / 34

Enumeration Constructing a different meaning representation for every possible interpretation

Representing and resolving ambiguities

Enumeration

13 / 34

Enumeration ▸

How big is New York? ▸







SELECT ?a WHERE { geo:new_york_city geo:area ?a . } SELECT ?p WHERE { geo:new_york_city geo:population ?p.} SELECT ?a WHERE { geo:new_york geo:area ?a . } SELECT ?p WHERE { geo:new_york geo:population ?p . }



two lexical entries for big, one referring to geo:area and one referring to geo:population



two lexical entries for New York, one referring to geo:new_york and one referring to geo:new_york_city

Representing and resolving ambiguities

Enumeration

14 / 34

Enumeration



Which state has the most rivers? Which state has the most cities?



two lexical entries for has, one referring to geo:flowsThrough and one referring to geo:inState



Problem: geo:flowsThrough is only possible if the relevant argument is a river, geo:inState is only relevant if the relevant argument is a city



Solution: In order not to derive inconsistent interpretations, we need to capture sortal restrictions.

Representing and resolving ambiguities

Enumeration

15 / 34

Adding sortal restrictions Lexical entries are enriched with sortal restrictions vˆclass. S DP1 ↓

VP V has

geo:flowsThrough (y, x)

DP2 ↓

(DP1 , x), (DP2 , y) yˆgeo:river

S DP1 ↓

VP V has

geo:inState (y, x)

DP2 ↓

Representing and resolving ambiguities

(DP1 , x), (DP2 , y) yˆgeo:city

Enumeration

16 / 34

Adding sortal restrictions When translating semantic representations into a formal query, sortal restrictions are added as a condition. ▸

Which state has the most rivers?

▸ SELECT COUNT(?y) as ?c WHERE {

?x a geo:state . ?y a geo:river . ?y geo:flowsThrough ?x . ?y a geo:river . } ORDER BY ?c DESC LIMIT 1 ▸ SELECT COUNT(?y) as ?c WHERE {

?x a geo:state . ?y a geo:river . ?y geo:inState ?x . ?y a geo:city . } ORDER BY ?c DESC LIMIT 1 Representing and resolving ambiguities

Enumeration

17 / 34

Example: major N ADJ major

N↓

geo:population (x, p) p > 150 000

(N, x) xˆgeo:city N ADJ major

N↓

geo:population (x, p) p > 10 000 000

(N, x) xˆgeo:state

Representing and resolving ambiguities

Enumeration

18 / 34

Example: major



Give me all major cities.

▸ SELECT ?x WHERE {

?x a geo:city . ?x geo:population ?p . ?x a geo:city . FILTER ( ?p > 150000 ) } ▸ SELECT ?x WHERE {

?x a geo:city . ?x geo:population ?p . ?x a geo:state . FILTER ( ?p > 10000000 ) }

Representing and resolving ambiguities

Enumeration

19 / 34

Problems with enumeration The enumeration strategy relies on a conflict that automatically filters out unwanted interpretations by constructing a query that returns no result. ▸

Extensionality No way to distinguish between queries that return no result due to an inconsistency introduced by a sortal restriction, and queries that return no result, because there is no result (e.g. Which states border Hawaii?).



Number of constructed interpretations User questions easily lead to 20 or 30 different possible interpretations at least.

Representing and resolving ambiguities

Enumeration

20 / 34

Underspecification Constructing one underspecified representation that subsumes all different interpretations

Representing and resolving ambiguities

Underspecification

21 / 34

Adding metavariables

Semantic representations are enriched with metavariables and metavariable specifications that list all possible instantiations of a metavariable given certain sortal restrictions.

Representing and resolving ambiguities

Underspecification

22 / 34

Example: biggest N ADJ biggest

N↓

y

P (x, y) max(y) (N, x) P → geo:area (x = geo:city ⊔ geo:state) ∣ geo:population (x = geo:city ⊔ geo:state) ∣ geo:height (x = geo:mountain)

Representing and resolving ambiguities

Underspecification

23 / 34

Semantically light expressions

For semantically light expressions, we assume the most underspecified representation: a metavariable without a metavariable specification.

Representing and resolving ambiguities

Underspecification

24 / 34

Example: has

S DP1 ↓

VP V has

DP2 ↓

P (y, x) (DP1 , x), (DP2 , y)

Representing and resolving ambiguities

Underspecification

25 / 34

Example: final underspecified representation ▸ ▸

Which state has the biggest city? x, y, z geo:state(y) geo:city(x) P (x, y) Q (x, z) max(z)

Q → geo:area (x = geo:city ⊔ geo:state) ∣ geo:population (x = geo:city ⊔ geo:state) ∣ geo:height (x = geo:mountain)

Representing and resolving ambiguities

Underspecification

26 / 34

The interpretation process thus constructs one underspecified representation that has to be specified in order to yield a specific query that can be evaluated w.r.t. a knowledge base.

Representing and resolving ambiguities

Underspecification

27 / 34

Ontological reasoning Integrating a reasoner in order to resolve metavariables and thereby construct exactly those interpretations that are possible and consistent

Representing and resolving ambiguities

Ontological reasoning

28 / 34

Resolving Q x, y, z geo:state(y) geo:city(x) P (x, y) Q (x, z), max(z)

Q → geo:area (x = geo:city ⊔ geo:state) ∣ geo:population (x = geo:city ⊔ geo:state) ∣ geo:height (x = geo:mountain) Check satisfiability of the intersection of x's type information with the sortal restrictions: ▸ geo:city ⊓ (geo:city ⊔ geo:state)

true

▸ geo:city ⊓ geo:mountain

false

Representing and resolving ambiguities

Ontological reasoning

29 / 34

Resolving Q

x, y, z geo:state(y) geo:city(x) P (x, y) Q (x, z) max(z)

Q → geo:area (x = geo:city ⊔ geo:state) ∣ geo:population (x = geo:city ⊔ geo:state)

Representing and resolving ambiguities

Ontological reasoning

30 / 34

Resolving Q

x, y, z

x, y, z

geo:state(y) geo:city(x) P (x, y) geo:area (x, z) max(z)

geo:state(y) geo:city(x) P (x, y) geo:population (x, z) max(z)

Representing and resolving ambiguities

Ontological reasoning

30 / 34

Example: Resolving P

x, y, z

x, y, z

geo:state(y) geo:city(x) P (x, y) geo:area (x, z) max(z)

geo:state(y) geo:city(x) P (x, y) geo:population (x, z) max(z)

The ontology is searched for a relation that admits geo:city (or a superclass) as domain and geo:state (or a superclass) as range. ▸ geo:inState Representing and resolving ambiguities

Ontological reasoning

31 / 34

Example: Resolving P

x, y, z

x, y, z

geo:state(y) geo:city(x) geo:inState (x, y) geo:area (x, z) max(z)

geo:state(y) geo:city(x) geo:inState (x, y) geo:population (x, z) max(z)

The ontology is searched for a relation that admits geo:city (or a superclass) as domain and geo:state (or a superclass) as range. ▸ geo:inState Representing and resolving ambiguities

Ontological reasoning

31 / 34

Outline Ambiguities in ontology-based interpretation

Representing and resolving ambiguities Enumeration Underspecification Ontological reasoning

Results and conclusion

Results and conclusion

32 / 34

Results

Of the 880 GeoBase user questions, Pythia can handle 624.

Enumeration Reasoning

Results and conclusion

Total # queries 3180 2100

Avg. # queries 5.1 3.4 (-44%)

Max. # queries 96 24 (-75%)

32 / 34

Distribution (improved results) .

.242

.111 .87 .51

.

.3 .1

Results and conclusion

.2

.3

.4

.5

.21 .6

. 0 .7

.4 .8

. 0 .9

.4 .10

.

33 / 34

Conclusion

Ambiguous expressions are everywhere. But disambiguation strategies ▸

allow to discard interpretations that are not consistent in the context and therefore should be filtered out



allow to significantly reduce the number of constructed queries: the average number of queries per question can be reduced by 44%, the maximum number of queries per question can be reduced even by 75%

Results and conclusion

34 / 34

Representing and resolving ambiguities in ontology ...

Jul 30, 2011 - ?c rdf:type geo:city . ?c geo:population ?p . } ORDER BY .... VP. V has. DP2 ↓ geo:flowsThrough (y,x). (DP1,x),(DP2,y) yˆgeo:river. S. DP1 ↓.

159KB Sizes 3 Downloads 202 Views

Recommend Documents

No documents