Embedding Probabilistic Logic for Machine Reading Sebastian Riedel (University College London)

1

Overview Machine Reading & Reasoning … … with Probabilistic Logics and Embeddings Challenges Injecting Explanations Extracting Explanations

2

Machine Reading “Who works in London and is interested in NLP?

in(UCL,London)

interest(x,NLP),! worksFor(x,y),
 in(y,London)

Relational DB

topic(Seb,NLP) worksFor(Seb,UCL)

[Kwiatkowski et al., 2013]

Narrow domain-specific schema

[Mintz et al., 2009]

Semantics Statistical NLP Syntax

Coreference

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 3

Machine Reading [Riedel et al., 2013] in(UCL,London)

“Who works in London and is interested in NLP?

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL)

Relational DB

interest(x,NLP),! worksFor(x,y),! in(y,London)

Semantics Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 4

Semantics as Reasoning [Riedel et al., 2013] in(UCL,London)

“Who works in London and is interested in NLP? interest(x,NLP),! worksFor(x,y),! in(y,London)

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL) worksFor(x,y): faculty-at(x,y) interest(x,y): works-in-area-of(x,y)[0.9]

Statistical Relational Learner and Reasoner

faculty-at(x,y): lecturer-at(x,y)

Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 5

Benefit: Transitive Reasoning in(UCL,London)

“Who works in London and is interested in NLP? interest(x,NLP),! worksFor(x,y),! in(y,London)

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL) worksFor(x,y): faculty-at(x,y) interest(x,y): works-in-area-of(x,y)[0.9]

Statistical Relational Learner and Reasoner

faculty-at(x,y): lecturer-at(x,y)

Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 6

Benefit: More Coverage in(UCL,London)

“Who is faculty in London and interested in NLP? interest(x,NLP),! worksFor(x,y),! in(y,London)

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL) worksFor(x,y): faculty-at(x,y) interest(x,y): works-in-area-of(x,y)[0.9]

Statistical Relational Learner and Reasoner

faculty-at(x,y): lecturer-at(x,y)

Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 7

Benefit: Code Reuse in(UCL,London)

“Who lives in London and is interested in NLP? interest(x,NLP),! worksFor(x,y),! in(y,London)

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL) worksFor(x,y): faculty-at(x,y) interest(x,y): works-in-area-of(x,y)[0.9] livesIn(x,z): worksFor(x,y),! locatedIn(y,z) [0.6]

Statistical Relational Learner and Reasoner

[Lao et al., 2011]

Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 8

Reasoner and Learner Statistical Relational Learner and Reasoner

? 9

Probabilistic Logics Use (weighted) logics to define graphical models lecturer-at

prof-at

works-for

Examples Markov Logic
 [Richardson and Domingos, 2006]

Bayesian Logic
 Programs [Kersting , 2007]

10

Probabilistic Logics Use (weighted) logics to define graphical models lecturer-at

prof-at

works-for

Problems Inference Rule Learning

11

Matrix Factorization Think of database as a matrix or tensor lecturer-at

prof-at

works-for

1

1

1 1 1

1

12

Matrix Factorization Embed entity (pairs) in low dimensional vector spaces lecturer-at

prof-at

works-for

1

1

1 1 1

?? ??

1

?? ?? 13

Matrix Factorization Embed relations in low dimensional vector spaces

1

1

lecturer-at

1 1 1

?? ??

1

??

? ?

prof-at

? ?

works-for

? ?

?? 14

Matrix Factorization Find a matrix-matrix product that approximates observed DB

1

1

lecturer-at

1 1 1

?? ??

1



??



? ?

prof-at

? ?

works-for

? ?

?? 15

Matrix Factorization Or a non-linear function of this product

1

1

1 1 1

1



sigmoid



16

Matrix Factorization Low rank forces some 0 cells to become non-zero => prediction

1 1

1 .9

1 1

1 .9



sigmoid



[Nickel, Bordes, …] 17

Results for Relation Extraction [Riedel et al. 2013, NAACL] Averaged 11-point Precision/Recall 1 0.9 0.8

Precision

0.7 SU12 N F NF NFE

0.6 0.5 0.4 0.3 0.2 0.1 0

0.2

0.4

0.6

0.8

1

Recall

18

Facts

|P|

Challenge 1: Injecting Symbolic Rules

First-orde Formulae

KB

8x, y : #2-unit-of-#1(x, y) ) organi Example: “Boeing and the Sikorsky Aircraf 8x, y : #1-city-in-#2(x, y) ) locati Example: “With 900,000 people, San Jose#1 “lecturers are employees!”

?

sigmoid



Figure 1: Injecting Logic into Matrix Factorization: G

entity-pairs P and predicates/relations R, matrix factori embeddings that approximate the observed matrix. In entities and relations to learn the embeddings such that th 19

Challenge 1: Injecting Symbolic Rules

“a liquid turns into a solid ! when its temperature is ! lowered below its freezing point

?

sigmoid



20

Some Experiments “Zero-shot” learning Given: a lot of relational data, but not for worksFor Goal: given few of worksFor rules, learn to predict worksFor

Results (in MAP for several relations) Only rules: 0.23 Apply rules after factorization: 0.34 Apply rules before factorization: 0.43 Incorporate rules into training objective: 0.52

[Rocktaeschel et al. 2014, SP14] 21

Facts

|P|

Challenge 1: Injecting Symbolic Rules

First-orde Formulae

KB

8x, y : #2-unit-of-#1(x, y) ) organi Example: “Boeing and the Sikorsky Aircraf 8x, y : #1-city-in-#2(x, y) ) locati Example: “With 900,000 people, San Jose#1 “lecturers are employees!”

?

sigmoid



Figure 1: Injecting Logic into Matrix Factorization: G

entity-pairs P and predicates/relations R, matrix factori embeddings that approximate the observed matrix. In entities and relations to learn the embeddings such that th 22

Facts

|P|

Challenge 2: Extracting Explanations

First-orde Formulae

KB

8x, y : #2-unit-of-#1(x, y) ) organi Example: “Boeing and the Sikorsky Aircraf 8x, y : #1-city-in-#2(x, y) ) locati Example: “With 900,000 people, San Jose#1 “lecturers are employees!”

?

sigmoid



Figure 1: Injecting Logic into Matrix Factorization: G

entity-pairs P and predicates/relations R, matrix factori embeddings that approximate the observed matrix. In entities and relations to learn the embeddings such that th 23

Facts

|P|

Challenge 2: Extracting Explanations

First-orde Formulae

KB

8x, y : #2-unit-of-#1(x, y) ) organi Example: “Boeing and the Sikorsky Aircraf 8x, y : #1-city-in-#2(x, y) ) locati Example: “With 900,000 people, San Jose#1 “I returned Sebastian! because we know he is a lecturer! at UCL, which is in London,! so he most likely lives in London! …

?

sigmoid



Figure 1: Injecting Logic into Matrix Factorization: G

entity-pairs P and predicates/relations R, matrix factori embeddings that approximate the observed matrix. In [Thrun 1995, NIPS, Craven 1996, NIPS] that th entities and relations to learn the embeddings such 24

Summary Do semantics in a probabilistic relational reasoner Reasoner: matrix/tensor factorization (or other LV models) Challenges: inject explanations extract explanations

Do this for: deeper downstream tasks such as question answering, fact checking, machine comprehension We are hiring (thanks to the Paul G. Allen Foundation) 25

Thanks

26

NIPS Learning Semantics 2014

Relational DB. Statistical NLP topic(Seb .... Use (weighted) logics to define graphical models. Probabilistic ... Embed relations in low dimensional vector spaces.

802KB Sizes 1 Downloads 209 Views

Recommend Documents

No documents