AAAI 2012

Exact Lifted Inference with Distinct Soft Evidence on Every Object Hung Hai Bui, Tuyen N. Huynh, Rodrigo de Salvo Braz Artificial Intelligence Center SRI International Menlo Park, CA, USA

July 26, 2012

1/18

Outline

1 Outline

2 Distinct Soft Evidence is Problematic

3 LIDE (Lifted Inference with Distinct Evidence)

4 Experiments

AAAI 2012

2/18

AAAI 2012

Lifted Inference and the Problematic Soft Evidence • The main idea of lifted inference is to exploit symmetry of the

probabilistic models. This leads to algorithms that can be very efficient on high-tree width, but symmetric models • Soft evidence at the level of every object destroys the model’s

symmetry • Everyone has different weight, cholesterol level, etc

Symmetric   Symmetry  destroyed  

• Aim: lifted inference with distinct soft evidence on every object 3/18

AAAI 2012

Distinct Soft Evidence on a Unary Predicate • The simplest form of distinct soft evidence: on every

grounding of a single unary predicate • Consider an MLN M consists of • An MLN M0 with a unary predicate q. • A set of soft evidence of the form wi : q(i) for every object i.

Evidence

M0 1.4

:

¬Smokes(x)

w1

:

Cancer (P1 )

2.3

:

¬Cancer (x)

w2

:

Cancer (P2 )

4.6

:

¬Friends(x, y )

1.5

:

Smokes(x) ⇒ Cancer (x)

w1000

:

Cancer (P1000 )

1.1

:

Smokes(x) ∧ Friends(x, y )

...

⇒ Smokes(y )

(tree-width = 1000) 4/18

AAAI 2012

LIDE (Lifted Inference with Distinct Evidence)

• Most lifted inference methods applied to M would completely

shatter the model, thus reverting to ground inference. • LIDE’s approach 1 2

Perform lifted inference on M0 only Use special operations to absorb the soft evidences • Instead of exploiting symmetry of the model, we exploit symmetry of the partition function

5/18

Symmetric Function

Definition A n-variable function F (t1 , . . . , tn ) is symmetric if for all permutation π, permuting the variables of F by π does not change the output value, that is, F (t1 , . . . , tn ) = F (tπ(1) . . . , tπ(n) ).

AAAI 2012

• F depends only on the histogram of its arguments. • If ti ∈ {0, 1}, the set {ck }, k = 0, . . . , n, where ck = F (t) for

any t such that ktk1 = k is termed the counting representation of the symmetric function F . • An exchangable distribution is a symmetric function, so it

admits a counting representation.

6/18

Exchangeability of Groundings of a Unary Predicate Theorem Let D∗ = {d1 , . . . , dn } be the set of individuals that do not appear as constants in the MLN M0 and q be a unary predicate in M0 . Let P0 (.) = Pr(q(d1 ) . . . q(dn ) | M0 ). Then, the random vector (q(d1 ) . . . q(dn )) is exchangeable under P0 .

AAAI 2012

• Proof is in the paper. • This seems trivial: d1 , . . . dn do not appear in M0 so they are

“indistinguishable”. But beware, “indistinguishable” does not necessarily imply exchangeable: groundings of an n-ary predicate in general are NOT exchangeable when n > 1.

7/18

AAAI 2012

LIDE as a Wrapper

1

Step 1: apply any applicable lifted inference technique on M0 to compute the counting representation {ck } of P0 (). • One natural method is counting elimination.

2

Step 2: Absorb the soft evidence • Equivalent to compute the posterior of a set of exchangable

binary random variables n

P(q1 , . . . , qn ) =

Y 1 P0 (q1 . . . qn ) φi (qi ) Z i=1

where qi = q(di )

8/18

Posterior of Exchangeable Binary RVs

n

Pr (q1 , . . . , qn ) =

Y 1 P0 (q1 . . . qn ) φi (qi ) Z i=1

We discuss three related problems, to compute • The MAP configuration q under the marginal Pr(q) (a.k.a the

marginal-map problem) • The partition function Z • The marginal Pr(qi ) for each individual di

AAAI 2012

9/18

MAP Inference Let αi =

AAAI 2012

φi (1) φi (0) ,

Φ=

Q

φi (0). Then n

P(q) =

Y q Φ P0 (q1 . . . qn ) αi i Z i=1

max P(q) = q

n Y Φ max ck max αiqi Z k q:kqk1 =k i=1

• Observation: the 2nd maximization simply picks k largest

elements of α. • By sorting the vector α, the MAP problem can be solved in

O(n log(n)) given {ck } as input.

10/18

AAAI 2012

Partition Function Z

Z (α1 , . . . , αn ) = Φ

X

P0 (q1 , . . . , qn )

q1 ...qn

n Y

αiqi

i=1

• Observation: Z is a polynomial in α. More importantly Z is a

symmetric polynomial. • According to the fundamental theorem of symmetric

polynomials, it can be expressed in terms of a small number of building units called elementary symmetric polynomials.

Z (α) = Φ

n X

ck ek (α)

k=0

11/18

Elementary Symmetric Polynomials • ek (α) is the k-th order elementary symmetric polynomial in α,

the sum of all products of distinct k elements of α X ek (α) = αi1 . . . αik 1≤i1 <...
 n

• Sum of k terms, so naive evaluation is a bad idea. P • Newton’s Identity: Let pk (α1 . . . αn ) = ni=1 αik be the k-th

power sum. Then

ek (α) =

1 k

Pk

i=1 (−1)

i−1 e k−i (α)pi (α)

• This yields a recursive method to compute all ek (α) in O(n2 ). • Thus, Z can be computed in O(n2 ) given {ck }. AAAI 2012

12/18

AAAI 2012

Marginal on Each Individual

• As usual, the marginals Pr(qi ) can be computed in a way

similar to the computation of the normalization term Z , as the following theorem shows. (i)

• Let α(i) be the vector such that αi

(i)

= 0 and αj = αj for

every j 6= i. Then

Pn ck ek (α(i) ) Z (α(i) ) = Pk=0 Pr(qi = 0) = n Z (α) k=0 ck ek (α)

13/18

Experimental Setup • Friends and Smokes domain. • Task: compute the marginal probability of having cancer of

each person given the cancer test readings of whole population. • Individual soft evidence uniformly sample from [0,2]. Thus,

lifted BP reduces to ground BP • Two versions of the “Friends & and Smokes” MLN: • Original Friends-and-Smokes: encourage Smokes(x) and

Smokes(y ) to be the same if Friends(x, y ) is unknown. • Attractive potential between Smokes(x) and Smokes(y ) • Friends-and-Smokes-Neg:

−1.1 : Smokes(x) ∧ Friends(x, y ) ⇒ Smokes(y ). • Repulsive potential between Smokes(x) and Smokes(y ) • Difficult test case for BP

AAAI 2012

14/18

Running Time on “Friends & Smokes” 800 700 600 500 400

LIDE

300

BP

200 100 0

10 20 30 40 50 60 70 80 90 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500

Running time (seconds)

900

Number of persons

*Note • Use a slightly modified C-FOVE for lifted inference without evidence • C-FOVE time dominates evidence absorbing time • Junction-tree ran out of memory for N = 30.

AAAI 2012

15/18

Running Time on “Friends & Smokes-Neg” 9 Running time (seconds)

8 7 6 5

LIDE

4 BP with damping (damping = 0.1)

3 2 1 0 10 20 30 40 50 60 70 80 90 100 Number of persons

*Note: BP did not converge when N≥ 200 AAAI 2012

16/18

Prob(Cancer)

Evidence Strength vs Probability 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0

0.5 1 1.5 Evidence strength w

2

*Note:

AAAI 2012

• This is a scatter plot, not a function. • Distribution of Pr(Cancer) spreads, so quantization will loose

accuracy. 17/18

Conclusion and Future Direction • We propose a new strategy for handling distinct soft evidence • Perform lifted inference (e.g. C-FOVE) without the distinct

soft evidence • Absorb the soft evidence by exploiting the symmetry of the

partition polynomial Z • Future direction • Soft evidence on multiple (L) unary predicates. • Polynomial in domain size N, but super-exponential in L • Need to depart from exact inference and derive efficient approximation. • Soft evidence on one binary predicate • Intractable in general • Are there efficient approximations that can be derived from this approach?

AAAI 2012

18/18

Exact Lifted Inference with Distinct Soft Evidence ... - Semantic Scholar

Jul 26, 2012 - The MAP configuration q under the marginal Pr(q) (a.k.a the marginal-map ... By sorting the vector α, the MAP problem can be solved in.

373KB Sizes 0 Downloads 322 Views

Recommend Documents

No documents