A Joint Topic and Perspective Model for Ideological Discourse Wei-Hao Lin, Eric Xing, and Alex Hauptmann Language Technologies Institute School of Computer Science Carnegie Mellon University European Conference on Machine Learning, Antwerp, Belgium, September 2008

On abortion issue

On abortion issue

Barack Obama

On abortion issue



“Abortion should be legally available in accordance with Roe v. Wade.”

Sources: 1998 IL State Legislative National Political Awareness Test, Jul 2, 1998

Barack Obama

On abortion issue



“Abortion should be legally available in accordance with Roe v. Wade.”

Sources: 1998 IL State Legislative National Political Awareness Test, Jul 2, 1998

Barack Obama

John McCain

On abortion issue



“Abortion should be legally available in accordance with Roe v. Wade.”

Sources: 1998 IL State Legislative National Political Awareness Test, Jul 2, 1998

Barack Obama



“I have stated time after time that Roe v. Wade was a bad decision, that I support the rights of the unborn.” Sources: Meet the Press: 2007 “Meet the Candidates” series, May 13, 2007

John McCain

Echo Chamber Percentage of Internet users in the United States who seek news sources that challenge their views:

Echo Chamber Percentage of Internet users in the United States who seek news sources that challenge their views:

20%

Too many newspapers, too little time

Credit: http://www.flickr.com/photos/china-beijing-photowall/1537705755/

Automatic detection of biased articles

Automatic detection of biased articles Warning: Strongly pro-life bias!

Automatic detection of biased articles Warning: Strongly pro-life bias!

Want to read some pro-choice articles?

Goal



Develop statistical models for ideological discourse

• • • •

Automatic identify an article’s viewpoint Justify models’ decisions on perspectives Raise awareness of individual news sources’ biases Facilitate mutual understanding between people holding different beliefs

Outline

• •

• •

Goal: Model ideology discourse Joint Topic and Perspective Model

• • •

Emphatic patterns in word frequency Model specification Approximate inference using variational methods

Evaluation Conclusions

Emphatic patterns in word frequency

Israeli view

Palestinian view

Emphatic patterns in word frequency

Israeli view

Palestinian view

Emphatic patterns in word frequency Topical factor

Israeli view

Palestinian view

Emphatic patterns in word frequency

Israeli view

Palestinian view

Emphatic patterns in word frequency

Israeli view

Palestinian view

Emphatic patterns in word frequency Ideological factor

Israeli view

Palestinian view

Encode emphatic patterns into ß structure

Encode emphatic patterns into ß structure

Joint Topic and Perspective Model (jTP) µτ τ π

Pd

Wd,n

Στ

βv Nd D

Document view Pd ∼

V

µφ

φv V

Σφ

Bernoulli(π), d = 1, . . . , D

word Wd,n |Pd = v ∼ Multinomial(βv ), n = 1, . . . , Nd βvw =

topical weight τ ∼

ideological weight φv ∼

exp(τ w ×φw v ) P ! ! ,v w w ×φv ) w! exp(τ

= 1, . . . , V

N(µτ , Στ ) N(µφ , Σφ ).

Joint Topic and Perspective Model (jTP) µτ τ π

Pd

Wd,n

Στ

βv Nd D

Document view Pd ∼

V

µφ

φv V

Σφ

Bernoulli(π), d = 1, . . . , D

word Wd,n |Pd = v ∼ Multinomial(βv ), n = 1, . . . , Nd βvw =

topical weight τ ∼

ideological weight φv ∼

exp(τ w ×φw v ) P ! ! ,v w w ×φv ) w! exp(τ

= 1, . . . , V

N(µτ , Στ ) N(µφ , Σφ ).

Joint Topic and Perspective Model (jTP) µτ τ π

Pd

Wd,n

Στ

βv Nd D

Document view Pd ∼

V

µφ

φv V

Σφ

Bernoulli(π), d = 1, . . . , D

word Wd,n |Pd = v ∼ Multinomial(βv ), n = 1, . . . , Nd βvw =

topical weight τ ∼

ideological weight φv ∼

exp(τ w ×φw v ) P ! ! ,v w w ×φv ) w! exp(τ

= 1, . . . , V

N(µτ , Στ ) N(µφ , Σφ ).

Joint Topic and Perspective Model (jTP) µτ τ π

Pd

Wd,n

Στ

βv Nd D

Document view Pd ∼

V

µφ

φv V

Σφ

Bernoulli(π), d = 1, . . . , D

word Wd,n |Pd = v ∼ Multinomial(βv ), n = 1, . . . , Nd βvw =

topical weight τ ∼

ideological weight φv ∼

exp(τ w ×φw v ) P ! ! ,v w w ×φv ) w! exp(τ

= 1, . . . , V

N(µτ , Στ ) N(µφ , Σφ ).

Joint Topic and Perspective Model (jTP) µτ τ π

Pd

Wd,n

Στ

βv Nd D

Document view Pd ∼

V

µφ

φv V

Σφ

Bernoulli(π), d = 1, . . . , D

word Wd,n |Pd = v ∼ Multinomial(βv ), n = 1, . . . , Nd βvw =

topical weight τ ∼

ideological weight φv ∼

exp(τ w ×φw v ) P ! ! ,v w w ×φv ) w! exp(τ

= 1, . . . , V

N(µτ , Στ ) N(µφ , Σφ ).

Joint Topic and Perspective Model (jTP) µτ τ π

Pd

Wd,n

Στ

βv Nd D

Document view Pd ∼

V

µφ

φv V

Σφ

Bernoulli(π), d = 1, . . . , D

word Wd,n |Pd = v ∼ Multinomial(βv ), n = 1, . . . , Nd βvw =

topical weight τ ∼

ideological weight φv ∼

exp(τ w ×φw v ) P ! ! ,v w w ×φv ) w! exp(τ

= 1, . . . , V

N(µτ , Στ ) N(µφ , Σφ ).

Two Difficulties in jTP 1. Computationally intractable inference on weights P (τ, {φv }|{Wd,n }, {Pd }; Θ) ! ! ! ∝ N(τ |µτ , Στ ) Bernoulli(Pd |π) Multinomial(Wd,n |Pd , β) N(φv |µφ , Σφ ) v



d

n

Approximate inference using variational methods

P (τ, {φv }|{Pd }, {Wd,n }; Θ) ≈ qτ (τ )



v

qφv (φv )

Cope with non-conjugate logistic-normal distributions using Laplace approximation

2. Under-constrained model parameters



!

Fix corner points

Generalized Mean Fields inference µτ τ π

Pd

Wd,n

Στ

βv Nd D

V

µφ

φv V

Σφ

Generalized Mean Fields inference Variational E step

µτ τ π

Pd

Wd,n

βv Nd D

qτ (τ ) =

Στ

V

µφ

φv V

P (τ |{Wd,n }, {Pd }, {!φv "}; Θ)

Σφ

N(τ |µτ , Στ ) Multinomial({Wd,n }|{Pd }, τ, {!φv "}) ≈ N (τ |µ∗ , Σ∗ ) #−1 ! −1 " T ∗ τ • !φv ") → !φv " Σ = Στ + v nv 1!φv " ↓ H(ˆ ! −1 " " T ∗ ∗ µ = Σ Στ µτ + v nv • !φv " − v nv 1∇C(ˆ τ • !φv ") • !φv " # " T + v nv 1!φv " • (H(ˆ τ • !φv ")(ˆ τ • !φv ")) ∝

Generalized Mean Fields inference Variational E step

µτ τ π

Pd

Wd,n

βv Nd D

qφv (φv ) =

Στ

V

µφ

φv V

P (φv |{Wd,n }, {Pd }, !τ "; Θ)

Σφ

∝ N(φv |µφ , Σφ ) Multinomial({Wd,n }|{Pd }, {φv }, !τ ") ≈ N (φv |µ† , Σ† ) ! "−1 −1 † Σ = Σφ + nTv 1!τ " ↓ H(!τ " • φˆv ) → !τ " ! −1 † † µ = Σ Σφ µφ + nv • !τ " − nTv 1∇C(!τ " • φˆv ) • !τ "

Generalized Mean Fields inference Variational M step

µτ τ π

Pd

Wd,n

Στ

βv Nd D

V

µφ

φv V

Σφ

Outline

• • •



Goal: Model ideology discourse Joint Topic and Perspective Model Evaluation

• • •

Synthetic and real data Uncovered topical and ideological weights Predict unseen data

Conclusions

Experimental Data

• •

Synthetic data for verifying the inference algorithm Two real data

• •

Editorials published on http://bitterlemons.org

• •

Israeli vs. Palestinian 594 documents (302 vs. 292), 462,308 words

2000-04 US presidential debate speech transcripts

• •

Democratic vs. Republican 1232 documents (214 vs. 235), 122,056 words

w3

0.4 0.3 0.2 0.1 0.0

w2

w1 !

maximal absolute difference

Synthetic Data

0

200

400

600

training examples

800

1000

Israeli vs. Palestinian

Democratic vs. Republican

Reduce perplexity

Encode emphatic patterns into ß structure

Encode emphatic patterns into ß structure

Conclusions

• • •



New challenge: identifying ideological perspectives Emphatic patterns in ideological discourse New model: Joint topic and perspective model

• •

Inference algorithms using variational methods



Predict real data better than view-ignorant models

Automatic discovery of topical and ideological weights of words

Emphatic patterns found in user-generated tags, visual content, etc.

A Joint Topic and Perspective Model for Ideological ...

Jul 2, 1998 - Language Technologies Institute. School of Computer Science. Carnegie ..... Republican. • 1232 documents (214 vs. 235), 122,056 words ...

5MB Sizes 1 Downloads 223 Views

Recommend Documents

No documents