Conditional Marginalization Simulation Example

1 / 27

Conditional Marginalization for Exponential Random Graph Models Tom A.B. Snijders

University of Oxford University of Groningen

April 2010

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

2 / 27

Exponential Random Graph Model (ERGM) defined by Pθ {Y = y } = exp θ0 u(y ) − ψ(θ)



y ∈ Y(N )

where Y(N ) is the set of all graphs on a node set N . (Frank & Strauss 1986; Pattison & Wasserman 1996; Snijders, Pattison, Robins & Wasserman 2006).

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

2 / 27

Exponential Random Graph Model (ERGM) defined by Pθ {Y = y } = exp θ0 u(y ) − ψ(θ)



y ∈ Y(N )

where Y(N ) is the set of all graphs on a node set N . (Frank & Strauss 1986; Pattison & Wasserman 1996; Snijders, Pattison, Robins & Wasserman 2006). The induced subgraph on a subset of nodes N1 ⊂ N does not in general have an ERG distribution. (Frank & Strauss, 1986)

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

2 / 27

Exponential Random Graph Model (ERGM) defined by Pθ {Y = y } = exp θ0 u(y ) − ψ(θ)



y ∈ Y(N )

where Y(N ) is the set of all graphs on a node set N . (Frank & Strauss 1986; Pattison & Wasserman 1996; Snijders, Pattison, Robins & Wasserman 2006). The induced subgraph on a subset of nodes N1 ⊂ N does not in general have an ERG distribution. (Frank & Strauss, 1986) ... Is this serious ... ?

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

3 / 27

In ‘regular’ statistical models with i.i.d. observations, omitting a random subset of the data is only a loss of information, not an assault on the model’s validity: the marginal distribution of the smaller data set is still an i.i.d. sample from the same distribution – the model marginalizes straightforwardly. Therefore, the same type of statistical analysis still is applicable. For ERGMs this is not the case (trivial specifications excepted), which may be regarded to be a quirk of the model.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

3 / 27

In ‘regular’ statistical models with i.i.d. observations, omitting a random subset of the data is only a loss of information, not an assault on the model’s validity: the marginal distribution of the smaller data set is still an i.i.d. sample from the same distribution – the model marginalizes straightforwardly. Therefore, the same type of statistical analysis still is applicable. For ERGMs this is not the case (trivial specifications excepted), which may be regarded to be a quirk of the model. Taking nodes out of the graph is an amputation.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

3 / 27

In ‘regular’ statistical models with i.i.d. observations, omitting a random subset of the data is only a loss of information, not an assault on the model’s validity: the marginal distribution of the smaller data set is still an i.i.d. sample from the same distribution – the model marginalizes straightforwardly. Therefore, the same type of statistical analysis still is applicable. For ERGMs this is not the case (trivial specifications excepted), which may be regarded to be a quirk of the model. Taking nodes out of the graph is an amputation. But what can we say about marginal distributions of subgraphs? c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

4 / 27

When does the ERGM marginalize??

When does the induced subgraph on a subset N1 ⊂ N still have an ERG distribution?

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

4 / 27

When does the ERGM marginalize??

When does the induced subgraph on a subset N1 ⊂ N still have an ERG distribution?

Under the condition that the graph on N1 is disconnected from the rest.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

5 / 27

Theorem. If N = N1 ∪ N2 , under the condition that there are no connections between N1 and N2 , the induced graphs Y1 |N1 and Y2 |N2 are mutually independent, both having the ERG distribution Pθ



  Yh |Nh = yh = exp θ0 u(y˜h )) − ψh (θ) ,

where y˜h is the graph on N obtained from y by deleting all edges outside Nh . This generalizes to multiple disconnected subgraphs, and to more restrictive conditions (implying disconnection).

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

6 / 27

But there is a condition: exclude action at a distance

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

6 / 27

But there is a condition: exclude action at a distance Definition. The function u(y ) is component separable, if for any partition N = ∪H h=1 Nh , and Nh ∩ Nk = ∅ for h 6= k , and any graph y that has no edges between Nh and Nk for any h 6= k , u(y ) can be written as u(y ) =

H X

u(y˜h ) + ud

h=1

where y˜h is the graph on N obtained from y by deleting all edges outside Nh . c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

7 / 27

For ERGMs, component separability is equivalent to disconnected induced subgraphs being independent.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

7 / 27

For ERGMs, component separability is equivalent to disconnected induced subgraphs being independent. Examples of ERGMs that are not component separable 1

u(y ) =

sX

yij

ij

Nonlinear functions of subgraph counts. 2

1 u(y ) = 8

X

.

.

.

.

i

j

h

k

yij yhk ,

i,j,h,k :{i,j}∩{h,k }=∅

the count of subgraphs on four points, composed of only two disconnected edges. c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

8 / 27

An aside ...

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

8 / 27

An aside ...

In specifications of ERGMs, use only statistics that are linear combinations of subgraph counts (with weights depending on covariates) unless you are willing to permit action at a distance.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

9 / 27

Theorem. For a component separable ERGM with sufficient statistic u(y ), and N1 , . . . , NH a partition of the node set, let A0 be the event that the subsets Nh all are mutually disconnected, and Ah be an event referring only to the induced graph Y |Nh . Then conditional on A0 ∩ A1 ∩ . . . ∩ AH , the subgraphs Y |Nh for h = 1, . . . , H are independent, and (    exp θ0 u(y˜h ) − ψh (θ) if yh satisfies Ah Pθ Y |Nh = yh = 0 otherwise.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

10 / 27

The theorem (in its extended form) has a number of corollaries. Suppose that Y follows an ERGM.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

10 / 27

The theorem (in its extended form) has a number of corollaries. Suppose that Y follows an ERGM. 1

The graph on a subset of nodes, disconnected from the rest, follows again an ERGM with the "same" specification.

2

A connected graph Y1 on a subset of nodes, disconnected from the rest, follows again an ERGM with the "same" specification, under the additional condition of Y1 being connected.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

11 / 27

3

Suppose we begin a snowball sample with node set B . The saturated snowball generated by B follows an ERGM with the same specification, under the additional assumption that all nodes are reachable from B. (Cf. Doreian and Woodard, 1994, for network delineation by a snowball sample.)

4

The graph without its isolates again follows an ERGM with the same specification, under the additional restriction that there are no isolates.

5

The small components:

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

c Tom A.B. Snijders

12 / 27

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

13 / 27

(Add Health Study, Moody et al.)

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

13 / 27

(Add Health Study, Moody et al.) The small connected components again follow the same ERGM distribution, under the condition of being small connected components, and here only the low-order subgraph counts play a role.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

13 / 27

(Add Health Study, Moody et al.) The small connected components again follow the same ERGM distribution, under the condition of being small connected components, and here only the low-order subgraph counts play a role. For example, consider the disconnected triads

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

14 / 27

Corollary. Let N0 be the number of isolated 3-node connected subgraphs; these are isolated twopaths or isolated triangles. In the original ERGM denote the coefficient of the number of edges by θE , the coefficient of the number of two-stars by θS2 , and the coefficient of the number of triangles by θT . Then, conditional on N0 , the number of isolated triangles has a binomial distribution with binomial denominator N0 and probability parameter exp(θE + 2θS2 + θT ) . 1 + exp(θE + 2θS2 + θT )

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

15 / 27

How this works in practice was studied in a simulation study.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

15 / 27

How this works in practice was studied in a simulation study. Simulation design: Non-directed network with n = 100 nodes. parameters: edges –3.5 alt. k -stars 0.2 alt. 2-paths –0.4 alt. k -triangles 1.5

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

15 / 27

How this works in practice was studied in a simulation study. Simulation design: Non-directed network with n = 100 nodes. parameters: edges –3.5 alt. k -stars 0.2 alt. 2-paths –0.4 alt. k -triangles 1.5 This yields average degrees ∼ 2.1.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

16 / 27

200 replications. For each replication, one network was generated; the parameters were estimated for this network, and for the largest connected component (‘saturated snowball’), if larger than 50, under the condition of connectedness. Extra requirements for Metropolis Hastings steps.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

16 / 27

200 replications. For each replication, one network was generated; the parameters were estimated for this network, and for the largest connected component (‘saturated snowball’), if larger than 50, under the condition of connectedness. Extra requirements for Metropolis Hastings steps. Expectations:

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

16 / 27

200 replications. For each replication, one network was generated; the parameters were estimated for this network, and for the largest connected component (‘saturated snowball’), if larger than 50, under the condition of connectedness. Extra requirements for Metropolis Hastings steps. Expectations: 1

Parameter estimates appr. unbiased in either case.

2

Larger standard errors for saturated snowball.

3

Larger standard errors for smaller sat. snowballs.

4

Type-I error rates appr. correct. c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

17 / 27

Of the 400 estimations, 28 had t-ratios for convergence > 0.15. These were discarded. Distribution of sizes of giant components

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

18 / 27

Parameter estimates: degree; with fitted means.

edge parameter = –3.5 downward bias

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

19 / 27

Parameter estimates: alternating k -stars; with fitted means.

a. k -star parameter = 0.2

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

20 / 27

Parameter estimates: alternating two-paths; with fitted means.

a. twopaths parameter = –0.4

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

21 / 27

Parameter estimates: alternating k -triangles; with fitted means.

a. k -tri. parameter = 1.5 upward bias for snowballs

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

22 / 27

Mean absolute errors: degree; with fitted means.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

23 / 27

Mean absolute errors: alternating k -stars; with fitted means.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

24 / 27

Mean absolute errors: alternating two-paths; with fitted means.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

25 / 27

Mean absolute errors: alternating k -triangles; with fitted means.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

26 / 27

Conclusions from simulations Estimates close to unbiased for total and snowball data; slight biases for degree and alternating k -triangle parameters.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

26 / 27

Conclusions from simulations Estimates close to unbiased for total and snowball data; slight biases for degree and alternating k -triangle parameters. Standard errors slightly higher for snowballs, but mainly for degree and alternating k -triangle parameters.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

26 / 27

Conclusions from simulations Estimates close to unbiased for total and snowball data; slight biases for degree and alternating k -triangle parameters. Standard errors slightly higher for snowballs, but mainly for degree and alternating k -triangle parameters. Standard errors for snowballs not clearly dependent on size largest component.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

26 / 27

Conclusions from simulations Estimates close to unbiased for total and snowball data; slight biases for degree and alternating k -triangle parameters. Standard errors slightly higher for snowballs, but mainly for degree and alternating k -triangle parameters. Standard errors for snowballs not clearly dependent on size largest component. Empirical type-I error rates for tests of true hypotheses ranged from 0.02 to 0.04.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

26 / 27

Conclusions from simulations Estimates close to unbiased for total and snowball data; slight biases for degree and alternating k -triangle parameters. Standard errors slightly higher for snowballs, but mainly for degree and alternating k -triangle parameters. Standard errors for snowballs not clearly dependent on size largest component. Empirical type-I error rates for tests of true hypotheses ranged from 0.02 to 0.04. Therefore, this small simulation study is supportive of the theoretical results. c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

27 / 27

Conclusion Support for theoretical consistency of the ERGM.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

27 / 27

Conclusion Support for theoretical consistency of the ERGM. Marginalization & independence for mutually disconnected subgraphs is intuitive, once you think of it.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

27 / 27

Conclusion Support for theoretical consistency of the ERGM. Marginalization & independence for mutually disconnected subgraphs is intuitive, once you think of it. To avoid action at a distance: restrict statistics to subgraph counts (with attribute weights).

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

27 / 27

Conclusion Support for theoretical consistency of the ERGM. Marginalization & independence for mutually disconnected subgraphs is intuitive, once you think of it. To avoid action at a distance: restrict statistics to subgraph counts (with attribute weights). Network delineation by saturated snowball sample OK. Extra conditions must be imposed in Metropolis Hastings steps for drawing from ERGMs.

c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization Simulation Example

27 / 27

Conclusion Support for theoretical consistency of the ERGM. Marginalization & independence for mutually disconnected subgraphs is intuitive, once you think of it. To avoid action at a distance: restrict statistics to subgraph counts (with attribute weights). Network delineation by saturated snowball sample OK. Extra conditions must be imposed in Metropolis Hastings steps for drawing from ERGMs. Possibilities for homogeneity testing in ERGMS; are parameters the same in different components? c Tom A.B. Snijders

Conditional Marginalization ERGMs

Conditional Marginalization for Exponential Random ...

But what can we say about marginal distributions of subgraphs? c Tom A.B. Snijders .... binomial distribution with binomial denominator N0 and probability ...

734KB Sizes 1 Downloads 240 Views

Recommend Documents

Snowball sampling for estimating exponential random ...
Nov 13, 2015 - Abstract. The exponential random graph model (ERGM) is a well-established statis- tical approach to modelling social network data. However, Monte Carlo estimation of ERGM parameters is a computationally intensive procedure that imposes

Auxiliary Parameter MCMC for Exponential Random Graph Models*
Abstract. Exponential random graph models (ERGMs) are a well-established family of statistical models for analyzing social networks. Computational complexity ...

Auxiliary Parameter MCMC for Exponential Random Graph Models*
Keywords ERGMs; Parameter inference; MCMC; Social Networks; .... reviewed, in Section 4 a new MCMC sampler is suggested, and in Section 5 the results of ...

Speech Recognition with Segmental Conditional Random Fields
learned weights with error back-propagation. To explore the utility .... [6] A. Mohamed, G. Dahl, and G.E. Hinton, “Deep belief networks for phone recognition,” in ...

SCARF: A Segmental Conditional Random Field Toolkit for Speech ...
into an alternative approach to speech recognition, based from the ground up on the combination of multiple, re- dundant, heterogeneous knowledge sources [4] ...

A Hierarchical Conditional Random Field Model for Labeling and ...
the building block for the hierarchical CRF model to be in- troduced in .... In the following, we will call this CRF model the ... cluster images in a semantically meaningful way, which ..... the 2004 IEEE Computer Society Conference on Computer.

Co-Training of Conditional Random Fields for ...
Bootstrapping POS taggers using unlabeled data. In. CoNLL-2003. [26] Berger, A., Pietra, A.D., and Pietra, J.D. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39-71,. 1996. [26] Kudo, T. and Matsumoto, Y.

Conditional Random Fields for brain tissue ... - Swarthmore's CS
on a segmentation approach to be (1) robust to noise, (2) able to handle large variances ... cation [24]. Recent years have seen the emergence of Conditional Random Fields .... The Dice index measures the degree of spatial overlap between ...

Conditional Random Fields with High-Order Features ...
synthetic data set to discuss the conditions under which higher order features ..... In our experiment, we used the Automatic Content Extraction (ACE) data [9], ...

Semi-Markov Conditional Random Field with High ... - Semantic Scholar
1http://www.cs.umass.edu/∼mccallum/data.html rithms are guaranteed to ... for Governmental pur- poses notwithstanding any copyright notation thereon. The.

Gradual Transition Detection with Conditional Random ...
Sep 28, 2007 - ods; I.5.1 [Pattern Recognition]: Models—Statistical, Struc- tural .... CRFs is an undirected conditional model. ..... AT&T research at trecvid 2006.

Context-Specific Deep Conditional Random Fields - Sum-Product ...
In Uncertainty in Artificial Intelli- gence (UAI), pp ... L. R. Rabiner. A tutorial on hidden markov models and ... ceedings of 13th Conference on Artificial Intelligence.

SCARF: A Segmental Conditional Random Field Toolkit ...
In SCARF, the fast-match may be done externally with an HMM system, and provided in the form of a lattice. Alternatively .... A detection file simply shows which.

High-Performance Training of Conditional Random ...
presents a high-performance training of CRFs on massively par- allel processing systems ... video, protein sequences) can be easily gathered from different ...... ditional random fields”, The 19th National Conference on. Artificial Intelligence ...

Conditional Random Field with High-order ... - NUS Computing
spurious, rare high-order patterns (by reducing the training data size), there is no .... test will iterate through all possible shuffles, but due to the large data sizes,.

CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION ...
Abstract. The purpose of this paper is to give a clean formulation and proof of Rohlin's Disintegration. Theorem (Rohlin '52). Another (possible) proof can be ...

Causal Conditional Reasoning and Conditional ...
judgments of predictive likelihood leading to a relatively poor fit to the Modus .... Predictive Likelihood. Diagnostic Likelihood. Cummins' Theory. No Prediction. No Prediction. Probability Model. Causal Power (Wc). Full Diagnostic Model. Qualitativ

Auxiliary Parameter MCMC for Exponential ... - Semantic Scholar
Keywords ERGMs; Parameter inference; MCMC; Social Networks; ... sociology [4], political sciences [5], international relations [6], medicine [7], and public health ...

Counter Marginalization of Information Rents
Dec 11, 2007 - Vancouver, BC, V6T 1Z1 Canada. e#mail: [email protected] ... text of our public service procurement example, the financier may be a shareholder or a .... punishments for the agents and achieve the first best solution even when r

Double Marginalization in Performance-Based ...
behind much of the innovation taking place in Internet-based ad- vertising. ... Overture and turned into a multi-billion dollar business by Google,. Yahoo and other .... An extensive literature has proposed solutions to the double mar- ginalization .

Exponential Growth.pdf
Page 1 of 10. S.23. Page 1 of 10. Page 2 of 10. Page 2 of 10. Page 3 of 10. Page 3 of 10. Exponential Growth.pdf. Exponential Growth.pdf. Open. Extract.

Exponential Decay.pdf
Sign in. Loading… Whoops! There was a problem loading more pages. Whoops! There was a problem previewing this document. Retrying... Download. Connect ...

Answer key 7.1 Exponential Growth 7.2 Exponential Decay.pdf ...
Answer key 7.1 Exponential Growth 7.2 Exponential Decay.pdf. Answer key 7.1 Exponential Growth 7.2 Exponential Decay.pdf. Open. Extract. Open with. Sign In.