Supplemental materials for ”Decomposed Normalized Maximum Likelihood Codelength Criterion for Selecting Hierarchical Latent Variable Models” Tianyi Wu∗, Sugawara Shinya†and Kenji Yamanishi‡

1 1.1

Proofs Proof Sketch of Theorem 3.3

We begin with deriving LN M L (xn |z n ; M ). The maximum of the likelihood ˆ n , z n ), M ) can be written as function P (xn |z n ; Φ(x ˆ n , z n ), M ) = P (x |z ; Φ(x n

n

K Y V Y nkv nkv ) . ( nk

k=1 v=1

Following similar computation in NB, we can obtain the first two terms in the main equation for Theorem 3.3. Next, we consider LN M L (z n ; M ). Because each document has a mixture of topics in LDA, P (z n ; Θ) can be deQ n composed into d P (zd ; θd ), where zdn allocates data to document d. Under this decomposition, P (zdn ; θd ) for each d comprises P a finite mixture model. Then, the NML codelength can be obtained as d LN M L (zdn ; M ), which is the last two terms in the main equation for Theorem 3.3. ∗ Corresponding author. University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo 113-0033, Japan Email: tianyi [email protected] † University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo 113-0033, Japan Email: [email protected] ‡ University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo 113-0033, Japan Email:[email protected]

1

1.2

Proof Sketch for Theorem 3.4

We begin with deriving the expression for LNML (xn |z n ; M ). Notice that when the latent variable z are given, the conditional distribution P (x|z; θ) is the same as that of SBM, thus the conditional maximum likelihood P (xn |z n ; ηˆ) is the same in Theorem 3.2, which is XX  XX log CM N (nk1 k2 , 2). nk1 k2 log nk1 k2 − n1k1 k2 log n1k1 k2 − n0k1 k2 log n0k1 k2 + k1

k1

k2

k2

Next, we consider LN M L (z n ; M ). MMSBM is a variant model of LDA, documents in LDA are corresponding to vertices and word in document d are corresponding to links and no-links begin from vertex i. Therefore, we can plug ni into nd from LN M L (z n ; M ) for LDA. Using the result of Theorem 3.3, we can get LN M L (z n ; M ) for MMSBM for as follows: XX X nik (log ni − log nik ) + log CM N (ni , K), i

2 2.1

i

k

Detailed Designs of Experiments Experiment using the NB Model

For the experiments using the NB models, to guarantee the generality of experiments, we generate multiple datasets using different hyper-parameters. NB has hyper-parameters α, β and M where π ∼ Dir(α), φk ∼ Dir(β) and M = (M1 , ..., MD ). We generate eight datasets for each combination of hyper-parameters from the following candidates: α ∈ {2, 3, 4, 5}, β ∈ {0.05, 0.15, 0.3, 0.5}, m ∈ {16, 16, 16), (4, 4, 4, 4), (12, 12, 12, 12), (8, 8, 8, 8, 8), (6, 6, 6, 6, 6), (4, 4, 4, 4, 4, 4)} and n ∈ {10, 37, 138, 268, 1000}. As a result, we generated 4 × 4 × 6 × 5 × 8 = 3840 simulation datasets in total.

2.2

Experiment using SBM

The hyper-parameters for SBM are α, β and ρ where π ∼ Dir(α), ηk1 k2 ∼ Ber(β). We generated eight synthetic datasets using each combination of hyper-parameters from the following candidates: α ∈ {1, 4}, β ∈ {0.1, 0.3, 0.6, 1, 3}, ρ ∈ {1.0, 0.75} and n ∈ {7, 12, 19, 29, 45, 70, 108, 167, 258, 400}. As a result, we obtained 2 × 5 × 2 × 10 × 8 = 1600 datasets in total. The best model was selected from candidates of (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) latent components.

2

2.3

Experiment using LDA

The hyper-parameters in LDA are α, β and V where θd ∼ Dir(α), φk ∼ Dir(β) and V is the number of unique words. We generated eight synthetic datasets using each combination of hyper-parameters from the following candidates: α ∈ {0.1, 0.2, 0.25, 0.3, 0.35, 0.4}, β ∈ {0.1, 0.2, 0.25, 0.3, 0.35, 0.4}, V ∈ {200, 400, 600] and n ∈ {5, 7, 12, 19, 30, 48, 76, 120, 190, 300]. As a result, we obtained 6 × 6 × 3 × 10 × 8 = 8640 datasets in total. The best model was selected from candidates with (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) latent components.

2.4

Real data experiments

For preprocessing on two datasets, following the previous studies, we omitted terms which had a lower term-frequency inverse document frequency (TFIDF) score than 0.1 [3], and selected only terms which appeared in five documents or more [2]. Since a single label was assigned to a document, all words in the document shared this label. For the 20 newsgroups data, the categories for each dataset are listed in Table 1. Table 1: 20 Newsgroups: Categories for 5 datasets Labels 2 3 4 5 6

Categories atheism, space atheism, graphics, baseball atheism, graphics, baseball, space graphics, baseball, space, christian, guns graphics, forsale, baseball, space, christian, guns

References [1] E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. Journal of Machine Learning Research, 9:1981–2014, 2008. [2] D. M. Blei and J. D. Lafferty. Topic models. Text mining: Classification, Clustering, and Applications, 10(71):34, 2009.

3

[3] T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101:5228–5235, 2004. [4] P. Kontkanen and P. Myllym¨aki. A linear-time algorithm for computing the multinomial stochastic complexity. Information Processing Letters, 103(6):227–233, 2007.

4

Supplemental materials for ”Decomposed Normalized ...

†University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo 113-0033, Japan Email: ... log n0 k1k2. )+∑ k1. ∑ k2 log CMN (nk1k2 ,2). Next, we consider LNML(zn;M).

175KB Sizes 2 Downloads 31 Views

Recommend Documents

Indian cotton Supplemental Materials s12302-015-0043-8-s1.pdf ...
32 Figure S9 - High-density short-season cotton in Imperial County, CA. 33 Figure S10 - Suicides among males by age class in the Indian states of AP, GJ, KA ...

Supplemental Appendix for
compose only a small part of dyadic trade – particularly if the commodity holds strategic value. 4 Use of rare events logit models are justified because MID ...

Supplemental Appendix for
We code these variables using data from Pevehouse, Nordstrom, & Warnke (2004). .... Australia. Japan. Israel. Iceland. Denmark. Norway. Sweden. Finland. Italy .... following criteria: (1) direct election of the executive (or indirect selection via ..

Supplemental irrigation
Rain-fed agriculture accounts for about 80% of the world's farmland and two- ... of water during critical crop growth stages – can substantially increase yield and water ..... Lentil, Chickpeas and faba beans grown under supplemental irrigation pro

Supplemental irrigation
Rain-fed agriculture accounts for about 80% of the world's farmland and two- ... of water during critical crop growth stages – can substantially increase yield and water ..... Lentil, Chickpeas and faba beans grown under supplemental irrigation pro

Output (normalized)
Polymer compositions for display mediums, and blue green red (BRG) display ..... spectra after excitation with a low energy pulse the emission spectra can be ...

Supplemental Material for Entanglement's Benefit ...
are associated with the modes Alice and Eve measure, respectively, from a single signal-idler mode pair, where the ± superscripts here and elsewhere in the figure represent Bob's ±1 (0 or π rad) binary phase- shift keying (BPSK) modulation. The tr

Supplemental Salary.pdf
Director, High School Musical (1 per year; no others) 10.00%. Director, High School Plays (Maximum of 2 plays per year) 4.00%/play. Director, MS Drama (1 per ...

Normalized Online Learning Tutorial - GitHub
Normalized Online Learning Tutorial. Paul Mineiro joint work with Stephane Ross & John Langford. December 9th, 2013. Paul Mineiro. Normalized Online ...

Normalized alignment of dependency trees for ...
T and H into a logical form, and then use a gen- ... zation. For POS tagging, we use the memory- ... For lemmatization, we employ the memory-based lemmatizer ...

Maximum Normalized Spacing for Efficient Visual ...
... distance met- ric learning method called Maximum Normalized Spacing (MNS) ... Data Clustering, Distance Metric Learning, Data Mining. Permission to ..... degrees apart as the object is rotated on a turntable and each object has 72 images.

Supplemental Application.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps ... Supplemental Application.pdf. Supplemental Application.pdf. Open.

supplemental material.pdf
Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. supplemental material.pdf. supplemental

Supplemental Appendix
Feb 17, 2018 - ∗Cattaneo gratefully acknowledges financial support from the National Science Foundation through grants SES- ..... We employ Assumption SA-5 (in Part III below), which complements Assumption SA-3 (in ..... Under Assumption SA-2, the

Supplemental Material - University of Melbourne
... and Python with MPI [1]) for the model is available from https://sites.google. ... Figures S1, S2 and S3 show the fractions of cooperators, average number of ...

Supplemental Material - University of Melbourne
... and Python with MPI [1]) for the model is available from https://sites.google. ... Figures S1, S2 and S3 show the fractions of cooperators, average number of ...

Affine Normalized Invariant functionals using ...
S.A.M Gilani. N.A Memon. Faculty of Computer Science and Engineering ..... Sciences & Technology for facilitating this research and Temple University, USA for.

Affine Normalized Contour Invariants using ...
Faculty of Computer Science and Engineering ..... Conics have been used previously in computer vision .... and Temple University, USA for providing the.

Midland supplemental brief.pdf
Sign in. Page. 1. /. 15. Loading… .... Midland supplemental brief.pdf. Midland supplemental brief.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying ...

Notification-Supplemental EIA.pdf
Download. Connect more apps... Try one of the apps below to open or edit this item. Notification-Supplemental EIA.pdf. Notification-Supplemental EIA.pdf. Open.

Online Appendix Supplemental Material for “A Moment ...
Aug 3, 2013 - as T → ∞. In Proposition 1 below, we show that calculating the transition probabilities using the continuous distribution functions does not always deliver meaningful approximations. In particular, Tauchen's (1986) method fails to a

DS-5535-Supplemental-Questions-for-Visa-Applicants.pdf ...
... delay or prevent the processing of an individual visa application. DS-5535 Page 3 of 3. Page 3 of 3. DS-5535-Supplemental-Questions-for-Visa-Applicants.pdf.

Enhanced Normalized Difference Vegetation Index ... -
Live green plants absorb solar radiation in the photosynthetically active radiation (PAR) spectral region (between about 450 – 700 nm). Plants use this energy in ...