Useful Derivations for i-Vector Based Approach to Data Clustering in Speech Recognition Yu Zhang December 16, 2011

Contents 1 Background 1.1 Traditional maximum likelihood eigen-decomposition . . . . . . . 1.2 What’s new in Kenny’s paper? . . . . . . . . . . . . . . . . . . .

2 2 2

2 The i-vector based approach to data clustering 2.1 Data model . . . . . . . . . . . . . . . . . . . . . 2.2 Frontend i-vector extraction . . . . . . . . . . . . 2.2.1 The i-vector solution . . . . . . . . . . . . 2.2.2 Smoothed data model . . . . . . . . . . . 2.2.3 Solving i-vector . . . . . . . . . . . . . . . 2.3 Hyperparameter estimation . . . . . . . . . . . . 2.3.1 Updating T . . . . . . . . . . . . . . . . . 2.3.2 Updating R . . . . . . . . . . . . . . . . . 2.3.3 Summarizing the EM algorithm . . . . . .

. . . . . . . . .

3 3 3 3 4 5 6 7 7 8

3 Implementation 3.1 Training T and R . . . . . . . . . . . . . . . . . . . . . . . . . . ˆ 3.2 Extracting w(i) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Computational complexity . . . . . . . . . . . . . . . . . . . . . .

9 9 9 9

4 Clustering of i-vectors using LBG algorithm A The A.1 A.2 A.3

Matrix Codebook for this Trace . . . . . . . . . . . . . . Variance of quadratic form . Others . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

10

memo 11 . . . . . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . . . . . 12

B From a graphic model view 12 B.1 Probabilistic PCA . . . . . . . . . . . . . . . . . . . . . . . . . . 12 B.1.1 Maximum likelihood PCA . . . . . . . . . . . . . . . . . . 13

1

1

Background

1.1

Traditional maximum likelihood eigen-decomposition

Let M (s) denote the mean supervector for a speaker s formulating as: M (s) = M0 + T w, and the likelihood function written as: Y max PHMM (Ys |M0 + T w, R). s

w

(1)

(2)

The optimization proceeds by iterating the following two steps: 1. For each training speaker s, use the current estimates of T and R to find the w which maximizes the HMM likelihood given the speaker’s training data Ys : w(s) = arg max PHMM (Ys |M0 + T w, R) (3) w

2. Update T and R by maximizing: Y PHMM (Ys |M0 + T w(s), R)

(4)

s

1.2

What’s new in Kenny’s paper?

In Kenny’s paper [3], they treat w(i) as a random vector with a standard normal distribution. So the estimation of T and R becomes to maximize the marginal likelihood function YZ P (Ys |M0 + T w, R)N (w|0, I)dw, (5) s

w

where the product extends over all speakers in the training set. Calculating the posterior distribution of w(i) in the E-step rather than the maximum likelihood estimate is the key to avoiding the degeneracy problem that arises when Eq. (2) is used to estimate the eigenvoices in situations where speaker-dependent training is not feasible (e.g., when training data is sparse) or the number of eigenvoices is large compared with the number of training speakers. The algorithm can be briefly break-down into the following steps: 1. For each training speaker s, use the current alignment of the speaker’s training data and the current estimates T and R to carry out MAP speaker adaptation. Use the speaker adapted model to realign the speaker’s training data. 2. The E-step: For each speaker s, calculate the posterior distribution of w(i) using the current alignment of the speaker’s training data, the current estimates of T and R and the prior N (w|0, I). 3. The M-step: Update T and R by a linear regression in which the w(i)’s play the role of the explanatory variables. 2

2

The i-vector based approach to data clustering

We are considering in this memo to use i-vector based approach for more generic data clustering problems in speech recognition. Possible applications include training data clustering for multiple acoustic model training [7] and acoustic sniffing in IVN-based training [5]. So hereafter, we will change the subscript s representing the speaker to subscript i representing each individual acoustic unit that to be clustered. The granularity of the acoustic units can be one single utterance, several utterances (e.g., one conversational side in Switchboard), or even more utterances from a pre-defined source (speaker, environment, channel, etc.)

2.1

Data model

Suppose we are given a set of training data denoted as Y = {Y i |i = 1, 2, . . . , I}, where Y i = (y1i , y2i , . . . , yTi i ) is a sequence of D-dimensional feature vectors extracted from the i-th acoustic unit (e.g., utterance). From Y, a Gaussian Mixture Model (GMM) can be trained using a Maximum Likelihood (ML) criterion to serve as a Universal Background Model (UBM). Let’s use Ω = {ck , mk , Rk |k = 1, . . . , K} to denote the set of UBM-GMM parameters where ck ’s are mixture component weights, mk and Rk are D-dimensional mean and D × D diagonal covariance matrix for the k th mixture component. Given the data model Ω, the probability of each acoustic unit Y i can be written as: Ti X K Y ck N (yti ; mk , Rk ). (6) p(Y i |Ω) = t=1 k=1

And we denote Ω(0) hereafter the initial model parameters of the UBM.

2.2 2.2.1

Frontend i-vector extraction The i-vector solution

Let M0 denote the (D · K)-dimensional supervector by concatenating the mk ’s. Given an utterance Y i , let’s use another (D · K)-dimensional random supervector M (i) to characterize it independent of its linguistic content. In i-vector based approach, M (i) is correlated with M0 as: M (i) = M0 + T w(i),

(7)

where T is a fixed but unknown (D · K) × F rectangular matrix of low rank (i.e., F  (D · K)), and w(i) is an F -dimensional random vector having a prior distribution of N (·; 0, I). A graphical model representation is shown in Fig. 1. In [2], T is called the total variability matrix and w(i) the i-vector. They are also comparable with the loading matrix and factors in factor analysis.

3

Figure 1: A graphical model representation of i-vector approach.

Given Y i , M (i), and Rk ’s, the i-vector is the MAP solution of the following problem: ˆ w(i) = argmax p(w(i)|Y i ) w(i) (8) = argmax p(Y i |w(i))p(w(i)), w(i)

where p(Y i |w(i)) =

Ti X K Y

ck N (yti ; Mk (i), Rk ),

(9)

t=1 k=1

in which Mk (i) is the k-th D-dimensional subvector of M (i). 2.2.2

Smoothed data model

Eq. (9) is intractable because of the summation term. So Viterbi approximation was imposed in [3] so that p(Y i |w(i)) '

Ti Y t=1

max N (yti ; Mk (i), Rk ).

(10)

k

One can also define a “smoothed” version of p(Y i |w(i)): p(Y i |w(i)) '

Ti Y K Y

i

(0)

N (yti ; Mk (i), Rk )p(k|yt ,Ω

)

,

(11)

t=1 k=1

where

(0)

ck N (yti ; mk , Rk ) p(k|yti , Ω(0) ) = PK (0) i l=1 cl N (yt ; ml , Rl )

(12)

is the Baum-Welch occupancy probability of the k th component given yti . Note that the production terms in Eq. (11) are not strictly probability densities because they do not integrate to 1. Using Eq. (11) will lead to the same ivector solution as in [2], although in that paper, the form of p(Y i |w(i)) was never explicitly given. 2.2.3

Solving i-vector

Given the smoothed data model in Eq. (11), the closed-form solution of the problem in Eq. (8) gives the i-vector extraction formula as follows: ˆ = l−1 (i)T > R−1 Γy (i), w(i)

(13)

l(i) = I + T > Γ(i)R−1 T ,

(14)

where

4

R is the (D · K) × (D · K) block-diagonal matrix with Rk as its k th block component, Γ(i) is a (D · K) × (D · K) block-diagonal matrix with γk (i)ID×D as its k th block component; Γy (i) is a (D · K)-dimensional supervector with Γy,k (i) as its k th D-dimensional subvector. The statistics γk (i) and Γy,k (i) are calculated as follows: γk (i)

=

Ti X

p(k|yti , Ω(0) ),

t=1

Γy,k (i)

=

Ti X

(15) p(k|yti , Ω(0) )(yti

− mk ).

t=1

Proof: First, log p(Y i |w(i)) =

Ti X K X

p(k|yti , Ω(0) ) log N (yti ; Mk (i), Rk )

t=1 k=1

=

Ti X K X

 p(k|yti , Ω(0) ) log

t=1 k=1 1 − (yti − 2 Ti X K X

1 (2π)D/2 |Rk |1/2

mk − Tk w(i))> Rk−1 (yti − mk − Tk w(i))



1 − (yti − mk )> Rk−1 (yti − mk ) 2 t=1 k=1  1 > > > −1 i > −1 +w (i)Tk Rk (yt − mk ) − w (i)Tk Rk Tk w(i) . 2 (16) Only the last two terms are related with w(i), and we can define: =

 p(k|yti , Ω(0) ) log

1

(2π)D/2 |Rk |1/2

Ti X K X

  1 p(k|yti , Ω(0) ) w> (i)Tk> Rk−1 (yti − mk ) − w> (i)Tk> Rk−1 Tk w(i) 2 t=1 k=1 1 > > −1 > −1 = w (i)T R Γy (i) − w (i)T Γ(i)R T w(i). 2 (17)

H(i) =

So p(w(i)|Y i ) ∝ p(Y i |w(i))p(w(i))    1  1 ∝ exp w> (i)T R−1 Γy (i) − w> (i)T > Γ(i)R−1 T w(i) · exp − w> (i)w(i) 2 2   1 = exp w> (i)T R−1 Γy (i) − w> (i)[T > Γ(i)R−1 T + I]w(i) 2   1 = exp w> (i)T R−1 Γy (i) − w> (i)l(i)w(i) 2  1 >  ∝ exp − w(i) − l−1 (i)T > R−1 Γy (i) l(i) w(i) − l−1 (i)T > R−1 Γy (i) . 2 (18) Therefore, the “posterior” distribution of w(i) is Gaussian of mean l−1 (i)T > R−1 Γy (i) and covariance l−1 (i).  5

2.3

Hyperparameter estimation

Given the training data Y, the hyperparameters T and R can be estimated by maximizing the following log-likelihood function: F(T , R) = log

I Z Y

pT ,R (Y i , w(i))dw(i)

i=1

=

I X

Z log

pT ,R (Y i , w(i))dw(i)

i=1

= =

I X i=1 I X

Z

pT ,R (Y i , w(i)) p (0) (0) (Y i , w(i))dw(i) pT (0) ,R(0) (Y i , w(i)) T ,R

Z

pT ,R (Y i , w(i)) p (0) (0) (w(i)|Y i )dw(i) + C1 , pT (0) ,R(0) (Y i , w(i)) T ,R

log log

i=1

(19)

where T (0) and R(0) are hyperparameters from last iteration, C1 is a constant independent of T and R (we will use Cj ’s hereafter for those constants). By Jensen’s inequality, we have: F(T , R) ≥

I Z X

 log

i=1

= =

I Z X i=1 I Z X

 pT ,R (Y i , w(i)) p (0) (0) (w(i)|Y i )dw(i) pT (0) ,R(0) (Y i , w(i)) T ,R

log pT ,R (Y i , w(i))pT (0) ,R(0) (w(i)|Y i )dw(i) + C2

(20)

log pT ,R (Y i |w(i))pT (0) ,R(0) (w(i)|Y i )dw(i) + C3 .

i=1

So an auxiliary function A can be built as: A(T , R) =

I Z X

log pT ,R (Y i |w(i))pT (0) ,R(0) (w(i)|Y i )dw(i)

i=1

=

I X

(21) i

E log pT ,R (Y |w(i))



i=1

and the log likelihood function is log F(T , R) =

X i

1 1 G(i) − |l(i)| + E[w(i)> ]T > R−1 Γy (i) 2 2

6

(22)

2.3.1

Updating T

Referring to Eqs (16) and (17), for updating T , only the H function is involved. So given the properties in Eq. (48), we have 1 E[w> (i)T > R−1 Γy (i) − w> (i)T > Γ(i)R−1 T w(i)] + C4 2 i X 1 = E[w> (i)]T > R−1 Γy (i) − tr(T > Γ(i)R−1 T E[w(i)w> (i)]) + C4 2 i    X  1 > −1 > Γy (i)E[w (i)] − Γ(i)T E[w(i)w (i)] T > + C4 , = tr R 2 i (23) X

A=

and

 ∂A X −1  = R Γy (i)E[w> (i)] − Γ(i)T E[w(i)w> (i)] . ∂T i

(24)

Setting Eq. (24) to 0, and taking use of the diagonal structure of Γ(i), T can be solved line-by-line according to: X X > Tm Γm (i)E[w(i)w> (i)] = Γm (25) y (i)E[w (i)], i

i

th row of T , Γ and Γy , respectively. in which T m , Γm and Γm y are the m

2.3.2

Updating R

Go back to Eq. (16) and set Ti X K X

G(i) =

 p(k|yti , Ω(0) ) log

t=1 k=1 K

1 (2π)D/2 |Rk |1/2

 1 − (yti − mk )> Rk−1 (yti − mk ) 2

1X γk (i) log |Rk−1 | − tr(Rk−1 Γyy> ,k (i)) + C5 , = 2 k=1

(26) in which Γyy> ,k (i) =

Ti X

p(k|yti , Ω(0) )(yti − mk )(yti − mk )> .

(27)

t=1

It is not difficult to see that: A=

X

G(i) + E[H(i)].

(28)

i

Because

K

∂G(i) 1X = γk (i)R − Γyy> ,k (i), −1 ∂R 2 k=1

7

(29)

and ∂E[H(i)] ∂R−1

    ∂tr R−1 Γy (i)E[w> (i)] − 21 Γ(i)T E[w(i)w> (i)] T > (30) ∂R−1

=

Note that Eq. (25) X ∂E[H(i)] ∂R−1

i

∂tr

   1 −1 > > R Γ (i)E[w (i)] − Γ(i)T E[w(i)w (i)] T> y i 2

= =

Setting

P

∂R−1  1 X Γy (i)E[w> (i)]T > + T E[w(i)]Γy (i)> 2 i

∂A to 0, we have ∂R−1 1 Rk = P γ i k (i)

! X

Γyy> ,k (i) − Mc

(32)

i

where Mk denotes the kth diagonal block of the matrix  1 X Γy (i)E[w> (i)]T > + T E[w(i)]Γy (i)> . 2 i 2.3.3

(31)

(33)

Summarizing the EM algorithm

• E-Step: E[w(i)] = l−1 (i)T > R−1 Γy (i) E[w(i)w> (i)] = E[w(i)]E[w> (i)] + l−1 (i)

(34)

• M-Step: Tm

X

Γm (i)E[w(i)w> (i)] =

i

i

1 Rk = P γ i k (i)

3

X

X

> Γm y (i)E[w (i)] !

(35)

Γyy> ,k (i) − Mk

i

Implementation

3.1

Training T and R

1. Train a GMM-UBM using all the training data. 2. Initialize R by borrowing Rk ’s from the UBM, and T randomly as: T m,f ∈ [−αRm,m , αRm,m ],

∀m = 1, . . . , DK; f = 1, . . . , F,

(36)

where T m,f is the mth line and f th column of T , which is similar for Rm,m ; α is typically set to 0.1. 8

3. For each acoustic unit i, extract the Baum-Welch statistics γk (i), Γy,k (i) and Γyy> ,k (i) using the UBM as in Eqs. (15) and (27). 4. The E-Step: Calculate the posterior expectation E[w(i)] and E[w(i)w> (i)] using the statistics and the current estimation of T and R with Eq. (34). 5. Repeat Steps 3 and 4 for all i’s. Accumulate necessary statistics for solving Eq. (35). 6. The M-Step: Updating T and R using Eq. (35). 7. If training converges, stop; otherwise go back to Step 4.

3.2

ˆ Extracting w(i)

ˆ is straightforward by using the above Steps 3 and 4 and setting: Extracting w(i) ˆ = E[w(i)]. w(i)

(37)

The procedure is the same for both training and testing acoustic units.

3.3

Computational complexity

In i-vector extraction, the most computational cost is the calculation of l(i) in Eq. (14). The right-hand-side of the equation can be implemented step-by-step as: R−1 → Γ(i)R−1 → T > Γ(i)R−1 → T > Γ(i)R−1 T , (38) so the complexity is O(I × DK × F × F ). In our implementation, the property that Γ(i) is a block diagonal matrix can be used to make the computation more efficient: 1. Compute all the K block matrices of T > R−1 T : Hk = T > Rk T

∀k = 1, . . . , K.

(39)

γk (i)Hk .

(40)

2. For each acoustic unit i, l(i) =

X k

By doing so, the complexity of evaluating Eq. (14) is dominated by O(I × K × F × F + DK × F × F ).

9

4

Clustering of i-vectors using LBG algorithm

As described above, given the training corpus, an i-vector can be extracted from each acoustic unit. Given the training set i-vectors, we use a hierarchical divisive clustering algorithm, namely LBG algorithm [4], to cluster them into multiple ˆ ˆ clusters. To measure the similarity between two i-vectors w(i) and w(j), the both Euclidean distance and cosine similarity measure can be used. It is quite easy to use Euclidean distance in clustering. However when cosine similarity is used, the calculation of the clustering centroid needs to be reviewed. Given the cosine similarity defined as: ˆ ˆ sim(w(i), w(j)) =

ˆ > (i)w(j) ˆ w , ˆ ˆ ||w(i)|||| w(j)||

(41)

it can be proved that the corresponding clustering centroid, cwˆ consisting of n ˆ ˆ ˆ i-vectors w(1), w(2), . . . , w(n), can be calculated as: cwˆ = argmax

n X

ˆ sim(w(i), c)

c

=

  

i=1 Pn ˆ ˆ w(i)/|| w(i)|| Pi=1 n w(i)/|| ˆ ˆ w(i)|| i=1

 

0

if

Pn

i=1

ˆ ˆ w(i)/|| w(i)|| = 6 0 .

(42)

otherwise

After the convergence of the LBG clustering algorithm, we obtain E clusters of ˆ ˆ ˆ w w i-vectors with their centroids denoted as cw 1 , c2 , . . . , cE , respectively. We use ˆ w c0 to denote the centroid of all the training i-vectors.

References [1] F. Beaufays, V. Vanhoucke, and B. Strope, “Unsupervised discovery and training of maximally dissimilar cluster models,” Proc. Interspeech2010, pp. 66-69. [2] N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Frontend factor analysis for speaker verification,” IEEE Trans. on Audio, Speech and Language Processing, Vol. 19, No. 4, pp. 788-798, 2011. [3] P. Kenny, G. Boulianne, and P. Dumouchel, “Eigenvoice modeling with sparse training data,” IEEE Trans. on Speech and Audio Processing, Vol. 13, No. 3, pp. 345-354, 2005. [4] Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer design,” IEEE Trans. on Communication, Vol. COM-28, pp. 84-95, 1980. [5] J. Xu, Y. Zhang, Z.-J. Yan, and Q. Huo, “An i-Vector based Approach to Acoustic Sniffing for Irrelevant Variability Normalization based Acoustic Model Training and Speech Recognition,” Proc. Interspeech2011. 10

[6] S. Young, et al., The HTK Book (for HTK version 3.4), 2006. [7] Y. Zhang, J. Xu, Z.-J. Yan, and Q. Huo, “An i-vector based approach to training data clustering for improved speech recognition,” Proc. Interspeech2011.

A A.1

The Matrix Codebook for this memo Trace tr(AB) = tr(BA)

(43)

∂tr(X) = tr(∂X)

(44)

∂ tr(AX > ) = A ∂X

(45)

∂ tr(AXBX > ) = AXB + A> XB > ∂X ∂ tr(AX −1 B) = −X −> A> B>X −> ∂X ∂ ln |det(X)| = (X −1 )> ∂X

A.2

(46) (47) (48)

Variance of quadratic form

Assume that x ∼ N (x|µ, R) and A is positive semi-definite, E(xx> ) = R + µµ> >

(49) >

E(x Ax) = tr(AR) + µ Aµ (50) Proof: ∵ A is positive semi-definite ∴ A = C >C ∴ E(C > x> xC) = C > (R + µµ> )C X ∴ E(x> Ax) = E( (Cx)2i ) = tr(C > (R + µµ> )C) = tr(AE(xx> )) i

Note that tr(Aµµ> ) =

XX i

=

X i >

aij µi µj

j

Aµµi

= µ Aµ

(51)

and let µ = E(w), R = E(ww> ) − E(w)E(w)> , A = T > Γ(i)R−1 T we get Eq.[23]. 11

A.3

Others

1 Proposition: RΓ(i) = Γ(i)R

(52)

R = diag{R1 , . . . , RC }

(53)

Γ(i) = diag{γ1 I, . . . , γC I}

(54)

where

Proof: RΓ(i) = diag{R1 , . . . , RC }diag{γ1 I, . . . , γC I} = diag{γ1 IR1 , . . . , γC IRC } = Γ(i)R

(55)

In Mark J.F. Gales’s paper, it assume that w is only a model parameter and find the solution through “the alternative of the variables method” (there is no closed-form maximum likelihood solution for w, T , R.

B B.1

From a graphic model view Probabilistic PCA

Probabilistic PCA is a simple example of the linear-Gaussian framework, in which all of the marginal and conditional distributions are Gaussian. We can formulate probabilistic PCA by first introducing an explicit latent variable w corresponding to the principal-component subspace. Next we define a Gaussian prior distribution p(w) over the latent variable, together with a Gaussian conditional distribution p(x|w) for the observed variable x conditioned on the value of the latent variable. Specifically, the prior distribution over w is given by a zero-mean unit-covariance Gaussian p(w) = N (w|0, I)

(56)

and the conditional distribution of the observed variable x p(x|w) = N (x|T w + µ, σ 2 I).

(57)

We can view the probabilistic PCA model from a generative viewpoint in which a sampled value of the observed variable is obtained by first choosing a value for latent variable and then sampling the observed variable conditioned on this latent value: x = Tw + µ +  (58) where w is an Gaussian latent variable, and  is a zero-mean Gaussian-distributed noise variable with covariance σ 2 I. The generative process is illustrated in Figure 2. 12

Figure 2: Probabilistic PCA B.1.1

Maximum likelihood PCA

13

Useful Derivations for i-Vector Based Approach to Data ...

Dec 16, 2011 - where the product extends over all speakers in the training set. Calculating the posterior distribution of w(i) in the E-step rather than the maximum likelihood estimate is the key to avoiding the degeneracy problem that arises when Eq. (2) is used to estimate the eigenvoices in situations where.

287KB Sizes 0 Downloads 20 Views

Recommend Documents

iVector-based Acoustic Data Selection - Research at Google
DataHound [2], a data collection application running on An- droid mobile ... unsupervised training techniques where the hypothesized tran- scripts are used as ...

Model Based Approach for Outlier Detection with Imperfect Data Labels
outlier detection, our proposed method explicitly handles data with imperfect ... Outlier detection has attracted increasing attention in machine learning, data mining ... detection has been found in wide ranging applications from fraud detection ...

Data Anonymization Based Approach for Privacy ...
MapReduce and cloud computing environments, and survey current existing ..... computing solutions are also developed, like Hadoop, Eucalyptus, Open Nebula ...

Model Based Approach for Outlier Detection with Imperfect Data Labels
much progress has been done in support vector data description for outlier detection, most of the .... However, subject to sampling errors or device .... [9] UCI Machine Learning Repository [Online]. http://archive.ics.uci.edu/ml/datasets.html.

Rule Based Data Filtering In Social Networks Using Genetic Approach ...
A main part of social network content is constituted by ... The main part of this paper is the system provided that customizable content based message filtering for OSNs, Based on ML .... “Sarah Palin,” and “John McCain,” then both documents

Content-based retrieval for human motion data
In this study, we propose a novel framework for constructing a content-based human mo- tion retrieval system. Two major components, including indexing and matching, are discussed and their corresponding algorithms are presented. In indexing, we intro

Business English- Useful Phrases for Emails to ... - UsingEnglish.com
Useful Phrases for Emails to Report on Meetings. Brainstorm at least two useful phrases for each of the categories below. Opening line. Second line. Starting the ...

An Approach for Data-driven and Logic-based Complex ...
matches all orders from customers in response to a discount an- nouncement event. ... http://code.google.com/p/etalis/wiki/TechnicalReport?ts=1241893.

A New Data Representation Based on Training Data Characteristics to ...
Sep 18, 2016 - sentence is processed as one sequence. The first and the second techniques are evaluated with MLP, .... rent words with the previous one to represent the influence. Thus, each current input is represented by ...... (better disting

Optimal Training Data Selection for Rule-based Data ...
affair employing domain experts, and hence only small .... free rules. A diverse set of difficult textual records are given to set of people making sure that each record is given to a ..... writes when presented with the 100 chosen patterns. A.

A robustness-based approach to systems-oriented drug ...
Feb 23, 2007 - data generated by today's technology could have a key ... this article considers the success and failure of drugs from the .... limitations, this article will focus only on the cellular ...... technique for wireless digital communicati

A Partition-Based Approach to Structure Similarity Search
such as chemical and biological structures, business processes and program de- pendencies. ... number of common q-grams, based on the observation that if the GED between two graphs is small, the majority of q-grams in one graph are preserved. ......

A Partition-Based Approach to Structure Similarity Search
In the rest of the paper, we will focus on in-memory im- plementation when describing algorithms. 2.2 Prior Work. Approaching the problem with sequential ...... ing the best option of moving a vertex u from one partition pu to another pv such that (u

ABACUS: An Auction-Based Approach to Cloud ...
different computation platforms, e.g. Hadoop, HBase and MPI. In Mesos, every platform submits its resource requests to the master node, which then makes offers of the resources to the platforms based on the remaining resources available. Each platfor

A novel approach to Monte Carlo-based uncertainty ...
Software Ltd., Kathmandu, Nepal, (3) Water Resources Section, Delft ... was validated by comparing the uncertainty descriptors in the verification data set with ... The proposed techniques could be useful in real time applications when it is not ...

Testimony-based Isolation: New Approach To ...
observations of which follow a Bernouli distribution by a Beta distribution. The Bayesian approach for nodes reputation regarding packet forwarding in MANET ...

A Player Based Approach to Baseball Simulation.pdf
Page 3 of 264. Page 3 of 264. A Player Based Approach to Baseball Simulation.pdf. A Player Based Approach to Baseball Simulation.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying A Player Based Approach to Baseball Simulation.pdf. Page 1