Pattern Recognition Letters 27 (2006) 838–842 www.elsevier.com/locate/patrec

Dynamical Gaussian mixture model for tracking elliptical living objects Guanglei Xiong, Chao Feng, Liang Ji

*

Department of Automation, National Laboratory for Information Science and Technology, Tsinghua University, Haidian District, Beijing 100084, China Received 30 May 2005; received in revised form 19 October 2005 Available online 7 February 2006 Communicated by Dr. H. Sako

Abstract In this letter, we present a novel dynamical Gaussian mixture model (DGMM) for tracking elliptical living objects in video frames. The parameters, which inform object position and shape, are estimated by using a traditional Gaussian mixture model (GMM) for the first frame. Instead of simply using the parameters of one frame as the initial state for the next frame, DGMM employs Kalman filter to predict these parameters to initialize the evolution of GMM for the rest of sequel frames. Compared with the simple way, the estimate by Kalman filter is usually much closer to the true state, the computation load for tracking is reduced significantly. In addition, ‘‘split’’, ‘‘merge’’ and ‘‘delete’’ operations are integrated into DGMM to account for object splitting, merging and vanishing. This scheme can suppress the chance of getting stuck at a local maximum, led by a mismatch between the number of objects in the model and the actual number of objects. Results obtained from experiments on both simulated and real-world data demonstrate the effectiveness of our algorithm. DGMM has been successfully applied to track the early development process of urchin embryo in which the events of motility and cleavage occur.  2005 Elsevier B.V. All rights reserved. Keywords: Gaussian mixture model; EM algorithm; Kalman filtering; Tracking

1. Introduction Gaussian mixture model (GMM) (Dempster et al., 1977) plays an important role in clustering, density estimation and pattern recognition due to its analytical tractability, asymptotic properties, and ease of implementation. Most applications assume the underlying probability density of sample data to be fixed and iterations are performed to obtain proper parameters of the model based on some criteria. We found, in our preliminary experiments, that the converged GMM could re-converge quickly if the samples changed only slightly. This implies that the re-convergence property is potentially useful for tracking problems. When

*

Corresponding author. Tel.: +86 106 278 5787; fax: +86 106 279 5871. E-mail address: [email protected] (L. Ji).

0167-8655/$ - see front matter  2005 Elsevier B.V. All rights reserved. doi:10.1016/j.patrec.2005.11.015

dealing with tracking, however, samples are different from frame to frame. Some parameter updating techniques are thus necessary to the basic GMM. Several authors have proposed adaptive expectation– maximization (EM) algorithms for GMM in which certain parameter updating schemes are included. There are two approaches to achieve adaptation. One is parameter based in which parameter updating functions are derived (Oliver et al., 1997; McKenna et al., 1999; Rares and Reinders, 2000). The other is non-parameter based which incorporates Markov chain Monte Carlo (MCMC) based particle filter (Vermaak et al., 2003). However, these models are not designed for tracking. The method we propose here is parameter based and well suits tracking clusters of elliptical objects. Firstly, the parameters, which inform object position and shape, are estimated by using a traditional GMM to initialize the

G. Xiong et al. / Pattern Recognition Letters 27 (2006) 838–842

model. Instead of simply using the parameters of one frame as the initial state of the next frame, DGMM employs Kalman filter to predict these parameters to initialize the evolution of GMM for the sequel frames. In addition, ‘‘split’’, ‘‘merge’’ and ‘‘delete’’ operations are integrated into DGMM to account for object splitting, merging and vanishing. The advantage of our algorithm is twofold. Due to the inclusion of ‘‘split’’, ‘‘merge’’ and ‘‘delete’’ operation, the chance to be stuck at a local maximum, led by a mismatch between the number of objects in the model and the actual number of objects, is greatly reduced. Moreover, because the estimation by Kalman filter is usually close to the true state, the iterations for one single frame are dramatically decreased. Therefore, our algorithm is faster than the simple way. The remainder of the letter is organized as follows: In the next section we briefly review the GMM approach and present three operations adopted. Section 3 describes motion parameterization and the estimation of them by Kalman filtering. Overall algorithm is summarized in Section 4. Synthetic and real-world experiment results are shown in Section 5. Section 6 concludes the paper and discusses some future works. 2. Gaussian mixture model 2.1. Basic Gaussian mixture model A random variable x is said to follow a Gaussian mixture model, if its probability density function can be formulated as pðx; HÞ ¼

k X

am pðx; hm Þ;

m¼1

where a1, a2, . . . , ak are the weights subject to am > 0 and Pk m¼1 am ¼ 1. Each component’s density p(x; hm) is a normal probability distribution   1 1 T 1 pðx; hm Þ ¼ exp  Þ R ðx  l Þ ; ðx  l m m m 2 ð2pÞd=2 detðRm Þ1=2

• E-step The posterior probability of sample xi at the tth step is computed as ðtÞ

log pðX; HÞ ¼ log

n Y i¼1

pðxi ; HÞ ¼

n X i¼1

log

k X

am pðxi ; hm Þ.

m¼1

Because the maximum likelihood estimate of H cannot be solved analytically, EM has been widely used to estimate the parameters of GMM. It is an iterative algorithm, and performs the E-steps and M-steps alternatively as follows:

ðtÞ

a pðxi ; hl Þ pðljxi ; HðtÞ Þ ¼ Xk l . ðtÞ ðtÞ a m pðxi ; hm Þ m¼1 • M-step The (t + 1)th step H(t+1) is updated by n 1X pðmjxi ; HðtÞ Þ; amðtþ1Þ ¼ n i¼1 Xn xi pðmjxi ; HðtÞ Þ ðtþ1Þ ; lm ¼ Xi¼1 n ðtÞ pðmjx i; H Þ i¼1 h i Xn ðtÞ ðtþ1Þ ðtþ1Þ T pðmjx Þðx Þ i ; H Þ ðxi  lm i  lm i¼1 Xn . Rmtþ1 ¼ pðmjxi ; HðtÞ Þ i¼1

2.2. ‘‘Split’’, ‘‘merge’’ and ‘‘delete’’ operations To account for object splitting, merging and vanishing, and to prevent the model from converging to a local maximum, led by a mismatch between the number of objects in the model and the actual number of objects, we include three operations and one model selection criterion. Among several criterions already proposed, such as Bayesian inference criterion (BIC), minimum description length (MDL), Laplace-empirical criterion (LEC), integrated classification likelihood (ICL) and minimum message length (MML), Figueiredo and Jain (2002) has testified that MML is the best by experiment. Thus, we use the MML criterion, which can be expressed as ^ ¼ arg max LðH; XÞ; H H

where LðH; XÞ ¼ log pðX; HÞ  

where hm = {lm, Rm} represents the mth component center and covariance, respectively. The complete set of parameters to define the mixture is H = {a1, . . . , ak, l1, . . . , lk, R1, . . . , Rk}. Given a group of n independent and identically distribution samples X ¼ fx1 ; x2 ; . . . ; xn g, the log-likelihood corresponding to a k-component mixture is

839

k na  N X m log 2 m¼1 12

k n kðN þ 1Þ log  ; 2 12 2

and N is the dimensionality of hm. The model which maximizes LðH; XÞ will be chosen to describe the status of objects in current frame. As in (Ueda et al., 2000), Local Kullback divergence is used to measure the distance between the local data density fm(x) and the model density pm(x) of the mth component. n X fm ðxi ; HÞ J ðm; HÞ ¼ ; fm ðxi ; HÞ log pm ðxi ; hm Þ i¼1 where pðmjx; HÞ fm ðx; HÞ ¼ Xn . pðmjx i ; HÞ i¼1 We check whether the following three operations can increase LðH; XÞ repeatedly:

840

G. Xiong et al. / Pattern Recognition Letters 27 (2006) 838–842

• ‘‘Split’’: The component (with superscript ‘‘(s)’’) which maximizes J(m; H) is potential to be split. The initial parameters of two new components indicating by subscript ‘‘1, 2’’ are set as follows: a1 ¼ a2 ¼ aðsÞ =2; l1 ¼ lðsÞ þ mðsÞ =2; R1 ¼ rðsÞ I;

l2 ¼ lðsÞ  mðsÞ =2;

R2 ¼ rðsÞ I;

where m(s) is the normalized major axis of R(s), r(s) is the smallest singular value of R(s) and I is identity matrix. The idea behind the above formula is simple: We split in the direction perpendicular to the principle axis. • ‘‘Merge’’: The pair of components (with subscript ‘‘1, 2’’) whose composition (with superscript ‘‘(m)’’) minimizes J(m; H) is potential to be merged. The parameters of the composed component are set as follows (Zhang et al., 2004): aðmÞ ¼ a1 þ a2 ; lðmÞ ¼ ða1 l1 þ a2 l2 Þ=aðmÞ ; n h i T RðmÞ ¼ a1 R1 þ ðl1  lðmÞ Þðl1  lðmÞ Þ h io. þa2 R2 þ ðl2  lðmÞ Þðl2  lðmÞ ÞT aðmÞ . • ‘‘Delete’’: The component with superscript ‘‘(d)’’ is deleted if its weight is too small, i.e., a

ðdÞ

< N =n.

If any operation can increase LðH; XÞ, the model is updated, otherwise previous model is recovered.

3. Motion estimation by Kalman filtering A simple method for updating parameters is using the model in previous frame as the initial model for the next frame. When the objects move or reshape rapidly, it has two disadvantages. One is the iteration number for a single frame is often large. The other is sometimes the model is stuck in a local minimum and it does not describe objects well. Therefore, we use Kalman filtering to predict the next state of the objects and the prediction will be used as the initial parameters of the next frame. Because the estimate by Kalman filter is usually close to the true state, the computation load of the tracking is reduced significantly. Kalman filter is a well known and common tool for stochastic estimation from noisy measurement. It is actually a set of recursive mathematical equations that implement a predictor–corrector type estimator. The estimator is optimal in the sense of minimizing the estimated error covariance. Due to its simplicity, effectiveness, and optimal properties, Kalman filter has been successfully applied in a wide range of areas such as aerospace, navigation, radar, target tracking, etc. Herein, we employ it to predict the initial parameters for the next frame from the previous frame.

Assume states satisfy a linear time-invariant model: 

xðk þ 1Þ ¼ UxðkÞ þ uðkÞ; yðkÞ ¼ HxðkÞ þ vðkÞ;

where x(k) and y(k) denote state vector and measurement vector at time k, respectively; u(k) and v(k) are Gaussian white noise with zero mean and covariance Q and R, respectively; U is a matrix relating x(k) to x(k + 1); H is a matrix relating states and measurements. Then the onestep predicted estimates form of the Kalman filter can be described as follows (Kailath, 1981): 1. Input the initial error covariance matrix P(0j1) = P0, which is usually set to be identity. 2. Input the initial states value x(0j1) = 0. 3. Compute the Kalman gain K(k) and G(k): GðkÞ ¼ HPðkjk  1ÞH T þ R; KðkÞ ¼ Pðkjk  1ÞH T . ^ðk þ 1jkÞ: 4. Compute the estimated state vector x 1 ^ðk þ 1jkÞ ¼ U^ x xðkjk  1Þ þ UKðkÞGðkÞ ^ðkjk  1ÞÞ.  ðyðkÞ  H x

5. Update the error covariance matrix P(k + 1jk): Pðk þ 1jkÞ ¼ UPðkjk  1ÞUT þ Q 1

T

 UKðkÞGsðkÞ KðkÞ UT . 6. Return Step 2. For the elliptical objects, the parameters hm = {lm, Rm} are able to encode all the information of an object. That is to say, lm informs its position and Rm informs its shape and orientation. They are the state variables to be estimated by Kalman filter. Assume the objects do not interact with each other, so we can estimate for each component independently. The vector lm is suitable for Kalman filtering, but in the two-dimensional (2D) case the matrix Rm needs parameterization as follows:    T rx cos h sin h cos h sin h R¼ ; ry sin h  cos h sin h  cos h where h gives orientation and ½ rx ry T informs shape. By combining parameters from lm and Rm, the component’s states to be estimated are ½ lx ly h rx ry T . 4. Dynamical Gaussian mixture model We summarize the DGMM model in Fig. 1. State estimate: The initial state of each component for the next frame is estimated using Kalman filtering. State update: The renewed states in GMM are fed to Kalman filter to update their current values. If there any splitting or merging occurs in GMM, newly

G. Xiong et al. / Pattern Recognition Letters 27 (2006) 838–842

841

a1 ¼ 0:2; a2 ¼ 0:3; a3 ¼ 0:2; a4 ¼ 0:3; l1 ¼ ½ 0:2 0:4 T ; l2 ¼ ½ 0:9 þ v2 t 0:6 T ; l3 ¼ ½ 0:8

T

0:4  ;

l4 ¼ ½ 0:5

T

0:1 þ v4 t  ;

R1 ¼ R2 ¼ R3 ¼ R4 ¼ diag½ 0:001 0:001 ; t ¼ 1; 2; . . . ; 100; v2 ¼ 0:01; v4 ¼ 0:02.

Fig. 1. Flow chart of DGMM.

generated components will notify the Kalman filter to initialize their state, whereas vanished components will notify it to delete them.

Tracking results are shown in Table 1. The average Mean square error (MSE) is used to measure the differences between the tracking results and the known true parameters, i.e., the accuracy of tracking. It is computed for the weight, center, and covariance. The computational load is measured by the average number of iterations for all frames. We can see from the table that the Kalman filtering based DGMM will save a lot of computation time while maintaining the tracking accuracy. Table 1 Comparison between with and without Kalman filtering Type

Average MSE Weight

Center

Covariance

0.0021

0.0032

0.0011

4

0.0023

0.0047

0.0010

8

5. Experimental results To evaluate the performance of our algorithm we show results on both simulated and real-world datasets. 5.1. Simulated examples In the simulated example, 100 frames of 2000 2D samples are generated with parameters:

With Kalman filtering Without Kalman filtering

Average number of iterations

The average MSEs of weight, center, and covariance measures the differences between the tracking results and known true parameters, i.e., the accuracy of tracking. The average number of iterations measures the computational load. Clearly, DGMM powered by Kalman filtering will save a lot of computation time while maintaining the tracking accuracy.

Fig. 2. Simulated results demonstrating ‘‘merge’’ and ‘‘split’’ operations of DGMM (· represents samples on objects, circles indicates tracking results)— (a) frame 31: object merging, (b) frame 36: object merged, (c) frame 41: object splitting and (d) frame 44: object split.

842

G. Xiong et al. / Pattern Recognition Letters 27 (2006) 838–842

Fig. 3. Real data of tracking urchin embryo sequence.

Next, some parameters are adjusted to show the ability of tracking merging and splitting objects, namely, l4 ¼ ½0:5  0:1 þ v4 t T . The tracking result is shown in Fig. 2. 5.2. Real data We apply our algorithm to track the early development (motility and cleavage) in purple urchin embryos (Fig. 3). They are mounted in filtered seawater and filmed on a microscope. The cells are elliptical and fit our model well. A simple global threshold method is used for every frame to extract sample points to represent the objects. The operation is computationally very cheap and the thresholding level can be chosen from a wide range without altering the outcome of DGMM. DGMM is employed to find each cell’s current location, shape and orientation. Below is frame 65 and 81 from which we can see Cell 2 on frame 65 splits to produce Cell 2-a and 2-b on frame 81. Cell 3 on frame 65 splits to produce Cell 3-a and 3-b on frame 81. Cell 1 on frame 81 is splitting. All these experiments show that the DGMM algorithm we propose is effective for tracking the elliptical living object. The frame rate is about 15 and 8–10 without and with merging and splitting on ordinary PC when data size is about 10,000 in each frame. 6. Conclusion In this letter, we propose a novel dynamical Gaussian mixture model which is able to track elliptical living objects as well as their splitting and merging behaviors. Kalman filtering is included to estimate the parameters for the next frame which can decrease the number of iterations of the basic EM algorithm significantly. In addition, ‘‘split’’, ‘‘merge’’ and ‘‘delete’’ operations are integrated into

DGMM to account for object splitting, merging and vanishing and to suppress the chance of getting stuck at a local maximum, led by a mismatch between the number of objects in the model and the actual number of objects. Both simulated and real-world experiments show its effectiveness. We will apply DGMM to 3D case and analyze the theoretical aspect of our model in the future work. Acknowledgements The authors wish to thank Dr. George Dassow for providing us the video microscopic sequences and fruitful discussions. We also thank the two anonymous reviewers for their helpful comments. This work is supported by Chinese Ministry of Science and Technology under contract 001CB510307. References Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum-likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39, 1–38. Figueiredo, M.A.T., Jain, A.K., 2002. Unsupervised learning of finite mixture model. IEEE Trans. Pattern Anal. Machine Intell. 24 (3), 381– 396. Kailath, T., 1981. Lectures on Wiener and Kalman filtering. Springer, New York. McKenna, S., Gong, S., Raja, Y., 1999. Tracking colour objects using adaptive mixture models. Image Vision Comput. 17, 225–231. Oliver, N., Pentland, A., Be´rard, F., 1997. LAFTER: lips and face real time tracker with facial expression recognition. In: CVPR ’97. Rares, A., Reinders, M.J.T., 2000. Object tracking by adaptive modeling. In: ICIP ’2000. Ueda, N., Nakano, R., Gharhamani, Z., Hinton, G., 2000. SMEM algorithm for mixture models. Neural Comput. 12, 2109–2128. Vermaak, J., Doucet A., Pe´rez, P., 2003. Maintaining multi-modality though mixture modeling. In: ICCV ’2003. Zhang, B.B., Zhang, C.S., Yi, X., 2004. Competitive EM algorithm for finite mixture models. Pattern Recognit. 37, 131–144.

Dynamical Gaussian mixture model for tracking ...

Communicated by Dr. H. Sako. Abstract. In this letter, we present a novel dynamical Gaussian mixture model (DGMM) for tracking elliptical living objects in video ...

295KB Sizes 4 Downloads 254 Views

Recommend Documents

The subspace Gaussian mixture model – a structured model for ...
Aug 7, 2010 - We call this a ... In HMM-GMM based speech recognition (see [11] for review), we turn the .... of the work described here has been published in conference .... ize the SGMM system; we do this in such a way that all the states' ...

Non-rigid multi-modal object tracking using Gaussian mixture models
of the Requirements for the Degree. Master of Science. Computer Engineering by .... Human-Computer Interaction: Applications like face tracking, gesture ... Feature Selection: Features that best discriminate the target from the background need ... Ho

Group Target Tracking with the Gaussian Mixture ... -
such as group target processing, tracking in high target ... individual targets only as the quantity and quality of the data ...... IEEE Aerospace Conference, Big.

Non-rigid multi-modal object tracking using Gaussian mixture models
Master of Science .... Human-Computer Interaction: Applications like face tracking, gesture recognition, and ... Many algorithms use multiple features to obtain best ... However, there are online feature selection mechanisms [16] and boosting.

The subspace Gaussian mixture model – a structured ...
Oct 4, 2010 - advantage where the amount of in-domain data available to train .... Our distribution in each state is now a mixture of mixtures, with Mj times I.

Panoramic Gaussian Mixture Model and large-scale ...
Mar 2, 2012 - After computing the camera's parameters ([α, β, f ]) of each key frame position, ..... work is supported by the National Natural Science Foundation of China. (60827003 ... Kang, S.P., Joonki, K.A., Abidi, B., Mongi, A.A.: Real-time vi

A GAUSSIAN MIXTURE MODEL LAYER JOINTLY OPTIMIZED WITH ...
∗Research conducted as an intern at Google, USA. Fig. 1. .... The training examples are log energy features obtained from the concatenation of 26 frames, ...

The subspace Gaussian mixture model – a structured ...
Aug 7, 2010 - eHong Kong University of Science and Technology, Hong Kong, China. fSaarland University ..... In experiments previously carried out at IBM ...... particularly large improvements when training on small data-sets, as long as.

Detecting Cars Using Gaussian Mixture Models - MATLAB ...
Detecting Cars Using Gaussian Mixture Models - MATLAB & Simulink Example.pdf. Detecting Cars Using Gaussian Mixture Models - MATLAB & Simulink ...

Fuzzy correspondences guided Gaussian mixture ...
Sep 12, 2017 - 1. Introduction. Point set registration (PSR) is a fundamental problem and has been widely applied in a variety of computer vision and pattern recognition tasks ..... 1 Bold capital letters denote a matrix X, xi denotes the ith row of

Subspace Constrained Gaussian Mixture Models for ...
IBM T.J. Watson Research Center, Yorktown Height, NY 10598 axelrod,vgoel ... and HLDA models arise as special cases. ..... We call these models precision.

Gaussian mixture modeling by exploiting the ...
or more underlying Gaussian sources with common centers. If the common center .... j=1 hr q(xj) . (8). The E- and M-steps alternate until the conditional expectation .... it can be easily confused with other proofs for several types of Mahalanobis ..

Learning Gaussian Mixture Models with Entropy Based ...
statistical modeling of data, like pattern recognition, computer vision, image analysis ...... Degree in Computer Science at the University of. Alicante in 1999 and ...

Gaussian Mixture Modeling with Volume Preserving ...
transformations in the context of Gaussian Mixture Mod- els. ... Such a transform is selected by consider- ... fj : Rd → R. For data x at a given HMM state s we wish.

Learning Gaussian Mixture Models with Entropy Based ...
statistical modeling of data, like pattern recognition, computer vision, image ... (MAP) or Bayesian inference [8][9]. †Departamento de ... provide a lower bound on the approximation error [14]. In ...... the conversion from color to grey level. Fi

a framework based on gaussian mixture models and ...
Sep 24, 2010 - Since the invention of the CCD and digital imaging, digital image processing has ...... Infrared and ultraviolet radiation signature first appears c. .... software can be constantly upgraded and improved to add new features to an.

Gaussian mixture modeling by exploiting the ...
This test is based on marginal cumulative distribution functions. ... data-sets and real ones. ..... splitting criterion can be considered simply as a transformation.

Computing Gaussian Mixture Models with EM using ...
email: tomboy,fenoam,aharonbh,[email protected]}. School of Computer ... Such constraints can be gathered automatically in some learn- ing problems, and ...

Gaussian Mixture Modeling with Volume Preserving ...
fj : Rd → R. For data x at a given HMM state s we wish to model ... ment (Viterbi path) of the acoustic training data {xt}T t=1 .... fd(x) = xd + hd(x1,x2,...,xd−1). (8).

A Partial Set Covering Model for Protein Mixture ...
2009; published online XX 2009. To date, many popular .... experimental study, we found that they almost exhibit identical ...... software [42] and VIPER software [43] to pre-process the ..... Can Yang received the bachelor's degree and master's ...