Dynamic Dissimilarity Measure for Support-Based ...

Viewer
Transcript

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,

VOL. 22,

Dynamic Dissimilarity Measure for Support-Based Clustering Daewon Lee and Jaewook Lee Abstract—Clustering methods utilizing support estimates of a data distribution have recently attracted much attention because of their ability to generate cluster boundaries of arbitrary shape and to deal with outliers efficiently. In this paper, we propose a novel dissimilarity measure based on a dynamical system associated with support estimating functions. Theoretical foundations of the proposed measure are developed and applied to construct a clustering method that can effectively partition the whole data space. Simulation results demonstrate that clustering based on the proposed dissimilarity measure is robust to the choice of kernel parameters and able to control the number of clusters efficiently. Index Terms—Clustering, kernel methods, dynamical systems, equilibrium vector, support.

Ç 1

INTRODUCTION

RECENTLY, many researchers have successfully applied clustering methods based on the estimated support of a data distribution to solve some difficult and diverse unsupervised learning problems [1], [2], [3], [4], [5]. These methods, inspired by kernel machines such as kernel-based clustering [5], [6], [7] and support vector clustering [1], consist, in general, of two main stages: estimating a support function [8], [9] and clustering data points based on geometric structures of the estimated support function. The latter clustering stage is highly computer-intensive even in middle-scale problems and often shows poor clustering performance. Several researchers have therefore developed various techniques to reduce its computational complexity for the real applications, which include approximated graph techniques [10], spectral graph partitioning strategy [11], ensembles combined strategy [12], chunking strategies [3], pseudohierarchical technique [13], or equilibrium-based techniques [4], [14], [15], [16]. Despite their advantages over other clustering methods, the existing support-based clustering algorithms have some drawbacks. First, out-of-the sample points outside of the generated cluster boundaries cannot directly be assigned a cluster label. Second, the clustering results are very sensitive to the choice of kernel parameters used for a support estimate since the boundaries can show highly fluctuating behavior caused by small changes of the kernel parameters [13]. Finally, it is difficult to control the number of clusters when they are applied to clustering problems with a priori information of the cluster numbers. To obtain K clusters (related to finding corresponding kernel parameters), for example, they require a computationally intensive parameter tuning process that involves repeated calls of support estimating step and cluster labeling step. To overcome these intrinsic handicaps, in this paper, we propose a novel dissimilarity measure that can be applied to support-based clustering. Starting from a support function that estimates the support of a data distribution, we build its associated dynamic process to partition the whole data space into so-called basin cells of . D. Lee is with the School of Industrial Engineering, University of Ulsan, Ulsan 680-749, Korea. E-mail: [email protected]. . J. Lee is with the Department of Industrial and Management Engineering, Pohang University of Science and Technology, Pohang, Kyungbuk 790784, Korea. E-mail: [email protected]. Manuscript received 5 Apr. 2008; revised 29 Dec. 2008; accepted 12 Apr. 2009; published online 3 June 2009. Recommended for acceptance by B.C. Ooi. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TKDE-2008-04-0179. Digital Object Identifier no. 10.1109/TKDE.2009.140. 1041-4347/10/$26.00 ß 2010 IEEE

Published by the IEEE Computer Society

NO. XX,

XXXXXXX 2010

1

equilibrium vectors and to construct a weighted graph consisting of equilibrium vectors. The constructed graph then defines a novel dissimilarity measure among equilibrium vectors with which we can perform inductive clustering, that is, assigning cluster labels to out-of-the sample points as well as in-sample points. Unlike the traditional SVC that focuses on the support vectors located on the cluster boundaries, the proposed dissimilarity measure focuses on the equilibrium vectors located inside the generated clusters and can be applied to any kernel-based support or density estimating functions if they can reveal clusters of a data distribution well. Finally, we perform experiments to show that clustering based on the proposed dissimilarity measure is robust to the choice of kernel parameters and is able to generate the user-specified number of clusters without the parameter tuning process.

2

DYNAMIC DISSIMILARITY MEASURE

2.1

Support of a Data Distribution

A support function (or quantile function) is roughly defined as a positive scalar function f :
ð1Þ

See Fig. 1a. Popular support functions that are trained from data and are shown to characterize the support (or quantile) of a classconditional distribution include: .

A support function generated by the support vector domain description (SVDD) method in [1], [8], [17], fðxÞ ¼ 1 2

X j2J

.

2

j eqkxxj k þ

X

2

i j eqkxi xj k ;

ð2Þ

i;j2J

where xj and j for j 2 J are support vectors and their corresponding coefficients constructed by optimizing the SVDD model. q denotes the parameter of the Gaussian kernel. A Gaussian process support function generated by the Gaussian process clustering in [15], fðxÞ ¼ kðxÞT C1 kðxÞ;

ð3Þ

where C is a positive-definite covariance matrix with e l e m e n t s Cij ¼ Cðxi ; xj ; Þ a n d kðxÞ ¼ ðCðx; x1 Þ; . . . ; Cðx; xN ÞÞT , One commonly used covariance function is ( ) n m 1X m 2 lm xi xj þ v1 þ ij v2 ; Cðxi ; xj ; Þ ¼ v0 exp 2 m¼1 where a set of hyperparameters, ¼ fv0 ; v1 ; v2 ; l1 ; . . . ; ln g, can be determined by maximizing the marginal likelihood [15]. Traditional clustering methods based on a support function assign each connected component Ci to a separate cluster. Hence, they are not inductive since the set of clusters Ci ; i ¼ 1; . . . ; m does not cover the whole input space (e.g., the region outside Lf ðrÞ).

2.2

Graph Associated with a Support

With the aid of the dynamical system approach, we can construct a weighted graph that preserves and simplifies the cluster structures of Lf ðrÞ [14]. To be specific, we build the following dynamical system associated with f: dx ¼ F ðxÞ :¼ rfðxÞ: dt

Authorized licensed use limited to: Ulsan University. Downloaded on March 31,2010 at 00:27:33 EDT from IEEE Xplore. Restrictions apply.

ð4Þ

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,

VOL. 22,

NO. XX,

XXXXXXX 2010

Fig. 1. Illustration of the graph associated with a support. (a) The solid lines represent the cluster boundaries described by a level set Lf ðrÞ of the constructed support function f in (2) when applied to the original data set, denoted by . (b) Partitioning the sample space into basin cells, Aðsi Þ, of the stable equilibrium vectors (SEVs), si , denoted by , via the dynamic process in (4). The transition equilibrium vectors (TEVs) are denoted by ‘+’. (c) Illustration of the weighted graph Gr with three connected components consists of the 10 vertices (e.g., SEVs) and the 8 edges, denoted by solid lines. The cluster boundaries of the clusters in (a) are represented by the dotted lines, which show that graph Gr preserves a topological structure of the clusters in Lf ðrÞ.

The existence of a unique solution (or trajectory) xðÞ : < !
The boundary of the basin cell defines the basin cell boundary, denoted by @AðsÞ. One distinguished feature of system (4) is that we can partition the whole data space into several separate basin cells of the SEVs under some mild conditions [4], [14], [21], i.e.,
M [

Aðsi Þ;

ð6Þ

2.3

Dynamic Dissimilarity Measure

The basic elements for the constructed graph Gr are the SEVs, si , whose basin cell Aðsi Þ can be identified as the set consisting of similar objects with respect to system (4). Now to define a dissimilarity measure defined on all pairs of the SEVs, we first present the next theorem, which serves as a theoretical basis to extend the distance between two adjacent SEVs to the distance between any pair of SEVs by guaranteeing the existence of a connected graph Gr for some large r. Theorem 1. For a given support function f defined by (2), assume that for any x0 2 0 such that the graph Gr is connected for all r . Proof. Choose M > maxj2J kxj k. Let V ðxÞ ¼ 12 kxk2 and DM :¼ fx : kxk Mg. Then, for any x0 2 fx : kxk Mg, the trajectory xðtÞ of (4) starting from xð0Þ ¼ x0 always satisfies ! X @ j ðx0 Þxj V ðxðtÞÞjt¼0 ¼ xT0 ðrfðx0 ÞÞ ¼ xT0 x0 @t j2J X ¼ kx0 k2 þ j ðx0 ÞxT0 xj j2J

i

where fsi : i ¼ 1; . . . ; Mg is the set of the SEVs of system (4). (See Fig. 1b.) Since each data point converges almost surely to one of the SEVs when system (4) is applied, we can easily identify a basin cell to which a data point belongs by its corresponding SEV. An SEV, sa , is said to be adjacent to another SEV, sb , if there exists an index-one saddle equilibrium vector d 2 Aðsa Þ \ Aðsb Þ. Such an index-one saddle equilibrium vector, d, is called a transition equilibrium vector (TEV) between sa and sb [14]. (See Fig. 1c.) From a practical viewpoint, the notions of adjacent SEVs and TEVs enable us to build a weighted graph Gr ¼ ðVr ; Er Þ describing the connections between the SEVs with the following elements: The vertices Vr of Gr consist of SEVs, si , in V with fðsi Þ < r. The edge Er of Gr is defined as follows: ðsi ; sj Þ 2 Er with the edge weight distance, dE ðsi ; sj Þ ¼ fðdÞ, if there is a TEV, d, between si and sj with dE ðsi ; sj Þ < r. It can then be shown [14] that two SEVs, si and sj , are in the same connected component of the graph Gr if, and only if, si and sj are in the same cluster of the level set Lf ðrÞ; that is, each connected component of Gr corresponds to a cluster of Lf ðrÞ. This result enables us to build a simplified graph Gr that preserves the topological structure of the level set Lf ðrÞ (See Fig. 1c). 1. 2.

2

kx0 k þ 1 kx0 k max kxj k < 0; j

where j ðx0 Þ ¼ P

j expðqkx0 xj k2 Þ

j2J

j expðqkx0 xj k2 Þ

:

This implies that kx0 k is always strictly decreasing when process (4) is applied and, in particular, all the equilibrium vectors of (4) are inside the bounded and connected set DM . Since all the trajectories of (4) converge to one of its equilibrium vectors [4], the trajectory of (4) starting from any x0 with kx0 k M always enters into the region DM . Next, we show that there exists a > 0 such that DM Lf ðÞ, where Lf ðÞ ¼ fx 2
X

2

j eqkx0 xj k C e4qM ; 2

j2J 2

where C ¼ 1 þ i;j i j eqkxi xj k and q is a preset parameter of 2 the Gaussian kernel in (2). By choosing > C e4qM , we have DM Lf ðÞ.

Authorized licensed use limited to: Ulsan University. Downloaded on March 31,2010 at 00:27:33 EDT from IEEE Xplore. Restrictions apply.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,

VOL. 22,

NO. XX,

XXXXXXX 2010

3

Fig. 2. Illustration of the main step in Algorithm 1 after the prestep of Algorithm 1 in Fig. 1. The number of clusters at step l is M l, where M ¼ 11 is the number of the SEVs, denoted by “.” The total number of edges is e ¼ 12. In each panel, thin solid lines represent a constructed graph Gr for varying r and the thick solid lines represent the defining cluster boundaries generated by Gr . At each step l, the next least weight edge does not always join two clusters, e.g., at step l ¼ 6, the edge joining s4 and s7 is the next least weight edge, but it no longer creates a new cluster since s4 and s7 are already in the same cluster. In this algorithm, the next weight edge is selected to join two clusters that do not result in a cycle when added to the set of already selected edges.

Finally, let r > . By the invariant property of Lf ðrÞ (i.e., if a point is on a connected component of Lf ðrÞ, then its entire positive trajectory lies on the same component), we should have that for any point x0 2 ðLf ðrÞ n DM Þ, the trajectory starting at xð0Þ ¼ x0 should first hit @DM , and then enter inside the region DM since all the equilibrium vectors are inside the region DM and all the trajectories converge to one of its equilibrium vectors. This implies that the set DM is a strong deformation retract of the level set Lf ðrÞ. Also, from the uniqueness of the trajectories [18], the boundary @DM is homeomorphic to @Lf ðrÞ, which implies that Lf ðrÞ is connected from the connectedness of DM . Since the cardinality (i.e., the number of connected components) is the same between the graph G and Lf ðrÞ [14]; therefore, the graph Gr is connected. u t This theorem motivates us to define a dissimilarity measure on a connected graph G ¼ Gr for r as follows: Definition 1 (Dissimilarity measure). Let a connected graph G ¼ ðV ; EÞ be given. For a pair of SEVs, si and sj , in V , we can define the distance dG ðsi ; sj Þ as n o dG ðsi ; sj Þ ¼ min dE ðsi ; sj Þ; max dE ðsik1 ; sik Þ k¼1;...h

for a path sequence (with no cycle) si ¼ si0 ; si1 ; . . . ; sih1 ; sih ¼ sj such that ðsik1 ; sik Þ 2 E for each k ¼ 1; . . . ; j, which endows a graph G

with a dissimilarity measure. (Here, we assume dE ðsi ; sj Þ ¼ 1 if ðsi ; sj Þ 62 E.) Geometrically, the distance dG ð; Þ takes the smallest function value along a path connecting two SEVs to escape from one SEV and move on to the other SEV.

3

CLUSTERING BASED ON A DYNAMIC DISSIMILARITY MEASURE

Generally speaking, a support function (e.g., f in (2)) is often very sensitive to the choice of kernel parameters, so is the clustering structure described by Lf ðrÞ. For example, if clusters overlap in some region, it is difficult or even impossible to find the kernel parameter to separate it. Moreover, to control the number of clusters, we have to change the kernel parameters (hence, the support function f) by trial and error, as in [1], where each alteration of kernel parameters entails repeated calls of a quadratic programming solver and a cluster labeling algorithm, which is computationally intensive. The derived dynamic dissimilarity measure on the graph G can help us to overcome these drawbacks when it is applied to clustering. Specifically, with an input K 1 denoting the number of clusters, we begin with every SEV representing a singleton cluster. Denote these clusters C1 ¼ fs1 g; . . . ; Cv ¼ fsv g. At each step, the closest two clusters (i.e., two separate clusters containing

Authorized licensed use limited to: Ulsan University. Downloaded on March 31,2010 at 00:27:33 EDT from IEEE Xplore. Restrictions apply.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,

VOL. 22,

NO. XX,

XXXXXXX 2010

Fig. 3. Generated clusters by Algorithm 1 with different kernel parameters q ¼ 15 for (a) and q ¼ 50 for (b), given the number of clusters K ¼ 3. Irrespective of considerable change of kernel parameter q, the proposed method generates very similar cluster boundaries, represented by thick solid lines, with the same cluster labeling, which shows the robustness of the proposed method to varying kernel parameter q.

two adjacent SEVs with the least edge weight distance) are merged into a single cluster, producing one less cluster at the next higher level. This process is terminated when we get K clusters starting from v clusters. This procedure, employing a modified version of Kruskal’s algorithm [22] for minimum cost-spanning tree, is detailed in the following Algorithm 1 below (see Fig. 2):

1: Given a number of clusters K 2: Rearrange the index k ¼ 1; . . . ; e in such a way that fðd1 Þ < fðd2 Þ < < fðde Þ

Algorithm 1. (Clustering based on a dynamic dissimilarity measure) (Pre-Step:) 1: Given a support function f and its associated weighted graph G ¼ ðV ; EÞ where si , i ¼ 1; . . . ; M is the set of SEVs and dk , k ¼ 1; . . . ; e is the set of TEVs (cf. Algorithm 1 in [14]) (Main-Step:) A.0. // Initialization //

TABLE 1 Experimental Results on Benchmark Data Sets

RIadj denotes the adjusted Rand Index.

Fig. 4. Comparison of the proposed method (b) with traditional SVC algorithms (a) applied to iris and crab data sets with overlaps between clusters, respectively.

Authorized licensed use limited to: Ulsan University. Downloaded on March 31,2010 at 00:27:33 EDT from IEEE Xplore. Restrictions apply.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,

VOL. 22,

NO. XX,

XXXXXXX 2010

5

Fig. 5. Image segmentation results. From the same support function, three different segmentation results are generated.

3: Start with initial clusters as C1 ¼ fs1 g; . . . ; CM ¼ fsM g. In this distance between two clusters is defined as initial step, the dE ðsi ; sj Þ ¼ fðdk Þ if ðsi ; sj Þ 2 E dðCi ; Cj Þ ¼ 1 otherwise A.1. // Single-linkage amalgamating.// 1: Set l ¼ 1 and k ¼ 1 2: while l M K do 3: Find the SEVs si ; sj with its edge weight dE ðsi ; sj Þ ¼ fðdk Þ 4: if si ; sj are not in the same clusters then 5: CMþl ¼ Ca [ Cb where si 2 Ca and sj 2 Cb 6: dðCMþl ; Cu Þ ¼ minfdðCa ; Cu Þ; dðCb ; Cu Þg for all remaining clusters Cu 7: Add cluster CMþl as a new cluster and remove clusters Ca and Cb 8: Set l ¼ l þ 1; k ¼ k þ 1 9: else 10: k¼kþ1 11: end if 12: end while This algorithm possesses a monotonicity property. That is, the dissimilarity between merged clusters is monotone increasing with the level of the merger. Thus, the binary tree, called a dendrogram, can be plotted so that the height of each node is proportional to the value of the intergroup dissimilarity between its two daughters, as is shown in Fig. 3. This nice property makes the method less sensitive to the choice of kernel parameters unlike the traditional support-based clustering algorithms such as the SVC in [1] (see Fig. 3). Also, Algorithm 1 enables us to control the number of clusters by manipulating the constructed graph without changing the kernel parameters (see Fig. 2). The most time-consuming step in Algorithm 1 involves locating SEVs and TEVs for constructing a weighted graph G ¼ ðV ; EÞ. If we let m (usually order of 5-20) be the average number of iterations for locating SEVs from data points via steepest descent process, then the time complexity of getting all SEVs and TEVs of system (4) is OðNmÞ and OðM 2 dÞ, respectively. Here, M is the number of SEVs and d is the average number of computing TEVs between SEVs [14].

4

EXPERIMENTAL RESULTS

To demonstrate the performance of the clustering algorithm based on the proposed dynamic dissimilarity measure empirically, we applied it to well-known benchmark classification data sets1 and compared it with other state-of-the-art kernel clustering methods: kernel clustering method by Camastra et al. [2] (K-SVC) and spectral clustering method [23] (Spectral) (see Table 1). In the proposed method, we used a support function in (2) and a Gaussian kernel parameter q was randomly chosen for all data sets without parameter tuning in order to check the robustness to the kernel parameter. As a measure of clustering accuracy, we used the adjusted Rand Index (RIadj ), which is the similarity measure between two data partitions [24]. Here, we calculated RIadj between the predicted cluster labels and the true class labels. The RIadj has a value between 0 and 1, with 0 indicating that the predicted cluster labels and the class labels do not agree on any pair of points and 1 indicating that these two partitions are exactly the same. Table 1 shows that the proposed method has the largest RIadj values and even the value of 1 among all data sets except iris and sunflower, thus outperforming other methods for all of the data sets where the cluster number K is a priori known. In addition, we applied it to some well-known overlapped clustering problems, iris and crab. Fig. 4 shows the clustering result of the proposed method (2nd and 4th panels) compared with the original SVC of [1] (1st and 3rd panels) for which the parameter set ðC; qÞ is selected after some trials and errors. To split the overlapped clusters, the original SVC should introduce many boundary support vectors (BSVs), which may render even insample data points unlabeled. In contrast, the proposed method not only successfully separates these clusters without allowing BSVs, but also can assign a cluster label to both in-sample and outof-the sample data points. The result illustrates how well the proposed method can split the overlapped clusters. Fig. 5 shows the result of the proposed method applied to image segmentation problems to check its scalability for large data sets. The proposed method easily generates three different segmentation results (with varying cluster sizes) from the same support function without repeated calls of support estimating step and cluster labeling step. Hence, the image-pixel clustering process to find a suitable parameter for a specific cluster size becomes less computationally intensive. 1. The data sets are downloadable in the following Webpage: http:// sites.google.com/site/daewonweb/file/artificialData.zip.

Authorized licensed use limited to: Ulsan University. Downloaded on March 31,2010 at 00:27:33 EDT from IEEE Xplore. Restrictions apply.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

5

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,

CONCLUSIONS

In this paper, we have proposed a dynamic dissimilarity measure for support-based clustering. Through simulations, the clustering based on the derived dynamic dissimilarity measure is shown to be less sensitive to the choice of kernel parameters and is able to efficiently control the number of clusters. Also, it works successfully for various challenging clustering problems. The proposed measure can be derived with minor modifications from any support or density estimating functions. An application of the proposed measure to other practical problems remains to be investigated further.

[22] [23] [24]

VOL. 22,

NO. XX,

XXXXXXX 2010

J.B. Kruskal, “On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem,” Proc. Am. Math. Soc., vol. 7, no. 1, pp. 48-50, 1956. A.Y. Ng, M.I. Jordan, and Y. Weiss, “On Spectral Clustering: Analysis and an Algorithm,” Advances in Neural Information Processing Systems, pp. 849856, MIT Press, 2001. L. Hubert and P. Arabie, “Comparing Partitions,” J. Classification, vol. 2, pp. 193-218, 1985.

. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.

ACKNOWLEDGMENTS This work was supported partially by the Korea Research Foundation under the Grant number KRF-2008-314-D00483 and partially by the KOSEF under the Grant number R01-2007-00020792-0. The work of the first author (Daewon Lee) was partially supported by the Korea Research Foundation under the Grant number KRF-2008-357-D00231.

REFERENCES [1] [2] [3] [4] [5] [6]

[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]

A. Ben-Hur, D. Horn, H.T. Siegelmann, and V. Vapnik, “Support Vector Clustering,” J. Machine Learning Research, vol. 2, pp. 125-137, 2001. F. Camastra and A. Verri, “A Novel Kernel Method for Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp. 801-805, May 2005. T. Ban and S. Abe, “Spatially Chunking Support Vector Clustering Algorithm,” Proc. Int’l Joint Conf. Neural Networks, pp. 414-418, 2004. J. Lee and D. Lee, “An Improved Cluster Labeling Method for Support Vector Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 461-464, Mar. 2005. M. Girolami, “Mercer Kernel-Based Clustering in Feature Space,” IEEE Trans. Neural Networks, vol. 13, no. 3, pp. 780-784, May 2002. S. Chen and D. Zhang, “Robust Image Segmentation Using FCM with Spatial Constraints Based on New Kernel-Induced Distance Metric,” IEEE Trans. System, Man, and Cybernetics—Part B, vol. 34, no. 4, pp. 1907-1916, Aug. 2004. D. Zhang and S. Chen, “A Novel Kernelised Fuzzy C-Means Algorithm with Application in Medical Image Segmentation,” Artificial Intelligence in Medicine, vol. 32, no. 1, pp. 37-50, 2004. D.M.J. Tax and R.P.W. Duin, “Support Vector Domain Description,” Pattern Recognition Letters, vol. 20, pp. 1191-1199, 1999. B. Scho¨lkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R. Williamson, “Estimating the Support of a High-Dimensional Distribution,” Neural Computation, vol. 13, no. 7, pp. 1443-1471, 2001. J. Yang, V. Estivill-Castro, and S.K. Chalup, “Support Vector Clustering through Proximity Graph Modelling,” Proc. Ninth Int’l Conf. Neural Information Processing (ICONIP ’02), pp. 898-903, 2002. J. Park, X. Ji, H. Zha, and R. Kasturi, “Support Vector Clustering Combined with Spectral Graph Partitioning,” Proc. 17th Int’l Conf. Pattern Recognition (ICPR04), pp. 581-584, 2004. W.J. Puma-Villanueva, G.B. Bezerra, C.A.M. Lima, and F.J.V. Zuben, “Improving Support Vector Clustering with Ensembles,” Proc. Int’l Joint Conf. Neural Networks, 2005. ´ttir, H.B. Larsson, M.B. Stegmann, M.S. Hansen, K. Sjo¨strand, H. Olafsdo and R. Larsen, “Robust Pseudohierarchical Support Vector Clustering,” Proc. Scandinavian Conf. Image Analysis (SCIA ’07), pp. 808-817, 2007. J. Lee and D. Lee, “Dynamic Characterization of Cluster Structures for Robust and Inductive Support Vector Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 461-464, Nov. 2006. H.-C. Kim and J. Lee, “Clustering Based on Gaussian Processes,” Neural Computation, vol. 19, no. 11, pp. 3088-3107, 2007. D. Lee and J. Lee, “Equilibrium-Based Support Vector Machine for SemiSupervised Classification,” IEEE Trans. Neural Networks, vol. 18, no. 2, pp. 578-583, Mar. 2007. D. Lee and J. Lee, “Domain Described Support Vector Classifier for MultiClassification Problems,” Pattern Recognition, vol. 40, pp. 41-51, 2007. J. Guckenheimer and P. Homes, Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. Springer, 1986. H.K. Khalil, Nonlinear Systems. Macmillan, 1992. J. Lee, “An Optimization-Driven Framework for the Computation of the Controlling UEP in Transient Stablity Analysis,” IEEE Trans. Automatic Control, vol. 49, no. 1, pp. 115-119, Jan. 2004. J. Lee, “A Novel Three-Phase Trajectory Informed Search Methodology for Global Optimization,” J. Global Optimization, vol. 38, no. 1, pp. 61-77, 2007.

Authorized licensed use limited to: Ulsan University. Downloaded on March 31,2010 at 00:27:33 EDT from IEEE Xplore. Restrictions apply.