A Novel Three-Phase Algorithm for RBF Neural Network Center ...

Viewer
Transcript

A Novel Three-Phase Algorithm for RBF Neural Network Center Selection Dae-Won Lee and Jaewook Lee Department of Industrial Engineering, Pohang University of Science and Technology, Pohang, Kyungbuk 790-784, Korea. {woosuhan,jaewookl}@postech.ac.kr

Abstract. In this paper, we propose a new method for selecting RBF centers. The strength of our method is to determine the number and the locations of RBF centers automatically without any priori assumption about the number of centers. The proposed method consists of three phases. The ﬁrst phase is to partition the input patterns into the several subsets according to their output labels. In the second and third phase, the number and the locations of RBF centers are determined using bisection algorithm and weighted mean centering. These second and third phase are iteratively repeated until to reach the goal error. The proposed method is applied to several benchmark data sets. The numerical results show that our method is robust and eﬃcient for determining the number and the locations of centers.

1

Introduction

Radial basis function network (RBFN), due to the simplicity of its single hidden layer structure and universal property, has been widely used for nonlinear function approximation and pattern classiﬁcation. Training of the RBFNs consists of selecting centers of the hidden neurons and estimating the weights that connect the hidden and the output layers. Once centers have been ﬁxed, the network weights will be directly estimated by using the least squares algorithm. The generalized radial basis function network(GRBFN) involves searching for a suboptimal solution in a lower-dimensional space that approximates the interpolation solution where the approximated solution F ∗ (x) can be expressed as follows F ∗ (x) =

m

wi φ(x − ti )

(1)

i=1

where the set of RBF centers {ti |i = 1, . . . , m} is to be determined. One of the key issues in the design of RBFN specially is how to determine the number and the locations of the RBF centers. In the recent researches, a variety of ways for determining the locations of centers have been proposed. Previously reported approaches for RBF center selection include random selection from F. Yin, J. Wang, and C. Guo (Eds.): ISNN 2004, LNCS 3173, pp. 350–355, 2004. c Springer-Verlag Berlin Heidelberg 2004

A Novel Three-Phase Algorithm

351

input patterns [1], and selection of centers based on clustering algorithms [2], [3]. These methods have some drawbacks in that they cannot determine the number and the locations of centers at the same time and do not consider the information of the supervised data. In this paper, to overcome such drawbacks, the proposed method consists of three basic ingredients. Firstly, input patterns are partitioned into several subsets according to their output labels and the RBF centers are determined separately for each subset in order to reﬂect information of the output class label. Then these centers are combined to form a larger superset to be used in the ﬁnal RBFN. Secondly, the proposed algorithm determines the number of the centers by using bi-section algorithm. Finally, the proposed algorithm determines the optimal locations of the centers by employing a weighted mean centering. The organization of this paper is as follows : In Section 2 a new method for RBF center selection was explained. Section 3 presents an algorithm for the proposed method, and Section 4 presents experimental results applied to benchmark problems, followed by conclusions in Section 5.

2 2.1

The Proposed Method Phase I: Partitioning Supervised Data

The proposed method is basically similar to the selection of centers based on clustering algorithm which is most widely used for center selection. However this kind of approaches deal with input patterns as unsupervised data during the center selection step even though supervised data are available. When the output class label is not considered, some of centers is often found to be located in the boundary between several classes, in which case these centers fail to play a role of a good feature extractor in the GRBFN. In phase I, we partition training data into C disjoint subsets {Di }C i=1 according to its class to reﬂect the distribution of input pattern for each output class. Then, with each subset, cluster centers are found using clustering algorithm to be explained in Section 2.2 and 2.3. Finally, we use C disjoint sets of cluster centers as GRBFN centers. 2.2

Phase II: Determining the Number of Centers by Bi-section Algorithm

Phase II tries to estimate the proper number of centers automatically. The idea was originated from Cover’s theorem on the separability of patterns [4]: the more GRBFN has hidden neurons, the more input patterns are linearly separable. It can be interpreted that a training error rate curve against the number of centers is approximately monotonic decreasing function. From this observation, in order to ﬁnd the number of centers we employed the bi-section algorithm expressed in Fig.1. It can approximately estimate the minimum number of centers reaching the goal error rate with reducing the search space (1 ∼ N ) in a ratio of 1/2. One advantage of this algorithm is that it can ﬁnd the number of centers within a comparatively less time because it has only O(log2 N ) complexity.

352

D.-W. Lee and J. Lee

Fig. 1. Flow diagram for center selection (Phase II and III). Where Error(λ) is a misclassiﬁed rate with the λ number of centers

2.3

Phase III: Determining the Locations of Centers Using Weighted Mean Centering

Determining the locations of the RBF centers is based on optimizing the following objective function with respect to (t, w). E=

N 1

2

 di −

i=1

ki

2 wj φ(xi − tj )

(2)

j=1

From the necessary optimality condition, positions of centers is given by N

∂E 1 = 2 wj (d − Φw)i Φij (xi − tj ) = 0 ∂tj σ i=1 N tj = =

i=1 (d − Φw)i Φij xi N i= (d − Φw)i Φij

N

λj (xi , t, w)xi ,

=

N i=1

(d − Φw)i Φij

N

i=1 (d

− Φw)i Φij

(3) xi (4)

∀j = 1, . . . , ki

i=1

N where i=1 λj (xi , t, w) = 1, ∀j = 1, . . . , ki , Φ = [φ(xi , tj )]i=1,... ,ni , j=1,... ,ki , d = [d1 , d2 , . . . , dN ]T and w = [w1 , w2 , . . . , wm1 ]T . It shows that the estimate for tj is merely a weighted average of the samples. However, due to the highly non-linearity of E with respect to the centers tj , it is very diﬃcult to compute

A Novel Three-Phase Algorithm

353

tj directly. Instead, we employ a so-called weighted mean centering scheme to implement this, which consists of two steps. In the ﬁrst step, center positions tj of Eq. (4) are approximated as a simple average of the xi that has higher absolute value of the numerator of λj (xi , t, w). This process is repeated until no change is made. (See Section 3 for more details about this.) In the case of non-convex data set, the obtained centers in the ﬁrst step are often found to be located in the regions of other classes, even though we have partitioned data sets according to their classes. To avoid this problem, in the second step, we modify RBF centers into the nearest points within the partitioned subset Di .

3

Algorithm

An algorithm of center selection for the proposed method is as follows. % Phase I: Partitioning supervised data 1. Separate N training data into C disjoint subsets Di containing ni elements, according to class. {(xi , di )}N i=1 → D1 ∪ D2 ∪, . . . , ∪DC Di = {(x1 , i), (x2 , i), . . . , (xni , i)} for i = 1, . . . , C % Selecting ki centers for each subset Di 2. for i = 1 to C do % Phase II: bi-section algorithm 2.1. Determine the number of centers (ki ) using bi-section algorithm as i explained in Fig. 1: initially, ki = 1+n 2 . % Phase III: weighted mean centering 2.2. Determine the locations of ki centers using weighted means centering. ki from input 2.2.1. Choose random values for the initial centers Ti = {tj }j=1 space. Where tj is the jth cluster center of subset Di . % Adjust the centers 2.2.2. for l = 1 to ni do 2.2.2.1. Let Ii (xl ) denote the index of the best-matching center for the input vector xl ∈ Di as follows. Ii (xl ) = arg max (d − Φw)l Φlj , j

j = 1, 2, . . . , ki

2.2.2.2. Adjust the centers using the update rule tj ← tj + η(xl − tj ),

j = Ii (xl )

(5)

where η is a learning step size. 2.2.3. Continue the center adjusting procedure (step 2.2.2) until no change i are observed in the centers {tj }kj=1 2.2.4. Modify centers into nearest points of partitioned subset Di and determine ﬁnal locations of centers.

354

D.-W. Lee and J. Lee Table 1. Benchmark data description

2-Spirals Sonar Heart Vowel

Input dimension Number of classes Number of patterns 2 2 388 [194,194] 60 2 104 [55,49] 13 2 180 [98,82] 10 11 528[48,48,48,48,48,48,48,48,48,48,48]

bracketed numbers mean the number of patterns for each class.

tj = the nearest point x ∈ Di for j = 1, 2, . . . , ki 2.3. Repeat Step 2.1.∼2.2. until converge to goal error. % Complete the RBFN training 3. Combine the centers of C disjoint subsets {Di }C i=1 and construct generalized RBF network by using pseudo-inverse. T = {tj }K j=1 ← T1 ∪ T2 ∪, . . . , ∪TC w = (ΦT Φ + λΦ0 )−1 ΦT d where Φ = [φ(xi , tj )]i=1,... ,N, C i=1 ki .

4

j=1,... ,K ,

Φ0 = [φ(xi , tj )]i,j=1,... ,K , and K =

Simulation Results

The algorithm described in the previous section has been simulated on four kinds of benchmark data sets (2-spiral, sonar, heart, vowel). Description of the benchmark data sets is given in Table 1. The performance of the proposed method is compared with two widely used center selection methods. In Table 2, KM is the k-means based center selection without partitioning and RS is random selection from training data without partitioning. For these two methods, the number of centers is determined by increasing centers one by one until they achieve the goal error. For the comparison we adopted three criterion: the number of centers, the mis-classiﬁed rate, and the computing time. Simulation results are shown in Table 2. The results show that the proposed method achieves better accuracy with a slightly fewer number of RBFN centers while signiﬁcantly reducing computing time.

5

Concluding Remarks

In this study, we have presented a novel three-phase algorithm for RBF center selection. The proposed method has several advantages. Firstly, it determines the number and the locations of centers automatically without any assumption about the number of centers. Secondly it selects good feature extractors by using

A Novel Three-Phase Algorithm

355

Table 2. Simulation results on four benchmark problems Method m 2-Spiral 84 Sonar 37 Heart 129 Vowel 66

KM E 0.069 0.096 0.97 0.099

T 5650 1570 11876 3658

m 88 37 131 64

RS E 0.080 0.096 0.99 0.099

Proposed T m 80 [35,45] 5925 1112 34 [17,17] 9723 129 [68,61] 2465 59 [2,5,5,2,8,7,3,2,9,2,14]

E 0.064 0.096 0.093 0.099

T 297 21 125 188

m is the number of centers and bracketed numbers are the number of centers for each class. E is mis-classiﬁed error rate and T is computing time to construct the GRBFN.

the information of output class label. Finally, it is robust to a data set with nonconvex distribution. Experimental results show that the proposed method is competitive with the previously reported approaches for RBF center selection. Other methods to improve eﬃciency of RBF center selection, such as Homotopy method [5], [6] can be also be investigated.

Acknowledgement. This work was supported by the Korea Research Foundation under grant number KRF-2003-041-D00608.

References 1. Mao, K.Z.: RBF Neural Network Center Selection Based on Fisher Ratio Class Separability Measure. IEEE Trans. Neural Networks, Vol. 13(5) (2002) 1211-1217 2. Gomm, J.B., Yu, D.L.: Selection Radial Basis Function Network Centers with Recursive Orthogonal Least Squares Training. IEEE Trans. Neural Networks, Vol. 11(2) (2000) 306-314 3. Haykin, S.: Neural Networks: A Comprehensive Doundation. Prentice Hall, New York (1999) 4. Cover, T.M.: Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition. IEEE Trans. Electronic Computers, Vol. EC-14 (1965) 326-334 5. Lee, J., Chiang, H.-D.: Constructive Homotopy Methods for Finding All or Multiple DC Operating Points of Noninear Circuits and Systems. IEEE Trans. on Circuits and Systems- Part I, Vol. 48-(1) (2001) 35-50 6. Lee, J., Chiang, H.-D.: A Singular Fixed-Point Homotopy Method to Locate the Closest Unstable Equilibrium Point for Transient Stability Region Estimate. IEEE Trans. on Circuits and Systems- Part II, Vol. 51-(4) (2004) 185-189

A Novel Three-Phase Algorithm for RBF Neural Network Center ...

Network Center Selection. Dae-Won Lee and Jaewook Lee. Department of Industrial Engineering,. Pohang University of Science and Technology,. Pohang ...

Download PDF

158KB Sizes 2 Downloads 390 Views

Report

A Novel Three-Phase Algorithm for RBF Neural Network Center ...

Recommend Documents