A Novel Three-Phase Algorithm for RBF Neural Network Center Selection Dae-Won Lee and Jaewook Lee Department of Industrial Engineering, Pohang University of Science and Technology, Pohang, Kyungbuk 790-784, Korea. {woosuhan,jaewookl}@postech.ac.kr

Abstract. In this paper, we propose a new method for selecting RBF centers. The strength of our method is to determine the number and the locations of RBF centers automatically without any priori assumption about the number of centers. The proposed method consists of three phases. The first phase is to partition the input patterns into the several subsets according to their output labels. In the second and third phase, the number and the locations of RBF centers are determined using bisection algorithm and weighted mean centering. These second and third phase are iteratively repeated until to reach the goal error. The proposed method is applied to several benchmark data sets. The numerical results show that our method is robust and efficient for determining the number and the locations of centers.

1

Introduction

Radial basis function network (RBFN), due to the simplicity of its single hidden layer structure and universal property, has been widely used for nonlinear function approximation and pattern classification. Training of the RBFNs consists of selecting centers of the hidden neurons and estimating the weights that connect the hidden and the output layers. Once centers have been fixed, the network weights will be directly estimated by using the least squares algorithm. The generalized radial basis function network(GRBFN) involves searching for a suboptimal solution in a lower-dimensional space that approximates the interpolation solution where the approximated solution F ∗ (x) can be expressed as follows F ∗ (x) =

m 

wi φ(x − ti )

(1)

i=1

where the set of RBF centers {ti |i = 1, . . . , m} is to be determined. One of the key issues in the design of RBFN specially is how to determine the number and the locations of the RBF centers. In the recent researches, a variety of ways for determining the locations of centers have been proposed. Previously reported approaches for RBF center selection include random selection from F. Yin, J. Wang, and C. Guo (Eds.): ISNN 2004, LNCS 3173, pp. 350–355, 2004. c Springer-Verlag Berlin Heidelberg 2004 

A Novel Three-Phase Algorithm

351

input patterns [1], and selection of centers based on clustering algorithms [2], [3]. These methods have some drawbacks in that they cannot determine the number and the locations of centers at the same time and do not consider the information of the supervised data. In this paper, to overcome such drawbacks, the proposed method consists of three basic ingredients. Firstly, input patterns are partitioned into several subsets according to their output labels and the RBF centers are determined separately for each subset in order to reflect information of the output class label. Then these centers are combined to form a larger superset to be used in the final RBFN. Secondly, the proposed algorithm determines the number of the centers by using bi-section algorithm. Finally, the proposed algorithm determines the optimal locations of the centers by employing a weighted mean centering. The organization of this paper is as follows : In Section 2 a new method for RBF center selection was explained. Section 3 presents an algorithm for the proposed method, and Section 4 presents experimental results applied to benchmark problems, followed by conclusions in Section 5.

2 2.1

The Proposed Method Phase I: Partitioning Supervised Data

The proposed method is basically similar to the selection of centers based on clustering algorithm which is most widely used for center selection. However this kind of approaches deal with input patterns as unsupervised data during the center selection step even though supervised data are available. When the output class label is not considered, some of centers is often found to be located in the boundary between several classes, in which case these centers fail to play a role of a good feature extractor in the GRBFN. In phase I, we partition training data into C disjoint subsets {Di }C i=1 according to its class to reflect the distribution of input pattern for each output class. Then, with each subset, cluster centers are found using clustering algorithm to be explained in Section 2.2 and 2.3. Finally, we use C disjoint sets of cluster centers as GRBFN centers. 2.2

Phase II: Determining the Number of Centers by Bi-section Algorithm

Phase II tries to estimate the proper number of centers automatically. The idea was originated from Cover’s theorem on the separability of patterns [4]: the more GRBFN has hidden neurons, the more input patterns are linearly separable. It can be interpreted that a training error rate curve against the number of centers is approximately monotonic decreasing function. From this observation, in order to find the number of centers we employed the bi-section algorithm expressed in Fig.1. It can approximately estimate the minimum number of centers reaching the goal error rate with reducing the search space (1 ∼ N ) in a ratio of 1/2. One advantage of this algorithm is that it can find the number of centers within a comparatively less time because it has only O(log2 N ) complexity.

352

D.-W. Lee and J. Lee

Fig. 1. Flow diagram for center selection (Phase II and III). Where Error(λ) is a misclassified rate with the λ number of centers

2.3

Phase III: Determining the Locations of Centers Using Weighted Mean Centering

Determining the locations of the RBF centers is based on optimizing the following objective function with respect to (t, w). E=

N 1

2

 di −

i=1

ki 

2 wj φ(xi − tj )

(2)

j=1

From the necessary optimality condition, positions of centers is given by N

 ∂E 1 = 2 wj (d − Φw)i Φij (xi − tj ) = 0 ∂tj σ i=1 N tj = =

i=1 (d − Φw)i Φij xi N i= (d − Φw)i Φij

N 

λj (xi , t, w)xi ,

=

N  i=1



(d − Φw)i Φij

N

i=1 (d

− Φw)i Φij

(3)  xi (4)

∀j = 1, . . . , ki

i=1

N where i=1 λj (xi , t, w) = 1, ∀j = 1, . . . , ki , Φ = [φ(xi , tj )]i=1,... ,ni , j=1,... ,ki , d = [d1 , d2 , . . . , dN ]T and w = [w1 , w2 , . . . , wm1 ]T . It shows that the estimate for tj is merely a weighted average of the samples. However, due to the highly non-linearity of E with respect to the centers tj , it is very difficult to compute

A Novel Three-Phase Algorithm

353

tj directly. Instead, we employ a so-called weighted mean centering scheme to implement this, which consists of two steps. In the first step, center positions tj of Eq. (4) are approximated as a simple average of the xi that has higher absolute value of the numerator of λj (xi , t, w). This process is repeated until no change is made. (See Section 3 for more details about this.) In the case of non-convex data set, the obtained centers in the first step are often found to be located in the regions of other classes, even though we have partitioned data sets according to their classes. To avoid this problem, in the second step, we modify RBF centers into the nearest points within the partitioned subset Di .

3

Algorithm

An algorithm of center selection for the proposed method is as follows. % Phase I: Partitioning supervised data 1. Separate N training data into C disjoint subsets Di containing ni elements, according to class. {(xi , di )}N i=1 → D1 ∪ D2 ∪, . . . , ∪DC Di = {(x1 , i), (x2 , i), . . . , (xni , i)} for i = 1, . . . , C % Selecting ki centers for each subset Di 2. for i = 1 to C do % Phase II: bi-section algorithm 2.1. Determine the number of centers (ki ) using bi-section algorithm as i explained in Fig. 1: initially, ki = 1+n 2 . % Phase III: weighted mean centering 2.2. Determine the locations of ki centers using weighted means centering. ki from input 2.2.1. Choose random values for the initial centers Ti = {tj }j=1 space. Where tj is the jth cluster center of subset Di . % Adjust the centers 2.2.2. for l = 1 to ni do 2.2.2.1. Let Ii (xl ) denote the index of the best-matching center for the input vector xl ∈ Di as follows. Ii (xl ) = arg max (d − Φw)l Φlj , j

j = 1, 2, . . . , ki

2.2.2.2. Adjust the centers using the update rule tj ← tj + η(xl − tj ),

j = Ii (xl )

(5)

where η is a learning step size. 2.2.3. Continue the center adjusting procedure (step 2.2.2) until no change i are observed in the centers {tj }kj=1 2.2.4. Modify centers into nearest points of partitioned subset Di and determine final locations of centers.

354

D.-W. Lee and J. Lee Table 1. Benchmark data description

2-Spirals Sonar Heart Vowel

Input dimension Number of classes Number of patterns 2 2 388 [194,194] 60 2 104 [55,49] 13 2 180 [98,82] 10 11 528[48,48,48,48,48,48,48,48,48,48,48]

bracketed numbers mean the number of patterns for each class.

tj = the nearest point x ∈ Di for j = 1, 2, . . . , ki 2.3. Repeat Step 2.1.∼2.2. until converge to goal error. % Complete the RBFN training 3. Combine the centers of C disjoint subsets {Di }C i=1 and construct generalized RBF network by using pseudo-inverse. T = {tj }K j=1 ← T1 ∪ T2 ∪, . . . , ∪TC w = (ΦT Φ + λΦ0 )−1 ΦT d where Φ = [φ(xi , tj )]i=1,... ,N, C i=1 ki .

4

j=1,... ,K ,

Φ0 = [φ(xi , tj )]i,j=1,... ,K , and K =

Simulation Results

The algorithm described in the previous section has been simulated on four kinds of benchmark data sets (2-spiral, sonar, heart, vowel). Description of the benchmark data sets is given in Table 1. The performance of the proposed method is compared with two widely used center selection methods. In Table 2, KM is the k-means based center selection without partitioning and RS is random selection from training data without partitioning. For these two methods, the number of centers is determined by increasing centers one by one until they achieve the goal error. For the comparison we adopted three criterion: the number of centers, the mis-classified rate, and the computing time. Simulation results are shown in Table 2. The results show that the proposed method achieves better accuracy with a slightly fewer number of RBFN centers while significantly reducing computing time.

5

Concluding Remarks

In this study, we have presented a novel three-phase algorithm for RBF center selection. The proposed method has several advantages. Firstly, it determines the number and the locations of centers automatically without any assumption about the number of centers. Secondly it selects good feature extractors by using

A Novel Three-Phase Algorithm

355

Table 2. Simulation results on four benchmark problems Method m 2-Spiral 84 Sonar 37 Heart 129 Vowel 66

KM E 0.069 0.096 0.97 0.099

T 5650 1570 11876 3658

m 88 37 131 64

RS E 0.080 0.096 0.99 0.099

Proposed T m 80 [35,45] 5925 1112 34 [17,17] 9723 129 [68,61] 2465 59 [2,5,5,2,8,7,3,2,9,2,14]

E 0.064 0.096 0.093 0.099

T 297 21 125 188

m is the number of centers and bracketed numbers are the number of centers for each class. E is mis-classified error rate and T is computing time to construct the GRBFN.

the information of output class label. Finally, it is robust to a data set with nonconvex distribution. Experimental results show that the proposed method is competitive with the previously reported approaches for RBF center selection. Other methods to improve efficiency of RBF center selection, such as Homotopy method [5], [6] can be also be investigated.

Acknowledgement. This work was supported by the Korea Research Foundation under grant number KRF-2003-041-D00608.

References 1. Mao, K.Z.: RBF Neural Network Center Selection Based on Fisher Ratio Class Separability Measure. IEEE Trans. Neural Networks, Vol. 13(5) (2002) 1211-1217 2. Gomm, J.B., Yu, D.L.: Selection Radial Basis Function Network Centers with Recursive Orthogonal Least Squares Training. IEEE Trans. Neural Networks, Vol. 11(2) (2000) 306-314 3. Haykin, S.: Neural Networks: A Comprehensive Doundation. Prentice Hall, New York (1999) 4. Cover, T.M.: Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition. IEEE Trans. Electronic Computers, Vol. EC-14 (1965) 326-334 5. Lee, J., Chiang, H.-D.: Constructive Homotopy Methods for Finding All or Multiple DC Operating Points of Noninear Circuits and Systems. IEEE Trans. on Circuits and Systems- Part I, Vol. 48-(1) (2001) 35-50 6. Lee, J., Chiang, H.-D.: A Singular Fixed-Point Homotopy Method to Locate the Closest Unstable Equilibrium Point for Transient Stability Region Estimate. IEEE Trans. on Circuits and Systems- Part II, Vol. 51-(4) (2004) 185-189

A Novel Three-Phase Algorithm for RBF Neural Network Center ...

Network Center Selection. Dae-Won Lee and Jaewook Lee. Department of Industrial Engineering,. Pohang University of Science and Technology,. Pohang ...

158KB Sizes 0 Downloads 336 Views

Recommend Documents

A novel RBF neural network training methodology to ...
laboratory at the Institute of Soil Science, Academia Sinica,. Najing [23]. 2.1. Data Set. As mentioned above, the toxicity data toVibrio fischerifor the. 39 compounds that constituted our data base were obtained from the literature [22]. The toxicit

pdf-0749\radial-basis-function-rbf-neural-network-control-for ...
... apps below to open or edit this item. pdf-0749\radial-basis-function-rbf-neural-network-contr ... design-analysis-and-matlab-simulation-by-jinkun-liu.pdf.

Improvement of Learning Algorithms for RBF Neural ...
The Quick Propagation (QP) method is a way to optimize the SD method. It uses the gradients ..... manager of www.Ghatreh.com (Persian News Search Engine).

back propagation algorithm in neural network pdf
back propagation algorithm in neural network pdf. back propagation algorithm in neural network pdf. Open. Extract. Open with. Sign In. Main menu.

Neural Network Toolbox
3 Apple Hill Drive. Natick, MA 01760-2098 ...... Joan Pilgram for her business help, general support, and good cheer. Teri Beale for running the show .... translation of spoken language, customer payment processing systems. Transportation.

LONG SHORT TERM MEMORY NEURAL NETWORK FOR ...
a variant of recurrent networks, namely Long Short Term ... Index Terms— Long-short term memory, LSTM, gesture typing, keyboard. 1. ..... services. ACM, 2012, pp. 251–260. [20] Bryan Klimt and Yiming Yang, “Introducing the enron corpus,” .

A programmable neural network hierarchical ...
PNN realizes a neural sub-system fully controllable (pro- grammable) behavior ...... comings of strong modularity, but it also affords flex- ible and plausible ...

A Review on Neural Network for Offline Signature Recognition ... - IJRIT
Based on Fusion of Grid and Global Features Using Neural Networks. ... original signatures using the identity and four Gabor transforms, the second step is to ...

A Deep Convolutional Neural Network for Anomalous Online Forum ...
as releasing of NSA hacking tools [1], card cloning services [24] and online ... We propose a methodology that employs a neural network to learn deep features.

Development and Optimizing of a Neural Network for Offline Signature ...
Computer detection of forgeries may be divided into two classes, the on-line ... The signature recognition has been done by using optimum neural network ...

Neural Network Toolbox
[email protected] .... Simulation With Concurrent Inputs in a Dynamic Network . ... iii. Incremental Training (of Adaptive and Other Networks) . . . . 2-20.

A Flexible Reservation Algorithm for Advance Network ...
Email: {mbalman, echaniotakis, ashoshani, asim}@lbl.gov. April, 2010 †. Abstract ... †This document was prepared as an account of work sponsored by the United ...... routing for fast transfer of bulk data files in time-varying networks. IEEE Int.

A Regenerating Spiking Neural Network
allow the design of hardware and software devices capable of re-growing damaged ..... It is interesting to analyse how often a single mutilation can affect the ...

A Regularized Line Search Tunneling for Efficient Neural Network ...
Efficient Neural Network Learning. Dae-Won Lee, Hyung-Jun Choi, and Jaewook Lee. Department of Industrial Engineering,. Pohang University of Science and ...

Neural Network Toolbox
to the government's use and disclosure of the Program and Documentation, and ...... tool for industry, education and research, a tool that will help users find what .... Once there, you can download the TRANSPARENCY MASTERS with a click.

A Neural Network for Global Second Level Trigger
•Calorimeter parameters (5 values). •TRT parameters (2 values). •Preshower parameters (3 values). •Muon detector (1 value). From these 12 parameters one ...

A Cyclostationary Neural Network Model for the ...
fects the health of the community and directly influences the sustainability ... in order to enable the development of tools for the management and reduction of pollution ..... [10] G. Masters, Introduction to Environmental Engineering and Science.

Validation of a constraint satisfaction neural network for ...
In addition, the effect of missing data was evaluated in more detail. Medical Imaging .... Receiver Operating Characteristics (ROC) analysis. We used the ROCKIT ...

Convolutional Neural Network Committees For Handwritten Character ...
Abstract—In 2010, after many years of stagnation, the ... 3D objects, natural images and traffic signs [2]–[4], image denoising .... #Classes. MNIST digits. 60000. 10000. 10. NIST SD 19 digits&letters ..... sull'Intelligenza Artificiale (IDSIA),

A Review on Neural Network for Offline Signature ...
This paper represents a brief review on various approaches used in signature verification systems. Keywords: Signature, Biometric, Artificial Neural Networks, Off-line Signature Recognition and Verification. I. INTRODUCTION. Biometrics are technologi

A Multi-Module Minimization Neural Network for Motion ...
Abstract–A competitive learning network, called Multi-Module Mini- mization (MMM) Neural ... not be mistakenly modeled as a meaningful class. Accordingly, we.

A Simple Feedforward Neural Network for the PM10 ... - Springer Link
Dec 23, 2008 - A Simple Feedforward Neural Network for the PM10. Forecasting: Comparison with a Radial Basis Function. Network and a Multivariate Linear ...

Development and Optimizing of a Neural Network for Offline ... - IJRIT
IJRIT International Journal of Research in Information Technology, Volume 1, Issue ... hidden neurons layers and 5 neurons in output layers gives best results as.