Computation of Madalines' Sensitivity to Input and ...

Viewer
Transcript

LETTER

Communicated by Andries P. Engelbrecht

Computation of Madalines’ Sensitivity to Input and Weight Perturbations Yingfeng Wang [email protected] Department of Computer Science, Hohai University, Nanjing, China

Xiaoqin Zeng [email protected] Department of Computer Science, Hohai University, Nanjing, China

Daniel So Yeung [email protected] Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong

Zhihang Peng [email protected] Department of Mathematics, Hohai University, Nanjing, China

The sensitivity of a neural network’s output to its input and weight perturbations is an important measure for evaluating the network’s performance. In this letter, we propose an approach to quantify the sensitivity of Madalines. The sensitivity is defined as the probability of output deviation due to input and weight perturbations with respect to overall input patterns. Based on the structural characteristics of Madalines, a bottomup strategy is followed, along which the sensitivity of single neurons, that is, Adalines, is considered first and then the sensitivity of the entire Madaline network. By means of probability theory, an analytical formula is derived for the calculation of Adalines’ sensitivity, and an algorithm is designed for the computation of Madalines’ sensitivity. Computer simulations are run to verify the effectiveness of the formula and algorithm. The simulation results are in good agreement with the theoretical results. 1 Introduction Generally, an artificial neural network is aimed at realizing a mapping between its input and output by establishing a set of connection weights. Therefore, the sensitivity of a neural network’s output to the input and weight perturbations is a fundamental issue with both theoretical and practical values in neural network research. It is obvious that a properly quantified sensitivity could be a useful measure for evaluating neurons’ Neural Computation 18, 2854–2877 (2006)

C 2006 Massachusetts Institute of Technology

Madalines’ Sensitivity

2855

relevance and performance, such as fault tolerance and generalization ability. For fault tolerance, there exist a number of theoretical studies of the response of neural networks to weight imprecision and input noise. For example, Stevenson, Winter, and Widrow (1990) have studied the sensitivity of Adalines to weight error aiming at addressing hardware imprecision; Choi and Choi (1992) have established a sensitivity measure for a specific input with white noise perturbation. They applied a kind of sensitivity as a measure to solve the fault tolerance problems in their studies. Recently we applied the sensitivity of perceptrons to consider the pruning issue of multilayer perceptron (MLP) networks and obtained good results (Zeng & Yeung, 2006). In our study, we employed the sensitivity of perceptrons to establish a relative measure for evaluating a neuron’s importance in a given MLP. With the measure, we can easily locate the least important neuron in the MLP and remove it without, or with the least effect on the performance of, the MLP. It is hopeful that the architecture pruning of Madalines will also benefit from Madalines’ sensitivity study. In the literature, a number of studies (Stevenson et al., 1990; Choi & Choi, 1992; Fu & Chen, 1993; Zurada, Malinowski, & Cloete, 1994; Alippi, Piuri, & Sami, 1995; Pich´e, 1995; Oh & Lee, 1995; Zurada, Malinowski, & Usui, 1997; Cheng & Yeung, 1999; Engelbrecht, 2001a, 2001b; Yeung & Sun, 2002; Zeng & Yeung, 2001, 2003, 2006) on the sensitivity of neural networks have emerged. They vary in their target networks and approaches. This letter focuses on the study of Madalines’ sensitivity and proposes a novel computational method to compute the sensitivity. Stevenson et al. (1990) first systematically investigated the sensitivity of Madalines to weight errors. They made use of hyperspheres as a mathematical model to theoretically analyze the sensitivity of Madalines. The surface of a hypersphere with radius n1/2 is used to express the input space for an Adaline with n-dimensional input. Geometrically, the surface of a hypersphere that represents all inputs of an Adaline can be divided by a hyperplane, which is determined by the Adaline’s weights and passes through the origin into two hemi-hyperspheres that correspond to the bipolar outputs of the Adaline. Based on such a geometrical model, they defined sensitivity as the probability of erroneous output of Madalines and derived approximate expressions as functions of the percentage error in inputs and weights under the assumption that the input and weight perturbations are small and the number of Adalines per layer is sufficiently large. Later, Alippi et al. (1995) generalized the hypersphere model by considering the sensitivity of Adalines with multiple-step activation functions. Unfortunately, since the discrete inputs of an Adaline generally do not span the whole hypersphere surface, their results may have large deviations when the input dimension of Adalines is not sufficiently large. Pich´e (1995) employed a statistical rather than a geometrical argument to analyze the effects of weight errors in Madalines. He made the assumption that inputs and weights as well as their errors are all independently and identically distributed (i.i.d.) with

2856

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

mean zero. Based on such a stochastic model and under the condition that both input and weight errors were small enough, Pich´e (1995) derived an analytical expression for the sensitivity as the ratio of the variance of the output error to the variance of the output. But his method is suitable only for an ensemble of neural networks rather than an individual one because of assumptions made by the stochastic model that were too strong. In our study, different from the methods we have noted, we propose a new method to derive a formula for calculating the sensitivity of Adalines and then design an algorithm to compute the sensitivity of Madalines neuron by neuron and layer by layer. Our method offers certain advantages over the other ones. For example, it does not demand the weight perturbation to be very small, as is required by Stevenson et al. (1990) and Pich´e (1995); its results are more accurate than that of Stevenson et al. (1990), and without the restriction that the weight perturbation ratio has to be the same for all Adalines in a Madaline required by Stevenson et al. (1990); and the probability used in our method is more direct and exact than the variance used in Pich´e (1995), because some assumptions on the mean of input, weight, and their perturbations need to be made prior to considering the variance. The rest of this letter is arranged as follows. The architecture and notations of Madalines as well as Adalines are briefly described in section 2. The definition and the formula derivation for Adalines’ sensitivity are given in section 3, and section 4, in a parallel way, gives the definition and the computation of the sensitivity of Madalines. Simulation results that support the theoretical results are presented in section 5. Section 6 concludes the letter. 2 The Madaline Model A Madaline is a discrete feedforward multilayered neural network; it consists of a set of Adalines that work together to maintain an input-output mapping. 2.1 Architecture. An Adaline is a basic building block of the Madaline. With n bipolar inputs and one bipolar output, a single Adaline is capable of performing certain logic functions. Each element of the inputs takes on a bipolar value of either +1 or −1 and is associated with an adjustable floating-point weight. The sum of weighted input elements is computed, producing a linear output, which is then fed to an activation function to yield the output of the Adaline. The commonly used activation function is the symmetrical hard-limit function: f (x) =

1

x≥0

−1 x < 0.

(2.1)

Madalines’ Sensitivity

2857

A Madaline is a layered network of Adalines. Links exist only between Adalines of two adjacent layers, and there is no link between Adalines in the same layer and in any two nonadjacent layers. All Adalines in a layer are fully linked to all the Adalines in the immediately preceding layer and all the Adalines in the immediately succeeding layer. At each layer, except the input layer, the inputs of each neuron are the outputs of the Adalines in the previous layer. 2.2 Notation. A Madaline in general can have L layers, and each layer l (1 ≤ l ≤ L) has nl (nl ≥ 1) neurons. The form n0 − n1 − . . . − nL is used to represent a Madaline with a given structural configuration, in which each nl (0 ≤ l ≤ L) not only stands for a layer from left to right, including the input layer, but also indicates the number of Adalines in the layer. n0 is an exception, which refers to the dimension of input vectors. nL refers to the output layer. Since the number of Adalines in layer l − 1 is equal to the output dimension of that layer, which in turn is equal to the input dimension of layer l, the input dimension of layer l is nl−1 . For Adaline i (1 ≤ i ≤ nl ) in layer l, the input vector is Xl = (x1l , . . . , xnl l−1 )T , the weight vector is l l l l T l , . . . , win Wil = (wi1 l−1 ) , and the output is yi = f (X Wi ). For each layer l, all Adalines in that layer have the same input vector Xl . The weight set of the layer is Wl = {W1l , . . . , Wnl l }, and the output vector of the layer is Yl = (y1l , . . . , ynl l )T . For an entire Madaline, the input vector is X1 or Y0 , the weight is W = W1 ∪ . . . ∪ W L , and the output is Y L . Let Xl = (x1l , . . . , xnl l−1 )T l l T and Wil = (wi1 , . . . , win be the perturbation of input vector l−1 ) l l X and weight vector Wi , and let X l = (x1l , . . . , xnl l−1 )T and Wil = l l T (wi1 , . . . , win be the corresponding perturbed input and weight vectors, l−1 ) respectively. 3 The Sensitivity of Adalines A perturbation in the inputs of an Adaline may alter its output, and a perturbation in the weights may alter the Adaline’s input-output mapping and thus its output. In this section, the effect of those perturbations on the output of an Adaline is studied. Since output deviation due to input and weight perturbations with respect to an individual input pattern can show the Adaline’s behavior only at that pattern and is usually not suitable for evaluating the Adaline’s performance, the average output deviation with respect to all input patterns is adopted in this letter as a sensitivity measure, which shows whether the Adaline is sensitive to input and weight perturbations. Definition 1. The sensitivity of Adaline i in layer l is defined as the probability of deviated output of the Aadaline due to its input and weight perturbations with

2858

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

respect to all input patterns, which is expressed as sil =

Nerr , Ninp

(3.1)

where Ninp is the number of all input patterns, and Nerr is the number of all output deviations for all the input patterns. Since the perturbation of an Adaline’s input element can result only in x j = x j or x j = −x j , it is obvious that an affected product in nj=1 x j w j can be expressed as x j w j = (−x j )w j = x j (−w j ). This means that the effective perturbation of x j is equivalent to the change of the sign of w j . In this way, an input perturbation can easily be converted to a weight perturbation. Without the loss of generality, only weight perturbation is considered in the following discussion of the sensitivity computation. For simplicity of expression, the superscript and the subscript that mark an Adaline’s layer and its order in the layer are omitted since the Adaline’s position in the network has no interest to us in this section. According to equation 2.1, it is obvious that whether there is an output deviation due to weight perturbation at input Xq (1 ≤ q ≤ Ninp ) is totally dependent on the signs of Xq W and Xq W . The output deviation occurs if and only if Xq W and Xq W have opposite signs (Xq W ≥ 0 and Xq W < 0 or Xq W < 0 and Xq W ≥ 0). This inspires us to consider the sensitivity as the probability of XW and XW having different signs over all input patterns. We have the following notations and expressions: s = P( f (XW) = f (XW )) = 1 − P( f (XW) = f (XW )) = 1 − (P((XW ≥ 0) ∩ (XW ≥ 0)) + P((XW < 0) ∩ (XW < 0))) +∞ +∞ 0 0 ≈1 − f (x, y)dxdy + f (x, y)dxdy , (3.2) 0

0

−∞

−∞

where f (x, y) is the joint probability density function of XW and XW . In order to derive a computable expression for the sensitivity, we first derive the distributions of XW and XW , respectively, then the joint distribution of XW and XW , and finally the probability obtained by using the joint distribution. Assume that all the n-dimensional inputs are uniformly distributed and that weight elements can be any real number but zero. Let ξi = xi wi , so that Ninp is equal to 2n and ξ1 , ξ2 , . . . , ξn are independent of each other. Since the probabilities of xi being either 1 or –1 are the same, the expectation is E(ξi ) = 0,

(3.3)

Madalines’ Sensitivity

2859

and the variance is D(ξi ) =

1 1 · (wi − 0)2 + · (−wi − 0)2 = wi2 . 2 2

(3.4)

Equations 3.3 and 3.4 show that ξ j has finite mathematical expectation and variance. Since ξ j = w j or ξ j = −w j , the distribution function of ξ j is    1, x ≥ |w j | F j (x) = P{ξ j ≤ x} = 12 , −|w j | ≤ x < |w j |   0, x < −|w j |.

(3.5)

Because F j (x) is a three-step constant function, the following Lindeberg condition is satisfied: ∀τ > 0, lim

n→∞

n 1

(x − u j )2 dF j (x) = 0, Bn2 |x−u j |≥τ Bn

(3.6)

j=1

n where Bn2 = i=1 D(ξi )2 and u j = E(ξ j ). Hence, in terms of Lindeberg’s n central limit theorem, it can be proved that i=1 xi wi converges in distribun 2 tion to a random variable with normal distribution N(0, w ). Similarly, i=1 i n in distribution to a random variable with normal disi=1 xi wi converges n tribution N(0, i=1 wi2 ). the joint probability density function of nWith the aim nat deriving x w and x w , it is required to obtain the covariance and the i=1 i i i=1 i i n n correlation coefficient of i=1 xi wi and i=1 xi wi . The covariance can be derived as follows: n n n n

xi wi , xi wi = E xi wi − E xi wi Cov i=1

i=1

i=1

n

×

i=1

xi wi − E

i=1

=E

n

i=1

n

xi wi

i=1

xi wi

n

xi wi

i=1

  n n n n

  =E xi x j wi w j  = E(xi x j wi w j ) i=1 j=1

i=1 j=1

n

=

n

i=1 j=1

E(xi x j )E(wi w j ).

(3.7)

2860

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

If i = j, it is clear that E(xi x j ) = 1. For i = j, because the probabilities of xi and x j being 1 or –1 are the same, the probability of xi x j being 1 or –1 must also be the same as them. Therefore, we have E(xi x j ) = 0. Based nn n on the above analysis, we obtain i=1 E(xi x j )E(wi w j ) = i=1 wi wi , that j=1 is, Cov

n

xi wi ,

n

i=1

xi wi

=

i=1

n

wi wi .

(3.8)

i=1

From equation 3.8, the correlation coefficient of which is denoted as ρ, can be written as

n i=1

xi wi and

n n Cov i=1 xi wi , i=1 xi wi n n D D i=1 xi wi i=1 xi wi n n wi w i=1 wi wi = = i=1 i , n n Bn Bn 2 2 i=1 wi i=1 wi

n i=1

xi wi ,

ρ=

(3.9)

n n 2 2 w and B = where Bn = i=1 wi . The joint probability density nn n i=1 i function of i=1 xi wi and i=1 xi wi can therefore be approximately expressed as the following bivariate normal integral:

F (x, y) = P ≈

n

xi wi ≤ x,

n

i=1 x

xi wi

≤y

i=1 y

1 1 − ρ2 −∞ 2 u −1 uv v2 dudv, × exp − 2ρ + 2 2(1 − ρ 2 ) Bn2 Bn Bn Bn −∞ 2π Bn Bn

(3.10)

and the corresponding joint probability density function is f (x, y) ≈

2 x −1 xy y2 1 . exp − 2ρ + 2(1 − ρ 2 ) Bn2 Bn Bn Bn2 2π Bn Bn 1 − ρ 2 (3.11)

Now, based on the above discussion, we can calculate the integral in equation 3.2. However, the following transformation needs to be introduced

Madalines’ Sensitivity

2861

in advance; the solution can be found in Gong and Zhao (1996): (−β1 , −β2 , ρ) =

−β1

−∞

−β2

−∞

1 2π 1 − ρ 2

1 x 2 + x22 − 2ρx1 x2 × exp − · 1 d x1 d x2 2 1 − ρ2 ρ 1 = (−β1 )(−β2 ) + √ 0 2π 1 − t 2 1 β 2 + β22 − 2β1 β2 t dt, × exp − · 1 2 1 − t2 where (x) =

x −∞

√1 2π

(3.12)

exp(− 12 u2 )du.

+∞ +∞ f (x, y)dxdy = By means of the symmetry property: 0 0 0 0 f (x, y)dxdy and equations 3.11 and 3.12, the probability (i.e. −∞ −∞ the sensitivity) can finally be derived as s ≈1 − =1 − 2

+∞

0 0 −∞

+∞

f (x, y)dxdy +

0 0

−∞

0 −∞

f (x, y)dxdy = 1 − 2

0

−∞ 0

−∞

f (x, y)dxdy 0

1

−∞ 2π Bn Bn

1 − ρ2

xy y2 x2 dxdy − 2ρ + Bn2 Bn Bn Bn2 Bn 2 Bn −1 1 2 x =1 − 2 exp − 2ρxy + y dxdy 2(1 − ρ 2 ) −∞ −∞ 2π 1 − ρ 2 0 0 −1 1 2 2 exp − 2ρxy + y ) dxdy (x =1 − 2 2(1 − ρ 2 ) −∞ −∞ 2π 1 − ρ 2 ρ 1 1 1 1 = 1 − 2(0, 0, ρ) = 1 − 2 · + dt √ 2 2 2π 0 1 − t2 1 1 1 arcsin ρ =1 − 2 + arcsin ρ = − . (3.13) 4 2π 2 π

−1 2(1 − ρ 2 ) 0 0

× exp

It is worthy of note that the computational complexity of our approach, which is O(n), is much less than that of the simulation approach, which is O(n2n ).

2862

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

Table 1: Associated Parameters and Experimental Results for the Three Adalines. Adaline

n

w

p (Simulations)

s (Ours)

s (Stevenson)

Neuron 1

15

Neuron 2

20

Neuron 3

30

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.004089 0.007751 0.011047 0.014648 0.017883 0.020813 0.024231 0.027710 0.031067 0.033936 0.004318 0.008636 0.012972 0.017553 0.021755 0.026026 0.030306 0.034578 0.038784 0.042980 0.005878 0.011672 0.017380 0.022999 0.028528 0.033968 0.039315 0.044576 0.049728 0.054772

0.003786 0.007551 0.011293 0.015012 0.018707 0.022377 0.026021 0.029639 0.033230 0.036794 0.004365 0.008730 0.013092 0.017450 0.021803 0.026148 0.030484 0.034810 0.039124 0.043424 0.005909 0.011736 0.017480 0.023137 0.028707 0.034188 0.039580 0.044881 0.050091 0.055209

0.003896 0.007791 0.011687 0.015582 0.019478 0.023374 0.027269 0.031165 0.035060 0.038956 0.004365 0.008730 0.013096 0.017461 0.021826 0.026191 0.030557 0.034922 0.039287 0.043652 0.006317 0.012635 0.018952 0.025270 0.031587 0.037905 0.044222 0.050539 0.056857 0.063174

4 The Sensitivity of Madalines Our final goal is to determine the sensitivity of Madalines to input and weight perturbations. From a global point of view, the sensitivity of a Madaline reflects its output deviation, or exactly its output layer’s output deviation, due to the first layer’s input perturbation and the network’s weight perturbation. Because the inputs of each neuron in a layer are the outputs of all neurons in its immediately preceding layer, the perturbations that occurred in first layer will be propagated through all internal layers to influence the output layer’s output. Based on the structural characteristics of Madalines and the sensitivity of Adalines discussed in section 3, we define the sensitivity of a layer and the sensitivity of an entire Madaline as follows:

Madalines’ Sensitivity

2863

0.08 Simulation’s Ours Stevenson’s

0.07

Sensitivity or Probability

0.06

Neuron 3 0.05 0.04

Neuron 2

0.03 Neuron 1 0.02 0.01 0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Weight perturbation

Figure 1: Experimental results of the three Adalines with weight perturbations.

Definition 2. The sensitivity of layer l (1 ≤ l ≤ L) is a vector in which each element is the sensitivity of the corresponding Adaline in the layer due to its input and weight perturbations, which is expressed as Sl = (s1l , s2l , . . . , snl l )T .

(4.1)

Definition 3. The sensitivity of a Madaline is the sensitivity of its output layer, that is, S = SL = (s1L , s2L , . . . , snLL )T .

(4.2)

There are two sources of perturbations for a Madaline: (1) the weight perturbation of all layers and (2) the input perturbation from the input layer, which is expressed as S0 = (s10 , s20 , . . . , sn00 )T , indicating the probability of input perturbation for each input element. This makes the notation for the input layer consistent with that of the succeeding hidden and output layers. Definition 3 is actually recursive, because SL implicitly depends on SL−1 , which in turn depends on SL−2 and so on, until the recursion finally reaches the initial S0 . This suggests that the sensitivity of a Madaline can be

2864

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

Table 2: Experimental Results for the Three Madalines. Architecture

w

p (Simulations)

s (Ours)

Net 1

20-15-1

Net 2

15-20-10-1

Net 3

25-20-15-10-1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.019060 0.036343 0.050889 0.065135 0.078338 0.089858 0.099978 0.108974 0.117135 0.127218 0.042542 0.070831 0.096497 0.123444 0.142426 0.165527 0.182373 0.194580 0.204102 0.213837 0.068878 0.126714 0.170748 0.211107 0.238871 0.266118 0.292421 0.314146 0.328447 0.344706

0.017774 0.034296 0.049696 0.064093 0.077591 0.090285 0.102256 0.113580 0.124321 0.134535 0.042172 0.076665 0.105433 0.129852 0.150915 0.169345 0.185682 0.200332 0.213605 0.225741 0.088355 0.144633 0.183842 0.213013 0.235816 0.254339 0.269850 0.283160 0.294815 0.305189

Madaline

obtained by calculating the sensitivity of each Adaline in the Madaline from the first layer to the last layer with such a partial order that the Adalines in the preceding layer should be calculated before those in the succeeding layer and that the Adalines in the same layer could be in any order. In section 3, we pointed out that the input perturbation of an Adaline can be treated as a kind of weight perturbation. We then provided a way to compute the sensitivity of the Adaline with weight perturbation. Hence, what needs to be done is to merge input perturbation into weight perturbation and then make use of equations 3.9 and 3.13 to compute the sensitivity of every Adaline in a Madaline. In section 3, the input perturbation is expressed as Xl (1 ≤ l ≤ L), but here it is in the form of Sl−1 (1 ≤ l ≤ L), the probability of the input perturbation. Because of the different forms of expression, a transformation from Sl−1 to Xl is needed for each layer before computing the sensitivity of each Adaline in the layer. For layer l with Sl−1 , since the perturbation

Madalines’ Sensitivity

2865

0.45 0.4

Simulation results Theoretical results

Sensitivity or Probability

0.35 Net 3 0.3 0.25 0.2 Net 2 0.15 0.1 Net 1 0.05 0 0.1

0.2

0.3

0.4 0.5 0.6 0.7 Weight perturbation

0.8

0.9

1

Figure 2: Experimental results for the three Madalines with weight perturbations.

probability of each input element may be different, the perturbed input vector in general may have many combinations of perturbed input elements at a given number, say, k (0 ≤ k ≤ nl−1 ). This causes the transformation to be quite complex. For simplicity, in the following derivations, s1l−1 , s2l−1 , . . . , snl−1 l−1 are approximated by their mathematical expectation: n 1

l−1

s l−1

=

nl−1

sil−1 .

(4.3)

i=1

Thus, the probability of k elements being perturbed in the input of layer l can be approximated as l−1

pkl ≈ Cnkl−1 (1 − s l−1 )n

−k

(s l−1 )k .

(4.4)

In order to compute the corresponding correlation coefficient for an Adaline with both input and weight perturbations, equation 3.9 needs to be modified to merge the input perturbation into the weight perturbation.

2866

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

l Let ρik denote the correlation coefficient of the ith Adaline in layer l with l l T k input elements perturbed, and let Wil = (wi1 , . . . , win represent the l−1 ) perturbed weight after the merge. Equation 3.9 can be modified as follows by noticing that we can only have either wilj = wilj when the jth input element is not perturbed, or wilj = −wilj when the jth input element is perturbed:

nl−1 l ρik =

j=1 nl−1 j=1

wil j wilj

nl−1 j=1

(wil j )2

wil j wilj − k wil j wilj nl−1 l 2 l 2 j=1 (wi j ) j=1 (wi j )

nl−1 −k (wilj )2

= nl−1

wil j wilj − 2 k wil j wilj = . nl−1 nl−1 l 2 l 2 j=1 (wi j ) j=1 (wi j ) nl−1

j=1

(4.5)

Since different combinations of perturbed input elements may yield difl ferent values of ρik under a given k, we again use the average correlation l l coefficient, denoted as ρik , to approximate ρik . When k = 0, it is always true l l that ρi0 = ρi0 . Different combinations of perturbed input elements may yield different items k wil j wilj in equation 4.5. Under a given k and m, if the mth input element is perturbed, the other nl−1 − 1 input elements must have k − 1 input elements perturbed. That is, there must be Cnk−1 l−1 −1 different combinations of perturbed input elements with the mth input element perturbed in all Cnkl−1 different combinations. It means that under given k and k l l wi j wi j , m, in all Cnkl−1 different k wil j wilj , there must be Cnk−1 l−1 −1 different k l l l l k wi j wi j are added, for which have wim wim , that is, if all Cnl−1 different l l wim are all added Cnk−1 any given m (1 ≤ m ≤ nl−1 ), wim l−1 −1 times. Because of equation 4.5,

C k−1 l−1 n

−1

C kl−1

=

n

l ρik =

Cnkl−1

l l and ρi0 = ρi0 , the average correlation coefficient is

nl−1

nl−1 l l wil j wilj − 2Cnk−1 l−1 −1 j=1 wi j wi j nl−1 nl−1 l 2 l 2 Cnkl−1 j=1 (wi j ) j=1 (wi j )

nl−1 =

k nl−1

j=1

j=1

wil j wilj − 2

C k−1 l−1

nl−1

l l j=1 wi j wi j nl−1 l 2 l 2 j=1 (wi j ) j=1 (wi j )

nl−1

n

−1

C kl−1 n

l−1 n k l l k nl−1 l l l l 1 − 2 nl−1 w w w w − 2 w w j=1 i j i j j=1 i j i j j=1 i j i j nl−1 = = nl−1 nl−1 nl−1 nl−1 l 2 l 2 l 2 l 2 (w ) (w ) (w ) ij ij ij j=1 j=1 j=1 j=1 (wi j ) k k l l = ρi0 1 − 2 l−1 = ρi0 1 − 2 l−1 . n n nl−1

(4.6)

Madalines’ Sensitivity

2867

Table 3: Experimental Results for the Three Madalines. Architecture

δW

p (Simulation)

s (Ours)

s (Stevenson)

Net 1

20-15-1

Net 2

15-20-10-1

Net 3

25-20-15-10-1

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10

0.011400 0.021933 0.032858 0.040961 0.050033 0.058933 0.066330 0.074381 0.081882 0.089083 0.024780 0.049133 0.065125 0.081543 0.097473 0.110779 0.128845 0.138306 0.149780 0.161316 0.045918 0.081263 0.119598 0.148934 0.176012 0.198072 0.222837 0.239430 0.258338 0.271175

0.010720 0.020977 0.030801 0.040219 0.049258 0.057943 0.066295 0.074336 0.082086 0.089563 0.025913 0.048813 0.069195 0.087462 0.103938 0.118889 0.132533 0.145053 0.156599 0.167296 0.059424 0.103402 0.137284 0.164259 0.186324 0.204786 0.220531 0.234178 0.246173 0.256846

0.036058 0.051192 0.062939 0.072955 0.081876 0.090028 0.097606 0.104733 0.111496 0.117957 0.120930 0.144180 0.159999 0.172422 0.182856 0.191969 0.200137 0.207593 0.214495 0.220951 0.221407 0.241815 0.254826 0.264655 0.272694 0.279583 0.285672 0.291175 0.296230 0.300934

Madaline

From equations 4.4 and 4.6, the sensitivity sil can be expressed as n

l−1

sil

≈

k=0

pkl

l 1 arcsin ρik − . 2 π

(4.7)

In summary, an algorithm for the computation of the sensitivity of a Madaline can be given as follows: MADALINE SENS(W, W, S0 , . . .) For layer l from 1 to L do: For Adaline i from 1 to nl do: Calculate sil using equations 4.3, 4.4, 4.6, and 4.7: L L (s1 , s2 , . . . , snLL )T is the sensitivity of the Madaline.

2868

Y. Wang, X. Zeng, D. Yeung, and Z. Peng 0.4

0.35

Simulation’s Ours Stevenson’s

Sensitivity or Probability

0.3

Net 3

0.25 0.2

Net 2

0.15 0.1

Net 1

0.05 0 0.01

0.02

0.03

0.04 0.05 0.06 0.07 Weight perturbation ratio

0.08

0.09

0.1

Figure 3: Experimental results for the three Madalines with weight perturbation ratios.

By analysis, it can be known that the computational complexity of the L algorithm is O( l=1 (nl−1 )2 nl ) and that of the simulation approach is 0 L n l−1 l O(2 n ). Obviously, our approach is much more efficient. l=1 n 5 Experimental Verification To verify the effectiveness of the derived formula and the given algorithm, a number of experiments have been conducted. Again, the bottom-up strategy was followed, in which Adalines are considered first. 5.1 Verification for Adalines. The sensitivity results for three Adalines with input dimensions of 15, 20, and 30 were separately computed using equation 3.13, in which the elements of W were randomly selected from [−10, −1] and [1, 10], and the elements of W were all identical and ranged from 0.1 to 1 with an increment of 0.1. The randomly obtained weight values of the three Adalines are given in Table 4 in the appendix. Computer simulations were run to compute the actual probability of the output deviations on all 2n possible input patterns for the three Adalines with the same parameters as in the sensitivity computation. All of these results are

Madalines’ Sensitivity

2869

given in Table 1 and illustrated in Figure 1. Both the theoretical results s and the simulation results p given in Table 1 and Figure 1 are well matched. This verifies the correctness of our approach. Further, the corresponding theoretical sensitivities s based on Stevenson’s approach (Stevenson et al., 1990) for the three Adalines are also computed and given in Table 1 and drawn in Figure 1. A comparison of the data obtained from simulation results, our results, and Stevenson’s results demonstrates that the accuracy of our approach is better than the results of Stevenson et al. 5.2 Verification for Madalines. The experiments on Madalines were carried out for two purposes: to verify the effectiveness of our algorithm and to compare our sensitivity results with those of Stevenson et al. (1990) to demonstrate that the accuracy of ours is better than that of Stevenson et al. In our experiments, three Madalines were implemented. They have architectures of 20 − 15 − 1, 15 − 20 − 10 − 1 and 25 − 20 − 15 − 10 − 1, respectively, and have weight elements randomly selected from [−10, −1] and [1, 10]. All the weight values of the three Madalines thus obtained are given in Table 5 in the appendix. To address the first objective, the sensitivity results for the three Madalines were computed by MADALINE SENS(W, W, S0 , . . .) with the elements of W being identical and ranging from 0.1 to 1.0 with an increment of 0.1 and S0 being zero. Besides, the actual probability results of output deviations for the three Madalines with the same parameters as in the sensitivity computation were computed by running computer simulations. Both the theoretical results s and the simulation results p are given in Table 2 and drawn in Figure 2. The data and the graphs in Table 2 and Figure 2 show that the computed sensitivity and the simulated probability are matched well. In the fulfillment of the second purpose, since Stevenson’s approach (Stevenson et al., 1990) requires all Adalines in a Madaline to have the same weight perturbation ratio, our sensitivity results, Stevenson’s sensitivity results, and the simulation’s probability results for the three Madalines were all computed with the same weight perturbation ratio from 0.01 to 0.1. The results are all given in Table 3 and drawn in Figure 3. By comparing the corresponding data in Table 3 and graphs in Figure 3, we can conclude that the accuracy of our approach is better. 6 Conclusion In this article, a quantified sensitivity measure for Madalines to input and weight perturbations is put forward. Formulas and algorithms are given for the computation of the sensitivity. Experimental verifications demonstrate that the theoretical results obtained from the formula and the algorithm match well with the simulation results, even when the dimension of the input of each layer is not very large. The sensitivity measure is expected to be useful as a relative criterion to evaluate the networks’ performance

2870

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

so that some crucial issues of Madalines, such as improvements of error tolerance and generalization ability, and the simplification of the network architecture, will benefit from this measure. In our future work, we intend to apply the sensitivity measure to evaluate the relevance of each Adaline as well as each input attribute in a given Madaline and then, based on the evaluation to trim the Madaline, to have an appropriate architecture with higher performance and lower cost. Further, we will investigate how to apply our method to other kinds of neural networks. Appendix This appendix presents the parameter values of the Adalines and Madalines used in the experiments. Table 4: Parameter Values of the three Adalines Used in the Experiment. Adaline

Dimension

Weight

Neuron 1

15

Neuron 2

20

Neuron 3

30

{8.586, −9.271, 8.843, −8.257, −6.671, 2.871, 9.700, 7.314, 8.243, −8.486, −6.343, 9.900, −7.171, 9.743, 8.400} {9.967, −2.367, −6.650, −7.350, 8.600, −9.517, −8.283, 8.817, 6.133, −9.200, 2.567, 8.633, 5.867, −4.783, −8.267, 5.783, −7.750, 8.300, 4.533, −6.117} {−7.1500, 2.9130, 8.5531, −6.6591, −2.2040, 2.8642, 6.4648, 6.6690, 4.3343, 6.1763, 5.0628, 1.3951, −1.2447, 3.8142, 1.1158, −4.4557, −7.1480, 1.8356, 1.3180, 6.5116, 6.4769, 1.1418, −1.1472, 2.7107, 6.2823, 1.5182, −4.3081, −6.6831, 7.4587, 7.2340}

Table 5: Parameter Values of the Three Madalines Used in the Experiment. Madaline Net 1

Architecture

Weight

20−15−1

Layer 1: {−4.357, −1.014, 3.514, −6.400, −4.343, −1.500, 7.829, 7.143, 9.043, −7.700, 3.571, −5.214, 8.771, −7.529, −3.357, 4.557, −9.557, −9.129, −8.100, −8.029}, {8.971, 4.271, −3.557, 8.386, 4.529, −1.771, 1.929, 7.729, 7.543, 6.157, −9.871, −3.814, −2.857, 8.400, 4.629, −4.143, 1.329, −1.086, 4.500, −2.643}, {−6.200, −4.143, −3.371, −5.414, 9.671, −4.029, 2.014, 1.500, −3.186, −4.200, 2.129, −9.371, 6.057, 5.286, −6.071, 7.900, −6.814, 4.914, 8.471, 1.071}, {−4.186, 5.643, 6.957, −3.543, 9.214, −3.600, −5.129, −9.071, 2.829, 8.271, −4.700, 6.500, −1.600, −8.014, 2.729, 9.543, −7.500, −3.543, 6.343, −1.114},

Madalines’ Sensitivity

2871

Table 5: Continued Madaline

Net 2

Architecture

15−20−10−1

Weight {−2.186, −6.529, −8.043, −8.729, −6.571, −3.329, 1.886, 7.429, −3.386, 4.443, 3.714, 1.286, −6.843, 9.029, 6.643, −2.157, 3.800, −3.700, −4.200, 8.614}, {−2.971, 4.800, −7.557, −7.214, 7.300, 3.229, −3.014, 5.343, 8.614, −7.743, 7.871, 5.286, −5.257, 9.114, −3.286, 6.657, 8.129, 7.686, 9.457, −1.086}, {6.243, −4.557, −1.514, −5.314, 7.500, 1.629, 8.143, −5.186, 9.343, 9.300, −7.143, 2.243, 1.457, −1.086, 9.786, 3.543, −5.086, 2.414, −7.871, 8.986}, {4.414, −8.229, 8.286, −1.757, 5.557, −1.971, −4.429, 8.329, −1.900, −8.229, −2.829, 1.371, 6.114, −9.214, −2.771, −7.100, 3.257, −4.986, −3.971, 2.343}, {−6.829, 6.729, −3.114, 4.043, −1.643, −2.957, 8.857, 1.400, −1.271, −5.557, −1.671, −9.457, 2.671, −4.300, −4.971, 8.714, 2.586, −8.357, −6.814, −8.686}, {−6.157, −1.486, −8.429, 4.943, −1.200, 9.429, 6.057, 4.400, 1.514, −1.243, 4.586, −7.057, −2.900, −7.886, 8.386, 9.386, −1.600, −8.029, −3.314, −7.929}, {−9.143, −1.757, −2.371, 1.214, 2.786, −3.457, 5.029, −1.343, −4.843, 7.514, 1.814, −8.671, −3.657, 9.586, 2.271, 8.686, 7.629, 4.457, −5.143, −2.986}, {−1.743, −9.329, 7.586, 8.100, 1.586, 3.114, 9.557, −5.229, 3.086, 8.343, −9.986, 6.343, −4.214, 3.057, −4.457, 9.229, −1.129, −2.900, −4.886, −1.014}, {−6.900, −4.843, −4.343, 5.786, 7.471, −7.329, 5.057, −8.871, 4.657, −2.929, 6.943, 7.586, −8.386, −3.571, −9.486, 4.486, −4.371, 3.271, −7.786, −2.129}, {6.471, 3.043, 7.300, −10.000, −1.186, 4.143, 5.314, −9.943, 7.629, −7.643, −4.571, −1.957, −3.371, −1.386, 4.071, −9.414, 1.586, −7.971, 5.043, −8.371}, {2.329, 6.714, 5.400, 6.143, 8.714, −1.929, 4.457, 1.771, −4.857, −9.286, −6.400, −2.400, 5.171, 3.086, −5.629, −5.743, 1.686, −7.243, 3.629, −3.643}. Layer 2: {4.686, −4.757, 9.714, −5.086, −7.814, 8.957, 6.129, −7.914, −5.786, 1.243, 8.514, 1.800, −2.171, 2.571, −5.343}. Layer 1: {2.057, 6.643, −6.400, −7.757, 1.643, −1.071, 4.586, 7.057, −3.986, 4.243, −2.943, −7.986, −9.814, −5.943, 1.086}, {9.500, 1.486, −3.071, 9.857, −3.629, 2.657, −3.457, −6.771, −1.343, 8.000, 9.771, 9.386, 1.843, −3.714, −7.971}, {7.729, 4.029, −3.886, −5.243, −3.586, −2.929, 2.971, −7.271, 1.800, 2.914, 4.671, 1.586, 7.157, 9.929, 6.514}, {−8.529, 3.729, −6.914, 4.186, 5.186, 1.171, −2.100, 2.686, −7.700, −9.400, −4.786, −8.814, 5.043, 7.829, 4.971}, {−6.843, −8.800, −7.914, −2.257, −5.157, 7.700, −5.557, 7.614, 8.871, −2.200, −4.729, −3.029, 9.686, 7.800, −7.214}, {−4.371, −5.186, 9.386, 2.771, −5.229, −6.614, 6.857, −5.886, 1.657, 3.957, −4.329, 3.443, 5.943, −3.143, 1.257},

2872

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

Table 5: Continued Madaline

Architecture

Weight {−6.371, 8.929, −7.829, 8.971, 4.857, 4.071, −9.114, −9.457, −6.243, 9.929, 4.300, −3.071, −4.714, 2.700, −2.943}, {2.029, −4.457, 2.900, −4.400, 3.400, 3.014, −9.786, −4.529, −1.643, −4.814, 5.586, −9.843, 7.743, 3.414, 1.257}, {2.586, 2.971, 4.686, −3.029, −7.529, −2.586, −6.200, −4.471, −1.857, 8.571, −7.143, 1.614, −9.057, 7.729, −1.700}, {9.529, −6.014, −6.286, 7.486, 4.600, 8.514, 3.557, −1.257, −1.429, −4.286, −5.743, −3.200, 1.057, 7.257, 2.386}, {−5.914, −2.486, −1.029, −8.943, 4.757, 1.914, −5.900, 8.800, 4.300, 2.114, −4.543, 2.071, 2.371, 2.043, −4.200}, {2.129, −4.700, 2.371, 6.529, −4.629, 8.686, 5.600, −7.614, −3.400, 2.771, −9.443, 2.800, 2.314, 5.571, 1.443}, {1.100, 9.943, −6.743, 4.286, 9.871, 5.229, −7.700, −5.700, −1.614, 6.114, 6.343, 9.543, −1.071, 5.857, −1.071}, {−1.871, 6.943, 8.886, 6.843, 5.743, 2.157, −8.100, −6.700, −5.614, −8.986, 9.986, 8.057, −3.229, 2.271, 4.371}, {−5.757, 7.314, −3.686, 1.571, 6.643, 7.029, −9.957, −4.314, −2.557, 2.500, 9.900, −8.757, 5.757, −3.214, 9.743}, {−6.757, 1.171, 8.686, 9.671, −4.471, 3.914, −6.243, 9.343, 4.743, 7.729, 8.700, 7.500, 1.671, 7.400, 1.243}, {4.871, 3.329, −6.329, 7.257, 6.429, −3.129, −9.557, 4.629, 1.757, −4.671, 2.629, −9.743, −9.986, 4.400, −3.343}, {9.100, 6.314, −5.729, 6.286, 9.686, −5.643, 5.843, 9.757, 8.357, 1.629, −9.829, 6.500, 8.343, 5.829, 6.386}, {2.014, 5.657, 2.429, −5.600, 2.957, −6.871, −3.386, 2.114, −5.257, −2.143, 4.186, 9.500, −5.686, −5.086, 5.586}, {−3.386, 9.114, −7.229, −9.986, −6.357, −8.757, 9.143, −7.657, −5.614, 1.343, −7.143, 7.871, 5.000, −4.000, 2.600}. Layer 2: {−5.443, −2.986, −4.800, −8.629, 9.814, −7.214, 6.686, 6.571, 1.729, −4.214, 3.543, 9.429, −3.857, −1.700, −1.329, −8.743, 9.029, 3.214, 9.771, 6.286}, {−3.971, 5.429, 4.429, 1.043, 4.243, 3.129, −2.471, −7.800, 8.371, −3.729, −7.743, 9.386, −2.000, −1.971, 7.557, −7.629, −3.143, 3.114, 4.486, 8.614}, {−3.814, −5.329, −2.686, 8.929, 1.729, 6.286, 7.257, −2.771, −9.186, 6.357, 3.943, −6.871, 8.457, −7.171, −6.071, −6.414, −6.571, −8.243, 2.771, 5.786}, {−4.400, −3.114, −2.129, −4.700, −8.671, 1.386, −2.157, −8.914, 2.757, −6.214, −5.043, 2.443, 1.486, −9.914, 8.743, −8.229, 5.814, −4.771, 9.286, 6.200}, {−7.614, 2.571, 6.600, −5.986, −6.571, −3.886, 6.043, 4.386, 2.529, −7.600, 1.571, 7.686, −6.671, 1.200, 9.286, 1.471, −3.229, −3.286, −5.786, 3.786}, {6.071, −9.143, 5.557, 5.357, 5.171, −7.443, 9.457, 4.286, 6.414, −7.529, 5.700, 7.900, 5.271, 8.129, −6.300, −7.500, −8.557, −8.886, 2.686, 7.000}, {5.886, −1.643, 1.129, −8.686, −8.614, 2.386, 4.143, −5.914,

Madalines’ Sensitivity

2873

Table 5: Continued Madaline

Net 3

Architecture

25−20−15−10−1

Weight −6.814, −8.814, 1.957, 1.286, 3.271, 1.814, −3.629, −6.971, −7.286, 2.057, −3.771, 5.457}, {9.400, 6.943, −5.129, −7.171, 9.086, 8.943, 4.143, 8.643, −8.986, −2.914, 8.857, −4.486, −5.629, −7.429, 7.614, 7.186, −7.929, 4.971, 3.157, 5.886}, {7.043, 9.629, 5.843, −9.000, −3.214, −2.086, −2.871, −2.086, 2.114, −1.671, 8.614, −7.014, −6.986, −6.886, 1.900, 8.086, −2.414, −5.171, −8.043, −2.771}, {−7.000, 5.057, −7.371, 2.786, −9.343, 1.543, −9.500, −1.886, 9.429, −2.400, 8.129, −3.829, −4.586, 6.529, −4.800, 1.114, 7.457, 5.729, −5.657, 9.029}. Layer 3: {6.586, −9.400, −1.786, −7.457, −1.157, −3.257, 1.243, 9.771, −5.900, −1.014}. Layer 1: {−2.400, 1.871, −5.400, −2.000, −6.143, −7.386, 8.471, 9.929, −2.714, 6.371, 9.443, 4.786, 7.600, 8.686, −8.757, 9.843, −4.400, 3.543, −7.957, −3.886, 3.757, 7.771, −6.100, 4.000, −3.129}, {2.614, −4.043, 5.371, −2.857, −2.543, −1.214, 3.329, 9.557, −8.500, 6.643, 8.529, −3.600, 8.329, −3.243, 1.671, 1.371, 3.214, 6.429, 9.529, 6.114, 9.129, 8.714, 5.071, 4.329, −6.986}, {−3.729, 6.400, 5.871, 8.900, −8.371, −9.414, 7.429, 7.214, −3.986, −8.329, −4.757, 6.700, −3.500, 8.743, 6.529, 6.757, 4.943, 1.814, 1.600, −7.386, 3.929, 7.043, 3.971, −1.086, −9.343}, {−3.729, −1.557, −3.200, −1.414, 5.714, −8.514, 8.143, −6.729, 8.243, 4.400, −2.500, −6.000, −7.814, 4.771, −3.729, −2.757, 6.471, −1.986, −3.300, 8.157, 7.743, 7.586, −2.871, 1.814, −8.843}, {9.729, 3.286, 9.429, 7.014, −1.186, 9.300, 2.357, 2.743, −7.800, −7.986, −1.486, −4.600, −3.429, 5.400, −3.600, 8.671, −8.814, 7.571, −1.943, 8.271, 1.114, 8.471, 2.743, −6.814, −1.557}, {6.029, 1.186, 4.743, 8.414, −9.829, −6.557, −4.571, 5.829, 4.400, 6.086, 1.686, −1.371, −8.200, 3.700, 2.071, 5.486, 3.371, −9.957, 3.300, −6.386, −6.500, 8.286, −6.086, −3.543, −2.871}, {8.614, −2.314, 3.257, 8.571, −4.443, 2.229, 6.000, −4.729, 8.557, 5.171, 1.643, −2.300, 7.614, −7.657, 5.657, −5.871, 8.443, −9.557, 8.186, 2.257, 6.229, −7.929, −6.943, 9.557, −2.086}, {−8.686, −8.586, −4.829, −6.157, −6.614, 7.057, 4.386, −8.014, 6.214, −5.514, −4.757, −6.171, −2.571, 2.700, 5.214, −6.629, −4.057, −7.857, 9.600, −8.329, −1.300, 3.286, 4.886, −1.357, −2.200}, {7.429, 2.771, 7.200, −5.786, −9.600, 5.243, −1.129, −9.086, −4.657, 3.457, 8.900, −4.486, 2.714, 9.686,

2874

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

Table 5: Continued Madaline

Architecture

Weight 6.314, −7.000, −3.071, 1.686, −4.071, −6.757, 9.386, −1.229, 3.771, 8.543, 7.686}, {1.943, 4.614, −3.271, 7.743, −7.286, 8.657, 9.186, −5.571, 9.071, 9.471, 8.629, 5.386, −8.543, −6.900, −9.814, 9.200, −2.471, −8.543, −8.014, 1.529, −8.386, −8.886, 5.986, 7.143, 4.829}, {9.443, −2.114, 5.214, −1.243, 9.943, −9.843, 6.371, 2.714, 7.000, 7.500, 6.329, −7.529, 9.829, 7.400, 6.271, 9.600, 3.729, −5.143, 2.886, 5.343, 6.400, −1.243, 9.886, −3.657, 9.000}, {−6.614, −3.743, −3.386, −4.271, 8.471, −5.714, −2.029, 5.671, 6.114, 2.286, −5.586, −6.386, −7.429, 5.471, −7.800, 8.986, −8.671, 1.329, 6.671, 5.329, −1.857, −9.400, 2.529, 3.914, −2.586}, {3.914, 9.943, −9.729, −8.900, −7.657, 5.400, −7.600, −6.300, 2.371, 5.371, −5.114, 8.943, −9.486, 8.371, 3.557, 8.229, 9.129, 7.014, 1.786, 9.057, −3.886, 4.871, 6.600, −6.329, 9.343}, {8.071, 1.600, −7.600, −4.457, −5.800, −8.071, 2.000, 6.600, 4.500, 5.871, −3.071, 8.700, −2.014, 2.314, 7.814, 1.886, 4.814, 9.200, −5.143, −1.971, 7.686, 7.714, 1.343, −7.000, 7.786}, {3.629, −8.143, −9.171, 1.100, 1.314, 1.000, 7.871, 3.529, −8.800, −3.814, 5.771, 9.143, −2.886, 9.557, 5.329, −5.386, −4.257, −6.786, 5.171, 7.743, 1.271, 9.986, −2.457, −4.157, 4.871}, {−3.371, 2.200, 3.414, −9.629, −1.971, −4.457, 2.357, −2.543, −3.671, −7.871, −6.386, −1.786, −5.029, 6.786, 8.100, −4.071, 2.829, −2.986, −3.743, −5.529, 9.943, 9.257, 3.786, −5.629, −6.043}, {4.900, −9.500, −1.057, −2.743, −7.300, −6.400, −6.629, 6.257, −5.343, −8.800, 9.357, 1.629, −4.486, 4.200, −8.357, 1.143, 3.543, −8.886, −4.014, 8.429, −7.257, −8.986, 3.429, −1.571, 9.086}, {2.057, 4.657, −9.743, 6.271, −2.400, −2.257, −4.057, 7.743, −2.186, −2.871, −9.800, 1.143, 1.229, −3.586, −7.471, −6.129, 3.000, 8.100, 4.857, 7.814, 1.271, −6.171, −2.229, 1.186, 8.543}, {−7.086, 7.329, −8.943, −8.543, 2.900, 2.629, −9.143, −4.529, −8.014, −1.329, −3.471, 4.000, −8.600, −2.300, −9.929, −6.157, −3.286, 6.014, −8.400, −9.886, −1.200, −8.471, 8.729, −5.314, −4.257}, {9.571, 2.229, 9.157, 3.429, −8.800, −2.357, −7.029, −4.886, 2.600, 1.571, −1.129, −6.914, 3.757, −8.486, 7.800, −7.686, 8.543, 3.400, −9.186, 4.529, −1.571, 4.743, −9.414, 8.714, 5.543}. Layer 2: {6.471, 6.900, −2.929, −2.743, 1.086, 9.943, 1.371, 4.529, 4.000, −9.543, 8.900, 7.543, −8.300, −2.143, −6.657, 9.457, −1.186, −2.886, −9.014, 4.571},

Madalines’ Sensitivity

2875

Table 5: Continued Madaline

Architecture

Weight {2.100, −1.214, 1.700, 6.829, 6.671, 5.014, −8.343, 2.529, 4.843, −5.300, 3.557, 9.129, −9.286, 1.243, 2.286, 8.800, −3.543, −7.843, −9.686, 7.871}, {−5.429, −2.814, 7.057, 8.786, −5.029, 6.000, −2.000, 3.357, −3.757, 7.357, 9.743, 5.957, −2.100, 2.414, 9.857, 2.429, −3.186, −7.643, −9.829, −6.743}, {1.229, −8.243, −4.100, 2.857, −7.071, −1.771, −5.486, −7.057, 1.843, 9.243, −6.129, −1.614, −7.914, −1.400, −9.914, −9.814, 9.657, 7.900, −3.429, −2.157}, {−3.343, 9.286, −1.229, 2.086, 7.414, −9.357, −2.514, −6.857, −7.586, 5.343, 9.543, 5.014, −4.743, 2.529, −5.457, 8.014, −5.529, 8.014, −6.086, −6.386}, {−5.771, 1.443, 1.829, 4.129, −8.029, −6.357, −7.814, 1.643, −2.986, 5.157, 1.257, −6.757, 4.500, 7.543, 3.486, −3.943, −1.214, −4.643, −9.700, −1.029}, {9.414, −2.957, 8.986, 6.414, 5.729, −7.900, 6.057, −5.514, −1.657, 7.600, 7.171, −1.414, 3.414, 3.957, 9.829, −9.514, −7.429, −4.143, −1.986, −5.271}, {5.257, −8.086, −9.443, 9.557, −1.329, 1.086, −8.514, 4.114, −8.129, 2.829, 2.257, 4.900, 6.886, 5.471, −1.900, −4.200, 4.300, 7.700, 6.800, −1.843}, {−9.714, −5.743, −9.314, 5.114, −9.343, −9.514, −8.114, −3.429, −6.186, −2.743, 6.257, −5.229, −4.271, 3.214, 2.657, −5.771, 9.814, 3.714, −9.557, −1.486}, {1.214, −9.786, −2.714, −3.900, 5.200, −5.043, 2.629, 1.457, 8.171, 7.714, 7.329, 2.343, −6.643, 4.243, −4.843, −9.943, 6.214, −8.857, −9.914, 7.200}, {−3.886, 6.871, 1.286, 8.586, 1.029, 4.343, −1.500, 4.429, 1.357, 7.943, −6.057, 8.929, 2.500, −3.171, −2.443, −6.900, 5.357, −5.457, −9.686, 8.700}, {−4.843, 8.743, 8.343, −9.986, −4.543, 4.357, 3.429, 4.100, −4.729, −8.857, −3.200, 8.843, 3.486, 5.714, 2.271, 8.429, 4.771, 1.414, −1.357, −2.971}, {−1.900, 2.871, 9.986, 2.743, −5.229, −2.843, −8.514, −4.029, −10.000, 6.057, 9.029, −2.986, 6.543, 4.286, 2.629, −2.186, 5.829, 1.743, 4.543, −2.700}, {−9.557, −8.143, −1.743, −2.757, −4.971, −5.914, −1.929, −5.443, −2.371, −6.829, 9.886, 4.200, −6.357, −9.957, −5.357, −4.971, 2.014, 2.043, −2.771, −5.257}, {4.557, 5.771, −7.057, −8.857, −1.357, 2.443, −9.014, 7.771, −9.100, −3.571, 3.343, −6.571, 1.343, −8.500, −9.229, −3.686, −1.271, −9.514, −6.200, −5.329} Layer 3: {7.500, −3.514, −9.586, 3.800, 3.229, −9.129, −9.243, −2.614, 6.143, 2.286, −1.100, 6.443, 7.129, −3.671, −2.314}, {7.800, −9.529, −1.029, −8.629, −7.443, −6.471, −3.371, 2.143, 4.886, 1.400, 6.414, −3.814, −6.100, −6.543, −6.386}, {−4.500, 5.743, 1.514, −2.014, −6.114, 6.771, 6.500, 5.271, −2.157, −3.700, −9.543, −1.629, 3.900, 6.971, −8.900},

2876

Y. Wang, X. Zeng, D. Yeung, and Z. Peng

Table 5: Continued Madaline

Architecture

Weight {−9.200, −1.057, 6.086, −3.014, 9.429, 4.500, −5.114, 6.171, 9.214, 7.929, −9.957, −4.729, −3.114, 2.357, 2.871}, {4.343, 9.886, −5.843, 1.814, 8.043, 9.771, −4.329, −3.571, 2.543, −1.029, 9.843, 5.043, −5.871, 3.486, −2.957}, {8.086, −9.643, 9.871, −5.186, −8.000, 8.071, 6.614, 4.929, 1.257, 9.814, 6.929, 9.371, −6.786, 8.529, 7.086}, {7.743, 5.471, −9.871, −8.843, −8.186, −4.700, 9.843, −2.700, 2.714, 9.271, −3.043, 2.686, −2.614, 8.086, 6.914}, {−3.843, −1.529, −7.586, 6.186, 1.829, −5.871, 8.343, 7.343, 2.986, 9.229, 9.457, −9.957, 8.371, 1.900, 1.129}, {−8.100, 1.871, 2.229, −6.486, 9.800, −3.029, −2.614, 9.800, −2.643, −2.443, −3.129, −7.657, −6.914, 2.614, −1.300}, {−9.543, 6.300, −7.829, 5.457, 1.114, 9.371, −3.629, 8.429, −8.814, −7.443, −5.857, −5.186, 2.114, 2.457, −2.486} Layer 4: {−1.243, 5.629, 3.000, 8.029, −9.400, −8.571, −4.743, −4.857, −3.343, −8.971}

Acknowledgments We express our gratitude to the reviewers for their helpful suggestions that significantly improved this article. This work was supported by the National Natural Science Foundation of China under grant 60571048 and the Provincial Natural Science Foundation of Jiangsu, China under grant BK2004114. References Alippi, C., Piuri, V., & Sami, M. G. (1995). Sensitivity to errors in artificial neural networks: A behavioral approach. IEEE Transactions on Circuits and Systems–I: Fundamental Theory and Applications, 42(6), 358–361. Cheng, A. Y., & Yeung, D. S. (1999). Sensitivity analysis of neocognitron. IEEE Transactions on System, Man, and Cybernetics—Part C: Applications and Reviews, 29(2), 238–249. Choi, J. Y., & Choi, C. H. (1992). Sensitivity analysis of multilayer perceptron with differentiable activation functions. IEEE Transactions on Neural Networks, 3(1), 101–107. Engelbrecht, A. P. (2001a). Sensitivity analysis for selective learning by feedforward neural networks. Fundamenta Informaticae, 45(1), 295–328. Engelbrecht, A. P. (2001b). A new pruning heuristic based on variance analysis of sensitivity information, IEEE Transactions on Neural Networks, 12(6), 1386–1399. Fu, L., & Chen, T. (1993). Sensitivity analysis for input vector in multilayer feedforward neural networks. Proc. of IEEE Int. Conf. on Neural Networks, 1, 215–218.

Madalines’ Sensitivity

2877

Gong, J., & Zhao, G. (1996). An approximate algorithm for bivariate normal integral. Computational Structural Mechanics and Applications, 13(4), 494–497. Oh, S. H., & Lee, Y. (1995). Sensitivity analysis of a single hidden-layer neural network with threshold function. IEEE Transactions on Neural Networks, 6(4), 1005–1007. Pich´e, S. W. (1995). The selection of weight accuracies for madalines. IEEE Transactions on Neural Networks, 6(2), 432–445. Stevenson, M., Winter, R., & Widrow, B. (1990). Sensitivity of feedforward neural networks to weight errors. IEEE Transactions on Neural Networks, 1(1), 71–80. Yeung, D. S., & Sun, X. (2002). Using function approximation to analyze the sensitivity of the MLP with antisymmetric squashing activation function. IEEE Transactions on Neural Networks, 13(1), 34–44. Zeng, X., & Yeung, D. S. (2001). Sensitivity analysis of multilayer perceptron to input and weight perturbations. IEEE Transactions on Neural Networks, 12(6), 1358–1366. Zeng, X., & Yeung, D. S. (2003). A quantified sensitivity measure for multilayer perceptron to input perturbation. Neural Computation, 15(1), 183–212. Zeng, X., & Yeung, D. S. (2006). Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neurocomputing, 69, 825–827. Zurada, J. M., Malinowski, A., & Cloete, I. (1994). Sensitivity analysis for minimization of input data dimension for feedforward neural network. In Proceedings of the IEEE International Symposium on Circuits and Systems (pp. 447–450). Piscataway, NJ: IEEE. Zurada, J. M., Malinowski, A., & Usui, S. (1997). Perturbation method for deleting redundant inputs of perceptron networks. Neurocomputing, 14(2), 177–193.

Received January 5, 2005; accepted March 29, 2006.

Computation of Madalines' Sensitivity to Input and ...

(3.13). It is worthy of note that the computational complexity of our approach, which is O(n), is much less than that of the simulation approach, which is. O(n2n).

Download PDF

156KB Sizes 1 Downloads 186 Views

Report

Computation of Madalines' Sensitivity to Input and ...

Recommend Documents