WCICA 2006 Paper

Viewer
Transcript

Proceedings of the 6th World Congress on Control and Automation, June 21 - 23, 2006, Dalian, China

Combining Least Squares Support Vector Machines and Wavelet Transform to Predict Gas Emission Amount Cunliang Jia

Haishan Wu

College of Information and Electronic Engineering

College of Information and Electronic Engineering

China University of Mining & Technology

China University of Mining & Technology

Xuzhou,Jiangsu , China, 221008

Xuzhou,Jiangsu , China, 221008

[email protected]

[email protected]

Abstract- To improve the prediction accuracy of gas emission amount, a novel model based on least squares support vector machines (LS-SVM) and wavelet transform (WT) is presented. First, the historical series is decomposed by wavelet, and thus the approximate part and several detail parts are obtained. Then each part is predicted by a separate LS-SVM predictor. The reconstruction of predicted series is used as the final prediction result. The selections of embedding dimension and decomposition level are discussed, respectively. The results show that this model has greater generality ability and higher accuracy. Index Terms– Least squares support vector machines, wavelet transform, gas emission prediction

Ⅰ. INTRODUCTION Prediction of the expected gas emission amount from the work area of a mine is needed to facilitate ventilation planning and an assessment of methane drainage requirements. Accurate prediction of gas emission amount is crucial to insure the safety of the workers and the production of the coal. Great attention is paid on the accurate prediction of gas emission amount, and many models have been constructed. Among them, linear regression methods such as autoregressive (AR) [1] have been used in practice. Meantime, nonlinear methods are also applied to time series prediction with the development of machine learning theory. Of the nonlinear models, neural networks are very popular [2]. However, there are disadvantages in these models. Linear models are inadequate to predict nonstationary time series, which is affected by several random factors thus making it hard to predict. With respect to the model based on neural networks, it can not overcome the overfitting problem because it adopts the empirical risk minimization (ERM) principle. Moreover, it needs large quantity of training samples and learning speed is comparatively slow. Support vector machines (SVM), proposed by Vapnik [3] in 1995, is based on statistical learning theory (STL). It adopts structural risk minimization (SRM) principle instead of ERM principle, and thus can obtain global optimal solution by solving a quadratic problem. The adoption of kernel method avoids the curse of dimensional efficiently.

1-4244-0332-4/06/$20.00 ©2006 IEEE

Least squares support vector machines (LS-SVM) is a kind of SVM, but it possesses different constrains with regard to standard SVM. It has been applied in many fields such as time series prediction [4]. Wavelet transform (WT), which can produce a good local representation of the signal in both time domain and frequency domain, has also been successfully applied in the fields like data analysis and signal processing. It is also proposed for time series prediction combined with other models like neural networks [5]. In this paper, we proposed a model for gas emission amount prediction combining LS-SVM and WT, which can be called WT-LSSVM model. A simulation experiment is carried out to validate the applicability of the model. This paper is organized as follows: Section Ⅱ reviews the basic principles of LS-SVM and WT. In Section Ⅲ, the prediction model based on LS-SVM and WT is constructed. In Section Ⅳ, the simulation experiment is carried out and the selections of embedding dimension and decomposition level are discussed. Finally, the conclusion is made in Section Ⅴ. Ⅱ.

A.

BACKGROUND

Least Squares Support Vector Machines

Suppose we have the independent uniformly distributed data {x1 , y1 }"{xl , y l } , where each x i ∈ R n denotes the input space of the sample and has a corresponding target value y i ∈ R for i=1...l, where l corresponds to the size of the training data. The estimating function takes the form as follows: f ( x ) = ( w ⋅ Φ ( x )) + b (1) Where, Φ ( x ) denotes the high dimensional feature space which is nonlinearly mapped from the input space. This leads to the optimization problem for standard SVM: l 1 T Minimize . (2) w w + γ ∑ ξi 2 i =1

6097

⎧⎪ y [ wT Φ ( x ) + b] ≥ 1 − ξ i i . (3) ⎨ i ξ i ≥ 0, i = 1, … , l ⎪⎩ Where, ξ i is a slack variable and γ is a positive real constant which determines penalties to estimation errors. For LS-SVM, (3) has been modified as follows: l 1 T w w + γ ∑ξi 2 i =1 (4) Minimize Subject to the equality constrains: y i [ wT Φ ( x i ) + b] = 1 − ξ i i = 1,..., l (5) By constructing the Lagrange function and according to KKT Conditions, the equation as follows can be obtained: l ⎧ w = ∑ α i y i Φ ( xi ) ⎪ i =1 ⎪ l ⎪ ∑ α i yi = 0 (6) ⎨ i =1 ⎪ α i = γξ i ⎪ ⎪ y [ w T Φ ( x ) + b] − 1 + ξ = 0 i i ⎩ i

xt − x j RBF: K ( xi , x j ) = exp(− 2σ 2

Subject to

Then we define:

2 )

The resulting LS-SVM model for regression can be expressed as follows: l

f ( x) =

∑ (α −α i

* i ) K ( x i , x) + b

(10)

i =1

B.

Wavelet Transform 2 Suppose the function ϕ (t ) ∈ L ( R ) and its Fourier

transform

ψ (ω )

satisfies the condition:

∫

R

ψ (ω ) ω

2

dω < ∞

(11)

Then ϕ (t ) can be called mother wavelet. By dilations and translations of mother wavelet, a family of wavelet functions as follows can be obtained: 1 t−d ψ a ,d (t ) = ψ( ) ( a ≠ 0, d ∈ R ) (12) a a Where, a is the dilation factor and d is the translation factor. Let a = 2 j and d = k 2 j , discrete wavelet transform (DWT) can be realized: −j (13) ψ j ,k (t ) = 2 2 ψ (2 − j t − k )

⎧Z = [Φ ( x )T y ;...; Φ ( xi ) T y i 1 1 ⎪ T Y [ y ;...; yi ] = ⎪⎪ 1 G 1 = [1;...;1] ⎨ ⎪ ξ = [ξ1 ;...; ξ i ] ⎪ α = [α1 ;...; α i ] ⎪⎩

Where, k is the shift parameter and j is the resolution level. (7) The larger the value of j, the lower the frequency. After substituting (7) into (6) and eliminating w and γ , According to (13), the reconstruction expression of f(x) can we can obtain: be presented as follows: T ⎤ ⎡ b ⎤ ⎡0 ⎤ ⎡0 f (t ) = c j ,k ϕ j ,k (t ) + d j ,kψ j ,k (t ) Y (8) ⎢Y ZZ T + γ −1 I ⎥ ⎢α ⎥ = ⎢1G ⎥ k k j (14) ⎣ ⎦⎣ ⎦ ⎣ ⎦ = a j (t ) + d j (t ) j By defining Ω = ZZ T and applying Mercer’s Condition [6] within the Ω the matrix, each element of the matrix is Where, a j and d j are the approximate and detail parts of in the form: original signal, respectively. T Ω i , j = y i y j Φ ( xi ) Φ ( x j ) = y i y j K ( xi , x j ) . (9) Ⅲ. PREDICTION MODEL Where, K ( x i , x j ) is defined as kernel function. The value

∑

∑∑

∑

of the kernel equals to the inner product of two vectors x i and x j in the feature space Φ ( x i ) and Φ ( x j ) that

The prediction model based on WT and LS-SVM can be realized according to the following stages:

is K ( x i , x j ) = Φ ( x i )Φ ( x j ) .

A.

Any

symmetry

function

satisfying Mercer’s condition can be used as kernel function. The typical examples of kernel function are polynomial kernel, RBF kernel. Polynomial: K ( xi , x j ) = (γ ( xi • x j ) + r ) d , γ > 0 ;

Decomposition of the Time Series

Given time series of gas emission amount {Q (1) " Q (l )} , it is decomposed by the wavelet at level j whose selection will be discussed in next section. Then the approximate part a j and the detail parts d i (i = 1" j ) are obtained: Q(t ) = a j +

∑d j

6098

i

(15)

B.

Prediction Model Base on LS-SVM

Suppose the current time is t , the amount of gas emission Q(t) can be predicted by the historical data Q(t-1),Q(t-2)…Q(t-p). Then the prediction function can be expressed as: (16) Q(t ) = Φ[Q(t − 1) " Q(t − p)] Where, p is referred to as the embedding dimension, whose selection will also be discussed in next section. According to the above subsection, the prediction function can be modified as follows: (17) a j (t ) = Φ[ a j (t − 1) " a j (t − p)] d j (t ) = Φ[ d j (t − 1) " d j (t − p )]

Ⅳ.

To test the efficiency of our prediction model, we use four days gas emission amount of each hour to forecast those of the next day.

A.

1.5

(18)

1 0.5 10

0.6 0.2 0

TABLE Ⅰ STRUCTURE OF INPUT VECTORS AND OUTPUT VECTORS

C.

aj(1)…aj(p-1),aj(p)

aj(p+1)

#

#

aj(l-p-1)…aj(l-3),aj(l-2)

aj(l-1)

aj(l-p)…aj(l-1),aj(l-1)

aj(l)

i

Wavelet Transform

dˆ j

"

dˆ1

aj

aˆ j

20

40

60

80

100

120

20

40

60

80

100

120

20

40

60

80

100

120

0

20

40

60

80

100

120

TABLE Ⅱ PARAMETERS OF LS-SVM PREDICTORS

Q( t − p ) " Q( t − 2 ) Q( t - 1 )

d1

120

In Fig.2, the trend parts, periodic parts, and random parts of original series are illustrated obviously,. The decomposed series is used to predict that of the next day by LS-SVM predictor. In this section, we use the software LSSVM [7] which includes the implementation of solving (8). RBF kernel is chosen as the kernel function; embedding dimension is selected at 6. The parameters of four LS-SVM predictors are shown in TABLEⅡ:

Where, Q and Qˆ (t ) are the real and predicted values of the gas emission amount respectively. Figure 1 shows the structure of the prediction model:

"

100

Fig.2 The original series (at the top) and decomposed series

j

dj

80

0 -0.5

represent the predicted values of approximate parts and detail parts, respectively. The reconstruction of each part can be used as the final predicted results: (19) Qˆ (t ) = aˆ + dˆ

LS - SVM " LS - SVM LS - SVM

60

0 -0.2 0.5 0

Using LS-SVM predictor, the predicted values of the approximate parts and detail parts of series of future gas emission amount can be achieved. Let aˆ j and dˆ i (i = 1" l )

∑

40

0 -0.2 0.2 0

Reconstruction of the Predicted Value

j

20

0.8

LS-SVM predictor can be obtained as shown in TableⅠ

output vectors

Experiment Procedure And Results

Db3 wavelet is selected as the wavelet function and decomposition level is selected at 3. The original series and the decomposed series are shown in Fig.2.

We construct a multi-input and single-output LS-SVM predictor for each part. According to (17) and (18), taking a j for example, the input vectors and output vectors of

input vectors

SIMULATION EXPERIMENT

LS-SVM predictor of each part

γ

σ2

approximate parts detail parts in level 3 detail parts in level 2 detail parts in level 1

1250 1250 100 350

20 20 20 190

By using LS-SVM predictor, the predicted values of approximate and detail parts of gas emission amount of the fifth day can be obtained as shown in Fig.3.

Reconstruction

ˆ(t ) Q

Fig.1 Structure of the prediction model

6099

and the correlation between real value and predicted value, respectively. The value of F, which is the comprehensive index to evaluate the model, shows the precision of the model. The comparison result is shown in Table. Ⅲ.

0.95 0.9 0.85 0.2

0

5

10

15

20

25

TABLE Ⅲ COMPARISON OF WT-LSVM MODEL AND AR MODEL AND LS-SVM MODEL

0 -0.2 0.1

0

5

10

15

20

25

0

5

10

15

20

25

0

5

10

15

20

25

Indices WT-LSSVM model LS-SVM model AR model

0 -0.1 0.2

Co 0.9486 0.5309 0.3360

F 0.9766 0.8068 0.7270

As we expect, all of the three indices of our method are significantly better than those of two other models. WT-LSSVM model demonstrates its success in the prediction of gas emission amount.

0 -0.2

E 0.00474 0.0093 0.012

Fig.3 Predicted values of each parts of original series for the fifth day. In each figure, the solid line and the dotted line represent the actual value and predicted value, respectively.

According to (19), the reconstruction of each part is used as the final predicted result. The result is shown in Fig.4:

C.

Discussion of Parameter Selection

The values of embedding dimension and decomposed level are difficulty to select. In our experiment, we select the embedding dimension from one to twelve and the decompose level for one to three. Two indices are used to validate the efficiency of the model with selected values: F and MAPE (Mean Absolute Percentage Error). l

1.1

∑ Q (i ) − Qˆ (i ) i =1 MAPE= 100 l = 24 l The result is shown in TABLE Ⅳ and Fig.5:

Actual Value Predicted Value

1

TABLE Ⅳ PERFORMANCE WHEN DECOMPOSITION LEVEL VARIES

0.9

Decomposition Level 1 2 3

0.8

F 0.9345 0.9705 0.9766

MAPE(%) 4.274 3.029 2.436

1

0

5

10 hour 15

20

0.9

25 F

0.7

0.8 Fig.4

The final predicted result

0.7

Performance and Comparison

To make a comparison with the autoregressive model in [1] and pure LS-SVM model, three indices are used to evaluate the performance of the prediction model: E = abs{[(mean(Q) − mean(Qˆ )] / mean(Q )]} Co =

cov(Q, Qˆ ) D(Q ) • D(Qˆ )

;

6100

2

4 6 8 Embedding Dimension

10

12

0

2

4 6 8 Embedding Dimension

10

12

4

2

F = 0.6(1 − E ) + 0.4C . The values of E and CO are used to measure the error

0

6 MAPE(%)

B.

Fig.5 Performance when embedding dimension varies

From the table and figure above, we note that when p is more than five or decomposed level more than two, there is only tiny improvement of the performance. That is because: the information of gas emission amount of five hours is sufficient to predict the value of the next hour. When p is more than five, the information is redundant. Meanwhile, when the resolution level is at two, random parts have been displayed apparently. Too large resolution level may lead to the error propagation. In this paper, we select the resolution level at three in that the periodic parts, the trend parts and the random parts are illustrated clearly. The predicted result shown in TABLE Ⅳindicates the validity of our selection.

Ⅴ.

original series. Other data preparation methods [9] may enhance the accuracy. REFERENCES [1] [2] [3] [4] [5]

CONCLUSIONS

In this paper, we combine wavelet transform and least squares support vector machines to predict time series of gas emission amount. The final results show that this model has greater generality ability and higher accuracy. That means our method is applicable to predict time series of gas emission amount. However, additional research is necessary to further explore the model combining WT and LS-SVM. In our experiment, we only select RBF as the kernel function. Other kernel functions such as wavelet kernel proposed by Zhang et al [8] may be also promising. In addition, prediction error mainly results from the random part of

[6] [7] [8] [9]

6101

Zhi-fang Xu “Research on Integrated System of Gas Real-time Detecting Information Based on Intranet”. Ph.D. thesis, China University of Mining & Technology, 2001.(in Chinese) Zhi-yi Yang , Ya-xuan Xiong, Qian-lin Zhang “Research on the prediction of gas emission in working face Based on neural network” Coal Engineering , no. 10 pp.73-75. 2004.(in Chinese) V.vapnik, “The Nature of Statistical Learning Theory” New York: Springer-Verlag,1995 Van Gestel, T, et al. “Financial time series prediction using least squares support vector machines within the evidence framework”. IEEE Trans. Neural Networks, vol.12, no.4, pp.809-821, 2001 Bai-ling Zhang, et al. “Multi-resolution Forecasting for Futures Trading Using Wavelet Decompositions” IEEE Trans. Neural Networks, vol.12, no.4, pp.765-774, 2001. Shevade S K et al. “Improvements to SMO algorithm for SVM regression” IEEE Trans. Neural Networks, vol.11, no.5, pp:1188-1193,2000. LS-SVMlab[Online].Available:http://www.esat.kuleuven.ac.be/sista/ lssvmlab Li Zhang, Wei-da Zhou,Licheng Jiao. “Wavelet Support Vector Machines.” IEEE Trans. System. Man and Cybernetics, vol.34, no.1,pp.34-39, 2004. Bo-juen Chen, Ming-wei Chang, Chih-jen Lin “Load Forecasting Using Support Vector Machines: A Study on EUNITE Competition 2001”. IEEE Trans. Power System vol.19, no.4, pp.1821-1830, 2004.

China University of Mining & Technology ... First, the historical series is decomposed by wavelet, and thus ... and Automation, June 21 - 23, 2006, Dalian, China ...

Download PDF

457KB Sizes 1 Downloads 254 Views

Report

WCICA 2006 Paper

Recommend Documents