Time Series ARIMA Models Example Ani Katchova
© 2013 by Ani Katchova. All rights reserved.
Time Series ARIMA Models Example
We have time series data on ppi (producer price index). Data are quarterly from 1960 to 2002. Summary statistics: Mean(ppi)=64, mean ((Δppi)=0.464. Graphs in Stata Differenced variable (Δppi)
1960q1
-4
20
40
-2
producer price index 60 80
producer price index, D 0 2
100
4
120
Original variable (ppi)
1970q1
1980q1 time in quarters
1990q1
2000q1
1960q1
1970q1
1980q1 time in quarters
1990q1
2000q1
Original variable does not look stationary. Differenced variable looks stationary (although the variance increases).
1
Graphs in R of the original and differenced variable
Graphs in SAS of the original and differenced variable
2
Dickey-Fuller Test
Const L1.y or yt-1
Original variable D.y or Δyt 0.5036* -0.0006 (-0.26)
With trend D.y or Δyt 0.5861* -0.0084 (-0.793)
LD.y or Δyt-1
-0.4452* (-6.86)
Trend or t Test stat p-value Conclusion
Differenced variable D2.y or ΔΔyt 0.2067*
0.0050 -0.26 0.398(Stata) 0.927(SAS) Variable is not stationary
-0.793 0.99 (SAS) 0.96 (Stata & R) Variable is not trend stationary
-6.86 <0.001 Differenced variable is stationary
The Dickey Fuller test shows that the original variable is not stationary, but the differences variable is stationary so we need to use differences d=1 in the ARIMA models.
3
Correlograms, Autocorrelation function (ACF), and partial autocorrelation function (PACF) Original variable (ppi) LAG ACF 1 0.990 2 0.978 3 0.966 4 0.952 5 0.937 6 0.923 7 0.908 8 0.894 9 0.880 10 0.866
Differenced variable (Δppi) LAG ACF 1 0.553 2 0.336 3 0.319 4 0.217 5 0.086 6 0.153 7 0.082 8 -0.078 9 -0.080 10 0.023
PACF 0.999 -0.555 -0.069 -0.209 0.023 0.125 -0.153 0.114 0.210 0.049
4
PACF 0.555 0.066 0.203 -0.031 -0.130 0.149 -0.118 -0.213 -0.051 0.166
-1.00
Partial autocorrelations of ppi -0.50 0.00 0.50
Autocorrelations of ppi -0.50 0.00 0.50
1.00
1.00
ACF and PACF of original variable
0
10
20 Lag
Bartlett's formula for MA(q) 95% confidence bands
30
40
0
10
20 Lag
30
40
95% Confidence bands [se = 1/sqrt(n)]
For original variable, ACF is a slow decay function (indicating non-stationarity) and PACF cuts off at lag 1 or 2. 5
6
Partial autocorrelations of D.ppi -0.200.00 0.20 0.40 0.60
Autocorrelations of D.ppi -0.200.00 0.20 0.40 0.60
ACF or PACF of the differenced variable
0
10
20 Lag
Bartlett's formula for MA(q) 95% confidence bands
30
40
0
10
20 Lag
30
40
95% Confidence bands [se = 1/sqrt(n)]
For differenced variable, ACF tails off and PACF cuts off after lag 1 – use AR(1)? 7
8
ARIMA Models
Const L1.ar L2.ar L1.ma L2.ma L3.ma AIC BIC
ARIMA ARIMA ARIMA ARIMA ARIMA ARIMA ARIMA ARIMA ARIMA (1,0,0) (2,0,0) (0,0,1) (1,0,1) (1,1,0) (0,1,1) (1,1,1) (1,1,3) (2,1,3) 64.37 64.18 64.69* 64.67* 0.46* 0.47* 0.43* 0.43* 0.44* 0.999* 1.64* 0.99* 0.55* 0.72* 0.73* 1.51* -0.64* -0.71* 1.00* 0.53* 0.48* -0.25* -0.24 -1.05* -0.10 0.21 0.12 0.32* 502 424 1401 441 393 405 393 392 390 511 426 1420 543 402 414 406 411 412
* These are the Stata results. R has very similar coefficients. In the SAS output, the MA components have reverse signs than what is reported in this table and some coefficients have different magnitudes.
We know that the variable is not stationary so we need to use differenced variable ARIMA (p,1,q). But here we also include models with the original variable ARIMA (p,0,q). The coefficient on the lagged dependent variable is close to 1 indicating non-stationarity. To select a model to use, look at the significance of the coefficients and also low AIC or BIC. Usually, there are a few models that perform similarly, so it is up to the researcher to try a few models and decide which one to use. The recommendation is to go with the simplest model. ARIMA(1,1,1) is a good choice based on low AIC and BIC. ARIMA(2,1,3) is also a good choice based on the significance of the lags.
9
Forecasting in R Original variable after ARIMA(1,0,1)
Differenced variable after ARIMA(1,1,1)
The dependent variable is forecasted including the confidence interval.
10