Supplement to “Generalized Least Squares Model Averaging” Qingfeng Liu∗, Ryo Okui† and Arihiro Yoshimura‡ December 21, 2014
1
Additional Monte Carlo studies
This supplement presents the results of additional Monte Carlo studies. We consider the designs in which the model used to estimate the variances are misspecified and/or the number of the variables that affects the variances are large. The simulation designs are almost the same design as the one used in the simulations presented in the main text, except the true variance structures and the models used for the variance estimation.
1.1
Design 1
We first examine the effect of misspecification. The data generating process is equivalent to the one used in the main paper. However, we use the following specification for the estimation of variances for the parametric version of FGLSMA: σi2 = β0 + β1 x4i3 . For the semiparametric FGLSMA method, we estimate the variance using xi3 and xi4 . Note that this variance model is misspecified since the true variances depend ∗
Otaru University of Commerce, Email:
[email protected] Corresponding author. Institute of Economic Research, Kyoto University, YoshidaHommachi, Sakyo, Kyoto, Kyoto, 606-8501, Japan. Tel: +81-75-753-7191. Fax: +81-75-7537118. Email:
[email protected] ‡ Kyoto University, Email:
[email protected] †
1
on x4i2 , but the estimated variance model does not include it. WALS is implemented using the same model as that for the parametric version of FGLSMA. Note that the variance model is not estimated in the other four alternative methods (HRCp , JMA, MMA, LASSO). The simulation results for the methods using the MLE estimation of the variances are presented in Figures 1 and 2, and the results for the semiparametric methods are given in Figures 3 and 4. The results indicate that the parametric version of the FGLSMA estimator is robust to misspecification: Misspecification of the variance model does not cause a loss. However, the semiparametric GLSMA estimator performs considerably worse than alternative methods.
1.2
Design 2
Next, we consider the situation in which the variances depend on many variables. We generate the data using the following variance structure: σi2 = 0.01 + β1 x4i2 + β2 x4i3 + β3 x4i4 + β4 x4i5 + β5 x4i6 , ∑ where βi = i−2 / 5i=1 i−2 for i = 1, · · · , 5. We consider the correctly specified case: For the parametric version of FGLSMA, we use σi2 = b0 + b1 x4i2 + b2 x4i3 + b3 x4i4 + b4 x4i5 + b5 x4i6 as the variance model and estimate b0 to b5 to obtain variance estimates; and for the semiparametric FGLSMA method, we use xi2 to xi6 to estimate the variances. We note that WALS is implemented using the same model as that for the parametric version of FGLSMA, and that the variance model is not estimated in the other four alternative methods (HRCp , JMA, MMA, LASSO). The results are summarized in Figures 5-8. We find that the parametric version of FGLSMA is indeed robust to the number of variables and performs better than the alternative methods. However, the semiparametric FGLSMA method may not work well in particular when it is based on CΩF (W ) and R2 is small. Nonetheless the performance of the semiparametric GLSMA estimator based on CIFn (W ) may be acceptable. This result indicates that the parametric version of FGLSMA is recommended even if the number of variables that affect the variances is large, but the semiparametric FGLSMA estimator may be used with care when the number of variables is large. 2
1.3
Design 3
Lastly, we consider the setting in which the variances depend on many variables and the model for variance estimation is misspecified. The data generating process is the same as that of Design 2. However, the models used to estimate the variances are misspecified. For the parametric version, we use the following model: σi2 = β0 + β1 x4i2 + β2 x4i3 + β3 x4i4 + β4 x4i7 + β5 x4i8 . For the semiparametric version, we use (xi2 , xi3 , xi4 , xi7 , xi8 ). WALS is implemented using the same model as that for the parametric version of FGLSMA. Note that the variance model is not estimated in the other four alternative methods (HRCp , JMA, MMA, LASSO). The results are summarized in Figure 9-12. The results are similar to those from Design 2. The parametric version of FGLSMA works well. However, the performance of the semiparametric FGLSMA method deteriorates when it is based on CΩF (W ) and R2 is small.
1.4
Summary of the results
The results of these additional Monte Carlo studies indicate that the parametric version of FGLSMA is robust to misspecification and the presence of many variables that affect the variances, but the semiparametric FGLSMA estimator is not. These results provide practical recommendations. When we suspect that the variances depend on many variables, the parametric version of FGLSMA should be used even when the variance structure is unknown. The parametric version of FGLSMA can be employed even when there is no clue about which variables affect the variances. When we are certain that a small number of variables affect the variances but the variance structure is unknown, the semiparametric FGLSMA estimator may be considered. Of course, when the variance structure is known, the parametric version of FGLSMA is recommended.
3
(a) K=9
(b) K=21
2.5 FGLS HRCp JMA WALS MMA LASSO
1.5
MSE Ratio
MSE Ratio
2
2.5
1 0.1
2
1.5
1 0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
(c) K=29
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.7
0.8
0.9
(d) K=41 3 2.5 MSE Ratio
MSE Ratio
2.5 2 1.5
1.5 1
1 0.1
2
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 1: Performance of the MLE-type FGLSMA estimator based on CIFn (W ) and alternative estimators in Design 1 with n = 150.
4
(a) K=9 2.2
FGLS HRCp JMA WALS MMA LASSO
1.8 1.6
2.5 WMSE Ratio
2 WMSE Ratio
(b) K=21
1.4 1.2 1 0.8 0.1
2
1.5
1 0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
(c) K=29
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.7
0.8
0.9
(d) K=41 3 2.5 WMSE Ratio
WMSE Ratio
2.5 2 1.5
1.5 1
1 0.1
2
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 2: Performance of the MLE-type FGLSMA estimator based on CΩF (W ) and alternative estimators in Design 1 with n = 150.
5
(a) K=9 FGLS HRCp JMA WALS MMA LASSO
1.6 1.4 1.2
2
MSE Ratio
1.8
MSE Ratio
(b) K=21
1
1.5
1
0.8 0.6 0.1
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.5 0.1
0.9
0.2
0.3
(c) K=29
0.8
0.9
0.7
0.8
0.9
2 MSE Ratio
MSE Ratio
0.7
(d) K=41
2
1.5
1
1.5
1
0.5
0.5 0.1
0.4 0.5 0.6 Population R2
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 3: Performance of the semiparametric FGLSMA estimator based on CIFn (W ) and alternative estimators in Design 1 with n = 150.
6
(a) K=9
(b) K=21
1.6
1.2 1
1.4 WMSE Ratio
WMSE Ratio
1.6
FGLS HRCp JMA WALS MMA LASSO
1.4
0.8 0.6
1.2 1 0.8 0.6 0.4
0.4
0.2 0.1
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
(c) K=29
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.7
0.8
0.9
(d) K=41
1.6 1.5 WMSE Ratio
WMSE Ratio
1.4 1.2 1 0.8 0.6
1
0.5
0.4 0.2 0.1
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 4: Performance of the semiparametric FGLSMA estimator based on CΩF (W ) and alternative estimators in Design 1 with n = 150.
7
(b) K=21
(a) K=9 FGLS HRCp JMA WALS MMA LASSO
2.5 2
4 3.5 MSE Ratio
3 MSE Ratio
4.5
3 2.5 2
1.5
1.5 1 0.1
1 0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
(c) K=29
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.7
0.8
0.9
(d) K=41
4.5
5
4 4 MSE Ratio
MSE Ratio
3.5 3 2.5 2
3 2
1.5 1 0.1
1 0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 5: Performance of the MLE-type FGLSMA estimator based on CIFn (W ) and alternative estimators in Design 2 with n = 150.
8
(a) K=9 FGLS HRCp JMA WALS MMA LASSO
2
5 WMSE Ratio
2.5 WMSE Ratio
(b) K=21
1.5
1 0.1
4 3 2 1
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
(c) K=29
0.8
0.9
0.7
0.8
0.9
7
6
6
5
WMSE Ratio
WMSE Ratio
0.7
(d) K=41
7
4 3
5 4 3
2
2
1
1
0.1
0.4 0.5 0.6 Population R2
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 6: Performance of the MLE-type FGLSMA estimator based on CΩF (W ) and alternative estimators in Design 2 with n = 150.
9
(a) K=9
(b) K=21
2.5 FGLS HRCp JMA WALS MMA LASSO
1.5
MSE Ratio
MSE Ratio
2
2.5
1 0.1
2 1.5 1
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
(c) K=29
0.8
0.9
0.7
0.8
0.9
2.5 MSE Ratio
MSE Ratio
0.7
(d) K=41
2.5 2 1.5
2 1.5 1
1 0.1
0.4 0.5 0.6 Population R2
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 7: Performance of the semiparametric FGLSMA estimator based on CIFn (W ) and alternative estimators in Design 2 with n = 150.
10
(a) K=9 1.6
2
1.2
WMSE Ratio
FGLS HRCp JMA WALS MMA LASSO
1.4 WMSE Ratio
(b) K=21
1
1.5
1
0.8 0.1
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.5 0.1
0.9
0.2
0.3
(c) K=29
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.7
0.8
0.9
(d) K=41
2 WMSE Ratio
WMSE Ratio
2 1.5
1
1 0.5
0.5 0.1
1.5
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 8: Performance of the semiparametric FGLSMA estimator based on CΩF (W ) and alternative estimators in Design 2 with n = 150.
11
(a) K=9
(b) K=21 4 FGLS HRCp JMA WALS MMA LASSO
MSE Ratio
2.5 2
3.5 MSE Ratio
3
1.5
2.5 2 1.5
1 0.1
3
1 0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
4.5
3.5
4
0.8
0.9
0.7
0.8
0.9
3.5
3 2.5 2
3 2.5 2
1.5
1.5
1
1
0.1
0.7
(d) K=41
4
MSE Ratio
MSE Ratio
(c) K=29
0.4 0.5 0.6 Population R2
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 9: Performance of the MLE-type FGLSMA estimator based on CIFn (W ) and alternative estimators in Design 3 with n = 150.
12
(a) K=9
(b) K=21 4 FGLS HRCp JMA WALS MMA LASSO
1.5
WMSE Ratio
WMSE Ratio
2
3.5 3 2.5 2 1.5
1 0.1
1 0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
(c) K=29
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.7
0.8
0.9
(d) K=41
4.5
5
3.5
WMSE Ratio
WMSE Ratio
4
3 2.5 2
4 3 2
1.5 1 0.1
1 0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 10: Performance of the MLE-type FGLSMA estimator based on CΩF (W ) and alternative estimators in Design 3 with n = 150.
13
(a) K=9 FGLS HRCp JMA WALS MMA LASSO
1.5
2.5
MSE Ratio
2 MSE Ratio
(b) K=21
1 0.1
2
1.5
1 0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
(c) K=29
0.8
0.9
0.7
0.8
0.9
2.5 MSE Ratio
MSE Ratio
0.7
(d) K=41
2.5 2 1.5
2 1.5 1
1 0.1
0.4 0.5 0.6 Population R2
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 11: Performance of the semiparametric FGLSMA estimator based on CIFn (W ) and alternative estimators in Design 3 with n = 150.
14
(a) K=9 FGLS HRCp JMA WALS MMA LASSO
1.2
1.8 1.6 WMSE Ratio
1.4 WMSE Ratio
(b) K=21
1
1.4 1.2 1 0.8
0.8 0.1
0.6 0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.4 0.1
0.9
0.2
0.3
(c) K=29
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.7
0.8
0.9
(d) K=41
2
WMSE Ratio
WMSE Ratio
2 1.5
1
1
0.5
0.5 0.1
1.5
0.2
0.3
0.4 0.5 0.6 Population R2
0.7
0.8
0.9
0.1
0.2
0.3
0.4 0.5 0.6 Population R2
Figure 12: Performance of the semiparametric FGLSMA estimator based on CΩF (W ) and alternative estimators in Design 3 with n = 150.
15