Supplementary Material for the Paper “Estimating the Aspect Layout of Object Categories” Yu Xiang and Silvio Savarese Department of Computer Science and Electrical Engineering University of Michigan at Ann Arbor, Ann Arbor, MI 48109, USA {yuxiang, silvio}@eecs.umich.edu
We present detailed experimental results in this supplementary material for our paper “Estimating the Aspect Layout of Object Categories”.
1. 3DObject Dataset Fig. 1 shows the viewpoint confusion matrices of the eight categories in the 3DObject dataset obtained by our Aspect Layout Model (ALM). The viewpoint accuracy is computed among all the true positive detections. To see how the viewpoint estimation is related to detection, we report viewpoint accuracy as a function of recall. We plot the accuracy-recall curves for the eight categories in the 3DObject dataset in Fig. 2, where we compare our full model with our root model and DPM [1]. The area under the accuracy-recall curve is used as a quantitative measure for viewpoint estimation. Our full model achieves the best overall performance among the three models. Detailed detection results on the 3DObject dataset are presented in Table 1. Some aspect layout estimation results of the eight categories obtained by our full model are show in Fig.7-14. Average Viewpoint Accuracy: 93.4%
Average Viewpoint Accuracy: 84.6%
front
0.86 0.02 0.00 0.00 0.09 0.00 0.02 0.00
front
0.98 0.00 0.00 0.00 0.02 0.00 0.00 0.00
right−front
0.00 0.91 0.00 0.00 0.00 0.09 0.00 0.00
right−front
0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00
right−front
0.00 0.88 0.07 0.00 0.00 0.05 0.00 0.00
right−front
0.00 0.82 0.05 0.00 0.00 0.10 0.03 0.00
right
0.00 0.00 0.96 0.00 0.00 0.00 0.04 0.00
right
0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00
right
0.00 0.05 0.95 0.00 0.00 0.00 0.00 0.00
right
0.00 0.02 0.93 0.00 0.00 0.00 0.05 0.00
right−back
0.00 0.00 0.00 0.89 0.02 0.02 0.00 0.07
right−back
0.00 0.00 0.00 0.97 0.00 0.00 0.00 0.03
right−back
0.00 0.02 0.14 0.79 0.00 0.00 0.00 0.05
right−back
0.00 0.00 0.05 0.82 0.00 0.00 0.07 0.05
back
0.35 0.00 0.00 0.00 0.65 0.00 0.00 0.00
back
0.10 0.00 0.00 0.00 0.81 0.03 0.00 0.06
back
0.00 0.03 0.00 0.00 0.97 0.00 0.00 0.00
back
0.00 0.00 0.03 0.03 0.89 0.00 0.00 0.06
left−back
0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00
left−back
0.00 0.10 0.00 0.00 0.00 0.90 0.00 0.00
left−back
0.00 0.05 0.07 0.10 0.00 0.63 0.12 0.02
left−back
0.00 0.20 0.11 0.00 0.00 0.66 0.02 0.00
left
0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00
left
0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00
left
0.02 0.00 0.05 0.00 0.02 0.00 0.91 0.00
left
0.00 0.00 0.21 0.00 0.00 0.00 0.79 0.00
left−front
0.00 0.00 0.00 0.00 0.00 0.04 0.02 0.93
left−front
0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00
left−front
0.00 0.00 0.05 0.00 0.00 0.00 0.15 0.80
left−front
0.00 0.00 0.05 0.07 0.00 0.00 0.00 0.88
left−back
Estimated Viewpoint
left
left−front
front right−front right right−back back
left−back
Estimated Viewpoint
Bicyle
left
left−front
front right−front right right−back back
Car
Average Viewpoint Accuracy: 66.5%
left−back
Estimated Viewpoint
left
Ground Truth
0.80 0.00 0.00 0.00 0.20 0.00 0.00 0.00
Ground Truth
front
front right−front right right−back back
left−front
front right−front right right−back back
left−back
Estimated Viewpoint
Cellphone
Average Viewpoint Accuracy: 87.0%
left
left−front
Iron
Average Viewpoint Accuracy: 72.8%
Average Viewpoint Accuracy: 65.2%
front
0.97 0.00 0.00 0.00 0.03 0.00 0.00 0.00
front
0.53 0.00 0.00 0.00 0.47 0.00 0.00 0.00
front
0.57 0.00 0.05 0.00 0.32 0.05 0.00 0.02
0.00 0.86 0.03 0.00 0.00 0.08 0.03 0.00
right−front
0.00 0.98 0.02 0.00 0.00 0.00 0.00 0.00
right−front
0.00 0.53 0.12 0.09 0.00 0.26 0.00 0.00
right−front
0.02 0.45 0.02 0.00 0.00 0.50 0.00 0.00
right
0.00 0.00 0.76 0.03 0.05 0.00 0.13 0.03
right
0.00 0.08 0.92 0.00 0.00 0.00 0.00 0.00
right
0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00
right
0.00 0.00 0.80 0.00 0.02 0.00 0.18 0.00
right−back
0.00 0.03 0.21 0.32 0.06 0.06 0.24 0.09
right−back
0.00 0.05 0.33 0.59 0.00 0.00 0.00 0.03
right−back
0.00 0.00 0.10 0.76 0.00 0.00 0.00 0.15
right−back
0.00 0.00 0.00 0.64 0.04 0.09 0.00 0.22
back
0.11 0.00 0.07 0.04 0.61 0.04 0.11 0.04
back
0.08 0.00 0.00 0.00 0.92 0.00 0.00 0.00
back
0.17 0.00 0.00 0.00 0.82 0.00 0.00 0.00
back
0.16 0.00 0.07 0.02 0.61 0.00 0.14 0.00
left−back
0.00 0.06 0.12 0.00 0.00 0.48 0.33 0.00
left−back
0.00 0.05 0.00 0.00 0.00 0.69 0.26 0.00
left−back
0.00 0.05 0.00 0.00 0.00 0.76 0.19 0.00
left−back
0.00 0.07 0.00 0.02 0.00 0.82 0.04 0.04
left
0.05 0.00 0.19 0.00 0.00 0.00 0.76 0.00
left
0.00 0.00 0.00 0.02 0.00 0.00 0.95 0.02
left
0.00 0.00 0.05 0.00 0.00 0.00 0.95 0.00
left
0.00 0.00 0.42 0.00 0.00 0.00 0.58 0.00
left−front
0.00 0.03 0.03 0.05 0.00 0.05 0.11 0.74
left−front
0.00 0.00 0.02 0.00 0.00 0.00 0.05 0.93
left−front
0.00 0.00 0.00 0.14 0.00 0.05 0.35 0.47
left−front
0.02 0.00 0.02 0.18 0.00 0.04 0.00 0.73
front right−front right right−back back
left−back
Estimated Viewpoint
left
left−front
front right−front right right−back back
left−back
Estimated Viewpoint
Mouse
left
left−front
front right−front right right−back back
left−back
Estimated Viewpoint
Shoe
left
Ground Truth
0.79 0.00 0.00 0.00 0.12 0.03 0.06 0.00
Ground Truth
front right−front
Ground Truth
Ground Truth
Average Viewpoint Accuracy: 85.0%
0.97 0.00 0.00 0.00 0.03 0.00 0.00 0.00
Ground Truth
Ground Truth
Average Viewpoint Accuracy: 91.4% front
left−front
front right−front right right−back back
left−back
Estimated Viewpoint
Stapler
left
left−front
Toaster
Figure 1. Viewpoint confusion matrices of the eight categories in the 3DObject Dataset. Table 1. Average precision on the 3DObject dataset and the ImageNet dataset. Category DPM [1] ALM Root ALM Full
Bicycle 95.1 93.5 93.0
Car 98.2 99.5 98.4
Cellphone 73.1 77.4 79.2
Iron 83.1 75.8 80.7
Mouse 64.0 48.8 50.7
Shoe 95.7 85.6 84.2
Stapler 65.0 73.4 70.5
1
Toaster 96.7 96.5 97.4
Mean 83.9 81.3 81.8
Bed 94.0 83.5 89.4
Chair 95.4 78.4 89.3
Sofa 97.6 93.7 92.8
Table 95.1 81.2 90.1
Mean 95.5 84.2 90.4
Bicycle
1
Car
1.05
Cellphone
1
0.98
0.9
Full Model (0.95) Root Model (0.97) DPM (0.94)
0.88
0.86
0
0.1
0.2
0.3
0.4
Full Model (0.98) Root Model (0.93) DPM (0.91)
0.85
0.5
Recall
0.6
0.7
0.8
0.9
0.8
1
Mouse
1
0.9
0
0.1
0.2
0.3
0.4
0.8 0.75 0.7
0.5
Recall
0.6
0.7
0.8
0.9
1
0
Viewpoint Accuracy
Viewpoint Accuracy
0.6 0.5 0.4 0.3
Full Model (0.60) Root Model (0.52) DPM (0.48)
0.2 0.1 0
0
0.1
0.2
0.3
0.4
Viewpoint Accuracy
0.95
0.7
0.9
0.85
0.8
Full Model (0.90) Root Model (0.91) DPM (0.84)
0.75
0.5
Recall
0.6
0.7
0.8
0.9
1
0.7
0
0.1
0.2
0.3
0.4
0.4
0.6
0.7
0.8
0.9
1
0.9 0.88
0.5
Recall
0.6
0.7
0.8
0.9
0.82
1
0.8
0.8
0.7 0.6 0.5 0.4 0.3
Full Model (0.75) Root Model (0.71) DPM (0.62) 0
0.1
0.2
0.3
0.4
0
0.1
0.4
0.6
0.7
0.8
0.9
0.5
Recall
0.6
0.7
0.8
0.9
1
0.6
0.7
0.8
0.9
1
0.7 0.6 0.5 0.4 0.3
Full Model (0.68) Root Model (0.66) DPM (0.63)
0.1 0.5
0.3
Toaster
0.2
Recall
0.2
1 0.9
0
Full Model (0.84) Root Model (0.83) DPM (0.90)
0.84
Stapler
0.1 0.5
0.3
0.92
0.9
0.2
Recall
0.2
1
0.9 0.8
0.1
0.94
0.86
Full Model (0.86) Root Model (0.84) DPM (0.72)
0.65
Shoe
1
Viewpoint Accuracy
0.92
0.95
0.96
0.9 0.85
Viewpoint Accuracy
0.94
Viewpoint Accuracy
Viewpoint Accuracy
Viewpoint Accuracy
1 0.96
Iron
1 0.98
0.95
0
1
0
0.1
0.2
0.3
0.4
0.5
Recall
Figure 2. Viewpoint accuracy-recall curves for the eight categories in the 3DObject dataset. Average Viewpoint Accuracy: 85.9% Frontal
0.86
0.03
0.11
Average Precision: 48.7%
1 0.9
0.00
0.8
Rear
0.01
0.28
0.95
0.00
0.00
0.72
0.04
0.00
Precision
Ground Truth
0.7 Left
0.6 0.5 0.4 0.3 0.2
Right
0.03 Frontal
0.07
0.00
0.90
Left
Rear
Right
Estimated Viewpoint
(a)
0.1 0
0
0.1
0.2
0.3
Recall
0.4
0.5
0.6
0.7
(b)
Figure 3. (a) Viewpoint confusion matrix of ALM on the VOC2006 Car dataset. (b) Precision-recall curve of ALM on the VOC2006 Car dataset.
2. VOC2006 Car Dataset We show the viewpoint confusion matrix and the precision-recall curve of ALM on the VOC2006 Car Dataset in Fig. 3.
3. EPFL Car Dataset The histograms of azimuth errors in degree of ALM and DPM on the EPFL Car dataset are show in Fig. 4(a), from which we can see clearly that ALM obtains better viewpoint estimation than DPM on the EPFL Car dataset. The viewpoint confusion matrix of ALM on the EPFL Car dataset is show in Fig. 4(b).
4. ImageNet Dataset We show ALM’s viewpoint confusion matrices for 3 views of the four categories in the ImageNet dataset in Fig. 5, and the viewpoint confusion matrices for 7 views in Fig. 6. Detailed detection results on the ImageNet dataset are also presented in Table 1. Some aspect layout estimation results of the four categories are show in Fig.15-18.
Average Viewpoint Accuracy: 64.8%
Histogram of Azimuth Errors
900
ALM DPM
800
2 3 4
700
5 600
Ground Truth
Number of Bounding Boxes
1
500 400 300
6 7 8 9 10 11 12
200
13 14
100
15 16
0 −50
0
50
100
150
200
250
300
Azimuth Error in Degree
350
400
1
2
3
4
5
(a)
6
7
8
9
10
11
Estimated Viewpoint
12
13
14
15
16
(b)
Figure 4. (a) Histograms of azimuth errors in degree of ALM and DPM on the EPFL Car dataset. (b) Viewpoint confusion matrix of 16 views of ALM on the EPFL Car dataset. Average Viewpoint Accuracy: 87.7%
0.06
0.13
right−front
0.00
0.98
0.02
left−front
0.02
0.08
0.91
right−front
left−front
front
Estimated Viewpoint
front
0.95
0.00
0.05
right−front
0.15
0.84
0.01
left−front
0.14
0.01
0.85
right−front
left−front
front
Bed
Estimated Viewpoint
front
0.89
0.05
0.06
right−front
0.07
0.91
0.01
left−front
0.01
0.01
0.97
right−front
left−front
front
Chair
Estimated Viewpoint
Ground Truth
0.81
Average Viewpoint Accuracy: 76.0%
Average Viewpoint Accuracy: 92.4%
Ground Truth
front
Ground Truth
Ground Truth
Average Viewpoint Accuracy: 90.0%
front
0.86
0.05
0.09
right−front
0.15
0.69
0.16
left−front
0.24
0.04
0.72
right−front
left−front
front
Sofa
Estimated Viewpoint
Table
Figure 5. Viewpoint confusion matrices of 3 views of ALM on the four categories in the ImageNet dataset. Average Viewpoint Accuracy: 73.1%
Average Viewpoint Accuracy: 65.0%
Average Viewpoint Accuracy: 52.6%
0.00
0.00
0.00
0.00
0.11
0
0.67
0.21
0.00
0.00
0.02
0.02
0.09
0
0.79
0.05
0.01
0.00
0.00
0.01
0.14
0
0.57
0.18
0.01
0.00
0.00
0.01
0.22
15
0.10
0.70
0.20
0.00
0.00
0.00
0.00
15
0.08
0.92
0.00
0.00
0.00
0.00
0.00
15
0.15
0.45
0.20
0.15
0.00
0.00
0.05
15
0.08
0.50
0.33
0.00
0.00
0.00
0.08
30
0.02
0.12
0.76
0.08
0.00
0.00
0.02
30
0.03
0.13
0.68
0.16
0.00
0.00
0.00
30
0.00
0.23
0.54
0.20
0.00
0.03
0.00
45
0.00
0.00
0.25
0.62
0.12
0.00
0.00
45
0.00
0.00
0.39
0.61
0.00
0.00
0.00
45
0.08
0.04
0.17
0.58
0.04
0.08
0.00
30
0.00
0.00
0.64
0.32
0.00
0.04
0.00
45
0.00
0.00
0.26
0.74
0.00
0.00
0.00
Ground Truth
0.30
Ground Truth
0.59
Ground Truth
Ground Truth
Average Viewpoint Accuracy: 62.7% 0
315
0.00
0.05
0.10
0.00
0.75
0.10
0.00
315
0.00
0.00
0.00
0.00
0.69
0.23
0.08
315
0.00
0.00
0.04
0.00
0.70
0.26
0.00
315
0.00
0.00
0.00
0.10
0.60
0.25
0.05
330
0.00
0.00
0.03
0.00
0.50
0.47
0.00
330
0.01
0.00
0.00
0.00
0.12
0.75
0.12
330
0.02
0.00
0.00
0.00
0.14
0.84
0.00
330
0.05
0.02
0.00
0.02
0.16
0.44
0.31
345
0.07
0.00
0.00
0.07
0.07
0.29
0.50
345
0.15
0.00
0.00
0.00
0.00
0.15
0.69
345
0.13
0.00
0.00
0.00
0.06
0.32
0.48
345
0.06
0.11
0.00
0.00
0.00
0.39
0.44
0
15
30
45
315
330
345
0
15
30
45
315
330
345
0
15
30
45
315
330
345
0
15
30
45
315
330
345
Estimated Viewpoint
Bed
Estimated Viewpoint
Chair
Estimated Viewpoint
Sofa
Estimated Viewpoint
Table
Figure 6. Viewpoint confusion matrices of 7 views of ALM on the four categories in the ImageNet dataset.
References [1] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. TPAMI, 2010.
Prediction: a=135, e=30 Ground Truth: a=135, e=30
Prediction: a=180, e=0 Ground Truth: a=180, e=0
Prediction: a=90, e=15 Ground Truth: a=90, e=15
Prediction: a=45, e=15 Ground Truth: a=45, e=30
Prediction: a=225, e=30 Ground Truth: a=225, e=30
Prediction: a=270, e=15 Ground Truth: a=270, e=0
Prediction: a=0, e=30 Ground Truth: a=0, e=0
Prediction: a=315, e=30 Ground Truth: a=315, e=30
Figure 7. Aspect layout estimation results on the Bicycle category in the 3DObject dataset. Prediction: a=150, e=15 Ground Truth: a=150, e=15
Prediction: a=180, e=0 Ground Truth: a=180, e=15
Prediction: a=90, e=15 Ground Truth: a=90, e=15
Prediction: a=30, e=0 Ground Truth: a=45, e=15
Prediction: a=210, e=15 Ground Truth: a=210, e=15
Prediction: a=270, e=0 Ground Truth: a=270, e=0
Prediction: a=0, e=0 Ground Truth: a=0, e=0
Prediction: a=330, e=15 Ground Truth: a=330, e=15
Figure 8. Aspect layout estimation results on the Car category in the 3DObject dataset.
Prediction: a=135, e=0 Ground Truth: a=135, e=0
Prediction: a=180, e=45 Ground Truth: a=180, e=45
Prediction: a=90, e=45 Ground Truth: a=90, e=45
Prediction: a=45, e=45 Ground Truth: a=45, e=45
Prediction: a=225, e=90 Ground Truth: a=225, e=90
Prediction: a=270, e=0 Ground Truth: a=270, e=0
Prediction: a=0, e=45 Ground Truth: a=0, e=45
Prediction: a=300, e=90 Ground Truth: a=300, e=45
Figure 9. Aspect layout estimation results on the Cellphone category in the 3DObject dataset. Prediction: a=150, e=60 Ground Truth: a=150, e=60
Prediction: a=180, e=60 Ground Truth: a=180, e=60
Prediction: a=255, e=30 Ground Truth: a=270, e=30
Prediction: a=90, e=60 Ground Truth: a=90, e=60
Prediction: a=60, e=60 Ground Truth: a=60, e=60
Prediction: a=210, e=30 Ground Truth: a=225, e=60
Prediction: a=0, e=60 Ground Truth: a=0, e=60
Prediction: a=330, e=30 Ground Truth: a=315, e=60
Figure 10. Aspect layout estimation results on the Iron category in the 3DObject dataset.
Prediction: a=135, e=15 Ground Truth: a=120, e=45
Prediction: a=180, e=90 Ground Truth: a=180, e=90
Prediction: a=90, e=15 Ground Truth: a=90, e=15
Prediction: a=60, e=45 Ground Truth: a=45, e=15
Prediction: a=225, e=90 Ground Truth: a=225, e=90
Prediction: a=270, e=45 Ground Truth: a=270, e=45
Prediction: a=0, e=45 Ground Truth: a=0, e=90
Prediction: a=300, e=45 Ground Truth: a=300, e=45
Figure 11. Aspect layout estimation results on the Mouse category in the 3DObject dataset. Prediction: a=150, e=45 Ground Truth: a=135, e=90
Prediction: a=180, e=0 Ground Truth: a=180, e=0
Prediction: a=285, e=0 Ground Truth: a=270, e=0
Prediction: a=75, e=0 Ground Truth: a=90, e=0
Prediction: a=45, e=90 Ground Truth: a=45, e=90
Prediction: a=240, e=45 Ground Truth: a=240, e=45
Prediction: a=0, e=90 Ground Truth: a=0, e=90
Prediction: a=300, e=0 Ground Truth: a=315, e=0
Figure 12. Aspect layout estimation results on the Shoe category in the 3DObject dataset.
Prediction: a=120, e=30 Ground Truth: a=135, e=30
Prediction: a=180, e=30 Ground Truth: a=180, e=30
Prediction: a=90, e=30 Ground Truth: a=90, e=0
Prediction: a=30, e=0 Ground Truth: a=45, e=0
Prediction: a=225, e=60 Ground Truth: a=225, e=60
Prediction: a=270, e=60 Ground Truth: a=270, e=60
Prediction: a=0, e=0 Ground Truth: a=0, e=0
Prediction: a=300, e=30 Ground Truth: a=315, e=30
Figure 13. Aspect layout estimation results on the Stapler category in the 3DObject dataset. Prediction: a=150, e=0 prediction: a=150,a=135, e=0, d=7.0 e=0 Ground Truth:
Prediction: a=180, e=45 Ground Truth: a=180, e=45
Prediction: a=90, e=45 Ground Truth: a=90, e=22.5
Prediction: a=45, e=22.5 Ground Truth: a=45, e=22.5
Prediction: a=225, e=45 Ground Truth: a=225, e=45
Prediction: a=270, e=45 Ground Truth: a=270, e=45
Prediction: a=0, e=45 Ground Truth: a=0, e=22.5
Prediction: a=300, e=45 Ground Truth: a=315, e=45
Figure 14. Aspect layout estimation results on the Toaster category in the 3DObject dataset.
Prediction: a=15, e=15 Ground Truth: a=15, e=15
Prediction: a=0, e=15 Ground Truth: a=0, e=30
Prediction: a=300, e=15 Ground Truth: a=300, e=15
Prediction: a=30, e=15 Ground Truth: a=30, e=15
Prediction: a=0, e=15 Ground Truth: a=0, e=15
Prediction: a=315, e=15 Ground Truth: a=315, e=15
Prediction: a=30, e=15 Ground Truth: a=30, e=15
Prediction: a=0, e=30 Ground Truth: a=0, e=30
Prediction: a=315, e=15 Ground Truth: a=315, e=15
Figure 15. Aspect layout estimation results on the Bed category in the ImageNet dataset. Prediction: a=45, e=30 Ground Truth: a=60, e=30
Prediction: a=45, e=30 Ground Truth: a=45, e=15
Prediction: a=330, e=30; a=30, e=30 Ground Truth: a=330, e=30; a=30, e=30
Prediction: a=0, e=30 Ground Truth: a=0, e=30
Prediction: a=0, e=15 Ground Truth: a=0, e=15
Prediction: a=0, e=30; a=15, e=30 Ground Truth: a=0, e=30; a=0, e=50
Prediction: a=330, e=15 Ground Truth: a=330, e=30
Prediction: a=315, e=15 Ground Truth: a=330, e=15
Prediction: a=300, e=15; a=300, e=15 Ground Truth: a=300, e=15; a=300, e=15
Figure 16. Aspect layout estimation results on the Chair category in the ImageNet dataset.
Prediction: a=30, e=15 Ground Truth: a=30, e=15
Prediction: a=0, e=30 Ground Truth: a=0, e=30
Prediction: a=315, e=30 Ground Truth: a=315, e=30
Prediction: a=45, e=15 Ground Truth: a=45, e=15
Prediction: a=0, e=15 Ground Truth: a=0, e=15
Prediction: a=315, e=15 Ground Truth: a=315, e=15
Prediction: a=45, e=15 Ground Truth: a=45, e=15
Prediction: a=345, e=15; a=60, e=30 Ground Truth: a=345, e=15; a=60, e=15
Prediction: a=330, e=15; a=30, e=15 Ground Truth: a=315, e=15; a=30, e=15
Figure 17. Aspect layout estimation results on the Sofa category in the ImageNet dataset. Prediction: a=60, e=15 Ground Truth: a=60, e=15
Prediction: a=0, e=15 Ground Truth: a=0, e=15
Prediction: a=45, e=15 Ground Truth: a=45, e=15
Prediction: a=0, e=30 Ground Truth: a=0, e=30
Prediction: a=60, e=15 Ground Truth: a=60, e=15
Prediction: a=0, e=45 Ground Truth: a=0, e=30
Prediction: a=315, e=15 Ground Truth: a=315, e=15
Prediction: a=315, e=15 Ground Truth: a=330, e=15
Prediction: a=330, e=30 Ground Truth: a=330, e=30
Figure 18. Aspect layout estimation results on the Table category in the ImageNet dataset.