Multi-instance Object Segmentation with Occlusion Handling Yi-Ting Chen, Xiaokai Liu, Ming-Hsuan Yang

IEEE 2015 Conference on Computer Vision and Pattern Recognition

page: http://faculty.ucmerced.edu/mhyang/code

(a) Occluding region

(d) Horse

•  For each superpixel covered by a categorized proposal, we record c j c j ∈C the corresponding category and score, and form a list {ssp }sp∈I



Sn

categorized object occluding hypotheses regions

output …

0.30 Mm











SDS CNN

Grabcut with occlusion handling









class-specific likelihood map …







cropped images

shape predictions

feature extraction Box CNN Region CNN

foreground images

inferred masks

shape prior

•  The shape prior is defined as the weighted mean of inferred masks in the cluster cls(n) 1 cj CM s ⋅ s ∑ hk hk ⋅ M m | cls(n) | M m ∈cls(n)

•  s is the classification score of the proposal hk for the class c j •  s is chamfer matching score between the contour of an exemplar the contour of a proposal

Grabcut with Occlusion Handling feature vectors

Class-specific classifiers

avg

TV

train

sofa

sheep

plant

person

mbike

horse

dog

dtable

Table 2: Results of the joint detection and segmentation task using APr metric at different IoU thresholds on the VOC PASCAL 2012 segmentation validation set. The top two rows show APr results using all validation images. The bottom two rows show APr using the images with occlusions between instances.

nth

cj hk CM hk

SDS CNN object proposals and foreground masks

matched points

Sn =

Exemplarbased

shape prediction

exemplars

exemplar templates

SDS 58.8   0.5   60.1   34.4   29.5   60.6   40.0   73.6   6.5   52.4   31.7   62.0   49.1   45.6   47.9   22.6   43.5   26.9   66.2   66.1   43.8   Ours 63.6   0.3   61.5   43.9   33.8   67.3   46.9   74.4   8.6   52.3   31.3   63.5   48.8   47.9   48.3   26.3   40.1   33.5   66.7   67.8   46.3  

Sn



object hypotheses

0.64

cow

aero

Chamfer matching

Overall Framework input

M1

0.28

chair

corner detector

Table 1: Per-class results of the joint detection and segmentation task using APr metric over 20 classes at 0.5 IoU on the VOC PASCAL 2012 segmentation validation set. All numbers are %. bike

B

˟A ˟

(c) Segmentation w/ subtracting occluding region

Quantitative Results – Detection

(d) Our result

•  Occlusion cannot be handled by bottom-up segmentation •  An exemplar-based shape prediction and occlusion regularization are introduced

Qualitative Results – Detection

•  Classification scores of Figure (b) and (c) determine the energy assignment to the occluding region

Exemplar-based Shape Prediction (c) MCG result

(b) Segmentation w/o subtracting occluding region

cat

(c) Person

car

(b) Superpixel

bus

(a) Input

bottle

(b) Ground truth segmentation

boat

(a) Input

Class-specific Likelihood Map

bird

Motivation and Objective

§  Eocclusion is the Occlusion Regularization

•  Grabcut initialization: Segmentation proposals + thresholded shape priors •  Energy function E = Eappearance + Eocclusion + Elikelihood + Esmoothness §  Eappearance models foreground and background appearances by using GMMs §  Elikelihood is based on the class-specific likelihood map §  Esmoothness is the same as in Grabcut

# of images SDS Ours SDS Ours

1449   1449   309   309  

0.5   43.8   46.3   27.2   38.4  

0.6   34.5   38.2   19.6   28.0  

IoU Score 0.7   21.3   27.0   12.5   19.0  

0.8   8.7   13.5   5.7   10.1  

Qualitative Results – Segmentation

0.9   0.9   2.6   1.0   2.1  

Discussion •  When exemplar-based shape prediction is disabled, the detection performance drops from 46.3% to 39.3%; When occlusion regularization is disabled, the performance drops from 46.3% to 46%.

•  A better estimate of object shape helps detection significantly.

Reference 1.  P. Arbelaez, J. Pont-Tuset, J. T. Barron, F. Marques, and J. Malik, “Multiscale combinatorial grouping,” in CVPR, 2014 2.  B. Hariharan, P. Arbelaez, R. Girshick, and J. Malik, “Simultaneous detection and segmentation,” in ECCV, 2014. 3.  C. Rother, V. Kolmogorov, and A. Blake, “Grabcut: Interactive foreground extraction using iterated graph cuts,” in SIGGRAPH, 2004

Yi-Ting Chen_cvpr15.pdf

Page 1 of 1. IEEE 2015 Conference on. Computer Vision and Pattern. Recognition. Multi-instance Object Segmentation with Occlusion Handling. Yi-Ting Chen ...

14MB Sizes 0 Downloads 254 Views

Recommend Documents

No documents