July 17th, 2013
Learning Auxiliary Dictionaries for Undersampled Face Recognition Chia‐Po Wei and Yu‐Chiang Frank Wang Research Center for IT Innovation, Academia Sinica, Taipei, Taiwan
IEEE International Conference on Multimedia & Expo 2013
Outline • Introduction Undersampled Face Recognition • Related Work Sparse Representation based Classification (SRC) & ESRC Dictionary Learning for Sparse Representation • Learning of Auxiliary Dictionary • Experiments • Conclusions
2
Why Face Recognition? Computational Forensics
Access Control
Mass Surveillance
People Identification/Search
3
Challenges • Robust Face Recognition Face images with possible illumination and expression variations Face images might be corrupted due to occlusion or disguise. A sufficient amount of training data is required for handling such variations.
• Undersampled Face Recognition Practically, only one or very few images per subject are available for training. It would be very difficult to handle both intra and inter‐class variations.
4
Approaches to Undersampled FR Problems • AGL (Su et al., CVPR’10): Adaptive generic learning • ESRC (Deng et al., PAMI’12): Extended sparse representation‐ based classification • PCRC (Zhu et al., ECCV’12): Multi‐scale patch based collaborative representation approach • Remarks AGL and PCRC did not considered corrupted face images . ESRC directly applied external data for modeling intra‐class variations. How to properly utilize external data remains a question…
5
Outline • Introduction Undersampled Face Recognition • Related Work Sparse Representation based Classification (SRC) & ESRC Dictionary Learning for Sparse Representation • Learning of Auxiliary Dictionary • Experiments • Conclusions
6
Sparse Representation based Classification (SRC) • •
Consider an input image as a sparse linear combination of training images. Given a test image y and training data D = [D1, D2, …, DC] of C classes, SRC (Wright et al., PAMI’09) derives the sparse coefficient x of y as:
min y Dx 2 + x 1 2
x
•
The identity of y will be given by
arg min y D k (x) k
•
2 2
y = [D1, D2, …, DC] x + ε = D x + ε
i.e., classification is based on class‐wise minimum reconstruction error. However, SRC requires a large amount of training data as D.
7
Extended SRC •
To improve SRC, Deng et al. (PAMI’12) proposed extended SRC (ESRC): 2
x min y [D, A ] a + x 1 , where x [x d ; x a ] x x 2 d
•
Intra‐class variant dictionary A Contains images collected from an external dataset (subjects not of interest) ESRC performs classification by
k (x ) arg min y [D, A ] a k x d
•
2
2
ESRC utilizes external data to model intra‐class variations. Performance might degrade due to noise or redundant images presented. The size of the external dataset can be very large (and thus a large A). 8
Outline • Introduction Undersampled Face Recognition • Related Work Sparse Representation‐based Classification (SRC) & ESRC Dictionary Learning for Sparse Representation • Learning Auxiliary Dictionaries • Experimental Results Extended Yale B AR • Conclusions 9
Dictionary Learning (DL) for Sparse Representation • Unsupervised Dictionary Learning: Data Representation KSVD (Aharon et al., TIP’06) MOD (Engan et al., ICASSP’99)
• Supervised Dictionary Learning: Data Separation
Supervised Dictionary Learning (Mairal et al., NIPS’09) Discriminative K‐SVD (Zhang and Li, CVPR’10) Label Consistent K‐SVD (Jiang et al., CVPR’11) Fisher Discrimination Dictionary Learning (Yang et al., ICCV’11)
• Remarks The above approaches might not generalize well if limited training data. (i.e., for undersampled recognition problems) For undersamepled FR, possible integration of ESRC and DL? 10
Outline • Introduction Undersampled Face Recognition • Related Work Sparse Representation based Classification (SRC) & ESRC Dictionary Learning for Sparse Representation • Learning of Auxiliary Dictionary • Experiments • Conclusions
11
Our Proposed Method x
D
A
y
2
x min y [D, A ] a x 1 x x 2 d
k ( x ) identity min y [D, A ] a k x d
2
2 12
Our Contributions • Learning of Auxiliary Dictionary A Observing intra‐class variations using external data
• Advantages: Only one or few training images per subject class required. The auxiliary dictionary A is derived from external data (i.e., subjects NOT of interest). A compact A is observed: preferable for storage/computation Can deal with corrupted face images in training and testing
13
Problem Formulation • Our Proposed Algorithm (for learning A) N
min y [D , A]xi A ,X
i 1
e i
e
Reconstruction error
2 2
xi 1 y D ik (x ) Ax
Sparsity constraint
e i
e
d i
2 a i 2
Classwise reconstruction error for class k
e e e e Y [ y1 , y 2 , , y N ] : probe set of external data
De : gallery set of external data A : auxiliary dictionary to be learned xi [xid ; xia ]: sparse coefficient of y ie
• Remarks Need to learn the auxiliary dictionary A and sparse coefficients x during training using external data. 14
The Training Stage • Optimization via iterating between sparse coding and dictionary updates • Sparse Coding for Updating X Given fixed A, we rewrite our formulation as: N
min X
i 1
2
A x y D e (De ) A xi x 2 y i ik e i
e
d i a i
1
where 1/2 , and i (De ) [0,0, , Die , ,0] k k The above is the standard L1‐minimization problem. Can be solved via techniques like Homotopy, Iterative Shrinkage‐ Thresholding, or Augmented Lagrange Multiplier (ALM)
15
The Training Stage (cont’d) • Dictionary Update for A: We fix X and rewrite the objective function as N
min A
i 1
y D x Ax e e d y D x ( ) Ax i ik i e i
e
d i
a i a i
2
2
Setting the partial derivatives wrt aj (the jth column of A) equal to zero, the analytically solution can be derived as
A V T (U 1 )T , where xi1yˆ Ti Z e e d N N y D xi I i 2 a a T , U (1 ) xi (xi ) , V , yˆ i e Z e d I ( ) y D x i 1 i 1 i i i T k xim yˆ i Z
where
16
The Testing Stage • Inputs: Query y Undersampled training data D (collected from the subjects of interest) Auxiliary dictionary A (learned from external data)
• We solve
2
x min y [D, A ] a x 1 x x 2 d
• Once x is obtained, recognition is achieved by k (x d ) identity min y [D, A ] a k x
2
2
17
Remarks • Recall that our proposed formulation solves: N
min y [D , A]xi A ,X
e i
i 1
e
Reconstruction error
2 2
xi 1 y D ik (x ) Ax e i
Sparsity constraint
e
d i
2 a i 2
Classwise reconstruction error for class k
• Comparisons with SRC‐based Approaches: Corrupted Training Data
Undersampled Training/Gallery Set
Dictionary Learning
SRC
X
X
X
LR
O
X
X
ESRC
O
O
X
Ours
O
O
O
[SRC] Wright et al., Sparse representation‐based classification, PAMI’09 [LR] Chen et al., Low‐rank matrix recovery with structural incoherence, CVPR’12 [ESRC] Deng et al., Extended SRC, PAMI’12 18
Outline • Introduction Undersampled Face Recognition • Related Work Sparse Representation based Classification (SRC) & ESRC Dictionary Learning for Sparse Representation • Learning of Auxiliary Dictionary • Experiments • Conclusions
19
Extended Yale B • 38 subjects available, each has about 64 images • External data for training the auxiliary dictionary: 6 subjects (out of 32 considered not of interest), 64 images each
• Gallery set: 32 subjects (of interest), each has 3 images considered available
• Test set: 32 subjects, each has about 61 images for evaluation Training Images
Test Images
20
Comparisons with ESRC using Different Features
ESRC randomly selects elements from external data as A. 21
Performance Comparisons with ESRC, K‐SVD, SRC, and NN LBP Features
Pixel‐based Features
For KSVD, ESRC, and our approach, the number of external subjects is set as 2, while NN and SRC does not utilize any external data. 22
AR Database • External Data (for training the auxiliary dictionary) 20 subjects (out of 100 considered not of interest), each has 13 images
• Gallery Set 80 subjects; only 1 neutral image per subject is available.
• Test Set 80 subjects, each has about 12 images Single Gallery Image
Test Images
23
Performance Comparisons with Different Auxiliary Dictionary Sizes
Note that PCRC and SRC do not utilize external data for recognition, while AGL was not designed for handling occluded face images. 24
Performance Comparisons with Different Feature Dimensions LBP Features
Pixel‐based Features
For KSVD, ESRC, and our approach, the number of external subjects is set as 2, while NN and SRC does not utilize any external data. 25
Outline • Introduction Undersampled Face Recognition • Related Work Sparse Representation based Classification (SRC) & ESRC Dictionary Learning for Sparse Representation • Learning of Auxiliary Dictionary • Experiments • Conclusions
26
Conclusions • We propose to learn an auxiliary dictionary from external data for undersampled face recognition. • The auxiliary dictionary is derived for handling intra‐class variations, which is more compact and representative than ESRC, etc. approaches. • Together with ESRC‐based formulations, our proposed method achieved improved recognition results than state‐of‐the‐art SRC‐based approaches.
27
Thank You!
28