Introduction
Systems Description
Experiments
Conclusions and Future Work
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling Petr Motlicek1 , Laurent El Shafey1 2 , Roy Wallace1 , Christopher McCool1 and S´ebastien Marcel1 2 Ecole ´
1 Idiap Research Institute, Switzerland Polytechnique F´ ed´ erale de Lausanne, Switzerland
Tsukuba, ICPR’2012, November 13th
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
1/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Outline
1
Introduction
2
Systems Description
3
Experiments
4
Conclusions and Future Work
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
2/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Outline
1
Introduction
2
Systems Description
3
Experiments
4
Conclusions and Future Work
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
3/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Smartphones and privacy (1/2) From mobile phones to smartphones
Privacy is becoming an increasing concern Contacts Pictures E-mails Web / social media (facebook, twitter, etc.) Security systems activation/deactivation ··· Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
4/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Smartphones and privacy (2/2) Mobile phone security Data encryption Authentication Mobile phone authentication 4-digit passcode (de-facto method)
Biometric
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
5/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Related Work Task Bi-modal (face and speech) authentication in mobile environments Related Work 1 Use of small in-house databases [Kim et al., 2010], [Qian et al., 2010] and [Rao et al., 2010] 2
ICPR competition on MOBIO Phase I (subset of MOBIO) [Marcel et al., 2010]
3
Bi-modal evaluation with the hardware constraints of a Nokia N900 phone using simplistic algorithms [McCool et al., 2012]
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
6/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Contributions Contributions Use of session variability modelling techniques in both modalities Comparison of session variability modelling techniques for speaker authentication in mobile environments Evaluation performed on the largest bi-modal mobile authentication database available (the MOBIO database) Achieve the most accurate results on the complete MOBIO authentication protocols Relies on the open-source library Bob http://www.idiap.ch/software/bob
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
7/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Outline
1
Introduction
2
Systems Description
3
Experiments
4
Conclusions and Future Work
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
8/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Overview System Overview
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
9/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Overview System Overview
Speech
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
9/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Overview System Overview
Speech
Face
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
9/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Overview System Overview
Speech
Energy-based Voice Activity Detection MFCC features
Face
Face Normalization Block-based features
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
9/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Overview System Overview
Speech
Energy-based Voice Activity Detection MFCC features
Session Variability Modelling
Face
Face Normalization Block-based features
Session Variability Modelling
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
9/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Overview System Overview
Speech
Face
Energy-based Voice Activity Detection MFCC features
Session Variability Modelling
Face Normalization Block-based features
Session Variability Modelling
Score Fusion
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
9/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Feature Extraction Speech Modality Feature Vectors Energy-based Voice Activity Detection
MFCC
(+energy) 25ms frames 10ms overlap 24-band filter bank -> 20 coefs
+Deltas +Double Deltas
Face Modality Feature Vectors Face Normalization 80x64 pixels
Block Decompostion 12x12 pixels 11 pixels overlap
DCT 44 coefs
From each image/speech utterance, a set of feature vectors X = {x1 , x2 , · · · , xK } is obtained. Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
10/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Baseline: Gaussian Mixture Model (GMM) (1/2) Generic model (training) Train a Universal Background Model (UBM) Expectation-Maximisation Point m in the GMM mean super-vector space
Client specific model (enrolment) Adapt the client model mi using the UBM m as a prior MAP adaptation (mean-only) Mean super-vector space: mi = m + di di : client-specific offset
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
11/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Baseline: Gaussian Mixture Model (GMM) (2/2) Score Given a test sample Xt Extract a set of feature vectors Xt Compute the average log likelihood ratio score between the client model mi and the UBM m scoreXt ,mi =
X j
log
p(xjt |mi ) p(xjt |m)
In practice, use of an approximation known as linear scoring
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
12/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Session Variability Modelling (1/5) GMM Limitation All of di is considered to be client-specific information, but it probably has noise as well
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
13/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Session Variability Modelling (1/5) GMM Limitation All of di is considered to be client-specific information, but it probably has noise as well Session Variability Modelling Introduce a term to describe the variations between samples of the same client Previously, mi = m + di Now, observations of the j’th sample of client i, Xi,j are assumed to be drawn from a distribution µij µij = m + uij + di Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
13/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Session Variability Modelling (2/5) GMM mean supervector space Observations of Alice Observations of Bob Direction of session variation
Direction of identity variation
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
14/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Session Variability Modelling (3/5) Session Variability Modelling Observations of the j’th sample of client i, Xi,j is assumed to be drawn from a distribution µij µij = m + uij + di
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
15/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Session Variability Modelling (3/5) Session Variability Modelling Observations of the j’th sample of client i, Xi,j is assumed to be drawn from a distribution µij µij = m + uij + di Inter-Session Variability Modelling (ISV) Constrain session variations in a linear subspace U µij = m + Uxij + di
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
15/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Session Variability Modelling (3/5) Session Variability Modelling Observations of the j’th sample of client i, Xi,j is assumed to be drawn from a distribution µij µij = m + uij + di Inter-Session Variability Modelling (ISV) Constrain session variations in a linear subspace U µij = m + Uxij + di Joint Factor Analysis (JFA) Constrain identity variations in a linear subspace V µij = m + Uxij + Vyi + dˆi Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
15/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Session Variability Modelling (4/5) GMM mean supervector space Observations of Alice Observations of Bob Direction of session variation U
Direction of identity variation V
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
16/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Session Variability Modelling (5/5) Model usage 1
Training: Estimate the low-dimensional subspaces U and V
2
Enrolment: Estimate the identity latent variable yi
3
Test time: Estimate the session latent variable xij
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
17/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Session Variability Modelling (5/5) Model usage 1
Training: Estimate the low-dimensional subspaces U and V
2
Enrolment: Estimate the identity latent variable yi
3
Test time: Estimate the session latent variable xij
Parameters estimation Factor Analysis-like models U and V subspaces learnt with an EM algorithm E-step: Estimate the latent variables xij and yi M-step: Maximisation using a Maximum-Likelihood criterion
Latent variables estimation (xij and yi ) Maximum a posteriori (MAP) to jointly estimate them
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
17/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Score Fusion Linear fusion sfused = wface sface + wspeech sspeech + wbias Sum rule wface = wspeech = 1 wbias = 0 Linear logistic regression Weights wface , wspeech and wbias learnt using logistic regression
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
18/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Outline
1
Introduction
2
Systems Description
3
Experiments
4
Conclusions and Future Work
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
19/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
MOBIO database (http://www.idiap.ch/dataset/mobio) Features Bi-modal (face and speech) database (Phase I + Phase II) Publicly available and free 3 sets for training, development and test About 50 clients in each set, with 192 videos for each client Training data: 9,600 images and audio samples of roughly 20 seconds Rigorous Gender dependent protocols (more males than females)
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
20/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Bi-modal Identification on the MOBIO database (1/2)
Modality
System
Male Dev Test
Female Dev Test
Face
[McCool et al., 2012] GMM ISV JFA
21.6 9.2 3.6 4.0
24.1 10.5 7.5 7.3
20.9 10.7 6.7 7.7
28.2 20.4 12.2 13.2
Speech
[McCool et al., 2012] GMM ISV JFA
18.0 12.6 8.2 15.5
18.2 15.8 8.9 14.7
15.1 20.0 11.9 23.1
17.7 22.6 15.3 19.4
Fusion
[McCool et al., 2012] ISV (sum rule) ISV (linear logistic regression)
10.9 2.1 1.2
11.9 3.3 2.6
10.5 3.8 2.3
13.3 11.0 9.7
Table: Recognition error rates (Dev set equal error rate (EER), Test set half total error rate (HTER) in %) Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
21/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Bi-modal Identification on the MOBIO database (2/2)
Face Speech Fusion
20 10 5
2 1 0.50.5 1
2
5
10
FRR (%)
20
40
Face Speech Fusion
40
FAR (%)
FAR (%)
40
20 10 5
2 1 0.50.5 1
2
5
10
FRR (%)
20
40
Figure: Male (left) and Female (right) Test DET for the ISV system
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
22/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Outline
1
Introduction
2
Systems Description
3
Experiments
4
Conclusions and Future Work
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
23/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Conclusions and Future Work Conclusions Presented a state-of-the-art bimodal authentication system robust to challenging mobile environments Use of session variability modelling techniques Experiments on the MOBIO database demonstrated relative improvements of at least 30% for the fused system Use of the open-source library Bob http://www.idiap.ch/software/bob Future Work Gender-dependent training Additional training data Optimisation of session variability modelling algorithms to run on mobile hardware Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
24/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Bibliography
[Kim et al., 2010]: Person authentication using face, teeth and voice modalities for mobile device security, IEEE Trans. Consum. Electron., 2010. [Qian et al., 2010]: Biometric authentication system on mobile personal devices, IEEE Trans. Instrum. Meas., 2010. [Rao et al., 2010]: Robust speaker recognition on mobile devices, proceedings of Intl. Conf. Signal Processing and Communications, 2010. [Marcel et al., 2010]: On the results of the first mobile biometry (MOBIO) face and speaker verification evaluation, proceedings of Intl. Conf. on Pattern Recognition contests, 2010. [McCool et al., 2012]: Bi-modal person recognition on a mobile phone: using mobile phone data, proceedings of IEEE ICME Workshop on Hot Topics in Mobile Multimedia, 2012. [Anjos et al., 2012]: Bob: a free signal processing and machine learning toolbox for researchers, proceedings of ACM Multimedia, 2012.
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
25/26
Introduction
Systems Description
Experiments
Conclusions and Future Work
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling , Motlicek et al.
Thank You!
Idiap Research Institute Ecole Polytechnique F´ ed´ erale de Lausanne
[email protected]
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
26/26
Appendix
Performance Evaluation FAR: False Alarm Rate FRR: False Rejection Rate Equal Error Rate (EER) Point where FAR = FRR Half Total Error Rate (HTER) HTER =
FAR+FRR 2
Reported Performances 1
On the development set, compute the threshold which gives the EER (FAR=FRR=HTER)
2
On the test set, applies the previous threshold and reports the HTER
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
27
Appendix
ISV vs JFA Inter-session variability model (ISV) µij = m + Uxij + Dzi , U is a sub-space of directions of session variation and xij is zero mean, unit standard deviation. Joint factor analysis (JFA) ˆ i, µij = m + Vyi + Uxij + Dz V is a sub-space of directions of client variation, yi is zero mean, unit standard deviation, ˆ is a diagonal matrix that is learnt from the training data D and zi is zero mean, unit standard deviation. Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
28
Appendix
Issues to Resolve for JFA Estimating the latent variables is done using a Gauss-Seidel loop. The parameters are estimated separately using fixed versions of the others. The order of the training is important, V, U then D. ¯ i = argmax p(λi | Oi,1 , Oi,2 , . . . , Oi,J ), λ i λi
= argmax p(Oi,j | λi )p(λi ), λi
= argmax p(zi )p(yi ) λi
Ji Y
p(Oi,j | xij , yi , zi )p(xij ).
(1)
j=1
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
29
Appendix
JFA vs PLDA JFA takes point estimates of yi , xij and zi . PLDA would integrate out the uncertainty of these variables. JFA is applied to a GMM/HMM framework. PLDA is applied to feature vectors.
ˆ i, µij = m + Vyi + Uxij + Dz
(2)
oij = µ + Vyi + Uxij + ij .
(3)
Bi-Modal Authentication in Mobile Environments Using Session Variability Modelling, Motlicek et al., ICPR’2012
30