2012.en.yusra ahmad salih.pdf

Viewer
Transcript

Video-Based Face Detection Using Dynamic Template Matching A Thesis Submitted to the Council of the Faculty of Science and Science Education, School of Science, University of Sulaimani, in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science

By

Yusra Ahmed Salih Higher Diploma in Computer Science, 2008 Supervised By

Dr. Aree Ali Mohammed Lecturer

January 2012

Rêbendan 2711

Dedication I dedicate this work to: My lovely family My dear parents My loving sisters and brothers My faithful friends, the companions of the long road … With my love Yusra A. Salih

Acknowledgements Praise be to Allah for His grace in allowing me to finish this work. First of all, I would like to express my deepest gratitude and appreciation to my supervisor, Dr. Aree Ali Mohammed, for his valuable advice, academic guidance, and patience throughout my thesis preparation at the University of Sulaimani. I would like to thank the Dean of Computer Science Institute, Dr. Soran A. Mohammed and the Head of Network Department, Mr. Harith Raad. I would like to thank the Dean of the Faculty of Science and Education Sciences, University of Sulaimani and the Head of Computer Science Department, Dr. Kamaran Hama Ali. I would like to express my sincere gratitude to Prof. Dr. Astrid Laubenheimer, for her invaluable guidance, encouragement and providing me with important feedback throughout this thesis. A great deal of thanks goes to Dr. Joachim Lembach, Director of International Office, Mr.Hans Wünstel, and Mr. Rebwar Omer for their kindness and support through my staying period in Germany. I wish to express my gratitude to the members of LFM Laboratory and students in Information Technology Department, Karlsruhe University of Applied Sciences, Germany, for participating in our test videos and face database. It is impossible to remember all, and I apologize to those I have inadvertently left out. Lastly, thank you all!

Abstract Human face processing techniques based video, including face detection, tracking, and recognition, has attracted a lot of research interest because of its value in various applications, such as video surveillance, structuring, indexing, retrieval, and summarization. To improve the outcomes of face detection method which developed by Viola-Jones in the video, different algorithmic strategies are designed and implemented. The first improvement is done by minimizing false positive alarms type I using manual thresholding algorithm. On other hand to reduce the miss detection (false negative) of the faces, a face tracking based template matching is used. Finally a dynamic thresholding algorithm is applied to minimize the false positive alarms type II. An optimization of the detection time is achieved for the hybrid detection and tracking phases by embedding and implementing a region of interest approach. In this thesis, hybrid face detection and tracking scheme based on dynamic template matching is proposed. At first, the detection is performed in each frame and then the tracker is applied on the detected faces in next frames. The conducted test results of the proposed face detection method on the two different video scenarios indicate that the performance of the hybrid face detector and tracker is much better than the Viola-Jones detector in terms of detection rate and detection time. For optimal parameters the detection rate of our proposed hybrid method is (97.2%) and the average detection time is 151.217 ms. The disadvantage of the proposed FDM is that it is not applicable to the different video scenes including fast person movement and lighting environments. It is also not used for multiple facial positions. I

Table of Contents Chapter One : General Introduction 1.1

Introduction………………………………………………….. 1

1.2

Face Detection…………………...…......………………….. 2

1.3

Object Tracking……...……………………………………… 4

1.4

Face Recognition…………...………………………………. 5

1.5

Literature Survey……………………………………………. 5

1.6

Aim of the Thesis..........................................................

1.7

Research Limitation………………………………………… 9

1.8

Thesis Layout…………………..…………………………… 10

9

Chapter Two : Face Detection, Tracking and Recognition Concepts 2.1

Introduction......................................................................

11

2.2

Still Image Face Detection…………………………………

11

2.3

Face Detection Methods…………………………………… 12

2.3.1 2.4

Appearance-Based Methods……………………………...

13

The Viola - Jones Face Detector Method…………….

14

2.4.1

Integral Image…...………………………….………………. 14

2.4.2

The Modified AdaBoost Algorithm………………………... 16

2.4.3

The Cascaded Classifier ……………………….…………. 19

2.4.4

Implementation of the Viola – Jones Detector using 21 OpenCV………………………………………………………

2.5

Face Tracking……………………………………………….. 22

2.5.1

Basic Mathematical Framework…………………………... 23

2.5.2

Face Tracking Methods……………………………………. 24

2.5.2.1

Template Matching Approach……………………………... 25 II

2.6

Face Recognition…………………………………………… 27

2.6.1

Face Recognition Techniques…………………………….. 27

2.6.1.1

Principal Component Analysis (PCA)…………………….. 28

2.6.1.2

Mathematical Theory…………......................................... 29

2.6.2

Face Recognition Applications.……………………….......

33

Chapter Three : Design and Implementation of the Proposed Face Detection Method 3.1

Introduction………………………………………………… 34

3.2

Face Detector Scheme…………………………………… 35

3.3

Face Tracking Scheme…………………………………… 38

3.3.1

False Positive Reduction…………………………………. 39

3.3.1.1

Manual Threshold Algorithm……………………………..

3.3.1.2

Dynamic Threshold Algorithm…………………………… 44

3.3.2 3.4

False Negative Reduction………………………………..

40

47

Face Recognition Scheme……………………………….. 49

Chapter Four : Test Evaluation 4.1

Introduction………………………………………………… 50

4.2

Video Scenarios Description…………………………….. 50

4.3

Test Results……………………………………………….. 53

4.3.1

Effect of Involved Parameters (Scenario 1)……………

53

4.3.1.1

Window Size……………………………………………….

53

4.3.1.2

Neighbor Threshold……………………………………….

54

4.3.1.3

Scale Factor……………………………………………….. 56

4.3.1.4

Result Discussion…………………………………………

57

Effect of Involved Parameters (Scenario 2)……………

59

4.3.2

III

4.3.2.1

Window Size………………………………………………

59

4.3.2.2

Neighbor Threshold……………………………………….

60

4.3.2.3

Scale Factor……………………………………………….. 61

4.3.2.4

Scenario 2 Results Evaluation…………………………... 62

4.4

Improved Detection Rates (Scenario 1)………………… 62

4.4.1

Reduce

False

Positive

Type

I

Using

Manual

Thresholding.................................................................. 63 4.4.2

Reduce False Negative Using Template Matching........ 63

4.4.3

Reduce False Positive Type II Using Dynamic Thresholding.................................................................. 64

4.4.4

Improvement of Detection Time..................................... 64

4.4.4.1

Face Detection Time..................................................

64

4.4.4.2

Face Tracking Time....................................................

65

Face Recognition………………………………………….

65

4.5

Chapter Five : Conclusions and Suggestions for Future Work 5.1

Conclusions...................................................................

5.2

Suggestions for Future Work…………………………….. 68

References

………………………………………………………………

IV

67

69

List of Figures

Figure

Figure Name

Page

No.

No.

1.1

Video based face recognition system………………………

2

2.1

General scheme for face detection methods……………… 13

2.2

The integral image……………………………………………. 14

2.3

Sum calculation……………………………………………..... 15

2.4

The different types of features………………………………. 15

2.5

The classifier cascade……………………………………….. 20

2.6

Steps of the Haar training process………………………..... 21

2.7

Classification of face recognition methods………………… 28

2.8

a) An example of dominant Eigenfaces b) An average face…………………………………………… 30

2.9

Calculating Eigenfaces………………………………………

2.10

Face verification procedure using M

3.1

General block diagram of the proposed FDM……………... 35

3.2

Face detection scheme………………………………………

36

3.3

Face detection process………………………………………

37

3.4

Face tracking process………………………………………..

39

3.5

False positive alarms………………………………………… 40

3.6

False positive alarm between two successive

t

Eigenfaces………… 32

frames………………………………………………………….

V

32

41

3.7

Window's face width values after running the face detector………………………………………………………… 42

3.8

Average window's face width……………………................. 42

3.9

Manual threshold process…………………………………… 43

3.10

Unsolved types of false positive alarms……………………. 44

3.11

Euclidian distance measure…………………………………. 45

3.12

Dynamic threshold algorithm………………………………... 46

3.13

Template matching based tracking…………………………. 48

3.14

The framework of face recognition process……………….. 49

4.1

Video scenario 1……………………………………………… 51

4.2

Video scenario 2……………………………………………… 52

4.3

Neighbor threshold setting to zero………………………….. 55

4.4

DR versus WS…………………………………….................. 58

4.5

DR versus NT…………………………………………………. 58

4.6

DR versus SF…………………………………………………. 58

4.7

Training phase in face recognition system………………… 66

4.8

Test phase in face recognition system……………………..

VI

66

List of Tables Table No.

Table Name

Page No.

4.1

Video scenarios description……………………………… 52

4.2

Effect of involved parameters……………………………. 53

4.3

Effect of window size on detection rate and time (Scale factor = 1.2 and Neighbor threshold = 1)………. 54 Effect of window size on detection rate and time (Scale factor = 2.2 and Neighbor threshold = 2)………. 54

4.4 4.5

Effect of window size on detection rate and time (Scale factor = 3.2 and Neighbor threshold = 3)………. 54

4.6

Effect of neighbor threshold on detection rate and time (Scale factor = 1.2 and Window size =20×20)…………. 55

4.7

Effect of neighbor threshold on detection rate and time (Scale factor = 2.2 and Window size =30×30)…………. 56

4.8

Effect of neighbor threshold on detection rate and time (Scale factor = 3.2 and Window size =40×40)…………. 56

4.9

Effect of scale factor on detection rate and time (Neighbor threshold = 1 and Window size=20×20)……

56

Effect of scale factor on detection rate and time (Neighbor threshold = 2 and Window size=30×30)……

57

Effect of scale factor on detection rate and time (Neighbor threshold = 2 and Window size=40×40)……

57

4.10 4.11 4.12

Effect of window size on detection rate and time (Scale factor = 1.2 Neighbor thresholds = 1)…………... 59

4.13

Effect of window size on detection rate and time (Scale factor = 2.2 Neighbor thresholds = 2)…………... 60 VII

4.14

Effect of window size on detection rate and time (Scale factor = 3.2 Neighbor thresholds = 3)…………... 60

4.15

Effect of neighbor threshold on detection rate and time (Scale factor = 1.2 and window size (20×20)…………..

60

Effect of neighbor threshold on detection rate and time (Scale factor = 2.2 and window size (30×30)…………..

61

Effect of neighbor threshold on detection rate and time (Scale factor = 3.2 and window size (40×40)…………..

61

4.16 4.17 4.18

Effect of scale factor on detection rate and time (Neighbor threshold = 1 and window size (20×20)……. 61

4.19

Effect of scale factor on detection rate and time (Neighbor threshold = 2 and window size (30×30)……. 62

4.20

Effect of scale factor on detection rate and time (Neighbor threshold = 3 and window size (40×40)……. 62

4.21

False positive type I reduction…………………………… 63

4.22

False negative reduction…………………………………. 63

4.23

False positive type II reduction…………………………..

4.24

Detection time with and without ROI……………………. 65

4.25

Tracking time with and without ROI……………………..

VIII

64

65

List of Abbreviations Abbreviation ATM CCTV DNA

Acronyms Automated Teller Machine Closed Circuit Television Deoxyribo Nucleic Acid

DR

Detection Rate

DT

Detection Time

EGM

Elastic Graph Matching

FDM

Face Detection Method

FN

False Negative

FP

False Positive

GDT

Generalized Distance Transform

HMM

Hidden Markov Model

ICA

Independent Component Analysis

LDA

Linear Discriminant Analysis

NT

Neighbor Threshold

OM

Orientation Map

OpenCV

Open Source Computer Vision

PCA

Principal Component Analysis

PIN

Personal Identification Number

ROI

Region of Interest

SAD

Sum Absolute Difference

SDM

Square Difference Measure

SF

Scale Factor

SIM

Subscriber Identification Module

SSD

Sum Squared Difference

SVM

Support Vector Machine

TP

True Positive

IX

WS

Window Size

XML

Extensible Markup Language

X

Chapter 1

General Introduction

Chapter One General Introduction 1.1 Introduction The use of different biometric systems is becoming usual in our society. In their objective of determining the identity of one person, several characteristics can be analyzed. For instance, there are physical features as fingerprints, iris, retina, face, hand geometry, hand veins, DNA and psychological features as gait or signature. Face recognition has received significant attention during the last two decades and many researchers study various aspects of it. There are at least two reasons for this; the first one is a wide range of commercial and security applications and the second is the availability of feasible computer technology to develop and implement applications that demand strong computational power. Today, automatic recognition of human faces is a field that gathers many researchers from different disciplines such as image processing, pattern recognition, computer vision, graphics, and psychology [Mar, 09]. Human face research mainly includes human face recognition and human face detection. Human face recognition is defined as identifying or verifying a person from a digital still image or video image, while human face detection is defined as determining the locations and sizes of human faces in images.

In order to do face recognition, people first need to build a database

which includes all kinds of human face images. To verify an image from the image database, the system needs to detect the human face within the image and then analyze all the similarity factors among all the images within the database. And up until now, no one has developed a very mature human face

1

Chapter 1

General Introduction

recognition system with high accuracy and speed. But, in any human face processing system, the first step is to detect human faces. So face detection actually began as an independent topic itself because of the requirement of human face recognition system [GU, 08]. Figure (1.1) shows the whole process of video based face recognition system.

Fig (1.1) Video based face recognition system.

1.2 Face Detection Face detection systems identify faces in images and video sequences using computers. An ideal face detection system should be able to identify and locate all faces regardless of their positions, scale, orientation, lightning

2

Chapter 1

General Introduction

conditions, and expressions and so on. Due to the large intra-class variations in facial appearances, face detection has been a challenging problem in the field of computer vision [Jor, 06]. Face detection is the first stage of a face recognition system. A lot of research has been done in this area, most of that is efficient and effective for still images only. So it could not be applied to video sequences directly. In the video scenes, human face can have unlimited orientations and positions, so its detection is of a variety of challenges to researchers. Generally, there are three main processes for face detection based on video. At first, it begins with frame based detection. During this process, lots of traditional methods for still images can be introduced such as statistical modeling method [Mog, 97], neural

network - based

BOOST method [Vio, 01], color-based

method [Row, 98],

face detection [Hsu, 02], etc.

However, ignoring the temporal information provided by the video sequence is the main drawback of this approach. Secondly, integrating detection and tracking, it implies face detection in the first frame and then tracking it through the whole sequence. Since detection and tracking are independent and information from one source is just in use at one time, loss of information is unavoidable. Finally, instead of detecting each frame, temporal approach exploits temporal relationships between the frames to detect multiple human faces in a video sequence. In general, such method consists of two phases, namely detection

and

prediction

by update - tracking. This helps to stabilize

detection and to make it less sensitive to thresholds compared to the other two detection categories [Wan, 09].

3

Chapter 1

General Introduction

1.3 Object Tracking Object tracking is an important task within the field of computer vision. The proliferation of high-powered computers, the availability of high quality and inexpensive video cameras, and the increasing need for automated video analysis have generated a great deal of interest in object tracking algorithms. There are three key steps in video analysis: detection of interesting moving objects, tracking of such objects from frame to frame, and analysis of object tracks to recognize their behavior. In its simplest form, tracking can be defined as the problem of estimating the trajectory of an object in the image plane as it moves around a scene. Additionally, depending on the tracking domain, a tracker can also provide object-centric information, such as orientation, area, or shape of an object. Object tracking approaches can be separated into three main groups [Yil, 06]: 1. Point Tracking. An object can be represented by a number of points; those correspondences over consecutive frames are tracked. The points are combined in a kind of model of the object and the correspondences are evaluated over a number of constraints, such as motion models. An example for point tracking is Kalman filter [Bro, 86]. 2. Kernel Tracking. Kernel refers to the object shape and appearance. For example, the kernel can be a rectangular template or an elliptical shape with an associated histogram. This group of object tracking methods computes the motion of an object in order to track it from one frame to the next. The appearance models can be separated into template based or density based models. Popular examples for this method are TemplateTracking [Hag, 96], Meanshift-Tracking [Com, 03]. 3. Silhouette Tracking. The object is tracked via estimation of the object region in each frame. This can be done by shape matching or contour

4

Chapter 1

General Introduction

evolution. Recent approaches of this group are [Yil, 04] for contour evolution based methods or [Kang, 04] for shape matching.

1.4 Face Recognition Face recognition is the ability to establish the person identity based on facial characteristics. Automated face recognition requires various techniques from different research fields, including computer vision, image processing, pattern recognition, and machine learning. In a typical face recognition system, face images from a number of subjects are enrolled into the system as gallery data, and the face image of a test subject (probe image) is matched to the gallery data using a one-to-one or one-to- many schemes. The one-to-one and one-to-many matching are called verification and identification, respectively. Face recognition has a wide range of applications, including law enforcement, civil applications, and surveillance systems. Most of the algorithms demonstrate promising research while dealing with still images,

which

include

Principal

facial

Component Analysis (PCA), Linear

Discriminated Analysis (LDA), and Elastic Graph Matching (EGM) and so on. Compared with still images, video can provide more information, such as temporal information. Therefore, video-based face recognition has gained more attention recently [Par, 09].

1.5 Literature Survey Several researches on face detection and recognition have been published within the last 10 years. Some of the relevant published works are listed and annotated below:

5

Chapter 1

General Introduction

1. P.Viola and M. Jones [Vio, 01] presented a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. In the domain of face detection the system yields detection rates comparable to the best previous systems. Used in real-time applications, the detector runs at 15 frames per second without resorting to image differencing or skin color detection. 2. Z. Jin et al. [Jin, 05] proposed a face detection algorithm which combines template matching technique with skin color information in segmenting eye-pair candidates via a linear transformation and in making a face/nonface decision. Experiments showed that the proposed algorithm was effective and efficient in detecting frontal faces with different races and under different lighting conditions beside that the proposed algorithm fails in detecting faces with big pose changes and fails in detecting very small faces because of the failures to segment eye-pairs. 3. H. Lee and D. Kim [Lee, 07] proposed a robust face tracking method based on the condensation algorithm and importance sampling. They presented two separate trackers which used skin color and facial shape information as the measured cues, respectively. They also proposed an adaptive color model based on the condensation algorithm to handle an illumination change during face tracking. Experiments showed good robustness even other skin-colored objects or other faces appeared in the image, and even with a cluttered background or changing illumination. Compared with other face tracking methods, which use only skin color as the measurement cue, it tracks better and more robustly. 4. E. Ben-Israel [Isra, 07] experimented with basic computer vision algorithms related to tracking, and used them to implement a simple human tracker based on OpenCV mean-shift tracker. Advances from this

6

Chapter 1

General Introduction

work can be done in two major aspects: (1) use of automatic generation of

the

human

template for

the

mask generation and

(2)

improvement of the tracking phase by choosing a smaller tracking region, possibly based

on

the

segmentation

data

from

the

initialization phase. 5. K. Nallaperumal et al. [Nal, 08] proposed a new face detection technique, which used mixed Gaussian color model, adaptive threshold, and template matching techniques. The proposed face detection algorithm involves three stages: Applying mixed Gaussian color model, adaptive threshold algorithm, and template matching. Experimental results showed that the algorithm was quite practical and faster in comparison to the techniques such as neural networks and other techniques. 6. H. Ryu et al. [Ryu, 08] proposed a face tracking approach based on template matching for applying a video indexing application. First, the face template is represented by two projection histograms of the face region and matching methods are used to determine the candidate face region. Next, the facial features are extracted from the candidate face region and a dissimilarity measure. The template to determine the face region in the next frame is dynamically updated using the refined face region. Thus, the proposed method can adapt to scale any changes with less computational cost. 7. Y.W. Wu and X.Y. Ai [Wu, 08] proposed a method to improve the performance for detecting faces in color images. This improvement is achieved by integrating the AdaBoost learning algorithm with skin color information. The complete system is tested on a variety of color images and compared with other relevant methods. Experimental results show that

7

Chapter 1

General Introduction

proposed system leads to competitive results and improves detection performance substantially. 8. N. D.Thanh et al. [Tha, 09] proposed a novel weighted template matching method. It employed a generalized distance transform (GDT) and an orientation map (OM). Based on the matching method, a two-stage human detection method consisting of template matching and Bayesian verification was developed. Experimental results showed that the proposed method can effectively reduce the false positive and false negative detection rates. 9. D. Chen et al. [Che, 10] proposed an approach using Skin-Color model to integrate the AdaBoost algorithm, and then using subsidiary discriminated function to improve the detection process. They used the theory of Haarlike features, Integral image and AdaBoost algorithm, which was proposed by Paul Viola, and then researched its improvement. The experimental results showed that improved algorithm can detect the real-time face better, having higher success rate and lower false alarm rate. 10. Q. Zhao and H. Cai [Zha, 10] used the

method

of

difference

in

background images and the Kalman filter to track and extract the region of human body firstly, and then used the AdaBoost algorithm to detect human face in the region. Finally the improved Hidden Markov Model which is named as Pseudo-two-dimensional Hidden Markov Model was used for

face

image

feature

extraction and recognition of face image.

Experiments demonstrated that the method could extract the motive human

body

in

the

video and then detect and recognize the face

effectively and also showed

that

the

effect of recognition

was

impacted by the lighting when the video was captured. In addition, the

8

Chapter 1

General Introduction

speed of human motion is one of the influencing factors. Too fast or too slow also would lead to the failure in extraction of the body contour. 11. K. Ramirez et al. [Ram, 11] proposed a face recognition and verification algorithm based on histogram equalization to standardize the faces illumination reducing in such way the variations for further features extraction; using the image phase spectrum and principal components analysis, which allow the reduction of the data dimension without much information loss. Evaluation results showed the desirable features of proposal scheme reaching recognition rate over 97% and a verification error lower than 0.003%.

1.6 Aim of the Thesis The aim of this research is to develop and implement an efficient real time video based face detection method. At first a face is detected in each frame based on machine learning algorithm and then the face is tracked using dynamic template matching. The moving objects (faces) within video scenes must be detected and tracked with minimum false positive and false negative alarms. This minimization leads to improve the face detection rate.

1.7 Research Limitation Although video based face detection is a difficult problem in general, it can be solved by imposing some limited conditions as follows: 1) A user is cooperative for identification: the user walks up to a camera and standing toward, it is required for creating image database for single face recognition.

9

Chapter 1

General Introduction

2) Only some limited variations are considered in tilt and in-depth rotation from a frontal face.

1.8 Thesis Layout Apart from the current chapter, this thesis is organized into five chapters which are: Chapter two, “Face Detection, Tracking and Recognition Concepts”, presents the theoretical background of face recognition and face detection. Moreover, object tracking, Harr-Like Feature Method, Principal Component Analysis algorithm and dynamic template matching algorithm are detailed. Chapter three, “Design and Implementation of the Proposed Face Detection Method”, describes the design steps of face detection in video through block and flowchart diagrams. The designed system is then applied in real time application. Chapter four, “Test Evaluation”, presents the test results to reflect the accuracy of the proposed face detection method. The test material was consisting of two videos produced with invariant illumination and tilt to in depth rotation from a frontal face. Chapter five, “Conclusions and Suggestions for Future Work”, is devoted to a presentation of some of the conclusions derived from the test results. The chapter also contains some suggestions for future work.

10

Chapter Two Face Detection, Tracking and Recognition Concepts 1.1 Introduction Face sequence detection can be divided into two main phases: face detection and face tracking. Most of the related studies are looking for only one of the two topics or combine them together depending on the previous researches. Face detection is usually made over still images and many tracking algorithms are presented with manually given, initial object locations and focus on following the track of a single object. Thus, combining these two approaches is not as straightforward as it might seem and has some special demands that need to be taken into account when designing a full sequence detection algorithm [Sar, 10]. This chapter describes the methods and theory that have been used to carry out the implementation of our video –based face recognition system in three parts. The first part describes the methods used to implement ViolaJones detector. The second part describes the principle of template matching method and the third part concerns the mathematical theory behind the principal component analysis for face recognition.

2.2 Still Image Face Detection Given an arbitrary image, the aim of face detection is to find a face in the image and, if present, return the location of the image and extent of each 11

Chapter 2

Face Detection, Tracking and Recognition Concepts

face [Yang, 02]. Face detection has a number of applications, namely it can be part of face recognition, a surveillance system, or video based computer machine interface. Efficient face detection at frame rate is an impressive goal; it is analogue to face tracking that requires no knowledge of previous frames. Fast face detection has an apparent application to practical face tracking in the sense that it can be used to initialize tracking [Sil, 05]. Given a single image, the goal of face detection is to identify all image regions which contain a face regardless of its position, orientation and the lightning condition. Such a problem is challenging because faces are nonrigid and have a high degree of variability in scale, location, orientation (upright, rotated) and poses (frontal, profile). Facial expression, occlusion and lightning conditions also change the overall appearances of faces. Two important characteristics for a trained face detector are its detection and error rates. The detection rate is defined as the ratio between the number of faces correctly detected and the number of faces determined by a human. In general, two types of errors can occur: False negatives in which faces are missed, resulting in low detection rate. False positives in which an image region is declared to be a face, but it is not. The detection and false positive rates are normally related since one can often tune the parameters of the detection system to increase the detection rates while also increasing the number of false detections [Jor, 06].

2.3 Face Detection Methods The existing techniques to detect faces from a single intensity or color image are divided into four major categories: 1. Knowledge-based methods.

12

Chapter 2

Face Detection, Tracking and Recognition Concepts

2. Feature invariant approaches (Facial features, Texture, Skin color and Multiple features). 3. Template matching methods (Predefined face templates and Deformable templates). 4. Appearance-based

methods

(Eigenface,

Distribution-based,

Neural

Network, Support Vector Machine (SVM) and Hidden Markov Model (HMM). Figure (2.1) summaries these methods.

Fig (2.1) General scheme for face detection methods.

2.3.1 Appearance-Based Methods The “templates” in appearance-based methods are learned from examples in images. They rely on techniques from statistical analysis and machine learning to find the relevant characteristics of face and non-face images. The learning characteristics are in the form of distribution models that are consequently used for face detection [Sil, 05].

13

Chapter 2

Face Detection, Tracking and Recognition Concepts

A popular method for appearance-based face detection is to use Haar-like feature and a trained classifier. The idea is to scan a sub-window capable of detecting faces across a given input image. The standard image processing approach would be to rescale the input image to different sizes and then run the fixed size detector through these images. Viola - Jones have devised a scale invariant detector that requires the same number of calculations whatever the size.

2.4 The Viola - Jones face detector Method This is a widely used algorithm with a strong mathematical base. Paul Viola was the first to develop the Haar cascade object detector [Vio, 01] that was improved by Rainer Lienhart [Lie, 02]. The idea here is to first train a classifier with a number of sample views of an object. The next sections elaborate on this detector. 2.4.1 Integral Image The first step of the Viola-Jones face detection algorithm is to turn the input image into an integral image. This is done by making each pixel equal to the entire sum of all pixels above and to the left of the concerned pixel. This is demonstrated in Figure (2.2).

Fig (2.2) The integral image

14

Chapter 2

Face Detection, Tracking and Recognition Concepts

This allows for the calculation of the sum of all pixels inside any given rectangle using only four values. These values are the pixels in the integral image that coincide with the corners of the rectangle in the input image. This is demonstrated in Figure (2.3). Sum of grey rectangle = D - (B + C) + A

Fig (2.3) Sum calculation.

Since both rectangle B and C include rectangle A the sum of A has to be added to the calculation. It has now been demonstrated how the sum of pixels within rectangles of arbitrary size can be calculated in constant time.

The Viola-Jones face detector analyzes a given sub-window using

features consisting of two or more rectangles. The different types of features are shown in Figure (2.4).

Type 1

Type 2

Type 3

Type 4

Fig (2.4) The different types of features [Ara, 08].

15

Type 5

Chapter 2

Face Detection, Tracking and Recognition Concepts

Each feature results in a single value which is calculated by subtracting the sum of the white rectangle(s) from the sum of the black rectangle(s). Viola-Jones have empirically found that a detector with a base resolution of 24*24 pixels gives satisfactory results. When allowing for all possible sizes and positions of the features in Figure (2.4) a total of approximately 160.000 different features can then be constructed. Thus, the amount of possible features vastly outnumbers the 576 pixels contained in the detector at base resolution. These features may seem overly simple to perform such an advanced task as face detection, but what the features lack in complexity they most certainly have in computational efficiency. One could understand the features as the computer’s way of perceiving an input image. The hope being that some features will yield large values when on top of a face. Of course operations could also be carried out directly on the raw pixels, but the variation due to different pose and individual characteristics would be expected to hamper this approach. The goal is now to smartly construct a mesh of features capable of detecting faces and this is the topic of the next section.

2.4.2 The Modified AdaBoost Algorithm As stated above there can be calculated approximately 160.000 feature values within a detector at base resolution. Among all these features some few are expected to give almost consistently high values when on top of a face. In order to find these features Viola-Jones use a modified version of the AdaBoost algorithm developed by Freund and Schapire in [Bis, 06]. AdaBoost is a machine learning boosting algorithm capable of constructing a strong classifier through a weighted combination of weak classifiers. (A weak

16

Chapter 2

Face Detection, Tracking and Recognition Concepts

classifier classifies correctly in only a little bit more than half the cases.) To match this terminology to the presented theory each feature is considered to be a potential weak classifier. A weak classifier is mathematically described

h

 1 if pf ( x )  p ( x, f , p ,  )    0 otherwise

(2.1)

Where x is a 24x24 pixel sub-window, f is the applied feature, p the polarity and



the threshold that decides whether x should be classified as a positive

(a face) or a negative (a non-face). Since only a small amount of the possible 160.000 feature values are expected to be potential weak classifiers the AdaBoost algorithm is modified to select only the best features. Viola-Jones' modified AdaBoost algorithm is presented in pseudo code [Vio, 04].  Given examples images ( x1 , y1 )… ( xm , y n ) where

y

i

= 0, 1 for

negative and positive examples.  Initialize weights where m

w

1 ,i



1 1 , 2m 2l

y

for

i

= 0, 1 respectively,

and l are the numbers of negative and positive

respectively.  For t  1,.........., T : 1. Normalize the weights,

w

t ,i

 

w  w t ,i

so that

n

j 1

w

t

is

t ,i

probability distribution. 2. For each feature,

j

, train a classifier

h

j

which is restricted to

using a single feature. The error is evaluated with respect to wt ,

17

Chapter 2

Face Detection, Tracking and Recognition Concepts



j

 w h (x )  y



i

j

i

i

i

3. Choose the classifier, ht , with the lowest error 4. Update the weights:

w Where

t  1 ,i



w

t ,i

e  0 if example x i

Otherwise, and



t







1

t



T

An

important

part



of

t

ei

e 1 i

. t

T

t

1

 log

the

.

t

 1   t h t ( x )  1   2 t 1 t (x)    0 otherwise

Where

t

is classified correctly,

i

 The final strong classifier is:

h

1 





modified

t

AdaBoost

algorithm

is

the

determination of the best feature, polarity and threshold. There seems to be no smart solution to this problem and Viola-Jones suggest a simple brute force method.

This means that the determination of each new weak classifier

involves evaluating each feature on all the training examples in order to find the best performing feature. This is expected to be the most time consuming part of the training procedure. The best performing feature is chosen based on the weighted error it produces. This weighted error is a function of the weights belonging to the training examples. As seen in part 4 of the above algorithm the weight of a correctly classified example is decreased and the

18

Chapter 2

Face Detection, Tracking and Recognition Concepts

weight of a misclassified example is kept constant. As a result it is more ‘expensive’ for the second feature (in the final classifier) to misclassify an example also misclassified by the first feature, than an example classified correctly. An

alternative

interpretation is that the second feature is forced to focus harder on the examples misclassified by the first. The point being that the weights are a vital part of the mechanics of the AdaBoost algorithm. With

the

integral

image,

the

computationally

efficient features and the modified AdaBoost algorithm in place it seems like the face detector is ready for implementation[Jen, 08].

2.4.3 The Cascaded Classifier The basic principle of the Viola-Jones face detection algorithm is to scan the detector many times through the same image – each time with a new size. Even if an image should contain one or more faces it is obvious that an excessive large amount of the evaluated sub-windows would still be negatives (non-faces). This realization leads to a different formulation of the problem: Instead of finding faces, the algorithm should discard non-faces. The thought behind this statement is that it is faster to discard a non-face than to find a face. With this in mind a detector consisting of only one (strong) classifier suddenly seems inefficient since the evaluation time is constant no matter the input. Hence the need for a cascaded classifier arises. The cascaded classifier is composed of stages each containing a strong classifier. The job of each stage is to determine whether a given sub-window is definitely not a face or maybe a face. When a sub-window is classified to be a non-face by a given stage it is immediately discarded. Conversely a sub-window classified as a

19

Chapter 2

Face Detection, Tracking and Recognition Concepts

maybe-face is passed on to the next stage in the cascade. It follows that the more stages a given sub-window passes, the higher the chance the subwindow actually contains a face. The concept is illustrated in Figure (2.5).

Fig (2.5) The classifier cascade

In a single stage classifier one would normally accept false negatives in order to reduce the false positive rate. However, for the first stages in the staged classifier false positives are not considered to be a problem since the succeeding stages are expected to sort them out. Therefore Viola-Jones prescribe the acceptance of many false positives in the initial stages. Consequently the amount of false negatives in the final staged classifier is expected to be very small. Viola-Jones also refers to the cascaded classifier as an attention cascade. This name implies that more attention (computing power) is directed towards the regions of the image suspected to contain faces. It follows that when

20

Chapter 2

Face Detection, Tracking and Recognition Concepts

training a given stage, say n, the negative examples should of course be false negatives generated by stage n-1. The majority of thoughts presented in the ‘Methods’ sections are taken from the original Viola-Jones paper [Vio, 04]. 2.4.4 Implementation of the Viola – Jones Detector using OpenCV OpenCV is an open source computer vision library originally developed by Intel [Web, 01]. As discussed in previous section, the “Boosted Cascade of Simple Features” objects detection algorithm introduced by Viola and Jones was built into OpenCV and will be utilized in this thesis. Steps shown in Figure (2.6) were followed to train the Harr classifier.

Fig (2.6) Steps of the Haar training process.

21

Chapter 2

Face Detection, Tracking and Recognition Concepts

The first step is to select positive/negative samples out of the training image database. Negative samples are taken from arbitrary images to avoid the camera variation when testing is performed on different databases. These images do not contain face representations. Negative samples are passed through background description file where each line contains the filename (relative to the directory of the description file) of the negative sample image. This file was created once manually and was used across databases for face detection. When selecting positive samples, samples were selected from different reflections, illuminations and backgrounds of different people to mimic real world scenes. First and foremost, the face was localized to include eyes, eyebrow, nose, mouth and find the number of object instances, locations, face width and height. The face was localized so that the rectangle is close to the object border.

2.5 Face Tracking Face tracking is a crucial part of most face processing systems. It requires accurate target (i.e. face) detection and motion estimation when an individual is moving. Generally, this process is required to facilitate the face region localization and segmentation necessary prior to face recognition. Accurate face tracking is a challenging task since many factors can cause the tracking algorithm to fail. Some of the major challenges encountered by face tracking systems are robustness to pose changes, lighting variations, and facial deformations due to changes of expression and face occlusion. These factors might cause the algorithm to lose track of the subject’s face and drift (i.e. lose face detection for initialization) [Li, 08].

22

Chapter 2

Face Detection, Tracking and Recognition Concepts

2.5.1 Basic Mathematical Framework Here we provide an overview of the basic mathematical framework that explains the process in which most trackers work. Let p R P denote a parameter vector that is the desired output of the tracker. It could be a 2D location of the face in the image, the 3D pose of the face, or a more complex set of quantities that also include lighting and deformation parameters. We define a synthesis function f: R 2  R p  R 2 that can take an

image pixel v R 2 at time t 1 and transform it to f v , p  at time t. For a 2D tracker, this function f could be a transformation between two images at two consecutive time instants. For a 3D model-based tracker, this can be considered as a rendering function of the object at pose p in the camera frame to the pixel coordinates v in the image plane. Given an input image I v  , we want to align the synthesized image with it so as to obtain:

pˆ  arg min g  f ( v , p )  I ( v )  p

(2.2)

where pˆ denotes the estimated parameter vector for this input image I (v). The essence of this approach is the well-known Lucas-Kanade tracking, an efficient and accurate implementation of which has been proposed using the inverse compositional approach [Bak, 04]. Depending on the choice of v and p, the method is applicable to the overall face image, a collection of discrete features, or a 3D face model. The cost function g is often implemented as a

L 2 norm, i.e., the sum of the squares of the errors over the entire region of interest. However, other distance metrics may be used. Thus a face tracker is often implemented as a least-squares optimization problem.

23

Chapter 2

Face Detection, Tracking and Recognition Concepts

Let us consider the problem of estimating the change,  P t ˆ m t in the parameter vector between two consecutive frames,

mˆ

t

 arg min g m

  f ( v , pˆ

t 1

 m)

v

I

t

I

(v) and

t

I

t 1

(v) as:

(v )  2

(2.3)

and

pˆ

t



pˆ

t



mˆ

(2.4)

t

The optimization of the above equation can be achieved by assuming a current estimate of m is known and iteratively solve for increments  m such that

2   f ( v , pˆ t  1  m   m )  I t ( v )   

(2.5)

v

is minimized.

2.5.2 Face Tracking Methods

Various methods have been proposed to overcome face tracking challenges according to [Wu, 04]; face tracking methods could be classified into three main groups: low level feature approaches, template matching approaches and statistical inference approaches. The low level feature approaches make use of low level face knowledge, such as skin color, background knowledge (background subtraction or rectangular features) or motion information to track faces. The template matching approaches involved tracking contours with snakes, 3D face model matching, shape and face matching and wavelet networks matching. The third tracking category, statistical inference approaches, includes Kalman filtering techniques for unimodal Gaussian representations, Monte Carlo approaches for non- Gaussian nonlinear target tracking [Cor, 07].

24

Chapter 2

Face Detection, Tracking and Recognition Concepts

2.5.2.1 Template Matching Approach

Template matching is a technique used in digital image processing for comparing portions of images between them, sliding the patch or portion over the input image using different methods. Once the patch has been tested in all the possible locations in a pixel-by-pixel basis, a matrix containing a numerical index according to how good the patch matches in each location is created [Cas, 09]. Template matching can be subdivided between two approaches: feature-based and template-based matching. The feature-based approach uses the features of the search and template image, such as edges or corners, as the primary match-measuring metrics to find the best matching location of the template in the source image. The template-based, or global, approach, uses the entire template, with generally a sum-comparing metric (using SAD, SSD, cross-correlation, etc.) that determines the best location by testing all or a sample of the viable test locations within the search image that the template image may match up to. A. Template-Based Approach

For templates without strong features, or for when the bulk of the template image constitutes the matching image, a template-based approach may be effective. As aforementioned, since template-based template matching may potentially require sampling of a large number of points, it is possible to reduce the number of sampling points by reducing the resolution of the search and template images by the same factor and performing the operation on the resultant downsized images (pyramid, image processing), providing a search window of data points within the search image so that the template does not have to search every viable data point, or a combination of both.

25

Chapter 2

Face Detection, Tracking and Recognition Concepts

B. Template Matching Measurements

The "matching error" between the patch and any given location inside the mage where this is being searched can be computed using different methods. This section gives a brief description of each of them. In the following mathematical expressions, I denote the input image, T the template and R the result. 1. Square difference matching. These methods match the squared

difference, which means that the perfect match would be 0 and bad matches would lead to large values. 2  T x , y    I ( x  x . y  y  )  R sq _ diff  x , y  x  , y

(2.6)

2. Correlation matching. These methods multiplicatively match the

template against the image, which means that a perfect match would be the largest.

 T x , y   . I ( x  x . y  y  )  2 R ccorr  x , y   x  , y

(2.7)

3. Correlation coefficient matching. These methods match a template

relative to its mean against the image relative to its mean. The best match would be 1 and the worst one would be -1. Value 0 means that there is no correlation [Cas, 09].



T ' x , y   . I ' ( x  x . y  y  ) R ccoeff ( x , y )  x  , y

26



2

(2.8)

Chapter 2

Face Detection, Tracking and Recognition Concepts

T '( x ', y ') T ( x ', y ') 

1 ( w .h )  T ( x " , y " ) x" , y"

I '(x  x', y  y') I (x  x', y  y') 

1 (w.h)  I (x  x", y  y") x", y"

(2.9)

(2.10)

2.6 Face Recognition Face recognition has received considerable interest as a widely accepted biometric, because of the ease of collecting samples of a person, with or without subject’s intension. Face recognition refers to an automated or semi-automated process of matching facial images. This type of technology constitutes a wide group of technologies which all work with face but use different scanning techniques. Most common by far is 2D face recognition which is easier and less expensive compared to the other approaches [Kim. 01].

2.6.1 Face Recognition Techniques

All available face recognition techniques can be classified into four categories based on the way they represent face; 1. Appearance based which uses holistic texture features. 2. Model based which employs shape and texture of the face, along with 3D depth information. 3. Template based face recognition. 4. Techniques using neural networks. Figure (2.7) summaries these types:

27

Chapter 2

Face Detection, Tracking and Recognition Concepts

Fig (2.7) Classification of face recognition methods.

2.6.1.1 Principal Component Analysis (PCA)

PCA is a way of identifying patterns in data and expressing the data in such a way to highlight their similarities and differences.The purpose of PCA is to reduce the large dimensionality of the data space (observed variables) to smaller intrinsic dimensionality of feature space (independent variables) which are needed to describe the data economically[Sar, 05].

28

Chapter 2

Face Detection, Tracking and Recognition Concepts

2.6.1.2 Mathematical Theory

Principal Component Analysis (PCA) has been widely adopted to capture the face space in a low-dimensional feature space: the Eigen space. Principal Component Analysis (PCA) is a classical technique for multivariate analysis. Let a face image I(x, y) be a two-dimensional N by N array of intensity values, or a vector of dimension N 2 . An image of 100 by 100 sizes can be represented by a vector of dimension 10,000, or, equivalently, a point in 10,000 dimensional spaces. An ensemble of images, then, maps to a collection of points in this huge space. Images of faces, being similar in overall configuration, will not be randomly distributed by a relatively low dimensional subspace. The main idea of the principal component analysis is to find the vectors which best account for the distribution of face images within the entire image space. These vectors define the subspace of face images, which we call “face space”. Here, a description of how to perform PCA in the context of face detection is given.



Consider a data set X  x1 , x 2, x 3,....., x M  , of N 2 -dimensional vectors. This data set might for example be a set of M face images. The mean,  , and the covariance matrix, , of the data are given by:







1 M

M



m

X m  1

1 M  x m    x m    T M m 1

29

(2.11)

(2.12)

Chapter 2

Face Detection, Tracking and Recognition Concepts

Where  is an N × N symmetric matrix. This matrix characterizes the scatter of the data set. A non-zero vector

uk 

u

for which

k

k uk

(2.13)

is an eigenvector of the covariance matrix. It has the corresponding eigenvalue

u

k

. If

 ,



1

2

, ........,



k

are the K largest, distinct eigenvalues

then matrix U  u 1 ,u 2 , u 3 ,......, u k



represents the K dominant

eigenvectors. These eigenvectors are mutually orthogonal and span a Kdimensional subspace called the principal subspace. Figure (2.8) is an example of eigenvectors. If U is the matrix of K dominant eigenvectors,

Fig (2.8) a) An example of dominant Eigenfaces. b) An average face

an N- dimensional input vector 

x

can be linearly transformed into a K-dimensional

by:

30

Chapter 2

Face Detection, Tracking and Recognition Concepts





U

T

x

 



(2.14)

After applying the linear transformationU T , the set of transformed vectors

 1, 2, 3,....., M has scatterU the determinant,

U  U T

T

 U . PCA chooses U so as to maximize

, of this scatter matrix. In other words, PCA

retains most of the variance. An original vector x can be approximately reconstructed from its transformed vector

~ x 

K



k  1



as:

 k uk  

(2.15)

In fact, PCA enables the training data to be reconstructed in a way that minimizes the squared reconstruction error, 



total



M 1  2 m  1

each reconstruction error,

 total 

x m



total

, over the data set, where,

~x m

2

(2.16)

xm  ~ xm , indicates how well the image

patch is fitted to the face space. This distance from face space is used as a measure of "faceness". The much reconstruction error means that the image patch appears to be a non-face [Kim, 01]. Both calculating eigenfaces and face verification using eigenface procedures are represented in Figures (2.9) and (2.10) respectively.

31

Chapter 2

Face Detection, Tracking and Recognition Concepts

Fig (2.9) Calculating Eigenfaces.

Fig (2.10) Face verification procedure using

32

M

t

Eigenfaces.

Chapter 2

Face Detection, Tracking and Recognition Concepts

2.6.2 Face Recognition Applications

Every day, we are facing new products of technology, prompting us to enter our PIN code or passwords such as money transactions in the internet or to get cash from ATM, even to use our cell phone SIM card, a dozen of others to access internet and so on. Therefore, the need for reliable methods of biometric personal identification is obvious. In fact, there are such reliable methods like fingerprint analysis, retinal or iris scans, however these methods rely on the cooperation of the participant. Face recognition systems, on the other hand, can perform person identification without the cooperation or knowledge of participant which is advantageous in some applications such as surveillance, suspect tracking and investigation [Sez, 05]. Typical application of face recognition systems can be listed in four main categories: a. Entertainment: Video Game, Virtual Reality, Training Programs, Human-Robot-Interaction, Human-Computer-Interaction. b. Smart Cards: Driver’s Licenses, Entitlement Programs, Immigration, National ID, Passports, Voter Registration, Welfare Fraud. c. Information Security: TV Parent Control, Personal Device Logon, Desktop Logon, Application Security, Database Security, File Encryption, Intranet Security, Internet Access, Medical Records, Secure Trading Terminals. d. Law Enforcement and Surveillance: Advance Video Surveillance, CCTV Control, Portal Control, Post-Event Analysis, Shoplifting, Suspect Tracking and Investigation.

33

Chapter 3

Design and Implementation of the Proposed FDM

Chapter Three Design and Implementation of the Proposed Face Detection Method 3.1 Introduction This work is aimed to improve the outcomes of face detection method which developed by Viola – Jones in the sense of video – based applications. The whole recognition system consists of three main modules. The first module is stared by loading a video file which is captured by a webcam and then extracting into frames. These frames are subjected to the Viola – Jones face detector to find out the single faces in the video.

The second and the

most important module of the work is the face tracking phase. The new contribution is appeared throughout two steps. Firstly, the detected face in the current frame is used to initialize the tracker in the next frame. This process is continued for the rest frames in the video. This contribution leads to eliminate the entire false negative (miss detection) that was one of the drawbacks of the Haar-Like Feature detector in the sense of video. Secondly, to reduce the false positive alarms, a dynamic thresholding algorithm is applied to each detected face before initializing the tracker in the next frame. Finally the tracked faces are recognized using image-based face recognition method (PCA). The detection rate is highly increased when the hybrid detector and tracker method is applied. This work is tested on a special video scenario where one or more persons are moving in front of the webcam, for example, ATM in a bank and a building security control. In the next sections, the proposed hybrid FDM is described and detailed using flowchart diagram or pseudo code algorithm. The process steps

34

Chapter 3

Design and Implementation of the Proposed FDM

are also presented for each part (face detection, face tracking and face recognition). Figure (3.1) shows the general diagram of the proposed FDM. The used programming language for implemented system is visual C++ 2010 with OpenCV 2.2 library which is used as development tools.

Fig (3.1) General block diagram of the proposed FDM.

3.2 Face Detector Scheme The method used to detect a single frontal face in a video is taken from OpenCV 2.2 library (face detector) developed by Viola – Jones. This method is essentially developed for still image based face detection. In this research Viola – Jones scheme is implemented for video application. Face detector steps can be summarized as follows: Step 1: Load video Load video file which is compatible with OpenCV library and obtained by real time capturing. The color conversion algorithm is applied on each

35

Chapter 3

Design and Implementation of the Proposed FDM

frame after extracting them from the video. This conversion changes the color space from color to gray that is more sensitive to the human vision system. Step 2: Initialize detector The Haar cascade classifier is loaded and then the data stored in an XML file is used to decide how to classify each image location. Step 3: Running the detector The face detector examines each image location and classifies it as "Face" or "Not Face." Classification assumes a fixed scale for the face, say 50x50 pixels. Since faces in an image might be smaller or larger than this, the classifier runs over the image several times, to search for faces across a range of scales. Step 4: Detector results A specific function in OpenCV is used to display the detected faces on the window with different resolution. This is due to the distance of objects (face) from the webcam. Therefore the faces are displayed with different sizes like 24x24 default value of the classifier, 26x26, 31x31, 44x44 and so on. Figure (3.2) illustrates the face detection steps.

Fig (3.2) Face detection scheme

36

Chapter 3

Design and Implementation of the Proposed FDM

In Figure (3.3) the flowchart of the face detection is shown including the above mentioned steps.

Fig (3.3) Face detection process.

37

Chapter 3

Design and Implementation of the Proposed FDM

3.3 Face Tracking Scheme The face detection scheme used in this work does not give satisfied results in terms of detection rate. This drawback encourages us to develop a hybrid detection method that combines the Viola – Jones detector and template based tracking algorithm. The reasons behind using these algorithms are: 1. Obtaining the template of the face in the face detector part. 2. Reusing this template to initialize a tracker. 3. Template matching algorithm is very efficient and simple to implement. 4. Robust against scale invariant and lighting conditions. 5. Fast because of using Region of Interest to reduce searching process within the frame. The detected face that is obtained from the current frame is used to initialize the tracker. Due to the slow change of the face position between two successive frames, the template position of the face is firstly found from the current frame and a ROI is applied to the next frame in order to reduce the search area. The ROI is selected according to the properties of the face template (top left location, width, height). The size of the ROI is chosen to be twice than the size of the face template in condition that the ROI window does not exceed the frame size. The ROI size depends on the size of the template in the current frame (i.e., growth and shrink dynamically). Face tracking process is followed by the face detector so that the objects (face) in the next frames are simply founded to avoid any miss detection. Each tracked face template can be considered as a reference for initializing the tracker itself in the next frames in case the detector is failed to detect (false negative alarm).

38

Chapter 3

Design and Implementation of the Proposed FDM

Figure (3.4) depicts the face tracking process between two successive frames.

Fig (3.4) Face tracking process.

3.3.1 False Positive Reduction In image-based face detection a false positive is a number of detected objects that are not faces while in the sense of video is a number of detected faces that are representing the face of another person. The scenarios which are used to run the face detector and tracker aims to track a nearest person who is walking toward the webcam. This situation leads to have both types of false positive alarms. Therefore, the main objective of using tracking is to reduce the false positive alarms to get a high detection rate. In Figure (3.5) the occurrence of false positive alarms is shown.

39

Chapter 3

Design and Implementation of the Proposed FDM

Fig (3.5) False positive alarms.

3.3.1.1 Manual Threshold Algorithm

After running the face detector the possibility of false positive occurrences between the frames as shown in Figure (3.6), can be reduced before running the tracker as described in the following steps.

40

Chapter 3

Design and Implementation of the Proposed FDM

Fig (3.6) false positive alarm between two successive frames

Step 1: Set the standard average window size for face to be recognized equal to (120 x 90) pixels. Another assumption is to develop a new scheme depending on the width of the window. The best value was found by trial and errors were equal to 110. Step 2: Initialize two arrays to store the width of window's face as shown in Figure (3.7) and the average width of the window's face in each frame. Step 3: Calculate the average width. Step 4: Accumulate the average width of window's face.

41

Chapter 3

Design and Implementation of the Proposed FDM

Step 5: Store the data results (width and average width) in a text file.

Fig (3.7) Window's face width values after running the face detector.

Step 6: Graph a curve between frames numbers and the average width as illustrated in Figure (3.8). Step 7: Select the threshold from the curve, which is representing the best value, which is near from the standard average width of window's face (here, it is taken 110 pixels). Step 8: Each value greater than 110 is discarded and considered as a non recognized face. This selection is an important step to initialize the face tracker.

Fig (3.8) Average window's face width

42

Chapter 3

Design and Implementation of the Proposed FDM

The flowchart of the threshold algorithm before tracking is shown in Figure (3.9).

Fig (3.9) Manual threshold process.

43

Chapter 3

3.3.1.2

Design and Implementation of the Proposed FDM

Dynamic Threshold Algorithm

In the manual thresholding process some false positive alarms are not totally reduced as shown in Figure (3.10). To solve these types of alarms, a dynamic threshold algorithm is developed and implemented to apply with a face tracker.

Fig (3.10) Unsolved types of false positive alarms.

44

Chapter 3

Design and Implementation of the Proposed FDM

The algorithm steps are described below: Step 1: Get frame. Step 2: Run face detector. Step 3: Apply manual thresholding as mentioned in section 3.3.1.1. Step 4: Determine the center of the detected face. Step 5: Get next frame. Step 6: Run face tracker. Step 7: Determine the center of the tracked face. Step 8: Determine Euclidian distance between the centers of the detected and tracked faces. The graph is shown in Figure (3.11).

Fig (3.11) Euclidian distance measure.

Step 9: Accumulate five values of the obtaining distance and find the median of them. Step 10: Normalize the distance by dividing the distance over the face width. Step 11: Find a threshold. If the normalized distance is greater than ten times of the median value, the last tracked face is tracked by the tracker otherwise the last detected face is tracked. Step 12: Visualize the face.

45

Chapter 3

Design and Implementation of the Proposed FDM

The flowchart of the dynamic threshold algorithm that involves all the above steps is shown in Figure (3.12).

Fig (3.12) Dynamic threshold algorithm.

46

Chapter 3

Design and Implementation of the Proposed FDM

3.3.2 False Negative Reduction The second contribution is the reduction of the false negative alarms of the undetected faces between frames. The template matching based on Square Difference Matching metric is used to implement the face tracker. This strength of the proposed tracker is efficient and simple. Moreover, it is important to say that it is scale, pose and rotation independent, and theoretically it will keep tracking the face even when this one is not in a frontal position. Initially, a face detector based on the Viola-Jones object detection, is executed in order to detect the face of the first frame of the sequence to track. This will be repeated until at least two successful detected faces can be found. Every time the last detected face in a current frame was used to track a face in the next frame, this process leads to update the template position which can be used to initialize the tracker. The steps of the algorithm are explained as follows: Step 1: Memory allocation for creating a template with the same size as indicated in the face detection part. Step 2: Save the detected face in the template located on the next frame. Step 3: Memory allocation for creating a tracking result. Step 4: Optimal matching result can be found when the template is sliding over the next frame using SDM. Step 5: Normalize obtained results (values between 0 and 1) to separate the perfect match (0) from the non match (1). Step 6: Visualize the perfect match results on the next frame to get the location of the tracking by using minimal location of the best match. Figure (3.13) shows the flowchart of the template matching based tracking method.

47

Chapter 3

Design and Implementation of the Proposed FDM

Fig (3.13) Template matching based tracking.

48

Chapter 3

Design and Implementation of the Proposed FDM

3.4 Face Recognition Scheme The face recognition is coming after performing both face detection and tracking process. Some preprocessing operations (resizing window's face size and histogram equalization) are required to apply on the tracked faces before starting the recognition process. The framework of recognition process is presented in Figure (3.14).

Fig (3.14) The framework of face recognition process.

In this part a special OpenCV library developed by Robin Hewitt [Hew, 07] is used. The obtaining tracked faces are subjected to PCA algorithm in order to get Eigen faces.

49

Test Evaluation

Chapter 4

Chapter Four Test Evaluation 4.1 Introduction In this research work, the established FDM which was designed and implemented in previous chapter is tested using special video scenarios. The proposed FDM is the result of combining the face detector and tracker processes which give an improvement in detection rate and detection time. The performance test results of the proposed FDM were evaluated by tuning some of the involved parameters such as scale factor, neighbor threshold and window size. These parameters affect the system accuracy (detection, tracking and recognition). The efficiency of the proposed FDM was evaluated using detection rate and detection time measures. The programming language, Microsoft Visual Studio 2010 (VC++) with OpenCV 2.2 Library, was used as development tools to construct the required program. The programs were tested on Windows 7 Enterprise ((Intel(R) Core(TM)) CPU, 2.77GHZ processor and 4 GB RAM).

4.2 Video Scenarios Description The experiments were carried out on 5 persons. While one or more persons were moving behind at random, each person walked up to a camera and stood toward it. Two different scenarios (taken from the Department of Information Technology/ Faculty of Engineering and IT/ Karlsruhe University of Applied Sciences/ Germany) were used to evaluate the performance of the proposed FDM the tested videos are illustrated in Figures (4.1) and (4.2)

50

Test Evaluation

Chapter 4

respectively. In this work, the first video scenario was captured in a natural environment without any light condition by using Logitech webcam.

Fig (4.1) Video scenario 1

51

Test Evaluation

Chapter 4

Fig (4.2) Video scenario 2.

The other scenario was captured from an environment (high movement of the person) in which lighting condition affects the accuracy performance of Viola – Jones detector. In table (4.1) the properties of the video scenarios (1 and 2) are described. Table (4.1) Video scenarios description

Video No.

Video Description Length=00:01:07, Size=4.71 MB, Frame width=640,

Scenario 1 Frame height =480, Frame Rate=15 Frame/ Second, Type:

DivX Video, System color: RGB. Length=00:00:35, Size=3.87 MB, Frame width=640, Scenario 2

Frame height =360, Frame Rate=30 Frame/ Second.

52

Test Evaluation

Chapter 4

4.3 Test Results The implemented methods (Viola – Jones, Manual thresholding and Dynamic thresholding) were experimented on the video scenarios applied on 1000 frames-scenario 1and 900 frames-scenarios 2 in real time environment. At first, the involved parameters which play an important role in face detection process were tuned to get the optimal parameters. This optimization leads to increase the success of detection rate and fix the parameters for further tests. In the next sections the results of each method were presented by testing them on the two considered different video scenarios. 4.2.1 Effect of Involved Parameters (Scenario 1) There are several parameters (Window size, Neighbor Threshold, Scale factor) to tune that affect the (Viola-Jones) face detector as presented in the following subsections. Table (4.2) presents the parameters that affect the performance of the (Viola-Jones) face detector. Table (4.2) Effect of involved parameters.

Parameters Window size Neighbor threshold Scale factor

Viola-Jones face detector performance Detection rate Detection time No Yes Yes No Yes Yes

4.3.1.1 Window Size

It is the size of the smallest face to search. The optimal selection of window size is to minimize the number of face detections. In table (4.3) the

53

Test Evaluation

Chapter 4

results of detection rate and time by tuning three different window sizes are shown with the fixed parameters (scale factor and neighbor thresholds). Tables (4.4) and (4.5) present the same effect but with different fixed parameters. Table (4.3) Effect of window size on detection rate and time (Scale factor = 1.2 and Neighbor threshold = 1)

Window size 20×20 30×30

Detected as faces 983 969

TP

FP

FN

DR

DT/ms

652 683

331 286

17 31

65.2 68.3

237.569 210.652

40×40

957

669

288

43

66.9

190.820

Table (4.4) Effect of window size on detection rate and time (Scale factor = 2.2 and Neighbor threshold = 2)

Window size 20×20 30×30 40×40

Detected as faces 890 889 889

TP

FP

FN

DR

DT/ms

800 806 793

90 83 96

110 111 111

80.0 80.6 79.3

162.680 121.202 146.712

Table (4.5) Effect of window size on detection rate and time (Scale factor = 3.2 and Neighbor threshold = 3)

Window size 20×20 30×30

Detected as faces 531 551

TP

FP

FN

DR

DT/ms

466 505

66 46

469 449

46.6 50.5

263.383 114.119

40×40

551

505

46

449

50.5

123.702

4.3.1.2 Neighbor Threshold

It is the way to find the average rectangle from a raw detection (i.e., replace a group of rectangles into one rectangle). If this value is set to 0, it means all raw detections are appeared as illustrated in Figure (4.3).

54

Test Evaluation

Chapter 4

Fig (4.3) Neighbor threshold setting to zero

In table (4.6) the results of detection rate and time by tuning three different neighbor thresholds are shown with the fixed parameters (window size and scale factor) Tables (4.7) and (4.8) present the same effect but with different fixed parameters. Table (4.6) Effect of neighbor threshold on detection rate and time (Scale factor = 1.2 and Window size =20×20)

Neighbor

threshold 1 2 3

Detected TP FP FN as faces 983 652 331 17 969 882 147 31 964

913

51

55

36

DR

DT/ms

65.2 88.2

237.569 236.696

91.3

243.090

Test Evaluation

Chapter 4

Table (4.7) Effect of neighbor threshold on detection rate and time (Scale factor = 2.2 and Window size =30×30)

Neighbor

Detected TP FP FN as faces 926 848 83 83 889 806 83 111 843 763 80 157

threshold 1 2 3

DR

DT/ms

84.3 80.6 76.3

159.627 121.202 143.880

Table (4.8) Effect of neighbor threshold on detection rate and time (Scale factor = 3.2 and Window size =40×40)

Neighbor

Detected TP FP threshold as faces 788 693 95 1 659 594 65 2 505 46 551 3

FN

DR

DT/ms

212 341

69.3 59.4

132.577 126.316

449

50.5

123.702

4.3.1.3 Scale Factor

When this parameter is set higher, the detector runs faster because of increasing the scale of the face in each pass. In table (4.9) the results of detection rate and time by tuning three different neighbor thresholds are shown with the fixed parameters (window size and neighbor thresholds). Tables (4.10) and (4.11) present the same effect but with different fixed parameters. Table (4.9) Effect of scale factor on detection rate and time (Neighbor threshold = 1 and Window size =20×20)

Scale factor

1.2 2.2 3.2

Detected TP FP FN as faces 983 652 331 17 937 842 95 63

DR

DT/ms

65.2 84.2

237.569 165.117

691 113 196

80.4

198.424

804

56

Test Evaluation

Chapter 4

Table (4.10) Effect of scale factor on detection rate and time (Neighbor threshold = 2 and Window size =30×30)

Scale factor

1.2 2.2 3.2

Detected TP FP FN as faces 959 814 145 41 889 806 83 111 659

584

75

341

DR

DT/ms

81.4 80.6

216.164 121.202

58.4

125.870

Table (4.11) Effect of scale factor on detection rate and time (Neighbor threshold = 3 and Window size =40×40)

Scale factor

1.2 2.2

Detected as faces 946 843

3.2

551

TP

FP

FN

DR

DT/ms

898 48 54 743 100 157

89.8 74.3

227.373 134.554

505

50.5

123.207

46

449

4.3.1.4 Results Discussion

The obtained results of Viola – Jones detector with tuning some parameters which are presented in the above tables can be concluded as follows: 1.

The window size does not affect the detection rate while it increases the detection time when the window size becomes smaller.

2.

Detection rate is directly proportional to neighbor threshold in case of choosing minimum value of window size and scale factor, see table (4.6). However detection time is inversely proportional to NT.

3.

High success of detection rate is obtained when the minimum value of scale factor is used in condition that the neighbor threshold is high, see table (4.11). According to the test results, the Viola –Jones detector still suffers from

the problem of miss detection (false negative and positive alarms). To

57

Test Evaluation

Chapter 4

overcome this problem the optimal parameters are used from the above tables to reduce false alarms in the next sections. In Figures (4.4 - 4.6) the detection rate versus involved parameters of face detector are shown.

Fig (4.4) DR versus WS WS=20X20,SF=1.2

WS=30X30,SF=2.2

WS=40X40,SF=3.2

Detection Rate

95 90 85 80 75 70 65 60 55 50

1

2

Neighbor Threshold

Fig (4.5) DR versus NT

Fig (4.6) DR versus SF

58

3

Test Evaluation

Chapter 4

4.2.2 Effects of the Involved Parameters (Scenario 2) This scenario is totally different from scenario 1 according to the following conditions: 1. Light condition (i.e., no natural environment): increases the false positive type I. 2. The frame rate is fast (30 f/s): increases the false negative ratio. 3. Fast movement of the person and moving in different directions which lead to increase false negative alarms. 4. Large distance between the webcam and walked person which affects the aspect ratio of the faces window size. For that reason, the manual thresholding cannot be applied on the whole video frames. The same parameters as mentioned in section 4.3.1 are tuned to get the optimal parameters. After conducting the test on 900 frames, we found that the best parameters are still the same (i.e., WS=20x20, SF=1.2 and NT=3). 4.3.2.1 Window Size

In table (4.12) the results of detection rate and time by tuning three different window sizes are shown with the fixed parameters (scale factor and neighbor thresholds). Tables (4.13) and (4.14) present the same effect but with different fixed parameters. Table (4.12) Effect of window size on detection rate and time (Scale factor = 1.2 and Neighbor threshold = 1)

Window size 20×20

Detected as faces 814

TP

FP

FN

DR

DT/ms

663

151

86

73.66

347.284

30×30

803

644

159

97

71.53

300.661

40×40

563

723

140

337

47.00

275.951

59

Test Evaluation

Chapter 4

Table (4.13) Effect of window size on detection rate and time (Scale factor = 2.2 and Neighbor threshold = 2)

Window size 20×20 30×30 40×40

Detected as faces 216 216 216

TP

FP

FN

DR

DT/ms

207 207 207

9 9 9

684 684 684

23.00 23.00 23.00

281.897 109.933 160.659

Table (4.14) Effect of window size on detection rate and time (Scale factor = 3.2 and Neighbor threshold = 3)

Window size 20×20 30×30 40×40

Detected as faces 165 165 165

TP

FP

FN

DR

DT/ms

165 165 165

0 0 0

735 735 735

18.22 18.22 18.22

80.183 59.534 59.620

4.3.2.2 Neighbor Threshold

In table (4.15) the results of detection rate and time by tuning three different neighbor thresholds are shown with the fixed parameters (window size and scale factor). Tables (4.16) and (4.17) present the same effect but with different fixed parameters. Table (4.15) Effect of neighbor threshold on detection rate and time (Scale factor = 1.2 and Window size =20×20)

Neighbor

Detected Threshold as faces 814 1

TP

FP

FN

DR

DT/ms

663

151

86

73.66

347.284

2

802

748

54

98

83.11

581.982

3

773

754

19

127

85.88

558.817

60

Test Evaluation

Chapter 4

Table (4.16) Effect of neighbor threshold on detection rate and time (Scale factor = 2.2 and Window size =30×30)

Neighbor

Detected TP FP FN as faces 668 49 183 717 216 207 9 684 179 5 716 184

threshold 1 2 3

DR

DT/ms

74.22 23.11 19.88

164.422 281.897 160.813

Table (4.17) Effect of neighbor threshold on detection rate and time (Scale factor = 3.2 and Window size =40×40 )

Neighbor

Detected TP FP FN as faces 177 21 879 198 179 175 4 721 165 0 735 165

threshold 1 2 3

DR

DT/ms

19.66 19.44 18.22

79.200 78.373 80.183

4.3.2.3 Scale Factor

In table (4.18) the results of detection rate and time by tuning three different neighbor thresholds are shown with the fixed parameters (window size and neighbor thresholds). Tables (4.19) and (4.20) present the same effect but with different fixed parameters. Table (4.18) Effect of scale factor on detection rate and time (Neighbor threshold = 1 and Window size =20×20)

Scale factor

1.2 2.2 3.2

Detected TP FP FN as faces 841 663 151 86 303 281 22 597 198

179

19

61

702

DR

DT/ms

73.66 31.22

347.284 240.086

19.88

159.980

Test Evaluation

Chapter 4

Table (4.19) Effect of scale factor on detection rate and time (Neighbor threshold = 2 and Window size =30×30)

Scale factor

1.2 2.2 3.2

Detected TP FP FN as faces 712 658 54 188 216 207 9 684 179

175

4

721

DR

DT/ms

73.11 23.00

431.02 281.893

19.44

78.619

Table (4.20) Effect of scale factor on detection rate and time (Neighbor threshold = 3 and Window size =40×40)

Scale factor

1.2 2.2 3.2

Detected TP FP FN as faces 393 18 489 411 184 179 5 716 165

165

0

735

DR

DT/ms

43.66 19.88

236.933 155.619

18.22

80.183

4.3.2.4 Scenario 2 Results Evaluation

Based on the obtained results, the occurrences of the false negative alarms were still high because of the properties of the video as described in section (4.3.2). In this scenario the negative alarms normally occurred between the frames with a large interval of time (frame #20, frame #40 and so on). Then the face detector failed for this scenario and could not be approved for detection rate. The tracker needs to be initialized by the detector from the previous frames (the template position updating depends on the detected faces). The face tracker based on template matching method also failed to work with this scenario. Therefore the proposed FDM is only applied on scenario 1.

4.4

Improved Detection Rates (Scenario 1) The face tracker which is followed by Viola – Jones face detector

62

Test Evaluation

Chapter 4

mainly depends on the detected face in the previous frame in the video. That is, if the false positive alarms type I, see section (3.3.1.1), are directly fed to the tracker, the tracker would be subject to the error and would highly affect the performance of the proposed FDM. Therefore, the FP type I, occurring before tracking, is reduced by applying manual thresholding. The other type II of false positive alarms which occur after tracking (template matching) is minimized using dynamic thresholding algorithm (Euclidian distance). The template matching is also used to minimize the false negative alarms. 4.4.1 Reduce False Positive Type I Using Manual Thresholding The optimal parameters from the face detector are WS=20x20, SF= 1.2 and NT= 3. The results of manual thresholding algorithm tested on video scenario 1 with the optimal parameters are shown in table (4.21). Table (4.21) False positive type I reduction

Methods Viola –Jones Manual Thresholding

Detected as faces 964 914

TP

FP

FN

DR

913 898

51 6

36 86

91.3 89.8

4.4.2 Reduce False Negative Using Template Matching The undetected faces (FN) can be reduced by tracking the faces between current and previous frames. The tracker process uses the template Table (4.22) False negative reduction

Methods Viola -Jones Manual Thresholding Template Matching

Detected as faces 964 914 976

63

TP

FP

FN

DR

913

51

36

91.3

898 966

6 10

86 24

89.8 96.6

Test Evaluation

Chapter 4

matching method. The results after tracking are shown in table (4.22). 4.4.3 Reduce False Positive Type II Using Dynamic Thresholding As indicated in table (4.22), the false negative alarms still exist because the Viola – Jones face detector with the optimal parameters starts the detection from frame number =23. On the other hand, the false positive alarms type II is reduced from 10 to 4 by using dynamic thresholding algorithm as presented in table (4.23). This minimization is to increase the performance of the recognition of tracked faces. Table (4.23) False positive type II reduction

Methods Viola -Jones Manual Thresholding Template Matching Dynamic Thresholding

Detected as faces 964 914 976 976

TP

FP

FN

DR

913 898 966 972

51 6 10 4

36 86 24 24

91.3 89.8 96.6 97.2

4.4.4 Improvement of Detection Time The Viola – Jones face detector according to the optimal parameters that obtained in section (4.3.1.2) is very slow because of the value of window size (20x20) and scale factor (1.2). This problem is due to the trade off between improving detection rate and detection time. For scenario 1, the detection time is calculated for face detector and tracker with and without region of interest algorithms. 4.4.4.1 Face Detection Time The Viola-Jones detector is tested on 1000 frames to determine the detection time using ROI method. In table (4.24) the comparison between the detection time with and without ROI is shown.

64

Test Evaluation

Chapter 4

Table (4.24) Detection time with and without ROI

Detected As faces 964 964

Viola-Jones Without ROI With ROI

TP

FP

FN

DR

DT/ms

913 913

51 51

36 36

91.3 91.3

243.090 78.807

4.4.4.2 Face Tracking Time The tracker based template matching method is tested on 1000 frames to determine the tracking time using ROI method. In table (4.25) the comparison between the tracking time with and without ROI is shown. Table (4.25) Tracking time with and without ROI

Template Matching Without ROI With ROI

Detected as faces 976 976

TP

FP

FN

DR

DT/ms

966 966

10 10

24 24

96.6 96.6

222.023 72.410

From tables (4.24) and (4.25), the total time of the proposed FDM is: Total time = detection time + tracking time.

4.5 Face Recognition The face recognition process is carried out through two phases (training and testing). The training faces are obtained after applying Viola-Jones face detector .To get the Eigenfaces for each person, the PCA algorithm is applied and then they are stored in an XML file. In this work the training images are taken from 10 different persons with 15 faces for each one. The test phase is recognition of an unidentified person's face. FDM is used to detect the face. PCA is also applied to the test face to find the Eigen face of it. The matching algorithm which depends on the Euclidian distance measure is performed between Eigen values of the training and testing Eigen

65

Test Evaluation

Chapter 4

faces. Figures (4.7) and (4.8) present both training and testing phases for face recognition.

Fig (4.7) Training phase in face recognition system.

Fig (4.8) Test phase in face recognition system.

66

Chapter 5

Conclusions and Suggestions for Future work

Chapter Five Conclusions and Suggestions for Future Work 5.1 Conclusions In this research, a hybrid method (face detection and tracking) is developed for face recognition system. The face detector uses Viola – Jones algorithm while the face tracker is based on the dynamic template matching algorithm. The proposed FDM aims for effective reduction of false positive and negative alarms which occurre in video frames. It is tested on two video scenarios. Based on the test results presented in the previous chapter, some conclusions related to the performance of FDM have been investigated: 1-

The test results of Viola-Jones video based detection with best suitable parameters showed high number of false alarms.

2-

The Viola-Jones detector improved in terms of detection rate and time using proposed FDM. a. DR = 91.3% improved to DR= 97.2%. b. DT = 465.113 ms improved to DT= 151.217 ms (using ROI algorithm).

3-

Manual threshold algorithm reduced the false positive alarms type I from 56 objects (face and non face) to 6 objects.

4-

Face tracking based template matching algorithm reduced the false negative alarms from 86 (undetected face) to 24.

5-

Face tracking based dynamic template matching algorithm reduced the false positive alarms type II from 10 objects (face and non face) to 4 objects.

6- The proposed FDM was robust for complex background when two

67

Chapter 5

Conclusions and Suggestions for Future work

persons were moving towards the webcam. That is the frontal face was uniquely detected. 7- The main advantages of the proposed FDM were: a. Reduction of false positive and negative alarms. b. Reduction of time complexity by using ROI. c. Proposed FDM can be considered as a simple detector for frontal face recognition. 6-

The drawback of the proposed FDM is the difficulty of choosing the best scenario that consider illumination, fast object movement and pose invariant (Scenario 2).

5.2 Suggestions for Future Work The following suggestions can be taken into consideration for future research work: 1. Developing our proposed FDM to detect and track multiple faces instead of a single face. 2. FDM can be also improved in terms of recognition rate. 3. Using motion estimation techniques to overcome the problem of fast movement object. This lead the proposed FDM more robust against different video scenarios. 4. Using other different algorithms than template matching for face tracking (Kalaman filter, optical flow, etc.). 5. Combining eye detection with Viola – Jones face detector to reduce false positive alarms. In this case the tracker might be more robust to pose invariant. 6. For video scenario 2, a preprocessing step can be added through histogram equalization to solve the problem of the illumination. 7. The proposed FDM can be adapted for different applications by using other biometric techniques. 68

References [Ara, 08] S. R. Arachchige, "Face Recognition in Low Resolution Video Sequences using Super Resolution", Kate Gleason College of Engineering, Department of Computer Engineering, Master’s Thesis, August 2008. [Bak, 04] S. Baker, and I. Matthews, " Lucas-kanade 20 years on: A unifying framework", International Journal of Computer Vision, Vol. 56, pp. 221– 255, 2004. [Bis, 06] C. M. Bishop, “Pattern Recognition and Machine Learning”, ISBN-10: 0387310738, 1st Edition, Springer 2006. [Bro, 86] T. J. Broida, and R. Chellappa," Estimation of Object Motion Parameters from Noisy Images", IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 8, No.1, pp.90-99, 1986. [Cas, 09] D.Casas,"Real- Time Face Tracking Methods," Master's Thesis, University Autónoma de Barcelona, 2009. [Che, 10] D. Chen, J. Wang, and Y. Zhou, "Face Detection Method Research and Implementation Based on AdaBoost", Intelligent Information Processing

69

and Trusted Computing (IPTC), ISBN: 978-1-4244-8148-4, pp.643-646, 2010.

[Com, 03] D. Comaniciu, V. Ramesh, and P. Meer, "Kernel-based object tracking", IEEE Transaction on Pattern Analysis and Machine Intelligence 25, pp.564– 575, 2003.

[Cor, 07] P. Corcoran, M. C. Jonita, and J. Bacivarov,"

Next Generation Face

Tracking Technology Using AAM Techniques", Signals, Circuits and System, ISSCS (2007), ISBN: 1-4244-0969-1, pp.1-4, 2007.

[GU, 08] Q. GU, "Finding and Segmenting Human Faces", Uppsala University, Master Thesis, February 2008. [Hag, 96] G. D. Hager, and P. N. Belhumeur," Real-time tracking of image regions with changes in geometry and illumination", IEEE Conference on Computer Vision and Pattern Recognition, pp. 403, USA, 1996.

[Hew, 07] R. Hewitt, “Seeing with opencv, part 5: Implementing eigenface”. Servo Magazine, pp.50, May 2007.

70

[Hsu, 02] R.L. Hsu, M. Abdel-Mottaleb and A.K. Jain, "Face Detection in Color Images", IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 24, No. 5, pp. 696-706, May 2002. [Isra, 07] E. Ben-Israel," Tracking of Humans Using Masked Histograms and Mean Shift", Introductory Project in Computer Vision, March 2007. [Jen, 08] O. H. Jensen "Implementing the Viola-Jones Face Detection Algorithm", Master Thesis in Informatics and Mathematical Modeling, Technical University of Denmark, 2008. [Jin, 05] Z. Jin, Z. Lou, J. Yang, and Q. Sun," Face Detection Using Template Matching and Skin Color Information", International Conference on Intelligent Computing, PP.23-26, China, August 2005. [Jor, 06] A. Jorgensen, "AdaBoost and Histograms for Fast Face Detection", Master Thesis of Computer Science, Stockholm, Sweden 2006. [Kang, 04] J. Kang, I. Cohen, and G. Medioni, " Object Reacquisition using Invariant Appearance Model", Proceedings of 17th International Conference on Pattern Recognition, Vol.4, pp. 759-762, USA, 2004.

71

[Kim, 01] J. Kim, "Face Localization for Face Recognition in Video", Department of Electrical and Electronic Engineering, Yonsei University, Master's Thesis, 2001.

[Lee, 07] H. Lee, and D. Kim, "Robust face tracking by integration of two separate trackers: Skin Color and facial shape", Pattern Recognition, Vol.40, No. 11, pp: 3225 – 3235, 2007.

[Li, 08] Y. Li, H. Ai, T.Yamashita, S. Lao, and M. Kawade, ," Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Life Spans", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.30, No.10, 2008.

[Lie, 02] R. Lienhart, and J. Maydt, “An extended Set of Haar-like Features for Rapid Object Detection”, In Proceedings of International Conference on Image Processing , vol. 1, No. 9, pp. 900-903, 2002.

[Mar, 09] J. G. Martil, "Face Recognition in Controlled Environments using Multiple Images", Master's Thesis, March 2009.

72

[Mog, 97] B. Moghaddam and A. Pentland, "Probabilistic Visual Learning for Object Representation", IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 696-710, July 1997.

[Nal, 08] K. Nallaperumal, R. Subban, R. K. Selvakumar, A. L. Fred, C. N. KennadyBabu, S.S. Vinsley, and C. Seldev, "Human Face Detection in Color Images using Mixed Gaussian Color Models", International Journal of Imaging Science and Engineering (IJISE),GA,USA,ISSN:19349955,Vol.2, No.1, January 2008.

[Par, 09] U. Park, " Face Recognition: face in video, age invariance, and facial marks", A Dissertation of Doctor of Philosophy in Computer Science, 2009.

[Ram, 11] K. Ramirez, D. Cruz, and H. Perez, "Face Recognition and Verification using Histogram Equalization", ISSN: 1792-4863, ISBN: 978-960-474231-8, 2011.

[Row, 98] H.A. Rowley, S. Baluja, and T. Kanade, "Neural Network-Based Face Detection", IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol.20, No. 1, pp. 23-38, 1998.

73

[Ryu, 08] H. Ryu, M. Kim, V. Dinh, S. Chun, and S. Sull," Robust Face Tracking Based on Region Correspondence and its Application for Person Based Indexing System " ,International Journal of Innovative Computing, Information and Control ICIC ,ISSN 1349-4198,Vol.4, No. 11, November 2008.

[Sar, 10] J. Sarvanko, "Face Sequence Detection for a Web-Based Annotation Application" University of Oulu, Department of Electrical and Information Engineering, Master’s Thesis, 62 p, 2010.

[Sil, 05] S. Silva, "Remote Surveillance and Face Tracking with Mobile Phones (Smart Eyes)", University of the Western Cape, Department of Computer Science. Master’s Thesis, 106 p, May 2005,

[Sez, 05] O. G. Sezer, "Supper resolution Techniques for Face Recognition from Video", Sabanci University, Master's Thesis, spring 2005.

[Tha, 09] N. D.Thanh, W. Li, and P. Ogunbona," A Novel Template Matching Method for Human Detection," In Proceedings of 16th IEEE International Conference on Image Processing (ICIP), ISSN: 1522-4880, pp.2549-2552, Cairo, 2009.

74

[Vio, 01] P. Viola, and M. Jones, "Rapid Object Detection Using a Boosted Cascade of Simple Features", In Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 511-518, 2001.

[Vio, 04] P. Viola, and M. Jones, "Robust Real-Time Face Detection", International Journal of Computer Vision 57(2), 2004.

[Wan, 09] H. Wang, Y. Wang, And Y. Cao, "Video-based Face Recognition: A Survey ", World Academy of Science, Engineering and Technology, Vol. 60, pp. 239, 2009.

[Wu, 04] H. Wu, and J. S. Zelek,"The Extension of Statistical Face Detection to Face Tracking", In Proceedings of the 1st IEEE Canadian Conference on Computer and Robot Vision, DOI:10.1109/CCCRV.1301415 , pp. 10-17, Washington, DC, USA, 2004.

[Wu, 08] Y.W. Wu, and X.Y. Ai, "An improvement of face detection using AdaBoost with color information", ISECS International Colloquium on Computing,

Communication,

Control,

10.1109/CCCM.366, pp.317-321, 2008.

75

and

Management,

[Yang, 02] M. H. Yang, D. J. Kriegman, and N. Ahuja, “Detecting Faces in Images: A Survey”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 1, January 2002.

[Yil, 04] A.Yilmaz, X. Li, and M. Shah," Contour-based object tracking with occlusion handling in video acquired using mobile cameras", IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 26, No.11, pp.1531-1536, 2004.

[Yil, 06] A. Yilmaz, O. Javed, M. Shah, " Object tracking: A survey", ACM Computing Surveys, Vol.38, No. 4, Article 13, 2006.

[Zha, 10] Q. Zhao, and H. Cai, "The Research and Implementation of Face Detection and Recognition Based on Video Sequences", In Proceedings of 2nd IEEE International Conference on Future Computer and Communication, DOI: 10.1109/ICFCC 5497778, pp.318-321, Wuhan, 2010.

[Web, 01] Open Source Computer Vision Library, http://www.intel.com/technology/computing/opencv/.

76

ُ ‫الخـــالصـــة‬ ! "# $ ) * +, - ./ % ( ! ! & %! &' 3 4$ ' 4 $ " 0 " $ 1 2 . 7 1 5 6/ $ ( Viola-Jones )9 %! &' % 6 < 2 %! , % 6 ."$ 1 " 6 " 6; $ - : - 6 3 >6 -1 6 I 5 " 9 " = = ( 9 6) $ &' % ? % 1, > ! . < 2 - ) 1? . ! -# > <2 -1 6 - B .II 5 " 9 " = = % 3 >6 C < 2 %! 2 1 &' ; - &' % 6 . " "2 > 6? $ <2 36? ! %! &' C *

E ) . D < 2 - -0 % (frame) ' &' F; - " . > ." (frames) 4 ! &' - , ! > D (FDM) %! &' > ! ? 1/ C# > &' F? %? ; $ % $ 1 % . 6 ! &' + % (Viola-Jones)&' % 0' 9? F ? . " " 2 1 2 & ' " 0 .& ' ." 0 1.217 &' 26 (G97.2) ! <2 / ? D (FDM) %! &' > H6 0 ' .IF9= # " 6 71 "' "$ 1 $ . .I 9 -1 6 / ?

‫التعرف على الوجه من خالل الفيديو‬ ‫باستخدام مطابقة القالب الديناميكي‬ ‫‬ ‫ – – ‬ ‫! ‬

‫ "‬ ‫‪$% & #‬‬ ‫ ' ‪2008‬‬ ‫ ‬

‫ )(‪:‬‬ ‫ ‪ #, .‬‬ ‫ ‪-‬‬

‫‪2012./ .0‬‬

‫‪1433 1%‬‬

ón‚íq@ @H@óåŽîìI@Žíî‡ïÄ@õbàóåi@Šó"óÜ@@ÄŽì‹à@@õòíŽï’@çbî@H@õìbš@ì@ãò†I@@Žßó óÜ@ç†‹ØóÜóàbà@ôäbØóÙïåØóm @çbØòìóåî‰Žîím@óÜ@ÚŽîŠŽìŒ@ô−Šó"@õó(Žïu@ómòíiL@òìóåï"bä@ì@çìíša†aì†ói@ì@ç†‹Øa‹Ù’b÷@@ôäbØóÙïåØóm@•óäaìóÜ @õ†bïäíi@ôä†‹ÙîŠbî†@L@@ì@@õŽíî‡ïÄ@ôä†‹Ùî‹Žî†ìbš@@íØòì@@ŠŽíuaŠŽíu@õòíŽï’ói@ôån/‚ŠbØói@ì@ôîóØbèói@ŠóióÜ@•óàó÷ .òìóä†‹Ùn‚íq@ìóäaŠó LÛŽí(äóèŠóÐLßóÙîóè @a†óØŽíî‡ïÄ@óÜ@(Viola-Jones)@çóîýóÜ@òìaŠ‡Žïq@õó’ó @óØ@ìbš@ì@ãò†@@òìóåîŒŽì†@õò‰ŽîŠ@ôä†‹Ù’bi@Žíi @ç†<ŽïqìòŠói@çbî@ç†‹ØŠaíà@óè@ãóØóîN@òìa‹ÙŽïuójŽïu@ì@Hòìa‹äa†I@òìa‹Ø@ça6î†@Œaìbïu@ôîòŠbàˆ@ì@þàŠŽíÐ@õ‰ïma9"L @@õò‹Žïàˆ@@ôn"b÷@õŠòíŽïq@ôäbåŽïèŠbØói@ói@@I@@õŠŽíuóÜóè@ôÅïmòŒŽíq@õ@Ša6åï÷@õòìóä†‹ÙàóØ@õóäbïàóÜ@òìaŠ†@ãb−ó÷ @õH@óÜóè@ôÅïmó(ŽïäI@ìí›n"ò†óÜ@ôä†‹ÙÑ’óØ@ì@ç†‹ÙïåïjŽïm@õòìóä†‹ÙàóØ@ôn"óióàói@òìò‹m@ôØóîý@óÜN@ôn"ò† @óÜN@oŽî‹åŽïèò†ŠbØói@çbØójÜbÔ@ôä†‹ØòíŽï’ìbè@õbàóåi@Šó"óÜ@ìbš@ì@ãò†@ôåmìóØ@æŽîí’@N@çbØòìbš@ì@@ãò†@ì@òíŽï’ @o"ìŠ†bä@ôÅïmòŒŽíq@õ@II@õŠŽíu@õŠa6åï÷@õòìóä†‹ÙàóØŽíi@oŽî‹‚ò†ŠbØói@ìì‡åîŒ@@õò‹Žïàˆ@ôn"b÷@õŠòíŽïq@a†Šbuaì† NHóÜóèI @ôäìíšaì†ói@ì@ýóÙŽïm@@@õòìóåîŒŽì†@Žíi@oŽî‹åŽïèò‡n"ò†ói@òìóåîŒŽì†@@ômbØ@@ãó÷@@ôä†‹Ø9’bi@ì@ôä‡äb−í N@a†ŠbØ@õóšìbä@óÜ@ç†‹ÙŽïuójŽïu@ì@ç†‹Ùªò†@@õóäbïàóÜ@çbØóÌbäŽíÔ @@ôä†‹ÙŽî†ìbš@L@a†‹m@ôÙŽïn’@‡äóš@Žßó óÜ@pbØò†Šbïå“Žïq@@•óiìbè@õìbš@ì@ãò†@@õòìóåîŒŽì†@@ôÙŽïán/ï"@@a†òŠbØ@ãóÜ @çb’bqL@ oŽî‹Øò‡Ñ’óØ@ òìóØóîý@ ìíàóèóÜL@ òìbmòŠó"óÜNìì‡åîŒ@ ôjÜbÔ@ @ ôä†‹ØòíŽï’ìbè@ õbàóåi@ Šó"óÜ N@òìò‹m@ôäbØóäóîý@óÜ@óØòìa‹ÙÑ’óØ@òìbš@@ì@ãò†@ŠóÜ@oŽî‹‚ò†ŠbØói@óØòŠóÙî‹Žî†ìbš @õíŽîŠbåï"@‹"óÜ@HFDMI@ìbš@ì@ãò†@ììHŠ@ôä†‹ÙÑ’óØ@ôjÜbÔ@Šó"óÜ@@çaŠ‡àb−ó÷@óØ@ôäìíàŒó÷@ôäbØóàb−ó÷Šò† õŠò‡äb“ïä@ @ @ óÜ@ ò9’bi@ ŠŽìŒ@ @ ŠóÙî‹Žî†ìbš@ ì@ •óiìbè@ õììHŠ@ õŠóÙî‹Žî†ìbš@ @ óØ@ çóØò†@ òìói@ òˆbàb÷@ Žíî‡ïÄ@ ìì† @ôîò‰ŽîHŠ@ óØ@ çbØòŠòíŽïq@ æî9’bi@ N@ ça‡äb“ïä@ ì@ †‹ÙÑ’óØ@ ômbØ@ ì@ @ ò‰ŽîHŠ@ ôîbå’ŽìHŠ@ ŠóióÜ@ @ (Viola-Jones) NóØ‹š@151.217ç†‹ÙÑ’óØ@ì@ça‡äb“ïä@@ômbØ@ôîaH‹ÙŽïm@òì@HE97.2I@@•óiìbè@õ†Žínïà@õìa‹ØŠbïå“Žïq

@çbØòŒaìbïu@ óïîŽíî‡ïÄ@ óäóº†@ Žíi@ @ oŽî‹‚bäŠbØói@ óØ@ óîòìó÷@ @ ìa‹ØŠbïå“Žïq@ @ HFDMI@ ôäbØ@ óïmŠíØ@ ã@ óØ@

@óåŽîí’@ óÜ@ oŽî‹åŽïèbäŠbØói@ bèòìŠóè@ òìLNÛbäììHŠ@ @ õó(åîˆ@ ì@ @ ÚŽï"óØ@ ôîa‹Žï‚@ ôäýìíu@ a‡“ïäbïäaíŽïäóÜ N@ìbš@ì@ãò†@çbî@ììHŠ@ôäbØòŠŽíuìaŠŽíu

ŽíŽî‡ïÄ@õóŽîŠóÜ@@ììŠ@ìbš@ì@ãò†@õòìóåïbä @ @ìì‡åîŒ@ôjÜbÔ@ôä†‹ØòíŽï’ìbè@@ôäbåŽïèŠbØói @@

óØóîóàbä@ @õüÙäaŒ@–@o)äaŒ@ôÜíÙ@–@çbØón)äaŒ@ò†ŠòìŠóqì@o)äaŒ@ônÝØbÐ@ói@@òìa‹Ø@•óÙ“Žïq @a†ŠómíïràüØ@ôn)äaŒóÜŠónbà@õóÝq@ôäbåŽïénò†óiüi@ŠóØìaìóm@ôÙŽï’ói@Ûòì@ôäbáŽïÝ@ @

@çóîýóÜ@

|Üb–@‡¼a@@õ‹)î 2008L@ŠómíràŽíØ@ôn)äaŒ@óÜ@ýbi@ôàŽíÝi† @ôn’ŠóqŠó@ói@

‡áz:::à@ôÝ:È@õŠb::÷N†@ bnŽíàbà@ 2711@ça‡äójŽîŠ@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@2012@@ãòìì†@@ôäíäbØ

Ahmad

Ahmad Setiadi.pdf

125 - Ahmad Ragab.pdf

2011.en.dilman ahmad aziz.pdf

ahmad nadeem qasmi books pdf

20142404548 - AHMAD FATHONI - Biodata.pdf

2011.en.Shwana Ahmad Hussain.pdf

Ahmad Tavakoly Curriculum Vitae.pdf

Phulkari (Ashfaq Ahmad).pdf

2013.en.dastan ahmad sadalla.pdf

2009.ku.talar ahmad mstafa.pdf

Ali-Ahmad-NM802.pdf

Muhammad the Greatest by Ahmad Deedat

PhD Dissertation with ToC - Ahmad Hama Amin Hama Rashid.pdf ...

Ahmad Mas'ari_Rekonstruksi Ushul Fikih Perspektif Hasan al-Turabi ...

Shabbir Ahmad - QXP-iv (Bookmarks).pdf

ALI B. AHMAD _ THESIS - Ali Bawasheakh Ahmad.pdf

Maqashid al-Syariah Nusantara - ahmad syahrus sikti.pdf ...

Ahmad, Fermion Quantum Field Theory in Black Hole Spacetimes.pdf

Jahannam Kay Parwana Yafta By Allama Ahmad Khalil Juma.pdf ...

Khakam-Badhan-By-Mushtaq-Ahmad-Yousafi.pdf