Human Appearance Matching Across Multiple Non-overlapping Cameras Yinghao Cai, Kaiqi Huang and Tieniu Tan National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences {yhcai, kqhuang, tnt}@nlpr.ia.ac.cn Abstract In this paper, we present a new solution to the problem of appearance matching across multiple nonoverlapping cameras. Objects of interest, pedestrians are represented by a set of region signatures centered at points sampled from edges. The problem of frameto-frame appearance matching is formulated as finding corresponding points in two images as minimization of a cost function over the space of correspondence. The correspondence problem is solved under integer optimization framework where the cost function is determined by similarity of region signatures as well as geometric constraints between points. Experimental results demonstrate the effectiveness of the proposed method.

1

Introduction

Nowadays, more and more cameras have been applied in surveillance to monitor activities over an extended area. One problem associated with a multicamera system is to automatically analyze and fuse information gathered from multiple cameras so that human intervention is reduced to a maximum extent. The prerequisite for information fusion is to establish correspondences between observations across cameras. Establishing correspondences across multiple non-overlapping cameras is more challenging than single camera tracking since no spatial continuity can be exploited. This paper addresses the problem of matching moving objects across multiple non-overlapping cameras. We assume that the problem of single camera tracking is solved. The objective of this paper is to establish correspondences between video sequences as shown in Figure 1. In this paper, we rely on appearance information, more specifically, color cues to identify moving objects across multiple non-overlapping cameras. The appearance based object matching must deal with several

Figure 1. Example Image Sequences challenges such as variations of illumination conditions, poses and camera parameters. In [8], moving objects are represented by their major color spectrum histogram while colors that rarely appear are discarded. The major color spectrum histogram in [8] does not contain any color spatial information which is important in discriminating one object from another. Kang et al. [5] incorporate color spatial information into the representation by partitioning the blob into polar representation according to its centroid. While this method takes localization of color components into consideration, the coordinates of centroid may suffer inaccuracy brought by imperfect segmentation. Recently, there is a flourish interest in feature-based object recognition methods [1, 3, 7]. Numerous methods have been put forward based on interest point detectors and associated descriptors. However, SIFT [3] like feature detectors produce a small number of interest points which can not be extracted reliably in low resolution images such as ours shown in Figure 1. In this paper, instead of applying interest operators to specify points of interest as in [2, 3], objects of interest, pedestrians are represented by a set of region signatures centered at points sampled from edges. Inspired from [1], our method is based on the assumption that corresponding points on two edge maps of the same person under disjoint views should have similar region signatures. Then, the problem of frame-to-frame matching across cameras is formulated as the correspondence problem between a model image and a query image: how to establish the correspondence of points on two edge maps based on region signatures. The similarity of region signatures and geometric constraints between points are encoded in a cost function defined over the space of correspondences under the integer optimiza-

tion framework. Then, corresponding points are used to compute a similarity measure between the model image and query image. The paper is organized as follows. Section 2 presents our correspondence problem. In Section 3, a sequenceto-sequence strategy is proposed to further improve the performance of frame-to-frame matching. Experimental results and conclusions are given in Section 4 and Section 5 respectively.

2 The Correspondence Problem We now consider the correspondence problem between feature points {pi } uniformly sampled from edges in model image P and {qj } in query image Q. Besides good localization property, regions around points on the edges indicate the presence of “Multicolored Neighborhood” [7] where a rich description of color content is included. Two kinds of constraints are exploited to solve the correspondence problem: (1)Corresponding points on two edge maps should have similar region signatures. (2)Pairwise geometric relationship between corresponding points on two edge maps should be preserved [1]. Different from [1], we refer pairwise geometric constraints between points as spatial configuration of a reference point and candidate points on the edge. The similarity of region signatures and geometric constraints between points are encoded in a cost function defined over the space of correspondences under the integer optimization framework. The cost of assigning qj to pi is defined as: cost(pi , qj ) = ωm Cmatch (pi , qj )+ωg Cgeometric (pi , qj ) (1) where ωm and ωg are weights for match quality and geometric constraints respectively. We use xi,j to represent an assignment from qj to pi . Then the correspondence problem is formulated as:  x = argmin( cost(pi , qj )) ,subject to : (2) x

 j

xi,j ≤ 1,

i



xi,j ≤ 1, xi,j ∈ {0, 1}.

i

 xi,j ≤ 1( i xi,j ≤ 1) denotes that one point pi (qj ) at the edge map of the model(query) image may not have its counterpart in the query(model) image. The problem of frame-to-frame matching is formulated as finding corresponding points in two images as minimization of the cost function defined over the space of correspondences. We solve the correspondence problem under integer optimization framework



j

[6]. The match quality Cmatch and geometric constraint Cgeometric are computed in section 2.1 and 2.2, respectively.

Figure 2. Represent moving objects by region signatures centered at points on the edges: (a)original image , (b) result of Canny edge detection algorithm, (c) region signatures on the edges.

2.1 Dominant Matching

Color

Representation

and

As we mentioned above, we represent moving objects by a set of region signatures centered at points on the edges in Figure 2. In this section, we mainly address two problems. The first one is how to characterize the appearance of each region. The second problem is how to calculate the match quality between two regions. Regions of size w × w are selected around edge pixels as shown in Figure 2(c). By employing a concept of color distance [8], we represent each region by its dominant colors and frequencies of occurrence these colors appearing in the region on the target. The computation of the dominant color representation is summarized in Algorithm 1. In Algorithm 1, colors within a distance threshold α1 is regarded as a single color. The distance between two colors C1 and C2 is defined according to [8]. Similar to [8], colors in each region are then sorted in descending frequency. Thus, the i-th region centered at point pi of model image P is represented by the first k dominant colors along with their frequency: Rpi = {(C1 , W1 ), ..., (Ck , Wk )}. The similarity measure between two regions Rpi and j Rq is defined as: Sim(Rpi , Rqj ) = min(P (Rpi |Rqj ), P (Rqj |Rpi ))

(3)

where P (Rqj |Rpi ) is the probability of observing dominant color representation of Rqj in Rpi which is defined as: i Mp

P

P (Rqj |Rpi )

=

n=1

i min{Wp,n ,

Mqj

P

m=1

i j j δ(Cp,n , Cq,m )Wq,m }

|Npi | (4)

|Npi | is the number of pixels in the i-th region of model image P . Mpi and Mqj are numbers of dominant coli is the frequency of the n-th ors in each region. Wp,n

Algorithm 1 Computation of Dominant Color Representation 1: M = 0; Initialize the number of dominant colors in the region. I(x) is the RGB value at pixel x. 2: for Each pixel x in the region do 3: for Each dominant color Ci do 4: if dist(I(x), Ci ) ≤ α1 then 5: Ci ← (1 − W1i )Ci + W1i I(x) ;update the dominant color 6: Wi ← Wi + 1 ;update the frequency of this dominant color 7: else 8: CM ← I(x) ;assign to a new dominant color M ←M +1 9: WM ← 1 10: end if 11: end for 12: end for

2.2

Geometric Constraints

In this section, we refer pairwise geometric constraints between points as spatial configuration of a reference point and candidate points on the edge. We choose reference point as the top of the head since it can be easily detected and relatively stable compared with centroid under the imperfect segmentation. As shown in Figure 3, the vector from H1 to pi should be consistent with the vector from H2 to qj if pi and qj are two corresponding points on two edge maps. Two measures are proposed between the head point and candidate point: D(H1 , pi ) = H1 − pi 2 and θ(H1 , pi ) = tan−1 (H1 − pi ), D(H2 , qj ) and θ(H2 , qj ) can be defined similarly. The distance between the head point and candidate point is normalized by the height of the silhouette. The geometric constraint Cgeometric is defined as:

H2 mT D

H1 D mT pi

color appearing in the i-th region of model image P. i i δ(Cp,n , Cq,m ) equals to 1 if two dominant colors are close enough. P (Rpi |Rqi ) can be defined similarly. Finally, the similarity measure between two regions is transformed into a cost representation which is the first term of the right side of Equation 1.

qj

Figure 3. The vector from H1 to pi should be consistent with the vector from H2 to qj if pi and qj are two corresponding points on two edge maps under two views.

3

Sequence-to-Sequence Matching

In frame-to-frame matching, for each point in the model image, we have found the best matching point in the query image by region signature matching and geometric constraints. The frame-to-frame similarity measure between model image P and query image Q is computed as the mean of these best correspondences: P (Q|P ) =

K 1 X Sim(Rpk , Rqk ) K

(6)

k=1

where K is the number of corresponding points on two edge maps. Generally, the more points matched, the more similar the compared images. However, matching the appearance of objects by a single image may bring uncertainties into the system because of imperfect segmentation and pose changes. In this section, we employ a sequence based matching method to further improve the performance of frame-to-frame matching. We match each image in the model sequence to each image in query sequence. The process is illustrated in Figure 4. The score of the best matching pair is chosen as the similarity score between two sequences [4].

Figure 4. A sequence-to-sequence strategy.

Cgeometric (pi , qj ) = ΔDi,j + Δθi,j , where ΔDi,j = |D(H1 , pi ) − D(H2 , qj )|

(5)

Δθi,j = |θ(H1 , pi ) − θ(H2 , qj )|

where both ΔDi,j and Δθi,j are transformed into a common domain so that they can be summed up. The optimal assignment of Equation 2 can be found efficiently under integer optimization framework [6].

4 Experimental Results and Analysis The experimental setup consists of two outdoor cameras with non-overlapping fields of view. The layout is shown in Figure 5 (a). Some sample images can be seen from Figure 6.

1

2

(a)

Figure 5. Experimental setup: (a) The layout of the camera system, (b) Views from two widely separated cameras. We evaluate the effectiveness of the proposed method on a dataset of 42 people. In computing the dominant color representation of each region, the color distance parameter α1 is set to 0.01 in Algorithm 1. Regions of size 5 × 5 are selected around edge points. There are mainly 2-3 dominant colors in each region. In Equation 1, weights for match quality and geometric constraints are set to 0.4 and 0.6 respectively. Figure 7 shows the rank matching performance of frame-toframe matching, sequence-to-sequence matching and a bounding box method [2]. Rank i (i = 1...10) performance is the rate that the correct person is in the top i of the retrieved list. Frames are selected randomly from the sequence in frame-to-frame matching. Bounding box method refers to computing a single signature using the foreground pixels in the bounding box of each moving object. Bounding box method serves as a baseline algorithm for comparison in [2]. Different people with similar appearances bring uncertainty into the system which can explain the rank one accuracy of 65% in frame-to-frame matching. Figure 8 shows corresponding points found on two images of the same person under disjoint cameras. Corresponding points are marked with the same color in image pairs in Figure 8(a-b) and (c-d).

Figure 6. Each column contains the same person under two disjoint views. 1 0.9 0.8

Accuracy

0.7 0.6 0.5 0.4 0.3

Frame−to−Frame Bounding Box method Sequence−to−Sequence

0.2 0.1 0 1

2

3

4

5

6

7

8

9

10

Performance of ranked matches

Figure 7. Rank matching performance

(b)

(c)

(d)

Figure 8. Corresponding points are marked with the same color in image pairs (a-b) and (c-d).

5 Conclusions In this paper, we have proposed a solution to the problem of appearance matching across multiple non-overlapping cameras by establishing the correspondence of points sampled from edges. Experimental results demonstrate the effectiveness of the proposed method. Future work will focus on evaluation of the proposed method on larger datasets.

Acknowledgement This work is funded by research grants from the National Basic Research Program of China (2004CB318110), the National Science Foundation (60605014, 60332010, 60335010 and 2004DFA06900), and the CASIA Innovation Fund for Young Scientists. The authors also thank the anonymous reviewers for their valuable comments.

References [1] A. C.Berg, T. L.Berg, and J. Malik. Shape matching and object recognition using low distortion correspondences. CVPR, pages 26–33, 2005. [2] N. Gheissari, T. B. Sebastian, and R. Hartley. Person reidentification using spatiotemporal appearance. CVPR, pages 1528–1535, 2006. [3] D. G.Lowe. Distinctive image features from scaleinvariant keypoints. IJCV, 60(2):91–110, 2004. [4] Y. Guo, S. Hsu, H. S.Sawhney, R. Kumar, and Y. Shan. Robust object matching for persistent tracking with heterogeneous features. PAMI, 29(5):824–839, 2007. [5] J. Kang, I. Cohen, and G. Medioni. Continuous tracking within and across camera streams. CVPR, pages 267– 272, 2003. [6] J. Maciel and J. P.Costeira. A global solution to sparse correspondence problems. PAMI, 25(2):187–199, 2003. [7] S. K. Naik and C.A.Murthy. Distinct multicolored region descriptors for object recognition. PAMI, 29(7):1291– 1296, 2007. [8] M. Piccardi and E. D. Cheng. Multi-frame moving object track matching based on an incremental major color spectrum histogram matching algorithm. CVPR, pages 19–27, 2005.

Human Appearance Matching Across Multiple Non ...

Institute of Automation, Chinese Academy of Sciences. {yhcai, kqhuang, tnt}@nlpr.ia.ac.cn. Abstract. In this paper, we present a new solution to the problem of ...

2MB Sizes 1 Downloads 209 Views

Recommend Documents

Matching Tracking Sequences Across Widely ...
in surveillance applications to monitor activities over an ex- tended area. In wide ... cameras is to build a representation for the appearance of objects which is ...

Matching with Multiple Applications
i.e., the probability that an unemployed worker finds a job. As pointed out by Tan (2003), our matching function for a ∈ {2, ..., v -1}, u and v finite,. *. We thank Ken Burdett and Serene Tan for alerting us to the mistake in the finite case in ou

Matching with Multiple Applications Revisited
Oct 6, 2003 - v vacancies. Each unemployed worker submits a applications with a $. &$,%, ..., v' given. These applications are randomly distributed across ...

Matching with Multiple Applications Revisited
Oct 6, 2003 - These applications are randomly distributed across the .... we use the standard result on the Poisson as the limit of a binomial to show that. ,+-.

Continuously Tracking Objects Across Multiple Widely Separated ...
The identities of moving objects are maintained when they are traveling from one cam- era to another. Appearance information and spatio-temporal information.

Touchscreen Biometrics Across Multiple Devices - Usenix
they read in our Android sensor application, which logged their keystrokes, including .... 10 subjects use some form of biometrics to unlock their mo- bile devices ...

Touchscreen Biometrics Across Multiple Devices - Usenix
cles from a set of recent stories featured in local and national news sources. ... We developed a sensor application for Android to record all touchscreen ...

MULTIPLE SYMMETRIC INVARIANT NON TRIVIAL ...
SOLUTIONS FOR A CLASS OF QUASILINEAR ELLIPTIC. VARIATIONAL ...... Faculty of Mathematics and Computer Science, Babes–Bolyai University, 400084.

Job Matching Within and Across Firms: Supplementary ...
Here, I have considered the simplest case in which human capital is produced and used in the same way in all jobs. Of course, once human capital is allowed to be differentially acquired in some jobs and differentially productive in other jobs, the jo

CVPR'00: Reliable Feature Matching across Widely ...
linear transformations of the image data including rotation, stretch and skew. .... varied in the calculation of M – the integration scale, and the “local scale” at ...

Semantic Matching across Heterogeneous Data Sources
semi-automated tools are therefore desired to assist human analysts in the semantic matching process. ..... Decision Support Systems 34, 1 (2002), 19-39. 4.

user verification: matching the uploaders of videos across accounts
MediaEval [10] Placing Task's Flickr video set, which is la- beled with the uploader's ... from posted text across different social networking sites, link- ing users by ...

Semantic Matching across Heterogeneous Data Sources
Semantic correspondences across heterogeneous data sources include schema-level ... Other semantics and business rules may simply reside in human.

Multiple Frames Matching for Object Discovery in Video
and the corresponding soft-segmentation masks across multiple video frames. ... mation method in video, based on Principal Component Analysis. Then, we ...

Matching with Multiple Applications: A Correction
Jul 12, 2003 - all exogenously make the same a number of applications to v firms. For finite ... Philadelphia PA 19104-6297, tel: +1 215 898-7701, fax: +1 215 ...

user verification: matching the uploaders of videos across accounts
‡International Computer Science Institute ... Linking online personas is an often discussed .... scoring threshold to the level for the EER, and simply tally-.

Actions Across Levels (AAL): A multiple levels ...
Previous findings: Mid-level construction as an intuitive strategy for understanding systems. • AAL: Framework for analyzing reasoning about complex systems. • Analysis students' .... Looking at actions. Action. Mean S D rule-making. 5 7. 1 5 par

Efficient Virtual Network Optimization across Multiple ...
Abstract—Building optimal virtual networks across multiple domains is an essential technology to offer flexible network services. However, existing research is founded on an unrealis- tic assumption; providers will share their private information i

Tracking Across Multiple Cameras with Overlapping ...
line is similar, we count on the color and edge clues which lead us to the correct results. There are .... and Automation, May 2006. [16] S. M. Khan and M. Shah, ...