Robust Estimation of Edge Density in Blurred Images Jae-Yeong Lee and Wonpil Yu Electronics and Telecommunications Research Institute (ETRI) Daejeon 305-700, Korea {jylee, ywp}@etri.re.kr
Abstract— Edge is an important cue for object detection in computer vision. In this paper, we present a filtering method for speeding up the object detection by using edge density as a prefiltering measure. Specifically the paper focuses on two problems of derivation of scale invariant edge density measure and robust edge extraction in blurred images. Normalization of edge density is performed based on the square root of the target area for scale invariance. Experimental result confirms validity of the suggested density measure. Second problem of edge extraction in blurred images is addressed by extracting edge pixels in scaled-down images with histogram equalization, giving more reliable edge extraction result. Experiment results on large set of pedestrian images captured under various conditions including daylight, raining, motion blur, and night are presented and analyzed quantitatively. Keywords— Edge density estimation, edge extraction, object detection, search space reduction.
1. Introduction Sliding window-based approaches [1], [2], [3] for object detection scan the input image by various sizes of windows and for a window test whether the underlying image region contains the target object or not. The sliding window method is very effective in that a target object is able to be searched regardless of its size and location in the input image. However, the computational burden due to huge search space has made a real-time implementation to be a challenging problem. A common technique for speeding up the sliding window method is to reduce the search space by filtering out unpromising windows at early stage by simple test like patch variance [4], edge [5], [6], color [5], [7], or geometric constraints [8], [9]. Recently Joung et. al. [5], [6] proposed to use the number of edge pixels as a simple feature to filter out background and they applied the method to speed up a pedestrian detection task. Their proposed edge filtering method is effective especially for the environment containing flat background like ground, road, and wall. Edge or gradient in general is powerful and reliable feature for object detection or tracking in computer vision as it is robust to illumination change, rotation, and scale. The current state-of-art texture-less detection algorithm, for example, Line-Mod [3] compares gradient orientations computed at a set of predefined edge points of an This work was supported by the R&D program of the Korea Ministry of Knowledge and Economy (MKE) and the Korea Evaluation Institute of Industrial Technology (KEIT). (The Development of Low-cost Autonomous Navigation Systems for a Robot Vehicle in Urban Environment, 10035354)
Fig. 1. Two sample blurred images (first colomn) and the edge extraction results in original scale (second column) and the half scale (third column). The edge extraction results in half scale are shown in original scale for visualization purpose.
object to locate target object. Recent VTD [10] algorithem also uses edge template as one of basic object model for visual object tracking. In this paper, we present a filtering method based on edge density for speeding up object detection. Our method share the basic idea with [5] in that edge density is used to filter out texture-less backgrounds. However, we focus on two uncovered problems of the derivation of scale invariant edge density measure and robust edge extraction in blurred images. In our paper, a normalized edge density measure which is scale invariant is presented. The second problem focused is to extract edges reliably from blurred images. Images are often blurred due to rapid camera motion (motion blur) or low illumination condition, and the blurring degrades the performance of edge extraction significantly. Addressing this problem, we propose a new edge extraction method that is able to extract blurred edges robustly as well as normal edges. The paper is organized as follows. In the next section, we describe our edge density measure and show its scale invariance experimentally. In Sect. 3 the process of edge extraction is described in detail focused on blurred edges. Adjustable parameters of our method is discussed with experimental results in Sect. 4. In Sect. 5 we apply our method for pedestrian detection task and give experimental results with quantitative analysis. The conclusion is given in Sect. 6. 2. Edge Density Measure 2.1 Edge Extraction Generally pedestrian images have more vertical edges than horizontal edges. To maximize discrimination from
(a)
(b)
(c)
(d)
Fig. 3.
Sample 12 pedestrian image patches. 8000
Fig. 2. Edge extraction. (a) Input image I. (b) Histogram equalized image H. (c) Edge image obtained by thresholding H ∗ G+ x with T1 = 150. (d) Edge + image obtained by thresholding H ∗ G− x + H ∗ Gx with T1 = 150.
edge count
6000
background we therefore consider only vertical edges under the assumption that the background have the equal number of vertical and horizontal edges. We first enhance the contrast of the input image by histogram equalization before extracting vertical edges. The histogram equalization process normalizes image contrast between images and makes the edge extraction to be more reliable. The edge magnitude is computed by convolving the histogram-equalized image with Sobel operator as the following:
−1 −2 G+ = x −1
0 0 0
+ E = H ∗ G− x +H ∗G x 1 1 0 −1 2 , G− 2 0 −2 . = x 1 1 0 −1
4000
2000
0
0
0.2
0.4
0.6
0.8
1
scale
Fig. 4.
Graph of edge count according to the change of window scale.
(1) (2)
In Eq. (1), G+ x has a roll of emphasizing the pixels which intensity increase from left to right while G− x emphasizing the pixels which intensity decrease from left to right. The edge pixels are then defined as the image pixels whose edge magnitude is larger than a predefined threshold. We denote this threshold by T1 . Figure 2 shows an example of edge extraction in driving environment. We are able to observe that the lane or ground area contains rare edges and thus easily excluded at early stage of detection by rejecting windows whose edge density is lower than a predefined threshold. The reason + why we use both of G− x and Gx instead of single filter is illustrated in Fig. 2(c) and (d). The use of Sobel operator for both direction (+ and −) usually increases the edge density of pedestrian about twice while the edge density of the background slightly increases. Therefore, we expect maximal discrimination efficiency of the edge density with the application of bidirectional Sobel operator. 2.2 Edge Density Measure We can measure edge density of a window as the number of edge pixels contained in the window as described in [5]. However, edge count works only for the objects of fixed scale and not adequate for our purpose as the size of pedestrian in images varies largely according to the distance from camera.
We define a scale-invariant measure of edge density by the following: N D= √ , (3) S where N is the number of edge pixels and S is the area of window. It is worth to note that in Eq. (3) we use square root of S rather than S as the denominator. In other words Eq. (3) is based on the assumption that the number of edge pixels in a window increases in proportion to window scale (or length). To justify Eq. (3) we performed an experiment on real pedestrian samples. We first collected 12 pedestrian image patches (see Fig. 3) and applied histogram equalization. The patches were then scaled down gradually by a scale factor of 1.05 producing so called a set of image pyramids. Next, we extracted edge pixels from each scaled image patch and measured the edge density by using Eq. (3). Figure 4 shows the relationship between the edge count and the window scale for 12 pedestrian patches. In the figure, we are able to observe that the edge count is linearly proportional to window scale for most pedestrian patches except two ones (red colored line). Two exceptions are result from the third and the seventh patch images of Fig. 3 where the edges are significantly blurred. The experimental result shown in Fig. 4 suggests two points. One is that edge pixels increases in proportion to window scale rather than window area justifying Eq. (3). The other point is that the edge extraction can be very unreliable in blurred images. In the next section
0.11
=0.2 =0.3 =0.4
0.1 0.09
best mssing rate
0.08 0.07 0.06 0.05 0.04 0.03 0.02
Fig. 5. Sample images of each dataset category. (top-left) Normal. (topright) Rain. (bottom-left) Blur. (bottom-right) Night.
we will describe how to extract edges and measure edge densities robustly in blurred images.
0.01 100
110
120
130
140
150 T1
160
170
180
190
200
Fig. 6. A graph of best(the smallest) missing rate with respect to the change of edge magnitude threshold T1 under fixed filtering rate of 0.2, 0.3, and 0.4.
3. Extraction of Blurred Edge
4. Parameter Determination As described previously, our method has three adjustable parameters: scale-down factor s, threshold on edge magnitude (T1 ), and threshold on edge density (T2 ). The best parameters have to maximize the filtering out rate of the background windows while preserving detection rate of
0.14
=0.2 =0.3 =0.4
0.12
0.1
best mssing rate
Images can be blurred due to rapid camera motion (motion blur) or low illumination condition, and the blurring degrades the performance of edge extraction significantly as illustrated in Fig. 4 (the third and seventh cases). The lower one among two red lines in Fig. 4 shows that the edge count rather increases as the window size decreases. This unexpected result is explained as follows. We usually use a small size of 3 × 3 edge filter and small edge filters are not adequate to detect rapid change of intensity but gradual change of intensity. Therefore edge pixels not detected in the original scale are detected in the image scaled down as the intensity change become more sharp in lower resolution. This observation has motivated us to use scaled-down image for edge extraction. Figure 1 shows two examples of edge extraction in blurred images. We are able to observe that the pedestrians (marked by red circles in the input images) are not shown in the edge image in second column, which are obtained from the original scale of input image. However, edge images extracted from the half scale of input images show clearly the pedestrians. In summary the proposed edge filtering works as follows. Firstly the input image is contrast-enhanced by histogram equalization. The histogram-equalized image is then scaled down to a predefined scale. Next, the edge pixels are extracted by thresholding the edge magnitude computed from Eq. (1) by a predefined threshold T1 . The count of extracted edge pixels can be stored in a 2D array for fast lookup by using so called integral image technique proposed by Viola and Jones [1]. Finally, during sliding window detection, for each window we measure edge density by using Eq. (3) and reject the window if the edge density is lower than a predefined threshold T2 .
0.08
0.06
0.04
0.02
0 0.1
0.2
0.3
0.4
0.5 0.6 scale
0.7
0.8
0.9
1
Fig. 7. A graph of best(the smallest) missing rate with respect to the change of image scale under fixed filtering rate of 0.2, 0.3, and 0.4.
target object. As the optimal value of parameters for robust edge extraction varies depending on the image resolution, camera, the degree of blurring, object size, etc, we estimated the parameters experimentally. The experimental dataset consists of 1,239 outdoor images where each image contains at least one pedestrian. The images were gathered by using a vehicle-mounted black box camera (F500HD) and labeled with one of four categories (normal, rain, blur, night) according to the environmental condition of image acquisition. Figure 5 shows selected 4 images of each dataset category. The image resolution is 640 × 360. We measured edge densities by varying the image scale s from 0.1 to 1.0 with a step size 0.1 and the threshold T1 from 100 to 200 with a step size 10 in total of 10 × 11 = 110 combinations. And then we measured filtering rate(γ) and missing rate(λ ) for each parameter combination by varying edge density threshold T2 , where the filtering rate is the rate of true rejections of background and the missing rate is the rate of false rejections of the pedestrians. Figure 6 shows best(the smallest) missing rate with respect to the change of edge magnitude threshold T1 . We measured the missing
Table 2. C OMPARISON OF PERFORMANCE OF PEDESTRIAN DETECTION WITH / WITHOUT PROPOSED EDGE FILTERING .
0.6 s=0.3, T1=140 0.5
HOG detector
Our approach
s=1.0, T1=140
average (normal rain blur night)
average (normal rain blur night)
14.5 fps
s=1.0, T1=170
0.4
mssing rate
s=0.5, T1=170
0.3
runtime
11.5 fps
filtering rate
0%
36.7%
recall
0.665 (0.74 0.60 0.47 0.50)
0.653 (0.73 0.60 0.44 0.45)
precision
0.793 (0.83 0.84 0.73 0.72)
0.799 (0.83 0.84 0.69 0.65)
0.2
0.1
0
0
0.1
0.2
0.3 0.4 filter rate
0.5
0.6
0.7
Fig. 8. A graph of missing rate with respect to filter rate for four selected parameter settings. Table 1. N UMBER OF TEST IMAGES AND PEDESTRIANS CONTAINED FOR EACH IMAGE CATEGORY.
normal
rain
blur
night
total #
images
820
170
136
113
1,239
pedestrians
1,552
270
272
200
2,294
rates with varying image scales and chose the best (smallest) missing rate at three points where the filtering rate 0.2, 0.3, and 0.4 for each value of T1 . The graph in Fig. 6 shows the preference of small value of T1 for minimizing missing rate and large value for maximizing filtering rate although the performance of edge filtering is not sensitive to T1 . The performance of edge filtering according to the change of image scale is shown in Fig. 7. The graph shows that the best performance is obtained around the half scale, suggesting the use of scaled-down image with scale factor 0.3 ∼ 0.5 for best performance. Figure 8 shows missing rate and filtering rate graph for four selected parameter settings. We are able to observe that the performance is greatly enhanced by using scaled-down image with s = 0.3 ∼ 0.5 and not sensitive on the selection of T1 . 5. Experiment In the experiment we apply the presented filtering method for pedestrian detection task and compare the performance with and without the edge filtering. We used the OpenCV [11] implementation of HOG pedestrian detector [2] for the experiment. Image dataset described in Sect. 4 was used as the test dataset. As described previously, the test dataset consists of 1,239 pedestrian images which are labeld with one of four categories. The total number of pedestrians included in the test dataset is 2,294. Table 1 shows the number of test images and the number of pedestrians contained for each category. We first applied the original HOG detector on our test dataset without edge filtering and measured runtime, filtering rate, recall, and precision, where the filtering rate is the rate of windows rejected by edge filtering, the recall is the ratio of detected pedestrians among all pedestrians in test dataset,
and the precision is the ratio of true positives among all detections. We consider a detection as a true positive only when it has the overlap with the ground truth more than 50 percent, where the overlap is defined as the ratio of intersection to union. Next, we repeated the test procedure with the modified HOG detector such that our edge filtering method was applied. The parameters of HOG detector [11] were set to be hitThreshold = 0 and groupThreshold = 2. Table 2 shows the experimental result in summary. We are able to observe that by using our approach the search space of the pedestrian detection is reduced by 36.7 percent (the filtering rate), which leads to speed up of the detection from 11.5 fps to 14.5 fps on average. The recall slightly decreases (1.2 %) and the precision remains nearly the same. 6. Conclusion In this paper we presented a scale-invariant edge density measure for speeding up the sliding window object detectors. We then suggested a new edge extraction method that is able to extract blurred edges robustly as well as normal edges. The experimental results on intensive pedestrian images confirms the effectiveness of the suggest method. We expect the suggested edge extraction method can be applied for various vision applications which based on edge detection. References [1] P. Viola and M. Jones, ”Rapid object detection using a boosted cascade of simple features,” In CVPR, 2001. [2] N. Dalal and B. Triggs, ”Histograms of oriented gradients for human detection,” In CVPR, CA, USA, 2005. [3] S. Hinterstoisser, C. Cagniart, S. Ilic, P. Sturm, N. Navab, P. Fua, and V. Lepetit, ”Gradient Response Maps for Real-Time Detection of Texture-Less Objects”, TPAMI, 34(5), MAY 2012. [4] Z. Kalal, K. Mikolajczyk, and J. Matas, ”Tracking-LearningDetection,” Pattern Analysis and Machine Intelligence, 34(7), July 2012. [5] J. H. Joung, M. S. Ryoo, S. Choi, J. Lee, and W. Yu, ”Speed Up Method for Sliding Windows”, in Proceedings of the International Conference on Ubiquitous Robot and Ambient Intelligence (URAI), 2010. [6] J. H. Joung, M. S. Ryoo, S. Choi, W. Yu, and H. Chae, ”Backgroundaware Pedestrian/Vehicle Detection System for Driving Environments”, IEEE Conference on Intelligent Transportation Systems (ITSC), Washington, D.C., October 2011. [7] D. Lee and S.G. Lee, ”Vision-Based Finger Action Recognition by Angle Detection and Contour Analysis”, ETRI Journal, vol.33, no.3, June 2011. [8] P. Sudowe and B. Leibe, ”Efficient Use of Geometric Constraints for Sliding-Window Object Detection in Video,” In ICVS, 2011. [9] R. Benenson, M. Mathias, R. Timofte and L. V. Gool, ”Pedestrian detection at 100 frames per second”, In CVPR 2012. [10] J. Kwon and K. M. Lee, ”Visual Tracking Decomposition”, IEEE CVPR, 2010. [11] http://opencv.willowgarage.com/wiki/