LOCAL PATTERNS CONSTRAINED IMAGE HISTOGRAMS FOR IMAGE RETRIEVAL Zhi-Gang Fan, Jilin Li, Bo Wu, and Yadong Wu Advanced R&D Center of Sharp Electronics (Shanghai) Corporation 1387 Zhangdong Road, Pudong, Shanghai 201203, China [email protected] ABSTRACT In this paper, we present local patterns constrained image histograms (LPCIH) for efficient image retrieval. Extracting information through combining local texture patterns with global image histogram, LPCIH is an effective image feature representation method with a flexible image segmentation process. This kind of feature representation is robust and invariant for several image transforms, such as rotation, scaling and damaging. LPCIH method is efficient for several difficult image retrieval tasks, such as rotated and damaged gray image retrieval. Because many traditional image retrieval methods are not suitable for the difficult gray image retrieval tasks, LPCIH is valuable for many real-world applications of image retrieval. Experimental results show that the LPCIH method is consistently efficient, effective and it offers advantages over the state-of-the-art image retrieval methods. Index Terms— Image Retrieval, Pattern Matching, Image Texture Analysis, Feature Extraction, LPCIH 1. INTRODUCTION Content-Based Image Retrieval has been an active research field in recent years. The interest in this research field has spurred from the need to efficiently manage and search large volumes of multimedia information [1], mostly due to the exponential growth of the World-Wide-Web (WWW). The very large amount of images available at the WWW can form the broadest collection of images. The management of such large size image collection is very difficult. So the research on efficient retrieval method is very necessary in this information exploding age. Image retrieval is performed based on feature representations of the image that are extracted during the image analysis phase . So image feature descriptors are very important for image retrieval. The most common categories of descriptors are based on color, texture and shape. An efficient image retrieval system must be based on efficient image feature descriptors. Image retrieval methods may also depend on the properties of the images being analyzed. These methods are usually distinct for different image domains, and gradually

978-1-4244-1764-3/08/$25.00 ©2008 IEEE

941

change when the focus moves from a narrow to a broad image domain. A narrow image domain has a limited and predictable variability in all relevant aspects of its appearance. For example, face image retrieval has its specified algorithm, such as boosting algorithm, which is difficult to be used for general object retrieval. A broad image domain has an unlimited and unpredictable variability of the content of images. Therefore, image analysis and feature representations are very difficult in the broad image domain. In this paper, our focus is on the image retrieval methods suitable for broad image domains. Low-level visual features such as color and texture are especially useful for image retrieval. The popular color features include global color histograms [2], color correlograms [3], color moments [4] and MPEG-7 color descriptors [5]. The MPEG-7 texture descriptors [5] are based on Gabor filtering. The edge histogram [6], which is a block-based descriptor included in the MPEG-7 standard, and local binary patterns [7] have been used as texture descriptors. There are some other usable features like MR-SAR [8], Wold features [9] and the famous Tamura approach [10]. CVPIC [11] and Border-Interior classification [12] are retrieval methods based on local textures. With SIFT and local descriptors [13], retrieval systems [14] in Google and global probabilistic models [15] have been developed. In several difficult situations, such as rotated and damaged gray image retrieval, these above mentioned traditional features (except SIFT) are not effective because they are not rotation-invariant. The SIFT-based image retrieval systems [14][15] are rotation-invariant but they are not efficient and too slow to be computed in real-time computing in some specific environments such as personal computers. In order to solve these problems, we propose local patterns constrained image histograms (LPCIH). LPCIH combines the local texture patterns with global image histogram to form a powerful image feature representation. The texture patterns have been locally encoded and global image histogram has been divided into several parts according to the local patterns in LPCIH. Therefore, both local and global information are considered by LPCIH. This kind of feature representation is robust and invariant for several image transforms, such as rotation, scaling and damaging. LPCIH is a compact and efficient image retrieval method suitable for broad image domains.

ICIP 2008

2. LOCAL PATTERNS CONSTRAINED IMAGE HISTOGRAMS

and the same label of pattern code in our texture operator.

Local patterns constrained image histograms (LPCIH) can be obtained through three processing steps: (1) Image quantization; (2) Local pattern analysis; (3) Histogram construction. At the first image quantization step, color images are converted into gray images at first. Then, the classical Octree quantization algorithm is used to quantize the image gray intensities to 32 scales. The node index at the top level uses the most significant bits of the gray components. The next lower level uses the next bit significance, and so on. If much more than the desired number of gray intensities are entered into the Octree, its size can be continually reduced by seeking out a bottom-level node and averaging its bit data up into a leaf node. Once sampling is complete, exploring all routes in the tree down to the leaf nodes, taking note of the bits along the way, will yield the required number of gray intensities. Secondly, after the quantization step, we analyze the local patterns of the image pixels through a texture operator similar to local binary patterns [7]. Our texture operator is an extension of local binary patterns because we have modified the basic form of the local binary patterns by defining a additional flat pattern. As illustrated in Figure 1, the operator assigns a label (pattern code) to every pixel of an image by comparing the eight points of the 5×5-neighborhood of each pixel with the center pixel value. 3×3-neighborhood is often used because of its computational efficiency. If the eight

Fig. 1. The local pattern operator uses the information of 5×5-neighborhood of each image pixel. neighbors are all equal to the center pixel value, we define that such pattern is a flat pattern. Beside the flat pattern, we have additionally used the patterns of LBPu2 8,2 [7]. If a pattern is not a flat pattern, it certainly is a pattern of LBPu2 8,2 . LBPu2 8,2 defines the local neighborhood as a set of sampling points evenly spaced on a circle centered at the pixel to be labelled. Bilinear interpolation is used when a sampling point does not fall in the center of a image pixel. The patterns of LBPu2 8,2 are produced by thresholding the 5×5-neighborhood of each pixel with the center value and considering the result as 8 binary numbers (0 and 1). The final pattern code P is produced by multiplying the 8 binary numbers Bi by weights given by powers of two and adding the results in circular way as shown in the equation (1). As a result, every image pixel is assigned a label P of local pattern code. In LBPu2 8,2 , a local pattern is called uniform if the pattern contains at most two bitwise (the bits are the 8 binary numbers) transitions from 0 to 1 or vice versa when its 8 binary numbers are ordered in circular way. All non-uniform patterns are labelled with one

942

P =

7 

Bi · 2i ,

Bi ∈ {0, 1}

(1)

i=0

For the real-world applications, we try to make the local patterns rotation-invariant through merging them. This merging process is operated by a circular shift operation. For LBPu2 8,2 , the 8 binary numbers of the thresholded neighborhood can be mapped into a 8-bit word in clockwise or counterclockwise order around the center pixel. On this 8-bit word, do circular shift which is a shift operation around the center pixel in circular style and make this 8-bit word meet the following condition: the longest 0 sub-sequence lies in the left end and the right end is 1 sub-sequence. As a result, some different local patterns are merged into the same pattern. After this merging process, the local patterns are made rotationinvariant because the edge orientation information have been removed other than the texture information. There are 11 rotation-invariant local patterns in total: one flat pattern, 9 uniform patterns and one merged non-uniform pattern. Each of these rotation-invariant local patterns has a label (pattern code) to be distinguished from each other. If an image is processed by this local pattern operator, every pixel in this image can be assigned a label of pattern code. Our texture operator produces the local patterns which are invariant to any monotonic transformation of the gray scale and these patterns are quick to compute. This texture operator is a key component of LPCIH retrieval algorithm. At last, the histogram construction step is finished through a flexible image segmentation process and a combination process of gray scale histograms. In this step, we separate the input image into 11 sub-images so that the pixels of each subimage have the same label of pattern code. This is a flexible image automatic segmentation process. Then, we construct gray scale histogram for each sub-image. So there are 11 gray scale histograms in total produced for the 11 sub-images respectively. The resulting 11 gray scale histograms are simply combined through linking these 11 histograms end to end in any pre-defined sequence order. As a result, a single large histogram is produced. This final histogram is called LPCIH and it’s the final feature vector extracted for the input images. This feature representation of LPCIH method is different from the spatially enhanced histogram (SEH) [7] which is used to encode the appearance of human face images. The SEH directly extracts local texture information of input images and it’s not rotation-invariant. But LPCIH method combines local texture information with global image histogram through fusing multiple sub-image histograms which are produced according to local texture patterns of image pixels. LPCIH is rotationinvariant and suitable for image retrieval because it has flexible image segmentation process which is more robust comparing with the fixed grid processing of SEH. Because of its simplicity, LPCIH is computational efficient comparing with the SIFT-based methods [14][15] which are too complicated.

1 LPCIH BIC CVPIC GIH

0.7

0.7

0.6

0.6

0.5

0.8

0.5

0.4

0.4

0.3

1 L1 LPCIH L1 BIC L1 CVPIC L1 GIH

0.9

Precision

0.8

Theta

Precision

0.8

1 LPCIH BIC CVPIC GIH

0.9

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.2

0.1

0.1

0.1

0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0

1

0.1

0.2

0.3

0.4

Recall

0.5

0.6

0.7

0.8

0.9

1

Recall

0 0

L1 LPCIH L1 BIC L1 CVPIC L1 GIH

0.9

Theta

1 0.9

0.1 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0

0.1

0.2

0.3

Recall

0.4

0.5

0.6

0.7

0.8

0.9

1

Recall

Fig. 2. The Precision vs. Recall and θ vs. Recall curves use the SCD or the L1 distance. 3. LPCIH IMAGE RETRIEVAL After LPCIH feature extraction, dissimilarities between input image and registered images have to be computed for image retrieval task. There are many possible dissimilarity measures for image matching and retrieval. Chi square statistic (χ2 ) can be used as image histogram dissimilarity measure because of its good performance. Image histogram distance using Chi square statistic is illustrated in equation (2). χ2 (S, M ) =

 (Si − Mi )2 i

Si + M i

(2)

S and M represent the feature histograms to be compared and subscript i is the corresponding bin. This image histogram distance of Chi square statistic can be simplified into the following form: D(S, M ) =

 |Si − Mi | i

Si + M i

(3)

This simplification preserves the original ranking order of χ2 results and reduces the computation burden at the same time. We use D(S, M ) (called SCD) as our image histogram distance metric for our image retrieval method because it is more computationally efficient and has high accuracy comparing with other image histogram distances, such as L1 distance, log-likelihood distance and histogram intersection.

In our experiments, we adopted 9 different measures of retrieval effectiveness. We used two graphical measures (Precision vs. Recall and θ vs. Recall), and 7 single value measures (p(10), r(10), p(20), r(20), p(30), r(30), and 11P P recision). θ vs. Recall curve is a variation of the Precision vs. Recall curve. The θ is defined as the average of the precision values measured whenever a relevant image is retrieved. The main difference between θ and precision is that the θ value is accumulative whose computation considers not only the precision at a specific recall level but also the precision at previous recall levels. This accumulative computation is more consistent with the ranking imposed by image retrieval methods. The two measures, p(10) and r(10), correspond to the precision and the recall after 10 images are retrieved. The measures p(20), r(20), p(30), and r(30) have the similar meanings as mentioned above. The 11P -P recision is computed by averaging the precisions taking at eleven predefined recall levels: 0%, 10%, . . . , 90%, 100%. In Figure 2, we compare LPCIH method with the three existing image retrieval methods through Precision vs. Recall and θ vs. Recall curves using SCD and L1 distances. For example, in Figure 2, “GIH” is the GIH method using SCD distance and “L1 GIH” is the GIH method using L1 distance. In Table 1, we evaluate the effectiveness of LPCIH method through the 7 single value measures (p(10), r(10), p(20), r(20), p(30), r(30), and 11P -P recision). According to the results shown in Figure 2 and Table 1, we can see that LPCIH method has outperformed the other three methods.

4. EXPERIMENTS In our experiments, we adopted the query-by-example as the way to submit queries in retrieval system and 3×3neighborhood was used. In order to evaluate the effectiveness of the proposed LPCIH method, we compared it with other three methods: global image histogram (GIH), BorderInterior classification (BIC) [12], and Block Edge Histograms of CVPIC [11] (called CVPIC in our experiments). For image distance measures, SCD distance was compared with L1 distance in our experiments. We used the UCID database [16]. There are 1338 images (gray scaled in our experiments) in total together with a ground truth in the UCID database.

943

Fig. 3. An example result of LPCIH retrieval is shown in the experiment of rotated and damaged gray image retrieval. In Figure 3, an example result of LPCIH retrieval is shown in our additional experiment of rotated and damaged gray image retrieval. It can be seen that LPCIH method is effective

Table 1. Single-value effectiveness results are made for comparing image retrieval methods. Methods p(10) r(10) p(20) r(20) p(30) r(30) 11P-Precision LPCIH 0.147 0.529 0.105 0.462 0.079 0.439 0.215 0.191 L1 LPCIH 0.144 0.456 0.089 0.423 0.071 0.393 BIC 0.139 0.430 0.092 0.388 0.070 0.362 0.176 0.102 0.355 0.067 0.332 0.051 0.318 0.122 L1 BIC CVPIC 0.065 0.178 0.053 0.166 0.044 0.159 0.064 0.064 L1 CVPIC 0.047 0.121 0.044 0.124 0.042 0.116 GIH 0.086 0.295 0.061 0.243 0.049 0.237 0.100 0.082 0.302 0.052 0.295 0.045 0.254 0.096 L1 GIH

for the task of rotated and damaged gray image retrieval.

[7] Timo Ahonen, Abdenour Hadid, and Matti Pietikainen, “Face description with local binary patterns: Application to face recognition,” IEEE Transactions on PAMI, vol. 28, pp. 2037–2041, 2006.

5. CONCLUSIONS This paper presented LPCIH method which is an efficient retrieval method for broad image domains. Through combining local texture patterns with global image histogram, LPCIH has flexible image segmentation process and effective feature representation. In our experiments, LPCIH method is consistently more efficient and more effective than the other three existing image retrieval methods. Especially, LPCIH is efficient for the rotated and damaged gray image retrieval and it can be applied into the real-time online applications which are impossible for SIFT-based image retrieval systems [14][15].

[8] J. Mao and A. Jain, “Texture classification and segmentation using multiresolution simultaneous autoregressive models,” Pattern Recognition, vol. 25, pp. 173– 188, 1992. [9] J.M. Francos, A.Z. Meiri, and B. Porat, “On a woldlike decomposition of 2-d discrete random fields,” in Proceedings of ICASSP, 1990, pp. 2695–2698.

6. REFERENCES

[10] H. Tamura, S. Mori, and T. Yamawaki, “Textural features corresponding to visual perception,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 8, pp. 460–473, 1978.

[1] M.S. Lew, N. Sebe, C. Djeraba, and R. Jain, “Contentbased multimedia information retrieval: State of the art and challenges,” ACM Transactions on Multimedia Computing, vol. 2, pp. 1–19, 2006.

[11] G. Schaefer, S. Lieutaud, and G. Qiu, “Cvpic image retrieval based on block colour co-occurance matrix and pattern histogram,” in Proceedings of ICIP, 2004, pp. 413–416.

[2] M. Swain and D. Ballard, “Color indexing,” in Proceedings of ICCV. IEEE, 1990, pp. 11–32.

[12] R.O. Stehling, M.A. Nascimento, and A.X. Falcao, “A compact and efficient image retrieval approach based on border/interior pixel classification,” in Proceedings of CIKM, 2002, pp. 102–109.

[3] J. Huang, S.R. Kumar, M. Mitra, W.-J. Zhu, and R. Zabih, “Image indexing using color correlograms,” in Proceedings of IEEE CVPR, 1997, pp. 762–768. [4] M. Stricker and M. Orengo, “Similarity of color images,” in Proceedings of SPIE Conference on Storage and Retrieval for Image and Video Databases, 1995, pp. 381–392. [5] B.S. Manjunath, J.-R. Ohm, V. Vasudevan, and A. Yamada, “Color and texture descriptors,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, pp. 703–715, 2001. [6] S.J. Park, D.K. Park, and C.S. Won, “Core experiments on mpeg-7 edge histogram descriptor,” ISO/IEC JTC1/SC29/WG11-MPEG2000/M5984, 2000.

944

[13] Krystian Mikolajczyk and Cordelia Schmid, “A performance evaluation of local descriptors,” IEEE Transactions on PAMI, vol. 27, no. 10, pp. 1615–1630, 2005. [14] J. Yushi and B. Shumeet, “Pagerank for product image search,” in Proceedings of WWW-2008, 2008, pp. 307– 315. [15] R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning object categories from google’s image search,” in Proceedings of ICCV, 2005. [16] G. Schaefer and M. Stich, “Ucid - an uncompressed colour image database,” in Proceedings of SPIE Storage and Retrieval Methods and Applications for Multimedia, 2004, pp. 472–480.

Local Patterns Constrained Image Histograms for ...

Advanced R&D Center of Sharp Electronics (Shanghai) Corporation. 1387 Zhangdong .... image automatic segmentation process. Then, we .... call and θ vs.

183KB Sizes 5 Downloads 144 Views

Recommend Documents

Multichannel Decoded Local Binary Patterns for Content Based Image ...
adder and decoder based local binary patterns significantly improves the retrieval ...... „Diamond Peach‟, „Fuji Apple‟, „Granny Smith Apple‟,. „Honneydew ...

Hierarchical Constrained Local Model Using ICA and Its Application to ...
2 Computer Science Department, San Francisco State University, San Francisco, CA. 3 Division of Genetics and Metabolism, Children's National Medical Center ...

Local Colour Occurrence Descriptor for Colour Image ...
including rotation, scale, and illumination cases. ... 2(a), the maximum possible value of ℕ(3,3). ,2 | ∈[1,5] (i.e., occurrence of any shade) becomes 25 (i.e., (2 + 1)2). The number of occurrences of shade c (i.e. ℕ(3,3). ,2 ) is 6, 5, 4, 5, a

Multichannel Decoded Local Binary Patterns for ...
adder/decoder local binary pattern decimal values from three 8-bit input. LBPs. Green and Red circles represent 0 ... [1]T. Ojala, M. Pietikainen and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture ... [3]C. Zhu, C.E. Bichot

Groupwise Constrained Reconstruction for Subspace Clustering
50. 100. 150. 200. 250. Number of Subspaces (Persons). l.h.s.. r.h.s. difference .... an illustration). ..... taining 2 subspaces, each of which contains 50 samples.

Groupwise Constrained Reconstruction for Subspace Clustering - ICML
k=1 dim(Sk). (1). Unfortunately, this assumption will be violated if there exist bases shared among the subspaces. For example, given three orthogonal bases, b1 ...

Groupwise Constrained Reconstruction for Subspace Clustering
The objective of the reconstruction based subspace clustering is to .... Kanade (1998); Kanatani (2001) approximate the data matrix with the ... Analysis (GPCA) (Vidal et al., 2005) fits the samples .... wji and wij could be either small or big.

Face Recognition using Local Quantized Patterns
by OSEO, French State agency for innovation and by the ANR, grant reference ANR-08- ... cessing, 2009:33, 2009. [18] H. Seo and P. Milanfar. Face verification ...

Image Segmentation using Global and Local Fuzzy ...
Indian Statistical Institute, 203 B. T. Road, Kolkata, India 700108. E-mail: {dsen t, sankar}@isical.ac.in. ... Now, we present the first- and second-order fuzzy statistics of digital images similar to those given in [7]. A. Fuzzy ... gray values in

Local Area/ transport Weather Patterns Arctic ...
Local Area/ transport. Weather Patterns. Arctic/ Antarctica/ Africa. YEAR 2 ... YEAR 4. Rainforests. Mountains and coasts. Cities/ towns and villages. YEAR 5.

A local fast marching-based diffusion tensor image registration ...
relatively low resolution of DTI and less advanced alignment techniques in the initial works, global brain registration was also applied to quantitatively ...... Illustration of the local neighborhood and the deformed tensors. (a) The tensors in the 

Robust Image Watermarking Based on Local Zernike ...
Signal Processing Laboratory, School of Electrical Engineering and INMC, ..... to check whether the suspect image is corrupted by resizing or scal- ing attacks.

Incorporating local image structure in normalized cut ...
Dec 31, 2012 - Graph partitioning for grouping of image pixels has been explored a lot, with nor- malized cut based graph partitioning being one of the popular ...

Minimal Inequalities for Constrained Infinite ...
Introduction. We study the following constrained infinite relaxation of a mixed-integer program: x =f + ∑ ... pair (ψ ,π ) distinct from (ψ,π) such that ψ ≤ ψ and π ≤ π. ... function pair for Mf,S . (ψ,π) is minimal for Mf,S if and on

Groupwise Constrained Reconstruction for Subspace Clustering - ICML
dal, 2009; Liu et al., 2010; Wang et al., 2011). In this paper, we focus .... 2010), Robust Algebraic Segmentation (RAS) is pro- posed to handle the .... fi = det(Ci)− 1. 2 (xi C−1 i xi + νλ). − D+ν. 2. Ci = Hzi − αHxixi. Hk = ∑ j|zj =k

PartBook for Image Parsing
effective in handling inter-class selectivity in object detec- tion tasks [8, 11, 22]. ... intra-class variations and other distracted regions from clut- ...... learning in computer vision, ECCV, 2004. ... super-vector coding of local image descripto

PartBook for Image Parsing
effective in handling inter-class selectivity in object detec- tion tasks [8, 11, 22]. ... automatically aligning real-world images of a generic cate- gory is still an open ...

CONSTRAINED POLYNOMIAL OPTIMIZATION ...
The implementation of these procedures in our computer algebra system .... plemented our algorithms in our open source Matlab toolbox NCSOStools freely ...

Equalization of Keystroke Timing Histograms Improves ...
transmitting a message by the rhythm, pace and syncopation of the signal taps (see [1] and references therein). ... data samples are obtained along with the experimental pro- cedure applied during databases construction, .... cedure, according to [8]