(CLD) and Edge Histogram Descriptor (EHD)

Viewer
Transcript

Institute for Visualization and Interactive Systems Intelligent Systems Group Stuttgart University Universit¨atsstraße 38 D – 70569 Stuttgart

Bachelor Thesis Proposal

Visualizing the Joint Representation of MPEG-7 Color Layout Descriptor (CLD) and Edge Histogram Descriptor (EHD) for CBIR Systems Laila H.Shoukry Major:

Digital Media Engineering & Technology (DMET)

Examiner:

Prof. Dr. Gunther Heidemann

Supervisor:

Dipl.-Inf. Sebastian Klenk

Contents 1 Background

2

2 Motivation And Related Work

3

3 CBIR Systems Using MPEG-7 3.1 PicSOM[14] . . . . . . . . . . 3.2 VizIR[18] . . . . . . . . . . . 3.3 Eric7[9] . . . . . . . . . . . . 3.4 Caliph & Emir[17] . . . . . . 3.5 Mirror[24] . . . . . . . . . . . 3.6 CBIR - Histogram[15] . . . . 3.7 CIREC[2] . . . . . . . . . . . 3.8 Fedora[4] . . . . . . . . . . . .

Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 The Color Layout Descriptor (CLD)[12]

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

5 5 5 5 5 6 7 7 8 9

5 The Edge Histogram Descriptor (EHD)[5]

10

6 Combining EHD and CLD

10

7 Implementation

11

1

Background

The MPEG-7 standard[1] is an international standard since September 2001 which specifies metadata for describing multimedia content. The interesting part for our project is this part of the standard which defines visual descriptors. These are structures to describe multimedia data. Their exact extraction methods are not standardized. Figure 1 shows an overview of MPEG-7 visual descriptors which are suitable for still images. Since 2001, lots of research has been conducted on making use of these standardized visual descriptors in the field of Computer Vision, especially in Content-Based Image Retrieval Systems (See [6, 23, 7, 13, 12, 3, 5].). “[..]Research did not only proceed along the lines of improved search algorithms, but also toward creating new features and similarity measures based on color, texture, and shape. One of the recent interesting additions to the set of features are from the MPEG-7 standard.” [16]

Figure 1: MPEG-7 Visual Descriptors For Low-level Features

2

2

Motivation And Related Work

In an effort to bridge the semantic gap in CBIR systems, visualizing features as artificial images was proposed in[10], where the authors stated that ”the visualization of low level features is an open question of research”. This year, Johannes Imo, a VIS student at Stuttgart University, is finalizing his Diplom entitled Interactive Feature Visualization for Image Retrieval. In [11], the main idea is ”to make CBIR systems more ’transparent’ by visualization of the employed features.” In his software, the color histogram as a color feature and the cooccurrence matrix from which most texture features are derived, are visualized. As his experimentation results revealed the importance of such visualization and now that more and more CBIR systems are using the MPEG-7 descriptors (see section 3), the next step would be to find a suitable visualization for the MPEG-7 features. We have also chosen a color feature and a texture feature: Color Layout and Edge Histogram, respectively. Those two features are two very powerful features for CBIR systems, especially sketch-based image retrieval. Further, combining color and texture features in CBIR systems leads to more accurate results for image retrieval. In both the Color Layout Descriptor (CLD) and the Edge Histogram Descriptor (EHD) image features like color and edge distributuion can be localized in seperate 4x4 sub-images.(See sections ) In [22] the visual features of each sub-image were characterized by the representative colors of the CLD as well as the edge histogram of the EHD at that sub-image using weighing factors. The main idea was to ”enable more content-oriented retrieval even without example images for the query”. The Query-By-layout project allowed the user to directly input information about edges and colors using sliding bars for each subimage seperately. As a result, it created a visualization of the edge popularity of the EHD (indicated by percentages) and the luminance of the CLD (using grey levels) in all subimages. See Figures 2 and 3. However, as most CBIR systems depend on query images and as most users of such CBIR systems are non-professional, we hope to develop a more user-friendly visualization of a joint representation of EHD and CLD, which an ordinary user can understand. It is also interesting to note that in [8] other MPEG-7 descriptors were also combined together into one representation, namely Scalable Color Descriptor (SCD) for color and, Homogeneous Texture Descriptor(HTD) for texture representation respectively. However, no visualization of the joint representation was proposed. Also, in [21] all MPEG-7 visual descriptors were ”fused” into one descriptor, but still with no visualization attempt. 3

Figure 2: Query By Layout (QBL) GUI Board (1)

Figure 3: Query By Layout (QBL) GUI Board (2)

4

3

CBIR Systems Using MPEG-7 Descriptors

In the field of Content-Based Image Retrieval, several systems were developed which rely on the MPEG-7 visual descriptors, mainly due to the interoperability of the standard. An overview over such systems in the order of their development date follows.

3.1

PicSOM[14]

PicSOM (Picture Self-Organizing Map) is a neural-network-based image retrieval system which was developed in 2001. It uses self-organizing maps to retrieve relevant images from the database and uses the MPEG-7 visual descriptors for the image description space. The following subset of MPEG-7 visual descriptors were used: • Scalable Color • Dominant Color • Color structure • Color Layout • Edge Histogram • Region-Based Shape

3.2

VizIR[18]

VizIR,a framework for Visual Information Retrieval, is a project which started in March 2003 aiming at implementing the MPEG-7 visual descriptors in Content-based retrieval of images and video (CBVR).

3.3

Eric7[9]

Eric7 is a software developed in 2004 which allows automatic MPEG-7/XML encoding of up to 15 color, texture and shape descriptors.

3.4

Caliph & Emir[17]

Caliph (Common and Lightweight PHoto Annotation) and Emir (Experimental Metadata-based Image Retrieval) are two programs developed in 2005 belonging to one and the same project. Among the features they have in 5

common is their ability to extract the following mpeg-7 features from images provided by users: • Dominant Color • Scalable Color • Color Layout • Edge Histogram In addition, they visualize the Color Layout feature. For the other three features, it just presents the raw data to the user (vector or matrix respectively), see Figures 4 and 5.

Figure 4: Snapshot of MPEG-7 descriptors extracted by Caliph from an input image

3.5

Mirror[24]

Mirror (MPEG-7 Image Retrieval Refinement based On Relevance feedback) was developed in 2005 to evaluate MPEG-7 visual descriptors in retrieval algorithms. 6

Figure 5: Snapshot of a retrieval result by Emir The user can choose one of the following MPEG-7 descriptors for his image search: • Dominant Color • Scalable Color • Color structure • Color Layout • Homogeneous Texture • Edge Histogram

3.6

CBIR - Histogram[15]

CBIR-Histogram is a CBIR software developed in 2006 by a student which uses the Color Structure Descriptor of the MPEG-7 standard

3.7

CIREC[2]

CIREC (Cluster Correlogram Image Retrieval and Categorization), developed in 2007, uses MPEG-7 low-level features to represent image partitions 7

obtained by clustering. It uses the following MPEG-7 color and texture descriptors. • Scalable Color • Color structure • Color Layout • Edge Histogram

3.8

Fedora[4]

In the last conference of Fedora Commons April 2008, the fedora team also announced the integration of Content-Based Image Retrieval built on the MPEG-7 standard into their system.

8

4

The Color Layout Descriptor (CLD)[12]

The CLD represents the spatial distribution of colours in an image. The extraction process of the CLD consists of the following four stages (See Figure 6): • The image array is partitioned into 8x8 blocks. • Representative colors are selected and expressed in YCbCr color space. • Each of the three components (Y, Cb and Cr) is transformed by 8x8 DCT (Discrete Cosine Transform). • The resulting sets of DCT coefficients are zigzag-scanned and the first few coefficients are nonlinearly quantized to form the descriptor.

Figure 6: The extraction process of the CLD [19] The CLD descriptor is thus a very compact representation of the color layout and allows for very fast searches in databases.

9

5

The Edge Histogram Descriptor (EHD)[5]

The EHD represents the spatial distribution of edges in an image. The extraction process of the EHD consists of the following stages: • The image array is divided into 4x4 subimages. • Each subimage is further partitioned into non-overlapping square imageblocks whose size depends on the resolution of the input image. • The edges in each image-block is categorized into one of the following six types: vertical, horizontal, 45◦ diagonal, 135◦ diagonal, nondirectional edge and no-edge. (See Figure 7) • Now a 5-bin edge histogram of each subimage can be obtained. (See Figure 8) • Each bin value is normalized by the total number of image-blocks in the subimage. • The normalized bin values are nonlinearly quantized.

Figure 7: Five types of edges [5]

6

Combining EHD and CLD

”[...]Since the EHD and the IDCT coefficients of the CLD are based on 4x4 and 8x8 grids, respectively, there exists a spatial one-to-one correspondence between the sub-image of the EHD and 2x2 IDCT values of the CLD. That 10

Figure 8: 1-D array of 80 bins of EHD[5] is, as shown in Figure 18, the adjacent 2x2 IDCT values correspond to one sub-image of the EHD. So, the visual features of each subimage can be characterized by the 2x2 representative colors of the IDCT of the CLD as well as the edge histogram of the EHD at that sub-image.[...]” [22] For further details, please refer to [22].

Figure 9: Combining the EHD and the IDCT values of the CLD[22]

7

Implementation

We plan to use Java programming language for both engine and graphical user interface (GUI). One of the image processing tools for Java will be used.

11

List of Figures 1 2 3 4 5 6 7 8 9

MPEG-7 Visual Descriptors For Low-level Features . . . . . Query By Layout (QBL) GUI Board (1) . . . . . . . . . . . Query By Layout (QBL) GUI Board (2) . . . . . . . . . . . Snapshot of MPEG-7 descriptors extracted by Caliph from an input image . . . . . . . . . . . . . . . . . . . . . . . . . . . Snapshot of a retrieval result by Emir . . . . . . . . . . . . The extraction process of the CLD [19] . . . . . . . . . . . . Five types of edges [5] . . . . . . . . . . . . . . . . . . . . . 1-D array of 80 bins of EHD[5] . . . . . . . . . . . . . . . . . Combining the EHD and the IDCT values of the CLD[22] .

12

. . .

2 4 4

. 6 . 7 . 9 . 10 . 11 . 11

References [1] Mpeg-7: Overview of mpeg-7 description tools, part 2. IEEE MultiMedia, 09(3):83–93, 2002. [2] A. Abdullah and M.A. Wiering. Cirec : Cluster correlogram image retrieval and categorization using mpeg-7 descriptors. Computational Intelligence in Image and Signal Processing, 2007. CIISP 2007. IEEE Symposium on, pages 431–437, 1-5 April 2007. http://ieeexplore.ieee.org/iel5/4221378/4221379/ 04221458.pdf?tp=&isnumber=&arnumber=4221458. [3] Ka Man author Wong. Content based image retrieval using mpeg7 dominant color descriptor. 2004. http://dspace.cityu.edu.hk/ bitstream/2031/4479/1/fulltext.html. [4] Pierre-Yves Burgi. Content-based image retrieval integrated into fedora. 2008. http://pubs.or08.ecs.soton.ac.uk/113/. [5] Dong Kwon Park Chee Sun Won and Soo-Jun Park. Efficient use of mpeg-7 edge histogram descriptor. ETRI Journal, 24(2), 2002. http://etrij.etri.re.kr/Cyber/servlet/GetFile?fileid= SPF-1041924741673. [6] Ritendra Datta, Jia Li, and James Z. Wang. Content-based image retrieval: approaches and trends of the new age. pages 253– 262, 2005. http://www.infolab.stanford.edu/∼wangz/project/ imsearch/review/ACM05/datta.pdf. [7] D. Djordjevic and E. Izquierdo. An object- and user-driven system for semantic-based image annotation and retrieval. Circuits and Systems for Video Technology, IEEE Transactions on, 17(3):313– 323, March 2007. http://ieeexplore.ieee.org/iel5/76/4118229/ 04118236.pdf?tp=&arnumber=4118236&isnumber=4118229. [8] Ramprasath Dorairaj and K.R. Namuduri. Compact combination of mpeg-7 color and texture descriptors for image retrieval. Signals, Systems and Computers, 2004. Conference Record of the Thirty-Eighth Asilomar Conference on, 1:387–391, 7-10 Nov. 2004. http://ieeexplore.ieee.org/iel5/9626/30419/01399159. pdf?tp=&isnumber=&arnumber=1399159.

13

[9] L. Gagnon, S. Foucher, and V. Gouaillier. Eric7: an experimental tool for content-based image encoding and retrieval under the mpeg-7 standard. pages 1–6, 2004. http: //portal.acm.org/ft gateway.cfm?id=984782&type=pdf&coll= ACM&dl=ACM&CFID=64078275&CFTOKEN=62340691. [10] Gunther Heidemann and Sebastian Klenk. Visual analytics for image retrieval. In Ralf Mikut and Markus Reischl, editors, Proceedings 17. Workshop Computational Intelligence. Universit¨atsverlag Karlsruhe, 2007. [11] Sebastian Klenk Johannes Imo and Gunther Heidemann. Interactive feature visualization for image retrieval. 2008. [12] Eiji JSASUTANI and Akio YAMADA. The mpeg-7 color layout descriptor: A compact image feature description for high-speed image/video segment retrieval. Image Processing, 2001. Proceedings. 2001 International Conference on, 2001. http://ieeexplore.ieee.org/iel5/ 7594/20726/00959135.pdf?tp=&isnumber=&arnumber=959135. [13] T. Kaczmarzyk and W. Pedrycz. Content-based image retrieval: an application of mpeg-7 standard and fuzzy c-means. Fuzzy Information Processing Society, 2006. NAFIPS 2006. Annual meeting of the North American, pages 172–177, 3-6 June 2006. http://ieeexplore.ieee.org/iel5/4216755/4095341/04216796. pdf?tp=&arnumber=4216796&isnumber=4095341. [14] J. Laaksonen, M. Koskela, and E. Oja. Picsom-self-organizing image retrieval with mpeg-7 content descriptors. Neural Networks, IEEE Transactions on, 13(4):841–853, Jul 2002. http://ieeexplore.ieee.org/iel5/72/21990/01021885.pdf? tp=&arnumber=1021885&isnumber=21990. [15] Joachim Leicht. Content based image retrieval mittels farbfeatures. 2006. http://www.home.joachimleicht.de/content/ document/studienarbeit/Studienarbeit-Ausarbeitung.pdf. [16] Michael S. Lew, Nicu Sebe, Chabane Djeraba, and Ramesh Jain. Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl., 2:1–19, 2006. http://www.liacs.nl/home/mlew/mir.survey16b.pdf. [17] Mathias Lux. Caliph and emir: Semantic annotation and retrieval of digital photos, 2003. http://www.semanticmetadata.net/features. 14

[18] Institute of Software Technology and Interactive Systems. Vizir, a framework for visual information retrieval. 2003. http://cbvr.ims.tuwien. ac.at/. [19] Phillipe Salembier and Thomas Sikora. Introduction to MPEG-7: Multimedia Content Description Interface. John Wiley & Sons, Inc., New York, NY, USA, 2002. [20] Wenbin Shao. Automatic annotation of digital photos. pages 31–51, 2007. http://www.library.uow.edu.au/adt-NWU/uploads/ approved/adt-NWU20080403.120857/public/02Whole.pdf. [21] Evaggelos Spyrou, Herv´e Le Borgne, Theofilos Mailis, Eddie Cooke, Yannis Avrithis, and Noel Oconnor. Fusing mpeg-7 visual descriptors for image classification. Artificial Neural Networks: Formal Models and Their Applications - ICANN 2005, pages 847–852, 2005. http://www.acemedia.org/aceMedia/files/document/wp7/ 2005/icann05-iti.pdf. [22] Soo Jun Park Sung Min Kim and Chee Sun Won. Image retrieval via query-by-layout using mpeg-7 visual descriptors. ETRI, 29, 2007. [23] Shankar Vembu, Malte Kiesel, Michael Sintek, and Stephan Baumann. Towards bridging the semantic gap in multimedia annotation and retrieval. 2006. http://www.image.ntua.gr/swamm2006/resources/ paper18.pdf. [24] Ka-Man Wong, Kwok-Wai Cheung, and Lai-Man Po. Mirror: an interactive content based image retrieval system. Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on, 2:1541–1544, 23-26 May 2005. http://ieeexplore.ieee.org/iel5/9898/31469/ 01464894.pdf?tp=&arnumber=1464894&isnumber=31469. [25] Q. Zhang and E. Izquierdo. A multi-feature optimization approach to object-based image classification. CIVR, 5th International Conference on Image and Video Retrieval, 2006. http://www.acemedia.org/ aceMedia/files/document/civr06-qmul.pdf.

15

(CLD) and Edge Histogram Descriptor (EHD)

Institute for Visualization and Interactive Systems. Intelligent .... In the last conference of Fedora Commons April 2008, the fedora team also announced the ...

Download PDF

570KB Sizes 1 Downloads 177 Views

Report

(CLD) and Edge Histogram Descriptor (EHD)

Recommend Documents