Objective Distortion Measure for Binary Text Image ...

Viewer
Transcript

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.XX, NO. XX, XXXX, TIP-02358-2006.R2

1

Objective Distortion Measure for Binary Text Image Based on Edge Line Segment Similarity Jun Cheng and Alex C. Kot, Fellow, IEEE

Abstract—This paper proposes a new approach to measure the distortion introduced by changing individual edge pixels in binary text images. The approach considers not only how many pixels are changed but also where the pixels are changed and how the flipping affects the overall shape formed by the edge line. Similarities between the edge line segments in the original and distorted image are compared to measure the distortion. Subjective testing shows that the new distortion measure correlates well with human visual perception. Index Terms—binary text image, distortion measure I. INTRODUCTION

D

measure is an important topic in image processing including data hiding. An important application of the distortion measure in data hiding is to evaluate the performance of the data hiding algorithms as well as to provide insights in data hiding. Distortion measure can be categorized into subjective measurement and objective measurement [1]. As discussed in [2], subjective distortion measure quantifies the dissatisfaction of the viewer in observing the distorted image in place of the original. A common way to evaluate the dissatisfaction is through subjective testing. In these tests, observer views a series of distorted images and rate them based on the visibility of the artifact. The results of subjective testing depend on various factors such as the observer's background. Subjective measurement is important for image quality evaluation since the images are ultimately viewed by human beings. However, subjective testing is inconvenient, time consuming and expensive. The objective distortion measure gives the distortion between the original and the distorted image mathematically such as mean-square error (MSE), peak-signal to noise ratio (PSNR) and signal to noise ratio (SNR). However, it may not reflect the observer's visual perception of distortion. An objective distortion measure that can accurately reflect the subjective ratings would be quite useful in designing data hiding algorithms. Several authors have discussed the gaps between subjective measurement and objective measurement and they have proposed solutions for multi-level images [2, 3, 4]. Little work has been done for objective measurement for binary text images. In recently years, data hiding in binary text images have ISTORTION

Manuscript received April 13, 2006; revised on Jan. 31, 2007. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Zhigang (Zeke) Fan. Jun Cheng and Alex C. Kot are with School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798 (e-mail: [email protected]; [email protected]). Digital Object Identifier

received much attention and a series of data hiding schemes for binary images by flipping individually selected edge pixels have been proposed [5-10]. These schemes are called pixel flipping techniques. One of the important issues of the pixel flipping technique is how to select the flippable pixels to minimize distortion. Also, the comparison of the distortion introduced by different schemes is important. However, performance evaluation is difficult due to the lack of objective distortion measure that reflects the subjective test results. There is an urgent need to develop a good objective distortion measure. Distance reciprocal distortion measure (DRDM) proposed in [11] uses a weight matrix to calculate the distortion caused by the flipping of pixels. It shows better correlation with human perception than PSNR and this measure works well if salt and pepper noise is involved. For example, flipping a non-edge pixel usually causes a larger visual distortion than flipping an edge pixel. However, most good data hiding techniques for binary images involve flipping edge pixels only. If connectivity is also taken into consideration, the measure of distortion would be more accurate. For example, flipping the center pixels of the two patterns in Fig. 1 have the same distortion score according to DRDM, but the visual distortions by flipping them are quite different. The change in smoothness and connectivity measure (CSCM) by using a modified weight matrix in [12] gives reasonable distortion scores. However, it has a limitation for patterns such as sharp corner and pixels along straight line. In this paper, we propose a new objective distortion measure based on edge line segment to evaluate the distortion introduced by flipping individually selected edge pixels for binary text images. The new proposed method is introduced in Section II, followed by experimental results and a conclusion.

(a)

(b)

Fig. 1. Two 3 × 3 pixel patterns with same DRDM distortion score [11]

II. PROPOSED METHOD We define an edge line as the common sharing ‘line’ between two neighboring pixels where the pixel values for the two pixels are different. The edge pixel refers to pixels (either white or black) on the edge. For example, the edge lines in Fig. 2(a) are those black lines shown in Fig. 2(b). When the edge line changes its direction by 90 degrees, it forms a sharp ‘corner’. There are eight different types of corners shown in Fig. 3 that make up all possible corners in a 2× 2 pixel block.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.XX, NO. XX, XXXX, TIP-02358-2006.R2

2

(a) (b) Fig. 2. An illustration of edges (a) Enlarged ‘a’ (b) Edge line of ‘a’ (a)

(b)

Fig. 3. Eight different types of corners (c)

(d)

Fig. 5. (a) Enlarged “s” (b) Edge line segment associated with center pixel A (c) Edge line segment associated with center pixel C (d) edge line segment associated with center pixel B Fig. 4. Two types of crosses

For a 2× 2 block, if any two pixels in the same row or column have different colors as shown in Fig. 4, the four pixels form a cross at the center point of the block. Corners and crosses divide the edge line into edge line segments. Each edge line segment starts from one corner or cross and ends at another corner or cross without any corner or cross in between. These corners or crosses define the two ends of the edge line segment. An edge line segment is associated with a pixel if flipping the pixel changes the edge line segment. From our observation, the edge line segments associated with an individually flipped pixel play an important role in determining the visual distortion caused by the flipped pixel. For example in Fig. 5(a), flipping pixel A causes larger distortion than flipping pixel B and flipping pixel B causes larger distortion than flipping pixel C. We show in Table I the numbers of the associated edge line segments before and after flipping these pixels and the corresponding subjective distortion. TABLE I NUMBERS OF EDGE LINE SEGMENTS ASSOCIATED WITH PIXELS A, B AND C AND SUBJECTIVE DISTORTION Pixel

Before flipping

After flipping

Changes

Subjective distortion

A

One (1)

Five(1’,2’,3’,4’,5’)

Four

High

B

Four(5,6,7,8)

Two(9’,10’)

Two

Medium

C

Three(2,3,4)

Three(6’,7’,8’)

Zero

Low

The length of each edge line segment also plays an important role in distortion measure. In Fig. 6, flipping D, E or F do not change the number of edge line segments. From human visual perception, flipping D seems to be more distorted than flipping E or F. In this case, the number of edge line segments remains the same but the lengths of some of the segments are changed.

Fig. 6. Enlarged “t”

We use edge line segment similarity to calculate the distortion between two edge line segments before and after flipping. For the two edge line segments whose lengths are l1 and l2 , we define their edge line segment similarity as:

ELSS =

min(l1 , l2 ) max(l1 , l2 )

(1)

Larger ELSS corresponds to less distortion to the edge line segment pair and we use 1 − ELSS as the distortion for a given pair of edge line segments. A. Identifying and mapping of edge line segments Flipping an edge pixel may change the lengths of existing edge line segments, create new edge line segments, remove existing edge line segments, separate the existing line segment, and merge two edge line segments to one edge line segment. Identifying and mapping of the associated edge line segments are needed in order to measure the distortion properly.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.XX, NO. XX, XXXX, TIP-02358-2006.R2 To identify and map the edge line segments before and after flipping a pixel, we consider a 3 × 3 pixel block centered at this pixel. Denote the center pixel of the block as pc with its value

pc and its neighboring pixels as pi , i = 0,1, 2,..., 7 , with corresponding values pi , i = 0,1, 2,..., 7 . Each pixel has four boundary lines and two neighboring pixels have a common boundary to share. Label the one-pixel-long common sharing line segment between pi and pi +1 as bi , i = 0,1, 2,..., 7 ( p8 refers to p0 ), and the one-pixel-long common sharing line segment between pi and

pc as ci , i = 0, 2, 4, 6 . Fig. 7 shows the pixels and the labels. A common sharing line is a part of an edge line segment if the values of the two pixels associated with it are different. For late use, we define pi pi −8 , bi bi −8 and ci ci −8 , for i ≥ 8 .

3

If the line segment bi is a part of an edge line segment associated with pc , the conditions pi ≠ pi +1 and pi +1 = pi + 2 are satisfied. Otherwise, if pi = pi +1 , then the line segment bi is not any part of an edge line segment, which conflicts with the assumption that bi is a part of an edge line segment. If pi +1 ≠ pi + 2 ( pi ≠ pi +1 ), the four pixels pi , pi +1 , pi + 2 and pc form either a corner or a cross, which is always the end of the edge line segment containing bi . It means flipping pc does not change the edge line segment which contains bi , which implies the segment is not associated with pc . Thus, we have pi ≠ pi +1 and pi +1 = pi + 2 . The similar argument goes to the cases for the odd line bi , i = 1,3,5, 7 .

(a) Fig. 8 Relationship betweesn

(b)

bi and the associated edge line segment.

The pixels in “grid” are don’t care pixels. Fig. 7 Pixels and lines in a 3 × 3 block

It can be seen that an edge line segment that is associated with the pixel pc contains at least one of the twelve common sharing lines. The four line segments c0 , c2 , c4 and c6 change status from edge line to non-edge line or vice versa when pc is flipped. Subsequently, the corresponding edge line segment changes. In what follows, we show the conditions associated with the changes in the edge line segments. For i = 0, 2, 4, 6 , the line segment bi is a part of an edge line

In the edge line segment mapping process, we map the edge line segment containing bi in the original image with the edge line segment containing bi in the distorted image, for any line segment

bi with bi = 1 , i = 0,1, 2,..., 7 . The length of the edge line segment containing bi can be computed by locating the two ends of the edge line segment.

segment associated with pc if and only if the pixel value pi ≠ pi +1 and pi +1 = pi + 2 [e.g., Fig. 5 (c) when flipping pixel C]. For i = 1,3,5, 7 , the line segment bi is a part of an edge line segment associated with

pc if and only if pi ≠ pi +1 and

pi −1 = pi [e.g., Fig. 5 (b) when flipping pixel A]. We define bi as the value of bi . bi = 1 if

bi is a part of an edge line segment

associated with pc , otherwise bi = 0 . These properties can be further demonstrated using Fig. 8. If the pixel value pi ≠ pi +1 and pi +1 = pi + 2 , it can be seen that the common sharing line segment ci + 2 between pc and pi + 2 is added to or reduced from the edge line segment which contains bi when

(a)

(b)

Fig. 9 Examples of removing or creating a one-pixel-long edge line segment. Flipping pc in (a) removes the segment. Flipping pc in (b) creates a segment.

In Fig. 9(a), ci is a one-pixel-long edge line segment. After

becomes a part of the edge line segment containing bi , thus the edge

flipping pc , ci is no longer an edge line segment as shown in Fig. 9(b). In order to compute the edge line segment similarity based on its length, we map the one-pixel-long edge line segment with the zero-length-long edge line segment after flipping pc . Similarly, by

line segment containing bi is changed, i.e., bi is a part of an edge

flipping pc in Fig. 9(b), we map the zero-length edge line segment

line segment associated with pc .

before flipping pc with the one-pixel-long edge line segment after

flipping pc . In Fig. 8(a), bi is a part of an edge line segment. After flipping pc , a new pattern shown in Fig. 8(b) is created and ci + 2

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.XX, NO. XX, XXXX, TIP-02358-2006.R2 flipping pc . Same rules can be applied to all similar patterns with one-pixel-long edge line segment removed or created. Consider a special case in Fig. 10 that flipping pc creates a new one-pixel-long edge line segment at ci + 4 by shifting the one-pixellong edge line segment at ci by one pixel length. Shifting a onepixel-long edge line segment at ci by one pixel length to ci + 4 causes very minimum distortion that can be ignored under human visual perception [e.g., Fig. 5(c), when flipping pixel C, and Fig. 6, when flipping pixel D, E, or F], although the change actually involves creating one and removing one edge line segment. Such a scenario occurs if and only if pi +1 = pi + 2 = pi + 3 , pi + 5 = pi + 6 = pi + 7 ,

pi ≠ pc and pi ≠ pi + 4 . This implies that the two ends of the edge line segment at ci before flipping have the same corner as the two ends of the edge line segment at ci + 4 after flipping correspondingly.

.

4

When mapping of the edge line segments before and after flipping one pixel are done properly, distortion for each pair of edge line segments can be computed. The edge line distortion score ELD k calculated based on the edge line segments for the k th flipped pixel is given by:

ELDk =

∑ i

⎛ min(l1ki , l2ki ) ⎞ ⎜⎜1 − k k ⎟ ⎟ ⎝ max(l1i , l2i ) ⎠

(2)

where l1ki and l2ki are the edge line segment lengths of the i th pair of mapped edge line segments that are associated with the k th flipped pixel, with the shifted edge line segments excluded. When a total of N pixels are flipped in a given image, the overall distortion ELDtotal can be computed by: N

ELDtotal =

∑∑ k =1

i

⎛ min(l1ki , l2ki ) ⎞ ⎜⎜1 − k k ⎟ ⎟ ⎝ max(l1i , l2i ) ⎠

(3)

III. EXPERIMENTAL RESULTS Experiments similar to the setup in [11] were carried out to test the performance of the proposed distortion measure using the two binary images shown in Fig. 11.

(a) (b) Fig. 10 Shifted edge line segments

(a) English text ( 207 × 94 ) pixels

We calculate the distortion score for each flipped pixel by comparing the associated edge line segments in the original image with those in the distorted image. We summarize our procedures below to identify and map the edge line segments in the original and distorted image associated with each flipped pixel: (i) Set bi equal to 1, for i = 0,1, 2,..., 7 , if bi is a part of an edge line segment associated with pc , otherwise set it to be 0. (ii) Locate the two ends of the edge line segment which contains

bi with bi = 1 , map and compute l1 and l2 of the edge line segment length in the original and distorted image [e.g., edge line segments 2, 4, 5, 8 are mapped to 6’,8’,9’,10’ in Fig. 5(c)(d) respectively. Edge line segment 1’ and 5’ are all mapped to 1 in Fig. 5(b)]. (iii) Map all the one-pixel-long edge line segments removed (or created) by flipping pc to zero-length edge line segments to get

l1 = 1 and l2 = 0 (or l1 = 0 and l2 = 1 ) [e.g., the created edge line segments 2’,3’,4’,7’ in Fig. 5(b)(c) and the removed edge line segments 3, 6,7 in Fig. 5(c)(d) are mapped to zero-length segments]. (iv) Discard all the mappings for the one-pixel-long edge line segments created or removed by shifting a-one-pixel-long edge line segment by one pixel. The two ends of the one-pixel-long edge line segment ci before flipping should have the same corner type as the two ends of the new edge line segment ci + 4 [e.g., 3 and 7’ in Fig. 5(c)]. B. Total Distortion Measure

(b) Chinese text ( 243 × 87 ) pixels Fig. 11. The two original test images

The subjective tests were carried out using these two original document images. For each original image, we generated a number of random test images with different visual distortion by adding noise to the original image. As we were evaluating the distortion by flipping edge pixels, we only added noise along the edges. A total of 200 and 400 edge pixels were flipped from the original English text image and original Chinese text image, respectively. A series of distorted images were generated by flipping different edge pixels. All the distorted images generated from the same document image were divided into four groups based on ELDtotal value; with group 1 having lowest ELDtotal and group 4 having highest ELDtotal . Four images were chosen randomly with each image from each of the four groups to form one set of English text images and one set of Chinese text images as shown in Fig. 12 and Fig. 13. In both cases, the subjective assessment was done by 60 subjects. Each subject was given the original image and four sets of test images, which were printed on an 80 GSM quality paper using HP LaserJet 5000 printer. The subject was asked to rank the quality of the four images in each set according to the distortions that he or she perceived in normal viewing condition. There are four rankings (1, 2, 3 and 4) with score 1 for the least distortion and 4 for the most distortion perceived. The ranking scores

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.XX, NO. XX, XXXX, TIP-02358-2006.R2 are analyzed and compared with the ranking according to the distortion score computed by the new proposed measure. The results are shown in Table II. PSNR for every image is of the same value as the same number of pixels is being flipped in each image. All the scores by DRDM, CSCM and the proposed method are scaled by N and shown in Table II, where the smaller score indicates less distortion. It can be seen that the proposed method correlates well with the subjective ranking by visual perception while the PSNR and DRDM do not. CSCM gives the same ranking as the proposed one but it has some limitation for some patterns [12]. As we know, flipping a pixel along a straight line is normally considered as a large distortion; however, CSCM assigned a low distortion score instead. Another limitation of CSCM as well as DRDM is that it cannot tell the difference between the pixels shown in Fig. 6 while the proposed method can. The proposed method uses the 3 × 3 block centered at each flipped pixel to identify the edge line segments associated with the flipped pixel. It may need to extend to area outside the 3 × 3 block to calculate the length of these associated edge line segments. The size of the area to be extended into is determined by these edge line segments automatically. Thus, the size of area that determines the distortion score is more flexible.

250

250

250

250

200

200

200

200

150

150

150

150

100

100

100

100

50

50

50

50

0

0 1 2 3 4

1 2 3 4

Group 1

Group 2

0

1 2 3 4

1 2 3 4

Group 4

Group 3

250

250

250

250

200

200

200

200

150

150

150

150

100

100

100

100

50

50

50

50

0 1

2 3 4

1 2 3 4

Group 2

Group 1

(b)

0

(a)

0

(a)

5

0

0 1 2 3 4

Group 3

1

2 3 4

Group 4

(b) Fig. 14. Distribution of ranking score in each group (a) English text image (b) Chinese text image

TABLE II THE SUBJECTIVE TESTING RESULTS

(b) PSNR=19.9, DRDM/N =0.53, CSCM/N =0.34, ELDtotal / N =1.68 (c) PSNR=19.9, DRDM/N =0.53, CSCM/N =0.39, ELDtotal / N =2.52

Chinese

(d) PSNR=19.9, DRDM/N =0.53, CSCM/N =0.50, ELDtotal / N =3.50

English

(c) (d) Fig. 12. One set of images based on the English text image (a) PSNR=19.9, DRDM/N =0.53, CSCM/N =0.21, ELDtotal / N =0.70

(a)

(b)

(c) (d) Fig. 13. One set of images based on the Chinese text image (a) PSNR=19.9, DRDM/N =0.60, CSCM/N =0.37, ELDtotal / N =0.70 (b) PSNR=19.9, DRDM/N =0.60, CSCM/N =0.43, ELDtotal / N =1.59 (c) PSNR=19.9, DRDM/N =0.60, CSCM/N =0.49, ELDtotal / N =2.75 (d) PSNR=19.9, DRDM/N =0.60, CSCM/N =0.53, ELDtotal / N =3.55

Group

PSNR

1 2 3 4 1 2 3 4

19.9 19.9 19.9 19.9 17.2 17.2 17.2 17.2

DRDM N 0.52 0.54 0.53 0.52 0.61 0.59 0.60 0.57

CSCM N 0.21 0.33 0.38 0.49 0.37 0.43 0.47 0.51

ELDtotal N 0.75 1.63 2.52 3.50 0.70 1.54 2.75 3.55

Subjective ranking 1.15 2.03 3.05 3.77 1.08 2.06 3.03 3.83

The distribution of the subjective ranking scores for each group is shown in Fig. 14. In this figure, the abscissa represents four ranking scores (1, 2, 3 and 4), and the coordinate shows the occurrence of the corresponding ranking scores given by two groups of 60 human subjects. Since each of the 60 subjects is given four sets of test images, there are 240 scores in total for each group. From this figure, we can see that the quality of each image is distinctively different. IV.

CONCLUSIONS

In this paper, we propose a new objective distortion measure for binary text images by calculating the edge line segment similarity. This measure is proposed based on how the overall shape of edges are affected by the flipping some of the pixels. Compared with DRDM which works well when flipping non-edge pixels is involved,

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.XX, NO. XX, XXXX, TIP-02358-2006.R2 our approach emphasizes the importance of connectivity and is a good distortion measure for flipping edge pixels. The new measure considers an area with flexible size. The area is different with the fixed 3 × 3 or 5 × 5 square block which is often used in CSCM and DRDM. The distortion score by the proposed method is calculated based on the similarity between the edge line segments in the original and distorted images. Our experimental results show strong correlation with subjective assessment. Besides using the measure to evaluate the performance of the pixel flipping algorithms in [5-10], we can use this measure to select the most flippable pixel when hiding information, provided that the selected pixels are at least one pixel apart from each other. This measure is suitable for evaluating distortion from individually flipped pixel only. It does not work for the cases where two 8-connected neighboring pixels are flipped simultaneously or when edge lines are uncommon like halftone images.

REFERENCES

[1] Y. Q. Shi and H. Sun, Image and Video Compression for Multimedia Engineering: Fundamental, Algorithm, and Standards. CRC Press LLC, Boca Raton, FL, 1999. [2] S. A. Karunasekera and N. G. Kingsbury, “A distortion measure for blocking artifacts in images based on human visual sensitivity,” IEEE Transactions on Image Processing, vol. 4(6): pp. 713–724, Jun. 1995. [3] N. Damera-Venkata, T. D. Kite, W. S. Geisler, B. L. Evans, and A. C. Bovik, “Image quality assessment based on a degradation model,” IEEE Transactions on Image Processing, vol. 9(4): pp. 636–650, Apr. 2000. [4] S. Matsumoto and B. Liu, “Analytical fidelity measures in the characterization of halftone processes,” Journal of the Optical Society of America, vol. 70, pp. 1248-1254, Oct. 1980. [5] Q. G. Mei, E. K. Wong, and N. D. Memon, “Data hiding in binary text documents”, In Proc. of SPIE Security and Watermarking of Multimedia Contents III, vol. 4314, pp. 369–375, Aug. 2001. [6] M. Wu and B. Liu, “Data Hiding in Binary Image for Authentication and Annotation”, IEEE Trans. On Multimedia, vol. 6, NO. 4, pp. 528538, Aug. 2004. [7] H. Yang and A. C. Kot, “Data Hiding for Text Document Image Authentication by Connectivity-Preserving,” in Proc. of IEEE Int. Conf. Acoustics, Speech, and Signal Processing, vol. II, pp. 505-508, March 2005. [8] G. Pan, Y. J. Wu, and Z. H. Wu, “ A novel data hiding method for twocolor images,” Lecture Notes in Computer Science, vol. 2229, pp. 261270, Oct. 2001. [9] G. Pan, Z. H. Wu, and Y. Pan, “A data hiding method for few-color images,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 3469-3472, May 2002. [10] M. Wu, J. Fridrich, M. Goljan, and H. Gou, “Handling Uneven Embedding Capacity in Binary Images: A Revisit,” in Proc. of SPIE Conference on Security, Watermarking and Stegonography, vol. 5681, pp. 194-205, Jan. 2005. [11] H. Lu, A. C. Kot, and Y. Q. Shi, “Distance-reciprocal distortion measure for binary document images,” IEEE Signal processing letters, vol. 11, pp. 228-231, Feb. 2004. [12] J. Cheng and A. C. Kot, “Objective distortion measure for binary images,” in Proc. IEEE TENCON, pp. 355-358, Nov. 2004.

6

Objective Distortion Measure for Binary Text Image ...

Jan 31, 2007 - distortion measure in data hiding is to evaluate the performance of the data ..... [4] S. Matsumoto and B. Liu, âAnalytical fidelity measures in the.

Download PDF

163KB Sizes 2 Downloads 151 Views

Report

Objective Distortion Measure for Binary Text Image ...

Recommend Documents