LNCS 5876 - Interactive Image Inpainting Using DCT ...

Viewer
Transcript

Interactive Image Inpainting Using DCT Based Exemplar Matching Tsz-Ho Kwok and Charlie C.L. Wang Department of Mechanical and Automation Engineering The Chinese University of Hong Kong [email protected]

Abstract. We present a novel algorithm of exemplar-based image inpainting which can achieve an interactive response and generate results with good quality. In this paper, we modify exemplar-based method with the use of Discrete Cosine Transformation (DCT) for the strategy of exemplar matching. We decompose exemplars by DCT and evaluate the matching score with fewer coeﬃcients, which is unprecedented in image inpainting. The reason why using fewer coeﬃcients is so important is that the eﬃciency of Approximate Nearest Neighbor (ANN) search drops signiﬁcantly when using high dimensions. We have also developed a local gradient-based ﬁlling algorithm to complete the image blocks with unknown pixels so that the ANN search can be adopted to speed up the matching while preserving the continuity of image. Experimental results prove the advantage of this proposed method.

1

Introduction

Image inpainting, also known as image completion, is a process to complete the missing parts of images, which is nowadays widely used in applications as retouching a photo with unexpected objects or recovering damages on a valuable picture. More speciﬁcally, given an input image I with a missing or unknown region Ω, the task here is to propagate structure and texture information from the known region I \ Ω into Ω. In literature, many techniques have been developed, which can be roughly classiﬁed into Partial Diﬀerential Equation (PDE) based, exemplar-based and statistical-based. Some of these algorithms can work well with small regions but fail in large and highly textured regions, whereas other algorithms which work well in large regions take a long time to complete an image. The Exemplar-Based method [1] showed a great improvement on speed, which reduced the computation time from hours to tens of seconds. However, by our implementation and tests, it still cannot reach the interactive speed (i.e., in a few seconds or even within a second) when running on the consumer level PC. Therefore, a new algorithm is presented in this paper to speed up the exemplar-based image inpainting. PDE-based method (ref. [2,3]) diﬀuses the known pixels into the missing regions, it smoothly propagates image information from the surrounding areas

Corresponding author.

G. Bebis et al. (Eds.): ISVC 2009, Part II, LNCS 5876, pp. 709–718, 2009. c Springer-Verlag Berlin Heidelberg 2009

710

T.-H. Kwok and C.C.L. Wang

along the isophotes direction. The results were sharp and without too many color artifacts. However, the major defect of the algorithm is that it only works well with small missing regions. Ghost eﬀects were produced in large regions and unreal blurred results were generated when processing highly textured regions. In addition, the algorithm takes a long time to complete an image. Photoshop Healing Brush [4] is a variety of PDE-based method, but it can be completed in an interactive speed. However, it also inherits the major drawback of PDE-based inpainting that connot process the highly textured or large region successfully. Figure 1 shows such an example.

Fig. 1. Limitation of Photoshop Healing Brush: (left) the given image with the target region in green, (right) the result generated by Photoshop

Statistical-based method [5] uses the information of the rest of whole image. The algorithm adopts the strategy of statistical learning. First, it builds an exponential family distribution which is based on the histograms of local features over images. The frequency of gradient magnitude and angle occurrence is recorded. Then, it employs the image speciﬁc distribution to retouch the missing regions by ﬁnding the most probable completion with the given boundary and distribution. Finally, loopy belief propagation is utilized to achieve optimal results. However, the algorithm also fails on high textured photographs and takes long time to compute the results. Exemplar-based methods are a combination of texture synthesis and inpainting. The ﬁrst approach [1] computed the priorities of patches to perform the synthesis taken through a best-ﬁrst greedy strategy that depends on the priority assigned to each patch on the ﬁlling-front. This algorithm works well in large missing regions and textured regions. Our proposed algorithm ﬁlls the missing

Interactive Image Inpainting Using DCT Based Exemplar Matching

711

region in a similar way but with a more eﬃcient matching technique. Several variants were proposed thereafter. Priority-BP (BP stands for belief propagation) was posed in the form of a discrete global optimization problem in [6]. The priority-BP was introduced to avoid visually inconsistent results; however, it took longer computing time than [1] and needed user guidance. The structural propagation approach in [7] ﬁlled the regions along user speciﬁed curves so that the important structures can be recovered. The ﬁlling order of patches was determined by dynamic programming, which is also very time-consuming when working on large images. Retouching an image using other resources from database or Internet is a new strategy which was researched starting from [8]. The recent approaches include the usage of large displacement views in completion [9] and the image database in re-coloring [10]. However, when having large number of candidate patches, the processing time will become even longer. In this paper, we focus on the speed-up problem of inpainting, which has not been discussed by them. 1.1

Exemplar-Based Image Inpainting

To better explain our algorithm, the procedure of exemplar-based algorithm [1] will be briefed here. After extracting the manually selected initial front ∂Ω 0 , the algorithm repeated the following steps until all pixels in Ω have been ﬁlled. – Identify the ﬁlling front ∂Ω t , and exit if ∂Ω t = ∅. – Compute (or update) the priorities for every pixel p on the ﬁlling front ∂Ω t by P (p) = C(p)D(p) with C(p) being the conﬁdence term and D(p) being the data term. They are deﬁned as |∇In⊥p · np | q∈Ψp ∩Ω C(q) C(p) = , D(p) = (1) |Ψp | α

– –

– –

where |Ψp | is the area of the image block Ψp centered at p, α is the normalization factor (e.g., α = 255 for a typical grey-level image), np is the unit normal vector that is orthogonal to the front ∂Ω at p, and ∇In⊥p is the isophote at p. During initialization, C(q) is assigned to C(p) = 1 (∀p ∈ Ω) and C(p) = 0 (∀p ∈ I \ Ω). Therefore, the pixel that is surrounded with more conﬁdent (known) pixels and more likely to let the isophote ﬂow in will has a higher priority. Find the patch Ψpˆ which has the maximum priority. Find the exemplar patch Ψqˆ in the ﬁlled region that minimizes the Sum of Squared Diﬀerences (SSD) between Ψpˆ and Ψqˆ (i.e., d(Ψpˆ , Ψqˆ )) deﬁned on those already ﬁlled pixels. Copy image data from Ψqˆ to Ψpˆ (∀p ∈ Ψpˆ ∩ Ω). Update C(p) = C(ˆ p) (∀p ∈ Ψpˆ ∩ Ω).

By default, a window size of 9 × 9 pixels is provided for Ψp but, in practice, requires the user to set it to be slightly larger than the largest distinguishable

712

T.-H. Kwok and C.C.L. Wang

texture element. d(Ψpˆ , Ψqˆ ) is evaluated in the CIE Lab color space because of its property of perceptual uniformity. In the routine of exemplar-based image inpainting, the most time consuming step is ﬁnding the best matched patch Ψqˆ = arg minΨq ∈I\Ω d(Ψpˆ , Ψqˆ ), where our approach contributes on the speed-up of algorithm. 1.2

Compression Technique

There are quite a number of methods can use fewer coeﬃcients to represent an image block (exemplars in our approach), such as Haar Wavelet [11], FFT [12], PCA [13]. However, for those fast decomposition methods (e.g. Haar Wavelet), some important details are lost during the coeﬃcients selection, and the structure is destroyed. Besides, it is not that ﬂexible in selecting coeﬃcients after PCA. In image processing, DCT is more eﬃcient than FFT. Therefore, we choose DCT as our mathematical tool to select coeﬃcients in exemplar matching. The formula of DCT can be partially pre-computed, which can further speed up the process of decomposition. Moreover, the locations of DCT coeﬃcients have very clear physical meaning—i.e., the top-left corner coeﬃcients are corresponding to the low frequency components which are very important to visual perception. 1.3

Our Contribution

We develop two enhancements for the exemplar-based image inpainting. – Firstly, for the evaluation of matching score, we decompose exemplars into the frequency domain using DCT and determine the best matched patches using fewer coeﬃcients more eﬃciently with the help of ANN search. – For DCT on patch with unknown pixels, their pixel values are determined by a local gradient-based ﬁlling, which can preserve the continuity of image information better.

2

Using DCT in Exemplar-Based Image Inpainting

Using heuristic search to determine the best matched patch that minimizes SSD in the above algorithm can hardly reach the speed reported in [1], although we have already employed the maximum heap data structure to obtain the patch Ψpˆ with the maximum priority. Therefore, we tried to employ the eﬃcient Approximate Nearest Neighbor (ANN) library [14] that is implemented by KD-tree to search for the best matched patch Ψqˆ for a given patch Ψpˆ at the ﬁlling front ∂Ω, where every image block will be considered as a high-dimensional point in the feature space and the distance in such a feature space is equal to the SSD. However, the unknown pixels in Ψpˆ change from time to time so that the dimension cannot be ﬁxed. Although no description about it has been given in [1], we used the average color to ﬁll the unknown pixels which therefore ﬁxed the dimension of KD-tree. Surprisingly, same results are obtained with similar lengths of time – we tested our implementation in this way on Fig.9 and 13 and 15 in [1].

Interactive Image Inpainting Using DCT Based Exemplar Matching

713

Fig. 2. Example I: (top-left) the given image (206 × 308 pixels), (top-right) original [1] (11.97s), (bottom-left) DCT (0.328s), and (bottom-right) DCT+GB (0.344s)

714

T.-H. Kwok and C.C.L. Wang

Fig. 3. Example II: (top-left) the given image (628 × 316 pixels), (top-right) original [1] (44.36s), (bottom-left) DCT (1.406s), and (bottom-right) DCT+GB (1.344s)

Fig. 4. Example III: (top-left) the given image (700 × 438 pixels), (top-right) original [1] (76.34s), (bottom-left) DCT (1.922s), and (bottom-right) DCT+GB (1.875s)

Further study shows that when increasing the dimension of KD-tree from 20 to more than 200 (e.g., a window size with 9×9 pixels will be a point in the space with dimension 9 × 9 × 3 = 243), the performance of KD-tree drops signiﬁcantly. Thus, in order to improve the eﬃciency, we need a method to approximate the image block using fewer coeﬃcients.

Interactive Image Inpainting Using DCT Based Exemplar Matching

715

Fig. 5. Example IV: (top-left) the given image (438 × 279 pixels), (top-right) original [1] (48.60s), (bottom-left) DCT (0.985s), and (bottom-right) DCT+GB (1.062s)

Fig. 6. Example V: (top-left) the given image (700 × 438 pixels), (top-right) original [1] (124.5s), (bottom-left) DCT (2.468s), and (bottom-right) DCT+GB (2.469s)

716

T.-H. Kwok and C.C.L. Wang

Fig. 7. Example VI: (left) the given image (362 × 482 pixels), (middle-left) original [1] (6.359s), (middle-right) DCT (0.406s), and (right) DCT+GB (0.437s)

Discrete Cosine Transformation (DCT) [15] provides such an ability. As been investigated in JPEG standard of still images, an image block can be well reconstructed even if we ignore many high frequency components on the resultant array of DCT. Speciﬁcally, using only about 10% of the DCT-coeﬃcients for matching has already given acceptable results in our tests. The two indices of a DCT-coeﬃcient correspond to the vertical and horizontal frequencies that it contributes to. Therefore, to make the matching conducted on DCT-coeﬃcients symmetric, we will keep symmetric DCT-coeﬃcients as the feature components of an image. The DCT coeﬃcients of image blocks fully occupied by known pixels can be pre-computed and stored in the KD-tree as exemplars for ANN search before going to the matching steps, which speeds up the computation.

3

Improvement by Gradient-Based Filling

For computing DCT-coeﬃcients on image blocks with unknown pixels, if the unknown pixels are ﬁlled with average color of known pixels, the DCT-coeﬃcients do not reﬂect the texture or the structural information in the block very much. For example, for a block with progressive color change from left to right, if the missing region Ω is located at the right part, ﬁlling Ω with average color is not a good approximation. For a smooth image, the gradient at pixels will be approximately equal to zero. Based on this observation, we developed a gradientbased ﬁlling method to determine the unknown pixels before computing DCT. In detail, for each unknown pixel p, letting the discrete gradient at this pixel be zero will lead to linear equations relating to p and its left/right and top/bottom neighbors. Therefore, for n unknown pixels, we can have m linear equations with m > n which is actually an over-determined linear system. The optimal values of unknown pixels can then be computed by the Least-Square solution that minimizes the norm of gradients at unknown pixels. This gives better inpainting results.

Interactive Image Inpainting Using DCT Based Exemplar Matching

717

Fig. 8. Example XI (top-row) and X (bottom-row): (left) the given images, (middleleft) results of [1], (middle-right) DCT results, and (right) DCT+GB results

4

Experimental Results and Discussion

We have implemented the proposed algorithm using Visual C++, and tested it on various examples on a PC with Intel Core2 6600 CPU at 2.40GHz plus 2.0GB RAM. Basically, the proposed algorithm can recover photographs more eﬀectively and eﬃciently than the greedy exemplar-based approach [1]. Figures 2-8 show the results, where ‘original’ stands for the method of [1] but with ANN search (ﬁlling unknown pixels with average color), ‘DCT’ for only using Discrete Cosine Transformation (ﬁlling unknown pixels with average color), and ‘DCT+GB’ denotes the DCT-based method with unknown pixels ﬁlled with zero gradient optimization (i.e., the method in section 3). The major limitation of the DCT-based method is that it does not work very well on sharp features as the high frequency components are neglected in the search of best-matched blocks (e.g., the top of Fig. 8). Improvement on this will be our work in the near future.

References 1. Criminisi, A., P´erez, P., Toyama, K.: Region ﬁlling and object removal by exemplarbased image inpainting. IEEE Transactions on Image Processing 13, 1200–1212 (2004) 2. Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: ACM SIGGRAPH 2000, pp. 417–424. ACM, New York (2000) 3. Bertalmio, M., Bertozzi, A.L., Sapiro, G.: Navier-stokes, ﬂuid dynamics, and image and video inpainting. In: IEEE CVPR 2001, vol. I, pp. 355–362. IEEE, Los Alamitos (2001) 4. Georgiev, T.: Photoshop healing brush: a tool for seamless cloning. In: Workshop on Applications of Computer Vision (ECCV 2004), pp. 1–8 (2004) 5. Levin, A., Zomet, A., Weiss, Y.: Learning how to inpaint from global image statistics. In: IEEE ICCV 2003, pp. 305–312. IEEE, Los Alamitos (2003)

718

T.-H. Kwok and C.C.L. Wang

6. Komodakis, N., Tziritas, G.: Image completion using global optimization. In: IEEE CVPR 2006, vol. I, pp. 442–452. IEEE, Los Alamitos (2006) 7. Sun, J., Yuan, L., Jia, J., Shum, H.-Y.: Image completion with structure propagation. ACM Transactions on Graphics (SIGGRAPH 2005) 24, 861–868 (2005) 8. Haysm, J., Efros, A.: Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH 2007) 26, Article 4 (2007) 9. Liu, C., Guo, Y., Pan, L., Peng, Q., Zhang, F.: Image completion based on views of large displacement. The Visual Computer 23, 833–841 (2007) 10. Liu, X., Wan, L., Qu, Y., Wong, T., Lin, S., Leung, C., Heng, P.: Intrinsic colorization. ACM Transactions on Graphics (SIGGRAPH Asia 2008) 27, Article 152 (2008) 11. Bede, B., Nobuhara, H., Schwab, E.D.: Multichannel image decomposition by using pseudo-linear haar wavelets. In: Image Processing, 2007. ICIP 2007, vol. 6, pp. VI-17 – VI-20 (2007) 12. Kumar, S., Biswas, M., Belongie, S.J., Nguyen, T.Q.: Spatio-temporal texture synthesis and image inpainting for video applications. In: Image Processing, 2005. ICIP 2005, vol. 2, pp. II-85–88 (2005) 13. Lefebvre, S., Hoppe, H.: Parallel controllable texture synthesis. ACM Trans. Graph. 24, 777–786 (2005) 14. Mount, D., Arya, S.: ANN: A Library for Approximate Nearest Neighbor Searching (2006), http://www.cs.umd.edu/~ mount/ANN/ 15. Li, Z.N., Drew, M.: Fundamentals of Multimedia. Pearson Prentice Hall, London (2004)