CONTENT ASPECT RATIO PRESERVING MESH-BASED IMAGE RESIZING Kazu Mishiba1 , Masaaki Ikehara2 and Takeshi Yoshitome1 1
2
Department of Electrical and Electronic Engineering, Tottori University, Tottori, Japan Department of Electronics and Electrical Engineering, Keio University, Kanagawa, Japan ABSTRACT
We proposed a novel method for content-aware image resizing based on grid transformation. Our method moves each vertical and horizontal grid lines in the perpendicular direction to itself. Keeping important regions unchanged is a strong constraint for image resizing. Satisfying this constraint can produce significant distortion on unimportant regions. Our method allows important regions to be changed and suppresses the changes of the aspect ratios of important regions. In addition, we introduced the lower thresholds of the scaling rates of grid meshes and the boundary discarding process. Experimental results demonstrate that our proposed method resizes images with less distortion than other resizing methods. Index Terms— content-aware image resizing, warping method, grid-based resizing, aspect ratio 1. INTRODUCTION With the diversity of display device sizes and aspect ratios, image resizing has played an important role for optimal display. Traditional resizing techniques such as cropping and uniform scaling resize images with low computational cost. Cropping images, however, discards important content such as human faces and foreground objects, and scaling images distorts important content when the aspect ratio is changed. Recently, there have been many content-aware image resizing methods for overcoming these limitations [1]. Seam carving [2, 3, 4] is one of approaches for content-aware image resizing. Seam carving methods change the size of an image by gracefully carving out or inserting paths of pixels. Warping methods [5, 6] are other approaches for content-aware image resizing. They place a mesh onto an image, then deform the mesh by computing a new geometry for the mesh. Many content-aware image resizing methods focus on only preventing important content from changing because keeping important content unchanged can reduce the possibility of making visually implausible images. However, if the target image width is smaller than the width of important content on the original image, these methods fail to prevent important content from changing. Instead of enforcing the size of important content to remain unchanged, optimized scale-and-stretch ap-
978-1-4673-2533-2/12/$26.00 ©2012 IEEE
865
Initial grid
Transformed grid
Fig. 1. Our method moves each vertical and horizontal grid lines in the perpendicular direction to itself. proach [7], which is categorized as warping method, determines an optimal scaling rate for each local region on a mesh. Their method distributes the distortion to image regions with homogeneous content. In this paper, we propose a new resizing method, which is based on grid transformation. Like other warping methods, our method places a grid onto an image, then deform the grid for resizing. While many warping methods move vertices on the mesh in any directions, our method moves each vertical and horizontal grid lines in the perpendicular direction to itself (see Fig. 1). In other words, our proposed method changes the distances between the grid lines. Optimal grid transformation is found to solve an energy minimization problem. To obtain favorable resizing result, we introduce two energies, whose details are shown in Section 2.2. The energy for preserving aspect ratios of objects is the central idea in our method. This energy helps to keep aspect ratios of main objects unchanged. It leads to produce plausible resizing results. To obtain more satisfactory results, our method adds a condition to the scaling rates of meshes. In the conventional warping methods, the lower thresholds of the scaling rates of meshes are not set or set to 0. It leads to distortion due to the convergence of unimportant regions. To avoid such distortion, our method uses lower thresholds. However, using lower thresholds causes other distortion. Lower thresholds maintain the scaling rates of unimportant regions than a certain level. It means reduction of area for important regions, causing distortion on important regions. To suppress such distortion, our method discards boundary regions whose scaling rates are small. It gives more area for important regions.
ICIP 2012
2. GRID-BASED IMAGE RESIZING Our method places a grid divided by vertical and horizontal lines onto an original image of size m × n. Let us denote a region between neighboring grid lines in the same direction by grid section Ωi and the distance between these lines by section width li . Here i∈IH li = m, i∈IV li = n, and IH and IV are the sets of all section indices in horizontal and vertical directions, respectively. To resize an image of m × n pixels into an arbitrary size of m × n pixels, our method changes the section widths l to l satisfying i i i∈IH li = m , i∈IV li = n , and di ≤ li ≤ ui , where di and ui are the thresholds to suppress distortion due to extreme shrinking and enlargement. The details of the thresholds are discussed later. Our goal is to find the optimal widths li of grid sections. Our proposed method attempts to satisfy the following two conditions as much as possible. First, important regions are kept unchanged. This condition is required on many content-aware image resizing methods. Second, the aspect ratios of main objects are kept unchanged. To find the optimal transformation satisfying these conditions, we solve this problem as an energy minimization problem. 2.1. Image Importance and Main Object Unlike homogeneous scaling methods, many content-aware resizing methods scale each region inhomogeneously at a rate corresponding to its importance. Avidan and Shamir [2] use the L1 -norm of the grayscale intensity gradient as image importance. Visual saliency, which indicates perceptual quality, is useful for image importance. Achanta’s method [8] calculates the Euclidean distance in Lab color space between the pixel vector in a Gaussian filtered image with the average vector for the input image. It outputs full resolution saliency maps with well-defined boundaries of salient objects. We use Achanta’s saliency calculation method to obtain pixel importance S. Fig. 2 (b) is an example of the saliency map obtained by using this method. Brightness is proportional to pixel importance, which is normalized between 0 and 1. Suppressing distortions on especially important regions including main objects is required for favorable results. Keeping important regions unchanged is a strong constraint for image resizing. Satisfying this constraint can produce significant distortion on unimportant regions. Therefore, the change of important regions is to some degree inevitable in a resizing process. To suppress the distortion due to this change, our method keeps the aspect ratios of objects unchanged. Preserving the aspect ratio of an object reduces artificiality due to the size change of the object. Extracting important objects is needed to achieve it. We use Achanta’s method [8] to extract important objects for preserving their aspect ratios. To detect salient region, this method performs mean-shift segmentation in Lab color space, and then uses an adaptive threshold that is image saliency dependent. Fig. 2 (c) is a detection result.
866
(a)
(b)
(c)
(d)
Fig. 2. Extract main objects. (a) Original image. (b) Saliency map of the original image. (c) Initially-detected important regions. (d) Extracted main objects. As shown in this figure, the detected regions contain unimportant regions. Thus we eliminate unimportant regions and leave main objects. There is a low probability that main objects are in low importance and/or small regions. Let us define ψi as the total importance ψi = Ψi S in initially-detected region Ψi . We eliminate relatively low important regions satisfying ψi / j ψj < T , where T is a threshold parameter and j is an index of all initially-detected regions. In this paper, we set T to 0.05. Fig. 2 (d) is a result of elimination. As shown in this figure, the elimination process extracts main objects from the initially-detected regions. 2.2. Energy Definition The optimal transformation of the grid is calculated by solving an energy minimization problem. To obtain favorable resizing results, we use two energies, which is defined for keeping important region unchanged, and for keeping the aspect ratios of objects unchanged. The definitions of these energies are shown below. First, we define the energy for keeping important region unchanged. The ideal condition to obtain a satisfactory resizing result is that all important regions are untouched in a resizing process. However it is hard to satisfy the condition when the target image width is smaller than the width of important regions. In addition, we cannot completely ignore distortions on unimportant regions. To solve these problems, each grid section is scaled depending on its importance. Let us denote the importance of grid section Ωi by ωi = Ωi S. We define the energy for keeping important region unchanged as 2 li EC = (1) ωi 1 − li i∈I
(1) Fig. 3. After solving the minimization problem, we eliminate boundary sections with small scaling rates (indicated by gray area), and then recalculate the minimization problem. where I is a set of all grid sections on the grid. The change of the scaling rate at a more important region makes larger energy. Next, we define the energy for keeping the aspect ratios of objects unchanged. The region we would like to keep its aspect ratio is object region Ψj , which contains main objects. Let us denote an index set of the grid sections intersecting with object region Ψj by IΨj . To keep the aspect ratio of the object, grid section Ωi where i ∈ IΨj needs to have the same scaling rates of the other grid sections contained in IΨj . If grid section Ωk ∈ IΨj intersects with other object region Ψl , Ωi needs to have the same scaling rates of the grid sections contained in IΨl . Let us denote a set of all grid sections whose scaling rates are needed to be equal to the scaling rate of Ωi by Iˆi , where i ∈ Iˆi . We define the energy for keeping the aspect ratios of objects unchanged as EA =
2 l li R(Iˆi ) − i li
(2)
i∈I˜
where I˜ is a set of all grid sections intersecting with an object region and R(·) is a function to calculate the average scaling rate: 1 li R(Iˆi ) = (3) |Iˆi | ˆ li i∈Ii
where |Iˆi | is the number of elements of Iˆi . As shown in equation (2), larger difference from the average scaling rate makes larger energy. 2.3. Total Energy Minimization To calculate the optimal deformation, we wish to minimize the weighted sum of two energies: E = EC + λEA (4) subject to i∈IH li = m , i∈IV li = n , and di ≤ li ≤ ui . Here, λ is a weight factor, and di and ui are lower and upper threshold to suppress distortion due to extreme shrinking and enlargement, respectively. We use the following threshold: di ui
= =
τ −1 li min{sv , sh }, τ li max{sv , sh }
(5) (6)
867
(2)
(3)
(4)
(5)
Fig. 4. Test images where sh = m /m and sv = n /n are the horizontal and the vertical scale of a target image, respectively, and τ is a parameter to adjust the thresholds. We solve the energy minimization problem of Eq. (4) by using active set method. In the conventional warping methods, the lower thresholds of the scaling rates of meshes are not set or set to 0. It leads to distortion due to the convergence of unimportant regions. The lower threshold we defined above can avoid such distortion. However, using the lower threshold causes other distortion. The lower threshold maintains the scaling rates of unimportant regions than a given level. It means reduction of area for important regions, causing distortion on important regions. To suppress such distortion, we use an additional process. If the optimal scaling rate at a section in contact with an image boundary is small, it indicates that the section may contain many unimportant regions. Our method discards such sections, and then recalculate the energy minimization problem (see Fig. 3). This discarding process can assign wider regions to important regions, reducing distortions in important regions. The boundary grid sections satisfying the following condition are eliminated: li = di . The scaling rates of the eliminated sections are set to be 0. This process has like an adaptive cropping effect on a resizing image. 3. EXPERIMENTAL RESULTS To evaluate our method, we have implemented the proposed method and tested it on a variety of images. In our experiments, we found that λ = 100 and τ = 1.25 produce sufficiently good results. An initial grid is divided by 10 pixels. To compare our method with Rubinstein’s seam carving method [3] and Wang’s warping method [7], we halve the widths of the images shown in Fig. 4. Fig. 5 shows resizing results. As shown in Fig. 5 (1)-(3), the conventional methods distort main objects, human figures. Our proposed method changes the size of the human figures but produce plausible results because our method extracts main objects and resizes images suppressing the changes of the aspect ratios of these objects. Fig. 4 (4) and (5) include no eye-catching objects. Object extraction could not function efficiently for such images. Despite this, our method resizes images with less distortion than other resizing methods. Fig. 6 shows the comparison of extreme resizing between Wang’s method and our method. Even in this severe conditions, our method produces a plausible result.
4. CONCLUSIONS We proposed a novel method for content-aware image resizing based on grid transformation. Our method moves each vertical and horizontal grid lines in the perpendicular direction to itself. To obtain favorable resizing result, we introduced the energy for preserving aspect ratios of objects. In addition, we introduced the lower thresholds of the scaling rates of grid meshes and the boundary discarding process. Experimental results demonstrate that our proposed method resizes images with less distortion than other resizing methods.
(1)
(2)
Acknowledgment We thank the following Flickr (http://www.flickr.com/) users for Creative Commons imagery: gwenael.piaser(two little tourists), Let Ideas Compete(windows), melisslissliss(snow mountain), Paul Schultz(girls on the grass), Wouter Kiel(street). This work has been supported by the grant of Tottori University Electronic Display Research Center (TEDREC).
(3)
5. REFERENCES [1] D. Vaquero, M. Turk, K. Pulli, M. Tico, and N. Gelfand, “A survey of image retargeting techniques,” in Proc. SPIE Applications of Digital Image Processing XXXIII, 2010, vol. 7798. [2] S. Avidan and A. Shamir, “Seam carving for contentaware image resizing,” ACM Trans. Graph., vol. 26, no. 3, pp. 10, 2007.
(4)
[3] M. Rubinstein, A. Shamir, and S. Avidan, “Improved seam carving for video retargeting,” ACM Trans. Graph., vol. 27, no. 3, pp. 1–9, 2008. [4] M. Rubinstein, A. Shamir, and S. Avidan, “Multioperator media retargeting,” ACM Trans. Graph., vol. 28, no. 3, pp. 1–11, 2009. [5] R. Gal, O. Sorkine, and D. Cohen-Or, “Feature-aware texturing,” in Proc. Eurographics Symposium on Rendering, 2006, pp. 297–303. [6] L. Wolf, M. Guttmann, and D. Cohen-Or, “Nonhomogeneous content-driven video-retargeting,” in Proc. IEEE int. Conf. Computer Vision, 2007.
(5)
(a)Rubinstein [3]
(b)Wang [7]
(c)Our method
Fig. 5. Comparison of our method with other methods.
[7] Y.-S. Wang, C.-L. Tai, O. Sorkine, and T.-Y. Lee, “Optimized scale-and-stretch for image resizing,” ACM Trans. Graph., vol. 27, no. 5, pp. 118:1–118:8, 2008. [8] R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk, “Frequency-tuned salient region detection,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
868
Fig. 6. The result of extreme resizing. Left: original image. Upper right: Wang’s method. Lower right: our method.