using standard pra s

Viewer
Transcript

Lewis D. Griffin

Vol. 16, No. 12 / December 1999 / J. Opt. Soc. Am. A

2825

Partitive mixing of images: a tool for investigating pictorial perception Lewis D. Griffin Department of Optometry and Vision Sciences, School of Life and Health Sciences, Aston University, Birmingham B4 7ET, England Received January 19, 1999; accepted July 15, 1999; revised manuscript received August 6, 1999 In certain cases, images of different scenes can be mixed to produce an image of a novel scene. For example, an image of a pink sphere can be additively mixed from suitable images of a red and a white sphere. Three ways in which scenes can differ are considered: in the spectral composition of the illuminant and in the spectral and the geometric reflectance of scene objects. Sufficient conditions are given for mixing to produce images that correspond to possible scenes. Examples illustrate ways that mixtures can be used as stimuli in psychophysical experiments concerned with pictorial perception. © 1999 Optical Society of America [S0740-3232(99)01512-4] OCIS codes: 330.1720, 330.5020, 330.5510.

1. INTRODUCTION A method of generating visual stimuli by mixing other stimuli will be described. The method is an extension and generalization of established colorimetric techniques. The earliest report of producing visual stimuli by mixing is Ptolemy’s (ca. 150 B.C.) description of spinning a nonuniformly painted disk.1 Mixing became a productive experimental method when Forbes2 improved the technique by using adjustable sectors of differently colored papers. A 100-point scale on the perimeter of the disk allowed the size of each sector to be read off. Maxwell3 adopted this technique for his own researches, using it to conclude that, for example, ‘‘... the neutral colors of a color-blind person may be produced by combining 6 degrees of ultramarine with 94 of vermilion, or 60 of emerald-green with 40 of ultramarine.’’ Later, Maxwell replaced the spinning disk with a light box, within which colors were produced by superimposing light beams. Maxwell4 also developed another mixing method—the superimposition of projected images—to make the first color photograph. The three component images were of the same scene: a tartan ribbon. The images differed because for each exposure a different-colored filter was interposed between camera and scene. Similarly, the mixing method described here uses photographs of a scene that differs in some way between exposures, but a wider range of changes is considered. With the new mixing method, as with the color photograph, the (basis) images are additively mixed to produce novel images; but, as with the spinning disk, the mixture is partitive; that is to say, the sum of the weights of the components is constant. As will be shown, for certain types of scene change any partitive mixture of the images will correspond to a possible scene. The scene changes considered here are to the illuminant, to the spectral reflectance (color) of objects, or to the geometric reflectance (e.g., matte versus gloss). The technique can be used to generate images of 0740-3232/99/122825-11$15.00

many more scenes than can conveniently be photographed. These images can be used as stimuli in psychophysical experiments, and responses can be mapped across the affine structure induced on the stimulus space by partitive mixing.5 This is the method by which Maxwell6 determined the shape of the spectral locus in mixture space. The emphasis of this paper is on describing the method and showing the range of its applicability, not on drawing conclusions from psychophysical data acquired with it. The paper is organized as follows: First, the methods used in acquiring and mixing images are described. Second, the accuracy with which mixed images simulate real images is assessed. Next, the results of three types of scene variation are presented. Finally, two demonstrations are given of the use of mixed images as stimuli in psychophysical experiments that explore pictorial perception.

2. METHODS A. Partitive Mixtures In abstract terms, a partitive mixture ( p mixture) is a weighted sum of basis items, with weights summing to unity.7 The operationalization of weighting and summing depends on the nature of the basis items. For example, if the basis items are images projected on a screen, then weighting can be achieved by using neutral-density filters to attenuate each image suitably, and summation by superimposing the projected images. Alternatively, if the basis items are digital images, then weighting and summing may be performed on the numerical image values pixel by pixel and independently on each color channel. Although the optical method of p mixing has been successfully implemented,8 it has several problems (alignment of slides, many processes to gamma correct, etc.). © 1999 Optical Society of America

2826

J. Opt. Soc. Am. A / Vol. 16, No. 12 / December 1999

Digital p mixing has been found to be easier and will be assumed in the remainder of the paper, except where explicitly stated otherwise. B. Barycentric Coordinates A space of p mixtures can be indexed by barycentric coordinates. For example, if the basis images are I 1 ,..., I n , then the coordinates 具 w 1 ,..., w n 典 (where 兺 w i ⫽ 1) index the p mixture w 1 I 1 ⫹ ... ⫹ w n I n . Barycentric coordinates have also been used to specify face images obtained by warping from the basis set.9 Such coordinates endow the space with an affine structure.10 This may be visualized by embedding the affine space in a Euclidean space. For example, if there are three basis images they may be placed at the corners of an equilateral triangle as shown in Fig. 1 (top). If the locations of the three basis images are i1 , i2 , i3 , then the location of the p mixture 具 w 1 , w 2 , w 3 典 is w 1 i1 ⫹ w 2 i2 ⫹ w 3 i3 . Notice that it requires three images to span a two-dimensional space; in general, spanning an n-dimensional space requires n ⫹ 1 images. The reason for the mismatch between the number of coordinates and

Lewis D. Griffin

the dimension of the space spanned is that the unit-sum constraint makes the coordinates interdependent. C. Superpartitive Mixtures Referring to Fig. 1 (top), if the weights in a mixture are constrained to be nonnegative, then only mixtures on the perimeter and in the interior of the triangle can be formed. These will be termed partitive (p). If one or more weights are negative, then mixtures that correspond to points outside the triangle are formed. These will be termed superpartitive (sp). Sp mixtures were of importance to Maxwell6 since every point that he mapped on the spectral locus was an sp mixture of his basis stimuli. However, with neither of his mixing methods was he able to subtract components. Instead, he inferred negative coefficients from matches. For example, from the match x ⫹ y ⫽ z he would infer that x ⫽ z ⫺ y. Inferred matching can be used with image mixing, but it is also possible to subtract components and so generate sp mixtures. Doing this is easy when one is working in the digital domain but is also possible when one is mixing projected images.8 To subtract a pro-

Fig. 1. Top, Euclidean embedding of the mixture space of three basis images. One can determine the weighting of the basis images needed to produce the mixture image at any given point by considering what masses (with unit sum) placed at the locations of the basis images would have a center of gravity at the point. Bottom left, noise variance level for different mixtures; bottom right, maximum magnitude of misregistration artifacts for different mixtures.

Lewis D. Griffin

Vol. 16, No. 12 / December 1999 / J. Opt. Soc. Am. A

jected image, a color negative (i.e., complementary) image is projected. The weight of this component is set by the use of a neutral-density filter, as with a positive component. Since subtraction of a basis image is achieved by addition of a complementary image, sp projected mixtures have a pedestal of constant luminance across them. D. Choosing Basis Images The choice of basis images determines the span of mixture images that can be produced. In theory, given a desired span (of dimension, say, d), any set of d ⫹ 1 images from the span, none of which can be mixed from the rest, could serve as a basis. However, in practice, some bases are better than others. This is due to noise and misregistration artifacts in mixture images, as follows. The noise in mixed images is an amplification or diminishment of noise present in the basis images. If the noise in basis images is of variance k, the noise variance in mixture images is given by v( 具 w 1 ,... ,w n 典 ) ⫽ k 兺 w i 2 . Mixing can also introduce artifacts if the basis images are not perfectly aligned. To derive a formula for the possible magnitude of such artifacts, consider the worst case: 兵 0, 1 其 -valued basis images that are identical except for misregistration. If there were no misregistration, all mixtures would be 兵 0, 1 其 valued. So the degree of misregistration artifact associated with each pixel in a mixture of misregistered basis images can be measured as the smaller of the artifact’s deviations from the values 0 and 1. The smaller deviation is used, rather than the larger, so the artifact measure goes to zero together with the degree of misregistration. Suppose that for some pixel, a subset S of the basis images has the value 1, with the remainder having the value 0. The value of the pixel in the mixture image will be 兺 i苸S w i . From this line of reasoning, the maximum magnitude of misregistration artifacts can be shown to be m 共具 w 1 ,..., w n 兲典 ⫽

max s債兵 1,...,n 其

冋冋冏兺冏冏

wi , 1 ⫺

min

i苸s

兺

i苸s

taking ten images of each version of the scene and selecting the best-aligned set. Jitter was evaluated by use of registration marks that were present in the scenes for this purpose. This procedure reduced misalignment but did not eliminate it. Camera response saturation occurred not only at specular highlights but also when a brightly lit and intensely colored object was present. For example, a wellilluminated red object caused the camera to saturate in the red channel even in bright areas free of highlights, producing uniform areas where there should have been shading. This problem was addressed by ensuring that any object that was to be changed between scenes was not too intensely lit compared with the rest of the scene. This procedure successfully dealt with the problem of saturation caused by bright colors. Saturation at highlights still occurred, but it could be avoided by using a camera with variable exposure.11 F. Linearizing Basis Images For clarity, four types of image should be distinguished. First are ethereal images, which are irradiance distributions on the camera CCD.12 Second are raw images, which are digital images output by the camera. Third are basis images, which are also digital images. Fourth are displayed images, which are physical stimuli displayed on a monitor. For subsequent mixing, it is necessary that basis images be linearly dependent on the ethereal image. Hence it was necessary to compensate for camera nonlinearities in order to produce basis images from raw images. The primary nonlinearity was the camera’s translation of CCD voltages into raw image values. By imaging a graylevel series of known reflectivity and least-squares fitting the data, this nonlinearity was estimated to be

冏册册

wi

P共 L 兲 ⫽

.

The functions v and m are shown as density plots in Fig. 1 (bottom). As the figure shows, both functions are monotonically increasing for increasingly sp mixtures. Therefore, to avoid producing mixture images with either of these problems, one should select a basis image whose p span includes as much as possible of that part of the entire span that is relevant to the experiment in hand. E. Acquiring Basis Images For the results presented here, basis images were acquired with an Olympus C-400L digital color still camera, a domestic-consumer-quality item. The images, acquired at 8 bits/channel and at a resolution of 640 ⫻ 480, were automatically (and unavoidably) JPEG compressed within the camera before being downloaded to a PC. Two problems in acquiring images were discovered: interacquisition jitter and response saturation. The camera exhibited a jitter of up to 2 pixels between acquisitions, despite being firmly mounted and operated electronically. Even subpixel misalignments between basis images can give rise to significant artifacts in sp mixtures. The misalignment problem was addressed by

2827

再

4.44L,

L ⭐ 0.0298

1.27共 L 0.324 ⫺ 0.217兲 ,

otherwise

,

where P and L are pixel and luminance values, respectively, scaled into the interval 关 0, 1 兴 . This relationship was determined for the green channel but was found also to describe adequately the red and blue channel nonlinearities. Basis images were computed from raw camera images by applying the inverse of this function. The second nonlinearity was due to the camera’s automatically, and again unavoidably, setting the scene white point (i.e., the channel gains) according to the image content. This was addressed by including a dark and a light reference in each scene. The image RGB values of the references (d r,i , d g,i , d b,i ) T, (l r,i , l g,i , l b,i ) T, and their ¯ , ¯d , ¯d ) T, (l¯ , ¯l , ¯l ) T were means across the basis set (d r g b r g b used to determine an affine transformation

冉冊冉 ri gi bi

→

d r 共 l r,i ⫺ r i 兲 ⫹ ¯l r 共 r i ⫺ d r,i 兲兴 / 共 l r,i ⫺ d r,i 兲关¯ ¯ 关 d g 共 l g,i ⫺ g i 兲 ⫹ ¯l g 共 g i ⫺ d g,i 兲兴 / 共 l g,i ⫺ d g,i 兲 d b 共 l b,i ⫺ b i 兲 ⫹ ¯l b 共 b i ⫺ d b,i 兲兴 / 共 l b,i ⫺ d b,i 兲关¯

冊

of the RGB values of the set of basis images such that they all agreed for these two objects. This affine trans-

2828

J. Opt. Soc. Am. A / Vol. 16, No. 12 / December 1999

formation was calculated and applied after correcting for the voltage–pixel-value nonlinearity. Finally, two nonlinearities were not corrected for. The first was saturation at highlights; the effect of this nonlinearity will be apparent in results given below. The second was the JPEG compression of the raw images; its effects were less noticeable than those of saturation and misregistration.

G. Displaying Images Ideally, the displayed version of a basis image would be metameric with its associated ethereal image. However, if, as here, the camera absorption spectra are not linear combinations of the human cone response functions, then this may not be possible. This is because there are pairs of ethereal images that are metameric to such a camera but not metameric to humans.13,14 Instead, we settle for an affine increasing relationship (D) between digital image values and displayed luminances. Since digital image values are linearly related to the ethereal image, the displayed image will be affinely related to the ethereal image of the original scene. Note that, since D is not required to be linear, it is not necessary for D(0) ⫽ 0; all that is required is the weaker con-

Lewis D. Griffin

dition 兺 w i ⫽ 1 ⇒ D( 兺 w i p i ) ⫽ 兺 w i D( p i ). When one is dealing with color images, the same function D is used for each color channel. The function D is established for a set of images. By choosing D for a set, rather than individually for each image, one ensures that (i) the relationship between the displayed image and the ethereal image is constant across the set and (ii) the affine structure that exists on ethereal images, and is mirrored by the mixture structure of digital images, is replicated in the space of displayed images. Given that D is affine and increasing, there remain two degrees of freedom. These are settled by ensuring that the minimum (maximum) pixel value in the entire set of images maps to minimum (maximum) screen luminance. Assuming that the basis images all have values in the interval 关 0, 1 兴 , then p mixtures will also have values in 关 0, 1 兴 . Hence there is a D function that is suitable for displaying the entire set of p mixtures (and, typically, some sp mixtures). Sp mixtures, however, can have a wider range of values, e.g., 关 ⫺0.1, 1.2兴 . Although the D function that is determined for a set containing such an image will allow all images in the set to be displayed consistently, some mixtures (in particular, p ones) will be displayed without utilizing the lower or upper parts of the display’s dynamic range.

Fig. 2. Results of an evaluation of the accuracy with which mixed images simulate real images. Bottom half, blurred visions of the basis images D and L were used to generate mixtures that matched the blurred targets B, G, and W. At the top left are details showing regions of a target and a mixture image that can easily be seen to be different if mixing is done with unblurred images (frames in the lower part of the figure show the locations of these regions of interest). The graph shows how the mismatch between mixed and target images decreases with blurring.

Lewis D. Griffin

3. ACCURACY OF MIXED IMAGES Given the limitations of the camera as described in Section 2, it is hardly to be expected that mixed images will exactly equal the images that they are intended to simulate. However, in this section their accuracy is assessed as a benchmark for future experiments that will use more-sophisticated cameras. To test the method, a set of monochrome images of different spheres were analyzed. The spheres were snooker balls of different lightness but similar glossiness photographed in color and then after linearization converted to gray scale by use of the mapping gray ⫽ 0.35 ⫻ red ⫹ 0.50 ⫻ green ⫹ 0.15 ⫻ blue. The color originals are shown in Fig. 4 below. This procedure produced a sequence of achromatic images of spheres of different diffusive albedo, running from black (B) through dark (D), gray (G), light (L), to white (W). The images of the D and L balls were selected as the basis set, and from them mixtures simulating the other three were produced. The mixture weights were chosen so as to minimize the root-mean-squared (RMS) difference between mixtures and their targets, evaluated over the spheres and their immediate surrounds. The RMS differences between the mixtures and their targets were on average 2.0% of the range of image values. The optimal mixtures determined for B, G, and W were 具 1.09, ⫺0.09典 , 具 0.34, 0.66典 , and 具 ⫺0.24, 1.24典 , respectively, where the first coordinate is the weight on the D image and the second is the weight on the L image. Visual comparison of the mixture and target images revealed noticeable localized differences at the highlights and rim of the sphere (Fig. 2, top left). Because of their spatial structure, these differences were attributed to JPEG compression and misregistration. To determine how much of the RMS difference was due to these highfrequency artifacts, mixing was repeated with blurred basis and target images. Blurring was achieved by numerical convolution with two-dimensional Gaussian kernels. Increasing degrees of blurring were achieved by using kernels of increasing standard deviation. The RMS difference for blurs of standard deviation up to 4 pixels are shown in graphic form in Fig. 2 (top right). The ball diameter is 115 pixels. The graph shows that, for all three targets, a blur of standard deviation 2 pixels reduced the RMS difference to approximately 60% of its unblurred value, i.e., to an average 1.2% of the range of image values. For increasing degrees of blurring beyond this point, the reduction in RMS difference was more gradual. The bottom part of Fig. 2 shows the blurred basis images, targets, and mixture images. The RMS difference of 1.2% is just reliably detectable when the images are displayed at full contrast on a monitor; in print reproduction the differences are more difficult to see. Spatial analysis of the images shows that the regions of difference are distributed widely across each sphere. In conclusion, with the camera and methods described in this paper, it is possible to mix moderately sp mixtures that will be within 2% of the targets. Half of the error is due to misregistration and compression and can be removed by blurring. The remaining 1.2% of error is plausibly explained by (i) inadequate correction for camera

Vol. 16, No. 12 / December 1999 / J. Opt. Soc. Am. A

2829

nonlinearities and (ii) subtle differences in the geometric reflectance of the different spheres.

4. ALTERING DIFFERENT ASPECTS OF THE SCENE A. Varying Illuminant Color Figure 3 (top row) shows basis images of a scene photographed through different-colored filters. Identical images could have been obtained by filtering the illuminant; so, for simplicity in the remainder of the paper, the images will be treated as if they are unfiltered photographs of the scene lit by illuminants of differing color but identical geometry. The scenes, shown in the basis images of Fig. 3, can be regarded as particular elements of an affine space of possible scenes; i.e., they are specific elements of the space of scenes lit by different p mixtures of the three illuminants.15 Given that all the sources of scene illumination are pairwise incoherent, the ethereal image measured by the camera will be affine with respect to such p mixing of illuminants.16 To express this in symbols, let S I denote the scene as lit by illuminant I and let E denote the process that gives rise to the ethereal image; then E(S 兺 w I ) ⫽ 兺 w i E(S I i ). If the production of basis imi i

ages (which is linear with respect to the ethereal image as already described) is denoted by B, it may be concluded that 兺 w i B 关 E(S I i ) 兴 ⫽ B 关 E(S 兺 w I ) 兴 , i.e., a p mixture of i i

basis images is equal to the (basis) image that would have been acquired had the scene been photographed under the corresponding p mixture of illuminants. In contrast, not all sp mixtures correspond to possible scenes. That sp mixtures corresponding to a possible scene exist is trivial [consider that a has coordinates 1 具 2, ⫺1 典 when 2 (a ⫹ b) and b are the basis]. That sp mixtures not corresponding to scenes exist is shown by the fact that some sp mixtures have negative pixel values (which would have to be caused by an unphysical negative intensity in the corresponding ethereal image). Each of the mixture images in the bottom row of Fig. 3 is displayed with a D function selected for it alone. Figure 3(a) shows the p mixture 具 0.28, 0.25, 0.47典 , displayed with the default D function that is suitable for all p mixtures. This mixture was selected to correspond to an achromatic illuminant. Figure 3(b) is the sp mixture 具 ⫺0.11, 0.59, 0.52典 , which shows the scene as if it were illuminated by an orange light. The range of values of this image was 关 ⫺0.02, 1 兴 . As there are negative image values, it does not correspond to the scene lit by a possible illuminant. However, to the author, it looks no less realistic than the basis images. In contrast, the sp mixture 具 0.33, ⫺0.54, 1.21典 of Fig. 3(c), which had a range of 关 ⫺0.15, 1.18兴 , looks unrealistic (at least to the author when it is displayed on a monitor). The sense in which this is meant is, I cannot see it as a realistic image of the scene shown in the other images. This is not due to the pedestal used in the D function for this image—if the other images are displayed with the same D function, the pedestal does not prevent my seeing them as realistic. This issue is explored further and more objectively in Subsection 5.A.

2830

J. Opt. Soc. Am. A / Vol. 16, No. 12 / December 1999

Lewis D. Griffin

Fig. 3. Results of mixing images acquired through filters of different colors (equivalent to using differently colored illuminants). Top row: basis images. Bottom row: (a) is a p mixture 具0.28, 0.25, 0.47典; and (b) and (c) are sp, 具⫺0.11, 0.59, 0.52典 and 具0.33, ⫺0.54, 1.21典, respectively.

B. Varying an Object’s Color The color properties of an object with a homogeneous and isotropic surface are described by the object’s spectral reflectance function (SRF), which is the ratio between incident and reflected intensities for each visible wavelength. In this subsection, the mixing of images, between the acquisition of which an object has changed its spectral reflectance, is considered. An important difference between altering illumination and altering reflectance is that, although the ethereal image is an affine function of the scene illumination, it is not necessarily an affine function of a scene object’s reflectance.17 This failure to be affinely related occurs if a significant number of rays encounter the object twice before entering the camera.18–20 Such rays give rise to a quadratic term in the expression that relates the object’s reflectance to the ethereal image. Rays that encounter the object three times before entering the camera give rise to a cubic term, and so on. So, if the color of a scene object is to be varied, it should be convex and distant from objects of high albedo. If these conditions are met, then a linear dependence will hold between object reflectance and the ethereal image; and, from the same argument as in Subsection 4.A, p mixtures will correspond to scenes of an object with a possible SRF. In particular, a mixture 具 w 1 ,..., w n 典 of images of objects with SRF’s R i will correspond to an object with SRF 兺 w i R i . Again, as before, (i) an sp mixture will not correspond to a possible object if the mixture image has negative values, but (ii) if it is everywhere positive it may or may not. This is akin to cone response rates: (i) Negative rates are impossible, but (ii) not all triples of positive response rates are possible, because of the overlap of cone spectral sensitivities. Figure 4 shows results of varying the color of an object in a scene that was constructed to minimize the number of rays that encounter the object twice before entering the

camera. The top row shows basis images of black, cream white, red, and yellow spheres; four were used because together they span the three-dimensional space of object color. The bottom row shows, on the left, an image of a blue sphere and on the right, an sp mixture of the four basis images. The mixture, which has coordinates 具 0.97, 0.65, 0.25, ⫺0.87典 , is that which gives the closest approximation, in the least-squares sense, to the true picture of the blue sphere (on the left). The RMS difference (averaged across the ball and its immediate surround and across all three color channels) between the mixture and the original is 3.7% of the range of image values when they are evaluated for unblurred images (as shown in the figure). For a blur of ␴ ⫽ 2 the RMS drops to 2.6%. That these differences are greater than those found with the monochrome spheres is possibly due to the greater degree of superpartitiveness of the mixture. Figure 5 shows the effect of arranging the scene so that many rays encounter the object multiple times. The four basis images (not shown) were of the same spheres as in Fig. 4. The image on the left is a true picture of a blue sphere, whereas the image on the right is a mixture (same weights as for Fig. 4). Note that there are visible differences not only on the left side of the sphere but also on the white object next to it. The RMS difference is 5.2% for unblurred images (compared with 3.7% above), dropping to 3.6% (compared with 2.6% above) for a blur of 2 pixels standard deviation. C. Varying an Object’s Geometric Reflectance Properties Just as images of objects of differing spectral reflectance may be mixed, so may images of objects of differing geometric reflectance properties (matte, glossy, etc.). These properties are fully described by the bidirectional reflec-

Lewis D. Griffin

tance distribution function21,22 (BRDF), which specifies the ratio between the radiance and irradiance, for each pair of directions. Assuming the same condition as before ( paucity of rays encountering the object twice), a p mixture 具 w 1 ,..., w n 典 of objects with homogeneous BRDF’s B i will correspond to an object with a homogeneous BRDF 兺 w i B i . Again, an sp mixture will not correspond to a possible object if the mixture has any negative values, but if it is everywhere positive it may or may not.23 Figure 6 shows the results of varying the BRDF of a scene object (same precautions as for Fig. 4). The objects were three identically sized spheres coated with paints of

Vol. 16, No. 12 / December 1999 / J. Opt. Soc. Am. A

2831

similar diffusive albedo but differing finish. As in Section 3 and Fig. 2, the spheres were photographed in color and then converted to gray scale. The three basis images so produced are shown in the top row. The bottom row shows mixture images with individually determined D functions. Figure 6(a) is the p mixture 具 0.52, 0.37, 0.11典 , chosen to give the appearance of an object made of pewter. Figures 6(b) and 6(c) are sp mixtures. Figure 6(b) has coordinates 具 0, ⫺1, 2 典 ; note how subtraction of the matte image has created highlights brighter than the white star reference. Figure 6(c) has coordinates 具 ⫺0.5, 1, 0.5典 .

Fig. 4. Results of mixing images of objects of different colors. Top row: basis images. Bottom row: left, a genuine image; right, the sp mixture 具0.97, 0.65, 0.25, ⫺0.87典 that most closely approximates it.

Fig. 5. Left, a genuine image; right, an sp mixture with the same weights as in Fig. 4. Note the differences between the two images on the left side of the sphere and on the white cup next to it. These differences are due to the failure of the technique in scenes with significant interreflection involving the changing object.

2832

J. Opt. Soc. Am. A / Vol. 16, No. 12 / December 1999

Lewis D. Griffin

Fig. 6. Results of mixing images of objects with different geometric reflectance properties. Top row: basis images. Bottom row: (a) is a p mixture 具0.52, 0.37, 0.11典; (b) and (c) are sp, 具0, ⫺1, 2典 and 具⫺0.5, 1, 0.5典, respectively.

5. USE IN PSYCHOPHYSICAL EXPERIMENTS In the experiments reported in Subsections 5.A and 5.B, no attempt was made to conceal from the subjects the fact that the stimuli were displayed images rather than real scenes. Hence the results are relevant to the perception of pictured scenes and objects, which, given the ubiquity of pictures in everyday life, is an important area of inquiry. The comparison of pictorial and ordinary perception is an active research area.24,25 Both impoverished perceptual competences compared with ordinary25 and enhanced26 have been shown. Agreement in the limiting case of images metameric with scenes has also been demonstrated.27 The phrase ‘‘pictorial perception’’ inherits the vagueness of ‘‘picture.’’ To reduce this vagueness, details must be given of the relationship between scenes and the pictures of them. The purpose of stating this relationship is not so that results can be reinterpreted as being about ordinary perception but so that the type of picture is specified. That the type of picture is significant is shown by considering that photographic negatives, line drawings, etc., could all be called pictures. The scene– picture relationship for the stimuli used here is affine and increasing, with approximate colorimetric fidelity. Mixing images can be used in (at least) two classes of psychophysical experiment. One class identifies categorical boundaries in the affine space of mixtures. For example, one could map the dissection of the space of surface colors into the categories of black, brown, blue, etc. A special case of a categorical boundary, which occurs for all types of scene variation, is the boundary between realistic and unrealistic images. A second class of experi-

ment uses matches made by a subject to establish a mapping between two spaces of stimuli. Matching may be based on global indiscriminability or on equality of a stated perceptual dimension and may be made between spaces of equal or unequal dimension. In the balance of this section, examples of experiments of these two classes are presented. These pilot experiments are presented to demonstrate the methods used, not to support rigorous conclusions about their subject matter. Both experiments employ images of the colored spheres used in earlier sections. The Munsell coordinates of these spheres, as determined by the author’s making color matches between pictured spheres and Munsell chips, were judged to be red (7.5R 4/11), green (5BG 4/4), blue (5PB 3/8), black (N 1), and cream white (7.5Y 8/6). A. Mapping Categorical Boundaries An experiment was conducted to explore the distinction between realistic and unrealistic sp mixtures noted in previous sections. The stimuli were mixtures of the black, cream-white, and red balls of Fig. 4. Three subjects (including the author) took part. In initial trials it was discovered that the task had to be specified carefully if the judgments of different subjects were to be in rough agreement. Originally, the judgment was phrased as follows: ‘‘Assuming the geometry of the scene and illuminants is as it appears in the basis images, does the image look to be a realistic or unrealistic image of a sphere of ordinary (i.e., nonluminous and nonfluorescent) material?’’ Some subjects found it difficult to make this judgment, and, moreover, they would frequently classify p mixtures, even pure unmixed basis images, as unrealistic.

Lewis D. Griffin

After discussion with the subjects, the task was changed to ‘‘Assuming [the same conditions as above], can you see the image as28 a realistic image of a sphere of ordinary material [same clarifications as above]?’’ This task was reported to be easier and produced consistent results between individuals. The subjects were also instructed to try to disregard any obvious artifacts (for example, at highlights) when making their judgments. The judgments were made on a set of 441 mixture images. The range of pixel values in the set was 关 ⫺0.58, 1.88兴 , and all were displayed with the same D function based on this range. As can be seen from Fig. 7, the D function significantly reduces the contrast of the images. Further experiments are needed to determine the effect of this contrast reduction on judgments of realism; however, the aim here is to introduce the method rather than to draw conclusions on visual perception. To assist the subjects in ignoring the luminance pedestal component of the D function, the images were surrounded with a 1-cm achromatic border whose luminance was the D function of 0 (i.e., black in the basis images). The nonluminous outer rim of the cathode-ray tube was shielded by a white-card surround.

Vol. 16, No. 12 / December 1999 / J. Opt. Soc. Am. A

2833

The images were presented in random order, with each subject viewing the entire set three times on separate days. Viewing took place in a dim room with the subjects 60 cm from a 25-cm-wide image on the monitor. The subjects made judgments by pressing buttons, without time limit or feedback. After each judgment, a randomly selected basis image was displayed for 5 s before the next mixture image. This was done to assist the subjects in maintaining a fixed idea of the illumination and geometry of the displayed scene. Figure 7 shows the results of the experiment. The three small density plots show how often each subject was able to see the mixture as realistic: from zero of three (black) to three of three (white). The larger density plot is the average across all subjects. The red contour is the 50% threshold. The letters show the locations of selected mixtures shown in the other panels. As can be seen from these selected mixtures, some nonrealistic spheres were too light (e.g., l), and others were too dark (e.g., j). The demonstration of this subsection shows that image mixing is a suitable tool for investigating whether perception of unnatural colors agrees with the physics-based theory of the body color solid30 or whether the visual sys-

Fig. 7. Results of an experiment mapping the limit of realism that occurs in the space of mixtures of a black, a cream-white, and a red sphere. The small density maps (top left) show, for three subjects, the proportion of trials in which different mixtures were judged possible to see as realistic, from white (100%) to black (0%). The larger map is the average across subjects. The red contour shows the 50% threshold. Individual mixtures, displayed with the D function used in the experiment, correspond to the locations marked a–l.

2834

J. Opt. Soc. Am. A / Vol. 16, No. 12 / December 1999

Lewis D. Griffin

Fig. 8. Results of an experiment on lightness perception. Subjects judged whether a mixture 具 u 1 , u 2 典 of a cream-white and a black sphere was lighter or darker than a mixture 具 w 1 , w 2 , w 3 典 of a red, a green, and a blue sphere. The stimuli 具 u 1 , u 2 兩 w 1 , w 2 , w 3 典 on which the judgments were based were created from the four basis images on the top row by the mixture 具 1 ⫺ u 1 ⫺ w 2 , u 1 ⫺ w 3 , w 2 , w 3 典 . Lightness matches were determined by fitting psychometric functions to the judgment data. These matches are shown (bottom left) as contours of equal cream-white ball content. An empirically determined lightness match 具 0.2, 0.8兩 1/3, 1/3, 1/3典 is shown at the bottom right.

tem uses an approximation. Previous methods of investigating this question,31,32 although they had the advantage of using real (though not particularly natural) rather than pictured objects, were restricted in that only the boundary between ordinary and luminous appearance could be investigated. B. Establishing Maps Between Spaces Figure 8 shows the results of an experiment to establish, by matching, a mapping from a high- to a lowdimensional space. The experiment again used colored spheres. The elements of the high-dimensional space were p mixtures of a red, a green, and a blue sphere; and of the low, p mixtures of a black and a cream-white sphere. The mapping was based on matching lightness. These matches were found by fitting psychometric functions (cumulative Gaussians) to 20 darker/lighter judgments for each of 15 red/green/blue p mixtures compared with adaptively chosen black/cream-white p mixtures. Trials involving the different RGB p mixtures were randomly interleaved. The D function for p mixtures was used, but in other respects display and viewing were as in the previous experiment. The results of the experiment are shown as contours of equal cream-white sphere content. Visually, the contour plot suggests that the lightness matches found were very close to being affine (i.e., the contours are straight and

evenly spaced). This was confirmed by least-squares fitting a plane to the data; the RMS error of the fit was 1.2%. The best fit to

具 u 1 , u 2 典 matches in lightness 具 w 1 , w 2 , w 3 典 was u 1 ⫽ 0.37w 1 ⫹ 0.15w 2 ⫹ 0.08w 3 . An example of this fit is shown in the large photograph in Fig. 7, which shows the best-fit lightness match between 具 1/3, 1/3, 1/3典 (sphere on right) and 具 0.2, 0.8典 (sphere on left). The luminance of displayed images is affinely related to the images’ coordinates. Thus, since a plane was successfully fitted to the lightness matches, the experiment fails to demonstrate the lightness–luminance discrepancy previously reported,33,34 though this may be due to the relatively small size of the gamut used. The experiment of Fig. 8 could have been performed with the subject’s viewing two images simultaneously, one a p mixture of the RGB spheres and the other a p mixture of the black and cream white. However, it is possible, as here, for the two stimuli to exist within the same pictorial space. The basis images used to achieve this are shown along the top of Fig. 8. To produce the mixture 具 u 1 , u 2 兩 w 1 , w 2 , w 3 典 , the mixture 具 1 ⫺ u 1 ⫺ w 2 , u 1 ⫺ w 3 , w 2 , w 3 典 was formed. For example, the mixture in the large photo 具 0.2, 0.8兩 1/3, 1/3, 1/3典 was formed by the mixture 具 0.46, ⫺0.13, 0.33, 0.33典 . Although sp

Lewis D. Griffin

mixtures were used as stimuli, all had pixel values in the range 关 0, 1 兴 . Hence it was possible to use the p mixture D function for display. This method of having multiple distinct mixtures in an image generalizes: If the image is to have mixtures of dimensionality d 1 ,..., d n , then 1 ⫹ 兺 d i base images are needed. The trick, however, has limitations, since to produce an image with mixtures that are all individually partitive, sp mixing is often required—with the attendant problems of noise and misregistration.

Vol. 16, No. 12 / December 1999 / J. Opt. Soc. Am. A

11.

12. 13. 14.

6. CONCLUDING REMARKS

15.

A method for generating synthetic images by digitally mixing a set of basis images has been presented. Sufficient conditions for mixed images to correspond to images of an actual scene have been given. An empirical evaluation confirmed (to a degree limited by the camera used) this hypothesized correspondence. More-rigorous testing would be possible with a superior camera. Examples showed how the method could be used to generate images of scenes with (i) varying illuminant color, (ii) varying object color, and (iii) varying object geometric reflectance. The use of the method to generate stimuli for psychophysical experiments was demonstrated. Although the stimuli so generated may be more natural than are used in many experiments, they are pictorial. Further research is needed to quantify the effect of varying how the images are displayed and to determine how well the results generalize to ordinary (nonpictorial) perception.

16.

The author can be reached at the address on the title page or by phone, 44-121-359-3611; fax, 44-121-333-4220; or e-mail, [email protected].

17. 18. 19. 20. 21.

22. 23. 24. 25.

REFERENCES AND NOTES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

N. J. Wade, A Natural History of Vision (MIT Press, Cambridge, Mass., 1998). J. D. Forbes, ‘‘Hints towards a classification of colours,’’ Philos. Mag. 34 (1948). J. C. Maxwell, ‘‘Theory of the perception of colours,’’ Trans. R. Scott. Soc. Arts 4, 394–400 (1856). J. C. Maxwell, ‘‘The diagram of colours,’’ Trans. R. Soc. Edinburgh 21, 275–298 (1857). L. D. Griffin, ‘‘Production of psychophysical stimuli by partitive mixing of images,’’ Perception 27 (suppl.), 171–172 (1998). J. C. Maxwell, ‘‘Theory of compound colours, and the relations of the colours on the spectrum,’’ Proc. R. Soc. London 10, 404–409 (1860). J. J. Koenderink, ‘‘Color atlas theory,’’ J. Opt. Soc. Am. A 4, 1314–1321 (1987). P. Turner, ‘‘Building a colour image mixing system,’’ B. Optom. dissertation (Aston University, Birmingham, England, 1998). S. Lee, G. Wolberg, and S. Y. Shin, ‘‘Polymorph: morphing among multiple images,’’ IEEE Comput. Graph. Appl. 18, 58–71 (1988). Informally, an affine space is a vector space lacking a point singled out as the origin; a familiar example is the space of equally luminous colors depicted in the CIE diagram. Formally, it is a space of points with an associated vector space and two permissible operations: (i) One may form the difference of two points, which is a vector in the associated vector space, and (ii) one may add any vector to any point to form a new point. However, unlike for a vector space, one

26. 27. 28.

29. 30. 31. 32. 33. 34.

2835

cannot multiply a point by a scalar nor add together two points. P. E. Debevec and J. Malik, ‘‘Recovering high dynamic range radiance maps from photographs,’’ in Proceedings of the Special Interest Group on Graphics ’97 (Proc. SIGGRAPH ’97) (Addison-Wesley, Reading, Mass., 1997), pp. 369–378. V. Ronchi, ‘‘Resolving power of calculated and reflected images,’’ J. Opt. Soc. Am. 51, 458–460 (1961). J. J. Gordon and R. A. Holub, ‘‘On the use of linear transformations for scanner calibration,’’ Color Res. Appl. 18, 218–219 (1993). B. A. Wandell, ‘‘Color rendering of color camera data,’’ Color Res. Appl. 11, S30–S33 (1986). S. A. Shafer, ‘‘Describing light mixtures through linear algebra,’’ J. Opt. Soc. Am. 72, 299–300 (1982). A. J. den Dekker and A. ven den Bos, ‘‘Resolution: a survey,’’ J. Opt. Soc. Am. A 14, 547–557 (1997). Z. Yaumauti, ‘‘The light flux distribution of a system of inter-reflecting surfaces,’’ J. Opt. Soc. Am. 13, 561–571 (1926). J. J. Koenderink and A. J. van Doorn, ‘‘Geometrical modes as a method to treat diffuse inter-reflections in radiometry,’’ J. Opt. Soc. Am. A 73, 843–850 (1983). C. M. Goral, K. E. Torrance, D. P. Greenberg, and B. Battaile, ‘‘Modelling the interaction of light between diffuse surfaces,’’ Comput. Graph. 18, 212–222 (1984). B. V. Funt, M. S. Drew, and J. Hio, ‘‘Color constancy from mutual reflection,’’ Int. J. Comput. Vision 6, 5–24 (1991). F. E. Nicodemus, J. C. Richmond, J. J. Hsia, I. W. Ginsberg, and T. Limperis, ‘‘Geometrical considerations and nomenclature for reflectance,’’ NBS Monograph 160 (National Bureau of Standards, Washington, D.C., 1977). J. J. Koenderink and A. J. van Doorn, ‘‘Phenomenological description of bidirectional surface reflection,’’ J. Opt. Soc. Am. A 15, 2903–2912 (1998). If the basis images were out of focus (and so blurred), for example, then an sp mixture that had zero-valued pixels could not correspond to any object. A. J. van Doorn, ‘‘Effects of changing context on shape perception,’’ Perception 27, S117 (1998). D. H. Brainard, M. D. Rutherford, and J. M. Kraft, ‘‘Colour constancy compared: experiments with real images and color monitors,’’ Invest. Ophthalmol. Visual Sci. 38, S2206 (1997). A. Hurlbert, ‘‘Illusions and reality checking on the small screen,’’ Perception 27, 633–636 (1998). R. L. Savoy, ‘‘Colour constancy with reflected and emitted light,’’ Perception 22S, 61 (1993). The psychological distinctness of ‘‘it looks X’’ and ‘‘it can be seen as X’’ was noted by Wittgenstein (Ref. 29). An example of the difference is in regard to the threedimensionality of line drawings. Suppose that one is examining a straightforward sketch of a cube (which one recognizes as such), and is asked ‘‘Does it look threedimensional?’’ One could reasonably answer either yes or no. In contrast, ‘‘Can you see it as three-dimensional?’’ could reasonably only be answered yes. ‘‘Reasonably’’ here should be understood as ‘‘speaking the same language.’’ L. Wittgenstein, Philosophical Investigations (Blackwells, Oxford, U.K., 1953). E. Schro¨dinger, ‘‘Theorie der Pigmente von grosster Leuchtkraft,’’ Ann. Phys. 62, 603–622 (1920). R. M. Evans, ‘‘Fluorescence and gray content of surface colors,’’ J. Opt. Soc. Am. 49, 1049–1059 (1959). J. M. Spiegle and D. H. Brainard, ‘‘Luminosity thresholds: effect of test chromaticity and ambient illumination,’’ J. Opt. Soc. Am. A 13, 436–451 (1996). G. Wyszecki, ‘‘Correlate for lightness in terms of CIE chromaticity co-ordinates and luminous reflectance,’’ J. Opt. Soc. Am. 57, 254–257 (1967). M. D. Fairchild and E. Pirrotta, ‘‘Predicting the lightness of chromatic object colours using CIELAB,’’ Col. Res. Appl. 16, 385–393 (1991).

So the degree of mis- registration artifact associated with each pixel in a mix- ture of misregistered basis images can be measured as the smaller of the artifact's ...

Download PDF

915KB Sizes 2 Downloads 180 Views

Report

using standard pra s

using standard pra s

using standard pra s

using standard prb s

using standard prb s

using standard prb s

using standard prb s

using standard prb s

using standard prb s

using standard prb s

using standard syste

using standard syste

using standard syste

USING STANDARD SYSTE

Gato pra cÃ¡, rato pra lÃ¡..pdf

using standard syste

using standard syste

using standard syste

using standard syste

using standard syste

using standard syste

using standard syste

using standard syste

using standard syste