Natural image profiles are most likely to be step edges Lewis D Griffin King’s College, London, UK [email protected]

M Lillholm & M Nielsen IT University, Copenhagen, Denmark {grumse,malte}@it-c.dk

Abstract We introduce Geometric Texton Theory (GTT), a theory of categorical visual feature classification that arises through consideration of the metamerism that affects families of co-localised linear receptive-field operators. A refinement of GTT that uses maximum likelihood (ML) to resolve this metamerism is presented. We describe a method for discovering the ML element of a metamery class by analysing a database of natural images. We apply the method to the simplest case – the ML element of a canonical metamery class defined by co-registering the location and orientation of profiles from images, and affinely scaling their intensities so that they have identical responses to 1-D, zeroth- and first-order, derivative-of-Gaussian operators. We find that a step edge is the ML profile. This result is consistent with our proposed theory of feature classification. Keywords: Natural Image Statistics; Feature Analysis; Textons; Receptive Fields.

1. Introduction In this paper, through computations using a database of natural images, we show that the most likely local form of a 1-D profile from a natural image is a step edge; rather than, for example, a uniform slope. This finding is a contribution to the growing body of results that characterize the statistical regularities of natural images (Bell & Sejnowski, 1997, Field, 1999, Hyvarinen & Hoyer, 2001, Lee, Mumford & Huang, 2001, Olshausen & Field, 1996, Pedersen, 2003, Ruderman, 1997, van Hateren & van der Schaaf, 1998, Zetzsche & Rohrbein, 2001), but our particular motivation for this research is its relevance to Geometric Texton Theory (GTT), an approach to feature analysis pioneered by Koenderink (Koenderink, 1993, Koenderink & van Doorn, 1992b, 1996). We review GTT and a refinement of it in the remainder of the introduction.

1.1. Feature Analysis Reviewing ‘feature analysis’ in vision science (Marr, 1982), David Marr identified its birth as Barlow’s 1950’s study of the Frog retina (Barlow, 1953). Barlow argued that assemblies of visual neurons with overlapping receptive fields (RFs) could detect spatially localized features of the retinal image. The promise of this approach, Marr wrote, was a simple scheme where “you looked at the image, detected features on it, and used the features to classify and hence recognize what you were looking at.” What this feature analysis idea amounts to is that at an early stage of visual processing there is a transition from quantitative measures of image structure (e.g. filter response magnitudes) to qualitative categorical descriptors (e.g. ‘edge’) (Griffin, 1995). Although the evidence is sketchy that this is so in biological visual systems (Hikosaka, 1997, Kobatake & Tanaka, 1994, Logothetis, Pauls & Poggio, 1995, Nakamura, Matsumoto, Mikami & Kubota, 1994, Sillito, Grieve, Jones, Cudeiro & Davis, 1995, Zhou, Friedman & von der Heydt, 2000), such a processing step is common in engineered vision systems (e.g. edge (Canny, 1986, Griffin, Colchester & Robinson, 1992, Marr, 1982) and corner detection (Lindeberg, 1998), surface classification (Koenderink & van Doorn, 1993), etc.). In this paper we continue an approach pioneered by Koenderink (Koenderink, 1993, Koenderink & van Doorn, 1992b, 1996) that, through consideration of metamerism in low level spatial vision, has the potential to forge a principled quantitative to qualitative link; we call this approach Geometric Texton Theory (GTT). Metamerism is the phenomenon that occurs when sensory measurements fail to completely determine the stimulus (Richards, 1979). GTT starts with the observation that early spatial vision is affected by metamerism just as are other perceptual competences such as colour vision (Wyszecki, 1982), motion perception (Horn & Schunk, 1981), shape-from-shading (Belhumeur, Kriegman & Yuille, 1999) and structure-from-motion (van Veen & Werkhoven, 1996). The central hypothesis of GTT is that feature categories arise as a consequence of the manner in which the visual system copes with metamerism in spatial vision. In the remainder of this introduction we will further describe GTT and our proposed refinement of it: that feature categories arise from a Maximum Likelihood (ML) approach to coping with low-level visual metamerism. To evaluate our refinement we need to calculate ML explanations and see whether they do give rise to effective feature categories. In the body of the paper we present a computational method by which ML explanations (given visual measurements) can be discovered by processing databases of natural images. We apply our method to the simplest possible case of metamerism – 1st order measurements of 1-D profiles – and show that after scaling so that profiles agree in these measurements, a step edge is the most likely form. This finding of a simple form for the ML explanation of a case of metamerism is consistent with our extended version of GTT.

1.2. V1 Measurements The response of a V1 simple cell is generally modelled as the taking of an inner product between the retinal irradiance and a receptive field (RF) weighting function

(Hubel & Wiesel, 1968). Two models of RF weighting functions are in widespread use: Gabor Functions (Daugman, 1985, Jones & Palmer, 1987) and Derivatives of Gaussians (DtGs) (Koenderink & van Doorn, 1990, 1992a, Romeny, Florack, Koenderink & Viergever, 1991, R.A. Young, 1987, R. A. Young, Lesperance & Meyer, 2001). Numerically the two families of functions are very similar, but DtGs have two advantages. First, an argument has been given that derive DtGs as the unique solution to constraints that should be satisfied by an idealized uncommitted visual system (Koenderink & van Doorn, 1992a, Weickert, Ishikawa & Imiya, 1997). Second, the properties and inter-relationships of a family of DtGs are simple and appealing to the theorist (Debnath, 1964, Koenderink & van Doorn, 1987, 1997, Martens, 1997). The nice formal properties of DtGs are unsurprising once it is appreciated that this approach is a generalisation of classical differential geometrical analysis to a system with operators of larger than infinitesimal size (Griffin, 1997b). A 1-D Gaussian kernel can be written as Gσ ( x ) = ( 2πσ

)

1 2 − 2

e



x2 2σ 2

; and, because of the

separability of the gaussian, a 2-D kernel can be written as Gσ ( x, y ) = Gσ ( x ) Gσ ( y ) . In both cases the parameter σ specifies the width or scale of the kernel. DtGs are spatial derivatives of the Gaussian kernel. Figure 1 (right) shows a family of DtGs up 4th order. Taking the term from differential geometry (Majthay, 1985), the space of possible responses of a family of DtGs up to some order is known as the jet space.

¥ = 2.1 -1.2 2.1 -3.2 -3.4

-0.6

-2.8

0.1

0.7 1.4

0.1

-0.2

2.2 0.8

4.5

Proximal Metamery Class

Figure 1 – Shows patches that are metameric with respect to the 4th order family of DtG operators shown i.e. they all produce the responses shown. The cross symbol denotes the inner product operation. The operators are of scale σ = 48 ≈ 7 and are shown in 64ä64 windows, which is the same size as the patches. Each number on the lower right results from applying an operator to a patch by forming an inner product, thus the measurements relate primarily to the central region of the patches. The elements of the metamery class are patches of the retinal image, thus we identify the class as proximal. Because the retinal image is frequency band-limited by the ocular optics, the dimensionality of the metamery class is large ( 642 − 15 = 4081 ) but finite. The majority of its members have high frequency structure and extreme grey values; nine patches with relatively narrow ranges of values are shown on top. (a) is an actual patch from a natural image, (b) is the metamery class element with the smallest variance, (c) the smallest range of values, (d) the smallest integral of the gradient squared and (e) the smallest maximum gradient. The other four foreground patches have power spectra typical of natural images.

In this paper we will consider 1-D versions of the DtGs applied to image profiles rather than patches. This is physiologically unrealistic but computationally convenient. The operators that we will employ are shown in figure 2 (left).

apodization function 1st order

L¶-grad L2

0th order

L1-grad, L¶ L2-grad

Figure 2 – On the left are shown the 0th and 1st order DtG filters that we are particularly concerned with in this paper. They are of scale σ ≈ 7 shown in a window 64 units long. On the right are shown some norm-minimizing elements within the canonical metamery class defined by the two filters. The metamery class consists of those 1-D functions that measure 0 and -1 respectively with the 0th and 1st order filters. In both panels, the apodization function (aperture) of scale σ ≈ 7 2 that relates to the DtGs (see section 2.2) is also shown. Vertical lines that delimit the bulk of the aperture are drawn. These are used in subsequent figures to indicate the extent of the aperture without cluttering figures by redrawing it.

1.3. V1 Metamerism – Local, Global, Proximal and Distal The concept of metamerism can be refined in a couple of ways. Firstly, since metamerism is relative to a set of measurement operators we can distinguish between local metamerism that relates to a single family of co-localised operators, and global metamerism that relates to a totality of such operators (Lillholm, Nielsen & Griffin, 2003). Secondly, metamerism can be graded as occurring on a continuum from the proximal to the more distal (Fechner, 1860/1966). For example, in colour vision metamerism of the spectral composition of the light falling on a retinal location is relatively proximal, whereas metamerism of the spectral reflectance function of a surface is relatively distal. We note that blurring of the retinal image by the ocular optics is relevant to metamerism in spatial vision. We identify metamerism of the blurred retinal image as proximal and distinguish it from a distal metamerism that is concerned with an idealized visual image unblurred by ocular optics. We observe that since blurring makes the retinal image approximately band-limited (and thus determinable from a finite density of samples), global proximal metamerism in spatial vision effectively does not occur. However, in this paper we are concerned with a metamerism that is more distal and more local and so definitely does occur. The metamerism is more local because we consider the responses only of a single family of co-localised operators, and more distal because we consider an idealized visual image, perfectly in focus and unblurred by the optics of the eye. Local proximal metamerism is illustrated in figure 1, where we show a selection of retinal image patches that all produce the same responses to a 4th order family of DtG operators. Note that these are only a few examples from the very large metamery class of patches that could account for the operator responses shown. Figure 3 shows the more complicated arrangement underlying local distal metamerism where the

point-spread function (PSF) of the eye (or camera in the experiments in this paper) must be considered.

2.1 -1.2



=

¥

image formation point-spread fn.

Distal Metamery Class

2.1 -3.2 -3.4

co-localised family of V1 neurons

0.7 -0.6

-2.8 0.1

1.4 0.1

-0.2

2.2 0.8

4.5

neuronal responses

Figure 3 – Illustrates the type of distal local metamerism class that is investigated in this paper. As in figure 1, the cross denotes the inner product operation, while the cross-within-a-circle denotes convolution. The elements of the metamery class are patches of the idealised perfectly-focussed and unblurred visual image. Thus the class is of infinite dimensionality. Even though class members can have energy at arbitrarily high frequencies it still has many elements similar to those that can be found in the proximal metamery class of figure 1.

1.4 Canonical vs. Particular Metamery Classes The representation of the world computed by the visual system seems often to be factored into causally independent components (for example in colour vision there is the separation of perceived object and illuminant colours). In spatial vision, the raw outputs of V1 neurons completely fail to make any such factorisation. The most obvious example is the entanglement of what is seen and where it is seen – translation of the retinal image by a small eye movement profoundly changes neuronal outputs and yet only ‘where’ has changed with little or no change in ‘what’. A second example is rotation of the retinal image, which again leads to changes in neuronal responses even at the centre of rotation (since individual V1 receptive fields are anisotropic). Other components that the human vision may try to disentangle and factor out are less clear-cut, but may include – viewing distance, surface slant (Griffin, 1997a), illuminant level and veiling haze. We can extend the notion of metamerism by taking into account such aspects of the world that should be factored out. Take translation as a first example. Rather than considering the retinal images that are locally metameric to a particular family of co-localized operators, we can consider an idealized canonical set of operators not located at any particular retinal location but rather inhabiting their own canonical visual space. Image patches are locally metameric with respect to this canonical family if, once brought into the visual space of the operators, they measure the same. Similarly, we can factor out rotation by bringing image patches into the space of the measurement operators in such a way that (for example) their y-derivatives vanish. And so on for other aspects of the world that we may want to factor out. Metamery classes so defined we call canonical, to distinguish them from particular metamery classes that are specified by a particular set of operators, at a particular

location, measuring particular values. Being canonical is a question of degree depending on what aspects are factored out. In this paper we consider a metamery class that is canonical in that we factor out position, orientation, illuminant level and veiling haze. Since we define the metamery class by 0th and 1st order measurements, our canonical transformation is powerful enough to bring all stimuli into the metamery class. Figure 2 (right) shows some key elements in this class.

1.5 Geometric Texton Theory It makes sense to ask, at Marr’s level of computational theory, what strategy the visual system uses to cope with local spatial metamerism? Possible answers include: (i) (ii)

(iii)

by ignoring it i.e. by using the numbers that define a metamery class as a symbol standing for the class but never representing or reasoning about individual class members (cf. colour vision). by using the class definition as a code that defines the class and allows generation of class members and testing of membership, but remaining uncommitted as to which class element is the true stimulus (cf. ‘multiple visual worlds’ (Koenderink, 2001)). By ‘sticking its neck out’ (Koenderink, 2001) and selecting a particular representative (icon) of a metamery class and attaching the icon’s qualities to the full metamery class.

We hypothesis that in low-level spatial vision at some point the visual system will have to use (iii), and moreover that it uses a strategy where the icon selected from a class is simple and representative so that the qualities that thus accrue to the metamery class are simple and relatively uncommitted. We further hypothesis that a partitioning of the jet space into equivalence classes within which the icons are qualitatively identical will produce effective feature classes. We call this approach Geometric Texton Theory (GTT). No-one has yet identified a strategy that successfully selects simple, representative icons. One of the approaches, that following Koenderink, we have previously (Tagliati & Griffin, 2001) considered is to choose the patch in a metamery class with the smallest range of intensities (i.e. the L¶ norm minimizer (Kreyszig, 1989)). The performance of this approach is promising when applied to metamery classes defined by families of filters up to 2nd order. For it can be shown (Tagliati & Griffin, 2001) that icons so selected are always binary-valued patches with a transition locus between the two image values which is a conic curve. This induces the appealing decomposition of possible 2nd order structure into the feature categories: uniform, edge (line), corner (parabola), neck (hyperbolae) and blob (ellipse). However, when we apply the L¶ approach to orders higher than 2 we obtain icons that fail to be representative (see figure 1c) of structure in natural images.

1.6 Selecting Icons We have evaluated (Griffin, 2002) several candidate rules for picking good icons from metamery classes. All of them have been based on minimizing some measure of

structural complexity, phrased as a norm (L2, L1 or L¶) of either the intensity or the intensity gradient magnitude1. The norm-minimizers of the 1-D 1st order canonical metamery class that we are concerned with here are shown in figure 2 (right). In figure 4 we show examples of how these forms can occur in natural images. In appendix 6.1 we prove that the L2 minimizer, that is to say the element of the metamery class with the smallest intensity variance, is a 1st DtG of the same scale as the defining filter; the L¶ minimizer, which is the element with the smallest intensity range, and the L1-grad minimizer, which has the smallest total variation, are identical – a step edge; and the L2-grad minimizer, which is the element with the smallest RMS gradient magnitude, is a cumulative Gaussian of the same scale as the defining filters.

L¶, L1-grad

L2

L2-grad

L¶-grad

Figure 4 – An enormous variety of profiles can be found in natural images, including (as shown here) the norm-minimizing forms shown in figure 2.

1

r  r  G 1 The Lr norm of an image I is defined to be  ∫ I ( x ) − µ r  ; where µ r is the value which results in G 2   x∈\  the lowest value of the norm, this term is not present in the normal mathematical definition of these 1

G G r r  norms. The L grad norms are defined to be  ∫ ∇I ( x )  . Infinity norms are defined by taking the G 2   x∈\  limit r → ∞ . r

As with many arguments that use measures of simplicity it is difficult convincingly to justify the use of one measure rather than another. As an alternative, we have proposed (Griffin, 2002) •

to select, as icon, the Maximum Likelihood (ML) member of a metamery class

ML has been suggested as a unifying principle for many operations of the visual system and has been found on occasions to account for psychophysical data better than any other model (Ernst & Banks, 2002, Mansfield & Legge, 1996) – see also (Pizlo, 2001) for an extensive review of related approaches. It is also a widely used approach in machine vision (Kanatani, 1998, Mardia, Goodall & Walder, 1996, Sebe & Lew, 2000, Tsai, Zhang & Willsky, 2001, Zhang & Tomasi, 2001), where it is often linked to minimum description length principles (Rissanen, 1978). We also note a recent (Freeman, Jones & Pasztor, 2002) machine vision super-resolution method that has points in common with our ML approach to selecting icons from metamery classes. To test GTT and our ML refinement of it, we will in the following compute and analyse the ML icon of the canonical metamery class described in 1.4. We work with this canonical metamery class because it is computationally the easiest to deal with, not because it is a short cut to results about more particular metamery classes. In fact in sections 2.3 and 3.2 we will explicitly show that it is not. We will conclude the paper by discussing how the ML refinement of GTT stands in light of our empirical findings.

2. Method In this section we describe and apply a method that we have developed to calculate the ML icon of a metamery class. We will consider metamery classes specified by responses of the 0th and 1st order 1-D DtG operators shown in figure 2 when applied to 1-D profiles from natural images. In particular we shall consider the canonical 1st order metamery class, which we construct by affinely scaling the luminance of each profile so that it measures 0 and -1 when measured with the 0th and 1st order operators respectively. This allows us to bring all profiles into the metamery class, which is canonical in the sense that the following are factored out: position, orientation, luminance level and veiling glare.

2.1. Preparing Profiles We use a database of 1220 images from the van Hateren natural image database (http://hlab.phys.rug.nl/archive.html) of linear, 1536ä768, images of woods, open landscapes and urban areas (van Hateren & Ruderman, 1998). We use a subset of the 4000 images in the .iml series. To form our subset we eliminated images that (i) have areas of saturation at the high end of the dynamic range, (ii) have areas of zero-valued pixels, (iii) are motion blurred, and/or (iv) have more than 25% of the image out of focus. We estimated the width of the PSF of the images by fitting 2-D error functions to high contrast straight edges such as those caused by objects silhouetted against the

sky. This gave an estimate of σ psf =0.8 pixel units, though this is certainly a lower bound as it was taken from the sharpest edges that we could find. The van Hateren images are encoded as 16-bit, but examination of their histograms shows that they have an effective precision of only 8-bits. We were concerned that this low level of quantization could produce artefacts in our results, in particular that step edges might arise in areas such as cloudless sky. To prevent this, we converted the images values to floating point and randomly perturbed them within each quantization bin. Finally, each image was normalized by dividing all values by the mean value. This is irrelevant for investigations of the canonical metamery class but relevant for the investigation of particular metamery classes in sections 2.3 and 3.2. We extracted 2500 1-D profiles from each image for a total of 3.05ä106 profiles. Profiles are extracted at random location and orientation (all real-valued) using bilinear interpolation. Each profile is 64 samples in length. From this point on the profiles were treated as all existing in the same 1-D space so position and orientation had been factored out. Profiles were then measured by forming the inner product with the 0th and 1st order 1-D DtGs of scale σ ≈ 7 (fig 3 left). All profiles then underwent a two-step normalization process to factor out luminance level and veiling glare. First, each profile that had a positive 1st derivative measurement had its 64 samples reversed in order. In the second step, each profile is individually affinely scaled ( I ′ = mI + c, m > 0 ) so that it measures 0 and -1 respectively with the 0th and 1st order DtGs. Note that the first step guarantees that the normalization factor m is always positive so the intensity axis is never inverted to bring profiles into canonical form. In a very small number of cases (0.002%) the 1st derivative value before normalization is so small that the profiles cannot be brought into canonical form in a numerically stable way – these profiles are discarded.

For comparison with natural image profiles we also construct two sets of 3.05ä106 synthetic profiles. One set – the Gaussian set - has each of its 64 values drawn independently from a normal distribution. The other set – Brownian profiles (Mandelbrot & van Ness, 1968) – is generated by, for each profile, setting the value at one end with a normally distributed random variable (mean=0, s.d.=100) then setting in turn each of the remaining 63 samples to be equal to the previous sample plus an independently generated normally distributed offset (mean=0, s.d.=1). Both the Gaussian and Brownian sets are scaled in the same manner as the natural profiles to bring them into the canonical metamery class. Figure 5 shows examples of the three types of profile before and after they have been scaled. We note that figure 5 (top) also gives an indication of what the class of all profiles, co-registered into the same space but unscaled, is like. The density over the space of profiles associated with this class will be extremely skewed, and it is clear that its maximum likelihood member will be a profile of constant low-value.

natural

gaussian

brownian

raw

scaled

scaled (zoomed on vertical axis)

Figure 5 – Shows examples of the 1-D profiles that were processed. Nine profiles of each class are shown. The top row shows the profiles as they are after extraction from images (natural) or random generation (Gaussian and Brownian). The middle row shows the same profiles after they have been scaled (and possibly reversed) to bring them into the canonical metamery class. The third row also shows the scaled profiles but with a zoomed view of the vertical axis.

2.2 Computing the ML Profile Probability density functions over a space (and thus ML elements of that space) are concepts that are only meaningful relative to a metric on that space. The space under consideration here is the space of 1-D profiles. The obvious metric to use is the L2

Euclidean metric d ( p, q ) = ∫ ( p ( x ) − q ( x ) ) . The problem with this obvious choice 2

2

x

is its dependency on the spatial cut-off of our representation of the profiles i.e. we might get a different ML profile if we used a representation that extended out twice as far from the DtG filters’ centre. The solution to this dependency is to use an apodized L2 metric, which cares less about a difference between two profiles the further away from the centre of the DtG operators that the difference occurs. The apodisation function (aperture) that we use is a gaussian of scale 2 times that of the DtG filters (see figure 2), thus d ( p, q ) = ∫ Gσ 2

2

( p ( x ) − q ( x ))

2

. This choice of aperture and its

x

size follow from noting that application of DtGs is computation of terms of the Hermite Transform of a function. The Hermite Transform being analogous to the Fourier Transform but within a Gaussian aperture rather than along the full real line (Debnath, 1995, Koenderink & van Doorn, 1997, Makram-Ebeid & Mory, 2003). For each class of profiles (natural, Gaussian and Brownian), we assume that there exists an unknown probability density over the metricized space of profiles that encodes how often particular profiles will result from our profile extraction or generation processes. Each set of scaled profiles form a cloud of points in the space,

the density of each cloud approximately reflecting the corresponding unknown probability density. Rather than using the point clouds to estimate the entire density we seek only to estimate the location of the mode of the density as that will correspond to the ML profile in the canonical metamery class. For this purpose we use a multi-dimensional mode estimation algorithm that we have developed (Griffin & Lillholm, 2003). The method is a variant of kernel density-estimation methods (Parzen, 1962), which can be understood as a sophisticated form of histogram construction that uses fuzzy Gaussian bins rather than hard-edged bins and allows the bins to be centred at any location and to be of any width rather than spaced regularly on a grid. Note that the use of gaussian bins for density estimation is independent of the use of gaussian derivative filters as models of receptive fields, though the reasons why the gaussian is a good choice are similar in the two cases. Unlike standard kernel methods, we do not compute an explicit estimate of the density, indeed this would be difficult to represent as we are working in such a high dimensional space. Instead we vary the location (a 64-D point) and width (a 1-D positive parameter) of a single density-estimation kernel (i.e. a fuzzy histogram bin), calculating the number of profiles ‘seen’ by the kernel (cf. within a histogram bin) as we go. The density seen by the kernel is given by the number of profiles seen, divided by the kernel volume. We vary the kernel position and width, seeking out the location in the 64-D space where the kernel-estimated density is maximal. This density maximization cannot be controlled by simple gradient ascent as that causes the procedure to ‘rush’ to very narrow kernels that inevitably see a high sample density, as their volume is so small. We prevent this with two measures. Firstly we carefully control the rate at which kernel width is reduced during the search as is done in graduated non-convexity methods (Blake & Zisserman, 1987). Secondly, we reduce the number of profiles seen by a kernel in a manner that particularly penalizes small numbers of seen profiles. This means that the highest estimated-densities are no longer associated with the finest kernels. Put informally, our method searches for the fuzzy histogram bin with the density that is reliably higher than other bins. We call this method Pessimistic Scale Space Tracking (PSST). Further details are given in (Griffin & Lillholm, 2003).

2.3 Canonical vs. Particular Metamery Classes Our concern in this paper is not with particular metamery classes but with the canonical metamery class obtained by bringing profiles into the same space and by scaling their intensities so that they agree in 0th and 1st order structure. There are several reasons for this preference, not least because studying particular metamery classes presents difficulties in that very many profiles would have to be examined in order to find sufficient numbers to populate a given particular metamery class. Also, any result about this canonical metamery class has a certain generality as it is about the totality of profiles rather than a subset. That is not to say that studying the canonical metamery class is a short-cut to studying particular metamery classes. Indeed the ML forms within particular metamery classes need not be the same as the ML form within the canonical.

To clarify this distinction between particular and canonical metamery classes, we have computed ML forms for profiles brought into canonical form but taken only from a subset of the totality of profiles. The subsets we have considered are defined with respect to the parameters – position, orientation, 0th and 1st order structure - that we factor out in the canonical transformation. Thus for position, we have considered two subsets, profiles just from the top (or bottom) halves of images. For orientation, we have considered three subsets, profiles with orientations near horizontal, near vertical or near diagonal. For 0th and 1st order structure we have taken four subsets: dark and flattish, light and flattish, dark and steep, and light and steep. In allocating profiles to these subsets we consider their 0th and 1st order structure before intensity scaling and compare these values to the medians of these dimensions. In all cases, ML computations were performed on the same number of profiles as in the main experiment. We note that performing ML computations on such subsets of profiles is not the same as ML computations on profiles in particular metamery classes. It is however a step in this direction, as if the ML profile for a subset is different from the canonical form it indicates that there is variation amongst the ML form of particular metamery classes. If there is no difference it does not prove that the ML form for all particular metamery classes is the same; this could only be proved by checking a wide range of particular metamery classes.

2.4 Control Computations We performed several control computations to assess the robustness of our main computation of the ML natural image profile. We assess robustness relative to filter scale, dataset of images used, log-transformation of the image values, and varying sampling and interpolation scheme used. The first experiment aimed to assess whether the ML profile depended on the scale of the filters used. We carefully created versions of the van Hateren images with reduced resolution and increased inner scale. We did this by first perturbing the images to remove quantization effects as described in section 2.1. Next we blurred the images with a gaussian kernel of width 3σ psf ≈ 1.4 , which effectively doubles the width of the PSF. Next we subsampled the image by a factor of two in each dimension. Then we quantized the subsampled image, and finally we perturbed it again. We found that it was necessary to take all these steps as otherwise the resulting images were statistically distinguishable from the original set. We created a third set of images by carrying out this procedure on the reduced scale and resolution set. We then calculated ML profiles for each of these two new sets of images in exactly the same way as before. If the original ML profile was for a scale and sample spacing of σ ≈ 7, ∆ = 1 , then these new ML profiles are for σ ≈ 14, ∆ = 2 and σ ≈ 28, ∆ = 4 . The ratio between the PSF width and the pixel spacing was constant. ‘Natural image’ is an imprecise term so it is possible that our results reflect a particular interpretation of the term used in the preparation of the van Hateren images. As control against this we have also used the BT dataset

(ftp://ftp.vislist.com/IMAGERY/BT_scenes/). The BT image database consists of 98 images of various outdoor scenes. The images are 512ä512 and 8-bit RGB. We converted these to grey-scale using the linear transformation grey = 0.30 × red + 0.59 × green + 0.11 × blue . We have also wondered if our results are dependent on our use of raw image values rather than log-transformed values. Logged values have been recommended on theoretical grounds (Koenderink & van Doorn, 2002) and are used in the majority of analyses of natural image statistics. We have computed the ML profile for the log-transformed van Hateren and BT data. Our final control experiment was aimed at establishing whether our results are dependent on details of sampling and interpolation used in our computations. We used the once sub-sampled set of images used in the scale comparison above but applied filters of scale σ ≈ 7, ∆ = 2 to them. We have also used nearest neighbour interpolation rather than bilinear when extracting profiles from images. Bilinear interpolation has a tendency to spuriously dampen high frequencies; nearest neighbourhood interpolation has the opposite tendency. Together the behaviour of these two interpolation schemes span the artefactual effects of reasonable methods for interpolation (Thévenaz, Blu & Unser, 2000). We have used both types of interpolation for σ ≈ 7, ∆ = 1 and σ ≈ 7, ∆ = 2 filters.

3. Results 3.1 The ML profile The main result of the paper is computation of the ML profile within the canonical 1-D 1st order metamery class. For this we have applied the PSST method described in section 2.2 to the sets of profiles whose preparation was described in section 2.1. To allow calculation of error bars, each computation was repeated three times on freshly extracted or generated profiles. The results are shown in figure 6.

natural

gaussian

brownian

Figure 6 – Shows in black the ML members of the canonical 1st order metamery class for three classes of profile; the error bars are 95% confidence intervals. The grey curves are norm-minimizers shown for comparison – refer to figure 2 to identify these.

In appendix 6.2 we prove that: • for Gaussian random profiles, the ML profile is the L2-minimizer (i.e. a 1st DtG of the same scale as the filters defining the class), and • for Brownian random profiles, the ML profile is the L2-grad minimizer (i.e. a cumulative Gaussian of the same scale as the filters defining the class).

Figure 6 shows that within the tolerance of the confidence limits, these are the ML profiles that we have estimated through PSST computation. The fact these two ML profiles are different from each other shows that the method gives results that are a function of the class of profiles used as input rather than just the filters used to specify the metamery class. That the ML profiles agree with theoretical predictions in these two cases where we know the ground truth gives us confidence in the accuracy of its result when applied to natural image profiles, for which no ground truth is available. Our result for natural images is that within the central region of the apodization window the ML profile is a blurred step edge (figure 6, left). It is well fit within the window by a cumulative gaussian of scale σ edge = 1.7 pixel units.

3.2 Canonical vs. particular Metamery Classes Figure 7 shows the ML forms for subsets of profiles selected according to position within the image or orientation. Confidence intervals have not been shown for clarity but are of a similar extent to those in figure 6. Within the limit of the confidence intervals, the ML profiles for position and orientation subsets have the same slightly blurred step edge form as for all profiles. Again we note that this does not prove that all particular (with respect to position and orientation) metamery classes have the step edge for their ML form but the result is consistent with that possibility.

(a) varying position

(b) varying orientation

Figure 7 – Shows the ML form for subsets of natural image profiles. (a) Shows the ML form using profiles either taken from the top half, the bottom half or anywhere in the image. (b) Shows the ML form for profiles extracted at orientations near horizontal, near vertical, near the intermediate 45º diagonals, or at any orientation.

In contrast to figure 7, figure 8 shows that the ML form of profiles is not a step edge for all particular (with respect to 1st- and 0th-order filter responses) metamery classes. What the figure shows is that for two of the four subsets - the light and steep, and the dark and steep - the ML profile is the same blurred step edge that we got for the full metamery class; but for the two flattish subsets the form is different. For dark and flattish profiles the ML form is close to the L2-grad minimizer. For light and flattish

profiles the ML form is the L¶-grad minimizer. The figure also shows comparable results for the gaussian and brownian profiles. For the gaussian profiles the ML profile is different for each subset and none of them is the same as the L2-minimizing form of the canonical ML profile (see figure 6). In contrast, for the brownian profiles all four ML profiles have the same L2-grad minimizing form as the canonical ML profile.

natural

gaussian

brownian dark & flat

light & flat

dark & steep

light & steep

Figure 8 – Shows ML profiles (with 95% confidence intervals) for subsets of the natural, gaussian and brownian profiles. The subsets are defined by the zeroth- and first-order measurements of the profiles before they are scaled to be equal. The labels along the bottom indicate the subset used in that column. For the dark and flattish subsets of natural image profiles the L2-grad minimizer is shown for comparison; for the light and flattish subset the L¶-grad minimizer is shown.

In figure 9 we show natural image profiles, and the image contexts from which they came, that are the best examples of profiles that have the various ML forms (selected from 3.05ä106 candidates). The physical cause of the profiles is different for each subset. The dark and flat example is caused by a depth discontinuity plus an associated band of self-shadowing; the light and flat by a low frequency undulation in an overcast sky; the dark and steep (step edge) by the border of a patch of water that is reflecting the sky; and the light and steep (step edge) by a depth discontinuity.

dark & flat

light & flat

dark & steep

light & steep

Figure 9 – Along the top row are shown example profiles from the subsets of natural image profiles that match closely the ML profile for that subsets (cf. figure 8). Like figure 8, the labels along the bottom indicate the subset used. The middle row shows the image, and location within it, from which each profile comes. The bottom row shows a zoomed view of the context of each profile.

3.3 Control Computations In figure 10 we present results of control computations designed to assess the stability of the result that the ML natural image profile is a cumulative gaussian of scale σ = 1.7 . The figure shows that none of our three control computations demonstrated any significant variation of the ML profile with the changes to the computation that were assessed – filter scale, image database used, log-transformation, interpolation and sampling. We note that the finding of scale invariance agrees with previous findings concerning natural images (Field, 1987, Kretzmer, 1952, Ruderman, 1997, Ziegaus & Lang, 1998). Because of scale invariance, it is meaningful to express the edge blur of the ML profile as a fraction of the scale of the filters: it is 25%.

(a) varying filter scale

(b) varying datasets and/or logging

(c) varying interpolation and/or sample spacing

Figure 10 – Illustrates the stability of the maximum likelihood natural image profile to variations in the details of its computation. (a) the ML profiles as computed using filters of scale σ ≈ 7, 14, 28 . (b) the ML profiles as computed with and without log-transformation of the image values before profile extraction and/or using an alternative to the van Hateren database of natural images. (c) the ML profiles as computed using nearest neighbour or bilinear interpolation for extraction of profiles from images, and/or sub-sampling the image by a factor of two while holding the filter scale constant. Note that slightly sharper steps are obtained using nearest neighbour interpolation (cf figure 12).

3.4 The distal ML profile There is a mismatch between the width of the PSF (0.8 pixels) and the blur of the ML profile (1.7). So clearly the ML profile does not arise from normal sections through perfect step edges blurred only by the PSF. The mismatch would be resolved if ML profiles were due to sections through edges at more oblique angles ( σ edge ≈ σ psf cosec 62° ) or if more than just the PSF blurred the edges. Both of these explanations apply to the profiles in figure 11, which shows the three profiles, and their image context, that are most similar (out of 3.05ä106, in a non-windowed L2 sense) to our computed ML profile. The profiles cross the image edge at angles of 40°, 70° and 10°. These different crossing angles lead to similarly blurred profiles because the blur of the image edge is different in the three cases. In the first case the blur is due to the PSF but also due to the nature of the physical edge that caused it. In the second case the blur is due just to the PSF but is amplified by an oblique crossing angle. In the third case the crossing angle is near normal but the PSF is locally broader than 0.8 pixels probably due to limited depth of focus. It is initially puzzling that the ML profile is equal to that obtained by sectioning a PSF blurred step edge at an oblique angle rather than normally. Indeed it is easy to construct an informal argument that if the images are populated only by such blurred step edges then the ML profile will be a normal section. We believe that the presence of some intrinsically blurred edges (first example in figure 11) and of some poor focus regions (third example in figure 11) pulls the ML profile away from the normal section of a PSF blurred edge to a more oblique section.

Figure 11 – On the top row are the three profiles, out of 3.05ä106, that are closest (in an L2 sense) to the maximum likelihood form shown in figure 6. The middle row shows the image, and location within it, from which each profile comes. The bottom row shows a zoomed view of the context of each profile.

Our analysis so far has been concerned with a proximal metamery class whose elements are the raw images (Griffin, 1999) of the camera. As discussed in section 1.3 we can also conceive of a more distal metamery class the elements of which are idealized, unblurred-by-focusing-optics, images. A visual system may be more concerned with such distal objects as they have more-about-the-world in them than blurred proximal images. We would like to compute the ML profile of the distal canonical metamery class but we cannot get samples of it (and might have trouble representing them even if we could). Instead we will make and test a hypothesis about it. Our hypothesis is that the ML profile of the distal canonical metamery class is an unblurred step edge. To test this we have performed the following experiment. We have added small amounts of blur ( σ inc = 0.6 − 2.2 ) to the van Hateren images to artificially increase the width of their PSF. For each degree of blur we have calculated the ML profile and measured its edge blur. We plot the results of this as edge blur against PSF width in figure 12. Other results plotted in the figure show that sampling and interpolation play a

negligible role in determining the degree of blur of the ML edge. The figure shows that edge-blur increases with PSF width and our results are consistent with them going to zero together. Thus we accept our hypothesis that distal natural image profiles are most likely to be step edges.

σ edge

σ filter

0.6

0.3

σ psf 0.1

0.2

0.3

σ filter

Figure 12 – Shows the blur of the ML profile (vertical) plotted against the blur of the pointspread function (horizontal). The leftmost filled square symbol corresponds to the ML profile shown in figure 6. Confidence intervals, where plotted are, 95%. All ML profiles were calculated using the default filters of scale σ ≈ 7 . Profiles were extracted from images using the default bilinear interpolation for square symbols, and using nearest neighbour interpolation for triangle symbols. The filters were sampled using the default spacing of ∆ = 1 for filled symbols, and the sub-sampled spacing of ∆ = 1 for unfilled symbols. From leftwards extrapolation of the trend it is plausible that the two types of blur go to zero together.

5. Conclusions We described a method of computing the ML profile from a large number of samples. We validated our method with randomly generated profiles for which we were able to compute the correct result. Applying the method to natural images showed that the ML profile of the canonical 1st order 1D metamery class is a blurred step edge. We have demonstrated that the blur is due to PSF and if unblurred images could be obtained the ML form would be an unblurred step edge. We have shown that this result is robust with respect to details of sampling and interpolation, and that it holds over at least two octaves of scale and over two independently constructed databases of natural images. To clarify the distinction between the canonical metamery class which all profiles can be transformed into, and particular metamery classes defined by specific positions, orientations or filter responses we carried out ML computations on subsets of the population of profiles. We found a difference between the canonical and the particular

in relation to the steepness of profiles. The ML form for steep profiles is the blurred step edge, while for flattish profiles the form is smoother and more complex. Our results on the canonical and the particular are not contradictory and may be phrased thus: if all that one knows of a randomly selected natural image profile is that it has non-zero first-order structure (true for 99.998% of profiles) then it is ML to be a step edge, if one knows something about the magnitude of the first-order structure that may alter what the ML form is. Our hypothesis about feature detection – that ML selected icons will lead to a simple classification of qualitative structure – is supported by our results. The icon we have been led to (a step edge) for the canonical metamery class that we have considered (1-D, 1st order) did turn out to have a very simple qualitative structure. If the ML form had turned out to have a more complex form the theory would already be looking less attractive. Further results in 2-D and for higher orders are needed to test the hypothesis sufficiently to convince.

6. Appendix 6.1 Form of norm-minimizers within the canonical metamery class We prove the form of the norm-minimizers shown in figure 2. In each case the problem is to find p : \ → \ such that: (i) Gσ p = 0 , (ii) Gσ′ p = −1 plus a third constraint specific to the norm. The angle bracket notation denotes the inner product operation between functions. 6.1.1 The L2 minimizer is a 1st DtG

The additional constraint is (iii)

∫p

2

is minimized. Using a Lagrangian multiplier

approach, one gets that the variation of p 2 (which is 2 p ) should be a weighted sum of Gσ and Gσ′ . Given that Gσ Gσ′ = 0 , constraints (i) and (ii) are sufficient to determine that p = −4σ 3 π Gσ′ . 6.1.2 The L• minimizer is a Step Edge

The proof of this has been given elsewhere (Tagliati & Griffin, 2001) but comes originally from an argument as to the form of optimal spectral reflectance functions (Schrödinger, 1920). 6.1.3 The L2-grad minimizer is a Cumulative Gaussian

The additional constraint is (iii)

∫ ( p′ )

2

approach, one gets that the variation of

is minimized. Using a Lagrangian multiplier

( p′ )

2

(which is 2 p′′ ) should be a weighted

sum of Gσ and Gσ′ . Thus p must be of the form A + B Gσ(

−1)

+ C Gσ(

−2 )

, so that

( −1)

p′ = B Gσ + C Gσ . But the quantity we seek to minimize will only be finite if C = 0 ,

(

so from (i) and (ii), p = σ π 2Gσ(

−1)

)

−1 .

6.1.4 L1-grad minimizer

In 1-D, the L¶ norm of a function is always less than or equal to its L1-grad norm. The step edge, which is the L¶ minimizer (section 6.1.2), has equal L¶ and L1-grad norms. Therefore by a reductio ad absurdum the step edge is also the L1-grad minimizer. 6.1.5 L•-grad minimizer

We are unable to provide a proof for our claim that the minimizer is a linear slope.

6.2 Maximum likelihood form of randomly generated profiles 6.2.1 Gaussian Profiles

The probability of generating a Gaussian profile p : \ → \ is proportional to

∏e

− k p( x )

2

=e

−k p

2

. If q : \ → \ is a profile in the canonical metamery class then it

x∈\

could have been generated (before being scaled into canonical form) as any of A + Bq . So the probability of selecting q from the canonical metamery class is proportional to q

−1

∫∫

e

− k A+ B q

2

2

which is inversely proportional to q . So the ML

A, B∈\

profile within the canonical metamery class is the variance minimizer, which (from the section 6.1.1) is the appropriately scaled 1st DtG of the same scale as the filters defining the metamery class. 6.2.2 Brownian Profiles

The probability of generating a Brownian profile p : \ → \ is proportional to 2 − k ( p ′( x ) ) −k e =e ∏

p′

2

. If q : \ → \ is a profile in the canonical metamery class then it

x∈\

could have been generated (before being scaled into canonical form) as any of A + Bq . So the probability of selecting q from the canonical metamery class is proportional

to

q

−1

∫∫

A, B∈\ 2

e

− k ( A+ B q )′

2

= q

−1

∫∫

e

− k B 2 q′

2

which

is

inversely

A, B∈\

proportional to q′ . So the ML profile within the canonical metamery class is the

L2-grad minimizer, which (from section 6.1.3) is the cumulative Gaussian of the same scale as the filters defining the metamery class.

7. References

Barlow, H.B. (1953). Summation and inhibition in the frog's retina. Journal of Physiology (London), 119, 69-88. Belhumeur, P., Kriegman, D., & Yuille, A.L. (1999). The bas-relief ambiguity. International Journal of Computer Vision, 35, 33-44. Bell, A.J., & Sejnowski, T.J. (1997). The ''independent components'' of natural scenes are edge filters. Vision Research, 37 (23), 3327-3338. Blake, A., & Zisserman, A. (1987). Visual Reconstruction. (MIT Press. Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8 (6), 679-698. Daugman, J.G. (1985). Uncertainty relations for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America A, 2, 1160-1169. Debnath, L. (1964). On Hermite Transforms. Mathematicki Vesnik, 1 (16), 285-292. Debnath, L. (1995). Integral Transforms and their Applications. (CRC Press. Ernst, M.O., & Banks, M.S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415 (6870), 429-433. Fechner, G. (1860/1966). Elements of psychophysics. (New York: Holt, Rinehart & Winston. Field, D.J. (1987). Relations between the Statistics of Natural Images and the Response Properties of Cortical-Cells. Journal of the Optical Society of America a-Optics Image Science and Vision, 4 (12), 2379-2394. Field, D.J. (1999). Wavelets, vision and the statistics of natural scenes. Philosophical Transactions of the Royal Society of London Series A - Mathematical Physical and Engineering Sciences, 357 (1760), 2527-2542. Freeman, W.T., Jones, T.R., & Pasztor, E.C. (2002). Example-based super-resolution. Ieee Computer Graphics and Applications, 22 (2), 56-65. Griffin, L.D. (1995). Descriptions of Image Structure. (London: PhD thesis, University of London. Griffin, L.D. (1997a). Critical Points in Affine Scale Space. In: S. Sporring, M.F. Nielsen, L.M.J. Florack, & P. Johansen (Eds.), Gaussian Scale-Space Theory (pp. 165-180). Griffin, L.D. (1997b). Scale-imprecision space. Image and Vision Computing, 15 (5), 369-398. Griffin, L.D. (1999). Partitive mixing of images: a tool for investigating pictorial perception. Journal of the Optical Society of America a-Optics Image Science and Vision, 16 (12), 2825-2835. Griffin, L.D. (2002). Local image structure, metamerism, norms, and natural image statistics. Perception, 31 (3), 377-377. Griffin, L.D., Colchester, A.C.F., & Robinson, G.P. (1992). Scale and Segmentation of Gray-Level Images Using Maximum Gradient Paths. Image and Vision Computing, 10 (6), 389-402. Griffin, L.D., & Lillholm, M. (2003). Mode Estimation by Pessimistic Scale Space Tracking. Scale Space '03 (Isle of Skye, UK: Springer. Hikosaka, K. (1997). Responsiveness of neurons in the posterior inferotemporal cortex to visual patterns in the macaque monkey. Behavioural Brain Research, 89 (1-2), 275-283. Horn, B.K.P., & Schunk, B.G. (1981). Determining optical flow. Artificial Intelligence, 17, 185-203.

Hubel, D.H., & Wiesel, T.N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology, 195, 215-243. Hyvarinen, A., & Hoyer, P.O. (2001). A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images. Vision Research, 41 (18), 2413-2423. Jones, J.P., & Palmer, L.A. (1987). The Two-Dimensional Spatial Structure of Simple Receptive Fields in Cat Striate Cortex. Journal of Neurophysiology, 58 (6), 1187-1211. Kanatani, K. (1998). Geometric information criterion for model selection. International Journal of Computer Vision, 26 (3), 171-189. Kobatake, E., & Tanaka, K. (1994). Neuronal Selectivities to Complex Object Features in the Ventral Visual Pathway of the Macaque Cerebral-Cortex. Journal of Neurophysiology, 71 (3), 856-867. Koenderink, J.J. (1993). What is a feature? Journal of Intelligent Systems, 3 (1), 4982. Koenderink, J.J. (2001). Multiple visual worlds (editorial). Perception, 30, 1-7. Koenderink, J.J., & van Doorn, A.J. (1987). Representation of Local Geometry in the Visual-System. Biological Cybernetics, 55 (6), 367-375. Koenderink, J.J., & van Doorn, A.J. (1990). Receptive-Field Families. Biological Cybernetics, 63 (4), 291-297. Koenderink, J.J., & van Doorn, A.J. (1992a). Generic Neighborhood Operators. Ieee Transactions on Pattern Analysis and Machine Intelligence, 14 (6), 597-605. Koenderink, J.J., & van Doorn, A.J. (1992b). Receptive Field Assembly Specificity. Journal of Visual Communication and Image Representation, 3 (1), 1-12. Koenderink, J.J., & van Doorn, A.J. (1993). Illuminance Critical-Points on Generic Smooth Surfaces. Journal of the Optical Society of America a-Optics Image Science and Vision, 10 (5), 844-854. Koenderink, J.J., & van Doorn, A.J. (1996). Metamerism in complete sets of image operators. In: Adv. Image Understan. '96 (pp. 113-129). Koenderink, J.J., & van Doorn, A.J. (1997). Local Image Operators and Iconic Structure. In: G. Sommer, & J.J. Koenderink (Eds.), Algebraic Frames for the Perception-Action Cycle, 1315 (pp. 66-93): Springer. Koenderink, J.J., & van Doorn, A.J. (2002). Image processing done right. ECCV 2002, 2350 (pp. 158-172). Copenhagen: Springer. Kretzmer, E. (1952). Statistics of television signals. Bell System Technical Journal, 31, 751-763. Kreyszig, E. (1989). Introductory Functional Analysis with Applications. Wiley Classics Library (Wiley. Lee, A.B., Mumford, D., & Huang, J. (2001). Occlusion models for natural images: A statistical study of a scale-invariant dead leaves model. International Journal of Computer Vision, 41 (1-2), 35-59. Lillholm, M., Nielsen, M., & Griffin, L.D. (2003). Feature-based Image Analysis. International Journal of Computer Vision, 52 (2), 73-95. Lindeberg, T. (1998). Feature detection with automatic scale selection. International Journal of Computer Vision, 30 (2), 79-116. Logothetis, N.K., Pauls, J., & Poggio, T. (1995). Shape Representation in the Inferior Temporal Cortex of Monkeys. Current Biology, 5 (5), 552-563. Majthay, A. (1985). Foundations fof Catastrophe Theory. (London: Pitman Publishing Ltd.

Makram-Ebeid, S., & Mory, B. (2003). Scale-space image analysis based on hermite polynomials theory. In: L.D. Griffin, & M. Lillholm (Eds.), Proc. Conf. on Scale Space Methods in Computer Vision, 2695 (pp. 57-71): Springer. Mandelbrot, B., & van Ness, J. (1968). Fractional brownian motions, fractional noises and applications. SIAM Review, 10 (4), 422-437. Mansfield, J.S., & Legge, G.E. (1996). The binocular computation of visual direction. Vision Research, 36 (1), 27-41. Mardia, K.V., Goodall, C., & Walder, A. (1996). Distributions of projective invariants and model-based machine vision. Advances in Applied Probability, 28 (3), 641-661. Marr, D. (1982). Vision. (New York: W H Freeman & co. Martens, J.B. (1997). Local orientation analysis in images by means of the Hermite transform. Ieee Transactions on Image Processing, 6 (8), 1103-1116. Nakamura, K., Matsumoto, K., Mikami, A., & Kubota, K. (1994). Visual Response Properties of Single Neurons in the Temporal Pole of Behaving Monkeys. Journal of Neurophysiology, 71 (3), 1206-1221. Olshausen, B.A., & Field, D.J. (1996). Natural image statistics and efficient coding. Network-Computation in Neural Systems, 7 (2), 333-339. Parzen, E. (1962). On estimation of probability density function and mode. Annals of Mathematical Statistics, 33, 520-531. Pedersen, K.S. (2003). Statistics of Natural Image Geometry. Department of Computer Science (Copenhagen: University of Copenhagen. Pizlo, Z. (2001). Perception viewed as an inverse problem. Vision Research, 41 (24), 3145-3161. Richards, W. (1979). Quantifying Sensory Channels - Generalizing Colorimetry to Orientation and Texture, Touch, and Tones. Sensory Processes, 3 (3), 207229. Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14, 465-471. Romeny, B.M.T., Florack, L.M.J., Koenderink, J.J., & Viergever, M.A. (1991). Scale Space - Its Natural Operators and Differential Invariants. In: LNCS, 511 (pp. 239-255). Ruderman, D.L. (1997). Origins of scaling in natural images. Vision Research, 37 (23), 3385-3398. Schrödinger, E. (1920). Theorie der pigmente von grosster leuchtkraft. Annalen den Physik, 62, 603-622. Sebe, N., & Lew, M.S. (2000). Maximum likelihood stereo matching. In: ICPR '00 (pp. 900-903). Sillito, A.M., Grieve, K.L., Jones, H.E., Cudeiro, J., & Davis, J. (1995). Visual Cortical Mechanisms Detecting Focal Orientation Discontinuities. Nature, 378 (6556), 492-496. Tagliati, E., & Griffin, L.D. (2001). Features in Scale Space: Progress on the 2D 2nd Order Jet. In: M. Kerckhove (Ed.) LNCS, 2106 (pp. 51-62): Springer. Thévenaz, P., Blu, T., & Unser, M. (2000). Image Interpolation and Resampling. In: I.N. Bankman (Ed.) Handbook of Medical Imaging, Processing and Analysis (pp. 393-420). San Diego CA: Academic Press. Tsai, A., Zhang, J., & Willsky, A.S. (2001). Expectation-maximization algorithms for image processing using multiscale models and mean-field theory, with applications to laser radar range profiling and segmentation. Optical Engineering, 40 (7), 1287-1301.

van Hateren, J.H., & Ruderman, D.L. (1998). Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex. Proceedings of the Royal Society of London Series BBiological Sciences, 265 (1412), 2315-2320. van Hateren, J.H., & van der Schaaf, A. (1998). Independent component filters of natural images compared with simple cells in primary visual cortex. Proceedings of the Royal Society of London Series B-Biological Sciences, 265 (1394), 359-366. van Veen, H., & Werkhoven, P. (1996). Metamerisms in Structure-from-motion Perception. Vision Research, 36 (14), 2197-2210. Weickert, J., Ishikawa, S., & Imiya, A. (1997). On the history of Gaussian scale-space axiomatics. In: Gaussian Scale-Space Theory, 8 (pp. 45-59). Wyszecki, G. (1982). Color Science. (New York: Wiley. Young, R.A. (1987). The Gaussian derivative model for spatial vision: I. Retinal mechanisms. Spatial Vision, 2, 273-293. Young, R.A., Lesperance, R.M., & Meyer, W.W. (2001). The Gaussian Derivative model for spatial-temporal vision: I. Cortical model. Spatial Vision, 14 (3-4), 261-319. Zetzsche, C., & Rohrbein, F. (2001). Nonlinear and extra-classical receptive field properties and the statistics of natural scenes. Network-Computation in Neural Systems, 12 (3), 331-350. Zhang, T., & Tomasi, C. (2001). On the consistency of instantaneous rigid motion estimation. International Journal of Computer Vision, 46 (1), 51-79. Zhou, H., Friedman, H.S., & von der Heydt, R. (2000). Coding of border ownership in monkey visual cortex. Journal of Neuroscience, 20 (17), 6594-6611. Ziegaus, C., & Lang, E.W. (1998). Statistical invariances in artificial, natural, and urban images. Zeitschrift Fur Naturforschung Section a-a Journal of Physical Sciences, 53 (12), 1009-1021.

Natural image profiles are most likely to be step edges

What this feature analysis idea amounts to is that at an early stage of visual processing there is ... separability of the gaussian, a 2-D kernel can be written as. ( ). ( ) ( ) ..... As with many arguments that use measures of simplicity it is difficult convincingly to .... concepts that are only meaningful relative to a metric on that space.

1MB Sizes 0 Downloads 145 Views

Recommend Documents

Are Conservatives Less Likely to be Prosocial Than ...
sectional study revealed that individualists and competitors endorsed stronger conservative political preferences than did prosocials; moreover ... tional Psychology, VU University Amsterdam, Van der Boechorststraat 1,. 1081 BT Amsterdam ... J. Pers.

most likely to succeed
Dec 19, 2008 - pause and pull out Venisha's name card, point to the letter “V,” show her how different it is from “C,” and make the ... Tell me what changes about my face when I'm happy. No, no, look at .... Financial-services firms don't loo

People are more likely to be insincere when they are ...
Email: [email protected] .... was close (within 12 percentage points) to that of the best- supported response. .... Marketing Research, 45, 633–644. Moore ...

These 11 Cities Are the Most Likely to Run Out of Drinking Water.pdf ...
Page 1 of 4. Joe Ready. These 11 Cities Are the Most Likely to Run Out of Drinking. Water. readylifestyle.com/11-cities-likely-to-run-out-of-water/. These Cities Are the Most Likely to Run Out of Drinking Water. Some of these cities are a little surp

Natural Image Colorization
function by taking local decisions only. ..... Smoothness map plays an important role in integrating .... actions, resulting in many visible mis-colored regions par-.

Watch Girl Most Likely (2012) Full Movie Online.pdf
There was a problem loading more pages. Retrying... Watch Girl Most Likely (2012) Full Movie Online.pdf. Watch Girl Most Likely (2012) Full Movie Online.pdf.

Watch Girl Most Likely (2012) Full Movie Online.pdf
There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps.

edges-pff.pdf
Page 1 of 18. Introduction to Computer Vision. Edge Detection. 1 Introduction. The goal of edge detection is to produce something like a line drawing of an image. In practice. we will look for places in the image where the intensity changes quickly.

Students are more likely to succeed when they feel connected to school.
and school health professionals have increasingly ... succeed academically and graduate. (Connell ... supportive administration, teachers will not be able to ...

Study: Teens who expect to die young are more likely to commit crime
May 9, 2014 - Piquero said letting kids know “that your life now is not destiny” can make ... college education, and many dealt drugs to make money, he said.

Read PDF Making It Right (A Most Likely To Novel ...
... reading it’s physically impossible to read every book you want The Giver is a 1993 American young ... this month I m currently up to my elbows in The Labyrinth Index with a tight deadline to hit if the book s going to be .... her own pathâ