EE 396: Lecture 5 Ganesh Sundaramoorthi Feb. 22, 2011

In this lecture, we learn to solve the Euler-Lagrange equations of our denoising model PDE: ( u(x) − α∆u(x) = I(x) x ∈ int(Ω) . ∂u x ∈ ∂Ω ∂n (x) = 0

(1)

numerically since in the last lecture, we have understood that the equation has a solution. Note that the solution will be a global minimizer of our energy : Z Z 2 E(u) = − log p(u|I) = (I(x) − u(x)) dx + α |∇u(x)|2 dx, (2) Ω



since the functional is convex.

0.1 0.1.1

Gradient Descent Steepest Descent in Rn

A generic algorithm for minimizing a smooth energy function is that of gradient descent or steepest descent. This algorithm for a function f : Rn → R works by making a guess for the minimum, xguess ∈ Rn and then correcting the guess by moving slightly in the negative gradient direction (recall from your calculus class that the gradient direction is the direction that maximizes f the quickest), xnext = xguess − ∆t∇f (xguess ) where ∆t > 0 is small. Since the negative gradient direction decreases the function in the steepest possible way, xnext should have lower value than xguess provided ∆t is small enough. Then one repeats the process on xnext and hopes that the procedure converges. That is, x0 = xguess xk+1 = xk − ∆t∇f (xk ).

(3) (4)

If the algorithm converges, then there is a possibility that the algorithm could become trapped in a local minimum if for example f is not convex (if f is convex, then the algorithm will reach a global minimum). Therefore, the initial guess is quite important; ideally it should be chosen close enough to the global minimum as possible or as close as possible to the desired local minimum. 0.1.2

Gradient of Functionals

We now generalize this gradient descent procedure to our denoising model. However, what is meant by gradient of such a functional? Turns out that this is analogous to the multivariable function case. Recall that 1

from previous lectures we have shown that the directional derivative of E is Z d v(x) · (u(x) − I(x) − α∆u(x)) dx dE(u) · v = E(u + tv) = dt Ω t=0

(5)

for v ∈ V, the space of permissible perturbations, and u ∈ U. Note that from this expression of the directional derivative, we can choose the perturbation v of u so that the directional derivative in the direction v is minimum. Note that an sure way to ensure dE(u) · v < 0 is to choose v(x) = −(u(x) − I(x) − α∆u(x)); in this case,

Z

|u(x) − I(x) − α∆u(x)|2 dx ≤ 0.

dE(u) · v = −

(6)

(7)



But is the steepest direction? To answer this question, we must first note that V is an inner product space equipped with the inner product 1 Z hv1 , v2 iL2 = v1 (x)v2 (x) dx for v1 , v2 ∈ V. (8) Ω

In an inner product space, we automatically get for free the Cauchy-Schwartz inequality | hv1 , v2 i | ≤ kv1 kkv2 k for the norm induced from the inner product and equality if and only if v1 and v2 are such that v1 = av2 for some a ∈ R. Applying this to (5), we find that

or

| dE(u) · v| ≤ kvkL2 ku − I − α∆ukL2

(9)

| dE(u) · v| ≤ ku − I − α∆ukL2 kvkL2

(10)

where k · kL2 is the norm induced from the L2 inner product. Therefore we see that the magnitude of the quotient above will be maximized (equality in the above expression) when v = −(u − I − α∆u),

(11)

v = −(u − I − α∆u) = −∇E(u)

(12)

thus, this justifies calling the negative gradient direction. This justifies the formal definition : Definition 1. Let E : U → R be a functional. Then the gradient of E at u (u ∈ U), denoted ∇E(u) ∈ V, is defined as the perturbation that satisfies Z dE(u) · v = v(x) · ∇E(u) dx = hv, ∇E(u)iL2 . (13) Ω

1

Recall that h·, ·i is a (real) inner product on a vector space V if the properties hold:

1. Linearity: ha1 v1 + a2 v2 , vi = a1 hv1 , vi + a2 hv2 , vi for all v1 , v2 , v ∈ V and a1 , a2 ∈ R. 2. Positive-definiteness: hv, vi ≥ 0and hv, vi = 0 is equivalent to v = 0 for all v ∈ V 3. Symmetry: hv1 , v2 i = hv2 , v1 i. Note that a norm can be defined from the inner product: kvk =

p

2

hv, vi.

0.1.3

Gradient Descent PDE

Now the steepest descent algorithm for our functionals is u0 = uguess uk+1 = uk − ∆t∇E(uk )

(14) (15)

where k represents the iteration number. In continuous time, the gradient descent is the PDE u(0, ·) = uguess (0, ·) ut (t, ·) = −∇E(u(t, ·)) and (15) is a discretization where we have used the forward Euler difference approximation : ut (t, ·) =

u(t + ∆t, ·) − u(t, ·) + O(∆t). ∆t

(16)

For our denoising model, this becomes ut = I − u + α∆u

(17)

2

where α > 0 We assume that we sample the function u on a regular grid with spacing ∆x in the x direction and ∆y in the y direction. We now discretize the PDE 3 . Using central differences 4 , we find that 1 ∂2u ∂u (t, x, y)∆x + (t, x, y)(∆x)2 + O((∆x)2 ) ∂x 2 ∂x ∂u 1 ∂2u u(t, x − ∆x, y) = u(t, x, y) − (t, x, y)∆x + (t, x, y)(∆x)2 + O((∆x)2 ) ∂x 2 ∂x where O(∆x) denotes linear dependence on ∆x. Adding the above two equations, we find that u(t, x + ∆x, y) = u(t, x, y) +

u(t, x + ∆x, y) − 2u(t, x, y) + u(t, x − ∆x, y) ∂2u (t, x, y) = + O(∆x) 2 ∂x (∆x)2 ∂2u u(t, x, y + ∆y) − 2u(t, x, y) + u(t, x, y − ∆y) (t, x, y) = + O(∆y) ∂y 2 (∆y)2

(18) (19)

(20) (21)

Setting uk (x, y) = u(k∆t, x, y) we find that our PDE can be written h uk+1 (x, y) = uk (x, y) + ∆t I(x, y) − uk (x, y) +  uk (x + ∆x, y) − 2uk (x, y) + uk (x − ∆x, y) uk (x, y + ∆y) − 2uk (x, y) + uk (x, y − ∆y) α +α (∆x)2 (∆y)2 + O(∆x∆t) + O(∆y∆t), (22) thus, we have discretized our PDE. The question remains on how to choose ∆t to ensure that the numerical scheme is stable and converges. 2 The equation ut = α∆u is called the heat equation for Ω = Rn ; one can show that u(t, ·) is simply a Gaussian of standard √ deviation t convolved with the initial guess u0 . Therefore, we can understand that our denoising model will be a combination smoothing with and closeness to the original image I. 3 The is a subtle art in choosing the discretizing scheme for a PDE; ultimately the approriate numerical scheme requires full understanding on the behavior of the PDE. 4 Generally, central differences are preferred compared to forward or backward differences because they induce the least error. However, in certain PDE choosing central differences could lead to an unstable PDE. Again, the differencing scheme is dependent on the behavior of the PDE.

3

0.1.4

Von Neumann Analysis

We apply Von Neumann analysis to analyze the stability of the discretization of the PDE, and this analysis typically can be applied to linear PDE. We compute the Fourier Transform in the spatial variable (x, y) of our discretization scheme :    i∆xω1 e − 2 + e−i∆xω2 ei∆yω2 − 2 + e−i∆yω2 k+1 k + U (ω1 , ω2 ) = U (ω1 , ω2 ) 1 − ∆t + α∆t + (∆x)2 (∆y)2 ∆tI(ω1 , ω2 ) (23) where we have noted that the Fourier transform of uk (x + ∆x, y) is U k (x, y)ei∆xω1 where U k denotes Fourier transform of uk and i is the imaginary unit. We define the H(ω1 , ω2 ) to be the expression in the brackets above, and then  2  sin (∆xω1 /2) sin2 (∆yω2 /2) H(ω1 , ω2 ) = 1 − ∆t − 4α∆t . (24) + (∆x)2 (∆y)2 Therefore, (23) becomes U k+1 (ω1 , ω2 ) = H(ω1 , ω2 )U k (ω1 , ω2 ) + ∆tI(ω1 , ω2 ).

(25)

Writting the above as an explicit formula, we have that U

k+1

k

0

(ω1 , ω2 ) = (H(ω1 , ω2 )) U (ω1 , ω2 ) + ∆tI(ω1 , ω2 )

k−1 X

(H(ω1 , ω2 ))k

(26)

i=0

where (·)i denotes ith power. We see that as k → ∞, we must have that |H(ω1 , ω2 )| < 1 for all ω1 , ω2 ; that is

 1 − ∆t − 4α∆t

and

 1 − ∆t − 4α∆t

(27)

sin2 (∆xω1 /2) sin2 (∆yω2 /2) + (∆x)2 (∆y)2

sin2 (∆xω1 /2) sin2 (∆yω2 /2) + (∆x)2 (∆y)2

 <1

(28)

> −1.

(29)



The first yields that ∆t > 0 (noting that α > 0) and the second yields that 2

∆t < 1 + 4α



sin2 (∆xω1 /2) (∆x)2

+

sin2 (∆yω2 /2) (∆y)2

 for all ω1 , ω2

(30)

which means that ∆t must be less than the min of the right hand side over ω1 , ω2 or 2

∆t < 1 + 4α



1 (∆x)2

+

1 (∆y)2

.

(31)

In the case that ∆x = ∆y = 1, we have that ∆t <

2 . 1 + 8α 4

(32)

0.1.5

Choosing α

Recall that the original energy from our model is Z Z 1 1 2 (I(x) − u(x)) dx + 2 |∇u(x)|2 dx, E(u) = 2 2σn Ω 2σp Ω

(33)

where σn is the standard deviation of the noise and σp is the standard deviation of the true noise free image derivatives. We scaled the energy by 2σn2 (which doesn’t affect the minimum u). Then α was defined as α=

σn2 1 = , 2 σp SNR

(34)

where SNR can be thought of as a signal to noise ratio. We see that α should be chosen larger as the noise increases. In practice, we do not know these parameters, but you may experiment with estimating these parameters from the noisy image. Notice that as α becomes larger (as the image becomes more noisy), ∆t becomes smaller and thus the algorithm becomes slower. 0.1.6

Algorithm Denoising Results

A key point is the initialization. Theoretically, we have shown that the energy is convex, and the energy only has global minimum, and thus the initialization should be irrelevant since the scheme can only converge at a global minimum. In practice, initialization is important. For these experiments we choose u0 = I. Also of importance is the convergence criteria, and we take that to be max |uk (x) − I(x) − α∆uk (x)| < ε x∈Ω

(35)

where ε > 0 is our allowed error. Fig. 1 shows an original image that is then corrupted by iid Gaussian noise. Fig. 2 shows the result of the algorithm constructed.

5

Figure 1: Left: An image of the UCLA campus, and right: the image corrupted synthetically generated noise (iid Gaussian mean zero with σ 2 = 0.05).

6

Figure 2: Left to right, top to bottom: denoising results for the noisy image in Fig. 1 with α = 1, 5, 10, 17, 25, 50, respectively.

7

EE 396: Lecture 5

Feb 22, 2011 - In an inner product space, we automatically get for free the Cauchy-Schwartz .... smoothing with and closeness to the original image I. 3The is a ...

668KB Sizes 2 Downloads 211 Views

Recommend Documents

EE 396: Lecture 10-11
From your physics class, we know that the speed of the curve squared divided by the radius of curvature is the normal component of acceleration : cpp(p) · N(p) = |cp(p)|2. R(p). = |cp(p)|2κ(p). (20) where κ(p) is one over the radius of curvature;

EE 396: Lecture 3 - UCLA Vision Lab
Feb 15, 2011 - (which we will see again in more detail when we study image registration, see [2]). • The irradiance R, that is, the light incident on the surface is ...

EE 396: Lecture 2
Feb 12, 2011 - where I : Ω → R is the observed or measured image, and α > 0. The energy E is a functional that is defined on the class of functions U, which ...

EE 396: Lecture 4
where Id : U→U is the identity map, and so if u1,u2 ∈ U, then .... Recall from signal processing, that if we have a linear system that is also shift-invariant (some-.

EE 396: Lecture 8
Mar 8, 2011 - many fields including electrical engineering and financial data .... used symmetry of w, Fubini's Theorem to interchange order of integration, and ...

EE 396: Lecture 14-15
Calculating the posterior probability, p({Ri,ui}N i=1|I) using Bayes' Rule, and then calculating the MAP estimate is equivalent to minimizing the energy. E({Ri,ui}N.

EE 396: Lecture 9
Mar 12, 2011 - p(ui) (stochastic components of ui, αi and ηi, are independent from each other ) ... we could minimize in u, and a MAP estimate for u, which would.

EE 396: Lecture 13
Mar 29, 2011 - We shall now derive an algorithm whereby we can compute dS(x) for .... Lagrange equations are. ∇Lφ(γ) = ∇φ(γ(s)) − d ds. (φ(γ(s))γs(s)) = 0.

EE 396: Lecture 12
Mar 26, 2011 - Thus, along lines that are parallel to v the function remains ... x ∈ R. We simply follow a line parallel to v starting from (t, x) and follow the line ...

EE 396: Lecture 3 - UCLA Vision Lab
Feb 15, 2011 - The irradiance R, that is, the light incident on the surface is directly recorded ... partials of u1 and u2 exist and are continuous by definition, and ...

Lectures / Lecture 5
Mar 22, 2010 - application files to a SSD, but leaving all their multimedia files on a HDD,. Dan and David were able to ... logic, but it will still be larger in size than the file storing Germany's flag. Whereas Germany's flag can ... in web design,

EE - Lecture 5b - RoR - multiple projects.pptx
Must use LCM for each pair-wise comparison. • Procedure for ... If i* < MARR: eliminate project and go the next project. •. If i* ≥ MARR: go ... Rules: – No phones.

The Java EE 5.pdf
Accessing Databases from Web Applications. Further Information. Chapter 3. Java Servlet Technology. What Is a Servlet? The Example Servlets. Servlet Life ...

lecture 5: matrix diagonalization, singular value ... - GitHub
We can decorrelate the variables using spectral principal axis rotation (diagonalization) α=XT. L. DX. L. • One way to do so is to use eigenvectors, columns of X. L corresponding to the non-zero eigenvalues λ i of precision matrix, to define new

Lecture 5-Circuit-Packet Switching.pdf
Sign in. Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying.

lecture 5-1 wp.pdf
The AR bandwidth is the frequency bandwidth in which the AR of an antenna changes less than. 3-dB from its minimum value. The AR beamwidth is the angle span over which the AR of an antenna. changes less than 3-dB from its minimum value. Fig. 26: AR b

Java-EE-5-Web-Service-Developer-Exam-Study-Notes.pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item.

Ben Tahar 396.pdf
Page 1 of 31. 1. ENTREPRENEURIAL STRESSORS. Yosr Ben Tahar. Groupe Sup de Co Montpellier Business School and. University Montpellier I, Montpellier Research in Management. SUMMARY. The entrepreneurial context is known as stressful, studies use models

Lecture 5: Random variables and expectation
Dec 13, 2017 - Ma 3/103. KC Border. Introduction to Probability and Statistics. Winter 2018. Lecture 5: Random variables and expectation. Relevant textbook passages: Pitman [5]: Sections 3.1– .... (This is beginning to sound ... For a discrete rand

Lecture 5 Number System Ben Hammond Goals
Decimal Number System. • Base 10 system (Ten digits: 0, 1, 2, …, 9). • Counting process. • Every digit goes through a cycle 0 → 9. • After a complete cycle of a lower significant digit (0 through 9) immediately higher digit is incremented

Lecture 5. Metal complex Colloid chemistry Surface properties.pdf ...
Retrying... Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Lecture 5. Metal complex Colloid chemistry Surface properties.pdf. Lecture 5. Metal comp

Litobox-wod-396.pdf
Litobox-wod-396.pdf. Litobox-wod-396.pdf. Open. Extract. Open with. Sign In. Details. Comments. General Info. Type. Dimensions. Size. Duration. Location.

EE-A.pdf
IN THE ROLL NUMBER AND ·msT BOOKLET SERIES CODE B, C OR D CAR£FULLY. AND WITHOUT ANY OMISSION OR DISCREPANCY AT THE APPROPRIATE PLACES. IN THE OMR ANSWER SHEET. ANY OMISSION/DISCREPANCY WILL RENDER. THE. ANSWER SHEET LIABLE FOR REJECTION. Booklet

XAPP290 - EE Times Asia
Jun 28, 2002 - Overall structure should be a top-level design with each functional ..... A device that has been configured with an encrypted bitstream cannot be.