EE 396: Lecture 2 Ganesh Sundaramoorthi Feb. 12, 2011
Last time we constructed, through the Bayesian paradigm and MAP estimation, the following energy to be minimized for the problem of image denoising: Z Z 2 (1) E(u) = − log p(u|I) = (I(x) − u(x)) dx + α |∇u(x)|2 dx, Ω
Ω
where I : Ω → R is the observed or measured image, and α > 0. The energy E is a functional that is defined on the class of functions U, which are to represent the class of denoised (true) images, and maps to the real numbers, i.e., E : U → R, we call such a function that acts on functions, a functional. We are going to learn how to minimize such functionals. As we shall see minimizing such functionals is analogous to the case of minimizing functions of the form f : Rn → R that you’ve learned to minimize in calculus.
0.1 0.1.1
Variational Calculus A Class of Functionals
We are going to consider a general class of functionals to minimize; the energy (1) falls into this class. We consider the class of functionals, E : U → R, defined by Z E(u) = F (u(x), ∇u(x), x) dx (2) Ω
where F : R × R2 × R2 → R. Note that E(u) = E(u) when we choose F to be F (w, y, z) = (I(z) − w)2 + α|y|2 . 0.1.2
(3)
Assumptions on the Class of Functions, U
For simplicity in the calcuations that we are going to do, we will assume additional conditions for the class U, in particular, Z U = {u ∈ C 2 (Ω, R) :
|∇u(x)|2 dx < ∞}.
(4)
Ω
Above u ∈ C 2 (Ω, R) means that all the second partial deriatives, i.e., ux1 x1 (x), ux1 x2 (x), ux2 x1 (x), ux2 x2 (x)
(5)
exist and are continuous for every x ∈ Ω. Again, it is not necessary to have the C 2 condition, but it will make our computations easier. 1
In addition, we must assume some conditions about the class U on the boundary of Ω, denoted ∂Ω, and typically there are two types of conditions that are employed in these types of problems. The first is called Dirichlet condition, u(x) = g(x), for x ∈ ∂Ω, (6) where g : ∂Ω → R is pre-specified. The second is called the Neumann condition, ∂u (x) = g(x) for x ∈ ∂Ω, ∂n
(7)
where the derivative above is the directional derivative in the direction of the normal of the boundary. In many cases, we choose g = 0. For simplicity in this lecture we assume ∂u (x) = 0 for x ∈ ∂Ω. ∂n Therefore, the class of functions we consider is Z ∂u U = {u ∈ C 2 (Ω, R) : |∇u(x)|2 dx < ∞, (x) = 0 for x ∈ ∂Ω}. ∂n Ω 0.1.3
(8)
(9)
Critical Points of Functionals
In the case of functions of the form f : Rn → R, we locate maxima/minima by computing the gradient of f (provided that f is smooth) and solve ∇f (x) = 0 for x. These points are known as critical points. The condition that ∇f (x) = 0 is equivalent to the condition d = 0, ∀v ∈ Rn , (10) f (x + tv) dt t=0 that is, the directional derivative in all directions is zero (provided f is smooth). Above the v’s above are known are perturbations. The case of functionals E is analogous. We first calculate the permissible perturbations, which we denote V, of the space U, that is, we ask for what properties of v is such that for small t, u + tv ∈ U for all u ∈ U. As you may verify, the condition is simply v ∈ U. Note that it is not always the case that the permissible perturbations of the function space is the function space itself. For example, if we replaced boundary condition above with ∂u/∂n = g for g 6= 0 the permissible perturbations would differ from the function space. We are ready to define critical points for functionals : Definition 1. Let E : U → R be a functional and V be the space of permissible perturbations of U. A function u ∈ U is a critical point of E if d E(u + tv) = 0, ∀v ∈ V. (11) dt t=0 We denote the left hand side as dE(u) · v, the directional (“Gˆateaux”) derivative of E at the function u in the direction v.
2
0.1.4
Euler-Lagrange Equation
Let us now compute the critical points of E defined in (2). To do so, we compute the Gˆataux derivative of E : Z d d F (u(x) + tv(x), ∇u(x) + t∇v(x), x) dx . (12) dE(u) · v = E(u + tv) = dt dt Ω t=0 t=0 Now we want to bring the derivative inside the integral, and this can only be done in some cases. We assume that it can be done; for those who are interested, the conditions need on F to exchange the derivative and integral are derived using the Lebesgue Dominated Convergence Theorem, which we do not discuss here 1 . Therefore, Z d dE(u) · v = F (u(x) + tv(x), ∇u(x) + t∇v(x), x) dx (13) Ω dt t=0 Z = Fw (u(x) + tv(x), ∇u(x) + t∇v(x), x)v(x)+ (14) Ω
Fy (u(x) + tv(x), ∇u(x) + t∇v(x), x) · ∇v(x)|t=0 dx Z = Fw (u(x), ∇u(x), x)v(x) + Fy (u(x), ∇u(x), x) · ∇v(x) dx.
(15) (16)
Ω
Recall now the multivariable version of the integration by parts formula (see [2] for more details) 2 : Z Z Z U (x) · ∇v(x) dx + div (U (x))v(x) dx = U (x) · N (x)v(x) dS(x) Ω
Ω
(17)
∂Ω
where N denotes the outward normal to Ω, U : Rn → Rn , v : Rn → R, dS(x) denotes the surface area element in ∂Ω, and n X ∂f (x). (18) div f (x) := ∂xi i=1
Applying integration by parts, we find Z dE(u) · v = Fw (u(x), ∇u(x), x)v(x) − div (Fy (u(x), ∇u(x), x))v(x) dx ZΩ + Fy (u(x), ∇u(x), x) · N (x)v(x) dS(x).
(19) (20)
∂Ω
Now if we only consider v ∈ V such that v(x) = 0 for x ∈ ∂Ω, then the boundary term is zero, and Z dE(u) · v = [Fw (u(x), ∇u(x), x) − div (Fy (u(x), ∇u(x), x))] v(x) dx.
(21)
Ω
Now if we want to solve for critical points of the functional E, we must solve for the u ∈ U such that dE(u) · v = 0 for all v ∈ V and satifying v(x) = 0 for x ∈ ∂Ω. We use the following theorem: 1 2
The condition is that the partials R of F are bounded by aR function that is absolutely integrable. The formula is equivalent to Ω div (U (x)v(x)) dx = ∂Ω U (x) · N (x)v(x) dS(x), which is simply the Divergence Theorem.
3
Theorem 1. Let f ∈ C 2 (Ω, R) and suppose Z f (x)v(x) dx = 0
(22)
Ω
for all v ∈ C 2 (Ω, R) such that v(x) = 0 and ∂v/∂n(x) = 0 for x ∈ ∂Ω. Then f (x) = 0 for all x ∈ Ω. We don’t prove this theorem. For those interested, the idea is to choose v to be a bump function that peaks and is narrowly concentrated at a point x0 ∈ int(Ω) and dies down to zero at ∂Ω. For this choice of v, using (22), we find that f (x0 ) = 0. But then you can choose v to peak at another point x1 , and then find f (x1 ) = 0, and so on. This gives f (x) = 0 for all x ∈ Ω. Applying the previous theorem, we find that if dE(u) · v = 0 for all v ∈ C 2 (Ω, R) such that v(x) = 0 and ∂v/∂n(x) = 0 for x ∈ ∂Ω, then u ∈ U must satisfy Fw (u(x), ∇u(x), x) − div (Fy (u(x), ∇u(x), x)) = 0.
(23)
The equation above is called the Euler-Lagrange Equation for E. Therefore, we attempt to solve the equations : ( Fw (u(x), ∇u(x), x) − div (Fy (u(x), ∇u(x), x)) = 0 for x ∈ int(Ω) , (24) ∂u for x ∈ ∂Ω ∂n (x) = 0 if a solution(s), exist then we have found a critical point of the functional E. Let us compute the above equations for the special case of (3). Then Fw (w, y, z) = 2(w − I(z)), Fy (w, y, z) = 2αy,
(25)
Fw (u(x), ∇u(x), x) = 2(u(x) − I(x)), Fy (u(x), ∇u(x), x) = 2α∇u(x),
(26)
div (Fy (u(x), ∇u(x), x)) = div (2α∇u(x)) = 2α∆u(x),
(27)
and then and finally where ∆u(x) =
n X ∂2u i=1
∂x2i
(x)
is the Laplacian of u. Let us now write down the Euler-Lagrange Equations : ( α∆u(x) = u(x) − I(x) x ∈ int(Ω) . ∂u x ∈ ∂Ω ∂n (x) = 0
(28)
(29)
In the subsequent lecture, we discuss whether the solution of this equation exists and if so, numerical methods to solve the equation. Indeed, we will find that there is a solution. If we know a solution exists, we only know that the solution is a critical point of E. We would like to know whether that critical point is a global minimizer, i.e., whether the u that solves the Euler-Lagrange equation is the lowest value of E. In general, this is a very difficult question, but in the next section, we introduce conditions on F that will guarantee that the solution of the Euler-Lagrange equation is a global minimum. Let us recall for clarity the definitions of global minimum, and local minima : Definition 2.
• A function u0 ∈ U is a global minimum of E if for all u ∈ U, we have E(u0 ) ≤ E(u).
• A function u0 ∈ U is a local minimum of E if there exists ε > 0 such that for all u ∈ U satisfying ku − u0 k < ε, E(u0 ) ≤ E(u). Note that k · k denotes a norm chosen on U, and the definition of local minimum depends on this choice of norm. For more details on the Euler-Lagrange equation and variational calculus see [3]. 4
References [1] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004. [2] Lawrence C. Evans. Partial Differential Equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, 1997. [3] John L. Troutman. Variational Calculus and Optimal Control: Optimization with Elementary Convexity. Springer, 1996.
5