0.1

Review

Physical World (X1 , Y1 , Z1 ) ∈ R3 R(X1 , Y1 , Z1 )

(X1 , Y1 ) ∈ Ω u(X1 , Y1 ) := R(X1 , Y1 , Z1 ) u(X2 , Y2 )

R(X2 , Y2 , Z2 )

I =u+η

u

u(X3 , Y3 )

R(X3 , Y3 , Z3 )

+

p(u) � ∝� � exp − 2σ1 2 Ω |∇u(x)|2 dx p

η(x) ∼ N (0, σ) iid x ∈ Ω

Object(s)

Imaging Plane Ω Affine Camera

Figure 1: Model of image formation for constructing our denoising algorithm. Figure 1 shows our model for image formation. Note that in general that the denoising problem is an ill-posed problem and cannot be solved without knowledge of how the image is formed, how the noise is generated, and a characterization of the possible true images. We have a simplified model of the image formation process where we have assumed that • visible points on the objects project to the imaging plane by the simple projection (X, Y, Z) → (X, Y ) ∈ Ω. This is true when the object(s) are very far from the imaging plane, and this a special case of what is known as an affine camera. For more information on the image formation process (which we will see again in more detail when we study image registration, see [2]). • The irradiance R, that is, the light incident on the surface is directly recorded on the imaging plane : u(X, Y ) = R(X, Y, Z). 1

• The error in the CCD device or any other noise/nuisances in the physical world are modeled by an additive noise on u, and the image we observe is I. • The noise η is modeled by a Gaussian iid random variables. • The prior probability on the space of possible true images u is proportional to the exponential of the norm gradient squared of u. Our goal is to recover u from I using the model above. We note that to design better denoising algorithms, we must improve the model above! Using the above assumptions, the Bayesian approach, and MAP estimation, we derived the following energy to be minimized : Z Z 2 (1) E(u) = − log p(u|I) = (I(x) − u(x)) dx + α |∇u(x)|2 dx, Ω

Ω

where E : U → R and

Z

U = {u ∈ C (Ω, R) :

|∇u(x)|2 dx < ∞,

2

Ω

∂u (x) = 0 for x ∈ ∂Ω}. ∂n

We computed the conditions for critical points of the following generalized functional: Z E(u) = F (u(x), ∇u(x), x) dx

(2)

(3)

Ω

and these are called the Euler-Lagrange equations : ( Fw (u(x), ∇u(x), x) − div (Fy (u(x), ∇u(x), x)) = 0 for x ∈ int(Ω) . ∂u for x ∈ ∂Ω ∂n (x) = 0

(4)

Note that these conditions are necessary for a local min/max of E, but are not sufficient. Next, we introduce conditions on F that will tell us when the solution to the Euler-Lagrange equations is a global minimizer of E.

0.2

Convexity

We introduce a very important property of functionals that guarantees that the functional does not have local minima that are not global minima. In addition, having the property of convexity leads to many efficient algorithms for minimizing the functional. Definition 1. A class of functionals F is a convex space if it is a subset of a vector space and for all t ∈ [0, 1] and u1 , u2 ∈ F, we have that (1 − t)u1 + tu2 ∈ F. Remark 1. Note that U defined in (2) is a convex space. Let’s verify. Let u1 , u2 ∈ U. Note that the second partials of u1 and u2 exist and are continuous by definition, and therefore the function (1 − t)u1 + tu2 has second partials and they are continuous. Also, Z |∇((1 − t)u1 (x) + tu2 (x))|2 dx = Ω Z Z Z 2 2 2 2 (1 − t) |∇u1 (x)| dx + t |∇u2 (x)| dx + 2t(1 − t) ∇u1 (x) · ∇u2 (x) dx. (5) Ω

Ω

Ω

2

Recall the inequality ab ≤ (a2 + b2 )/2 for a, b ∈ R (which is derived from (a − b)2 ≥ 0). Applying this, we find Z Z Z

Therefore, Z |∇((1 − t)u1 (x) + tu2 (x))|2 dx ≤ Ω

Z

Z

|∇u2 (x)|2 dx < ∞. (7)

|∇u1 (x)| dx + (t + t(1 − t))

((1 − t) + t(1 − t))

2

2

(6)

Ω

Ω

Ω

|∇u2 (x)|2 dx.

|∇u1 (x)|2 dx +

∇u1 (x) · ∇u2 (x) dx ≤

2

2

Ω

Ω

Therefore, ∇((1 − t)u1 (x) + tu2 (x) ∈ U, and so U is a convex space. We are now ready to define a convex functional. Definition 2. A functional E : U → R is a convex functional if U is a convex space, and if for all t ∈ [0, 1] and u1 , u2 ∈ U we have that E((1 − t)u1 + tu2 ) ≤ (1 − t)E(u1 ) + tE(u2 ).

(8)

It turns out that testing for convexity of E can be done by just testing convexity of F when E is in the form of (3). Proposition 1. The functional E defined in (3) is convex when F is convex with respect to the first two arguments, i.e., F (w, y, z) is convex in (w, y) for all z. Proof. Convexity of F in (w, y) means that for all x F ((1 − t)u1 (x) + tu2 (x), ∇((1 − t)u1 (x) + tu2 (x)), x) ≤ (1 − t)F (u1 (x), ∇u1 (x), x) + tF (u2 (x), ∇u2 (x), x). (9) Integrating the previous inequality yields that E is convex. There are equivalent notions of convexity of F when F satisfies additional assumptions of smoothness: Proposition 2. Let f : Rn → R be a convex function, and f ∈ C 2 (Rn , R) then for x, y ∈ Rn , the previous conditions are equivalent to 1. f (x) − f (y) ≥ ∇f (x) · (x − y) 2. let Hf (x) be the Hessian matrix at x, i.e., (Hf (x))ij = (fxi xj (x)), then Hf (x) is positive semidefinite, that is, for all y, y T Hf (x)y ≥ 0 where T denotes transpose. The above is typically convenient to check whether F is convex, but it only works for smooth functions! We will see an example of a non-smooth F in a few lectures. Let us now check whether F defined for our original denoising model F (w, y, z) = (I(z) − w)2 + α|y|2 .

(10)

leads to a convex functional E. Since F is smooth, we compute the second partials with respect to w, y: Fw = 2(w − I(z)), Fww = 2, Fy = αy, Fyy = αId, Fyw = 0 3

(11)

where Id is the 2 × 2 identity matrix. Therefore, H(w,y) F (w, y, z) =

2 0 0 αId

,

(12)

which is postive semi-definite since α > 0. Moreover, the Hessian is strictly positive definite that is (w0 , y0 )T HF (w, y, z)(w0 , y0 ) > 0 for (w0 , y0 ) 6= 0. So E defined for our denoising model is strongly convex. Convexity implies that E has only global minima if there are minima at all. Indeed, we have the following theorem: Proposition 3. Let E : U → R be convex, and suppose that there exists u ∈ U such that dE(u) · v = 0 for all v ∈ V. Then u is a global minimizer of E. Proof. Let w ∈ U, and note that w − u ∈ V. By convexity, we have E(u + t(w − u)) ≤ (1 − t)E(u) + tE(w);

(13)

rearranging we have that

E(u + t(w − u)) − E(u) ≤ E(w) − E(u), t and taking the limit as t → 0, we find dE(u) · (w − u) ≤ E(w) − E(u),

(14)

(15)

but the left hand side is zero, and so E(u) ≤ E(w), and this is true for any w. Therefore, u is a global minimizer. Now back to our denoising problem, we know that E is convex and we have found the equation ( α∆u(x) = u(x) − I(x) x ∈ int(Ω) . ∂u (x) = 0 x ∈ ∂Ω ∂n

(16)

that must be solved in order to find a critical point of E. Now by the previous theorem, if a critical point exists, then it must be a global minimizer. Thus, we simply need to be able to show that (16) has a solution, and then automatically it will be a global minimizer. We will do this next lecture. For more details on convexity of see [1], and for convexity of functionals see [3].

References [1] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004. [2] Y. Ma, S. Soatto, J. Kosecka, and S. Sastry. An invitation to 3-D vision, volume 6. Springer, 2004. [3] John L. Troutman. Variational Calculus and Optimal Control: Optimization with Elementary Convexity. Springer, 1996.

4