EE 396: Lecture 9 Ganesh Sundaramoorthi March 12, 2011

0.1

Dictionaries

The work [4] has looked at the statistics of image patches in natural images (which they define to be images of the visual envirnoment that do not show artifacts of civilization, e.g., scenes of rocks, mountains, ocean, etc..., but not televisions, buildings, cars, etc...), and they have shown that each patch in the such images is a linear combination of a small number of patches chosen from a “dictionary” of “atomic” patches. Indeed they find that this dictionary consists of patches that resemble Gabor filters, which are similar to the receptive fields in the visual cortex of the human brain. Following this observation, we assume that patches of images of natural scenes (whether or not they contain artifacts of civilization) are formed as a linear combination of a small number of atomic patches chosen from a dictionary of these atoms; we will not assume that the dictionary are composed of Gabor filters 1 . We will incorporate this information to create a new prior on u ∈ U, the class of true images. Let us assume that Ω, the domain of the image is rectangular, and is the disjoint union of Np = Np1 × Np2 identical patches, Ωp = [0, L1 ] × [0, L2 ] (then Ω = [0, L1 × Np1 ] × [0, L2 × Np2 ]). Let also ui : Ωp → R for i = 1, . . . , Np denote the ith patch of u : Ω → R: n u(x) = ui (mod(x1 , Lp1 ), mod(x2 , Lp2 )) if x = (x1 , x2 ) is in the ith patch . (1) where mod denotes modulus. Let us denote by di : Ωp → R the ith dictionary atom where i = 1, · · · , Nd (there are Nd dictionary elements). We set D(x) = (d1 (x), . . . , dNd (x)) (that is D : Ωp → R1×Nd , is a row vector). 1. We assume that each patch of u is modeled as ui (x) =

Nd X

αij dj (x) + ηi (x) = D(x)αi + ηi (x)

(2)

j=1

where is ηi : Ωp → R is modeling noise, and αj ∈ RNd . 2. Most of the components of αj are zero, that is, indeed we assume that p(αi ) ∝ exp (−µi kαi k`0 )

(3)

1 Note that this is following a similar idea of our last lecture of image self-similarity; that is the the image is composed of repeated atoms

1

where k · k`0 denotes the `0 norm, that is, the number of non-zero elements of αi : Nd X (1 − δ(αij ))

kαi k`0 =

(4)

j=1

where δ(x) = 1 if x = 0 and δ(x) = 0 if x 6= 0. Further αi and αj are independent for i 6= j (indeed if i and j are nearby, they may be correlated, however, we make the assumption that they are not). 3. We assume that ηi (x) ∼ N (0, σp ) and is iid in both x ∈ Ωp and i = 1, . . . , Np . 4. Both ηi and αj are independent for each i and j. Using the above assumptions, we calculate p(u), our prior probability : p(u) = p(u1 , . . . , uNp ) =

Np Y

(5)

p(ui ) (stochastic components of ui , αi and ηi , are independent from each other )

(6)

p(αi , ηi ) (the only stochastic components of ui are αi , ηi )

(7)

i=1 Np

=

Y i=1 Np

=

Y

p(αi )p(ηi = ui − Dαj ) =

Np Y

i=1

i=1

!

Z

1 exp (−µi kαi k`0 ) exp − 2 2σp

2

(ui (x) − D(x)αi ) dx Ωp

(8)  = exp −

Np X

 µi kαi k`0 +

i=1

0.2

1 2σp2

Z

(ui (x) − D(x)αi )2 dx .

(9)

Ωp

Energy to be Optimized

Now using our additive noise model with Gaussian noise, we find that Z E(u) = − log p(u|I) = λ

2

(I(x) − u(x)) dx + Ω

Np X i=1

µi kαi k`0 +

Np Z X i=1

(ui (x) − D(x)αi )2 dx (10)

Ωp

(after an approrpriate scaling by a constant, and λ > 0). This is the model considered by [2]. Note that if we had αi , then if dictionary were known, we could minimize in u, and a MAP estimate for u, which would then be our denoisied image, however, we do not know αi ! Indeed, we must optimize E in both u and αi . Now, the previous reasoning assumes that we have a fixed pre-defined dictionary. What do we choose for the dictionary? Following the reasoning of [4], we could choose a Gabor dictionary. However, the brilliant idea of [2] is choose a dictionary tailored to the image itself, and to suppose that the dictionary can be determined from the noisy image itself. That is, they propose to also solve for the dictionary elements as part of the

2

optimization of E 2 . Therefore, the energy to be optimized is Np d E(u, {αi }i=1 , {di }N i=1 )

0.3 0.3.1

Z

2

(I(x) − u(x)) dx +

=λ Ω

Np X

µi kαi k`0 +

i=1

Np Z X i=1

(ui (x) − D(x)αi )2 dx. (11)

Ωp

Optimizing the Energy Updating u

We may define ˜ D(x) = D(mod(x1 , Lp1 ), mod(x2 , Lp2 )), x ∈ Ω

(12)

which is simply the dictionary extended to all of Ω. We shall also define χi : Ω → {0, 1} as ( 1 x is in the ith patch of Ω χi (x) = . 0 x is in not the ith patch of Ω

(13)

We may then write the energy as N

p d E(u, {αi }i=1 , {di }N i=1 ) = λ

Z Ω

(I(x) − u(x))2 dx +

Np X

 Z µi kαi k`0 +

u(x) − Ω

i=1

Np X

2 ˜ D(x)α i χi (x) dx.

i=1

(14) We are now going to devise an iterative update scheme where we update each of u, α, D assuming that we are given some estimate of the others. Let us first assume that we have an estimate of α and D, and we now derive a scheme to estimate u. To do so, we compute the Euler-Lagrange equations with respect to u, and only the first two terms depend of the energy depend on u. Thus, the Euler-Lagrange equations are : λ(u(x) − I(x)) + u(x) −

Np X

˜ D(x)α i χi (x) = 0,

(15)

i=1

which yields : Np

λ 1 X ˜ u(x) = I(x) + D(x)αi χi (x), 1+λ 1+λ

(16)

i=1

that is, u is a weighted average of the (noisy) image and the current dictionary approximation of the noisy image. 2

Note that one must be very careful, because if we solve for u, αi , D, then a trivial solution would be choose ui to be the ith patch of I, choose αi1 = 1 and αij = 0 for j 6= 1, and the dictionary D to be the patches of I, in which case our energy is minimized, however, our image is not denoised! The trick is to have a prior on the number of dictionary elements, and therefore, Nd will have to be chosen small compared to the number of patches Np , and this will result in a non-trivial solution.

3

0.3.2

Updating αi

We now assume that we have and estimate of u, D, and we update αi . Note only the last two terms of the energy depend on αi , and therefore, we minimize arg min Np {αi }i=1

Np X

Z

(ui (x) − D(x)αi )2 dx =

µi kαi k`0 +

i=1

Ωp Np X

!

Z

(ui (x) − D(x)αi ) dx

arg min µi kαi k`0 + αi

i=1

2

(17)

Ωp

that is, the problem splits into Np independent problems. The problem above is known as least squares with a sparsity penalty. The problem has been of much recent interest in many applications, including compressive sensing. There are many numeric methods, for solving this problem, but we will just look at one - the matching pursuit algorithm [3], perhaps one of the simpliest 3 . The matching pursuit algorithm is as follows. We assume that kdi kL2 = 1. Let R0 = ui (the residual), k ui = 0 (the approximation of ui using dictionary elements). Then the matching pursuit algorithm works by iterating the following steps

1. ik = arg maxi=1,...,Nd Rk , di L2

P

2. uki = uk−1 + Rk , dik L2 dik = kl=1 Rl , dil L2 dil i 3. Rk+1 = Rk − hRk , dik iL2 dik

l Pk th k = 4. Note that αij l=1 δil ,j R , dil L2 (the k approximation of αij ). The algorithm stops when the desired level of sparsity is reached (the maximum number of non-zero entries of αik is reached) or when some other convergence criteria is reached. A modified version of this algorithm, known as orthogonal matching pursuit [5], has better convergence properties. 0.3.3

Updating the dictionary

We are going to define U (x) = (u1 (x), . . . , uNp (x)) where U : Ωp → R1×Np . Further, we let α = (α1 , . . . , αNp ) ∈ RNd ×Np . Then we see that |U (x) − D(x)α|2 =

Np X

(ui (x) − D(x)αi )2 , x ∈ Ωp .

(18)

i=1

Therefore, kU −

Dαk2L2

=

Np Z X i=1

where recall that kU k2L2

(ui (x) − D(x)αi )2 dx

(19)

|U (x)|2 dx.

(20)

Ωp

Z = Ω

3 Note that if the di were orthonormal, then the problem would be trivial, because then we would choose the components of αi to be the first few largest values of hdi , ui iL2 and the rest of the components of αi to be zero. However, the dictionary will not be orthonormal and may in fact be redundant.

4

Therefore, to minimize the energy of interest over D, while having an estimate of α, u, we are left to minimizing (21) arg min kU − Dαk2L2 . D

To do this, we are going to write, kU − Dαk2L2

2

2

Nd Nd

2

X X



i k i

di α − dk α = Ek − dk αk 2 = U − di α = U −

2 L

2 i=1 i=1,i6=k L

(22)

L

where αi is the ith row of the matrix α, and Ek (x) = U (x) −

Nd X

di (x)αi , x ∈ Ωp .

(23)

i=1,i6=k

We are going to now assume that we have estimates of di for i 6= k, and we are going to minimize

Ek − dk αk 2 2 over dk . L 0.3.4

SVD

To do this, we recall the singular value decomposition (SVD) of a matrix. Let A ∈ Rm×n then we can decompose the matrix into the form X A = W ΣV T = σi wi viT (24) i

where Σ ∈ Rm×n is diagonal, i.e.,

( σi Σij = 0

i=j , i 6= j

(25)

W = (w1 , . . . , wn ) ∈ Rn×n , and V = (v1 , . . . , vm ) ∈ Rm×m are orthogonal, i.e., W T W = Idn×n and V T V = Idm×m . The σi are called the singular values of A. Further, X X kAk2F := A2i,j = σi2 . (26) i,j

i

Note that {zij = wi vjT }i,j form an orthogonal basis with respect to the inner product hz1 , z2 i = tr(z1T z2 );

(27)

tr((wi1 vjT1 )T wi2 vjT2 ) = tr(vj1 wiT1 wi2 vjT2 ) = δi1 i2 tr(vj1 vjT2 ) = δi1 i2 vjT1 vj2 = δi1 i2 δj1 j2 ,

(28)

indeed,

thus, {wi vjT }i,j form an orthogonal basis.

5

0.3.5

Continuing with Dictionary Optimization

We decompose Ek using the singular value decomposition : X Ek (x) = σi wi (x)viT ;

(29)

i

then the problem is minimize

2

X

σi wi viT − dk αk

2 i

(30)

L

αk ,

over dk . Now suppose that we were to solve for dk instead of just dk . Either dk αk ∈ span{wi viT } k T or dk α is in the orthogonal complement of span{wi vi } (note that dk αk cannot have components in both span{wi viT } and its complement since then dk αk would have rank greater than 1, which cannot be the case). Suppose the former case, then dk αk = γwj vjT and then

2

X

X

T k σi wi vi − dk α = σi2 + (σj − γ)2 ; (31)

2 i

i6=j

L

we see that the sum is minimum when γ = σj , where σj is the largest singular value, in this case, dk αk = σj wj vjT . If dk αk is instead in the orthogonal complement of span{wi viT }, then

2

X

X

T k σ w v − d α σi2 + kdk αk k2L2 , (32)

= i i i k

2 i

i

L

which would have higher value than if we chose dk αk = σj wj vjT with σj being the largest singular value. Therefore, we would like to choose dk = wj and αk = σj vjT with σj being the largest singular value of Ek . However, vjT is unlikely to be sparse, and thus, this would not produce that α is sparse as it should be. However, there is a quick fix to this, that is, we optimize (22) over only those components of αk that are non-zero using the current estimate of αk . To this end, arg min kEk − dk αk k2L2 = arg min kEkR − dk (αk )R k2L2 dk ,αki 6=0

(33)

dk ,(αk )R

where (αk )R is the row vector that contains all non-zero entries of αk , and similarly, EkR is the sub matrix of Ek that has the columns removed that correspond to the columns where αk are zero. Now, using P the reasoning in the previous paragraph, we choose dk = wiR and (αk )R = σjR (viR )T where EkR = i σiR wiR (viR )T is the singular value decomposition of EkR and σjR is the largest singular value. Thus, we have updated dk and αk , but we still need to update the other dictionary elements dj . To do this, we simply formulate Ej and solve for dj and αj using the previous estimates of the other dictionary elements and the updated value of dk . We repeat the procedure until all dictionary elements are updated. This algorithm is known as K-SVD [1].

References [1] M. Aharon, M. Elad, and A. Bruckstein. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. Signal Processing, IEEE Transactions on, 54(11):4311–4322, 2006. 6

[2] M. Elad and M. Aharon. Image denoising via learned dictionaries and sparse representation. 2006. [3] S.G. Mallat and Z. Zhang. Matching pursuits with time-frequency dictionaries. Signal Processing, IEEE Transactions on, 41(12):3397–3415, 1993. [4] B.A. Olshausen and D.J. Field. Vision and the coding of natural images. American Scientist, 88:238, 2000. [5] Y.C. Pati, R. Rezaiifar, and PS Krishnaprasad. Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Signals, Systems and Computers, 1993. 1993 Conference Record of The Twenty-Seventh Asilomar Conference on, pages 40–44. IEEE, 1993.

7

EE 396: Lecture 9

Mar 12, 2011 - p(ui) (stochastic components of ui, αi and ηi, are independent from each other ) ... we could minimize in u, and a MAP estimate for u, which would.

130KB Sizes 6 Downloads 261 Views

Recommend Documents

EE 396: Lecture 10-11
From your physics class, we know that the speed of the curve squared divided by the radius of curvature is the normal component of acceleration : cpp(p) · N(p) = |cp(p)|2. R(p). = |cp(p)|2κ(p). (20) where κ(p) is one over the radius of curvature;

EE 396: Lecture 3 - UCLA Vision Lab
Feb 15, 2011 - (which we will see again in more detail when we study image registration, see [2]). • The irradiance R, that is, the light incident on the surface is ...

EE 396: Lecture 2
Feb 12, 2011 - where I : Ω → R is the observed or measured image, and α > 0. The energy E is a functional that is defined on the class of functions U, which ...

EE 396: Lecture 4
where Id : U→U is the identity map, and so if u1,u2 ∈ U, then .... Recall from signal processing, that if we have a linear system that is also shift-invariant (some-.

EE 396: Lecture 5
Feb 22, 2011 - In an inner product space, we automatically get for free the Cauchy-Schwartz .... smoothing with and closeness to the original image I. 3The is a ...

EE 396: Lecture 8
Mar 8, 2011 - many fields including electrical engineering and financial data .... used symmetry of w, Fubini's Theorem to interchange order of integration, and ...

EE 396: Lecture 14-15
Calculating the posterior probability, p({Ri,ui}N i=1|I) using Bayes' Rule, and then calculating the MAP estimate is equivalent to minimizing the energy. E({Ri,ui}N.

EE 396: Lecture 13
Mar 29, 2011 - We shall now derive an algorithm whereby we can compute dS(x) for .... Lagrange equations are. ∇Lφ(γ) = ∇φ(γ(s)) − d ds. (φ(γ(s))γs(s)) = 0.

EE 396: Lecture 12
Mar 26, 2011 - Thus, along lines that are parallel to v the function remains ... x ∈ R. We simply follow a line parallel to v starting from (t, x) and follow the line ...

EE 396: Lecture 3 - UCLA Vision Lab
Feb 15, 2011 - The irradiance R, that is, the light incident on the surface is directly recorded ... partials of u1 and u2 exist and are continuous by definition, and ...

Lecture 9
Feb 15, 2016 - ideological content have persisted among the American public from 1987 to 2012.2 ... That is, learning is social and takes place within the individuals' ... independent network structures, deriving always consensus results.

Lecture 9
markets that employ the DA mechanism is that preference misrepresentation by hospitals ... As stated before, a reason to restrict our analysis to dropping equi-.

Lecture 9
Dec 3, 2015 - (2006) and Balart and Cabrales (2014) for an illustration of this relationship. 9On the ... Theorem 1. The DM's optimal strategy is to face difficulties since the beginning ... analyze the effect of a marginal boost in dispositions.

EE - Lecture 5b - RoR - multiple projects.pptx
Must use LCM for each pair-wise comparison. • Procedure for ... If i* < MARR: eliminate project and go the next project. •. If i* ≥ MARR: go ... Rules: – No phones.

Lectures / Lecture 9 - Building Dynamic Websites
Andrew Sellergren. Contents. 1 Announcements (0:00–2:00). 2. 2 Ajax (2:00–105:00). 2. 2.1 Introduction . ... 2.4.2 ajax2.html . ... 2.4.10 ajax10.html .

Lecture 9: Interwar Instability
Channel 3: Central Bank Credibility: “Good” Speculative Capital. Flows, or the Lack of ”Bad” Speculative Capital Flows ... Rise of labor union and labor party: wage becomes more rigid. Government's increasing role in stimulating domestic ...

lecture 9: monte carlo sampling and integration - GitHub
analysis if needed. • There is also a concept of quasi-random numbers, which attempt to make the Monte Carlo integrals converge faster than N-1/2: e.g. Sobol ... Note that the weights can be >1 and the method is not very useful when the values>>1.

Ben Tahar 396.pdf
Page 1 of 31. 1. ENTREPRENEURIAL STRESSORS. Yosr Ben Tahar. Groupe Sup de Co Montpellier Business School and. University Montpellier I, Montpellier Research in Management. SUMMARY. The entrepreneurial context is known as stressful, studies use models

Litobox-wod-396.pdf
Litobox-wod-396.pdf. Litobox-wod-396.pdf. Open. Extract. Open with. Sign In. Details. Comments. General Info. Type. Dimensions. Size. Duration. Location.

EE-A.pdf
IN THE ROLL NUMBER AND ·msT BOOKLET SERIES CODE B, C OR D CAR£FULLY. AND WITHOUT ANY OMISSION OR DISCREPANCY AT THE APPROPRIATE PLACES. IN THE OMR ANSWER SHEET. ANY OMISSION/DISCREPANCY WILL RENDER. THE. ANSWER SHEET LIABLE FOR REJECTION. Booklet

XAPP290 - EE Times Asia
Jun 28, 2002 - Overall structure should be a top-level design with each functional ..... A device that has been configured with an encrypted bitstream cannot be.

Lecture 7
Nov 22, 2016 - Faculty of Computer and Information Sciences. Ain Shams University ... A into two subsequences A0 and A1 such that all the elements in A0 are ... In this example, once the list has been partitioned around the pivot, each sublist .....

Minecraft Generator 1.10.2 Mod 396
Gravity Gun - iChun's blog Home to multiple minecraft . ... Minecraft Maker Online Live Free Game Generator Codes online, Free Game Generator Codes Free ...

One Piece Chapter 396.pdf
Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. One Piece Chapter 396.pdf. One Piece Chapter 396.pdf. Open.