LECTURE 5: MATRIX DIAGONALIZATION, SINGULAR VALUE DECOMPOSITION AND PRINCIPAL COMPONENT ANALYSIS • We wish to diagonalize a square NxN matrix A: Ax=lx. Here x is the eigenvector and l the eigenvalue of A • We could solve det|A-lI|=0 and expand it terms of a polynomial of N-th order in l • There is a shift symmetry: (A+tI)x=(l+t)x that does not change eigenvectors, but changes eigenvalues

Common matrix types • Symmetric: A=AT or aij=aji • Hermitian (self-adjoint): A=A+ or aij=aji*. Important concept because eigenvalues are real (quantum mechanics!) • Orthogonal: transpose equals inverse AAT=ATA=I • Unitary: Hermitian conjugate equals inverse AA+=I • Normal: commutes with Hermitian conjugate AA+=A+A. Important because its eigenvectors are complete and orthogonal • For real matrices Hermitian is symmetric and unitary is orthogonal, both of which are normal. • A normal non-symmetric matrix is hard to diagonalize • Symmetric matrices are easier to diagonalize

Matrix forms and similarity transform • We can define the matrix of right eigenvectors as XR and the matrix of eigenvalues as D, which is diagonal with li on the diagonal: AXR=XRD • We can also define left eigenvectors XLA=DXL • XLXRD=DXLXR implies XLXR=I (with appropriate normalization) and so A=XRDXL=XRDXR-1 and D=XLAXR=XR-1AXR • For a symmetric matrix A=XRDXRT • a transformation Z-1AZ leaves eigensystem unchanged: det| Z-1AZ-lI|= det| Z-1AZ-lZ-1IZ|= det| Z-1| det| A-lI| det|Z|= det| A-lI| • Many of the methods consist of a series of such transforms • For real symmetric matrices Z-1=ZT

QR decomposition

We can show (by induction) that qiqj=dij Note that this is closely related to Gram-Schmidt orthogonalization

Gram-Schmidt orthogonalization and QR/RQ decomposition • The RQ decomposition transforms a matrix A into the product of an upper triangular matrix R (also known as right-triangular) and an orthogonal matrix Q. The only difference from QR decomposition is the order of these matrices. • QR decomposition is Gram–Schmidt orthogonalization of columns of A, started from the first column. • RQ decomposition is Gram–Schmidt orthogonalization of rows of A, started from the last row.

QR decomposition

• We have obtained a decomposition into an orthogonal matrix Q and upper diagonal matrix R

QR diagonalization for symmetric A • We can repeat these QR steps

General packaged solutions • QR works better if the matrix A is first transformed into tridiagonal form (Hauseholder or Givens method): standard method for symmetric A • Diagonalization methods: there are many different methods depending on what matrix we have: A can be real banded symmetric, real symmetric, complex Hermitian, real non-symmetric… • Depending of what we need: just the eigenvectors or both eigenvectors and eigenvalues • Look at numpy.linalg and choose depending on what your needs are: eigh, eigvalsh…

Generalized problems • Ax=lBx can be handled as (B-1A)x=lx • If A and B symmetric B-1 is not, but we can use Cholesky decomposition B=LLT and by multiplying with L-1 we get C(LTx)=lLTx where C=L-1A(L-1)T is a symmetric matrix. This diagonalization gives the same eigenvalues with eigenvectors given by LTx. • Nonlinear problems: (Al2+Bl+C)x=0 can be solved by solving 2Nx2N problem

“Shortcomings” of diagonalization • eigenvectors are not orthogonal for a nonsymmetric NxN matrix • There may not be enough eigenvectors, ie they may not be complete: A is “defective” • It cannot be defined as eigenvalue problem Ax=lx unless a is NxN • All these “defects” can be cured by SVD • None of these “defects” apply to a symmetric NxN matrix: SVD and diagonalization are the same in that case if A is semi-positive definite

Singular Value Decomposition (SVD) • Is a decomposition of a general NxM matrix • Define rank r of matrix A as the maximum number of linearly independent column vectors in the matrix or the maximum number of linearly independent row vectors in the matrix (both definitions are equivalent) • Another way: define range as the subspace of b reached by Ax=b. Rank is its dimension. • Define two orthonormal bases, with vectors u in RM and v in RN. In matrix form U is MxM and V is NxN • u1 , . . . , ur is an orthonormal basis for the column space • ur+1 ,...,um is an orthonormal basis for the left null space N (defined as ATx=0) • v1, . . . , vr is an orthonormal basis for the row space • Vr+1,...,vN is an orthonormal basis for the null space N(Ax=0). • These bases “diagonalize” A: Av1=s1u1, Av2=s2u2… • Also called Karhunen-Loeve (KL) Transform

SVD continued • Add null space, which is already orthogonal to first r v’s and u’s to go from reduced to full SVD

• Now the last matrix Σ is MxN with first r on the diagonal si, and the rest 0. U and V are now square matrices (MxM and NxN) hence AV=UΣ or A=UΣVT=u1s1v1+…ursrvr • Rank order s1>s2>…>sr

SVD in pictures

• SVD as a sequence of rotation-stretch-rotation A=UΣVT

• AV=UΣ

• Av1=s1u1

Example • If A = xyT (rank 1) with unit vectors x and y, what is the SVD of A?

Example • If A = xyT (rank 1) with unit vectors x and y, what is the SVD of A? • Solution The reduced SVD is exactly xyT, with rank r = 1. It has u1 = x and v1 = y and σ1 = 1. For the full SVD, complete u1 = x to an orthonormal basis of u’s, and complete v1 = y to an orthonormal basis of v’s. No new σ’s, only σ1 = 1.

How to construct u’s and v’s? • • • •

v’s will be orthonormal eigenvectors of ATA: ATA=(UΣVT)TUΣVT=VΣTUTUΣVT=VΣTΣVT We see that s2=l of ATA. u’s can easily be computed from Avi=siui. They are orthogonal:

• Show that u’s are eigenvectors of AAT

How to construct u’s and v’s? • • • •

v’s will be orthonormal eigenvectors of ATA: ATA=(UΣVT)TUΣVT=VΣTUTUΣVT=VΣTΣVT We see that s2=l of ATA. u’s can easily be computed from Avi=siui. They are orthogonal:

• Show that u’s are eigenvectors of AAT • Soln: AAT=UΣVT(UΣVT)T=UΣVTVΣTUT=UΣΣTUT

Linear algebra with SVD • Linear algebra Ax=b, assume A is NxN • Let’s do matrix inversion: A-1=VΣ-1UT • This can be solved unless some si=0: matrix is singular, i.e. there is non-zero null space • More generally we can define condition number s1/sN: if this is large the matrix is ill-conditioned • SVD will diagnose the condition situation of the matrix • If we want to solve homogeneous equation Ax=b for b=0 we use the null space of U/V, corresponding to si=0

Solving singular matrix problems • Ax=b can be solved if b is in the range of A. If so we have an infinite number of solutions, since we can add null space solutions to it in any linear combination • We can put additional constraint, such as the norm ||x|| is minimized (i.e. pick the solution with the shortest length |x|2). Then we can set the null space solutions to 0. This is achieved by setting si-1=0 wherever si=0 (null space) and solve x=VΣ-1UTb • If b is not in range we will not have an exact solution. But we can still use x=VΣ-1UTb as the solution which minimizes r=|Ax-b|

Proofs that x=A-1b =VΣ-1UTb

• b is in range: suppose x’ is in null space, and Σ-1 is the modified inverse of Σ: |x+x’|=|VΣ-1UTb+x’|= |V(Σ-1UTb+VTx’|= |Σ-1UTb+VTx’|>|x| unless x’=0 because the two are orthogonal (x’ is in null space, x is not). • b is not in range: we add x’ to x, Ax-b is modified by b’=Ax’ (in range), show r=|Ax-b| is minimized: |Ax-b+b’|=|(UΣVT) VΣ-1UTb-b+b’|= |(UΣΣ-1UT-1)b+b’|=|U[(ΣΣ-1-1)UTb+Utb’]| = |(ΣΣ-1-1)UTb+Utb’|> |(ΣΣ-1-1)Utb| unless b’=0. • In practice we want to fix ill-conditioned matrices with SVD: throw away all singular values whose value is too small to be meaningful (machine precision roundoff errors etc). So if si
Non-square matrices • If there are fewer equations N than variables M the system is underdetermined and we have M-N dimensional family of solutions. SVD provides the family of solutions by providing si

lecture 5: matrix diagonalization, singular value ... - GitHub

We can decorrelate the variables using spectral principal axis rotation (diagonalization) α=XT. L. DX. L. • One way to do so is to use eigenvectors, columns of X. L corresponding to the non-zero eigenvalues λ i of precision matrix, to define new variables as the linear combinations of original variables: u=X. L a. • the variables ...

11MB Sizes 0 Downloads 268 Views

Recommend Documents

Accelerated Singular Value Thresholding for Matrix ...
Aug 16, 2012 - tem [13, 14, 22] and image/video analysis [17, 11]. Since the completion of ...... Sdpt3 – a matlab software package for semidefinite quadratic.

Led matrix PCB - GitHub
Alarm Clock. TITLE. Led matrix PCB. REV. PART #. CLK-PC-01. DOCUMENT #. UNITS. INCHES. SIZE. B. DATE 2/8/2015. CLK-DWG-01. BENOIT FRIGON.

TWO DIMENSIONAL SINGULAR VALUE ...
applications such as mobile video calls and wireless .... Figure 2 depicts the system diagram of the proposed .... The GI file has been interleaved and a lost GI.

Lecture 1 - GitHub
Jan 9, 2018 - We will put special emphasis on learning to use certain tools common to companies which actually do data ... Class time will consist of a combination of lecture, discussion, questions and answers, and problem solving, .... After this da

Transcriptomics Lecture - GitHub
Jan 17, 2018 - Transcriptomics Lecture Overview. • Overview of RNA-Seq. • Transcript reconstruc囉n methods. • Trinity de novo assembly. • Transcriptome quality assessment. (coffee break). • Expression quan懿a囉n. • Differen鶯l express

Lectures / Lecture 5
Mar 22, 2010 - application files to a SSD, but leaving all their multimedia files on a HDD,. Dan and David were able to ... logic, but it will still be larger in size than the file storing Germany's flag. Whereas Germany's flag can ... in web design,

lecture 15: fourier methods - GitHub
LECTURE 15: FOURIER METHODS. • We discussed different bases for regression in lecture. 13: polynomial, rational, spline/gaussian… • One of the most important basis expansions is ... dome partial differential equations. (PDE) into ordinary diffe

On Sketching Matrix Norms and the Top Singular Vector
Sketching is an algorithmic tool for handling big data. A ... to [11] for graph applications for p = 0, to differential ... linear algebra applications have this form.

lecture 12: distributional approximations - GitHub
We have data X, parameters θ and latent variables Z (which often are of the ... Suppose we want to determine MLE/MAP of p(X|θ) or p(θ|X) over q: .... We limit q(θ) to tractable distributions. • Entropies are hard to compute except for tractable

lecture 4: linear algebra - GitHub
Inverse and determinant. • AX=I and solve with LU (use inv in linalg). • det A=L00. L11. L22 … (note that Uii. =1) times number of row permutations. • Better to compute ln detA=lnL00. +lnL11. +…

Value category cheatsheet - GitHub
Any function call returning a non-reference value type, including pointers, yields a prvalue. ... A non-static data member of an lvalue is also an lvalue. int &&a{ 77 }; ...

return x + value - GitHub
movl %esi, (%rdi) retq. __ZN4Plus6plusmeEi: ... static code and data auto plus = Plus(1); ... without garbage collection using object = std::map;.

Chapter 5 Density matrix formalism
In chap 2 we formulated quantum mechanics for isolated systems. In practice systems interect with their environnement and we need a description that takes this ...

Improving Speaker Identification via Singular Value ...
two different approaches have been proposed using Singular. Value Decomposition (SVD) based Feature Transformer (FT) for improving accuracies especially for lower ordered speaker models. The results show significant improvements over baseline and hav

lecture 16: ordinary differential equations - GitHub
Bulirsch-Stoer method. • Uses Richardson's extrapolation again (we used it for. Romberg integration): we estimate the error as a function of interval size h, then we try to extrapolate it to h=0. • As in Romberg we need to have the error to be in

Old Dominion University Lecture 2 - GitHub
Old Dominion University. Department of ... Our Hello World! [user@host ~]$ python .... maxnum = num print("The biggest number is: {}".format(maxnum)) ...

C2M - Team 101 lecture handout.pdf - GitHub
Create good decision criteria in advance of having to make difficult decision with imperfect information. Talk to your. GSIs about their experience re: making ...

LECTURE 8: OPTIMIZATION in HIGHER DIMENSIONS - GitHub
We want to descend down a function J(a) (if minimizing) using iterative sequence of steps at a t . For this we need to choose a direction p t and move in that direction:J(a t. +ηp t. ) • A few options: fix η. • line search: vary η ... loss (co

lecture 2: intro to statistics - GitHub
Continuous Variables. - Cumulative probability function. PDF has dimensions of x-1. Expectation value. Moments. Characteristic function generates moments: .... from realized sample, parameters are unknown and described probabilistically. Parameters a

lecture 10: advanced bayesian concepts - GitHub
some probability distribution function (PDF) of perfect data x, but what we measure is d, a noisy version of x, and noise is ... We can also try to marginalize over x analytically: convolve true PDF with noise PDF and do this for each ... We would li

Lab 5: strataG - GitHub
Let's take Wang (2016)'s advice into account. • To change settings to ... individuals using the software STRUCTURE: a simulation study. Molecular Ecology.

Lecture 5-Circuit-Packet Switching.pdf
Sign in. Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying.