SWILA NOTES
79
7.5. Normal operators and the spectral theorem for normal operators. In the previous section, we learned that self-adjoint operators are diagonalizable. More precisely, every self-adjoint operator has an orthonormal basis of eigenvectors. A natural question to ask is whether we can characterize those linear operators which have an orthonormal basis of eigenvectors. And is it even normal for a linear operator to be diagonalizable? Well, I’m not sure about that question, but the linear operators which are diagonalizable are exactly the normal ones. We will typically assume V is a finite-dimensional complex inner product space in this section. Definition 7.5.1. A linear operator T : V → V is called normal if T ∗ T = T T ∗ . Remark 7.5.2. Note that self-adjoint and skew-adjoint operators are normal. Also, orthogonal transformations are normal. We aim to show that an operator is normal if and only if it has an orthonormal basis of eigenvectors. We can easily prove necessity. Proposition 7.5.3. Suppose a linear operator T : V → V has an orthonormal basis of eigenvectors. Then T is normal. Proof. The main ideas for this proof are that diagonal matrices commute and Propostion 7.1.6, which stated that one can pull out adjoints from matrix representations when the basis is orthonormal. Let B := {e1 , . . . , en } be an orthonormal basis of eigenvectors for T . Then [T ]B is a diagonal matrix (with its k th diagonal entry being the eigenvalue corresponding to ek ). Clearly, its adjoint [T ]∗B is also diagonal. As diagonal matrices commute, [T ]B [T ]∗B = [T ]∗B [T ]B . By Proposition 7.1.6, [T ]∗B = [T ∗ ]B , and hence [T T ∗ ]B = [T ]B [T ∗ ]B = [T ∗ ]B [T ]B = [T ∗ T ]B . As T T ∗ agrees with T ∗ T on a basis of V , we may conclude T is normal.
Example 7.5.4. This example is taken from Petersen’s notes (Example 83, page 213). There exist linear operators that are not normal, yet have bases of eigenvectors. For example, the matrix 1 1 A := 0 2 2 (which corresponds to a linear operator on R by left application) is not self-adjoint/symmetric. In fact, one can check that A is not even normal. However, R2 has a basis of eigenvectors {(1, 0), (1, 1)} with corresponding eigenvalues 1, 2. The key is that this basis is not orthogonal. Random Thought 7.5.5. Consider the outside temperature as a function of time. Coming from California, I used to believe the temperature function was continuous, possibly even differentiable. Now in Illinois, I’m not even sure it’s measurable. Proposition 7.5.6. (Characterization of normal operators) Let T : V → V be a linear operator. Then the following are equivalent:
80
BY DEREK JUNG, ADAPTED FROM NOTES BY UCLA PROF. PETER PETERSEN
(1) (2) (3) (4)
T is normal. T T ∗ = T ∗T . ||T x|| = ||T ∗ x|| for all x ∈ V . BC = CB, where B = 12 (T + T ∗ ) and C = 21 (T − T ∗ ).
Proof. (1) ⇔ (2) by definition. (2) ⇔ (4): This follows from the computations 1 BC = (T 2 − T T ∗ + T ∗ T − T ∗ T ∗ ) 4 1 CB = (T 2 − T ∗ T + T T ∗ − T ∗ T ∗ ). 4 Note that we need to be careful about order in this computation as operators don’t commute in general. (2) ⇒ (3): Observe ||T x||2 = hT x, T xi = hT ∗ T x, xi = hT T ∗ x, xi = hT ∗ x, T ∗ xi = ||T ∗ x||2 . (3) ⇒ (2): We have 0 = hT x, T xi − hT ∗ x, T ∗ xi = h(T ∗ T − T T ∗ )x, xi
for all x ∈ V.
Note the operator T ∗ T − T T ∗ is self-adjoint. It follows from Proposition 7.2.1 that T T ∗ = T ∗T . We leave the following as an exercise to the reader (just follow your nose). Lemma 7.5.7. Let T : V → V be a linear operator. If M ⊂ V is a T and T ∗ invariant subspace, then M ⊥ is also T and T ∗ invariant. In particular, (T |M ⊥ )∗ = T ∗ |M ⊥ . Random Thought 7.5.8. Little known fact: The popular children’s song “For he’s a jolly good fellow” actually was originally written about NSF Fellows. We now prove the spectal theorem for normal operators on complex inner product spaces. Theorem 7.5.9. (The spectral theorem for normal operators) Let T : V → V be an operator on a complex inner product space. Then T is normal if and only if T has an orthonormal basis of eigenvectors. Proof. We proved that existence of an orthonormal basis of eigenvectors implies normality in Proposition 7.5.3. Conversely, suppose T is normal. As in the proof of the spectral theorem for self-adjoint operators (Theorem 7.4.4), we aim to show that T has an eigenvector and that the orthogonal complement of this eigenvector is T invariant. Define the self-adjoint operators B = 21 (T + T ∗ ) and C = 2i1 (T − T ∗ ) on V . Observe that T = B + iC. By Theorem 7.4.4, there exists a real eigenvalue λ of B, which implies ker(B − λ1V ) 6= 0. Since B · iC = iC · B (see Proposition 7.5.6), B ◦ C = C ◦ B. If z ∈ ker(B − λ1V ), (B − λ1V )(C(z)) = B ◦ C(z) − λC(x) = C ◦ (B − λ1V )(z) = 0.
SWILA NOTES
81
This shows the subspace ker(B − λ1V ) is C-invariant. Since C is self-adjoint and hence C|ker(B−λ1V ) is as well, we may find 0 6= x ∈ ker(B − λ1V ) and µ ∈ R such that C(x) = µx. This means T (x) = B(x) + iC(x) = (λ + iµ)(x). In addition, T ∗ (x) = B(x) − iC(x) = (λ − iµ)(x). This shows span{x} is T and T ∗ invariant. By Lemma 7.5.7, this implies M := (span{x})⊥ is also T and T ∗ invariant with (T |M )∗ = T ∗ |M . This implies T |M : M → M is also normal, and we can apply the inductive argument on the dimension of V as in Theorem 7.4.4. Recall that unitary operators T : V → V , where V is a complex inner product space, are characterized as satisfying T T ∗ = 1V = T ∗ T . The spectral theorem for normal operators gives us a spectral theorem for unitary operators. Theorem 7.5.10. (Spectral theorem for unitary operators) Let T be a unitary operator on a finite-dimensional complex inner product space V . Then there exists an orthonormal basis {v1 , . . . , vn } of V such that T (v1 ) = eθ1 v1 , . . ., T (vn ) = eiθn vn , where θ1 , . . . , θn ∈ R. Proof. Note that every unitary operator is normal. As ||T (x)|| = ||x|| (see Theorem 7.2.5) for all x ∈ V , it follows that every eigenvalue λ of T satisfies |λ| = 1. The theorem follows from the polar decomposition of complex numbers. We can also extend Theorem 7.4.11 to normal operators with the same proof. Theorem 7.5.11. Let T : V → V be a normal operator and λ1 , . . . , λk the distinct eigenvalues of T . Then 1V = projker(T −λ1 1V ) + · · · + projker(T −λk 1V ) and T = λ1 projker(T −λ1 1V ) + · · · + λk projker(T −λk 1V ) . 7.6. Schur’s theorem. We have shown that self-adjoint operators and normal operators are diagonalizable with orthonormal bases of eigenvectors. One may be wondering what we can say about general linear operators. We will show in this section that linear operators on finitedimensional complex inner product spaces have an upper triangular matrix representation. Throughout this section, V will be a finite-dimensional complex inner product space. 0 1 Example 7.6.1. The matrix A := is a typical example of a matrix that is not 0 0 diagonalizable. However, A is upper triangular. Definition 7.6.2. Let e1 , . . . , en be a basis for V . We define the linear transformation a1 ( e1 · · · en ) : Fn → V by ( e1 · · · en ) ... = a1 e1 + · · · + an en . an We will show that every linear operator has an upper triangular matrix representation.
82
BY DEREK JUNG, ADAPTED FROM NOTES BY UCLA PROF. PETER PETERSEN
Theorem 7.6.3. (Schur’s theorem) Let T an orthonormal basis B = {e1 , . . . , en } of V triangular. Equivalently, a11 0 T = ( e1 · · · en ) . .. 0
: V → V be a linear operator. Then there exists such that the matrix representation [T ]B is upper a12 · · · a22 · · · .. .
a1n a2n .. .
···
ann
0
( e1 · · · en )∗ .
Proof. Observe that we need to find an orthonormal basis e1 , . . . , en a11 a12 · · · 0 a22 · · · ( T (e1 ) · · · T (en ) ) = ( e1 · · · en ) . .. .. . 0
···
0
of V such that a1n a2n .. . . ann
This is equivalent to finding an orthonormal basis e1 , . . . , en of V and constructing an increasing sequence of T -invariant subspaces {0} ⊂ V1 ⊂ V2 ⊂ · · · ⊂ Vn−1 ⊂ V, where Vk = span{e1 , . . . , ek }. This proof will be similar to that of the spectral theorems. Consider the linear operator T ∗ : V → V . Choose an eigenvector x of T ∗ with corresponding eigenvalue λ such that ||x|| = 1. (Existence is guaranteed since F = C.) Define Vn−1 = {x}⊥ = {v ∈ V : hx, vi = 0}. For all v ∈ Vn−1 , ¯ xi = 0. hT (v), xi = hv, T ∗ xi = λhv, This proves that Vn−1 is T -invariant and has dimension dim(V ) − 1. By induction, T |Vn−1 is upper triangulizable. If we set n = dim(V ), there is an orthonormal basis B = {e1 , . . . , en−1 } of Vn−1 such that [T |Vn−1 ]B is upper triangular. If we set B˜ = B ∪ {x}, it is easy to see [T ]B˜ is upper triangular. 7.7. Singular value decomposition. In this section, F = R or C. Given an orthonormal basis {e1 , . . . , en } of a vector space V , recall (see Definition 7.6.2) that we define a1 ( e1 · · · en ) : Fn → V by ( e1 · · · en ) ... = a1 e1 + · · · + an en . an Note the adjoint ( e1 · · · en )∗ : V → Fn of this transformation is defined by a1 ( e1 · · · en )∗ (a1 e1 + · · · + an en ) = ... . an Theorem 7.7.1. (The Singular Value Decomposition) Let T : V → W be a linear map between finite-dimensional inner poduct spaces. Then there is an orthonormal basis {e1 , . . . , em } of V such that hT (ei ), T (ej )) = 0 for i 6= j. Moreover, we can find orthonormal bases
SWILA NOTES
83
B = {e1 , . . . , em } of V and C = {f1 , . . . , fn } of W and nonnegative real numbers σ1 , . . . , σk , k ≤ m, so that T (e1 ) = σ1 f1 , . . . , T (ek ) = σk fk T (ek+1 ) = · · · = T (em ) = 0. In other words, T = ( f1 · · · fn )[T ]C,B ( σ1 0 .. = ( f1 · · · fn ) . .. . .. .
e1 · · · em )∗ 0 .. .
0
···
0
0
0
σk .. .
0 0 0
··· ∗ ··· ( e1 · · · em ) . ··· .. .
Proof. We can apply the spectral theorem to the self-adjoint operator T ∗ T : V → V to find an orthonormal basis {e1 , . . . , em } of V such that T ∗ T (ei ) = λi ei , some λi ∈ F. Then λi , i=j ∗ hT (ei ), T (ej )i = hT T (ei ), ej i = hλi ei , ej i = 0, i 6= j. Reordering if necessary, we may assume λ1 , . . . , λk > 0 and λl = 0 for l > k. Define fi =
T (ei ) , i = 1, . . . , k. ||T (ei )||
Then extend {f1 , . . . , fk } to an orthonormal basis {f1 , . . . , f√ k , fk+1 , . . . , fn } for W (possibly using the Gram-Schmidt Process). Setting σi = ||T (ei )|| = λi , T (ei ) = σi fi for all i. The theorem follows. Remark 7.7.2. Some sources may use T T ∗ in the proof. Their result will be a decomposition of T ∗ as opposed to one of T as we have here. This immediately gives us the singular value decomposition of matrices. We call a (possibly non-square) matrix D = (dij ) diagonal if dij = 0 whenever i 6= j. Corollary 7.7.3. (The Singular Value Decomposition for matrices) Let A be a real (complex) m × n matrix. Then there is an orthogonal (unitary) m × m matrix U , an orthogonal (unitary) n × n matrix V , and a diagonal m × n matrix D with nonnegative entries such that A = U DV ∗ . Equivalently, there exists an orthonormal basis {e1 , . . . , em } of Fm , an orthonormal basis {f1 , . . . , fn } of Fn , and nonnegative real numbers σ1 , . . . , σk , k ≤ m, such that A(e1 ) = σ1 f1 , . . . , A(ek ) = σk fk A(ek+1 ) = · · · = A(em ) = 0. Random Thought 7.7.4. Did you know that people sell plastic nets to hit tennis balls with? I just think it’s a racket.
84
BY DEREK JUNG, ADAPTED FROM NOTES BY UCLA PROF. PETER PETERSEN
Example 7.7.5. Let
−1 0 A = 1 −1 : R2 → R3 . 0 1 We will find the singular value decomposition of A by following the proof of Theorem 7.7.1. We can calculate −1 0 −1 1 0 2 −1 t 1 −1 AA= = . 0 −1 1 −1 2 0 1 We have t
det(A A − tId) = det
2 − t −1 −1 2 − t
= (2 − t)2 − 1 = (t − 3)(t − 1).
This shows the eigenvalues of At A are 3 and 1. √ √ An eigenvalue of At A associated with the eigenvalue 3 √ is e1 =√(1/ 2, −1/ 2). An eigenvalue of At A associated with the eigenvalue 1 is e2 = (1/ 2, 1/ 2). We have √ −1/ 2 A(e1 ) 1 √ f1 = = 2√ ||A(e1 )|| 3 −1/ 2 and
√ −1/ 2 A(e2 ) 0√ . f2 = = ||A(e2 )|| 1/ 2
Letting 1 1 f3 := √ 1 , 3 1 3 {f1 , f2 , f3 } is an orthonormal basis for R . We may conclude that √ √ √ √ √ √ −1/(3 2) −1/ 2 1/ √ √3 1/ 2 −1/ 3 0 √ √2 . A= 2/3 0√ 1/√3 √ 0 1 1/ 2 1/ 2 −1/(3 2) 1/ 2 1/ 3