Are all matrices diagonalisable?

Kapil H. Paranjape

I have frequently used the above question to gauge the solidity of the mathematical training of another person. The incorrect responses vary between:

An unhesitating ``Yes''.
An unhesitating ``No'' with no counter-example forthcoming.
A ``Yes'' after some thought.
``Yes, over the field of complex numbers''.

Linear Algebra forms the heart of ``modern'' mathematics. It is thus disappointing to see the number of students confused about such an important issue. This note is an attempt to show the ramifications of this question and provide a detailed answer.

To begin with a matrix A is said to be ``diagonalisable'' if it is a square matrix and there is an invertible matrix B such that B^-1AB is a diagonal matrix. So we need restrict ourselves only to square matrices for the purposes of our discussion. Now it is clear that if {e₁,..., e_n} is the standard basis for the vector space on which A acts, then {Be₁,..., Be_n} is another basis, which has the property that each of the basis vectors is an eigenvector; moreover, the diagonal entries of B^-1AB are just the eigenvalues of A. Thus an equivalent definition of diagonalisability is be that there be a basis consisting of eigenvectors.

If A is a symmetric matrix then one can show that there is an orthogonal basis of eigenvectors and thus we obtain the diagonalisability of symmetric matrices. One can argue similarly for some other classes of matrices. This is perhaps what people who give the third answer are thinking of. However, there is an important class of matrices of which none except the zero matrix is diagonalisable (see below).

Now it is clear that a matrix like $\left(\vphantom{ \begin{matrix} 0 & 1\\ -1& 0 \end{matrix}}\right.$ $\begin{matrix} 0 & 1\\ -1& 0 \end{matrix}$ $\left.\vphantom{ \begin{matrix} 0 & 1\\ -1& 0 \end{matrix}}\right)$ cannot have eigenvalues over the field of real numbers; hence it isn't diagonalisable over the real numbers. However, it is diagonalisable over the field of complex numbers with eigenvectors e₁± $\sqrt{-1}$ ^.e₂. More generally, one could argue as follows. From the second definition and the usual definition of eigenvectors we know that we should be looking at the roots of the Characteristic Polynomial P_A(t) = det(A - tI). Now if we are working over the field of complex numbers we have all roots of P_A(t). Thus we have eigenvectors corresponding to every eigenvalue. This is perhaps the reasoning of people who give the fourth answer. However, the problem is that if an eigenvalue occurs with multiplicity greater than one it is not clear that we can find two or more (linearly independent) eigenvectors.

Of course the correct answer to the main question is ``No'' because of the phenomenon of nilpotent matrices. A matrix such as $\left(\vphantom{ \begin{matrix} 0 & 1\\ 0& 0 \end{matrix}}\right.$ $\begin{matrix} 0 & 1\\ 0& 0 \end{matrix}$ $\left.\vphantom{ \begin{matrix} 0 & 1\\ 0& 0 \end{matrix}}\right)$ has 0 as its only eigenvalue but it is not the zero matrix and thus it cannot be diagonalisable. It is clear that if N is nilpotent matrix (i. e. N^k = 0 for some k) then it is diagonalisable if and only N = 0. In fact, the role of nilpotent matrices is made more precise by the ``Jordan decomposition theorem'' which says that any matrix A over the real numbers (or more generally over a perfect field) can be written as A = S + N where S and N commute, N is nilpotent and S is diagonalisable over the complex numbers (respectively over an algebraic extension of the ground field).

Now we can approach the question from a different viewpoint and ask for conditions which ensure that the nilpotent matrix N is zero. One way to do this is to ensure that P_A(t) has distinct roots. In this case we clearly have a basis of eigenvectors. Let f (t) = tⁿ + c₁t^{n - 1} + ... + c_n be any polynomial. There is a (universal) polynomial D_n(C₁,..., C_n) called the discriminant with the property that D_n(c₁,..., c_n) = 0 if and only if f (t) has multiple roots (e. g. for n = 2 we have D₂ = C₁² - 4C₂). Now

the coefficients of the characteristic polynomial of a matrix A are polynomials in the matrix entries of A.

Thus, treating the entries of the matrix as variables we get a (universal) polynomial $\Delta_{n}^{}$ in the matrix entries of A with the property that the characteristic polynomial of A has multiple roots if and only if $\Delta_{n}^{}$ vanishes at A. It is not difficult to show that $\Delta_{n}^{}$ is not the identically zero polynomial; moreover, the zero set of a non-zero polynomial is always a ``thin set'' in the sense of measure theory. In other words, if we bound the matrix entries of A and choose them randomly (but uniformly) within these bounds then with probability 1 we will find a diagonalisable matrix. Thus my own answer to the question posed above is two-fold:

Every matrix is not diagonalisable. Take for example non-zero nilpotent matrices. The Jordan decomposition tells us how close a given matrix can come to diagonalisability.
If we choose our matrix ``randomly'' (in a uniform distribution) from within a bounded region, then it will turn out to be diagonalisable over the complex numbers with probability 1 (in other words ``always'').

Kapil Hari Paranjape 2002-11-21