15. Matrix Factorizations
Dated: 18-04-2025
Matrix Factorization
A factorization
of a matrix
as a product of two or more matrices
is called Matrix Factorization
.
Uses of Matrix Factorization
\(LU\) Factorization or Decomposition
It is a matrix decomposition
into 2 matrices
, lower triangular
and upper triangular matrix
and is used to solve system of linear equations
or determinant
.
Assume \(A_{m \times m}\) can be row reduced
to echelon form
without row interchanges.
Then \(A\) can be written in form \(A_{m \times m} = L_{m \times m} U_{m \times n}\) where \(L_{m \times m}\) is in lower triangular form
with \(1\)'s on the diagonal and \(U_{m \times n}\) is in echelon form
of \(A\).
\[
A =
\begin{bmatrix}
1 & 0 & 0 & 0\\
\star & 1 & 0 & 0\\
\star & \star & 1 & 0\\
\star & \star & \star & 1
\end{bmatrix}
\begin{bmatrix}
\bullet & \star & \star & \star & \star \\
0 & \bullet & \star & \star & \star \\
0 & 0 & 0 & \bullet & \star \\
0 & 0 & 0 & 0 & 0
\end{bmatrix}
\]
In general, not every \(A\) matrix
has a \(LU\) decomposition
nor a \(LU\) decomposition
is unique.
\(LU\) factorization
Algorithm
Suppose \(A\) can be reduced to an echelon form
\(U\) without row interchanges.
Then, \(U\) can be produced with only row reduction
without scaling, on \(A\).
\[E_1 \ldots E_pA = U\]
\[\implies A = (E_1 \ldots E_p)^{-1}U\]
\[\because L = (E_1 \ldots E_p)^{-1}\]
\[\therefore A = LU\]
Procedure
Step 1
Reduce \(A\) to \(U\) without row replacement and keep track of multipliers used to introduce leading \(1\)'s and \(0\)'s belong them.
Step 2
In each position along the main diagonal
of \(L\), place the reciprocal of the multiplier that introduced the leading \(1\) in that position in \(U\).
Step 3
In each position below the main diagonal
of \(L\), place the negative of the multiplier used to introduce the \(0\) in that position in \(U\).
Step 4
For the decomposition
\(A = LU\).
Example
Find \(LU\) decomposition
of
\[
A = \begin{bmatrix}
2 & -4 & -2 & 3 \\
6 & -9 & -5 & 8 \\
2 & -7 & -3 & 9 \\
4 & -2 & -2 & -1 \\
-6 & 3 & 3 & 4
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
1 & 0 & 0 & 0 & 0 \\
* & 1 & 0 & 0 & 0 \\
* & * & 1 & 0 & 0 \\
* & * & * & 1 & 0 \\
* & * & * & * & 1
\end{bmatrix}
\]
*
represents the unknown entry.
\[
A =
R_1 \to R_1 \times \frac 1 2 \to
\begin{bmatrix}
1 & -2 & -1 & \frac{3}{2} \\
6 & -9 & -5 & 8 \\
2 & -7 & -3 & 9 \\
4 & -2 & -2 & -1 \\
-6 & 3 & 3 & 4
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
2 & 0 & 0 & 0 & 0 \\
* & 1 & 0 & 0 & 0 \\
* & * & 1 & 0 & 0 \\
* & * & * & 1 & 0 \\
* & * & * & * & 1
\end{bmatrix}
\]
\[
\begin{align}
R_2 = R_2 &- 6 \times R_1\\
&\downarrow \\
R_3 = R_3 &- 2 \times R_1\\
&\downarrow \\
R_4 = R_4 &- 4 \times R_1\\
&\downarrow \\
R_5 = R_5 &+ 6 \times R_1\\
&\downarrow \\
\end{align}
\]
\[
A =
\begin{bmatrix}
1 & -2 & -1 & \frac{3}{2} \\
0 & 3 & 1 & -1 \\
0 & -3 & -1 & 6 \\
0 & 6 & 2 & -7 \\
0 & -9 & -3 & 13
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
2 & 0 & 0 & 0 & 0 \\
6 & 1 & 0 & 0 & 0 \\
2 & * & 1 & 0 & 0 \\
4 & * & * & 1 & 0 \\
-6 & * & * & * & 1
\end{bmatrix}
\]
\[
\begin{align}
R_2 = &R_2 \times \frac 1 3\\
&\downarrow
\end{align}
\]
\[
A =
\begin{bmatrix}
1 & -2 & -1 & \frac{3}{2} \\
0 & 1 & \frac{1}{3} & -\frac{1}{3} \\
0 & -3 & -1 & 6 \\
0 & 6 & 2 & -7 \\
0 & -9 & -3 & 13
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
2 & 0 & 0 & 0 & 0 \\
6 & 3 & 0 & 0 & 0 \\
2 & * & 1 & 0 & 0 \\
4 & * & * & 1 & 0 \\
-6 & * & * & * & 1
\end{bmatrix}
\]
\[
\begin{align}
R_3 = R_3 &+ R_3 \times 3\\
&\downarrow\\
R_4 = R_4 &- R_4 \times 6\\
&\downarrow\\
R_5 = R_5 &+ R_5 \times 9\\
&\downarrow\\
\end{align}
\]
\[
A =
\begin{bmatrix}
1 & -2 & -1 & \frac{3}{2} \\
0 & 1 & \frac{1}{3} & -\frac{1}{3} \\
0 & 0 & 0 & 5 \\
0 & 0 & 0 & -5 \\
0 & 0 & 0 & 10
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
2 & 0 & 0 & 0 & 0 \\
6 & 3 & 0 & 0 & 0 \\
2 & -3 & 1 & 0 & 0 \\
4 & 6 & * & 1 & 0 \\
-6 & -9 & * & * & 1
\end{bmatrix}
\]
\[
\begin{align}
R_4 = &R_4 \times - \frac 1 5\\
&\downarrow
\end{align}
\]
\[
A =
\begin{bmatrix}
1 & -2 & -1 & \frac{3}{2} \\
0 & 1 & \frac{1}{3} & -\frac{1}{3} \\
0 & 0 & 0 & 5 \\
0 & 0 & 0 & 1 \\
0 & 0 & 0 & 10
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
2 & 0 & 0 & 0 & 0 \\
6 & 3 & 0 & 0 & 0 \\
2 & -3 & 1 & 0 & 0 \\
4 & 6 & 0 & -5 & 0 \\
-6 & -9 & 0 & * & 1
\end{bmatrix}
\]
\[
\begin{align}
R_5 = &R_5 \times - 10\\
&\downarrow
\end{align}
\]
\[
U =
\begin{bmatrix}
1 & -2 & -1 & \frac{3}{2} \\
0 & 1 & \frac{1}{3} & -\frac{1}{3} \\
0 & 0 & 0 & 5 \\
0 & 0 & 0 & 1 \\
0 & 0 & 0 & 0
\end{bmatrix}
\]
\[
L = \begin{bmatrix}
2 & 0 & 0 & 0 & 0 \\
6 & 3 & 0 & 0 & 0 \\
2 & -3 & 1 & 0 & 0 \\
4 & 6 & 0 & -5 & 0 \\
-6 & -9 & 0 & 10 & 1
\end{bmatrix}
\]
\[
A = LU = \begin{bmatrix}
2 & 0 & 0 & 0 & 0 \\
6 & 3 & 0 & 0 & 0 \\
2 & -3 & 1 & 0 & 0 \\
4 & 6 & 0 & -5 & 0 \\
-6 & -9 & 0 & 10 & 1
\end{bmatrix}
\begin{bmatrix}
1 & -2 & -1 & \frac{3}{2} \\
0 & 1 & \frac{1}{3} & -\frac{1}{3} \\
0 & 0 & 0 & 5 \\
0 & 0 & 0 & 1 \\
0 & 0 & 0 & 0
\end{bmatrix}
\]
Matrix Inversion by \(LU\) Decomposition
Let \(A\) be an invertible matrix
and
\[A^{-1} =
\begin{bmatrix}
x_1 & x_2 & \cdots & x_n
\end{bmatrix}
\]
And also
\[I =
\begin{bmatrix}
e_1 & e_2 & \cdots & e_n
\end{bmatrix}
\]
Where \(x\) and \(e\) are column matrices
of \(A^{-1}\) and \(I\) respectively.
\[\because A A^{-1} = I\]
\[
A
\begin{bmatrix}
x_1 & x_2 & \cdots & x_n
\end{bmatrix}
=
\begin{bmatrix}
e_1 & e_2 & \cdots & e_n
\end{bmatrix}
\]
\[
\implies
\begin{bmatrix}
Ax_1 & Ax_2 & \cdots & Ax_n
\end{bmatrix}
=
\begin{bmatrix}
e_1 & e_2 & \cdots & e_n
\end{bmatrix}
\]
\[
\implies
Ax_1 = e_1, \
Ax_2 = e_2, \,
\ldots
Ax_n = e_n \,
\]
Solving Linear System
by \(LU\) Factorization
Procedure
Rewrite \(A \vec x = \vec b\) as
\[LU \vec x = \vec b \quad \because A = LU\]
\[L(U \vec x) = \vec b\]
Then let \(U \vec x = \vec y\)
\[L \vec y = \vec b\]
Now solve this to get \(\vec y\) which will be used later to get \(\vec x\).
Example
Solve the following by using \(LU\) decomposition
\[
\begin{align}
2x_1& & + 6x_2& + &2x_3 &= 2\\
-3x_1& & - 8x_2& &&= 2\\
4x_1& & + 9x_2& + &2x_3 &= 3\\
\end{align}
\]
Solution
\[
\begin{bmatrix}
2 & 6 & 2 \\
-3 & -8 & 0 \\
4 & 9 & 2
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
x_3
\end{bmatrix}
=
\begin{bmatrix}
2 \\
2 \\
3
\end{bmatrix}
\]
\[
A = \begin{bmatrix}
2 & 6 & 2 \\
-3 & -8 & 0 \\
4 & 9 & 2
\end{bmatrix}
\]
\[
L = \begin{bmatrix}
1 & 0 & 0 \\
* & 1 & 0 \\
* & * & 1
\end{bmatrix}
\]
\[
\begin{align}
R_1 = &R_1 \times \frac 1 2\\
&\downarrow
\end{align}
\]
\[
A =
\begin{bmatrix}
1 & 3 & 1 \\
-3 & -8 & 0 \\
4 & 9 & 2
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
2 & 0 & 0 \\
* & 1 & 0 \\
* & * & 1
\end{bmatrix}
\]
\[
\begin{align}
R_2 = R_2 &+ 3R_2\\
&\downarrow\\
R_3 = R_3 &- 4R_3\\
&\downarrow\\
\end{align}
\]
\[
A =
\begin{bmatrix}
1 & 3 & 1 \\
0 & 1 & 3 \\
0 & -3 & -2
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
2 & 0 & 0 \\
-3 & 1 & 0 \\
4 & * & 1
\end{bmatrix}
\]
\[
\begin{align}
R_3 = R_3 &+ 3R_2\\
&\downarrow
\end{align}
\]
\[
A =
\begin{bmatrix}
1 & 3 & 1 \\
0 & 1 & 3 \\
0 & 0 & 7
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
2 & 0 & 0 \\
-3 & 1 & 0 \\
4 & -3 & 1
\end{bmatrix}
\]
\[
\begin{align}
R_3 = & \frac 1 7 R_3 \\
&\downarrow
\end{align}
\]
\[
A =
\begin{bmatrix}
1 & 3 & 1 \\
0 & 1 & 3 \\
0 & 0 & 1
\end{bmatrix}
\]
\[
L =
\begin{bmatrix}
2 & 0 & 0 \\
-3 & 1 & 0 \\
4 & -3 & 7
\end{bmatrix}
\]
\[\because A = LU\]
\[
\begin{bmatrix}
2 & 6 & 2 \\
-3 & -8 & 0 \\
4 & 9 & 2
\end{bmatrix}
=
\begin{bmatrix}
2 & 0 & 0 \\
-3 & 1 & 0 \\
4 & -3 & 7
\end{bmatrix}
\begin{bmatrix}
1 & 3 & 1 \\
0 & 1 & 3 \\
0 & 0 & 1
\end{bmatrix}
\]
\[\because LU \vec x = \vec b\]
\[
\begin{bmatrix}
2 & 0 & 0 \\
-3 & 1 & 0 \\
4 & -3 & 7
\end{bmatrix}
\begin{bmatrix}
1 & 3 & 1 \\
0 & 1 & 3 \\
0 & 0 & 1
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
x_3
\end{bmatrix}
=
\begin{bmatrix}
2 \\
2 \\
3
\end{bmatrix}
\]
First we will solve \(U \vec x = y\)
\[
\begin{bmatrix}
1 & 3 & 1 \\
0 & 1 & 3 \\
0 & 0 & 1
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
x_3
\end{bmatrix}
=
\begin{bmatrix}
y_1 \\
y_2 \\
y_3
\end{bmatrix}
\]
Then we will solve \(L \vec y = \vec b\)
\[
\begin{bmatrix}
2 & 0 & 0 \\
-3 & 1 & 0 \\
4 & -3 & 7
\end{bmatrix}
\begin{bmatrix}
y_1 \\
y_2 \\
y_3
\end{bmatrix}
=
\begin{bmatrix}
2 \\
2 \\
3
\end{bmatrix}
\]
\[
\Big\downarrow
\]
\[
\begin{aligned}
2y_1 &&= 2 \\
-3y_1 &+ y_2 &= 2 \\
4y_1 &- 3y_2 + 7y_3 &= 3
\end{aligned}
\]
Forward Substitution
we solve the equations from the top down instead of from the bottom up. This procedure, called forward substitution
.
\[y_1 = 1, y_2 = 5, y_3 = 2\]
\[
\begin{bmatrix}
1 & 3 & 1 \\
0 & 1 & 3 \\
0 & 0 & 1
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
x_3
\end{bmatrix}
=
\begin{bmatrix}
1 \\
5 \\
2
\end{bmatrix}
\]
\[
\Big\downarrow
\]
\[
\begin{aligned}
x_1 + 3x_2 + x_3 &= 1 \\
x_2 + 3x_3 &= 5 \\
x_3 &= 2
\end{aligned}
\]
\[x_1 = 2, x_2 = -1, x_3 = 2\]
Numerical Notes
The following apply for \(A_{n \times n}\) where \(n \ge 30\).
- Computing \(LU\) factorization of \(A\) takes \(\frac 2 3 n^3\)
flops
, about same as row reducing
\(\begin{bmatrix} A & b \end{bmatrix}\), but \(A^{-1}\) require \(2 n^3\) flops
.
- Solving \(L \vec y = \vec b\) and \(U \vec x = \vec y\) requires about \(2n^2\)
flops
, because \(n \times n\) triangular system
can be solved in about \(n^2\) flops
.
- Computing \(A^{-1} \vec b\) also requires \(2n^2\)
flops
but results might not be as accurate as compared to \(LU\) factorization
(because of round off errors).
- If \(A\) is sparse (with mostly \(0\) entries) then \(A^{-1}\) is going to be dense and therefore, \(A \vec x = \vec b\) with \(LU\)
factorization
is much faster than using \(A^{-1}\).
References
Read more about notations and symbols.