LU decomposition#
Commonly we will have to repeatedly solve \(Ax = b\) for multiple \(b_i\). Gauss Elimination for each \(b_i\) would be grossly inefficient. If you knew all the \(b_i\) in advance you could do this in parallel by forming the augmented matrix:
\([A|b_1 \ b_2 \ b_3 \ ...]\)
but this is seldom the case.
It is much more efficient to decompose the matrix \(A\) into a form that is easier to solve.
There are other reasons to do this for special matrix types and distributed computing which we will discuss later.
We have actually already seen this efficiency boost with back-substitution. The equation \(U x = b\) solves in \(O(n^2)\).
Any square matrix can be decomposed,
\(A = LU\)
where:
\(L\) is a lower triangular matrix
\(U\) is an upper triangular matrix
Now, the linear system becomes:
Now let \(y = Ux\), such that
both of which solve in \(O(n^2)\).
NOTE: L and U are generally not unique.
Example: Return to the previouis example:
\begin{align} 4x_1 + 3x_2 - 5x_3 &=& 2 \ -2x_1 - 4x_2 + 5x_3 &=& 5 \ 8x_1 + 8x_2 &=& -3 \ \end{align}
Through Gaussian Elimination, we found
by clearing the first column by multiplying the first row by \(-0.5\) for the second row, and \(2\) for the third. The second column was cleared with the second row multiplied by \(-0.8\). These coefficients turn out to be the elemements of the \(L\) matrix (with 1’s along the diagonal)!
Let’s verify:
# prompt: Do decomposition on the above matrix
import numpy as np
# Define the matrix A
A = np.array([[4, 3, -5],
[-2, -4, 5],
[8, 8, 0]])
print("The original matrix A is:\n", A, "\n")
L = np.array([[1,0,0],
[-.5, 1,0],
[2,-.8,1]])
U = np.array([[4,3,-5],
[0,-2.5,2.5],
[0,0,12]])
print("The reconstructed matrix is:\n", L@U)
The original matrix A is:
[[ 4 3 -5]
[-2 -4 5]
[ 8 8 0]]
The reconstructed matrix is:
[[ 4. 3. -5.]
[-2. -4. 5.]
[ 8. 8. 0.]]
Let’s check the package decomposition!
# Calculate the LU decomposition
from scipy.linalg import lu, inv
P, L, U = lu(A)
print("Permutation Matrix (P):\n", P)
print("Lower Triangular Matrix (L):\n", L)
print("Upper Triangular Matrix (U):\n", U)
print("\nMultiply L and U:\n", L@U, "\nwhich is correct but pivoted!")
print("\nMultiply PLU:\n", P@L@U, "\nwhich is the original matrix!")
Permutation Matrix (P):
[[0. 0. 1.]
[0. 1. 0.]
[1. 0. 0.]]
Lower Triangular Matrix (L):
[[ 1. 0. 0. ]
[-0.25 1. 0. ]
[ 0.5 0.5 1. ]]
Upper Triangular Matrix (U):
[[ 8. 8. 0. ]
[ 0. -2. 5. ]
[ 0. 0. -7.5]]
Multiply L and U:
[[ 8. 8. 0.]
[-2. -4. 5.]
[ 4. 3. -5.]]
which is correct but pivoted!
Multiply PLU:
[[ 4. 3. -5.]
[-2. -4. 5.]
[ 8. 8. 0.]]
which is the original matrix!
NB: \(P\) in the above is the permutation matrix that, when multiplied by LU recovers the original matrix. It is not the pivoting operation that is done internally (although that matrix is easily obtained!).
Dr. Mike’s Tips!#
Direct solver are your ‘black box’ for most of your needs.
They are the most robust for ill-conditioned systems.
They scale terribly (both in system size and parallelization)
If you use them, start with a small system and work upwards.
Generally speaking you won’t see a speedup with parallelization until you get a large # of nodes
Warning: Some implementations (numpy) are sophisticated enough to handle singular matricies as well as non-singular (be careful with the answer!)
Sparse matricies are your saving grace! Do your best to protect them (hence store the LU factors, not the inverse!)