Open In Colab

Solving linear systems with Gauss elimination#

Gauss elimination is an algorithm for the most familiar / intuitive solution technique. Let’s reexamine our analytical (symbolic) solution to the previous problem and then make it into a numerical algorithm.

Question: You are organizing a fundraising event and need to buy chairs and tables. Chairs cost $20 each and tables cost $50 each. You have a budget of $700 and need a total of 20 pieces of furniture (chairs and tables combined). How many chairs and tables should you buy?

Symbolic manipulation#

Let \(c\) and \(t\) be the number of chairs and tables respectively.

The budget and pieces equations are,

(1) \(20 c + 50 t = 700\)

(2) \( c+t = 20\)

Solve (2) for \(c\):

\(c = 20-t\)

Substitute into (1):

\(20 [20-t] + 50t = 700\)

\(t = 10\)

and substitute into (2) or (1) to find \(c\).

This works because our first step carries the unkown symbol \(t\).

Numerical approach#

Lets repeat this without carrying symbols.

(1) \(20 c + 50 t = 700\)

(2) \( c+t = 20\)

Multiply (2) by \(20\):

\(20 c + 20 t = 400\)

and subtract from (1):

\begin{eqnarray*} 20 c + &50 t &=& 700 \ -20 c - &20 t &=& -400 \ \hline &30 t &=& 300 \end{eqnarray*}

Thus we have simplified the last line until it reaches a trival solution for \(t\). Now it is a matter of substitution to solve for \(c\).

NB: the linear system has been changed (without changing the answer) to:

\(\begin{pmatrix} 20 & 50 \\ 0 & 30 \end{pmatrix} \begin{pmatrix} c \\ t \end{pmatrix} = \begin{pmatrix} 700 \\ 300 \end{pmatrix}\)

In particular, \(A\) is upper triangular, (aka row echelon form for a square matrix) from which the answer is easily obtained through backward substitution.

##Upper triangular matricies

An upper triangular matrix, \(U\), is a matrix whose elements below the diagonal are zero:

\[\begin{split}\begin{bmatrix} u_{1,1} & u_{1,2} & u_{1,3} & u_{1,4}\\ 0 & u_{2,2}' & u_{2,3}' & u_{2,4}'\\ 0 & 0 & u_{3,3}' & u_{3,4}' \\ 0 & 0 & 0 & u_{4,4}' \end{bmatrix}\end{split}\]

This is useful because, in equation form,

\(U x = b'\)

turns in to \begin{eqnarray*} \begin{array}{} u_{1,1} x_1 &+& u_{1,2} x_2 & + & u_{1,3} x_{3} &+&u_{1,4} x_4 &=& b’1,\ & & u{2,2}’ x_{2} &+ & u_{2,3}’ x_{3} &+& u_{2,4}’ x_4 &=& b_{2}’ \ && & & u_{3,3}’ x_{3} &+& u_{3,4}’ x_4 &=& b_{3}’,\ && && && u_{4,4}’ x_4 &=& b_{4}’. \end{array} \end{eqnarray*}

which can be solved in \(O(n^2)\) time!

Gauss Elimination#

Gauss Elimination proceeds in two phases. The first is elimination which transforms

\(A x = b\)

into

\(U x = b'\)

where \(U\) is an upper triangular matrix and \(b'\) is a modified vector of constants. The second phase is back-substitution which is trivial with triangular matricies.

Gauss elimination algorithm#

We first take

\(Ax = b\)

and build an Augmented matrix:

\(Ab = [ A | b ]\)

We are allowed the following operations which affect \(|A|\) but not the solution to the problem:

|Operation | Effect on \(|A|\) | |—–|—–| |Exchange 2 rows | Flips sign| |Multiply a row by § | Multiplied by § | |Subtract 2 rows | Unchanged|

Steps from our previous example#

import numpy as np
# Define A and b

A = np.array([[20, 50],
              [1, 1]])

b = np.array([[700], [20]])

#Form the augmented matrix A|b
Ab = np.hstack([A, b])

#The coefficient matrix can be separated by slicing Ab
def print_update():
  print("Augmented matrix is \n", Ab)
  print("Determinant of A: ", np.linalg.det(Ab[:,:-1])),
  print("Norm of A", np.linalg.norm(Ab[:,:-1], 'fro')),
  print("Condition number of A:", np.linalg.cond(Ab[:, :-1]))
  print("\n")

print("Step 0: Show the augmented matrix and the determinant of A")
print_update()

print("Step 1: Multiply the second row by 20")
Ab[1, :] = Ab[1, :] * 20
print_update()

print("Step 2: Subtract the second row from the first")
Ab[1, :] = Ab[1, :] - Ab[0, :]
print_update()
Step 0: Show the augmented matrix and the determinant of A
Augmented matrix is 
 [[ 20  50 700]
 [  1   1  20]]
Determinant of A:  -29.99999999999999
Norm of A 53.87021440462252
Condition number of A: 96.72299453018881


Step 1: Multiply the second row by 20
Augmented matrix is 
 [[ 20  50 700]
 [ 20  20 400]]
Determinant of A:  -600.0
Norm of A 60.8276253029822
Condition number of A: 6.000000000000001


Step 2: Subtract the second row from the first
Augmented matrix is 
 [[  20   50  700]
 [   0  -30 -300]]
Determinant of A:  -600.0
Norm of A 61.644140029689765
Condition number of A: 6.171292729553326

Note: The determinant, norm and condition number are all changing despite the answer remaining the same. This is the basis of preconditioners!

Algorithm#

Let’s describe an algorithm for Gaussian elimination, as a sequence of passes row by row through the matrix.

Each pass we choose a pivot row which is used to eliminate the elements in other equations through multiplication of the pivot and subtraction.

Start from the top and only pass downwards, so we end up with an upper triangular matrix.

Example#

Use Gauss Elimination to solve the following equations.

\begin{eqnarray*} 4x_1 + 3x_2 - 5x_3 &=& 2 \ -2x_1 - 4x_2 + 5x_3 &=& 5 \ 8x_1 + 8x_2 &=& -3 \ \end{eqnarray*}

Step 1: Turn these equations to matrix form \(Ax=y\).

\[\begin{split} \begin{bmatrix} 4 & 3 & -5\\ -2 & -4 & 5\\ 8 & 8 & 0\\ \end{bmatrix}\left[\begin{array}{c} x_1 \\x_2 \\x_3 \end{array}\right] = \left[\begin{array}{c} 2 \\5 \\-3\end{array}\right]\end{split}\]

Step 2: Get the augmented matrix [A, y]

\[\begin{split} [A, y] = \begin{bmatrix} 4 & 3 & -5 & 2\\ -2 & -4 & 5 & 5\\ 8 & 8 & 0 & -3\\ \end{bmatrix}\end{split}\]

Step 3: Choose the first equation as the pivot equation and turn the 2nd row first element to 0. To do this, we can multiply -0.5 for the 1st row (pivot equation) and subtract it from the 2nd row. The multiplier is \(m_{2,1}=-0.5\). We will get

\[\begin{split} \begin{bmatrix} 4 & 3 & -5 & 2\\ 0 & -2.5 & 2.5 & 6\\ 8 & 8 & 0 & -3\\ \end{bmatrix}\end{split}\]

Step 4: Turn the 3rd row first element to 0. We can do something similar, multiply 2 to the 1st row and subtract it from the 3rd row. The multiplier is \(m_{3,1}=2\). We will get

\[\begin{split} \begin{bmatrix} 4 & 3 & -5 & 2\\ 0 & -2.5 & 2.5 & 6\\ 0 & 2 & 10 & -7\\ \end{bmatrix}\end{split}\]

Step 5: Turn the 3rd row 2nd element to 0. We can multiple -4/5 for the 2nd row, and subtract it from the 3rd row. The multiplier is \(m_{3,2}=-0.8\). We will get

\[\begin{split} \begin{bmatrix} 4 & 3 & -5 & 2\\ 0 & -2.5 & 2.5 & 6\\ 0 & 0 & 12 & -2.2\\ \end{bmatrix}\end{split}\]

Elmination is now complete since \(A\) is upper triangular.

Proceed with substitution:

Step 6: Therefore, we can get \(x_3=-2.2/12=-0.183\).

Step 7: Insert \(x_3\) to the 2nd equation, we get \(x_2=-2.583\)

Step 8: Insert \(x_2\) and \(x_3\) to the first equation, we have \(x_1=2.208\).

Complexity#

Considering the two phases of Gauss elimination:

Elimination: ~\(n^3/3\) operations, therefore, \(O(n^3)\)

Back substitution: ~ \(n^2/2\) operations, therefore \(O(n^2)\)

##Gauss-Jordan Elimination An obvious extension is to conduct each pass both upwards and downwards (G.E. is just down). While doing this we also normalize the row to get reduced row echelon form (abbreviated: rref):

\[\begin{split} \begin{bmatrix} 1 & 0 & 0 & 2.208\\ 0 & 1 & 0 & -2.583\\ 0 & 0 & 1 & -0.183\\ \end{bmatrix}\end{split}\]

Since \(I^{-1} = I\), the answer is just the right hand vector.

Complexity#

Gauss-Jordan Elimination eliminates the need for back-substitution so one might think that it is more efficient than Gauss Elimination. Unfortunatley, the elimination phase takes ~\(~n^3/2\) operations whereas GE took ~\(n^3/3 + n^2/2\). therefore Gauss Elimination is preferred.

Package implementation#

Gaussian Elimination is fundamental but has largely been surpased by matrix decomposition (We will see the connection soon). Therefore, you will be hard pressed to find a standard implemention in numperical packages (numpy / scipy).

It is still useful in symbolic manipulation, and indeed you will find it in sympy under echelon form and rref:

#prompt: Use sympy rref and echelon form to make an example for linear system

from sympy import Matrix, symbols, pprint

a = symbols('a')

# Define the coefficient matrix and the right-hand side vector
A = Matrix([[1, 2, 3],
            [4, 5, 6],
            [7, 8, 9]])
b = Matrix([2, 5, -3])

# Combine the coefficient matrix and the right-hand side vector into an augmented matrix
Ab = A.row_join(Matrix(b))

# Calculate the reduced row echelon form (rref) of the augmented matrix
Ab_rref = Ab.rref()

# Print the result
print("Augmented matrix in rref:")
pprint(Ab_rref[0])

# Calculate the echelon form of the coefficient matrix
Ab_echelon = Ab.echelon_form()

# Print the result
print("\nCoefficient matrix in echelon form:")
pprint(Ab_echelon)
Augmented matrix in rref:
⎡1  0  -1  0⎤
⎢           ⎥
⎢0  1  2   0⎥
⎢           ⎥
⎣0  0  0   1⎦

Coefficient matrix in echelon form:
⎡1  2   3   2 ⎤
⎢             ⎥
⎢0  -3  -6  -3⎥
⎢             ⎥
⎣0  0   0   33⎦

Pivoting and diagonal dominance#

The order of equations, and therefore rows in the matrix, is obviously arbitrary, but this may break Gauss Elimination.

E.g.: a system with augmented matrix

\[\begin{split} \begin{bmatrix} 2 & -1 & 0 & 1\\ -1 & 2 & -1 & 0\\ 0 & -1 & 1 & 0\\ \end{bmatrix}\end{split}\]

will work with our GE scheme, but the same system reordered,

\[\begin{split} \begin{bmatrix} 0 & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\\ \end{bmatrix}\end{split}\]

will fail due to the \(0\) in the element of the first row. The remedy is to swap rows 3 and 1 to match the first example.

###Permutation matrix Swapping rows in matrix algebra is achieved via a permutation matrix (row swap of the identity matrix).

To swap rows 3 and 1, we would use:

\[\begin{split} P = \begin{bmatrix} 0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \\ \end{bmatrix}\end{split}\]

Diagonal dominance#

More generally, if the diagonal element is small we will encounter roundoff error:

\[\begin{split} \begin{bmatrix} \epsilon & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\\ \end{bmatrix}\end{split}\]

after the first pass leads to:

\[\begin{split} \begin{bmatrix} \epsilon & -1 & 1 & 0\\ -1 & 2 -1/\epsilon & -1 + 1/\epsilon & 0\\ 2 & -1 -1/\epsilon& + 1/\epsilon & 1\\ \end{bmatrix}\end{split}\]

from which we immediately see the danger as \(\epsilon \rightarrow 0\)!

Thus, we don’t want the diagonal to have \(0\)s or small numbers. This motivates the definition of diagonal dominance.

A matrix is said to be diagonally dominant if each diagonal is larger, in absolute value, than the sum of the other elements in the row.

\(|A_{ii}| \ge \sum_{j=1,j \ne i}^{n} |A_{ij}|\)

Example: Which is diagonally dominant?

a) $\( \begin{bmatrix} -2 & 4 & -1 \\ -1 & -1 & 3 \\ 4 & -2 & 1 \\ \end{bmatrix}\)$

b) $\( \begin{bmatrix} 4 & -2 & 1 \\ -2 & 4 & -1 \\ -1 & -1 & 3 \\ \end{bmatrix}\)$

Importance of diagonal dominance#

It can be shown that if a matrix is diagonally dominant, it will not benefit from pivoting!.

Diagonal dominance contributes generally to numerical stability and accuracy through minimizing roundoff error propogation. This result extends beyond Gauss elimination, and will also be a criteria for matrix factorization methods and the efficient convergence of iterative methods.