Appendix F — Gaussian Elimination Method
The Gaussian Elimination Method is an algorithm that transforms the linear system \(A\boldsymbol{x}=\boldsymbol{b}\) where \(A \in \mathbb{C}^{N \times N}\) and \(\boldsymbol{b} \in \mathbb{C}^{N}\) into an equivalent upper triangular system \(U\boldsymbol{x}=\boldsymbol{g}\) after \(N-1\) steps, where \(U \in \mathbb{C}^{N \times N}\) is an upper triangular matrix and \(\boldsymbol{g} \in \mathbb{C}^{N}\). This uses Elementary Row Operations (swapping rows, multiplying a row by a constant, adding two rows), after which point, the system \(U\boldsymbol{x}=\boldsymbol{g}\) can solved by the backward substitution. Note that this method is possible when the elementary row operations are performed on both \(A\) and \(\boldsymbol{b}\) simultaneously, so if rows \(i\) and \(j\) are swapped in \(A\), the rows \(i\) and \(j\) must also be swapped in \(\boldsymbol{b}\), simialry for the other operations.
The Gaussian elimination method can be performed as follows (the superscripts in brackets will be the step number):
The algorithm will be explained and an example will be done in parallel to explain the steps with the matrix system \(A\boldsymbol{x}=\boldsymbol{b}\) where \[A=\begin{pmatrix} 2 & -1 & 1 \\ -1 & 1 & 2 \\ 1 & 2 & -1 \end{pmatrix} \quad \text{and} \quad \boldsymbol{b}=\begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix}.\]
- Establish the starting matrix: If \(a_{11} \neq 0\), then set \(A^{(1)}=A\) and \(\boldsymbol{b}^{(1)}=\boldsymbol{b}\) as \[A^{(1)}=\begin{pmatrix} a_{11}^{(1)} & a_{12}^{(1)} & \dots & a_{1j}^{(1)} & \dots & a_{1N}^{(1)} \\ a_{21}^{(1)} & a_{22}^{(1)} & \dots & a_{2j}^{(1)} & \dots & a_{2N}^{(1)} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ a_{j1}^{(1)} & a_{j2}^{(1)} & \dots & a_{jj}^{(1)} & \dots & a_{jN}^{(1)} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ a_{N1}^{(1)} & a_{N2}^{(1)} & \dots & a_{Nj}^{(1)} & \dots & a_{NN}^{(1)} \end{pmatrix} \in \mathbb{R}^{N \times N} \quad \text{where} \quad a_{11}^{(1)} \neq 0\] \[\text{and} \quad \boldsymbol{b}^{(1)}=\begin{pmatrix} b_1^{(1)} \\ b_2^{(1)} \\ \vdots \\ b_j^{(1)} \\ \vdots \\ b_N^{(1)} \end{pmatrix}.\] If \(a_{11} = 0\), then swap the first row with any other row whose first term is not zero and the result will be the starting matrix \(A^{(1)}\).
\[A^{(1)}=A=\begin{pmatrix} 2 & -1 & 1 \\ -1 & 1 & 2 \\ 1 & 2 & -1 \end{pmatrix} \quad \text{and} \quad \boldsymbol{b}^{(1)}=\boldsymbol{b}=\begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix}.\]
- Form the multiplier vector: The desired outcome is to have the matrix \(A\) be upper triangular, i.e. all the terms below the diagonal should be 0. To achieve this, introduce a vector \(\boldsymbol{m}_1\) of multipliers, whose \({i}^{\mathrm{th}}\) entry is given by \[ m_{i1}=\frac{a_{i1}^{(1)}}{a_{11}^{(1)}} \quad \text{for all} \quad i=1,2,\dots,N,\] hence the reason why the assumption \(a_{11}^{(1)} \neq 0\) must be imposed. Essentially, the vector \(\boldsymbol{m}_1\) is the first column of \(A\) divided the the first element of \(A\).
\[\boldsymbol{m}_1=\frac{1}{a_{11}^{(1)}}\begin{pmatrix} a_{11}^{(1)} \\ a_{21}^{(1)} \\ a_{31}^{(1)} \end{pmatrix}=\begin{pmatrix} 1 \\ -\frac{1}{2} \\ \frac{1}{2} \end{pmatrix}\]
- Elimination terms in the first column: For \(j=2,3,\dots,N\), multiply row 1 by \(-m_{j1}\) and add it to row \(j\) to give the new row \(j\): \[\begin{pmatrix} a_{11}^{(1)} & a_{12}^{(1)} & \dots & a_{1j}^{(1)} & \dots & a_{1N}^{(1)} \\ a_{21}^{(1)}-m_{21}a_{11}^{(1)} & a_{22}^{(1)}-m_{21}a_{12}^{(1)} & \dots & a_{2j}^{(1)}-m_{21}a_{1j}^{(1)} & \dots & a_{2N}^{(1)}-m_{21}a_{1N}^{(1)} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ a_{j1}^{(1)}-m_{j1}a_{11}^{(1)} & a_{j2}^{(1)}-m_{j1}a_{12}^{(1)} & \dots & a_{jj}^{(1)}-m_{j1}a_{1j}^{(1)} & \dots & a_{jN}^{(1)}-m_{j1}a_{1N}^{(1)} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ a_{N1}^{(1)}-m_{N1}a_{11}^{(1)} & a_{N2}^{(1)}-m_{N1}a_{12}^{(1)} & \dots & a_{Nj}^{(1)}-m_{N1}a_{1j}^{(1)} & \dots & a_{NN}^{(1)}-m_{N1}a_{1N}^{(1)} \end{pmatrix}.\]
\[\begin{pmatrix} 2 & -1 & 1 \\ -1 & 1 & 2 \\ 1 & 2 & -1 \end{pmatrix}\] \[\xrightarrow[r_2 \to -\left( -\frac{1}{2} \right)r_1+r_2]{} \begin{pmatrix} 2 & -1 & 1 \\ -\left( -\frac{1}{2} \right)(2)-1 & -\left( -\frac{1}{2} \right)(-1)+1 & -\left( -\frac{1}{2} \right)(1)+2 \\ 1 & 2 & -1 \end{pmatrix}\] \[=\begin{pmatrix} 2 & -1 & 1 \\ 0 & \frac{1}{2} & \frac{5}{2} \\ 1 & 2 & -1 \end{pmatrix}\] \[\xrightarrow[r_3 \to -\left( \frac{1}{2} \right)r_1+r_3]{} \begin{pmatrix} 2 & -1 & 1 \\ 0 & \frac{1}{2} & \frac{5}{2} \\ -\left( \frac{1}{2} \right)(2)+1 & -\left( \frac{1}{2} \right)(-1)+2 & -\left( \frac{1}{2} \right)(1)-1 \end{pmatrix}\] \[=\begin{pmatrix} 2 & -1 & 1 \\ 0 & \frac{1}{2} & \frac{5}{2} \\ 0 & \frac{5}{2} & -\frac{3}{2} \end{pmatrix}\]
Notice that by the definition of \(m_{j1}\), the first element in every row must be equal to 0, therefore, this set of operation makes all the terms in the first column equal to 0 except the first. Define this new matrix as the second term in the iteration: \[\begin{multline*} \begin{pmatrix} a_{11}^{(1)} & a_{12}^{(1)} & \dots & a_{1j}^{(1)} & \dots & a_{1n}^{(1)} \\ 0 & a_{22}^{(1)}-m_{21}a_{12}^{(1)} & \dots & a_{2j}^{(1)}-m_{21}a_{1j}^{(1)} & \dots & a_{2n}^{(1)}-m_{21}a_{1n}^{(1)} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & a_{j2}^{(1)}-m_{j1}a_{12}^{(1)} & \dots & a_{jj}^{(1)}-m_{j1}a_{1j}^{(1)} & \dots & a_{jn}^{(1)}-m_{j1}a_{1n}^{(1)} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & a_{n2}^{(1)}-m_{n1}a_{12}^{(1)} & \dots & a_{nj}^{(1)}-m_{n1}a_{1j}^{(1)} & \dots & a_{nn}^{(1)}-m_{n1}a_{1n}^{(1)} \end{pmatrix} \\ \implies \quad \begin{pmatrix} a_{11}^{(2)} & a_{12}^{(2)} & \dots & a_{1j}^{(2)} & \dots & a_{1n}^{(2)} \\ a_{21}^{(2)} & a_{22}^{(2)} & \dots & a_{2j}^{(2)} & \dots & a_{2n}^{(2)} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ a_{j1}^{(2)} & a_{j2}^{(2)} & \dots & a_{jj}^{(2)} & \dots & a_{jn}^{(2)} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ a_{n1}^{(2)} & a_{n2}^{(2)} & \dots & a_{nj}^{(2)} & \dots & a_{nn}^{(2)} \end{pmatrix}=A^{(2)} \end{multline*}\] where for all \(i,j=2,3,\dots,N\) \[a_{11}^{(2)}=a_{11}^{(1)} \quad \text{;} \quad a_{1i}^{(2)}=a_{1i}^{(1)} \quad \text{;} \quad a_{i1}^{(2)}=0 \quad \text{;} \quad a_{ij}^{(2)}=a_{ij}^{(1)}-m_{i1}a_{1j}^{(1)}\]
\[A^{(2)}=\begin{pmatrix} 2 & -1 & 1 \\ 0 & \frac{1}{2} & \frac{5}{2} \\ 0 & \frac{5}{2} & -\frac{3}{2} \end{pmatrix}\]
- Modification of the right hand side: The vector \(\boldsymbol{b}\) has to also undergo the same operations as \(A\), i.e. for \(j=2,\dots,N\), let row \(j\) of \(\boldsymbol{b}^{(1)}\) be row 1 multiplied by \(-m_{j1}\) plus row \(j\) and the final vector is the vector \(\boldsymbol{b}^{(2)}\).
\[\boldsymbol{b}^{(1)}=\begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix} \xrightarrow[\begin{matrix} r_2 \to -\left( -\frac{1}{2} \right)r_1+r_2 \\ r_3 \to -\left( \frac{1}{2} \right)r_1+r_3 \end{matrix}]{} \begin{pmatrix} 1 \\ -\left( -\frac{1}{2} \right)(1)+1 \\ -\left( \frac{1}{2} \right)(1)+2 \end{pmatrix}=\begin{pmatrix} 1 \\ \frac{3}{2} \\ \frac{3}{2} \end{pmatrix}=\boldsymbol{b}^{(2)}.\]
- Matrix representation of elimination: This whole procedure can be written as \(A^{(2)}=M^{(1)} A^{(1)}\) and \(\boldsymbol{b}^{(2)}=M^{(1)}\boldsymbol{b}^{(1)}\) where \[M^{(1)}=\begin{pmatrix} 1 & 0 & 0 & \dots & 0 \\ -m_{21} & 1 & 0 & \dots & 0 \\ -m_{31} & 0 & 1 & \dots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ -m_{n1} & 0 & 0 & \dots & 1 \end{pmatrix}.\]
\[M^{(1)}=\begin{pmatrix} 1 & 0 & 0 \\ \frac{1}{2} & 1 & 0 \\ -\frac{1}{2} & 0 & 1 \end{pmatrix}\] To check: \[M^{(1)}A^{(1)}=\begin{pmatrix} 1 & 0 & 0 \\ \frac{1}{2} & 1 & 0 \\ -\frac{1}{2} & 0 & 1 \end{pmatrix}\begin{pmatrix} 2 & -1 & 1 \\ -1 & 1 & 2 \\ 1 & 2 & -1 \end{pmatrix}=\begin{pmatrix} 2 & -1 & 1 \\ 0 & \frac{1}{2} & \frac{5}{2} \\ 0 & \frac{5}{2} & -\frac{3}{2} \end{pmatrix}=A^{(2)}\] \[M^{(1)}\boldsymbol{b}^{(1)}=\begin{pmatrix} 1 & 0 & 0 \\ \frac{1}{2} & 1 & 0 \\ -\frac{1}{2} & 0 & 1 \end{pmatrix}\begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix}=\begin{pmatrix} 1 \\ \frac{3}{2} \\ \frac{3}{2} \end{pmatrix}=\boldsymbol{b}^{(2)}\]
- Repeat for other columns: The process must now be repeated for the rest of the rows, specifically, those that have non-zero pivot points, i.e. the first point in a row that is non-zero. This process can be done more simply by generating the \(M\) matrices in the same way as before without going through the starting steps. This process should be reapeated until the last row is reached.
The matrix \(M^{(2)}\) can be generated in the same way as \(M^{(1)}\), so \[M^{(2)}=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -5 & 1 \end{pmatrix}.\] To check: \[M^{(2)}A^{(2)}=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -5 & 1 \end{pmatrix}\begin{pmatrix} 2 & -1 & 1 \\ 0 & \frac{1}{2} & \frac{5}{2} \\ 0 & \frac{5}{2} & -\frac{3}{2} \end{pmatrix}=\begin{pmatrix} 2 & -1 & 1 \\ 0 & \frac{1}{2} & \frac{5}{2} \\ 0 & 0 & -14 \end{pmatrix}=A^{(3)}\] \[M^{(2)}\boldsymbol{b}^{(2)}=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -5 & 1 \end{pmatrix}\begin{pmatrix} 1 \\ \frac{3}{2} \\ \frac{3}{2} \end{pmatrix}=\begin{pmatrix} 1 \\ \frac{3}{2} \\ -6 \end{pmatrix}=\boldsymbol{b}^{(3)}\]
- Solve using backwards substitution: After repeating for all other columns (a total of \(N-1\) times), the final matrix \(A^{(N)}\) will be an upper triangular matrix with non-zero terms on the diagonal and the system can then be solved by backwards substitution.
\[A^{(1)}\boldsymbol{x}=\boldsymbol{b}^{(1)} \quad \implies \quad A^{(2)}\boldsymbol{x}=\boldsymbol{b}^{(2)} \quad \implies \quad A^{(3)}\boldsymbol{x}=\boldsymbol{b}^{(3)}\] \[\implies \quad\begin{pmatrix} 2 & -1 & 1 \\ 0 & \frac{1}{2} & \frac{5}{2} \\ 0 & 0 & -14 \end{pmatrix}\begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix}=\begin{pmatrix} 1 \\ \frac{3}{2} \\ -6 \end{pmatrix} \quad \implies \quad\begin{matrix} 2x_1-x_2+x_3=1 \\ \frac{1}{2}x_2+\frac{5}{2}x_3=\frac{3}{2} \\ -14x_3=-6 \end{matrix}\] \[\implies \quad\boldsymbol{x}=\frac{1}{7}\begin{pmatrix} 5 \\ 6 \\ 3 \end{pmatrix}.\]
The total number of operations in every step is given in the table below (the “steps” here refer to the matrix manipulation step and not exactly to the step numbers of the algorithm):
| Step | Multiplications | Additions | Divisions |
|---|---|---|---|
| 1 | \((N-1)^2\) | \((N-1)^2\) | \(N-1\) |
| 2 | \((N-2)^2\) | \((N-2)^2\) | \(N-2\) |
| 3 | \((N-3)^2\) | \((N-3)^2\) | \(N-3\) |
| \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) |
| \(N-2\) | \(4\) | \(4\) | \(2\) |
| \(N-1\) | \(1\) | \(1\) | \(1\) |
This means that the total number of multiplications is \[1+4+\dots+(N-3)^2+(N-2)^2+(N-1)^2=\sum_{n=1}^{N-1}{n^2}=\frac{N(N-1)(2N-1)}{6},\] similarly for the additions. Whereas the total number of divisions is \[1+2+\dots+(N-3)+(N-2)+(N-1)=\sum_{n=1}^{N-1}{n}=\frac{N(N-1)}{2}.\] Therefore the total number of operations is \[\frac{N(N-1)(2N-1)}{6}+\frac{N(N-1)(2N-1)}{6}+\frac{N(N-1)}{2}=\frac{2}{3}N^3-\frac{1}{2}N^2-\frac{1}{6}N.\] This means that for large \(N\), the Gaussian elimination algorithm requires \(\mathcal{O}\left(\frac{2}{3}N^3\right)\) operations when \(A\) is a non-sparse matrix. This procedure is computationally expensive even for moderate sized matrices, this also assumes that the pivot points are non-zero, or more specifically, that the matrix has non-zero determinant. As an illustration of this computational complexity, if \(N=10^6\) (which not atypical), then for a computer with the computing power of 1 Gigaflops per second, an \(N \times N\) system will need 21 years to find a solution. A lot of more modern computational techniques are based on attempting to reduce this computational complexity, either by eliminating terms in some suitable way or chnaging the matrix in a more pallatable form.
Overall, every step of this process can be represented by a matrix transformation \(M^{(n)}\). This means that in order to convert the matrix \(A\) into an upper triangular matrix \(U\), the matrix transformations \(M^{(1)}, M^{(2)}, \dots, M^{(N-1)}\) have to be applied reverse order as \[U=M^{(N-1)} M^{(N-2)} \dots M^{(1)} A.\] This can be written as \[U=MA \quad \text{where} \quad M=M^{(N-1)} M^{(N-2)} \dots M^{(1)}. \tag{F.1}\]
Notice that every matrix \(M^{(n)}\) is lower triangular and this fact will be used later on in ?sec-LU.