Linear Algebra#

Scipy Modules#

  • Scipy has two main linear algebra modules: scipy.linalg and scipy.sparse.linalg

    • scipy.sparse.linalg has specialized algorithms for sparse matrices

  • Dense matrix example

\[\begin{split} D = \begin{pmatrix} 121&52&661&23&205&34&12&12\\ 20&16&0&64&0&123&153&135\\ 77&5&42&2&10&623&23&97\\ 314&431&817&11&23&23&23&199\\ 29&541&428&9&112&124&734&46\\ 325&0&450&88&12&534&345&125\\ 129&324&76&95&342&23&45&632\\ 13&20&12&122&431&1&123&978 \end{pmatrix} \end{split}\]
  • Majority matrix elements are non-zero

  • Sparse matrix example

\[\begin{split} S = \begin{pmatrix} 61&0&0&0&0&0&0&0\\ 0&112&0&0&0&0&0&0\\ 0&0&10&0&0&65&0&45\\ 0&0&0&12&0&0&0&0\\ 0&0&0&0&153&234&0&0\\ 0&0&0&0&0&21&0&0\\ 0&230&0&0&0&0&0&0\\ 0&0&0&0&0&0&0&97 \end{pmatrix} \end{split}\]
  • Only 11 out of 64 matrix elements are non-zero

  • Both scipy.linalg and scipy.sparse.linalg are interfaces to two standard libraries

    • BLAS - basic linear algebra subprograms

      • Contains optimized routines for vector addition, dot products, matrix multiplication etc.

    • LAPACK - linear algebra package

      • Implements optimized routines for matrix decompositions/factorizations, solving linear systems, eigenvalue problems, and linear least squares

  • Functions provided by scipy.linalg

    • Basic functions – det, inv, pinv, …

    • Linear system solvers - solve, solve_banded, solve_triangular, lstsq, \(\cdots\)

    • Eigenproblem solvers – eig, eigvals, eig_banded, eigvals_banded, \(\cdots\)

    • Matrix decompositions – lu, qr, cholesky, schur, \(\cdots\)

Pseudoinverse of a Matrix#

  • For a non-singular square matrix \(A\), the inverse \(A^{-1}\) satisfies:

\[ A^{-1} A = A A^{-1} = I \]
  • Moore-Penrose pseudo-inverse \(A^+\) of a matrix \(A\) satisfies:

\[\begin{split} AA^+ A = A \\ A^+ A A^+ = A^+ \\ (A A^+)^T = A A^+ \\ (A^+ A)^T = A^+ A \end{split}\]
  • \(A^+\) exists for any matrix - including singular and non-square matrices

  • For matrices with linearly independent rows (right inverse):

\[ A^+ = A^T (A A^T)^{-1} \qquad AA^+ = I \]
  • For matrices with linearly independent columns (left inverse):

\[ A^+ = (A^T A)^{-1} A^T \qquad A^+ A = I \]
  • For non-singular square matrices, this simplifies to \(A^+ = A^{-1}\) in both cases

  • The pseudo-inverse’s dimensions are \(n \times m\) if the matrix \(A\) has dimensions \(m \times n\)

  • pinv from scipys linalg module can be used to compute the pseudo-inverse

import numpy as np
from scipy import linalg

A = np.array([[ 1, 4, 3, 2],
              [ 3, 5, 2, 1],
              [-1, 0, 6, 0]])

print(repr(linalg.pinv(A))) # Right inverse
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 import numpy as np
      2 from scipy import linalg
      4 A = np.array([[ 1, 4, 3, 2],
      5               [ 3, 5, 2, 1],
      6               [-1, 0, 6, 0]])

ModuleNotFoundError: No module named 'numpy'
B = np.array([[1, 3], [2, 6], [4, 7]])
print(repr(linalg.pinv(B))) # Left inverse
array([[-0.28, -0.56,  0.6 ],
       [ 0.16,  0.32, -0.2 ]])

Linear System of Equations#

  • A linear system of equations are a set of \(m\) linear equations in \(n\) variables \(x_1, x_2, \cdots, x_n\)

\[\begin{split} \begin{align*} a_{11} x_1\ +&\ a_{12} x_2\ +\ \cdots\ +\ a_{1n} x_n\ =\ b_1 \\ a_{21} x_1\ +&\ a_{22} x_2\ +\ \cdots\ +\ a_{2n} x_n\ =\ b_2 \\ &\ \vdots \ \ \ \ \ \ \vdots \\ a_{n1} x_1\ +&\ a_{n2} x_2\ +\ \cdots\ +\ a_{nn} x_n\ =\ b_n \end{align*} \end{split}\]
  • In matrix form:

\[ A x = b \]
\[\begin{split} A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix} \qquad x = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \qquad b = \begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{pmatrix} \end{split}\]
  • To determine \(x\), we can use the inverse if \(m=n\) and all rows are linearly independent:

\[ x = A^{-1} b\]
  • This is computationally inefficent

  • More efficient algorithms

    • Gauss elimination

    • Gauss-Jordan elimination

    • LU decomposition

    • Cholesky decomposition

  • Scipy uses LU decomposition by default

  • Cholesky decomposition is used if the matrix \(A\) is symmetric and positive-definite

  • If the number of equations is not the same as the number of unknowns, i.e., \(m \ne n\), we can use the lstsq function

  • The least squares algorithm computes the vector \(x\) that minimizes the least squares error \(\lvert b - Ax \rvert\)

\[ \lvert b - Ax \rvert^2 = \sum\limits_{i=1}^m (b_i - (Ax)_i)^2 \]
  • It can be shown that the \(x\) that minimizes \(\lvert b - Ax \rvert\) is given by:

\[ x = A^+ b \]
  • Here \(A^+\) is the pseudo-inverse of \(A\)

A = np.array([[ 1, 4, 3, 2],
              [ 3, 5, 2, 1],
              [-1, 0, 6, 0]])
b = np.array([2, 1, -2])
res = linalg.lstsq(A, b)
res
(array([-0.4593519 ,  0.45110858, -0.40989198,  0.94229676]),
 array([], dtype=float64),
 3,
 array([8.63395251, 5.46582073, 1.25684835]))
x = res[0]
(A @ x - b).round(4)
array([-0., -0.,  0.])
A = np.array([[1, 3],
              [2, 6],
              [4, 7]])
b = np.array([1, 2, 3])
res = linalg.lstsq(A, b)
res
(array([0.4, 0.2]),
 6.943647297459356e-32,
 2,
 array([10.67251474,  1.04758252]))
x = res[0]
(A @ x - b).round(4)
array([-0., -0., -0.])
  • Consider some data collected from experiments

data

  • We want to fit this data to a quadratic curve (hypothesis) of the form \(a_1 + a_2 x + a_3 x^2\)

  • We can obtain the unknown coefficients \(a_1\), \(a_2\) and \(a_3\) by solving the linear system formed by the observed data

\[\begin{split} \begin{align*} &a_1\ +\ a_2 x_1\ +\ a_3 x_1^2\ =\ y_1 \\ &a_1\ +\ a_2 x_2\ +\ a_3 x_2^2\ =\ y_2 \\ \quad & \vdots \qquad \vdots \qquad \vdots & \\ &a_1\ +\ a_2 x_7\ +\ a_3 x_7^2\ =\ y_n \end{align*} \end{split}\]
x = np.array([1, 2.5, 3.5, 4, 5, 7, 8.5])
y = np.array([0.3, 1.1, 1.5, 2.0, 3.2, 6.6, 8.6])
M = x.reshape(x.size, 1)**[0, 1, 2]
M
array([[ 1.  ,  1.  ,  1.  ],
       [ 1.  ,  2.5 ,  6.25],
       [ 1.  ,  3.5 , 12.25],
       [ 1.  ,  4.  , 16.  ],
       [ 1.  ,  5.  , 25.  ],
       [ 1.  ,  7.  , 49.  ],
       [ 1.  ,  8.5 , 72.25]])
res = linalg.lstsq(M, y)
res
(array([0.0578403 , 0.07701453, 0.11262261]),
 0.3954624919193902,
 3,
 array([94.14628205,  4.12516865,  0.63091939]))
linalg.pinv(M) @ y
array([0.0578403 , 0.07701453, 0.11262261])
%matplotlib widget
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.plot(x, y, 'ob', label='data')
a = res[0]
xx = np.linspace(0, 9, 101)
yy = a[0] + a[1]*xx + a[2]*xx**2
ax.plot(xx, yy, '-r', label='least squares fit, $y = a_1 + a_2x + a_3x^2$')
ax.set_xlabel('x', fontsize=20)
ax.set_ylabel('y', fontsize=20)
ax.legend(fontsize=14)
ax.tick_params(labelsize=16)
fig.set_size_inches(8, 8)

Solving Linear Systems#

  • Numerical approaches to solving linear system of equations

    • Direct solvers

      • Use methods based on eliminations that lead to matrix decomposition or factorization

      • Matrix decompositions

        • LU decomposition

        • Cholesky decomposition

        • Eigen decomposition

        • QR decomposition

        • \(\cdots\)

    • Indirect solvers

      • Start with a guess and iteratively improve the guess

      • Methods

        • Jacobi

        • Gauss-Seidel

        • Sparse indirect solvers - Conjugate Gradient (CG), BIConjugate Gradient (BiCG), Generalized Minimum RESidual (GMRES), …

Direct Solvers#

  • The solution of linear system \(Ax=b\) is straightforward if the coefficient matrix \(A\) has a special form (diagonal, lower or upper triangular forms)

  • Forward substitution algorithm

    • Applicable for a lower triangular linear systems \(Lx = b\) where \(L\) is a lower triangular matrix

\[\begin{split} L = \begin{bmatrix} l_{11} & 0 & \cdots & 0 \\ l_{21} & l_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & 0 \\ l_{n1} & l_{n2} & \cdots & l_{nn} \end{bmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} = \begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{pmatrix} \implies \begin{align*} x_1 &= \frac{b_1}{l_11} \\ x_2 &= \frac{b_2 - l_{21} x_1}{l_{22}} \\ &\vdots \\ x_n &= \frac{b_n - \sum\limits_{i=1}^{n-1} l_{ni}x_i}{l_{nn}} \end{align*} \end{split}\]
  • If any of the diagonal terms are zero, the system is singular \(\rightarrow\) use the least squares method

  • Backward substitution algorithm

    • Applicable for an upper triangular linear systems \(Ux = b\) where \(U\) is an upper triangular matrix

\[\begin{split} U = \begin{bmatrix} u_{11} & u_{12} & \cdots & u_{1n} \\ 0 & u_{22} & \cdots & u_{2n} \\ \vdots & \vdots & \ddots & 0 \\ 0 & 0 & \cdots & u_{nn} \end{bmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} = \begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{pmatrix} \implies \begin{align*} x_n &= \frac{b_n}{u_{nn}} \\ x_{n-1} &= \frac{b_{n-1} - u_{(n-1)n} x_n}{u_{(n-1)(n-1)}} \\ &\vdots \\ x_1 &= \frac{b_1 - \sum\limits_{i=2}^{n} u_{1i}x_i}{u_{11}} \end{align*} \end{split}\]
  • The algorithm works bottom up, solving for \(x_n\) first and \(x_1\) in the final step

  • If any of the diagonal terms are zero, the system is singular \(\rightarrow\) use the least squares method

  • Gaussian elimination or row reduction

    • Algorithm for converting a given matrix to an equivalent upper triangular form

    • Uses 3 kinds of row operations sequentially

      • Swap the position of rows i and j

      • Scale row i by a non-zero value \(\alpha\)

      • Add a row j scaled by a value of \(\alpha\) to row i

  • Gauss-Jordan elimination

    • Extends the Gaussian elimination algorithm to convert a matrix to an equivalent diagonal form

    • Uses the same row operations

    • Can be used to determine the inverse of a matrix

Gaussian Elimination#

  • Example:

\[\begin{split} \begin{array}{ccc} A & x & b \\ \begin{bmatrix} -1 & 2 & 1 \\ 5 & 4 & 2 \\ 1 & 1 & 1 \end{bmatrix} & \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} & = \begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix} \end{array} \end{split}\]
  • The augmented matrix

Gauss Elim1

  • From the 1st column, choose a pivot row

    • The row with the max absolute value is used

    • Note: if all the values in the column are zero, then \(A\) is singular and no solution exists

  • Swap the pivot row with the first row

Gauss Elim2

  • Scale the pivot row (1st row now) so that the value in the 1st column is 1

Gauss Elim3

  • Add the pivot row to the rows below so that the values in the 1st column are zero

Gauss Elim4

  • Repeat the procedure with column 2 while ignoring row 1

Gauss Elim5

  • Repeat the procedure with column 2 while ignoring row 1

Gauss Elim6

  • Repeat the procedure with column 3 while ignoring rows 1 and 2

Gauss Elim7

  • The matrix is now in upper triangle form and can be solved using backward substitution algorithm

%matplotlib widget

import numpy as np
import matplotlib.pyplot as plt
import ipywidgets
from IPython.display import display

def viz_matrices(M1, M2, text=''):
    # Check input
    if not (isinstance(M1, np.ndarray) and isinstance(M2, np.ndarray)):
        raise TypeError("numpy arrays expected")
    
    # Num rows
    n = M1.shape[0]

    # Width of each cell
    w = 2

    # Create empty figure with three axes
    plt.ioff() # Will not show the figure unless explicitly asked to do so
    #plt.clf() # Clear the current figure
    fig, axs = plt.subplots(1, 3, figsize=(8, 2), gridspec_kw={'width_ratios':[2, 1, 2]})

    # Decorate each axes
    M = [M1, None, M2]
    txtAxes = []
    fs = 6
    for p, ax in enumerate(axs):
        ax.axis('off')

        if p == 1:
            ax.quiver([0], [0], [1], [0], scale_units='x', scale=1, headwidth=12, headlength=10)
            txt = ax.text(0.5, 0.02, text, ha='center', fontsize=8)
            txtAxes.append(txt)
            ax.set_xlim(0, 1.2)
            continue

        # Add grid lines and data from M as text
        for i in range(n):
            ax.plot([0, w*(n+1)], [w*i, w*i], '-k')
            ax.plot([w*i, w*i], [0, w*n], '-k')
            for j in range(n+1):
                txt = ax.text(w*(j + 0.5), w*(n - i - 0.5), f'{M[p][i, j]:6.2f}', ha='center', fontsize=fs)
                txtAxes.append(txt)

        ax.plot([0, w*(n+1)], [w*n, w*n], '-k')
        ax.plot([w*n, w*n], [0, w*n], '--b')
        ax.plot([w*(n+1), w*(n+1)], [0, w*n], '-k')

    #fig.set_size_inches(12, 3)

    return (fig, txtAxes)

def gauss_elim(A, b):
    if not (isinstance(A, np.ndarray) and isinstance(b, np.ndarray)):
        raise TypeError("numpy array expected")

    n = A.shape[0]
    if n != b.shape[0]:
        raise Exception("Incompatible dimensions")

    # Row operation functions
    def scale_row(M, i, a):
        M[i] *= a

    def add_row(M, i, j, a):
        M[i] += a*M[j]

    def swap_row(M, i, j):
        M[[i, j]] = M[[j, i]]

    # Reshape b to a column vector
    b = b.reshape(n, 1)

    # Augmented matrix
    B = np.hstack((A, b))
    print(B, '\n')

    #ax.table(cellText = B, loc='center')
    
    # Loop over the rows
    for i in range(n):
        # Determine the pivot location
        j = np.argmax(np.abs(B[:, i][i::])) + i

        # Swap rows
        swap_row(B, i, j)

        # Scale row
        scale_row(B, i, 1/B[i,i])

        # Pivoting
        for p in range(i+1, n):
            add_row(B, p, i, -B[p, i])
        print(B, '\n')

    return B

def gauss_elim_interactive(A, b):
    if not (isinstance(A, np.ndarray) and isinstance(b, np.ndarray)):
        raise TypeError("numpy array expected")

    n = A.shape[0]
    if n != b.shape[0]:
        raise Exception("Incompatible dimensions")

    # Row operation functions
    def scale_row(M, i, a):
        M[i] *= a

    def add_row(M, i, j, a):
        M[i] += a*M[j]

    def swap_row(M, i, j):
        M[[i, j]] = M[[j, i]]

    # Reshape b to a column vector
    b = b.reshape(n, 1)

    # Augmented matrix
    B = np.hstack((A, b))

    # Visualization stuff
    ## Create a figure
    fig, txtAxs = viz_matrices(B, B, "Init")

    ## Create button
    button = ipywidgets.Button(description='Next',
                               disabled=False,
                               button_style='', # 'success', 'info', 'warning', 'danger' or ''
                               tooltip='Next',
                               icon='arrow-right') # (FontAwesome names without the `fa-` prefix)
    output = ipywidgets.Output()

    ## Button callback
    global i, opCount
    i = 0 # Keep track of the iterations
    opCount = 0 # Keep track of the row operations
    def update_viz_matrices(button):
        # Needed in order to modify i, opCount
        # Not required if just accessing them
        global i, opCount

        with output:
            Bprev = np.copy(B)
            text = ''
            
            # Do nothing if reached the last row
            if i >= n:
                button.description = "Algorithm ended!"
                button.disabled = True
                return

            if opCount%3 == 0:
                # Update the augmented matrix
                # Determine the pivot location
                j = np.argmax(np.abs(B[:, i][i::])) + i

                if i != j:
                    # Swap rows
                    swap_row(B, i, j)
                    
                    text = f'Swap rows {i+1} and {j+1}'
                else:
                    text = f'No swapping'

            if opCount%3 == 1:
                text = f'Scale row {i+1} by {B[i,i]:4.2f}'
                
                # Scale row
                scale_row(B, i, 1/B[i,i])

            if opCount%3 == 2:
                # Pivoting
                for p in range(i+1, n):
                    add_row(B, p, i, -B[p, i])

                if i+1 < n:
                    text = f'Pivot rows {i+2} to {n}'
                    
                else:
                    text = 'Algorithm ended!'

            for c, ax in enumerate(txtAxs):
                if c < n*(n+1):
                    ax.set_text(f'{Bprev[c//(n+1), c%(n+1)]:4.2f}')
                elif c > n*(n+1):
                    ax.set_text(f'{B[(c-1)//(n+1) - n, (c-1)%(n+1)]:4.2f}')
                    if B[(c-1)//(n+1) - n, (c-1)%(n+1)] != Bprev[(c-1)//(n+1) - n, (c-1)%(n+1)]:
                        ax.set_c('r')
                    else:
                        ax.set_c('k')
                else:
                    ax.set_text(text)
            fig.canvas.draw()
            fig.canvas.flush_events()
            opCount += 1
        i = opCount//3

    ## Add callback to the button
    button.on_click(update_viz_matrices)

    ## Box with the figure
    display(ipywidgets.VBox([button, output, fig.canvas]))
A = np.array([[1, 2, 3, 7, 10],
              [13, 4, 5, 12, 15],
              [7, 8, 3, 0, 18],
              [12, 8, 9, 1, 5],
              [10, 1, 9, 8, 7]], dtype=float)

b = np.array([2, 2, 2, 2, 2], dtype=float)

gauss_elim_interactive(A, b)

Matrix Decomposition#

  • Gaussian elimination is restricted to a specific RHS vector \(b\)

  • This is disadvantageous when it is required to solve a linear system with the same \(A\) but different RHS vectors

  • A more efficient approach is decomposition or factorization of a matrix that can be reused for different \(b\)

  • Matrices once decomposed will simplify the solution procedure for the linear system

  • Scipy’s matrix decomposition functions:

    • lu, ldl, svd, cholesky, polar, qr, qz, schur, eig, \(\cdots\)

LU Decomposition#

  • LU decomposition is based on the matrix form of the Gaussian elimination algorithm

    • Decomposes the matrix into a product of a lower triangular and an upper triangular matrices, thus the name LU

\[ A = LU \]
\[\begin{split} \begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \\ a_{41} & a_{42} & a_{43} & a_{44} \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 & 0 \\ l_{21} & 1 & 0 & 0 \\ l_{31} & l_{32} & 1 & 0 \\ l_{41} & l_{42} & l_{43} & 1 \end{pmatrix} \begin{pmatrix} u_{11} & u_{12} & u_{13} & u_{14} \\ 0 & u_{22} & u_{23} & u_{24} \\ 0 & 0 & u_{33} & u_{34} \\ 0 & 0 & 0 & u_{44} \end{pmatrix} \end{split}\]
  • Some matrices do not possess an LU decomposition

    • However, a decomposition exists if the rows are permuted

    • The decomposition in this case looks like \(A = PLU\) where \(P\) is the permutation matrix

  • Solving a linear system given the LU decomposition

\[ Ax = LUx = b \]
  • Denoting \(Ux=y\), \(Ax=Ly=b \rightarrow\) solve using forward substitution algorithm

  • Once \(y\) is known, solve \(Ux=y\) using the backward substitution algorithm for \(x\)

from scipy import linalg

A = np.array([[9, 3, 4],
              [4, 3, 4],
              [1, 1, 1]], dtype=float)

linalg.lu(A)
(array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]),
 array([[1.        , 0.        , 0.        ],
        [0.44444444, 1.        , 0.        ],
        [0.11111111, 0.4       , 1.        ]]),
 array([[ 9.        ,  3.        ,  4.        ],
        [ 0.        ,  1.66666667,  2.22222222],
        [ 0.        ,  0.        , -0.33333333]]))
  • Output is a tuple of ndarrays, i.e., (P, L, U)

Cholesky Decomposition#

  • Cholesky decomposition is a special case of LU decomposition

    • Applicable to symmetric positive-definite matrices

\[\begin{split} A = A^T \qquad \text{Symmetry} \\ x^T A x > 0 \qquad x\ \text{is an arbitrary vector} \end{split}\]
\[\begin{split} A = LL^T = U^T U \\ U = L^T \end{split}\]

Eigen Decomposition#

  • Eigenvectors \(v\) of a matrix \(A\) are vectors that satisfy (\(\lambda\) are eigenvalues):

\[ A v = \lambda v \]
  • If the eigenvectors are linearly independent, then \(A\) can be factorized as:

\[ A = QDQ^{-1} \]
  • \(Q\) is a matrix formed by joining the eigenvectors (each column of \(Q\) is an eigenvector)

  • \(D\) is a diagonal matrix formed by the corresponding eigenvalues

  • For a symmetric positive-definite matrix, \(Q^{−1} = Q^T\), i.e., \(Q\) is an orthogonal matrix

A = np.array([[9, 3, 4],
              [4, 3, 4],
              [1, 1, 1]], dtype=float)

la, v = linalg.eig(A)
print("Eigenvalues:  ", la.real)
for i in range(3):
    print(f"Eigenvector {i}:", v[:, i])
Eigenvalues:   [11.275622    1.95159452 -0.22721652]
Eigenvector 0: [0.86622734 0.48211224 0.13121732]
Eigenvector 1: [ 0.51375558 -0.80249012 -0.30342182]
Eigenvector 2: [-0.03309737 -0.76141963  0.64741387]

Gauss-Seidel & Jacobi Methods#

  • Iterative procedures that split the coefficient matrix \(A\) into three parts

\[A = L + D + U\]
  • The lower triangular part \(L\), the diagonal part \(D\) and the upper triangular part \(U\)

\[\begin{split} A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix} \end{split}\]
\[\begin{split} L = \begin{bmatrix} 0 & 0 & \cdots & 0 \\ a_{21} & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & 0 \end{bmatrix} \qquad D = \begin{bmatrix} a_{11} & 0 & \cdots & 0 \\ 0 & a_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & a_{nn} \end{bmatrix} \qquad U = \begin{bmatrix} 0 & a_{12} & \cdots & a_{1n} \\ 0 & 0 & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 \end{bmatrix} \end{split}\]

Gauss-Seidel Method#

  • The linear system is rewritten as:

\[ Ax=b \quad \rightarrow \quad (L+D)x = b-Ux \]
  • Assuming the values of \(x=x^{(0)}\) on the RHS and using the forward substitution algorithm, an iterative update scheme is used

\[ (L+D) x^{(k+1)} = b - U x^{(k)} \]
  • Here \(x^{(k)}\) is the vector containing the solution at \(k^th\) iteration

Jacobi Method#

  • The linear system is rewritten as:

\[ Ax=b \quad \rightarrow \quad Dx = b-(L+U)x \]
  • Starting with an initial guess of \(x=x^{(0)}\), the iterative update scheme is:

\[ D x^{(k+1)} = b - (L+U) x^{(k)} \]
  • Gauss-Seidel method converges if \(A\) is symmetric positive-definite

  • If \(A\) is not symmetric, a sufficient but not necessary condition for convergence is that \(A\) is strictly diagonally dominant

\[ \lvert a_{ii} \rvert > \sum\limits_{i \ne j,j=1}^{n} \lvert a \rvert_{ij} \]
  • This sufficiency condition is applicable to both the methods