This post is going to explain one of basic building blocks for solving “Linear Algebra Equations”. Consider a set of equations:

This is a system on M unknowns \(x_0, x_1 \cdots x_M \) and N equations. Each variable can be thought of a degree of freedom and each equation can be thought of as a constraint. Think about a three variable situation, like a position of a person in a 3-D coordinate. Without any constraints, he has three degrees of freedom in the x, y and z direction. If we are given three equations describing his position(each equation in x, y and z represents a plane in 3-D), we can pin point his co-ordinates in the 3-D space.

three-planes

Validation

  • If \(M > N\), the number of unknowns is greater than the number of equations, the system is said to be undetermined and has infinitely many solutions. The solution space can be restricted by Compressed Sensing.
  • If \(M < N\), the number of equations are greater than the number of variables, the system is said to be overdetermined. Here the general approach is to find the best fit solution (i.e R.M.S error values are a minimum for all equations)
  • If \(M = N\), the system is consistent if the following caveats are satisfied:

    • No row should be a linear combination of the other row, this leads to row degeneracy
    • If all the equations have a certain variable in the exact same linear combination, the system is afflicted by column degeneracy
  • Both these equations effective result in the removal of a constraint and thus the system becomes indeterminable.

Pivoting

In order to obtain more accurate results and reduce round-off errors, a technique called Pivoting is used. Pivoting is done to convert a matrix to its row echelon form.

What is row echelon form?

A matrix is said to be in row echelon form if:

  • All non zero rows are above the zero rows.
  • The first non zero number in a row from the left called the Leading coefficient or Pivot should be strictly to the right of the leading coefficient of row above it.
  • All entries in a column below the leading coefficient must be zero

Here is an example of a matrix in row echelon form:

Pivoting can be done in two ways:

Partial Pivoting

In this the algorithm selects element the largest absolute value and shuffles the rows in such a way that it lies along the diagonal

Complete Pivoting

The algorithm scans the whole matrix for the largest element and shuffles both columns and rows to place the pivot along a diagonal //(a_{ii} //)

The Algorithm

We will be using an example matrix to illustrate this Algorithm (which is given in the text-book Numerical Recipes in C++:

The equations we aim at solving are:

The algorithm takes two inputs, matrix A (coefficient matrix) and B (solution vector). The inverse of the matrix is returned in A and the variable vector is returned in B.

Step 1: Finding the Pivot Element

In the first step the algorithm iterates through the matrix and finds the largest element, in the first iteration the pivot element is the largest element of the last row. In our case it comes out to be five and is in the fist column, so there is no need for a column swap, it only needs to be swapped with t he first row. This swap is maintained in a two book-keeping arrays storing the actual position of pivot, so that the result can be restored.

The next time the algorithm searches for a Pivot element, it excludes \(R_1\) and \(C_1\) from the search.

pivot11

pivot21

Step 2: Normalizing the row

Before we understand the first step we need to understand why this actually works. Using our transformations we are basically converting the matrix into the identity matrix I. Therefore,

Where \(B_1^\prime\) is the transformed solution vector

As we are using the equation \(A \cdot Y = I\) to determine the inverse of the matrix we store the result back in A.

This step can be further subdivided into two sub-steps:

  • Normalize a given row by the Pivot element, So now our matrix equations looks like:

    The solution vector also gets transformed as:

  • The next step is to reduce each element below the Pivot by subtracting the right amount of first row:

and similar transforms are performed on the solution vector

We, will discuss certain parts of the second iteration as they are slightly different from the first:

pivot

Now while iterating for the second column, the largest element found its at \( R_4,C_4\)

Here, there is no need for swapping as the pivot is found along the diagonal itself. At the end we have done pivoting for all columns and have reduced our matrix, but we need to accommodate for the shuffling that we have done. Let us say that our book-keeping arrays are:

arrays

Let us take the first case:

As the row and column number was not the same, there is an initial swap that needs to restored back. So, we swap \(C_4\ with\ C_0\). A row operation in the input appears as a column operation in its inverse a (explains the shuffling of columns instead of rows)