Householder transformation
Basically

Alston Scott Householder was a real genius I think. The Householder transformation he invented is a master piece of mathematics  :-)
The Householder transformation is an orthogonal transformation to eliminate more than one element of a given column in a matrix. It builds a transformation matrix

 U = I – 2 * ω * ωT

With I as the unit matrix and ω as a direction vector whit the length of 1. It’s quite a cool approach to solve a matrix equation and a very smart one if it comes to the method of the least squares to approximate some data points.

Most of the descriptions say what the transformation matrix does is basically mirroring a vector by the hyperplane which is orthogonal to ω and running through the origin. In my opinion this can be misleading a bit. At least me it was misguiding in the beginning :-) The vector is not mirrored on the surface of the hyperplane as a light beam would be mirrored on the mirror of a bathroom. The hyperplane is just a plane and it is actually the axis for the mirroring. That means the vector is flapped to the other side of this hyperplane. That’s an important fact in my opinion.

The direction of this plane is chosen that way, that all the elements of the resulting vector will be 0 except the first one.

Therefore for a first step we have a look at a vector s which is orthogonal to ω. That means it is lying flat on the hyperplane. This we multiply by the Householder matrix and get:

s’ = Us = (I – 2ωωT) * s = s - 2ω (ωTs) = s

Remark that ωT * s is 0 because of the orthogonality of ω and s. So the transformation does not change anything on this vector.

For the second step we have a look at a vector z which is parallel to ω. It’s standing upright on the hyperplane. So it can be substituted by a constant multiplied by ω: z = cω

 z’ = Uz = cUω = c(I – 2ωωT)ω = c(ω - 2 ω(ωTω)) = - cω = -z

With I * ω = ω and ωTω = 1. The transformation switches the direction of this vector.


Now any vector x can be regarded as a sum of s and z: X = s + z and the transformation of this vector x is:

x’ = Ux = U(s + z) = s – z

The s part of this vector, which is lying flat on the hyperplane, is not affected and the z part which is upright on the hyperplane switches the sign.


Graphically that looks like this: :-)


Householder


Whit the blue rectangle marking a part of the hyperplane.
(I didn’t mark the coordinate origin. It’s where ω, x and x’ cross)



Calculating the parameters

Now the interesting part is how to get to the parameters of ω. Therefore we start with the first row of the main matrix in the matrix equation

M * x = y

This row we regard as a vector m1. This vector shall be transformed into a vector γ * e1 in which only the first element is not 0. It consists of the unit vector e1 multiplied by a constant γ. So the equation for the first transformation matrix is

U1 * m1 = γ * e1

and as e1 has the length of 1, γ must be the length of m1:

Householder


But as a root always has 2 solutions, γ could be positive as well as negative. So it is taken as positive if m11 is positive and negative if m11 is negative. This should be to avoid extinction (o.k. :-))

That was the first step. For the next step it might be helpful to have a look onto the graphical situation in a 2 dimensional room:

Householder


To rotate the vector x to x’ we would basically need to find the location of the hyperplane (the blue line here) to mirror x. In the 2 dimensional room that would be the angle bisector between x and x’. In the 3 dimensional room it’s the same and in a more dimensional root too. Even if we can not imagine that anymore. But finally we are not interested in the hyperplane but in ω which occurs in the Householder matrix. The vector ω is orthogonal to the hyperplane and its direction is the same as x – x’ and as x’ should be γ * e1 and x is m1, the direction of ω will be m1 - γ * e1 and as γ can be negative as well, we can say m1 + γ * e1 for the direction as well.

Now there is some substitution: m1 + γ * e1 = h and we need to calculate the correction of the length for ω as m1 + γ * e1 or h has the same direction, but not the same length jet, we need to fix that. Now there is an interesting trick: The length of h is:

Householder


and this is the same as

Householder


the transposed matrix of h multiplied by h.


Now this written with m1 + γ * e1

Householder


and expanded

Householder


In the special case here we can say

Householder


and get

Householder


and

Householder


and so

Householder


and with this we get for ω

Householder


as vector and the elements

Householder


For k = 1 ... N


Now we can put this all into an algorithm to calculate the elements of U = I – 2 * ω * ωT For a 5 * 5 matrix that looks like:

 
Householder



This matrix multiplied by the matrix M eliminates all the elements in the first row except the element m11:

 
Householder



To eliminate the elements m32, m42 and m52 in the second row now, we have the do the same again using the matrix m22 to m55.

 
Householder



And so on :-)


This I put into a C# function;
 


public void RunHouseholder(int col)
{
     int i, j;
     double lamda = 0;
     double beta;
     double[] w = new double[samples];
     double[,] I = new double[samples, samples];
 
     // calc parameters
     for (i = col; i < samples; i++)
     {
         lamda = lamda + m[i, col] * m[i, col];
     }
     lamda = Math.Sqrt(lamda);
     if (m[col, col] < 0)
         lamda = -lamda;
     beta = Math.Sqrt(2 * lamda * (lamda + m[col, col]));
     w[col] = (lamda + m[col, col]) / beta;
     for (i = col + 1; i < samples; i++)
         w[i] = m[i, col] / beta;
 
     //calc matrix parameters
     for (i = 0; i < samples; i++) // unit matrix
     {
         for (j = 0; j < samples; j++)
         {
              I[i, j] = 0.0;
         }
         I[i, i] = 1.0;
     }
     for (i = 0; i < samples; i++)
     {
         if (i >= col)
         {
              for (j = 0; j < samples; j++)
              {
                   if (j >= col)
                        r[i, j] = I[i, j] - 2 * w[i] * w[j];
                   else
                        r[i, j] = I[i, j];
              }
         }
         else
         {
              for (j = 0; j < samples; j++)
                   r[i, j] = I[i, j];
         }
     }
     MultMatrix(r);
}




With



public void MultMatrix(double[,] m1)
{
     int i, j, k;
     double[,] al = new double[size, size];
     double[] yl = new double[size];
     for (i = 0; i < size; i++)
     {
         yl[i] = 0.0;
         for (j = 0; j < size; j++)
         {
              yl[i] = yl[i] + (y[j] * m1[i, j]);
              al[i, j] = 0.0;
              for (k = 0; k < size; k++)
              {
                   al[i, j] = al[i, j] + (m[k, j] * m1[i, k]);
              }
         }
     }
     for (i = 0; i < size; i++)
     {
         y[i] = yl[i];
         for (j = 0; j < size; j++)
         {
              m[i, j] = al[i, j];
         }
     }
}





This function needs to be called N-1 times (MaxOrder stands for N here). Then the main matrix C is converted to an upper triangle matrix.



for (i = 0; i < MaxOrder-1; i++)
     RunHouseholder(i);





And the matrix equation C’’x = y can be solved be a backward substitution.



public double[] Solve()
{
     int k, l;
     double[] x = new double[maxOrder];
     for (k = maxOrder - 1; k >= 0; k--)
     {
         for (l = maxOrder - 1; l > k; l--)
         {
              y[k] = y[k] - x[l] * m[k, l];
         }
         if (m[k, k] != 0)
              x[k] = y[k] / m[k, k];
         else
              x[k] = 0;
     }
     return x;
}





A beautiful algorithm. Isn’t it :-)



C# Demo Project Householder TransformationCardano
  • Matrix_Householder.zip


  • Java Demo Project Householder TransformationCardano
  • MatrixHouseholder.zip