## QR Decomposition with the Gram-Schmidt Algorithm

QR decomposition is another technique for decomposing a matrix into a form that is easier to work with in further applications. The QR decomposition technique decomposes a square or rectangular matrix, which we will denote as $A$, into two components, $Q$, and $R$.
$$A = QR$$

Where $Q$ is an orthogonal matrix, and $R$ is an upper triangular matrix. Recall an orthogonal matrix is a square matrix with orthonormal row and column vectors such that $Q^T Q = I$, where $I$ is the identity matrix. The term orthonormal implies the vectors are of unit length and are perpendicular (orthogonal) to each other.

QR decomposition is often used in linear least squares estimation and is, in fact, the method used by R in its lm() function. Signal processing and MIMO systems also employ QR decomposition. There are several methods for performing QR decomposition, including the Gram-Schmidt process, Householder reflections, and Givens rotations. This post is concerned with the Gram-Schmidt process.

## The Gram-Schmidt Process

The Gram-Schmidt process is used to find an orthogonal basis from a non-orthogonal basis. An orthogonal basis has many properties that are desirable for further computations and expansions. As noted previously, an orthogonal matrix has row and column vectors of unit length:

$$||a_n|| = \sqrt{a_n \cdot a_n} = \sqrt{a_n^T a_n} = 1$$

Where $a_n$ is a linearly independent column vector of a matrix. The vectors are also perpendicular in an orthogonal basis. The Gram-Schmidt process works by finding an orthogonal projection $q_n$ for each column vector $a_n$ and then subtracting its projections onto the previous projections $(q_j)$. The resulting vector is then divided by the length of that vector to produce a unit vector.

Consider a matrix $A$ with $n$ column vectors such that:

$$A = \left[ a_1 | a_2 | \cdots | a_n \right]$$

The Gram-Schmidt process proceeds by finding the orthogonal projection of the first column vector $a_1$.

$$v_1 = a_1, \qquad e_1 = \frac{v_1}{||v_1||}$$

Because $a_1$ is the first column vector, there is no preceeding projections to subtract. The second column $a_2$ is subtracted by the previous projection on the column vector:

$$v_2 = a_2 - proj_{v_1} (a_2) = a_2 - (a_2 \cdot e_1) e_1, \qquad e_2 = \frac{v_2}{||v_2||}$$

This process continues up to the $n$ column vectors, where each incremental step $k + 1$ is computed as:

$$v_{k+1} = a_{k+1} - (a_{k+1} \cdot e_{1}) e_1 - \cdots - (a_{k+1} \cdot e_k) e_k, \qquad e_{k+1} = \frac{u_{k+1}}{||u_{k+1}||}$$

The $|| \cdot ||$ is the $L_2$ norm which is defined as:

$$\sqrt{\sum^m_{j=1} v_k^2}$$ The projection can also be defined by:

Thus the matrix $A$ can be factorized into the $QR$ matrix as the following:

$$A = \left[a_1 | a_2 | \cdots | a_n \right] = \left[e_1 | e_2 | \cdots | e_n \right] \begin{bmatrix}a_1 \cdot e_1 & a_2 \cdot e_1 & \cdots & a_n \cdot e_1 \\ 0 & a_2 \cdot e_2 & \cdots & a_n \cdot e_2 \\ \vdots & \vdots & & \vdots \\ 0 & 0 & \cdots & a_n \cdot e_n\end{bmatrix} = QR$$

## Gram-Schmidt Process Example

Consider the matrix $A$:

$$\begin{bmatrix} 2 & - 2 & 18 \\ 2 & 1 & 0 \\ 1 & 2 & 0 \end{bmatrix}$$

We would like to orthogonalize this matrix using the Gram-Schmidt process. The resulting orthogonalized vector is also equivalent to $Q$ in the $QR$ decomposition.

The Gram-Schmidt process on the matrix $A$ proceeds as follows:

$$v_1 = a_1 = \begin{bmatrix}2 \\ 2 \\ 1\end{bmatrix} \qquad e_1 = \frac{v_1}{||v_1||} = \frac{\begin{bmatrix}2 \\ 2 \\ 1\end{bmatrix}}{\sqrt{\sum{\begin{bmatrix}2 \\ 2 \\ 1\end{bmatrix}^2}}}$$ $$e_1 = \begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \\ \frac{1}{3} \end{bmatrix}$$
$$v_2 = a_2 - (a_2 \cdot e_1) e_1 = \begin{bmatrix}-2 \\ 1 \\ 2\end{bmatrix} - \left(\begin{bmatrix}-2 \\ 1 \\ 2\end{bmatrix}, \begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \\ \frac{1}{3} \end{bmatrix}\right)\begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \\ \frac{1}{3} \end{bmatrix}$$
$$v_2 = \begin{bmatrix}-2 \\ 1 \\ 2\end{bmatrix} \qquad e_2 = \frac{v_2}{||v_2||} = \frac{\begin{bmatrix}-2 \\ 1 \\ 2\end{bmatrix}}{\sqrt{\sum{\begin{bmatrix}-2 \\ 1 \\ 2\end{bmatrix}^2}}}$$
$$e_2 = \begin{bmatrix} -\frac{2}{3} \\ \frac{1}{3} \\ \frac{2}{3} \end{bmatrix}$$
$$v_3 = a_3 - (a_3 \cdot e_1) e_1 - (a_3 \cdot e_2) e_2$$
$$v_3 = \begin{bmatrix}18 \\ 0 \\ 0\end{bmatrix} - \left(\begin{bmatrix}18 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \\ \frac{1}{3} \end{bmatrix}\right)\begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \\ \frac{1}{3} \end{bmatrix} - \left(\begin{bmatrix}18 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix} -\frac{2}{3} \\ \frac{1}{3} \\ \frac{2}{3} \end{bmatrix} \right)\begin{bmatrix} -\frac{2}{3} \\ \frac{1}{3} \\ \frac{2}{3} \end{bmatrix}$$
$$v_3 = \begin{bmatrix}2 \\ - 4 \\ 4 \end{bmatrix} \qquad e_3 = \frac{v_3}{||v_3||} = \frac{\begin{bmatrix}2 \\ -4 \\ 4\end{bmatrix}}{\sqrt{\sum{\begin{bmatrix}2 \\ -4 \\ 4\end{bmatrix}^2}}}$$
$$e_3 = \begin{bmatrix} \frac{1}{3} \\ -\frac{2}{3} \\ \frac{2}{3} \end{bmatrix}$$

Thus, the orthogonalized matrix resulting from the Gram-Schmidt process is:

$$\begin{bmatrix} \frac{2}{3} & -\frac{2}{3} & \frac{1}{3} \\ \frac{2}{3} & \frac{1}{3} & -\frac{2}{3} \\ \frac{1}{3} & \frac{1}{3} & \frac{2}{3} \end{bmatrix}$$

The component $R$ of the QR decomposition can also be found from the calculations made in the Gram-Schmidt process as defined above.

$$R = \begin{bmatrix}a_1 \cdot e_1 & a_2 \cdot e_1 & \cdots & a_n \cdot e_1 \\ 0 & a_2 \cdot e_2 & \cdots & a_n \cdot e_2 \\ \vdots & \vdots & & \vdots \\ 0 & 0 & \cdots & a_n \cdot e_n \end{bmatrix} = \begin{bmatrix} \begin{bmatrix} 2 \\ 2 \\ 1 \end{bmatrix} \cdot \begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \\ \frac{1}{3} \end{bmatrix} & \begin{bmatrix} -2 \\ 1 \\ 2 \end{bmatrix} \cdot \begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \\ \frac{1}{3} \end{bmatrix} & \begin{bmatrix} 18 \\ 0 \\ 0 \end{bmatrix} \cdot \begin{bmatrix} \frac{2}{3} \\ \frac{2}{3} \\ \frac{1}{3} \end{bmatrix} \\ 0 & \begin{bmatrix} -2 \\ 1 \\ 2 \end{bmatrix} \cdot \begin{bmatrix} -\frac{2}{3} \\ \frac{1}{3} \\ \frac{2}{3} \end{bmatrix} & \begin{bmatrix} 18 \\ 0 \\ 0 \end{bmatrix} \cdot \begin{bmatrix} -\frac{2}{3} \\ \frac{1}{3} \\ \frac{2}{3} \end{bmatrix} \\ 0 & 0 & \begin{bmatrix} 18 \\ 0 \\ 0 \end{bmatrix} \cdot \begin{bmatrix} \frac{1}{3} \\ -\frac{2}{3} \\ \frac{2}{3} \end{bmatrix}\end{bmatrix}$$
$$R = \begin{bmatrix} 3 & 0 & 12 \\ 0 & 3 & -12 \\ 0 & 0 & 6 \end{bmatrix}$$

## The Gram-Schmidt Algorithm in R

We use the same matrix $A$ to verify our results above.

A <- rbind(c(2,-2,18),c(2,1,0),c(1,2,0))
A
##      [,1] [,2] [,3]
## [1,]    2   -2   18
## [2,]    2    1    0
## [3,]    1    2    0

The following function is an implementation of the Gram-Schmidt algorithm using the modified version of the algorithm. A good comparison of the classical and modified versions of the algorithm can be found here. The Modified Gram-Schmidt algorithm was used above due to its improved numerical stability, which results in more orthogonal columns over the Classical algorithm.

gramschmidt <- function(x) {
x <- as.matrix(x)
# Get the number of rows and columns of the matrix
n <- ncol(x)
m <- nrow(x)
# Initialize the Q and R matrices
q <- matrix(0, m, n)
r <- matrix(0, n, n)
for (j in 1:n) {
v = x[,j] # Step 1 of the Gram-Schmidt process v1 = a1
# Skip the first column
if (j > 1) {
for (i in 1:(j-1)) {
r[i,j] <- t(q[,i]) %*% x[,j] # Find the inner product (noted to be q^T a earlier)
# Subtract the projection from v which causes v to become perpendicular to all columns of Q
v <- v - r[i,j] * q[,i]
}
}
# Find the L2 norm of the jth diagonal of R
r[j,j] <- sqrt(sum(v^2))
# The orthogonalized result is found and stored in the ith column of Q.
q[,j] <- v / r[j,j]
}
# Collect the Q and R matrices into a list and return
qrcomp <- list('Q'=q, 'R'=r)
return(qrcomp)
}

Perform the Gram-Schmidt orthogonalization process on the matrix $A$ using our function.

gramschmidt(A)
## $Q ## [,1] [,2] [,3] ## [1,] 0.6666667 -0.6666667 0.3333333 ## [2,] 0.6666667 0.3333333 -0.6666667 ## [3,] 0.3333333 0.6666667 0.6666667 ## ##$R
##      [,1] [,2] [,3]
## [1,]    3    0   12
## [2,]    0    3  -12
## [3,]    0    0    6

The results of our function match those of our manual calculations!

The qr() function in R also performs the Gram-Schmidt process. The qr() function does not output the $Q$ and $R$ matrices, which must be found by calling qr.Q() and qr.R(), respectively, on the qr object.

The qr() function performs QR decomposition using Householder reflections, but we can use it to verify our results in this case. Householder reflections is the more common approach to performing QR decomposition as it is more numerically stable. The Gram-Schmidt process can result in a non-orthogonal $Q$ matrix due to rounding errors. The qr() does not output the $Q$ and $R$ matrices, which must be found by calling qr.Q() and qr.R(), respectively, on the qr object.

A.qr <- qr(A)
A.qr.out <- list('Q'=qr.Q(A.qr), 'R'=qr.R(A.qr))
A.qr.out
## $Q ## [,1] [,2] [,3] ## [1,] -0.6666667 0.6666667 0.3333333 ## [2,] -0.6666667 -0.3333333 -0.6666667 ## [3,] -0.3333333 -0.6666667 0.6666667 ## ##$R
##      [,1] [,2] [,3]
## [1,]   -3    0  -12
## [2,]    0   -3   12
## [3,]    0    0    6

Thus the qr() function in R matches our function and manual calculations as well.

## References

http://www.calpoly.edu/~jborzell/Courses/Year%2005-06/Spring%202006/304Gram_Schmidt_Exercises.pdf

http://cavern.uark.edu/~arnold/4353/CGSMGS.pdf

https://www.math.ucdavis.edu/~linear/old/notes21.pdf

http://www.math.ucla.edu/~yanovsky/Teaching/Math151B/handouts/GramSchmidt.pdf

• #### QR Decomposition with the Gram-Schmidt Algorithm – Mubashir Qasim

March 23, 2017 at 8:18 pm Reply

[…] article was first published on R – Aaron Schlegel, and kindly contributed to […]

• #### QR Decomposition with the Gram-Schmidt Algorithm | A bunch of data

March 24, 2017 at 2:58 am Reply

[…] article was first published on R – Aaron Schlegel, and kindly contributed to […]

• #### Mariusz K

March 24, 2017 at 3:58 am Reply

Hello!!!
What about the numerical stability of this approach?

• #### M Edward/Ed Borasky (@znmeb)

March 24, 2017 at 4:01 pm Reply

It’s been a long time since I dug into this, but my recollection is that Gram-Schmidt is pedagogically simpler but computationally problematic. For dense in-core problems, Givens is more widely used and for out of core / sparse problems, variants of Arnoldi / Lanczos / Conjugate gradient are more widely used.

• #### Aaron Schlegel

March 25, 2017 at 10:25 am Reply

Hi Edward,

Thank you for your comment! I agree with your statement regarding the Gram-Schmidt procedure. I’m currently working on a Householder reflection function and it is proving to be more difficult than Gram-Schmidt! I believe Householder reflections are also more common in QR decomposition approaches as it is ‘more orthogonal’ compared to MGS and costs the same computationally. Rutishauser proposed a method of MGS that introduces reorthogonalization that is supposedly as computationally stable as some of the methods you mentioned. There’s a great paper here by Gander that discusses this.

• #### Thomas Lumley

April 13, 2017 at 6:58 pm Reply

Hi Aaron,

The qr() function in R does not use the Gram-Schmidt method.

As the help page for qr() says, by default it uses the LINPACK dqrdc routine, modified to pivot only when necessary; dqrdc uses Householder reflections. The source code file is here: https://svn.r-project.org/R/trunk/src/appl/dqrdc2.f.

Alternatively, there is an option to use the LAPACK routine DGEQP3. This also uses Householder reflections.

• #### Aaron Schlegel

April 14, 2017 at 9:40 am Reply

Hi Thomas,

Thank you so much for pointing that out! You are completely right, I meant to say the qr() function performs QR decomposition with Householder reflections rather than Gram-Schmidt, but the results can be verified with the function in this case regardless. I’ve updated the post the clear that up.

• #### Yuanchin

December 20, 2017 at 7:42 am Reply

Hi Aaron
I hope you can help me understand a concept in R language. I am just a starter for the R programming and making efforts to learn it.
My question is in your gramschmidt function at line 14:
(r[i,j] <- t(q[,i]) %*% x[,j] # Find the inner product (noted to be q^T a earlier)

Since both q and r are just defined as zero matrix in the function (line 7 & 8), how does it work to be used here? why shouldn't the dot product become zero too.?

If you can email me or just send me website links relating to this question would be greatly appreciated.
I am a biologist, I hope my question can even make sense to you.. Thank you.

• #### Aaron Schlegel

December 22, 2017 at 8:44 am Reply

Hi Yuanchin,

Thank you so much for your comment! You’re totally right that the dot product of the zero matrices should be 0; however, on the first iteration of the loop that begins on line 9, the calculation of the dot product is skipped as j would still be equal to 1. Thus the function proceeds straight to finding the L2 norm on the first iteration and doesn’t calculate the dot product until the next go arounds when the matrices Q and R are no longer 0.

Please let me know if this helps! Although it’s not related to the coding of the algorithm specifically, I found this resource: http://www.math.ucla.edu/~yanovsky/Teaching/Math151B/handouts/GramSchmidt.pdf, to be very helpful when I was writing the post.

P.S. Looking back at these posts I realize the syntax highlighting doesn’t make the code very clear. I am in the process of changing the plugin I use for syntax highlighting so these should look more like R scripts very shortly. I apologize if this made it difficult to read.

• #### Yuanchin

December 23, 2017 at 2:42 am Reply

Hi Aaron
Thank you so much for explaining all the detail and including the math link! They are really helpful.
One of my friends suggests me using print function to trace the changes of Q and R, now I can follow and understand the corresponding steps in the Gram-Schmidt process.
Thanks again for your sharing and thoughtful responses !! Merry Christmas!!

• #### Aaron Schlegel

December 23, 2017 at 9:56 am Reply

Hi Yuanchin,

I am so glad I was able to help! Happy holidays to you as well. ^_^

• #### Abby Smith

April 17, 2018 at 10:16 am Reply

Hi Aaron,

I have been trying to find the R code for the Householder triangularization algorithm which is similar to the Gram-Schmidt one. Could you guide me to where I can find it?

Thanks