# Singular Value Decomposition

This post is a reading note of the first chapter of the book Data-Driven Science and Engineering.

### Basic Information

Definition. The SVD is a unique matrix decomposition that exists for every complex-valued matrix $$X \in \mathbb{C}^{n \times m}$$:

$$X = U \Sigma V^{*}$$

where $$U \in \mathbb{C}^{n \times n}$$ and $$V \in \mathbb{C}^{m \times m}$$ are unitary matrices with orthonormal columns and $$\Sigma \in \mathbb{R}^{n \times m}$$ is a matrix with real, nonnegative entires on the dianogal and zeros off the diagonal.

The form given in the definition is a full SVD, there are other two representations of SVD:

1. Economy SVD
2. Truncated SVD

Their relationships are illustrated in the figure below.

We can show that

$$\begin{eqnarray} XX^{*}U & = & U \begin{pmatrix} \hat{\Sigma}^{2} & 0 \\ 0 & 0 \end{pmatrix} \\ X^{*}XV & = & V \hat{\Sigma}^{2} \end{eqnarray}$$

And note that $$X X^{*}$$ is the covariance matrix of $$X$$ (a row represents a sequence of measurements of a specific attribute). It follows that $$U$$ captures the correlation in the rows of $$X$$.

### SVD Application

There are two major use case of SVD mentioned in this chapter:

1. Approxumation
2. Feature extraction/projection

#### Approximation

Truncated SVD can be used to approximate a given matrix $$X$$ by a low-rank matrix. In the original matrix $$X$$, there are $$n \times m$$ items. In the truncated SVD, we have

$$X \approx \tilde{U} \tilde{\Sigma} \tilde{V}^*$$

where, we have $$n \times r$$ in $$\tilde{U}$$, $$r \times r$$ in $$\tilde{\Sigma}$$ and $$r \times m$$ in $$\tilde{V}$$.

According to Eckart–Young–Mirsky theorem, this is also the optimal rank-r approximation to X in a least-squares sense.

#### Feature Extraction/Projection

As we mentioned earlier, matrix $$U$$ captures the correlation in the rows of $$X$$. Moreover, each column in $$X$$ represents a measurement of different attributes and the column vectors in $$U$$ has the same size as the column vectors in $$X$$. Therefore, the column space of $$U$$ can be used to characterize the column space of $$X$$ and a sub-space of $$col(U)$$ captures a subset of characteristics of $$X$$.

Suppose we have a new measurement $$x$$ and we want to extract main features from it. In order to do that, we need to calcuate the projection of $$x$$ to the sub space of $$col(U)$$. This can be done using the following formular:

$$x_{projected} = \tilde{U} \tilde{U}^* x$$

----- END -----

Welcome to join reddit self-learning community.

Want some fun stuff?