Everything should be made as simple as possible, but not simpler. (Albert Einstein)

Sunday, October 4, 2015

Singular Value Decomposition and Dimensionality Reduction, Using R and Cat Image for Illustration Purposes

Singular Value Decomposition and Dimensionality Reduction, Using R and Cat Image for Illustration Purposes, by Soesilo Wijono,

SVD (singular value decomposition) is an important method used in data science, especially data mining. It can be used, e.g., in dimensionality reduction for recommender system.
Imagine online store, e.g. Amazon, to have million of items, and million of users. In order to perform algorithm for the recommender system, matrix to be used would have million by million dimension. Which is very expensive computation.
Theory for dimensionality reduction is everywhere, so we won’t repeat it again in here. Just remember the basic equation:
X = U A V.T
U matrix has dimension of n x n.
V matrix has dimension of d x d.
A matrix is diagonal matrix with dimension of n x d.
(T represents matrix transpose.)
We want to reduce the dimension of X matrix.
This is an illustration of the method by using a PNG cat image. To help understanding the method visually. In which we’ll use image raw data. In real world, the image data can be replaced by any data, e.g. items x users matrix used in an recommender system, etc.

require(ripa)
require(png)
options(warn=-1)
img = readPNG('tcat.png')
img = imagematrix(img, type="rgb")
plot(img) 

This is the X matrix, img. Dimension of the matrix is,
dim(img) 
## [1] 150 300   3
Let’s perform SVD, in this case we have 3D matrix, since the image has RGB format, those are red channel, green channel, blue channel.
red = img[1:nrow(img), 1:ncol(img), 1]
green = img[1:nrow(img), 1:ncol(img), 2]
blue = img[1:nrow(img), 1:ncol(img), 3]
s_red = svd(red) 
s_green = svd(green) 
s_blue = svd(blue) 
Dimension of each result is,
dim(s_red$u)
## [1] 150 150
dim(s_red$v)
## [1] 300 150
length(s_red$d)
## [1] 150
There is 150 singular values.
Declare helper function to reduce the singular values,
reduce = function(s, d) {
  u = as.matrix(s$u[, 1:d])
  v = as.matrix(s$v[, 1:d])
  # Create the diagonal matrix A,  
  A = as.matrix(diag(s$d)[1:d, 1:d])
  return(list(u=u, v=v, A=A))
}
Reduce each channel matrix dimension, from 150 down to 30, which is 20%:
newdim = 30
new_R = reduce(s_red, newdim) 
new_G = reduce(s_green, newdim) 
new_B = reduce(s_blue, newdim) 
Dimension of the reduced singular values,
dim(new_R$u)
## [1] 150  30
dim(new_R$v)
## [1] 300  30
dim(new_R$A)
## [1] 30 30
Reconstruction of the “dimensionality reduced” image,
newimg = array(0, dim=c(nrow(img), ncol(img), 3)) 

new_R = new_R$u %*% new_R$A %*% t(new_R$v) 
new_G = new_G$u %*% new_G$A %*% t(new_G$v) 
new_B = new_B$u %*% new_B$A %*% t(new_B$v) 

newimg[1:nrow(img), 1:ncol(img), 1] = new_R
newimg[1:nrow(img), 1:ncol(img), 2] = new_G
newimg[1:nrow(img), 1:ncol(img), 3] = new_B

newimg = imagematrix(newimg, type="rgb")
plot(newimg)

Reduce even more, from 150 down to 20, that is around 13%:
newdim = 20 
new_R = reduce(s_red, newdim) 
new_G = reduce(s_green, newdim) 
new_B = reduce(s_blue, newdim) 
Dimension of the reduced singular values,
dim(new_R$u)
## [1] 150  20
dim(new_R$v)
## [1] 300  20
dim(new_R$A)
## [1] 20 20
Reconstruction of the “dimensionality reduced” image,
newimg = array(0, dim=c(nrow(img), ncol(img), 3)) 

new_R = new_R$u %*% new_R$A %*% t(new_R$v) 
new_G = new_G$u %*% new_G$A %*% t(new_G$v) 
new_B = new_B$u %*% new_B$A %*% t(new_B$v) 

newimg[1:nrow(img), 1:ncol(img), 1] = new_R
newimg[1:nrow(img), 1:ncol(img), 2] = new_G
newimg[1:nrow(img), 1:ncol(img), 3] = new_B

newimg = imagematrix(newimg, type="rgb")
plot(newimg)

Conclusion:

We see that with only 13% - 20% of original dimension we are able to reproduce approximation of the data. Dimensionality reduction also removes noises and linear dependent elements by choosing only the most important singular values.

Example with user - movie ratings:

This is a small example provided by the Stanford’s course MMDS,
# user - movie ratings matrix: 
# columns represents movies, rows represents users 
xc = c(1, 1, 1, 0, 0,
       3, 3, 3, 0, 0,
       4, 4, 4, 0, 0,
       5, 5, 5, 0, 0,
       0, 2, 0, 4, 4,
       0, 0, 0, 5, 5,
       0, 1, 0, 2, 2)
x = matrix(xc, nrow=7, ncol=5, byrow=T)
# SVD 
s = svd(x)
Left singular vectors,
s$u 
##             [,1]        [,2]        [,3]          [,4]          [,5]
## [1,] -0.13759913 -0.02361145 -0.01080847  3.054209e-01  4.467149e-01
## [2,] -0.41279738 -0.07083435 -0.03242542  8.345818e-01 -4.642400e-02
## [3,] -0.55039650 -0.09444581 -0.04323389 -1.701251e-01 -7.266700e-01
## [4,] -0.68799563 -0.11805726 -0.05404236 -4.257332e-01  5.198474e-01
## [5,] -0.15277509  0.59110096  0.65365084 -1.420305e-17 -4.556360e-17
## [6,] -0.07221651  0.73131186 -0.67820922  3.553473e-17 -1.169854e-16
## [7,] -0.07638754  0.29555048  0.32682542 -7.101524e-18 -6.441516e-17
Right singular vectors,
s$v 
##             [,1]        [,2]       [,3]          [,4]          [,5]
## [1,] -0.56225841 -0.12664138 -0.4096675 -7.071068e-01  0.000000e+00
## [2,] -0.59285990  0.02877058  0.8047915 -4.081479e-16  5.384491e-17
## [3,] -0.56225841 -0.12664138 -0.4096675  7.071068e-01 -5.384491e-17
## [4,] -0.09013354  0.69537622 -0.0912571 -4.130810e-17  7.071068e-01
## [5,] -0.09013354  0.69537622 -0.0912571 -4.130810e-17 -7.071068e-01
Singular values, to diagonal matrix,
diag(s$d)
##          [,1]     [,2]    [,3]         [,4]         [,5]
## [1,] 12.48101 0.000000 0.00000 0.000000e+00 0.000000e+00
## [2,]  0.00000 9.508614 0.00000 0.000000e+00 0.000000e+00
## [3,]  0.00000 0.000000 1.34556 0.000000e+00 0.000000e+00
## [4,]  0.00000 0.000000 0.00000 2.885057e-16 0.000000e+00
## [5,]  0.00000 0.000000 0.00000 0.000000e+00 7.362016e-33
Reduced dimension from 5 down to 2,
new_s = reduce(s, 2)
new_s$u 
##             [,1]        [,2]
## [1,] -0.13759913 -0.02361145
## [2,] -0.41279738 -0.07083435
## [3,] -0.55039650 -0.09444581
## [4,] -0.68799563 -0.11805726
## [5,] -0.15277509  0.59110096
## [6,] -0.07221651  0.73131186
## [7,] -0.07638754  0.29555048
new_s$v 
##             [,1]        [,2]
## [1,] -0.56225841 -0.12664138
## [2,] -0.59285990  0.02877058
## [3,] -0.56225841 -0.12664138
## [4,] -0.09013354  0.69537622
## [5,] -0.09013354  0.69537622
new_s$A 
##          [,1]     [,2]
## [1,] 12.48101 0.000000
## [2,]  0.00000 9.508614
Reconstruction,
new_x = new_s$u %*% new_s$A %*% t(new_s$v) 
Compare both x and new_x, element by element:
x
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    1    1    0    0
## [2,]    3    3    3    0    0
## [3,]    4    4    4    0    0
## [4,]    5    5    5    0    0
## [5,]    0    2    0    4    4
## [6,]    0    0    0    5    5
## [7,]    0    1    0    2    2
new_x
##            [,1]      [,2]       [,3]         [,4]         [,5]
## [1,]  0.9940420 1.0117044  0.9940420 -0.001327193 -0.001327193
## [2,]  2.9821261 3.0351133  2.9821261 -0.003981578 -0.003981578
## [3,]  3.9761681 4.0468178  3.9761681 -0.005308770 -0.005308770
## [4,]  4.9702101 5.0585222  4.9702101 -0.006635963 -0.006635963
## [5,]  0.3603133 1.2921647  0.3603133  4.080263014  4.080263014
## [6,] -0.3738507 0.7344294 -0.3738507  4.916721417  4.916721417
## [7,]  0.1801567 0.6460824  0.1801567  2.040131507  2.040131507
Frobenius (Euclidean) norm:
norm(x-new_x, type="F") 
## [1] 1.34556