Singular Value Decomposition and Dimensionality Reduction, Using R and Cat Image for Illustration Purposes, by Soesilo Wijono,
SVD (singular value decomposition) is an important method used in data science, especially data mining. It can be used, e.g., in dimensionality reduction for recommender system.Imagine online store, e.g. Amazon, to have million of items, and million of users. In order to perform algorithm for the recommender system, matrix to be used would have million by million dimension. Which is very expensive computation.
Theory for dimensionality reduction is everywhere, so we won’t repeat it again in here. Just remember the basic equation:
X = U A V.TV matrix has dimension of d x d.
A matrix is diagonal matrix with dimension of n x d.
(T represents matrix transpose.)
We want to reduce the dimension of X matrix.
This is an illustration of the method by using a PNG cat image. To help understanding the method visually. In which we’ll use image raw data. In real world, the image data can be replaced by any data, e.g. items x users matrix used in an recommender system, etc.
require(ripa)
require(png)
options(warn=-1)img = readPNG('tcat.png')
img = imagematrix(img, type="rgb")
plot(img) This is the X matrix,
img. Dimension of the matrix is,dim(img) ## [1] 150 300   3red = img[1:nrow(img), 1:ncol(img), 1]
green = img[1:nrow(img), 1:ncol(img), 2]
blue = img[1:nrow(img), 1:ncol(img), 3]
s_red = svd(red) 
s_green = svd(green) 
s_blue = svd(blue) dim(s_red$u)## [1] 150 150dim(s_red$v)## [1] 300 150length(s_red$d)## [1] 150Declare helper function to reduce the singular values,
reduce = function(s, d) {
  u = as.matrix(s$u[, 1:d])
  v = as.matrix(s$v[, 1:d])
  # Create the diagonal matrix A,  
  A = as.matrix(diag(s$d)[1:d, 1:d])
  return(list(u=u, v=v, A=A))
}newdim = 30
new_R = reduce(s_red, newdim) 
new_G = reduce(s_green, newdim) 
new_B = reduce(s_blue, newdim) dim(new_R$u)## [1] 150  30dim(new_R$v)## [1] 300  30dim(new_R$A)## [1] 30 30newimg = array(0, dim=c(nrow(img), ncol(img), 3)) 
new_R = new_R$u %*% new_R$A %*% t(new_R$v) 
new_G = new_G$u %*% new_G$A %*% t(new_G$v) 
new_B = new_B$u %*% new_B$A %*% t(new_B$v) 
newimg[1:nrow(img), 1:ncol(img), 1] = new_R
newimg[1:nrow(img), 1:ncol(img), 2] = new_G
newimg[1:nrow(img), 1:ncol(img), 3] = new_B
newimg = imagematrix(newimg, type="rgb")
plot(newimg)Reduce even more, from 150 down to 20, that is around 13%:
newdim = 20 
new_R = reduce(s_red, newdim) 
new_G = reduce(s_green, newdim) 
new_B = reduce(s_blue, newdim) dim(new_R$u)## [1] 150  20dim(new_R$v)## [1] 300  20dim(new_R$A)## [1] 20 20newimg = array(0, dim=c(nrow(img), ncol(img), 3)) 
new_R = new_R$u %*% new_R$A %*% t(new_R$v) 
new_G = new_G$u %*% new_G$A %*% t(new_G$v) 
new_B = new_B$u %*% new_B$A %*% t(new_B$v) 
newimg[1:nrow(img), 1:ncol(img), 1] = new_R
newimg[1:nrow(img), 1:ncol(img), 2] = new_G
newimg[1:nrow(img), 1:ncol(img), 3] = new_B
newimg = imagematrix(newimg, type="rgb")
plot(newimg)Conclusion:
We see that with only 13% - 20% of original dimension we are able to reproduce approximation of the data. Dimensionality reduction also removes noises and linear dependent elements by choosing only the most important singular values.Example with user - movie ratings:
This is a small example provided by the Stanford’s course MMDS,# user - movie ratings matrix: 
# columns represents movies, rows represents users 
xc = c(1, 1, 1, 0, 0,
       3, 3, 3, 0, 0,
       4, 4, 4, 0, 0,
       5, 5, 5, 0, 0,
       0, 2, 0, 4, 4,
       0, 0, 0, 5, 5,
       0, 1, 0, 2, 2)
x = matrix(xc, nrow=7, ncol=5, byrow=T)
# SVD 
s = svd(x)s$u ##             [,1]        [,2]        [,3]          [,4]          [,5]
## [1,] -0.13759913 -0.02361145 -0.01080847  3.054209e-01  4.467149e-01
## [2,] -0.41279738 -0.07083435 -0.03242542  8.345818e-01 -4.642400e-02
## [3,] -0.55039650 -0.09444581 -0.04323389 -1.701251e-01 -7.266700e-01
## [4,] -0.68799563 -0.11805726 -0.05404236 -4.257332e-01  5.198474e-01
## [5,] -0.15277509  0.59110096  0.65365084 -1.420305e-17 -4.556360e-17
## [6,] -0.07221651  0.73131186 -0.67820922  3.553473e-17 -1.169854e-16
## [7,] -0.07638754  0.29555048  0.32682542 -7.101524e-18 -6.441516e-17s$v ##             [,1]        [,2]       [,3]          [,4]          [,5]
## [1,] -0.56225841 -0.12664138 -0.4096675 -7.071068e-01  0.000000e+00
## [2,] -0.59285990  0.02877058  0.8047915 -4.081479e-16  5.384491e-17
## [3,] -0.56225841 -0.12664138 -0.4096675  7.071068e-01 -5.384491e-17
## [4,] -0.09013354  0.69537622 -0.0912571 -4.130810e-17  7.071068e-01
## [5,] -0.09013354  0.69537622 -0.0912571 -4.130810e-17 -7.071068e-01diag(s$d)##          [,1]     [,2]    [,3]         [,4]         [,5]
## [1,] 12.48101 0.000000 0.00000 0.000000e+00 0.000000e+00
## [2,]  0.00000 9.508614 0.00000 0.000000e+00 0.000000e+00
## [3,]  0.00000 0.000000 1.34556 0.000000e+00 0.000000e+00
## [4,]  0.00000 0.000000 0.00000 2.885057e-16 0.000000e+00
## [5,]  0.00000 0.000000 0.00000 0.000000e+00 7.362016e-33new_s = reduce(s, 2)
new_s$u ##             [,1]        [,2]
## [1,] -0.13759913 -0.02361145
## [2,] -0.41279738 -0.07083435
## [3,] -0.55039650 -0.09444581
## [4,] -0.68799563 -0.11805726
## [5,] -0.15277509  0.59110096
## [6,] -0.07221651  0.73131186
## [7,] -0.07638754  0.29555048new_s$v ##             [,1]        [,2]
## [1,] -0.56225841 -0.12664138
## [2,] -0.59285990  0.02877058
## [3,] -0.56225841 -0.12664138
## [4,] -0.09013354  0.69537622
## [5,] -0.09013354  0.69537622new_s$A ##          [,1]     [,2]
## [1,] 12.48101 0.000000
## [2,]  0.00000 9.508614new_x = new_s$u %*% new_s$A %*% t(new_s$v) x and new_x, element by element:x##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    1    1    0    0
## [2,]    3    3    3    0    0
## [3,]    4    4    4    0    0
## [4,]    5    5    5    0    0
## [5,]    0    2    0    4    4
## [6,]    0    0    0    5    5
## [7,]    0    1    0    2    2new_x##            [,1]      [,2]       [,3]         [,4]         [,5]
## [1,]  0.9940420 1.0117044  0.9940420 -0.001327193 -0.001327193
## [2,]  2.9821261 3.0351133  2.9821261 -0.003981578 -0.003981578
## [3,]  3.9761681 4.0468178  3.9761681 -0.005308770 -0.005308770
## [4,]  4.9702101 5.0585222  4.9702101 -0.006635963 -0.006635963
## [5,]  0.3603133 1.2921647  0.3603133  4.080263014  4.080263014
## [6,] -0.3738507 0.7344294 -0.3738507  4.916721417  4.916721417
## [7,]  0.1801567 0.6460824  0.1801567  2.040131507  2.040131507norm(x-new_x, type="F") ## [1] 1.34556References:
-Stanford course’s Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman, Jeff Ullman-Data Mining Algorithms In R/Dimensionality Reduction/Singular Value Decomposition wikibooks.
 
 
