Item Based Collaborative Filtering Recommender Systems in R

Datetime:2016-08-23 01:13:16          Topic: R Program           Share

In the series of implementing Recommendation engines, in my previous blog about recommendation system in R , I have explained about implementing user based collaborative filtering approach using R. In this post, I will be explaining about basic implementation of Item based collaborative filtering recommender systems in r.

Intuition:

Item based Collaborative Filtering:

Unlike in user based collaborative filtering discussed previously, in item-based collaborative filtering, we consider set of items rated by the user and computes item similarities with the targeted item. Once similar items are found, and then rating for the new item is predicted by taking weighted average of the user’s rating on these similar items.

let's understand with an example:

As an example: consider below dataset, containing users rating to movies. Let us build an algorithm to recommend movies to CHAN.

Implementing Item based recommender systems, like user based collaborative filtering, requires two steps:

  • Calculating Item similarities
  •   Predicting the targeted item rating for the targeted User.

Step1: Calculating Item Similarity:

This is a critical step; we calculate the similarity between co-rated items. We use cosine similarity or pearson-similarity to compute the similarity between items. The output for step is similarity matrix between Items.

Code snippet:

#step 1: item-similarity calculation co-rated items are considered and similarity between two items

#are calculated using cosine similarity

library(lsa)

ratings = read.csv("Rating Matrix.csv")

x = ratings[,2:7]

x[is.na(x)] = 0

item_sim = cosine(as.matrix(x))

Step2: Predicting the targeted item rating for the targeted User CHAN.

In this most important step, we first predict the items which the user is not rated by making use of the ratings he has made to previously interacted items and the similarity values calculated in the previous step. First we select item to be predicted, in our case “INCEPTION”, we predict the rating for INCEPTION movie by calculating the weighted sum of ratings made to movies similar to INCEPTION. i.e We take the similarity score for each rated movie by CHAN w.r.t INCEPTION and multiply with the corresponding rating and sum up all the for all the rated movies. This final sum is divided by total sum of similarity scores of rated items w.r.t INCEPTION.

Recommending Top N items:

Once all the non rated movies are predicted we recommend top N movies to CHAN. Code for Item based collaborative filtering in R:

#data input

ratings = read.csv("~Rating Matrix.csv")

"step 1: item-similarity calculation\nco-rated items are considered and similarity between two items\nare calculated using cosine similarity"

library(lsa)

x = ratings[,2:7]

x[is.na(x)] = 0

item_sim = cosine(as.matrix(x))

"Recommending items for chan: since three movies are not rated\nas a first step we have to predict rating value for each movie\nin CHANs case we have to first predict values for Titanic, Inception,Matrix"

rec_itm_for_user = function(userno)

{

#extract all the movies not rated by CHAN

userRatings = ratings[userno,]

non_rated_movies = list()

rated_movies = list()

for(i in 2:ncol(userRatings)){

if(is.na(userRatings[,i]))

{

non_rated_movies = c(non_rated_movies,colnames(userRatings)[i])

}

else

{

rated_movies = c(rated_movies,colnames(userRatings)[i])

}

}

non_rated_movies = unlist(non_rated_movies)

rated_movies = unlist(rated_movies)

#create weighted similarity for all the rated movies by CHAN

non_rated_pred_score = list()

for(j in 1:length(non_rated_movies)){

temp_sum = 0

df = item_sim[which(rownames(item_sim)==non_rated_movies[j]),]

for(i in 1:length(rated_movies)){

temp_sum = temp_sum+ df[which(names(df)==rated_movies[i])]

}

weight_mat = df*ratings[userno,2:7]

non_rated_pred_score = c(non_rated_pred_score,rowSums(weight_mat,na.rm=T)/temp_sum)

}

pred_rat_mat = as.data.frame(non_rated_pred_score)

names(pred_rat_mat) = non_rated_movies

for(k in 1:ncol(pred_rat_mat)){

ratings[userno,][which(names(ratings[userno,]) == names(pred_rat_mat)[k])] = pred_rat_mat[1,k]

}

return(ratings[userno,])

}

> rec_itm_for_user(7)
  
    
  Users  Titanic Batman Inception SuperMan Spiderman   matrix
 
   
7  CHAN 3.085298    4.5  2.940811        4         1 3.170034
 
   

Calling above function gives the predicted values not previously seen values for movies Titanic, Inception, Matrix. Now we can sort and recommend the top items.

This is all about Collaborative filtering in R, in my upcoming posts I will talk about content based recommender systems in r.





About List