0

I am working on creating an item-based collaborative recommendation engine. The data set available has a size of:

Number of users: approx (3,00,000)

Number of items: 525

The recommenderlab package in R requires a user-item rating matrix. I have a molten data table with columns: User_Code, Item_Code, Ratings

From this dataset I have to create a user-item rating matrix by using the "acast" function in R. But given the size of data I get the error:

Error: Unable to allocate a vector of 250GB.

Is there a workaround for this step or increasing RAM is the only option?

tushaR
  • 3,083
  • 1
  • 20
  • 33

1 Answers1

0

Try to do following: 1. Select only that users that have actual item ratings (recommended some items/ranked them). So you will operate only with real valuable data. 2. If dataset from step 1 is still too large just select random N (10000, 20000) users from it with appropriate ratings

  • Is there any criteria that I can follow to sample the customers. All the customers have rated some item or the other but very few must have rated similar items. – tushaR Jan 27 '17 at 12:07
  • 1
    Select valuable amount (10000, 20000) of customers with max counts of rated items. Generally you do not need to process all your customers to get working recommendation engine. – Артем Кустиков Jan 30 '17 at 11:11