How to acquire or generate test data for a recommender system

Question

I'm currently researching recommender systems and would like to know how other researchers acquire or generate test data to evaluate the systems' performance?

score 8 · Answer 1 · answered Nov 02 '12 at 19:48

When I was working with Recommender Systems I had the exact same problem. I enjoyed the Grouplens dataset the most:

http://grouplens.org/node/12

You can download ratings given by users to movies.

Also, I described in my blog some datasets I found while researching:

http://girlincomputerscience.blogspot.com.br/2010/12/datasets.html

Hope it helps!

score 7 · Answer 2 · answered Mar 12 '12 at 13:46

7

I don't know what field you're evaluating, but if it's movie recommendations, you could use the MovieLens data from GroupLens to start out with. (It seems like their site is temporarily down, but I'm sure it will be back up soon).

They have three sets of data - 100,000 votes (preferences), 1 million, and 10 million - and it seems like they're more or less the standard that everyone starts out with.

answered Mar 12 '12 at 13:46

Eyal

3,412
1
44
60

Awesome! Thanks for the info. What if people were looking for a data set that was item based rather than rating based? E.G. Collaborative filtering vs contentfiltering/itemfiltering/info retrieval. – Ullr Mar 16 '12 at 16:16
What do you mean? The Grouplens set can be used for collaborative filtering, too. – Eyal Apr 17 '12 at 13:41

How to acquire or generate test data for a recommender system

2 Answers2