I'm currently researching recommender systems and would like to know how other researchers acquire or generate test data to evaluate the systems' performance?
Asked
Active
Viewed 1,540 times
2 Answers
8
When I was working with Recommender Systems I had the exact same problem. I enjoyed the Grouplens dataset the most:
You can download ratings given by users to movies.
Also, I described in my blog some datasets I found while researching:
http://girlincomputerscience.blogspot.com.br/2010/12/datasets.html
Hope it helps!

Renata Ghisloti
- 547
- 6
- 13
7
I don't know what field you're evaluating, but if it's movie recommendations, you could use the MovieLens data from GroupLens to start out with. (It seems like their site is temporarily down, but I'm sure it will be back up soon).
They have three sets of data - 100,000 votes (preferences), 1 million, and 10 million - and it seems like they're more or less the standard that everyone starts out with.

Eyal
- 3,412
- 1
- 44
- 60
-
Awesome! Thanks for the info. What if people were looking for a data set that was item based rather than rating based? E.G. Collaborative filtering vs contentfiltering/itemfiltering/info retrieval. – Ullr Mar 16 '12 at 16:16
-
What do you mean? The Grouplens set can be used for collaborative filtering, too. – Eyal Apr 17 '12 at 13:41