I have a movie database where I need to populate with data so it becomes easier to test and develop the application. There's tables to hold movie ratings and user accounts, the users rate the movies.
I've started to develop a script to populate the database with fake and generic data but I don't know how to randomize the rating. For each movie I select a random number of users, 100, 500, 1000, whatever. And for each of those users I randomize a rating from 1 through 10. But these ratings are resulting in the same average, around 5. Which means the distribution of ratings (1 through 10) for a specific movie is basically the same. This is not "realistic" at all as all movies with ratings generated like this will have the same average, thus the same ratings from different users and different amount of users, doesn't really matter.
I wanted movie A to have an average of 7, movie B average of 5, movie C average of 8, etc... But I just don't want the average to be different for every movie. I mean, it would be nice to produce ratings like this (for a specific number of users): http://www.imdb.com/title/tt1046173/ratings or this http://www.imdb.com/title/tt0486640/ratings
You know, something random that could produce two different variations like those above. I hit refresh and I get the first graph, I hit refresh and get the second, hit again and get something different or similar, something "random" and "realistic".
I'm also going to display graphs like this on my app so it would look nice to have different distributions. But I have no idea how can I randomly accomplish this with a simple script to generate all that.
How can I solve this? Maybe it's too much work not worth it?
Maybe something simpler, like select a point (between 1 and 10) and then create a normal distribution of ratings where that selected point is the highest one, that would work for me.