Users of my application (it's a game actually) answer questions to get points. Questions are supplied by other users. Due to volume, I cannot check everything myself, so I decided to crowd-source the filtering process to the users (players). The rules are simple:
- each user is shown a question to rate as good/bad/unsure
- when question is rated 5 times as "bad" it is removed from the pool
- when question is rated 5 times as "good" it is removed from the poll and flagged to be played by other players who have not seen it
If everyone could see everything, this would be easy. However, later in the game phase, users shouldn't get questions they have already seen. This means that users should not see all the questions, and exactly those they don't see would they get to play (answer) later in the game.
Total number of questions is much larger than number of players, new questions are added daily and new players come all the time, so I cannot just distribute in advance.
I'm looking for some algorithm that would maximize the number of rated playable (i.e. unseen) questions for all players.
I tried to google, but I'm not even sure which terms to put in the search box, and using stuff like "distribution", "voting", "collaborative filtering" gives very interesting but unusable results.
Ratio of good vs bad questions is 1:3, ie. 25% of questions are rated good. Number of already submitted unrated questions is over 10000. Number of active users with privilege to vote is around 150.
I'm currently considering splitting the question pool and user base into 2 parts. One part of the user base would check the question for the other part and vice versa. Splitting the questions is easy (odd vs even for example). However, I'm still not sure how to divide the user base. I thought about using odd/even position in "top question checkers" list, however the positions on list changes daily as new questions are checked.
Update: I just asked a sequel to this question - I need to periodically remove a fixed number of questions from the pool.