3

I have an application (More likely a quiz app) where i have saved all my 1000 quizzes in MySQL database, I want to retrieve a random question from this table when a user request one, I can easily do it using the RAND() function in MySQL.. my problem is , I don't want to give the same question two or more times to a user, how can i keep a record of retrieved questions? Do I have to create tables for each and every users? won't that increase the load time?? please help me, any help would be a big favor ..

-regards

  • [Don't use RAND() to fetch a random row](http://stackoverflow.com/questions/4644600/order-by-rand-gets-trouble-for-large-posts). [Use something faster instead](http://stackoverflow.com/questions/211329/quick-selection-of-a-random-row-from-a-large-table-in-mysql). – DCoder Dec 23 '12 at 10:48

3 Answers3

3

If you want it for a short time, use the user's $_SESSION for that.

If you need the long term ( say tomorrow, not to ask the same questions) - you'll have to create additional table for usersToQuestions, where you'll store the user id and the questions the user had been already asked.

Retrieving a question in both cases would require a simple IN condition:

SELECT * FROM questions 
WHERE id not IN  ('implode(",", $_SESSION["asked"])')

SELECT * FROM questions 
WHERE id not IN  ( 
    SELECT question_id FROM questions2users WHERE userid = 123
)
Alex
  • 11,479
  • 6
  • 28
  • 50
  • Thanks a lot for the fast reply man, I got the idea, but I'm little confused because I need it to be long term, Let's say I have created a table usersToQuestions, do i have to create separate columns for each and every questions? that means 1001 columns?? –  Dec 23 '12 at 10:45
  • Nope. You create two columns `userid` and `questionid`. For each questions you give to the user - insert a row in this table. – Alex Dec 23 '12 at 10:46
  • Oh.. this is cool..:) but still.. how can I check wether a question has already served to a user or not, if there are too many userId's recorded in several rows.. :/ –  Dec 23 '12 at 10:51
  • Oh.. I'm a real newbie here man, please, any code and help would be much appreciated.. –  Dec 23 '12 at 10:56
  • 1
    Read out loudly the second query. It pretty much should make sense in simple English. It will check automatically the question wasn't used by the user yet. Generally, to check if user 777 had seen question 555 or not you'd do: `select count(*) from questions2users where user = 777 and question = 555` and then check if the return value is 0 or 1 with `mysql_result(mysql_query(), 0, 0) == 1 )` – Alex Dec 23 '12 at 10:57
3

my problem is , I don't want to give the same question two or more times to a user, how can i keep a record of retrieved questions? Do I have to create tables for each and every users? won't that increase the load time?

Yes, but possibly not so much.

You keep a single extra table with userId, questionId and insert there the questions already asked to the various users.

When you ask question 123 to user 456, you run a single INSERT

INSERT INTO askedQuestions (userId, questionId) VALUES (456, 123);

Then you extract questions from questions with a LEFT JOIN

SELECT questions.* FROM questions
    LEFT JOIN askedQuestions ON (questions.id = askedQuestions.questionId AND askedQuestions.userId = {$_SESSION['userId']} )
    WHERE askedQuestions.userId IS NULL
    ORDER BY RAND() LIMIT 1;

if you keep askedQuestions indexed on (userId, questionId), joining will be very efficient.

Notes on RAND()

Selecting on a table like this should not done with ORDER BY RAND(), which will retrieve all the rows in the table before outputting one of them. Normally you would choose a questionId at random, and select the question with that questionId, and that would be waaaay faster. But here, you have no guarantee that the question has not been already asked to that user, and the faster query might fail.

When most questions are still free to ask, you can use

WHERE questions.questionId IN ( RAND(N), RAND(N), RAND(N), ... )
  AND askedQuestions.userId IS NULL LIMIT 1

where N is the number of questions. Chances are that at least one of the random numbers you extract will still be free. The IN will decrease performances, and you will have to strike a balance with the number of RANDs. When questions are almost all asked, chances of a match decrease, and your query might return nothing even with many RANDs (also because RANDs will start yielding duplicate IDs, in what is known as the Birthday Paradox).

One way to achieve the best of both worlds could be to fix a maximum number of attempts, say, three (or better still, based on the number of questions left over).

For X times you generate (in PHP) a set of Y random ids betweeen 1 and 1000, and try to retrieve (userId, questionId) from askedQuestions. The table is thin and indexed, so this is really fast. If you fail, then the extracted questionId is random and free, and you can run

SELECT * FROM questions WHERE id = {$tuple['questionId']};

which is also very fast. If you succeed X times, i.e., for X times, all Y random questionIds are registered as being already asked, then you run the full query. Most users will be served almost instantly (two very quick queries), and only a few really dedicated users will require more processing. You might want to set some kind of alerting to warn you of users running out of questions.

LSerni
  • 55,617
  • 10
  • 65
  • 107
0

One solution is to add an ID column in the question table and when you serve it to a user you check that ID with the list of questions that you served the user.

You can use in memory data structure like List to keep track of the questions that are served to a particular user. This way, you only need array of Lists instead of tables to get the job done.

Abraham
  • 603
  • 7
  • 19