0

I have a table of 400 questions and a table of users (potentially 1million) and their group (they all belong to varying size user groups). Every user must answer all the questions. The users are grouped together for purporses of analysis against one another. The groups can be of all sizes.

How should I record each user's answers in a way that i can analyse their answers against their group?

If i simply duplicate the questions for every user this will be a big unwieldy table. The 'answers' table would be ~4mil in size with only 1000 users. If i give each question a id then maybe something more clever can be done?

(Not sure if this is the correct Stack forum for such a theoretical question)

Fearghal
  • 10,569
  • 17
  • 55
  • 97
  • No, this is not the correct stack forum to ask these kind of questions. Also, as much I would like to help you, I have seen no effort, at least in your post of attempt(s) to solve your problem. I don't think stack is made this way. – Craftein Feb 07 '14 at 14:11
  • Thanks. It is not so much a question to solve but a question of best practice that i do not know the answer to. – Fearghal Feb 07 '14 at 15:45

1 Answers1

1

I would have a separate table for Users, Groups, Question and Answers.

You may want to consider putting a GroupID in the Answers question, as well as helping to enforce RI it'll allow you to join directly to an answer's group, save you a join and help indexing.

Typically you can model the answers table in two ways, either have a single answer per row, or have a very wide table with a column per question per user. Personally as you're talking about 400 questions I would not even consider the second option - it also gives you increased flexibility should you decide to change the questions.

You may want to look at using a reporting databases to ensure that your performance is acceptable.

For the volumes of answers you're looking at you'll need to make sure your database is properly indexed.

Community
  • 1
  • 1
Liath
  • 9,913
  • 9
  • 51
  • 81
  • what would be the column in the answers table? – Fearghal Feb 07 '14 at 13:51
  • I thought of the 'Col per question/user answer per row' solution but you are right, it is dodgy. Your first solution is same as my inital question possible solution; duplicate the questions for every user (except you use the question id) and ad answer col? dont forget its the same set of questions for every user. Users are grouped, but every user answers all questions, they dont answer as a group. – Fearghal Feb 07 '14 at 13:55
  • The 'answers' table solution would be very large with only 1000 users. I am using mySql is 40million rows for 10,000 users acceptable? – Fearghal Feb 07 '14 at 14:03
  • If the alternative is a 400 column table yes it is acceptable! A well tuned SQL database shouldn't have too much trouble with that sort of row count. However if you're planning on doing big number crunching then a reporting database may not be a bad idea – Liath Feb 07 '14 at 14:09
  • A reporting database - what is this? I am using MySql. you are probably right. I have never 'indexed' a table before, i heard it makes querying faster. How do i do this? Is it simply having a index col and having it as primary index? – Fearghal Feb 07 '14 at 14:12
  • @Fearghal - just to throw it out there does https://www.surveymonkey.com/ do what you need? – Liath Feb 07 '14 at 14:17
  • 1
    Hey Liath, thanks for all your responses, you have been a super help. the example i gave you is not the start and end of coding requirement so a off the shelf solution will not help, eg surveymonkey. – Fearghal Feb 07 '14 at 15:46
  • @Fearghal I assumed that - just thought I'd try and save you some time! – Liath Feb 07 '14 at 15:48
  • also, i think il break the answers table into multiple answers table and have a pointer from each user in user table to where their answers can be found. – Fearghal Feb 07 '14 at 15:49
  • Personally I don't like that idea if it's just to reduce row numbers... you'll get into conditional joins. I wouldn't be too scared by the row count. I'm always nervous when I try to optimise a schema before there's a problem – Liath Feb 07 '14 at 15:55
  • Really? Yea ok il keep it simple and follow the real issues. thx – Fearghal Feb 07 '14 at 16:59