44

I'm working on a fairly simple survey system right now. The database schema is going to be simple: a Survey table, in a one-to-many relation with Question table, which is in a one-to-many relation with the Answer table and with the PossibleAnswers table.

Recently the customer realised she wants the ability to show certain questions only to people who gave one particular answer to some previous question (eg. Do you buy cigarettes? would be followed by What's your favourite cigarette brand?, there's no point of asking the second question to a non-smoker).

Now I started to wonder what would be the best way to implement this conditional questions in terms of my database schema? If question A has 2 possible answers: A and B, and question B should only appear to a user if the answer was A?

Edit: What I'm looking for is a way to store those information about requirements in a database. The handling of the data will be probably done on application side, as my SQL skills suck ;)

Michael Durrant
  • 93,410
  • 97
  • 333
  • 497
kender
  • 85,663
  • 26
  • 103
  • 145

4 Answers4

133

Survey Database Design

Last Update: 5/3/2015
Diagram and SQL files now available at https://github.com/durrantm/survey

enter image description here

If you use this (top) answer or any element, please add feedback on improvements !!!

This is a real classic, done by thousands. They always seems 'fairly simple' to start with but to be good it's actually pretty complex. To do this in Rails I would use the model shown in the attached diagram. I'm sure it seems way over complicated for some, but once you've built a few of these, over the years, you realize that most of the design decisions are very classic patterns, best addressed by a dynamic flexible data structure at the outset.
More details below:

Table details for key tables

answers

The answers table is critical as it captures the actual responses by users. You'll notice that answers links to question_options, not questions. This is intentional.

input_types

input_types are the types of questions. Each question can only be of 1 type, e.g. all radio dials, all text field(s), etc. Use additional questions for when there are (say) 5 radio-dials and 1 check box for an "include?" option or some such combination. Label the two questions in the users view as one but internally have two questions, one for the radio-dials, one for the check box. The checkbox will have a group of 1 in this case.

option_groups

option_groups and option_choices let you build 'common' groups. One example, in a real estate application there might be the question 'How old is the property?'. The answers might be desired in the ranges: 1-5 6-10 10-25 25-100 100+

Then, for example, if there is a question about the adjoining property age, then the survey will want to 'reuse' the above ranges, so that same option_group and options get used.

units_of_measure

units_of_measure is as it sounds. Whether it's inches, cups, pixels, bricks or whatever, you can define it once here.

FYI: Although generic in nature, one can create an application on top of this, and this schema is well-suited to the Ruby On Rails framework with conventions such as "id" for the primary key for each table. Also the relationships are all simple one_to_many's with no many_to_many or has_many throughs needed. I would probably add has_many :throughs and/or :delegates though to get things like survey_name from an individual answer easily without.multiple.chaining.

Community
  • 1
  • 1
Michael Durrant
  • 93,410
  • 97
  • 333
  • 497
  • 3
    wish I could upvote more than once :) – Wil Dec 14 '11 at 09:44
  • 1
    How do you handle conditional questions in this schema? I only see required flags, no flow control... perhaps I am missing something. – Georges Feb 13 '12 at 17:00
  • 1
    Great question, George! hmmm, like "ask question 3 if answer 2 is option c"? I assume you mean dependent on another questions answer then? For instance only ask about roof rack if vehicle is suv, or only ask for spouse name if user age > 18 ? So base it on the answer to another question? So what do you (or anyone) think is the right structure for that? Have two fields in the question table that are 'depends_on_answer_id' and 'depends_on_answer_value' ? – Michael Durrant Feb 18 '12 at 02:19
  • Sorry to resurrect this old question but why exactly the question_options table is needed? I don't understand the advantage of it over putting a FK to option_choices directly in the answer table. In this case, answer could directly be linked to question. It would also prevent multiple answers for the same question. – Marc Demierre May 24 '12 at 13:22
  • It decouples option_choices from question_options and allows for reuse of those options for different questions. – Michael Durrant Jun 17 '12 at 01:05
  • Does anyone have an answer for Georges' question then?? – Paranoid Android Sep 28 '12 at 17:01
  • @MichaelDurrant: Impressive schema! I'm trying to wrap my head around how the referential integrity works. For example, how does it prevent Survey_Section_ID in Questions from being changed to the Section ID of a section that belongs to a different organization? If this somehow happened, I think existing entries in the Answers table would be out of integrity with entries in user_survey_sections. – poke Oct 23 '12 at 15:47
  • In what software did you create this? – Lithu T.V Apr 18 '13 at 13:02
  • With mySQl Workbench which comes from the mySQL folks themselves and works great and is free. – Michael Durrant Apr 18 '13 at 17:53
  • If you were to allow for multiple instances of the same survey to be taken and stored by a single user, how would you tie answers to those instances? Add a fkey to `answers` -> `user_survey_sections(ID)`? – Ragamffn May 21 '13 at 23:22
  • 1
    @MichaelDurrant - Excellent design. Also, can you please tell me how to incorporate matrix survey kind of questions into your existing design? I mean questions like Row/Column combination (i.e. column contains options like "Strongly Agree | Agree | Disagree | Strongly Disagree" with multiple row questions. – user972255 Aug 02 '13 at 15:47
  • 1
    Its a great schema and I plan to use it, but I too cannot see the reason for questions_options. Why not just have a question_id FK and an options_choices_id FK in the answers table? Each user can then have multiple answers to any question. You can still reuse the choices across any number of questions. What am I missing? – fezfox Oct 27 '13 at 01:42
  • 1
    Try to iaagine questions where each answer has options to it, e.g. A question has 5 options, e.g. Q) Where do you shop / how often do you show there? Answer: a) Corner Store / Convenience Store 1. Once a day 2. Once a week 3. Once a month. b) Supermarket 1. Once a week 2. Once a month 3. Twice a month c) Warehouse Club 1. Once a month 2. Once every other month 3. Four times a year 4. Three times a year 5. Twice times a year 6. Once a year – Michael Durrant Oct 27 '13 at 02:37
  • 2
    @MichaelDurrant How would you use it for just a text response though? What option_choice would you give a question like "Describe your work environment" – datatest Dec 06 '13 at 14:27
  • 1
    'free text' option and then store the actual answer in the `answer` table in the `answer_text` field. – Michael Durrant Dec 06 '13 at 23:21
  • 2
    @MichaelDurrant, I'm currently analysing your design. A few questions keep ringing in my mind though that prevent me from completely understanding the schema. I whould be really helped if you could answer them: 1) is table question_options filled on answering or on creation of the question? 2) Why is option_group_id in questions optional? when every answer through question_options and option_choices needs an option group. Let's see if i can answer my other questions if those two are cleared. – Bas Goossen Jul 16 '14 at 13:30
  • 1
    Awesome work, Michael Durrant! Can you please look at the comment from user972255... I have the same question. How would you incorporate matrix questions into this schema? please reply... much appreciated! – Sunny Sharma Nov 20 '14 at 06:47
  • 1
    @BasGoossen – I think 1) On creation of the question, and 2) There are a couple typos in the schema image (of which that may be one…). For example, Michael's answer to fezfox's question seems to require an additional varchar field in the question_options table; notice the FK question_id is a varchar, it should be an int – that suggests to me he meant to add another field. However, it would be nice to have Michael comment definitively. – gangreen Nov 20 '14 at 17:15
  • 1
    @Sunny and user972255, you might be able to do Matrix questions by structuring a question similar to Michael’s answer to fezfox. See my comment above about a possible missing field in the question_options table though. – gangreen Nov 20 '14 at 17:17
  • 1
    can you put this somewhere as sql to download? – MR.GEWA Dec 02 '14 at 16:08
  • For the matrix questions - sorry it took me 18 months to reply... ;) well, one option might be to treat them as simply "sub-questions" and put an additional attribute "parent_id" on the questions table in order to give that "extra dimension" to the question. If the primary id stays unique the answers table may not need it, you'll just need to be sure to group/sort by parent_id when you're counting/displaying them together. – Michael Durrant May 03 '15 at 01:46
  • btw, in reviewing my design today along with tehvan's answer (+1) I would also consider adding a 'question_required_yn' boolean field to the questions table as both answer and questions should have that option. The only gotcha would be question not required but answer required. In real life I might add a constraint to only allow YY, YN and NN but not NY for Question|Answer required'ness'. – Michael Durrant May 03 '15 at 01:50
  • **Version 1.5 Added !!!** Include columns in questions for parent_id ( allows sub-questions, e.g. matrix) plus question_required_yn field. – Michael Durrant May 03 '15 at 11:04
  • **Create database script for mysql DOWNLOAD NOW AVAILABLE** for y'all at https://gist.github.com/anonymous/b9f5a6d73feb3e85fde4 – Michael Durrant May 03 '15 at 11:09
  • 1
    **Files are also now available on github** at https://github.com/durrantm/survey – Michael Durrant May 03 '15 at 11:21
  • 2
    **V1.6 Added.** Fixed the foreign key that should be int not varchar. Added 3 'dependent question/answer' fields to actually answer the original question... Updated diagram and github including create script. Added semantic versioning through directories. – Michael Durrant May 05 '15 at 11:08
  • 2
    v1.6 gist at https://gist.github.com/durrantm/1e618164fd4acf91e372 – Michael Durrant May 05 '15 at 11:14
  • The github repository includes the design source code `.mwb` file. It is a mysql workbench file. Mysql workbench is a (really) free application that lets you do such designs and make create and update scripts and much more. https://www.mysql.com/products/workbench/ – Michael Durrant May 05 '15 at 11:37
  • You could think of implementing a valuelabel table, so you integrate the codeplan into the databasesystem. e.g. for question 1: 1=like, 2=equal, 3=unlike. This is quite relevant for reporting purposes. – ruedi Jun 17 '15 at 06:42
  • Someone has been able to build an MVC project based in this DB Schema? – Patrick Jul 20 '15 at 16:15
  • Looking at the answer table what if there is a question for the text instead of option. How this will relate into the diagram - "question option id" ? Cause it's the question is not the option ?!?! Am I missing something here? – dcpartners Apr 06 '16 at 01:15
  • 1
    @MichaelDurrant In the `answers` table, what is the advantage of having `answer_numeric`, `answer_text`, and `answer_yn` instead of one "catch-all" `answer_text`? The application will have to do more work to figure out which field the value of the answer is in. Is there some benefit that makes the trade-off worth it? – John Apr 23 '16 at 17:34
  • 2
    I have not been able to wrap my head around `question_options` , `option_groups` and `Option_choices` tables. Please do you have a sample data for each of this tables that can help for better understanding of how to used them. – Emeka Mbah Jun 20 '16 at 13:11
  • @Michael Durrant as `answers` table refer to `question_options` and have column for hold a real answer likes `answer _text`, `answer_numeric`, `answer_yn` it means it copy values from `option_choices.option_choice_name` when needed right? – buncis Mar 31 '23 at 11:05
13

You could also think about complex rules, and have a string based condition field in your Questions table, accepting/parsing any of these:

  • A(1)=3
  • ( (A(1)=3) and (A(2)=4) )
  • A(3)>2
  • (A(3)=1) and (A(17)!=2) and C(1)

Where A(x)=y means "Answer of question x is y" and C(x) means the condition of question x (default is true)...

The questions have an order field, and you would go through them one-by one, skipping questions where the condition is FALSE.

This should allow surveys of any complexity you want, your GUI could automatically create these in "Simple mode" and allow for and "Advanced mode" where a user can enter the equations directly.

Osama Al-Maadeed
  • 5,654
  • 5
  • 28
  • 48
9

one way is to add a table 'question requirements' with fields:

  • question_id (link to the "which brand?" question)
  • required_question_id (link to the "do you smoke?" question)
  • required_answer_id (link to the "yes" answer)

In the application you check this table before you pose a certain question. With a seperate table, it's easy adding required answers (adding another row for the "sometimes" answer etc...)

tehvan
  • 10,189
  • 5
  • 27
  • 31
5

Personally, in this case, I would use the structure you described and use the database as a dumb storage mechanism. I'm fan of putting these complex and dependend constraints into the application layer.

I think the only way to enforce these constraints without building new tables for every question with foreign keys to others, is to use the T-SQL stuff or other vendor specific mechanisms to build database triggers to enforce these constraints.

At an application level you got so much more possibilities and it is easier to port, so I would prefer that option.

I hope this will help you in finding a strategy for your app.

TomHastjarjanto
  • 5,386
  • 1
  • 29
  • 41
  • my option would require only one table in total, not new tables for every question – tehvan Feb 12 '09 at 11:39
  • yes, I plan to put the logic into the application, I'm not a big fan of creating stored procedure for every little bit of logic (mostly because I suck at it). I'm just looking for a good way to store the requirements. – kender Feb 12 '09 at 11:41
  • tehvan, I think for more complex requirements that solution does lack some critical information. This field is actually about "Expert systems" which is much more suited for this. But now that I think of it, my solution won't be the best either, because it is not really dynamic. – TomHastjarjanto Feb 12 '09 at 11:46