0

I have been doing a bit of analysis of both qualtrics and Google forms surveys with Pandas.

Some of the questions are of the format:


what do you like about cake? (select as many as you need to)

  • it's delicious
  • icing
  • bright colours
  • everything

In both systems they produce a column that looks like:

| cake  | ramen  |
|  1, 3, 4| love     |
|  1      | hate     |
|  3, 4   | love     |

and so on. Both systems do automatic barcharts of the responses, but they are hard to work with.

I've done it in the past by breaking them into extra columns, or just processing everything on the fly and building a temporary dataframe for a specific graph.

Is there a more elegant method of handling columns like this? Particularly so that I can do stacked bar charts of cake feelings, broken up by how they feel about ramen (for example )

cs95
  • 379,657
  • 97
  • 704
  • 746
Ben
  • 12,614
  • 4
  • 37
  • 69
  • 1
    edited the title because your question pertains to multiple values in a column, this is more searchable on google than "multi-answer question responses". – cs95 Jan 29 '20 at 05:20
  • 1
    Qualtrics has an option to retrieve the data in separate columns/fields - one for each choice option. – T. Gibbons Jan 29 '20 at 14:20

1 Answers1

1

most solutions to similar problems require creating a new dataframe. example:Pandas column of lists, create a row for each list element

If you don't want to do that - just unpack the lists. A function is needed to deal with uneven list depth:

tolist  = lambda a: a if type(a)==list else [a] 
[a for b in df['cake'].values for a in tolist(b)]

[1, 3, 4, 2, 3, 4]

Poe Dator
  • 4,535
  • 2
  • 14
  • 35