1

I have survey data in a pandas dataframe where each row is a complete survey response. It currently looks like this:

name Question 1 Question 2 Question 3
John agree disagree neither
Sally disagree neither agree

I want to flip it so the questions are in each row, and the results are tabulated across the rows, like this:

Question disagree neither agree
Question 1 1 0 1
Question 2 1 1 0
Question 3 0 1 1

The responses follow a strict 5-point likert scale, so they are all consistent.

I tried to loop through a list of columns with questions and summarize using value_counts(), but I was unable to transpose it into a new df.

dipps4
  • 11
  • 1

2 Answers2

2

Try pd.crosstab:

df = df[["Question 1", "Question 2", "Question 3"]].stack().droplevel(level=0)
print(pd.crosstab(df.index, df).rename_axis(index="Question", columns="").reset_index())

Prints:

     Question  agree  disagree  neither
0  Question 1      1         1        0
1  Question 2      0         1        1
2  Question 3      1         0        1
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
1

You can use melt and pivot_table:

out = (df.filter(like='Question').melt(var_name='Question', value_name='Answer')
         .assign(Value=1).pivot_table(index='Question', columns='Answer',
                                      values='Value', aggfunc=max, fill_value=0))
print(out)

# Output
Answer      agree  disagree  neither
Question                            
Question 1      1         1        0
Question 2      0         1        1
Question 3      1         0        1

Another alternative with index manipulation:

out = (df.filter(like='Question').melt(var_name='Question', value_name='Answer')
         .assign(Value=1).set_index(['Question', 'Answer'])['Value']
         .unstack('Answer').fillna(0).astype(int))
Corralien
  • 109,409
  • 8
  • 28
  • 52