1

I have survey data that was exported as a Vertical dataframe. Meaning for everytime a person responded to 3 questions in the survey, their row would duplicate 3 times, except the content of the question and their answer. I am trying to transpose/pivot this data so that all 3 questions are displayed in a unique column so that their responses are also displayed in each column instead of another row, alongside their details like ID, Full Name, Location, etc...

Here is what it looks like currently:

    ID      Full Name   Location   Question                               Multiple Choice Question Answers 
   12345   John Smith     UK    1. It was easy to report my sickness.             Agree
   12345   John Smith     UK    2. I felt ready to return from Quarantine.        Neutral
   12345   John Smith     UK    3. I am satisfied with the adjustments made.      Disagree
    ..          ...          ...                   ...         ...   
   67891  Jane Smith      UK    1. It was easy to report my sickness.             Agree
   67891  Jane Smith      UK    2. I felt ready to return from Quarantine.        Agree
   67891  Jane Smith      UK    3. I am satisfied with the adjustments made.      Agree      

and this is how I want it:

    ID      Full Name   Location  1. It was easy to report my sickness. 2. I was satisfied with the support I received.  3. I felt ready to return from Quarantine.
   12345   John Smith     UK         Agree                                  Neutral                                          Disagree
   67891   Jane Smith     UK         Agree                                  Agree                                            Disagree

Currently I'm trying to use this code to get my desired output but I can only get the IDs and Full Names to isolate without duplicating and the other columns just show up as individual rows.

column_indices1 = [2,3,4]
df5 = df4.pivot_table(index = ['ID', 'Full Name'], columns = df4.iloc[:, column_indices1], \
                      values = 'Multiple Choice Question Answer', \
                      fill_value = 0)
Scythor
  • 99
  • 6

1 Answers1

1

Concept

In this scenario, we should consider using:

pivot(): Pivot without aggregation that can handle non-numeric data.







Practice

Prepare data

data = {'ID':[12345,12345,12345,67891,67891,67891],
        'Full Name':['John Smith','John Smith','John Smith','Jane Smith','Jane Smith','Jane Smith'],
        'Location':['UK','UK','UK','UK','UK','UK'],
        'Question':['Q1','Q2','Q3','Q1','Q2','Q3'],
        'Answers':['Agree','Neutral','Disagree','Agree','Agree','Agree']}
df = pd.DataFrame(data=data)
df

Output
ID Full Name Location Question Answers
0 12345 John Smith UK Q1 Agree
1 12345 John Smith UK Q2 Neutral
2 12345 John Smith UK Q3 Disagree
3 67891 Jane Smith UK Q1 Agree
4 67891 Jane Smith UK Q2 Agree
5 67891 Jane Smith UK Q3 Agree




Use pivot()

questionnaire = df.pivot(index=['ID','Full Name','Location'], columns='Question', values='Answers')
questionnaire

Output enter image description here





adding reset_index() and rename_axis() to get the format you want

questionnaire = questionnaire.reset_index().rename_axis(None, axis=1)
questionnaire

Output enter image description here

Baron Legendre
  • 2,053
  • 3
  • 5
  • 22
  • Hi, I got it to pivot like in the first step. I added a few more columns to the 'Index =' section as well as an additional value column for Free text answers. But when I try to use the second line you gave me, for 'reset_index() and rename_axis(), I get this error: **TypeError: Must pass list-like as 'names'.** – Scythor Jun 30 '22 at 14:08