Pivot - Transpose Vertical Data with repeated rows into Horizontal Data with one row per ID

Question

I have survey data that was exported as a Vertical dataframe. Meaning for everytime a person responded to 3 questions in the survey, their row would duplicate 3 times, except the content of the question and their answer. I am trying to transpose/pivot this data so that all 3 questions are displayed in a unique column so that their responses are also displayed in each column instead of another row, alongside their details like ID, Full Name, Location, etc...

Here is what it looks like currently:

    ID      Full Name   Location   Question                               Multiple Choice Question Answers 
   12345   John Smith     UK    1. It was easy to report my sickness.             Agree
   12345   John Smith     UK    2. I felt ready to return from Quarantine.        Neutral
   12345   John Smith     UK    3. I am satisfied with the adjustments made.      Disagree
    ..          ...          ...                   ...         ...   
   67891  Jane Smith      UK    1. It was easy to report my sickness.             Agree
   67891  Jane Smith      UK    2. I felt ready to return from Quarantine.        Agree
   67891  Jane Smith      UK    3. I am satisfied with the adjustments made.      Agree

and this is how I want it:

    ID      Full Name   Location  1. It was easy to report my sickness. 2. I was satisfied with the support I received.  3. I felt ready to return from Quarantine.
   12345   John Smith     UK         Agree                                  Neutral                                          Disagree
   67891   Jane Smith     UK         Agree                                  Agree                                            Disagree

Currently I'm trying to use this code to get my desired output but I can only get the IDs and Full Names to isolate without duplicating and the other columns just show up as individual rows.

column_indices1 = [2,3,4]
df5 = df4.pivot_table(index = ['ID', 'Full Name'], columns = df4.iloc[:, column_indices1], \
                      values = 'Multiple Choice Question Answer', \
                      fill_value = 0)

Baron Legendre · Accepted Answer · 2022-06-29T15:46:10.297

Concept

In this scenario, we should consider using:

pivot(): Pivot without aggregation that can handle non-numeric data.

Practice

Prepare data

data = {'ID':[12345,12345,12345,67891,67891,67891],
        'Full Name':['John Smith','John Smith','John Smith','Jane Smith','Jane Smith','Jane Smith'],
        'Location':['UK','UK','UK','UK','UK','UK'],
        'Question':['Q1','Q2','Q3','Q1','Q2','Q3'],
        'Answers':['Agree','Neutral','Disagree','Agree','Agree','Agree']}
df = pd.DataFrame(data=data)
df

Output

	ID	Full Name	Location	Question	Answers
0	12345	John Smith	UK	Q1	Agree
1	12345	John Smith	UK	Q2	Neutral
2	12345	John Smith	UK	Q3	Disagree
3	67891	Jane Smith	UK	Q1	Agree
4	67891	Jane Smith	UK	Q2	Agree
5	67891	Jane Smith	UK	Q3	Agree

Use pivot()

questionnaire = df.pivot(index=['ID','Full Name','Location'], columns='Question', values='Answers')
questionnaire

Output

adding reset_index() and rename_axis() to get the format you want

questionnaire = questionnaire.reset_index().rename_axis(None, axis=1)
questionnaire

Output

Hi, I got it to pivot like in the first step. I added a few more columns to the 'Index =' section as well as an additional value column for Free text answers. But when I try to use the second line you gave me, for 'reset_index() and rename_axis(), I get this error: **TypeError: Must pass list-like as 'names'.** — Scythor, Jun 30 '22 at 14:08

Pivot - Transpose Vertical Data with repeated rows into Horizontal Data with one row per ID

1 Answers1

Concept

Practice