0

I would like to transpose all the rows for each ID into one row.

import pandas as pd
from functools import reduce

data1 = {'ID': [101, 101, 101, 102,102,103,103,103,103],
         'Name': ['John','John','John','Ray','Ray','Tracy','Tracy','Tracy','Tracy'],
         'Course': ['Math101', 'Eng101','His101','Math101','Sci101','Eng101','His101','Math101','Sci101'],
         'marks': [7, 5, 5, 6,5,8,6,4,9],
         'Nt': ['Pass', 'fail','fail','Pass','fail','Pass','Pass','fail','Pass']}
df = pd.DataFrame(data1)

df

enter image description here

my code

df1 = df.assign(key=df.groupby('ID').cumcount() + 1)

df2 = pd.crosstab(df1["ID"], df1["key"], df1["Course"], aggfunc='first').add_prefix("Course_").reset_index().rename_axis(None, axis=1)
df3 = pd.crosstab(df1["ID"], df1["key"], df1["marks"], aggfunc='first').add_prefix("marks_").reset_index().rename_axis(None, axis=1)
df4 = pd.crosstab(df1["ID"], df1["key"], df1["Nt"], aggfunc='first').add_prefix("Nt_").reset_index().rename_axis(None, axis=1)

data_frames =  [ df2, df3,df4]

df_merged = reduce(lambda  left,right: pd.merge(left,right,on=['ID'], how='outer'), data_frames)
df_merged 

I get this table enter image description here

i rearrange the column names to get the table in the format I want

report = df_merged[['ID','Course_1','marks_1','Nt_1','Course_2','marks_2','Nt_2','Course_3','marks_3','Nt_3','Course_4','marks_4','Nt_4']]
report

I want a similar table but I do want to name also here. I was not able to get the nameenter image description here

Its doable for 4, 5 columns but my dataset has millions of records, 5 columns and each record might have 15-25 rows - so transposed (into 25 rows * 5 columns = )125 columns ( like course_1 to course_125, marks1 to marks_125 )

Is there any easier solution to do this ? I would not want to create ~125 dfs and then rearrange the columns

Aj ml
  • 13
  • 1
  • 3
  • 1
    You want a pivot with flattening of the MultiIndex, check questions #10 and #11 in the duplicate – mozway Sep 08 '22 at 12:50

0 Answers0