0

I have a dataframe like the following:

import pandas as pd
df = pd.DataFrame([['1', 'Consultant'], ['1', 'Lead Developer'], ['1', 'Architect'],
                   ['2', 'Developer'],
                   ['3', 'Developer'], ['3', 'Data Engineer']], columns=['ID', 'Role'])
      ID            Role
    0  1      Consultant
    1  1  Lead Developer
    2  1       Architect
    3  2       Developer
    4  3       Developer
    5  3       Architect

I want this end result:

  ID       Role1           Role2      Role3
0  1  Consultant  Lead Developer  Architect
1  2   Developer            None       None
2  3   Developer       Architect       None

The number of Role columns should be equal with the maximum amount of Roles an ID has in the df (in this case ID 1 has 3 roles, that is the max column number). All other records should contain None (or np.NaN) if they have less roles. I tried pivoting:

pivoted = df.pivot(index='ID', columns=[1,2,3], values='Role')

But I do not know how can I let pandas know that it should generate the column names the way I want.

Looz
  • 377
  • 2
  • 14

0 Answers0