I have a dataframe like the following:
import pandas as pd
df = pd.DataFrame([['1', 'Consultant'], ['1', 'Lead Developer'], ['1', 'Architect'],
['2', 'Developer'],
['3', 'Developer'], ['3', 'Data Engineer']], columns=['ID', 'Role'])
ID Role
0 1 Consultant
1 1 Lead Developer
2 1 Architect
3 2 Developer
4 3 Developer
5 3 Architect
I want this end result:
ID Role1 Role2 Role3
0 1 Consultant Lead Developer Architect
1 2 Developer None None
2 3 Developer Architect None
The number of Role columns should be equal with the maximum amount of Roles an ID has in the df (in this case ID 1 has 3 roles, that is the max column number). All other records should contain None (or np.NaN) if they have less roles. I tried pivoting:
pivoted = df.pivot(index='ID', columns=[1,2,3], values='Role')
But I do not know how can I let pandas know that it should generate the column names the way I want.