0

For example, I have:

Column A    Column B    Column C
   A_1        B_1         0
   A_1        B_2         1
   A_2        B_3         3
   A_2        B_5         2

I would like to get this:

       B_1   B_2  B_3  B_5
A_1     0     1   nan  nan
A_2    nan   nan   3    2

My idea is to get the unique values of Column A and Column B, recreate a new dataframe based on that and fill in the blanks through 2 for loops. Is there a better way to do this using Pandas? My method takes too long with large dataframe.

Stanley Gan
  • 481
  • 1
  • 7
  • 19

1 Answers1

2

Option 1

df.set_index(['Column A','Column B'])['Column c'].unstack()

Output:

Column B  B_1  B_2  B_3  B_5
Column A                    
A_1       0.0  1.0  NaN  NaN
A_2       NaN  NaN  3.0  2.0

Option 2

pd.crosstab(df['Column A'],df['Column B'],df['Column C'],aggfunc='sum')

Option 3

df.pivot_table('Column C','Column A','Column B','sum')

Option 4

df.pivot('Column A','Column B','Column C')   
Scott Boston
  • 147,308
  • 15
  • 139
  • 187