0

I have a df like this

sale_id brand   Qty
1       Toyota  1
1       Toyota  2
2       Honda   1
2       Toyota  1
3       Lexus   3

Is there a function to convert it to:

sale_id Toyota Honda Lexus
1       3      0     0
2       1      1     0
3       0      0     1

So basically I´m creating columns equal to the sum of all quantities bought per car type by sale_id.

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
EGM8686
  • 1,492
  • 1
  • 11
  • 22

3 Answers3

3

Try with

s = df.groupby(['sale_id','brand']).Qty.sum().unstack(fill_value=0)
Out[223]: 
brand    Honda  Lexus  Toyota
sale_id                      
1            0      0       3
2            1      0       1
3            0      3       0
BENY
  • 317,841
  • 20
  • 164
  • 234
3

Or pd.crosstab with aggregation function of sum:

pd.crosstab(df['sale_id'], df['brand'], df['Qty'], aggfunc='sum').fillna(0)

Output:

brand    Honda  Lexus  Toyota
sale_id                      
1          0.0    0.0     3.0
2          1.0    0.0     1.0
3          0.0    3.0     0.0
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
2

We can use DataFrame.pivot_table

new_df = df.pivot_table(index='sale_id',
                        columns='brand',
                        values='Qty', 
                        fill_value=0,
                        aggfunc='sum')
print(new_df)

brand    Honda  Lexus  Toyota
sale_id                      
1            0      0       3
2            1      0       1
3            0      3       0
ansev
  • 30,322
  • 5
  • 17
  • 31