Pandas transformation to groupby join rows

Question

I have dataframe like this

   customer_id  product_id  sum
0      1            A        20
1      2            A        10
2      2            C        30
3      1            B        40

I want to transform it into a sparse dataframe where I will have customer data as rows and columns will be decoded products in order. Something like this:

   customer_id  Product_A_Sum  Product_B_Sum  Product_C_Sum
 0     1            20            40             0
 1     2            10             0             30

I have a brute force solution - something like this:

df_new= pd.DataFrame()
df_new['customer_id'] = df.customer_id.unique()
for product in range(len(list_products)):
    temp = df.groupby(['customer_id', product])['sum'].sum().reset_index().rename(columns = {'sum':('sum'+product)})
    df_new = df_new.merge(temp[['customer_id', 'sum'+product]], how='left', on = 'customer_id').fillna(0)

This code works but my list of products is large so it will not scale at all. Are there any pandas tricks that will allow me to do this easier?

Thank you in advance!

score 2 · Accepted Answer · answered Aug 04 '22 at 15:21

try this,

df = df.pivot_table(index='customer_id', columns='product_id', values='sum', fill_value=0)
df.columns = [f"Product_sum_{x}" for x in df.columns]
df = df.reset_index()

O/P:

    customer_id  Product_sum_A  Product_sum_B  Product_sum_C
0            1             20             40              0
1            2             10              0             30

score 1 · Answer 2 · answered Aug 04 '22 at 15:19

1

You should use pivot:

df.pivot(index='customer_id', columns='product_id', values='sum')

answered Aug 04 '22 at 15:19

gtomer

5,643
1
10
21

Pandas transformation to groupby join rows

2 Answers2