I have dataframe like this
customer_id product_id sum
0 1 A 20
1 2 A 10
2 2 C 30
3 1 B 40
I want to transform it into a sparse dataframe where I will have customer data as rows and columns will be decoded products in order. Something like this:
customer_id Product_A_Sum Product_B_Sum Product_C_Sum
0 1 20 40 0
1 2 10 0 30
I have a brute force solution - something like this:
df_new= pd.DataFrame()
df_new['customer_id'] = df.customer_id.unique()
for product in range(len(list_products)):
temp = df.groupby(['customer_id', product])['sum'].sum().reset_index().rename(columns = {'sum':('sum'+product)})
df_new = df_new.merge(temp[['customer_id', 'sum'+product]], how='left', on = 'customer_id').fillna(0)
This code works but my list of products is large so it will not scale at all. Are there any pandas tricks that will allow me to do this easier?
Thank you in advance!