Python pandas calculate share of after groupby

Question

I want to group below type of dataset by postalcodes and calculate the share of completed orders each shipping method has per postalcode. I've implemented a csv-file and tried the code below but i realized I need MultiIndex for that - and since I have a loot of different postalcodes I'm not sure how to go with it.

postalcode	shipping_method	completed_orders
12345	post1	1
12345	post2	3
12345	post3	2
11123	post1	1
11123	post2	2

import numpy as np
import pandas as pd

shipping_data = pd.read_csv("shipping_per_postalcode.csv")

shareof = lambda x: x/x.sum()
result = shipping_data['amount_users_completed'].groupby(level=['postalcode', 'shipping_option']).transform(sumto)
print(result)

score 1 · Answer 1 · answered May 03 '21 at 14:27

1

Like this?

result = df['completed_orders'] / df.groupby(['postalcode'])['completed_orders'].transform(sum)

# Out[43]:
# 0    0.166667
# 1    0.500000
# 2    0.333333
# 3    0.333333
# 4    0.666667
# Name: completed_orders, dtype: float64

answered May 03 '21 at 14:27

Andreas

8,694
3
14
38

score 1 · Accepted Answer · answered May 03 '21 at 14:47

1

You may need additional groupby to get the percentage contribution

df_agg=df_1.groupby(['postalcode', 'shipping_method'])['completed_orders'].sum()

df_agg.groupby(level=0).apply(lambda x: 100*x/float(x.sum()))

Source: Pandas percentage of total with groupby

answered May 03 '21 at 14:47

Praveenrajan27

611
6
9

Wow, I really complicated it for myself. Thanks a lot! – Sevgi Camuz May 03 '21 at 15:38
you're welcome @SevgiCamuz ! Please upvote the answer if you found it useful :) – Praveenrajan27 May 04 '21 at 06:17
Def! :) Do you know why the column name for share of completed orders disapears? – Sevgi Camuz May 04 '21 at 07:45

Python pandas calculate share of after groupby

2 Answers2