I'm fairly new to pandas and I am struggling to get the appropriate mean in a multi-index series. The multi-index series currently looks like this;
idx = pd.MultiIndex.from_tuples([('foo', 111), ('foo', 222),
('bar', 111), ('bar', 222), ('bar', 333),
('baz', 111),
('qux', 111), ('qux', 222)],
names=['ID', 'Account Number'])
df = pd.DataFrame(index=idx, data={'Service 1': 18, 'Service 2': 22, 'Total cost': 40})
df = pd.concat([df], keys=['Cost'], axis=1)
Cost
Service 1 Service 2 Total cost
ID Account Number
foo 111 18 22 40
222 18 22 40
bar 111 18 22 40
222 18 22 40
333 18 22 40
baz 111 18 22 40
qux 111 18 22 40
222 18 22 40
The tables that all the data is being pulled in from applies cost to service 1 & 2 at Account number level but what it really needs to do is apply the cost at ID level and split the cost across account numbers, so what it should look like is;
Cost
Service 1 Service 2 Total cost
ID Account Number
foo 111 9.0 11.000000 20.000000
222 9.0 11.000000 20.000000
bar 111 6.0 7.333333 13.333333
222 6.0 7.333333 13.333333
333 6.0 7.333333 13.333333
baz 111 18.0 22.000000 40.000000
qux 111 9.0 11.000000 20.000000
222 9.0 11.000000 20.000000
I've tried df.groupby(['ID']).transform('mean')
but this gives me the original figures obviously and I'm not sure how to get where I need to.
Feel like I've been round the twist with this, so any help would be appreciated.