Groupby with weight

Question

Given the following dataframe:

import pandas as pd
d=pd.DataFrame({'Age':[18,20,20,56,56],'Race':['A','A','A','B','B'],'Response':[3,2,5,6,2],'Weight':[0.5,0.5,0.5,1.2,1.2]})
d
    Age     Race    Response    Weight
0   18      A       3           0.5
1   20      A       2           0.5
2   20      A       5           0.5
3   56      B       6           1.2
4   56      B       2           1.2

I know that I can apply a group-by to get the count by age and race like this:

d.groupby(['Age','Race'])['Response'].count()
Age  Race
18   A       1
20   A       2
56   B       2
Name: Response, dtype: int64

But I'd like to use the "Weight" column to weight the cases such that the first 3 rows will count as 0.5 instead of 1 each and the last two will count as 1.2. So, if grouping by age and race, I should have the following:

Age  Race
18   A       0.5
20   A       1
56   B       2.4
Name: Response, dtype: int64

This is similar to using the "Weight Cases" option in SPSS. I know it's possible in R and I've seen a promising library in Python (though the current build is failing) here.

And PySal (not sure if it's applicable here)

...but I'm wondering if it can just be done somehow in the group-by.

score 3 · Accepted Answer · answered Jul 17 '17 at 00:25

3

If I understand correctly, you're just looking for .sum() with the weights.

d.groupby(['Age', 'Race']).Weight.sum()

## Age  Race
## 18   A       0.5
## 20   A       1.0
## 56   B       2.4
## Name: Weight, dtype: float64

answered Jul 17 '17 at 00:25

miradulo

28,857
6
80
93

score 2 · Answer 2 · answered Jul 17 '17 at 06:08

2

Not exactly obvious... but a fun way to do it with pd.factorize and np.bincount

f, u = pd.factorize(
    [tuple(row) for row in d[['Age', 'Race']].values.tolist()]
)

pd.Series(
    np.bincount(f, d.Weight.values),
    pd.MultiIndex.from_tuples(u, names=['Age', 'Race'])
)

Age  Race
18   A       0.5
20   A       1.0
56   B       2.4
dtype: float64

answered Jul 17 '17 at 06:08

piRSquared

285,575
57
475
624

1

"Not exactly obvious" might be a _slight_ understatement ;) but this is neat. – miradulo Jul 17 '17 at 06:14

Groupby with weight

2 Answers2

Linked