I am trying to calculate weighted sum using two columns in a python dataframe.
Dataframe structure:
unique_id weight value
1 0.061042375 20.16094523
1 0.3064548 19.50932003
1 0.008310739 18.76469039
1 0.624192086 21.25
2 0.061042375 20.23776924
2 0.3064548 19.63366165
2 0.008310739 18.76299395
2 0.624192086 21.25
.......
Output I desired is:
Weighted sum for each unique_id = sum((weight) * (value))
Example: Weighted sum for unique_id 1 = ( (0.061042375 * 20.16094523) + (0.3064548 * 19.50932003) + (0.008310739 * 18.76469039) + (0.624192086 * 21.25) )
I checked out this answer (Calculate weighted average using a pandas/dataframe) but could not figure out the correct way of applying it to my specific scenario.
This is what I am doing based on the above answer:
#Assume temp_weighted_sum_dataframe is the dataframe stated above
grouped_data = temp_weighted_sum_dataframe.groupby('unique_id') #I think this groups data based on unique_id values
weighted_sum_output = (grouped_data.weight * grouped_data.value).transform("sum") #This should allow me to multiple weight and value for every record within each group and sum it up to one value for that group.
# On above line I am getting the error > TypeError: unsupported operand type(s) for *: 'SeriesGroupBy' and 'SeriesGroupBy'
Any help is appreciated, thanks