I have the following dataframe from a dataset containing 0.3 million rows:
CustomerID Revenue
0 17850.0 15.30
1 17850.0 11.10
2 13047.0 17.85
3 13047.0 17.85
4 17850.0 20.34
5 13047.0 12.60
6 13047.0 12.60
7 13047.0 31.80
8 17850.0 20.34
9 17850.0 15.30
10 13047.0 9.90
11 13047.0 30.00
12 13047.0 31.80
13 12583.0 40.80
14 12583.0 39.60
15 13047.0 14.85
16 13047.0 14.85
17 12583.0 15.60
18 12583.0 45.00
19 12583.0 70.80
The CustomerID values repeat in batches. For example the CustomerID value of 17850 contained in the first 2 rows may appear again later at some some point in the datset. I am trying to group subsets of rows by the same Customer ID and sum up the revenue for that bunch. The dataframe transformation that i want to do should look like this:
CustomerID TotalRevenue
0 17850.0 26.40
1 13047.0 35.70
2 17850.0 20.34
3 13047.0 57.0
4 17850.0 35.64
5 13047.0 71.7
6 12583.0 80.4
7 13047.0 29.7
8 12583.0 131.4
The problem is if i use the groupby method it groups all the rows with the same CustomerID value. So this way it groups together all the 17850 CustomerID values in the whole dataframe and not just as a bunch of first 2 rows and then subsequent bunches of other CustomerID values.
Will really appreciate some help how to do this using Pandas. Thanks