I have a dataset with multiple entries for the same year, metric, and country. I want to create a dataframe where rows with matching values for year, metric, and country are summed into a single row.
There are potentially hundreds of instances of matching values across the first three columns so the solution must scale well.
Initial data:
Year Metric Country Q1 Q2 Q3 Q4
2016 2.1.1 Australia 23 166 146 17
2016 2.1.1 Australia 0 24 26 0
2014 3.1.1 Haiti 0 0 0 0
2015 2.1.1 Mexico 442 37 16 58
2013 3.1.4 Jamaica
2015 2.1.1 Mexico 165 140 209 309
I have tried several iterations of groupby and boolean indexing from here:
How do I sum values in a column that match a given condition using pandas?
Desired output:
Year Metric Country Q1 Q2 Q3 Q4
2016 2.1.1 Australia 23 190 172 17
2014 3.1.1 Haiti 0 0 0 0
2015 2.1.1 Mexico 606 177 225 367
2013 3.1.4 Jamaica