I would like to sum multiple values to one in python.
See the picture below of my data. I want to sum all the values of AGE
for each year for each country.
Instead of having this:
country TIME AGE Value
A 2017 20-60 200
A 2017 60-80 100
A 2016 20-60 200
A 2016 60-80 200
B 2017 20-60 300
B 2017 60-80 300
B 2016 20-60 400
B 2016 60-80 400
I would like to have this:
country TIME Value
A 2017 300
A 2016 400
B 2017 600
B 2016 800
The types of data:
df4types
AGE object
Value object
dtype: object
The data has a multi index by country and TIME
.
If have tried this:
df=df.groupby(by=["TIME","GEO"])['Value'].sum()
and this:
df=df.groupby(by=["TIME","GEO"]).sum()['Value']
Both "worked" but result in an enormous value. Like it doesn't sum but paste the numbers behind each other. I have tried to change the variable type to numeric by using:
by df.Value.astype(float) & df.Value.astype(int)
Unfortunately this didn't solve the problem. Does someone have an idea how to sum the values by group and time correctly? I have also uploaded a picture of the real dataset.