1

I'm using groupby sum and I am getting the wrong output:

This is my dataframe

Although the medal column only contains value of either 0 or 1, I am getting this output after executing the following code.

test=oly_new.groupby(['Country','Year'])['Medal'].sum()

EdChum
  • 376,765
  • 198
  • 813
  • 562
Ben Ng
  • 15
  • 1
  • 3
  • 1
    please read [how-to-make-good-reproducible-pandas-examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Brown Bear Apr 03 '19 at 08:49

1 Answers1

4

Your Medal column is a str, convert first to int and then sum:

oly_new['Medal'] = oly_new['Medal'].astype(int)
test=oly_new.groupby(['Country','Year'])['Medal'].sum()

When your column dtype is str then the sum function just concatenates all the strings

EdChum
  • 376,765
  • 198
  • 813
  • 562
  • Also, for future reference, posting images is frowned upon. You should post raw data, code to recreate your df, the desired output, and your existing code. Also remember to accept my answer, there will be an empty tick under the down arrow at the top left of my answer – EdChum Apr 03 '19 at 08:59