0

This is the data I have been given:

input dataframe

I have been wanting to take the sum of the needed supplies with respect to their location but the groupby function does not seem to work. For example: df..groupby(['location']).sum()

However this gives the following, which is not the desired output.

output dataframe

mozway
  • 194,879
  • 13
  • 39
  • 75
  • 1
    What is your expected result because `groupby.sum` is working as expected based on your screenshot? – It_is_Chris Apr 13 '22 at 20:03
  • The expected result should be along the lines of: location 6 needs 1 + 3 = 4 supplies and location 22 needs 3+8 = 11 supplies. – RobinHood Apr 13 '22 at 20:07
  • You are only showing the first 10 rows in your screenshot but your frame is larger. There are other locations that are being grouped and summed. – It_is_Chris Apr 13 '22 at 20:08
  • Indeed this is a subset of a larger problem, however i could not find a solution for the whole dataframe so i figured that if there is a solution for the first 10 it should work for (in my case) 200 rows total. But i still dont get why on this subset the sum function is incorrect – RobinHood Apr 13 '22 at 20:13
  • You are doing `df.groupby(...)` df is not a slice of your frame but rather the whole frame (all 200 rows). If you only want the first ten rows then you need to preform the groupby operation on only those rows: `df.head(10).groupby('location').sum()` – It_is_Chris Apr 13 '22 at 20:17
  • 1
    The issue is that you have strings. Thus '1'+'3' = '13' ; '3'+'8' = '38'. Convert to integers, then your logic will work. – mozway Apr 13 '22 at 20:23

0 Answers0