0

I a trying to get the sum of two numbers by using groupby and transform in pandas library but It is giving some garbage value, can someone guide me on how to solve this: my data looks like this:

SKU     Fees
45241   6.91
45241   6.91
55732   119.05
55732   137.98

I have tried using this code:

df['total_fees'] = df.groupby(['sku'])['Fees'].transform('sum')

what I am getting is this:

SKU     Fees     total_fees 
45241   6.91     6.91.6.91
45241   6.91     6.91.6.91
55732   119.05   119.05.137.98
55732   137.98   119.05.137.98
Lily
  • 1
  • 1
  • It would seem that the Fees column is a string not a number. – Henry Ecker Nov 16 '21 at 15:37
  • How to change a string to a number? – Lily Nov 16 '21 at 15:39
  • `df['Fees'] = df['Fees'].astype(int)` but this is giving me an error: `ValueError: invalid literal for int() with base 10: '6.91' ` – Lily Nov 16 '21 at 15:41
  • int is for integer ; you need float (6.48 is not an integer) – Vincent Nov 16 '21 at 15:42
  • `df['Fees'] = pd.to_numeric(df['Fees']).groupby(df['sku']).transform('sum')` like [this answer](https://stackoverflow.com/a/43745402/15497888) by [piRSquared](https://stackoverflow.com/users/2336654/pirsquared) – Henry Ecker Nov 16 '21 at 15:48
  • Or convert first: `df['Fees'] = df['Fees'].astype(float).groupby(df['sku']).transform('sum')` or with lambda `df['Fees'] = df.groupby(df['sku'])['Fees'].transform(lambda s: s.astype(float).sum())` – Henry Ecker Nov 16 '21 at 15:50

1 Answers1

0
df['Fees'] = df['Fees'].astype(float)

df.groupby(['sku'])['Fees'].sum()
# Computes the sum

df.groupby(['sku'])['Fees'].transform('sum')
# Computes the sum but using 'transform' duplicates the value for each row
Vincent
  • 1,534
  • 3
  • 20
  • 42