0

I have an excel spreadsheet with key columns (k1, k2) and amount columns (a1 thru a12).

I need to group by k1, k2 and in the resulting dataframe sum the columns and save the amounts to a new column. Here is what I have tried so far

import numpy as nm
import pandas as pd
df = pd.read_excel('C:\Users\pb\Desktop\py test\Bal.xlsx')
df1=df.groupby(['k1', 'k2'])
#sum a1 thru a12(also tried df['suma'] = df['a1']+df['a2'] 

df1['suma']=df1.apply(lambda x: x['a1'] + x['a2']) 

Here is the error I am getting

TypeErrorTraceback (most recent call last) <ipython-input-14-242ac0584a79> in <module>()
      3 df1=df.groupby(['k1', 'k2'])
      4 #sum a1 thru a12
----> 5 df1['sum']=df1.apply(lambda x: x['a1'] + x['a2'])

TypeError: 'DataFrameGroupBy' object does not support item assignment

Is there a way to sum the columns after the group by?

Thanks in advance

enter image description here

b p
  • 9
  • 1
  • 5
  • looks possible if you add a small sample dataset by referring [this](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and an expected df too – anky Mar 29 '19 at 15:24
  • 1
    I am sorry about that. I tried attaching the excel file for data as it is a huge dataset and has many columns but couldn't attach the file. I will create a df and post it here. Thank you – b p Mar 29 '19 at 15:29

1 Answers1

0

when you create a groupby you aren't creating a new dataframe unless there is some function applied to it via an aggregation or something else. you could start before the groupby by adding a column that is already adding the first two columns and then doing a groupby with a sum.

df['suma']= x['k1']+x['k2']
df1= df.groupby(['k1','k2'], as_index= False).agg({'suma':'sum'})
D.Sanders
  • 98
  • 6