1

I am trying to group my data to years and sum the spending according to the year they belong to.

Here's a sample data:

date:            spend_amt:

2/1/2014          10000

2/5/2014           98

1/2/2015          5834.2

7/8/2017          561236

9/3/2017          568

28/1/2016         989895.3

My current code

def yearlySpending(self):

    dfspendingYearly = pd.DataFrame()
    dfspendingYearly = self.dfGov.groupby(["date"])['spend_amt'].agg('sum')
    dfspendingYearly.groupby(dfspendingYearly["date"].dt.year)['spend_amt'].agg(['sum'])

I got an error, 'KeyError: 'date''

Desired output

date:       spend_amt:
2014        10098
2015        5834.2
2016        989895.3
2017        561804
justalazyguy
  • 49
  • 1
  • 5

2 Answers2

0

convert your datestamp column to date time and group the dataframe by year

possible of duplicate grouping by year

df["date:"] = pd.to_datetime(df['date:'])
df.groupby(df['date:'].dt.year).sum().reset_index()

Out:

    date:   spend_amt:
0   2014    10098.0
1   2015    5834.2
2   2016    989895.3
3   2017    561804.0
Naga kiran
  • 4,528
  • 1
  • 17
  • 31
  • in your code, the date column is mentioned as string , you have to convert it to datestamp if you want to access the year individually, otherwise grouping operation con be performed on individual string. – Naga kiran Sep 30 '18 at 08:46
  • The reason for key error is you have assigned grouped series value to "dfspendingYearly" , so the date name is index of that series, so you cant access as column (either you can try of to_frame followed by reset index ) then you can access @justalazyguy – Naga kiran Sep 30 '18 at 08:48
0

Your error means there is no column date, I guess there is index called date:

df.index = pd.to_datetime(df.index)
dfspendingYearly = df.groupby(df.index.year).sum().reset_index()
print (dfspendingYearly)
   date  spend_amt
0  2014    10098.0
1  2015     5834.2
2  2016   989895.3
3  2017   561804.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252