0

I'm trying to sum 6 months of data separately for each team name I have, so that I can perform a linear regression model. I've used the following code, however it is grouping everything in my dataset, rather than just 'Area Team Name', which is specified in the code.

 pivoted_df = pd.pivot_table(sample(), index="Month",values=["Actual Cost", "Area Team 
 Name"], aggfunc=np.sum)
 print(pivoted_df)

Which gives me a result of...

 Actual Cost
 Month                  
 AUGUST     2.156021e+07
 DECEMBER   3.282076e+07
 JULY       3.421666e+07
 NOVEMBER   3.370295e+07
 OCTOBER    3.466268e+07
 SEPTEMBER  3.371625e+07 
 
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
  • 1
    Welcome to SO, please read [tour] and [mre] and in this case also: [how-to-make-good-reproducible-pandas-examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Andreas Jul 19 '21 at 14:41
  • Check out the documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot_table.html . You may need to add a column parameter with `Area Team Name`? – Henrik Bo Jul 19 '21 at 14:53

1 Answers1

0

Since 'Area Team Name' is in the values, it is trying to include that in the sum function. I imagine they aren't numbers, so it isn't having success summing them.

If you want the team names as rows, you'll need to add it to index (although you may want to just do a groupby() at that point; if you want them as columns (so the rows would be the months as you have, and then more columns with the team names, you'll need to add the parameter columns='Area Team Name'.

scotscotmcc
  • 2,719
  • 1
  • 6
  • 29