1

I have the following code which is working as desired with one exception

df['FPYear'] = df['First_Purchase_Date'].dt.year

Table2 = df.loc[df.Date.between('2014-01-01','2019-11-30')].groupby(df['FPYear'])[['New Customer', 'Existing Customer', 'revenue']].sum() #with date filters for table
# Table2 = df.loc[df.Date.between('2018-11-22','2018-11-30') & (df['Region'] == 'USA')].groupby(df['FPYear'])[['New Customer', 'Existing Customer', 'revenue']].sum() #with date filters for table
Table2['TotalCusts'] = Table2['New Customer'] + Table2['Existing Customer']
Table2['Cohort Size'] = Table['New Customer']

Table2['Repeat Rate'] = Table2['Existing Customer']/Table2['TotalCusts']
Table2['NewCust Rate'] = Table2['New Customer']/Table2['TotalCusts']
Table2['PCT of Total Yr'] = Table2['TotalCusts']/Table['New Customer']

Table2.loc['Total'] = Table2.sum(axis = 0) #this code totals all columns.  the below calcs totals for some and average for others
cols = ["Repeat Rate", "NewCust Rate"]
diff_cols = Table2.columns.difference(cols)
Table2.loc['Total'] = Table2[diff_cols].sum().append(Table2[cols].mean())

print(Table2)

For the last line of the code,

Table2.loc['Total'] = Table2[diff_cols].sum().append(Table2[cols].mean())

rather than have it take the mean() of all other columns I'd rather add in customer functions (simple col x/ col y) but after trying a few different things I have been unable to do so.

wolfblitza
  • 477
  • 1
  • 5
  • 16
  • Can you please clarify your question? What exactly do you want to do? – mccandar Dec 04 '19 at 07:17
  • Have you looked at `.apply()`? See, e.g., https://stackoverflow.com/questions/13331698/how-to-apply-a-function-to-two-columns-of-pandas-dataframe – Evan Dec 04 '19 at 07:36

1 Answers1

0

IIUC, just append a direct division between 2 columns. Assume col x and col y belongs to Table2

Table2.loc['Total'] = Table2[diff_cols].sum().append(Table2['col x'] / Table2['col y'])
Andy L.
  • 24,909
  • 4
  • 17
  • 29