2

I have a DataFrame:

a = {'Price': [10, 15, 20, 25, 30], 'Total': [10000, 12000, 15000, 14000, 10000],
 'Previous Quarter': [0, 10000, 12000, 15000, 14000]}
a = pd.DataFrame(a)
print (a)

With this raw data, i have added a number of additional columns including a weighted average price (WAP)

a['Change'] = a['Total'] - a['Previous Quarter']
a['Amount'] = a['Price']*a['Change']
a['Cum Sum Amount'] = np.cumsum(a['Amount'])
a['WAP'] = a['Cum Sum Amount'] / a['Total']

This is fine, however as the total starts to decrease this brings down the weighted average price.

my question is, if Total decreases how would i get WAP to reflect the row above? For instance in row 3, Total is 1000, which is lower than in row 2. This brings WAP down from 12.6 to 11.78, but i would like it to say 12.6 instead of 11.78.

I have tried looping through a['Total'] < 0 then a['WAP'] = 0 but this impacts the whole column.

Ultimately i am looking for a WAP column which reads: 10, 10.83, 12.6, 12.6, 12.6

demonplus
  • 5,613
  • 12
  • 49
  • 68
ben121
  • 879
  • 1
  • 11
  • 23
  • 1
    I have about a week experience in Python, but wouldn't just `a['WAP'] = np.maximum.accumulate(a['Cum Sum Amount'] / a['Total'])` work for you? – David Arenburg Sep 05 '16 at 17:36

2 Answers2

3

You could use cummax:

a['WAP'] = (a['Cum Sum Amount'] / a['Total']).cummax()

print (a['WAP'])

0    10.000000
1    10.833333
2    12.666667
3    12.666667
4    12.666667
Name: WAP, dtype: float64
Nickil Maveli
  • 29,155
  • 8
  • 82
  • 85
1

As a total Python beginner, here are two options I could think of

Either

a['WAP'] = np.maximum.accumulate(a['Cum Sum Amount'] / a['Total'])

Or after you've already created WAP you could modify only the subset using the diff method (thanks to @ayhan for the loc which will modify a in place)

a.loc[a['WAP'].diff() < 0, 'WAP'] = max(a['WAP'])
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • You can change the second one to `a.loc[a['WAP'].diff() < 0, 'WAP'] = max(a['WAP'])` to avoid the warning. – ayhan Sep 05 '16 at 17:56
  • 1
    @ayhan thanks, I was just about to while reading this http://stackoverflow.com/questions/17071871/select-rows-from-a-dataframe-based-on-values-in-a-column-in-pandas – David Arenburg Sep 05 '16 at 17:57