0

I am trying to get the mean of 30 most recent points in column a for each type of product specified in column b given a date column c.

So the calculation of the average will be based on the most recent 30 points of each particular Product as opposed to the overall most recent data points of the whole DataFrame.

df:

Product            Value      Date
POL Mumbai         22.5       2015-6-26
STOLCO Finesse     55.5       2015-7-1
MPLR  Pure         85.0       2015-8-1
Stefan
  • 41,759
  • 13
  • 76
  • 81
pedramoh
  • 3
  • 3

1 Answers1

0

In general terms, you could groupby your DataFrame assumed to be called df by its column 'b' like so:

products = df.groupby('b)

then iterate through each product group as follows:

mean = {}
for product, data in products:
    mean[product] = data.sort('c', ascending=False).head(30)['a'].mean()
print DataFrame.from_dict(mean.items(), columns=['Product', 'Mean')

or

print Series(mean)

See here for details on the error you encountered.

Community
  • 1
  • 1
Stefan
  • 41,759
  • 13
  • 76
  • 81