I'm using Pandas 0.13.0 and I try to do a sliding average based on the value of the index.
The index values are not equally distributed. The index is sorted with increasing and unique values.
import pandas as pd
import Quantities as pq
f = {
'A': [ 0.0, 0.1, 0.2, 0.5, 1.0, 1.4, 1.5] * pq.m,
'B': [10.0, 11.0, 12.0, 15.0, 20.0, 30.0, 50.0] * pq.kPa
}
df = pd.DataFrame(f)
df.set_index(df['A'], inplace=True)
The DataFrame gives:
in: print df
out:
A B
A
0.00 0.00 m 10.0 kPa
0.10 0.10 m 11.0 kPa
0.20 0.20 m 12.0 kPa
0.50 0.50 m 15.0 kPa
1.00 1.00 m 20.0 kPa
1.40 1.40 m 30.0 kPa
1.50 1.50 m 50.0 kPa
Now I would like to do the average of the column B for each x value of the index, between x and x+c, c being a user defined criterion.
For the sake of this example, c = 0.40.
The averaging process would give:
A B C
A
0.00 0.00 m 10.0 kPa 11.0 kPa = (10.0 + 11.0 + 12.0) / 3
0.10 0.10 m 11.0 kPa 12.7 kPa = (11.0 + 12.0 + 15.0) / 3
0.20 0.20 m 12.0 kPa 13.5 kPa = (12.0 + 15.0) / 2
0.50 0.50 m 15.0 kPa 15.0 kPa = (15.0) / 1
1.00 1.00 m 20.0 kPa 25.0 kPa = (20.0 + 30.0) / 2
1.40 1.40 m 30.0 kPa 40.0 kPa = (30.0 + 50.0) / 2
1.50 1.50 m 50.0 kPa 50.0 kPa = (50.0) / 1
Note that because the index values are not evenly space, sometimes the x+c won't be found. It is ok for now, though I will definitely add a way to take the average value at x+c between the value just before and the value just after x+c, so I get a more accurate average.
I tried the solution found here from Zelazny7: pandas rolling computation with window based on values instead of counts
But I can't make it work for my case, where the search is made on the index.
I also looked at: Pandas Rolling Computations on Sliding Windows (Unevenly spaced)
But I don't understand how to apply it to my case.
Any idea how to solve this problem in a efficient pandas approach? (using apply, map or rolling?)
Thanks.