E.G, I have following csv data (There are more than one group g in practice):
G,T,x,y
g,1,3,4
g,2,4,5
g,3,6,1
g,4,7,2
g,5,8,3
g,6,9,8
I want to calculate the exponential weighted correlation coefs between x and y of each group. So I expected the result:
G T namedWhatever
g 1 NaN
g 2 1.000000
g 3 -0.867510
g 4 -0.792758
g 5 -0.510885
g 6 0.413379
which actually can calculated by:
dat.loc['g'].ewm(halflife=3).corr().loc[:, 'x', 'y']
Out[5]:
T
1 NaN
2 1.000000
3 -0.867510
4 -0.792758
5 -0.510885
6 0.413379
Name: y, dtype: float64
What I have tried without luck:
In [3]: dat = pd.read_csv('test.csv').set_index(['G', 'T'])
In [4]: dat.groupby(level='G').transform(lambda x: x.ewm(halflife=3).corr())
Out[4]:
x y
G T
g 1 NaN NaN
2 1.0 1.0
3 1.0 1.0
4 1.0 1.0
5 1.0 1.0
6 1.0 1.0
What's the right way to do it? My pandas version is 0.19.2 and python 3.6.