I have a dataframe (df) my goal is to add a new column ("grad") that corresponds to the gradient between the points if they have the same index.
First I didn't find an easy way to do it using only pandas for now I use numpy+pandas. I have written a function to get the gradient for each row by group, and it works but it is not pretty and a bit wonky.
Second I want to add the pandas series of numpy arrays to the df but I don't know how to do so. I tried to stack them so i get a series of size (0,9) (grouped_2 ) but when I use concat I have the following message: "ValueError: Shape of passed values is (48, 3), indices imply (16, 3)". According to a previous question I think having duplicate index values is the problem but I can't modify the index of my first df.
df = pd.DataFrame(index = [1,1,1,1,1,2,2,2,2],
data={'value': [1,5,8,10,12,1,2,8,2], 'diff_day':[-1,0,2,3,4,-2,-1,0,10]} )
def grad(gr):
val = gr['value']
dif = gr['diff_day']
return np.gradient(val, dif)
grouped_1 = df.groupby(level=0).apply(grad)
grouped_2 = pd.DataFrame(grouped_1.values.tolist(), index=grouped_1.index).stack().reset_index(drop=True)
result = pd.concat([df, grouped_2], axis=1)
My expectation was the following dataframe:
pd.DataFrame(index = [1,1,1,1,1,2,2,2,2],
data={'value': [1,5,8,10,12,1,2,8,2], 'diff_day':[-1,0,2,3,4,-2,-1,0,10], 'grad':[4,3.16,1.83,2,2,1,3.5,5.4,-0.6]} )