0

I have a dataframe (df) my goal is to add a new column ("grad") that corresponds to the gradient between the points if they have the same index.

First I didn't find an easy way to do it using only pandas for now I use numpy+pandas. I have written a function to get the gradient for each row by group, and it works but it is not pretty and a bit wonky.

Second I want to add the pandas series of numpy arrays to the df but I don't know how to do so. I tried to stack them so i get a series of size (0,9) (grouped_2 ) but when I use concat I have the following message: "ValueError: Shape of passed values is (48, 3), indices imply (16, 3)". According to a previous question I think having duplicate index values is the problem but I can't modify the index of my first df.

df = pd.DataFrame(index = [1,1,1,1,1,2,2,2,2], 
                  data={'value': [1,5,8,10,12,1,2,8,2], 'diff_day':[-1,0,2,3,4,-2,-1,0,10]} )
def grad(gr):
        val = gr['value']
        dif = gr['diff_day']
        return np.gradient(val, dif)


grouped_1 = df.groupby(level=0).apply(grad)
grouped_2 = pd.DataFrame(grouped_1.values.tolist(), index=grouped_1.index).stack().reset_index(drop=True)
result = pd.concat([df, grouped_2], axis=1)

My expectation was the following dataframe:

pd.DataFrame(index = [1,1,1,1,1,2,2,2,2], 
                      data={'value': [1,5,8,10,12,1,2,8,2], 'diff_day':[-1,0,2,3,4,-2,-1,0,10], 'grad':[4,3.16,1.83,2,2,1,3.5,5.4,-0.6]} )

1 Answers1

0

Here's a simple way:

df['grad'] = np.gradient(df['value'], df['diff_day'])

To make your solution work, you can do:

result = pd.concat([df.reset_index(drop=True), grouped_2.reset_index(drop=True)], axis=1)
YOLO
  • 20,181
  • 5
  • 20
  • 40