1

Good Morning,

I have the following dataframe:

print(df)

a    b    
1    6   
1    4   
4    5
4    2
...

And I would like to get:

print(final_df)

    a    b   c 
    1    6   2
    1    4   2
    4    5   3
    4    2   3
    ...

I tried using:

df["c"] = df.groupby("a")["b"].transform(np.diff)

And it works of a small test set with two rows, but whenever I try to run in on the whole dataset, it returns:

ValueError: Wrong number of items passed 0, placement implies 1

How can I create final_df ?

Alessandro Ceccarelli
  • 1,775
  • 5
  • 21
  • 41
  • I'm not sure what you're asking; your example works fine for me (though `c` is negative). You're suggesting that more than 2 rows this fails? Please give the full traceback. – roganjosh Jun 16 '18 at 07:49
  • Exactly; when I run it on the full 580-lines dataset, it returns that error – Alessandro Ceccarelli Jun 16 '18 at 07:55
  • `numpys` `diff` will not work for you here. How about : `df["c"] = df.groupby("a")["b"].diff().abs().fillna(0)` – Fourier Jun 16 '18 at 07:55
  • Ok, but can you please give the full traceback? 580 lines is not a huge amount to eye-ball on your side; are there any values in those columns that might be causing this error? – roganjosh Jun 16 '18 at 07:57

1 Answers1

1

This is your problem explained I guess: diff vs np.diff

And this might give you want you want:

df["c"] = df.groupby("a")["b"].diff().abs().fillna(0)
Fourier
  • 2,795
  • 3
  • 25
  • 39
  • 1
    It does not act as transform, since it does not create the same value on grouped-by columns; nonetheless, I manage to solve my issue with your piece of code; Thanks – Alessandro Ceccarelli Jun 16 '18 at 08:07