1

I am currently working with pandas and have a df that currently looks something like this:

     LOCATION   TIME   Value       
0         AUS   2000   33.595673       
1         AUS   2001   57.862362
2         AUS   2002   58.588608
3          UK   2000   61.7
4          UK   2001   63.243232
5          UK   2002   66.235122

I want to add another column that lists the differences between subsequent rows in the values column but want it to restart when the LOCATION changes. So essentially it should restart between rows 2 and 3 in the example above

Saad Shahid
  • 73
  • 2
  • 9
  • What's your expected output? – ALollz Jul 29 '20 at 21:27
  • 1
    Does this answer your question? [Adding a column thats result of difference in consecutive rows in pandas](https://stackoverflow.com/questions/23142967/adding-a-column-thats-result-of-difference-in-consecutive-rows-in-pandas) – Trenton McKinney Jul 29 '20 at 21:29

2 Answers2

2
 df['valuedif']=df[['LOCATION', 'TIME', 'Value']].groupby('LOCATION').Value.apply(lambda x: x.diff())
print(df)



  LOCATION  TIME      Value   valuedif
0      AUS  2000  33.595673        NaN
1      AUS  2001  57.862362  24.266689
2      AUS  2002  58.588608   0.726246
3       UK  2000  61.700000        NaN
4       UK  2001  63.243232   1.543232
5       UK  2002  66.235122   2.991890
wwnde
  • 26,119
  • 6
  • 18
  • 32
1

If I understand correctly what you're looking for, the solution is:

df.groupby("LOCATION").diff()

The output is:

   TIME      Value
0   NaN        NaN
1   1.0  24.266689
2   1.0   0.726246
3   NaN        NaN
4   1.0   1.543232
5   1.0   2.991890
Roy2012
  • 11,755
  • 2
  • 22
  • 35