Another attempt to calculate an 'elapsedSeconds' column based upon a condition applied to the previous 'payArea' value.
With this dataframe:
> 0 location_day timestamp payArea Name ratePay elapsedSeconds
> 1 2019-12-10 2019-12-10 12:00:01 Area1 Amy 12.25 365
> 2 2019-12-10 2019-12-10 12:00:06 Area1 Amy 12.25 5
> 3 2019-12-10 2019-12-10 12:00:16 Area1 Amy 12.25 10
> 4 2019-12-17 2019-12-17 15:00:16 Area2 Amy 12.25 10800
> 5 2019-12-17 2019-12-17 15:00:36 Area2 Amy 8.75 20
> 6 2019-12-17 2019-12-17 15:00:40 Area2 Amy 12.25 4
> 7 2019-12-17 2019-12-17 15:00:50 Area2 Amy 8.75 10
> 8 2019-12-17 2019-12-17 15:01:10 Area1 Amy 12.25 380
> 9 2019-12-17 2019-12-17 15:01:45 Area1 Amy 12.25 35
> 10 2019-12-17 2019-12-17 15:02:01 Area1 Amy 12.25 16
Desired logic to apply: IF 'payArea' is equal to the previous 'payArea' value THEN calculate difference between timestamp in seconds, else if 'payArea' is not equal to previous 'payArea' value then 'elapsedSeconds' = 0.
This should basically "reset" the elapsedSeconds = 0 at any point the condition is NOT met.
Desired output:
> 0 location_day timestamp payArea Name ratePay elapsedSeconds
> 1 2019-12-10 2019-12-10 12:00:01 Area1 Amy 12.25 0
> 2 2019-12-10 2019-12-10 12:00:06 Area1 Amy 12.25 5
> 3 2019-12-10 2019-12-10 12:00:16 Area1 Amy 12.25 10
> 4 2019-12-17 2019-12-17 15:00:16 Area2 Amy 12.25 0
> 5 2019-12-17 2019-12-17 15:00:36 Area2 Amy 8.75 20
> 6 2019-12-17 2019-12-17 15:00:40 Area2 Amy 12.25 4
> 7 2019-12-17 2019-12-17 15:00:50 Area2 Amy 8.75 10
> 8 2019-12-17 2019-12-17 15:01:10 Area1 Amy 12.25 0
> 9 2019-12-17 2019-12-17 15:01:45 Area1 Amy 12.25 35
> 10 2019-12-17 2019-12-17 15:02:01 Area1 Amy 12.25 16
I was previously calculating the elapsedSeconds with this,
#first get the elapsedSeconds as datetime
df['elapsedSeconds'] = df.sort('timestamp').groupby(['name','payArea'])['timestamp'].diff()/1000
#now convert the diff value as float type
df['elapsedSeconds'] = ((df['elapsedSeconds'] / np.timedelta64(1, 's')) *1000).astype(float)