-1

I'm working with 3 columns ('time', 'SEC', 'DeviceName') in a pandas DataFrame. I'm using the following code to calculate the differences between rows in the 'time' column and assign to the 'SEC' column:

df['SEC'] = df['time'].diff().dt.total_seconds()

The 'DeviceName' column can have several different devices, so I need to modify this to only perform the calculation if the device name matches the previous row, otherwise assign a 0 to 'SEC'.

For example:

time                    SEC       DeviceName
4/18/2023 2:43:00                 Applied_AA-12
4/18/2023 3:13:00       1800      Applied_AA-12  # calculate because the device name matches the previous row
4/18/2023 3:35:53       0         Applied_AA-14  # don't calculate because the device name doesn't match the previous row
4/18/2023 3:36:03       10        Applied_AA-14  # calculate because the device name matches the previous row
Luis
  • 49
  • 4
  • Can you make a [minimal-reproducible-example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and show the *matching expected output* ? – Timeless May 17 '23 at 15:59

1 Answers1

1

You can use GroupyBy.diff :

df["SEC"] = df.groupby("DeviceName")["time"].diff().dt.total_seconds().fillna(0)

df.at[0, "SEC"] = np.nan # is this optional ?

Output :

print(df)

                 time     DeviceName     SEC
0 2023-04-18 02:43:00  Applied_AA-12     NaN
1 2023-04-18 03:13:00  Applied_AA-12 1800.00
2 2023-04-18 03:35:53  Applied_AA-14    0.00
3 2023-04-18 03:36:03  Applied_AA-14   10.00
Timeless
  • 22,580
  • 4
  • 12
  • 30
  • I edited my question because I needed to add an extra element to the calculation. – Luis May 18 '23 at 16:22
  • 1
    You should instead open a new question (*as you often do*) because now your question is considered a moving target after the edit you made and my answer makes non sense to the reader. – Timeless May 18 '23 at 16:37