I am currently working on a problem which involves data cleaning and calculation in below fashion :
I have created the sample dataset here for a single unit A.
Data is sorted according to timestamp column for each unit. There are other columns as well.
For each distinct alternate value of event_log_value_desc
, I need to get rows. In the case of multiple duplicate values of event_log_value_desc
, it should return the row with the first occurrence of event_log_value_desc
. event_log_value_desc
should have alternate values of OFF and ON for each unit.
In return, the program should return the following :