I have a dataset in a pyspark job that looks a bit like this:
frame_id direction_change
1 False
2 False
3 False
4 True
5 False
I want to add a "track" counter to each row so that all the frames between direction changes have the same value. For example, the output I want looks like this:
frame_id direction_change track
1 False 1
2 False 1
3 False 1
4 True 2
5 False 2
I have been able to do this with Pandas with the following action:
frames['track'] = frames['direction_change'].cumsum()
But can't find an equivalent way to do it in Spark data frames. Any help would be really appreciated.