I have a spark DataFrame like this:
timestamp userId
2016-07-26 12:05:00 a
2016-07-26 12:05:01 b
2016-07-26 12:05:02 c
2016-07-26 12:05:03 d
2016-07-26 12:05:04 e
2016-07-26 12:05:05 f
I want to group the rows that are within 5 sec difference in one group, like:
timestamp userId group
2016-07-26 12:05:00 a 1
2016-07-26 12:05:01 b 1
2016-07-26 12:05:02 c 1
2016-07-26 12:05:03 d 1
2016-07-26 12:05:04 e 1
2016-07-26 12:05:05 f 2
Is there a way to do this without converting the spark DataFrame into R dataframe?