2

Consider a monotonically increasing integer sequence such as:

x <- c(0, 3, 5, 8, 10, 16, 18, 35, 36)

I would like to group these based on their difference from each other. If the difference is less than or equal to 4 I would like them to be in the same group -- however that difference needs to reset once a group is assigned.

#    x desired_group
# 1  0             0
# 2  3             0
# 3  5             1
# 4  8             1
# 5 10             2
# 6 16             3
# 7 18             3
# 8 35             4
# 9 36             4

{0, 3} go together because they are within 4. Once we reach 5, that grouping needs to reset. That is, floor(x / 4) will not work because it does not "reset" appropriately.

JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
  • 1
    Could you test `v1 <- x %/% 5;match(v1, unique(v1))-1#[1] 0 0 1 1 2 3 3 4 4` – akrun Nov 19 '16 at 13:29
  • What if you have `x <- c(0, 3, 5, 8, 10, 16, 18, 35, 36, 789, 22)` – akrun Nov 19 '16 at 13:30
  • still works...no? I get `[1] 0 0 1 1 2 3 3 4 4 5 6` – Sotos Nov 19 '16 at 13:31
  • If your sequence always starts at zero and increases monotonically, through integers, you should probably say that. Currently, the answers seem to be lacking in generality by relying on it... – Frank Nov 19 '16 at 13:31
  • Regardless, I edited, the sequence is monotonically increasing – JasonAizkalns Nov 19 '16 at 13:31
  • @Sotos sorry, my comment was directed to the OP (didn't test your code) – akrun Nov 19 '16 at 13:32
  • 1
    Oh ok. No problem @akrun – Sotos Nov 19 '16 at 13:33
  • @Sotos I think I like your solution a bit better as it's a touch easier to implement elsewhere (e.g. SQL). – JasonAizkalns Nov 19 '16 at 13:42
  • 1
    @JasonAizkalns found it! I didn't post it as answer because I knew I saw it somewhere else...[here you go](http://stackoverflow.com/questions/37809094/create-group-names-for-consecutive-values/37809368#37809368)... Ends up I also have the `rleid` answer on that post – Sotos Nov 19 '16 at 13:47
  • @Sotos Not sure how the dupe link would answer the OP's question. – akrun Dec 01 '16 at 17:57
  • @akrun, agree -- Moderators, feel free to close and I will attempt to ask again and be more clear. – JasonAizkalns Dec 01 '16 at 17:58
  • 1
    @akrun Attempting a clarifying question [here](http://stackoverflow.com/q/40917388/2572423) – JasonAizkalns Dec 01 '16 at 18:04
  • @akrun I duped the Q after the OP had commented that the answer I wrote in comments (deleted it in account of the dupe) worked for him. The dupe clearly shows the function needed for the current Q with a slight alteration to account for diff of 4. – Sotos Dec 01 '16 at 19:33
  • @Sotos If that is the case, the OP wouldn't have posted another question to get this straight – akrun Dec 01 '16 at 19:35
  • @akrun the duped happened on Nov. 19th. The new question happened an hour ago. I am not an oracle. – Sotos Dec 01 '16 at 19:37
  • @Sotos I meant [this](http://stackoverflow.com/questions/40917388/sessionize-a-sequence-of-numbers-into-groups-that-reset-once-a-cumulative-thresh). Yes the dupe might have happened sometime back, What I am saying is that if it was not closed, then the OP might have got another answer on this post – akrun Dec 01 '16 at 19:38
  • Exactly...an hour ago. If your well received answer(which is great and one vote is mine) here did indeed address the question correctly (until yesterday's comment of OP to it) then the dupe was spot on. Anyway I think it's a clarification error from OP side – Sotos Dec 01 '16 at 19:41

1 Answers1

4

We can try with

v1 <- x %/% 5
match(v1, unique(v1))-1
#[1] 0 0 1 1 2 3 3 4 4
akrun
  • 874,273
  • 37
  • 540
  • 662
  • This fails when trying to extend the idea to groups of 240 for the following: `x <- c(0, 4779, 4796, 4816, 24939, 26436, 26476, 45062, 242405, 242423, 242433, 242458)` -- using `v1 <- x %/% 241`. Look at: `x[12] - x[9]`, and then similarly, trying to change to groups of 60, `x[4] - x[2]` becomes problematic. – JasonAizkalns Nov 30 '16 at 17:28