Create Group from Sequences

Question

How can I create a "group" vector that identifies sequences of same values in another vector.

From this

x <- c(0,1,0,0,1,0,1)

I want to create this

outcome <- c(1,2,3,3,4,5,6)

[1] 0 1 0 0 1 0 1
[1] 1 2 3 3 4 5 6

So, whenever there is a new sequence of the same values there is a new group number (or can be something other than a number as well).

I would actually know ways to get there, but they are all hideous. The best I can come up with is

comparison <- x != lag(x)
cumsum(replace_na(comparison, TRUE))

but like I said - hideous. There must be a better way and I hope someone knows it.

Possible duplicate: [*How to create a consecutive index based on a grouping variable in a dataframe*](https://stackoverflow.com/q/6112803/2204410) — Jaap, Feb 16 '20 at 18:41
@Jaap I do not get it. Why do you close the question? The "duplicate" you linked does **not** answer this question here. Please, read more carefully before you close a question. — Georgery, Feb 17 '20 at 08:45

score 4 · Accepted Answer · answered Feb 16 '20 at 17:13

We can use rleid from data.table

library(data.table)
rleid(x)
#[1] 1 2 3 3 4 5 6

Or in base R with rle

with(rle(x), rep(seq_along(values), lengths))
#[1] 1 2 3 3 4 5 6

Or if we use the similar approach from OP

1 + cumsum(x != dplyr::lag(x, default = first(x)))

score 2 · Answer 2 · answered Feb 16 '20 at 17:42

2

If x is always only 0s and 1s, another option is

cumsum(c(1, (x[-1] + head(x, -1)) %% 2))

[1] 1 2 3 3 4 5 6

answered Feb 16 '20 at 17:42

Andrew Gustar

score 0 · Answer 3 · answered Feb 16 '20 at 18:26

0

a tidyverse version that does a condition, replaces the NA and sums cumulatively:

library(tidyverse)

if_else(outcome == lag(outcome), 0, 1) %>% 
  replace_na(1) %>% 
  cumsum()

[1] 1 2 3 3 4 5 6

answered Feb 16 '20 at 18:26

nycrefugee

3 Answers3