Create unique identifier based on conditions - identifying sequence

Question

I have a variable x. I would like to obtain a variable z such that it takes a unique identifier (here a letter, but it could be a number) according to these conditions:

it takes a unique identifier if the observation of before is zero and the current is one,
the same identifier of previous observation if the current and past values are one,
the value NA if the current observation is zero,
(=1.) a new unique identifier if the observation of before is zero and the current is one, etc.:
```
x    z 
0    NA
1    A
1    A
1    A
0    NA
1    B
1    B
0    NA
```

Anyone might have an idea as how to do it?

Almost duplicate of https://stackoverflow.com/questions/37809094/create-group-names-for-consecutive-values Try: `z <- cumsum(c(1, diff(x) != 0)); z[ x == 0 ] <- NA; z` — zx8754, Oct 23 '17 at 21:07

score 1 · Answer 1 · answered Oct 23 '17 at 20:51

1

library(data.table)
x = c(0, 1, 1, 1, 0, 1, 1, 0)

ifelse(x == 0, NA, rleid(x))
# [1] NA  2  2  2 NA  4  4 NA

You can relabel them if you'd like, with factor for example. This assumes that your input is always 0 or 1.

answered Oct 23 '17 at 20:51

Gregor Thomas

136,190
20
167
294

Exactly what I was looking for ! Thank you very much. – Sarah Oct 23 '17 at 21:05

score 1 · Answer 2 · answered Oct 23 '17 at 20:59

1

x = c(0, 1, 1, 1, 0, 1, 1, 0)
replace(cumsum(x == 0), x == 0, NA)
#[1] NA  1  1  1 NA  2  2 NA

answered Oct 23 '17 at 20:59

d.b

32,245
6
36
77

Create unique identifier based on conditions - identifying sequence

2 Answers2