I'm quite new to R and while I have done some data wrangling with it, I am completely at a loss on how to tackle this problem. Google and SO search didn't get me anywhere so far. Should this be a duplicate, I'm sorry, then please point me to the right solution.
I have a df with 2 columns called id and seq. like so
set.seed(12)
id <- rep(c(1:2),10)
seq<-sample(c(1:4),20,replace=T)
df <- data.frame(id,seq)
df <- df[order(df$id),]
id seq
1 1 1
3 1 4
5 1 1
7 1 1
9 1 1
11 1 2
13 1 2
15 1 2
17 1 2
19 1 3
2 2 4
4 2 2
6 2 1
8 2 3
10 2 1
12 2 4
14 2 2
16 2 2
18 2 3
20 2 1
I would need to count the number of unequal elements in between the equal elements in the seq column e.g. how many elements are between 1 and 1 or 3 and 3 etc. The first instance of the element should be NaN because there is no element before this to count.If the next element is identical it should just code 0, as there is no unequal element in-between e.g. 1 and 1. The results should be written out in a new column e.g. delay.
One catch is that this process would have to start again once a new id starts in the id column (here: 1 & 2).
This is what I would love to have as output:
id seq delay
1 1 1 NA
3 1 4 NA
5 1 1 1
7 1 1 0
9 1 1 0
11 1 2 NA
13 1 2 0
15 1 2 0
17 1 2 0
19 1 3 NA
2 2 4 NA
4 2 2 NA
6 2 1 NA
8 2 3 NA
10 2 1 1
12 2 4 4
14 2 2 4
16 2 2 0
18 2 3 4
20 2 1 4
I really hope someone might be able to help me figure this out and allow me learn more about this.