I have a df with an ID column and another column coding 1 or 0 depending on which response the person made. I'd like to create a third, new column that contains a counter that gets reset to 1 and starts counting again each time a person changes the response. Also, the counter should be reset whenever the ID changes. Here is a reproducible df & an example of how the output should look like:
df <- data.frame(ID = c(rep("abc546", 7), rep("azg467", 7),
rep("c7hg6", 7)),
response=c(1,1,1,0,0,1,0,0,1,0,0,0,1,0,1,0,0,0,1,1,0))
df
Desired output:
ID response counter
1 abc546 1 1
2 abc546 1 2
3 abc546 1 3
4 abc546 0 1
5 abc546 0 2
6 abc546 1 1
7 abc546 0 1
8 azg467 0 1
9 azg467 1 1
10 azg467 0 1
11 azg467 0 2
12 azg467 0 3
13 azg467 1 1
14 azg467 0 1
15 c7hg6 1 1
16 c7hg6 0 1
17 c7hg6 0 2
18 c7hg6 0 3
19 c7hg6 1 1
20 c7hg6 1 2
21 c7hg6 0 1
I tried adapting a function from another question I posted a while back ( Counting unequal elements in-between equal elements in R df column ), that also required consideration and resetting the counter based on a persons's id, but this considered a different rule and so far I failed to adapt it in a way that would provide the correct output (attempt 1). Attempt 1 will reset the counter back to 1 when the response changes, but on the next trial, the counter will continue with the overall count.
I also read several SO postings such as this one: Numbering rows within groups in a data frame and tried several things based on these posts (e.g. attempt 2 see below, which resets the counter and overall gives the output I want but does not consider the ID variable and hence does not reset the counter once a new ID starts). I also tried to combine attempt 2 with a groupby function and other things but I never get the output I need.
Attempt 1:
my.function <- function(x) {
ret <- rep(1, length(x)) # start counter at 1
for (i in 2:length(x)) {
for (j in (i-1):1) {
if (x[j] == x[i]) {
ret[i] = i
}
else {
ret[i] = 0
}
break
}
}
return(ret)
}
df %>%group_by(ID) %>%mutate(counter = my.function(response))
Attempt 2:
sequence(rle(as.character(df$response))$lengths)
I hope you can help with this problem, as I'm out of ideas of what else to try. In case I should have overlooked an SO entry that solves my problem, it would be great if you could point me there. Many thanks!!!