Is there a way to find the indices of change of factors in a column with R? For example:
x <- c("aaa", "aaa", "aaa", "bbb", "bbb", "ccc", "ddd")
would return 3, 5, 6
Is there a way to find the indices of change of factors in a column with R? For example:
x <- c("aaa", "aaa", "aaa", "bbb", "bbb", "ccc", "ddd")
would return 3, 5, 6
You could try to compare shifted vectors, e.g.
which(x[-1] != x[-length(x)])
## [1] 3 5 6
This will work both on characters and factors
which(!!diff(as.numeric(x)))
[1] 3 5 6
The assumption is that you really have factors. They are saved internally with numerical values. So when the difference is taken, a one will result at every change. A second coercion is that zeroes are considered FALSE and other numbers TRUE. which
locates the TRUE values aka non-zeroes.
rle
can be used for this:
head(cumsum(rle(x)$lengths), -1)
[1] 3 5 6
With the dplyr::lag
function
library(dplyr)
which(x != lag(x)) - 1
# [1] 3 5 6