I need to write a search function to look for start and end location of certain elements in a large dataset using R.
My sample dataset is like below:
C1 C2 Index
aa J 1
aa J 2
aa J 3
ab O 4
aa O 5
aa J 6
aa J 7
aa J 8
aa J 9
aa K 10
ac K 11
aa J 12
aa J 13
I want to write a search function like search("aa","J")
(where "aa" is value from C1 column and "J" is value from C2 column). The function will first subset the dataset according to "aa"; then provide the indices according to this subset.
The result will return indices of all positions found in a matrix like below:
[,1] [,2]
[1,] 1 3
[2,] 5 8
[3,] 10 11
Thank you very much.
I tried to modify the provided code; but there is error. Can you pls help to take a look?
get_inds <- function(test, C1, C2) {
test <- subset(test, test$C1 == C1)
inds <- rle(test$C1 == C1 & test$C2 == C2)
end = cumsum(inds$lengths)
start = c(1, head(end, -1) + 1)
data.frame(start, end)[inds$values, ]
}
get_inds(test, 'aa', 'J')