I'm looking for R code that subsets a data frame a for indices that match patterns in another vector k.
For example, consider
x <- c("a", "b", "c")
y <- 1:3
z <- c("foo", "bar", "null")
a <- data.frame(x, y, z)
a
# x y z
#1 a 1 foo
#2 b 2 bar
#3 c 3 null
Suppose that I have a list that I want to use to subset a, where k is defined as
k <- c("b", "c")
If I use grepl
with apply
and sapply
I can get the rows that match k, which is what I want.
a[as.logical(apply(sapply(k, grepl, a$x), 1, sum)),]
x y z
2 b 2 bar
3 c 3 null
This code however, is REALLY slow when scaled up to large datasets. Is there a faster and simpler way of doing this?
Thanks,
Rafael
EDIT: I tried my best to find the answer to this question on Stack Overflow. Since I could not find it I can assure that the wording used in this post is unique and therefore a contribution to the forum.