Get the indices for which any element of k matches a pattern in x[i] in R

Question

I'm looking for R code that subsets a data frame a for indices that match patterns in another vector k.

For example, consider

x <- c("a", "b", "c")
y <- 1:3
z <- c("foo", "bar", "null")
a <- data.frame(x, y, z)
a
#  x y    z
#1 a 1  foo
#2 b 2  bar
#3 c 3 null

Suppose that I have a list that I want to use to subset a, where k is defined as

k <- c("b", "c")

If I use grepl with apply and sapply I can get the rows that match k, which is what I want.

a[as.logical(apply(sapply(k, grepl, a$x), 1, sum)),]

  x y    z
2 b 2  bar
3 c 3 null

This code however, is REALLY slow when scaled up to large datasets. Is there a faster and simpler way of doing this?

Thanks,

Rafael

EDIT: I tried my best to find the answer to this question on Stack Overflow. Since I could not find it I can assure that the wording used in this post is unique and therefore a contribution to the forum.

Please see my edit regarding duplicates. – user2171927 Aug 07 '17 at 21:07 — user2171927, Aug 07 '17 at 21:07

score 4 · Accepted Answer · answered Aug 07 '17 at 16:38

4

a simple way in base R is to use %in%:

a[ a$x %in% k , ]

answered Aug 07 '17 at 16:38

David Heckmann

2,899
2
20
29

Thanks!! Definitely elegant and simple. – user2171927 Aug 07 '17 at 16:57

Get the indices for which any element of k matches a pattern in x[i] in R

1 Answers1