0

I’m using filter to my dataset to select certain values from column:


%>%
filter(col1 %in% c(“value1”, “value2"))

How ever I don’t understand how to filter values in column with pattern without fully writing it. For example I also want all values which start with “value3” (“value33”, “value34”,....) along with “value1” and “value2”. Can I add grepl to that vector?

3 Answers3

3

You can use regular expressions to do that:

df %>%
   filter(str_detect('^value[1-3]'))
Adam B.
  • 788
  • 5
  • 14
2

If you want to use another tidyverse package to help, you can use str_starts from stringr to find strings that start with a certain value

dd %>% filter(stringr::str_starts(col1, "value"))
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • but if i have several patterns –  Feb 04 '20 at 22:12
  • It's unclear what you mean by that. It would be nice to include a simple [reproducible example] (https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Include both rows that you want to match and those you don't so we can make sure the solution works. – MrFlick Feb 04 '20 at 23:31
  • I mean that in this way: “col1 %in% c(“value1”, “value2", “int_value”, “string_value”)” I can select several types of patterns, and with your method only one –  Feb 05 '20 at 06:49
0

Here are few options in base R :

Using grepl :

subset(df, grepl('^value', b))

#  a        b
#1 1   value3
#3 3 value123
#4 4  value12

Similar option with grep which returns index of match.

df[grep('^value', df$b),]

However, a faster option would be to use startsWith

subset(df, startsWith(b, "value"))

All of this would select rows where column b starts with "value".

data

df <- data.frame(a = 1:5, b = c('value3', 'abcd', 'value123', 'value12', 'def'), 
                 stringsAsFactors = FALSE)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213