0

I have been experimenting with grep to identify a pattern in a character vector with the aim of 'fishing out' the name of a file from the character vector. For example, for a small and simplified case, let's say:

vec <- c("Fast.file1", "Fast.file2", "Med.file3", "Medium.file4", "Slow.file5")

I can search for the files with the pattern "Fast" by simply writing:

> Fast_files <- vec[grep("Fast", vec)]
> Fast_files
[1] "Fast.file1" "Fast.file2"

But let's say I have a vector of patterns, whose length can vary depending on user input via a file read. I would like to feed the pattern vector to the search so that each element of the pattern can be cross-checked against vec, and I want to return all compatible hits. For example,

checkAgainst <- c("Fast", "Medium", "Med")

If I try to use grep with checkAgainst as a pattern, I get:

> find_files <- grep(checkAgainst, vec)
Warning message:
In grep(checkAgainst, vec) :
  argument 'pattern' has length > 1 and only the first element will be used
> find_files
[1] 1 2
> 

So, it appears grep can't take a vector pattern. It takes the first (i.e. "Fast").

My wish is to have an outcome where find_files includes "Fast.file1", "Fast.file2", "Med.file3" and "Medium.file4".

I can write a for-loop where I can overcome this problem, but I was wondering if R provided a more succinct and elegant solution?

Thank you for your consideration.

Maziar.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
please help
  • 197
  • 1
  • 12

1 Answers1

1

You could form a regex alternation, and then grep for that:

vec <- c("Fast.file1", "Fast.file2", "Med.file3", "Medium.file4", "Slow.file5")
checkAgainst <- c("Fast", "Medium", "Med")
regex <- paste(checkAgainst, collapse="|")
Fast_files <- vec[grep(regex, vec)]
Fast_files

[1] "Fast.file1"   "Fast.file2"   "Med.file3"    "Medium.file4"
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360