0
a <- "quick brown fox"
b <- "quick brown dog"

I want to know if both the "quick" AND "fox" strings exist in a,b.

i.e., applying the answer to this question on,

a - should return TRUE

b - should return FALSE

nrussell
  • 18,382
  • 4
  • 47
  • 60
akilat90
  • 5,436
  • 7
  • 28
  • 42
  • 1
    To use one pattern, try `grepl("quick.*fox|fox.*quick", x)` – Pierre L Jan 31 '16 at 17:56
  • What did you try? Why did it not work? – Heroka Jan 31 '16 at 18:09
  • 1
    @Heroka the original problem was like where a bunch of a,b,.. values being the colnames of a `data.frame` which is an extension to this question. I tried `grepl("brown", colnames(data.frame)) && grepl("fox", colnames(data.frame)` which is fundamentally wrong and would just compare the output lists of the above two logical vectors. I still couldn't figure out how to apply this to a colnames(data.frame) object though. – akilat90 Jan 31 '16 at 18:36
  • 1
    @akilat90 You should remove the `&&` and use a single `&` – akrun Jan 31 '16 at 18:39
  • @akrun - just read this [link](http://stackoverflow.com/questions/6558921/r-boolean-operators-and) Thanks. Btw @PierreLafortune your approach worked for me to apply this to a whole vector. I used `c <- c(a,b)` and then `grepl("quick.*fox", c)` and it returned `TRUE FALSE` . Your method of using `grepl("quick.*fox|fox.*quick", x)` yields the same result. I think I'm missing something about the wildcard search. Can you shed some light on this? Thanks – akilat90 Jan 31 '16 at 18:48
  • @akilat90 The `quick.*fox|fox.*quick` is to match all strings where `quick` comes before `fox` or `fox` comes before `quick`. If you have only instances where the `quick` comes before `fox`, the single `quick.*fox` should work. Again, as I mentioned in my solution, using `\\b` to separate the word boundary makes it specific to avoid any surprises. – akrun Jan 31 '16 at 18:51
  • 1
    Thanks a lot @akrun I'm a bit slow as I'm kind of new to this context. – akilat90 Jan 31 '16 at 19:07

1 Answers1

0

We can use a double grepl with & (more safer)

grepl('quick', a) & grepl('fox', a)
#[1] TRUE

grepl('quick', a) & grepl('fox', b)
#[1] FALSE

Or if we know the position of 'fox' beforehand, i.e. if it follows after quick (as in the example), then we can use a regex 'quick' followed by 0 or more characters followed by 'fox'. We may also flank it with word boundary (\\b) to avoid surprises i.e. to avoid matching words like quickness or foxy.

grepl('.*\\bquick\\b.*\\bfox\\b', a)
#[1] TRUE
grepl('.*\\bquick\\b.*\\bfox\\b', b)
#[1] FALSE

As I mentioned earlier, this will give FALSE for foxy

grepl('.*\\bquick\\b.*\\bfox\\b', 'quick brown foxy')
#[1] FALSE

If the position varies,

grepl('.*\\bquick\\b.*\\bfox\\b|.*\\bfox\\b.*\\bquick\\b', b)
#[1] FALSE

grepl('.*\\bquick\\b.*\\bfox\\b|.*\\bfox\\b.*\\bquick\\b', a)
#[1] TRUE
akrun
  • 874,273
  • 37
  • 540
  • 662