0

I have two elements:

id1 <- "dog"
id2 <- "cat"

I want to extract any combination of these elements (dogcat or catddog) from a vector

L <- c("gdoaaa","gdobbb","gfoaaa","ghobbb","catdog")
L

I tried:

L[grep(paste(id1,id2,sep="")),L]
L[grep(paste(id2,id1,sep="")),L]

but this gives an error.

I would be grateful for your help in correcting the above.

oguz ismail
  • 1
  • 16
  • 47
  • 69
adam.888
  • 7,686
  • 17
  • 70
  • 105
  • 1
    `L[grep(paste(id1,id2,sep=""),L)] L[grep(paste(id2,id1,sep=""),L)]` – HubertL Mar 08 '16 at 23:42
  • 2
    A simple non-regex solution could be `grepl(id1, L) & grepl(id2, L)`. You can add `fixed = TRUE` to both if efficiency is important. – David Arenburg Mar 08 '16 at 23:44
  • I don't understand it, but apparently `grepl("(dog(cat)?)", L)` works courtesy of http://stackoverflow.com/questions/1177081/mulitple-words-in-any-order-using-regex – thelatemail Mar 09 '16 at 00:32
  • That pattern matches `dogcat` but also `dog`, as in: `grepl("(dog(cat)?)", "dog")`. – effel Mar 09 '16 at 01:09

1 Answers1

2

The error is from misplaced parentheses, so these minor variations on your code will work.

L[grep(paste(id1,id2,sep=""), L)]
# character(0)
L[grep(paste(id2,id1,sep=""), L)]
# [1] "catdog"

Alternatively this is a regex one-liner:

L[grep(paste0(id2, id1, "|", id1, id2), L)]
# [1] "catdog"

That and some patterns in the comments will also match dogcatt. To avoid this you could use ^ and $ like so:

x <- c("dogcat", "foo", "catdog", "ddogcatt")
x[grep(paste0("^", id2, id1, "|", id1, id2, "$"), x)]
# [1] "dogcat" "catdog"
effel
  • 1,421
  • 1
  • 9
  • 17