1

I'm trying to find all matches of a particular pattern "8ab|ab8" in the string "8ab8". So I tried the R command gregexpr("8ab|ab8","8ab8") hoping to get a return vector with the starting positions as c(1,2).

Unfortunately, it seems that what happens is that once the first pattern is matched, that portion of the string is "removed" and the second pattern won't be matched.

For example, once "8ab" is matched, "8ab8" becomes "8" and when R tries matching "ab8" in "8", the pattern won't be found. I know this because gregexpr("8ab|ab8","8ab ab8") works fine and returns starting positions of pattern matches as c(1,5).

The question is, how do I match the same pattern multiple times in the first case?

Mazdak
  • 105,000
  • 18
  • 159
  • 188
user22119
  • 717
  • 1
  • 4
  • 18
  • 2
    does this help http://stackoverflow.com/questions/25800042/overlapping-matches-in-r – rawr Mar 28 '15 at 20:10

1 Answers1

1

Use perl regular expressions: perl=TRUE . (see ?regex for info on perl regular expressions)

 gregexpr("(?=8ab)|(?=ab8)","8ab8",perl=T) 
jeborsel
  • 697
  • 4
  • 13