0

I'm trying to match a word forwards and backwards in a string but it isn't catching all matches. For example, searching for the word "AB" in the string "AAABAAABAAA", I create and use the regex /AB|BA/, but it only matches the two "AB" substrings, and ignores the "BA" substrings.

I'm using RegexKitLite on the iPhone, but I think this is a more general regex problem (I see the same behavior in online regex testers). Nevertheless, here's the code I'm using to enumerate the matches:

[@"AAABAAABAAA" enumerateStringsMatchedByRegex:@"AB|BA" usingBlock:
 ^(NSInteger captureCount,
   NSString * const capturedStrings[captureCount],
   const NSRange capturedRanges[captureCount],
   volatile BOOL * const stop) { 
     NSLog(@"%@", capturedStrings[0]);
 }];

Output:

AB
AB
Hilton Campbell
  • 6,065
  • 3
  • 47
  • 79
  • possible duplicate of [How to check that a string is a palindrome using regular expressions?](http://stackoverflow.com/questions/233243/how-to-check-that-a-string-is-a-palindrome-using-regular-expressions) – Dave DeLong Jun 06 '11 at 16:45
  • No, I'm not actually looking for palindromes. If one happens to show up in the text I'm searching, though, the regex should still work. – Hilton Campbell Jun 06 '11 at 16:47
  • Ahhhhh... Yeah, I still think that's impossible. Why not just reverse the string yourself and try matching against your regex again? – Dave DeLong Jun 06 '11 at 16:50
  • Complicates the code and performs slightly worse, if a regex can do it. But if it can't, yeah, I'll do exactly what you suggest. – Hilton Campbell Jun 06 '11 at 16:54

3 Answers3

1

I don't know which online tester you tried, but http://www.regextester.com/ (for example) will not consider the same character for multiple matches. In this case, since ABA matches AB, the B is not considered for the BA match. It's purely a guess that RegexKitLite is implemented similarly.

Even if you don't consider the mirrored variant, the original search string may overlap with itself. For example, if you search ABCA|ACBA in ABCABCACBACBA you'll get two of four matches, searching in both directions will be the same.

It should be possible to find matches incrementally, but perhaps not with RegexKitLite

Kirk Kelsey
  • 4,259
  • 1
  • 23
  • 26
1

I would say, thats not possible in one turn. The regex matches for the given pattern and "eats" the matched characters. So if you search AB|BA in ABA the first found pattern is AB, then the regex continue to search on the third A.

So it is not possible to find overlapping patterns with the same regex and using the | operator.

stema
  • 90,351
  • 20
  • 107
  • 135
  • Is there another way to do it, or will I need to implement my own loop over the string, looking for the first match and incrementing one character past each match? – Hilton Campbell Jun 06 '11 at 19:38
0

I'm not sure how you'd accomplish exactly what I think you're asking for without reversing the string and testing twice.

However, I suppose it depends on what you're after exactly. If you're simply trying to determine if the pattern occurs in the string backwards or forwards, and not so much how it occurs, then you could do something like this:

ABA?|BAB?

The ? makes the last character optional on each side of the |. In the case of AAABAAABAAA, it'll find ABA twice. In the case of AB it'll find AB, and in the case of BA it'll find BA.

Here it is with test cases... http://regexhero.net/tester/?id=a387ae0a-1707-4d9e-856b-ebe2176679bb

Steve Wortham
  • 21,740
  • 5
  • 68
  • 90