-1

I am studying regular expression groups and have a simple question about that. Let's say I have a basic regular expression in java such as :

Pattern pattern = Pattern.compile("[0-9]{16}");

And I have a matcher :

Matcher matcher = pattern.matcher("111111111111111122);

 while (matcher.find()) {
   System.out.println(matcher.group());
}

When I loop, I want to be printed :

1111111111111111

1111111111111112

1111111111111122

I want to get the result of all 16 length number combinations. But it's only printed :

1111111111111111

Can I solve this issue by only modifying the regexp pattern?

Thorux
  • 197
  • 1
  • 1
  • 11
  • Wrong, `matcher.find()` does not neccesarily need the pattern to match the entire Input string to return a result. The real Problem here is that after calling `matcher.find()` the Pointer to where to start matching again on the next call (i.e. the next loop) is set to the end of the previous match, that is the remaining string after matching `1111111111111111` is `22`, which of course wouldn't match. Pretty sure that can't be fixed by simply modifying the pattern though. – B-Schmidt Nov 04 '19 at 14:50
  • You want *overlapping* matches. This is not supported in general by most "off-the-shelf" regex engines. Using lookaheads can be a workaround in some cases but will not work in the general case. – Giacomo Alzetta Nov 04 '19 at 14:54
  • @GiacomoAlzetta Overlapping *matches* are not, but overlapping *captures* are, if you have the capture inside a *zero-width positive lookaround*. See [answer](https://stackoverflow.com/a/58695869/5221149). – Andreas Nov 04 '19 at 14:57
  • @Andreas ? That answer doesn't work **for the general case**, for example it fails if the original regex contains lookbehinds. If wrapping a regex in `(?=` and `)` would provide a full implementation of an engine finding all overlapping matches the regex engines wouldn't have the issue in the first place since they could just add a flag and auto-wrap the regexes to obtain all overlapping matches. This approach works fine only with simple regexes. – Giacomo Alzetta Nov 04 '19 at 16:09
  • @GiacomoAlzetta I never claimed it would work for all regex's, just the regex in the question. I'm answering the question, not some big abstract thing. Besides, it does work with lookbehinds. Let input be `"12345678901234567890"` and let's modify regex to exclude matches preceded by an even digit, using a *negative lookbehind*: `"(?=((?<![02468])[0-9]{16}))"`. Result is as expected. Or use a *positive lookbehind*, that works too. – Andreas Nov 04 '19 at 16:17

1 Answers1

1

To get the result you want, change your code to:

Pattern pattern = Pattern.compile("(?=([0-9]{16}))");
Matcher matcher = pattern.matcher("111111111111111122");
while (matcher.find()) {
    System.out.println(matcher.group(1));
}

Notice the call to group(1), not group(), which is the same as group(0).

Output

1111111111111111
1111111111111112
1111111111111122
Andreas
  • 154,647
  • 11
  • 152
  • 247