0

How do I get all the indices(including overlapping) of the string where a pattern is matching. I have this poc code.

public static void main(){
    String input = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
    Pattern pattern = Pattern.compile("aaa");
    Matcher matcher = pattern.matcher(input);
    List<Integer> all = new ArrayList<>();
    while (matcher.find()) {
        all.add(matcher.start());
    }
    System.out.println(all);
}

Output:

[0, 3, 6, 9, 12, 15, 18, 21, 24, 27]

It does not consider overlapping patterns. All the matching indices should be:

[0, 1, 2, 3, 4, .....27]

I know it is easily doable by KMP, but
Can we do it using Pattern and Matcher?

impossible
  • 2,380
  • 8
  • 36
  • 60

1 Answers1

0

You can change your regex so that the entire expression is within a lookahead, i.e. change "aaa" to "(?=aaa)". This way, the matcher will find overlapping matches, although the matches are not really overlapping, as the actual match will be empty. You can still use groups in the lookahead, though. As a more complex example (Online Demo):

String input = "abab1ab2ab3bcaab4ab5ab6";
Pattern pattern = Pattern.compile("(?=((?:ab.){2}))");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println(matcher.start() + " " + matcher.group(1));
}

Starting indices and groups are:

2 ab1ab2
5 ab2ab3
14 ab4ab5
17 ab5ab6
tobias_k
  • 81,265
  • 12
  • 120
  • 179