116

Let's say I have a string which contains this:

HelloxxxHelloxxxHello

I compile a pattern to look for 'Hello'

Pattern pattern = Pattern.compile("Hello");
Matcher matcher = pattern.matcher("HelloxxxHelloxxxHello");

It should find three matches. How can I get a count of how many matches there were?

I've tried various loops and using the matcher.groupCount() but it didn't work.

Michael
  • 41,989
  • 11
  • 82
  • 128
Tony
  • 3,587
  • 8
  • 44
  • 77

5 Answers5

200

matcher.find() does not find all matches, only the next match.

Solution for Java 9+

long matches = matcher.results().count();

Solution for Java 8 and older

You'll have to do the following. (Starting from Java 9, there is a nicer solution)

int count = 0;
while (matcher.find())
    count++;

Btw, matcher.groupCount() is something completely different.

Complete example:

import java.util.regex.*;

class Test {
    public static void main(String[] args) {
        String hello = "HelloxxxHelloxxxHello";
        Pattern pattern = Pattern.compile("Hello");
        Matcher matcher = pattern.matcher(hello);

        int count = 0;
        while (matcher.find())
            count++;

        System.out.println(count);    // prints 3
    }
}

Handling overlapping matches

When counting matches of aa in aaaa the above snippet will give you 2.

aaaa
aa
  aa

To get 3 matches, i.e. this behavior:

aaaa
aa
 aa
  aa

You have to search for a match at index <start of last match> + 1 as follows:

String hello = "aaaa";
Pattern pattern = Pattern.compile("aa");
Matcher matcher = pattern.matcher(hello);

int count = 0;
int i = 0;
while (matcher.find(i)) {
    count++;
    i = matcher.start() + 1;
}

System.out.println(count);    // prints 3
aioobe
  • 413,195
  • 112
  • 811
  • 826
  • Counting number of matches that occur within the string. The java.util.regex.Matcher.region(int start, int end) method sets the limits of this matcher's region. The region is the part of the input sequence that will be searched to find a match. Invoking this method resets the matcher, and then sets the region to start at the index specified by the start parameter and end at the index specified by the end parameter. Try this. `while(matcher.find()){ matcher.region(matcher.end()-1, str.length()); count++; }` – Mukesh Kumar Gupta Nov 17 '17 at 11:27
17

This should work for matches that might overlap:

public static void main(String[] args) {
    String input = "aaaaaaaa";
    String regex = "aa";
    Pattern pattern = Pattern.compile(regex);
    Matcher matcher = pattern.matcher(input);
    int from = 0;
    int count = 0;
    while(matcher.find(from)) {
        count++;
        from = matcher.start() + 1;
    }
    System.out.println(count);
}
Andy Thomas
  • 84,978
  • 11
  • 107
  • 151
Mary-Anne Wolf
  • 171
  • 1
  • 2
8

From Java 9, you can use the stream provided by Matcher.results()

long matches = matcher.results().count();
Michael
  • 41,989
  • 11
  • 82
  • 128
4

If you want to use Java 8 streams and are allergic to while loops, you could try this:

public static int countPattern(String references, Pattern referencePattern) {
    Matcher matcher = referencePattern.matcher(references);
    return Stream.iterate(0, i -> i + 1)
            .filter(i -> !matcher.find())
            .findFirst()
            .get();
}

Disclaimer: this only works for disjoint matches.

Example:

public static void main(String[] args) throws ParseException {
    Pattern referencePattern = Pattern.compile("PASSENGER:\\d+");
    System.out.println(countPattern("[ \"PASSENGER:1\", \"PASSENGER:2\", \"AIR:1\", \"AIR:2\", \"FOP:2\" ]", referencePattern));
    System.out.println(countPattern("[ \"AIR:1\", \"AIR:2\", \"FOP:2\" ]", referencePattern));
    System.out.println(countPattern("[ \"AIR:1\", \"AIR:2\", \"FOP:2\", \"PASSENGER:1\" ]", referencePattern));
    System.out.println(countPattern("[  ]", referencePattern));
}

This prints out:

2
0
1
0

This is a solution for disjoint matches with streams:

public static int countPattern(String references, Pattern referencePattern) {
    return StreamSupport.stream(Spliterators.spliteratorUnknownSize(
            new Iterator<Integer>() {
                Matcher matcher = referencePattern.matcher(references);
                int from = 0;

                @Override
                public boolean hasNext() {
                    return matcher.find(from);
                }

                @Override
                public Integer next() {
                    from = matcher.start() + 1;
                    return 1;
                }
            },
            Spliterator.IMMUTABLE), false).reduce(0, (a, c) -> a + c);
}
gil.fernandes
  • 12,978
  • 5
  • 63
  • 76
1

Use the below code to find the count of number of matches that the regex finds in your input

        Pattern p = Pattern.compile(regex, Pattern.MULTILINE | Pattern.DOTALL);// "regex" here indicates your predefined regex.
        Matcher m = p.matcher(pattern); // "pattern" indicates your string to match the pattern against with
        boolean b = m.matches();
        if(b)
        count++;
        while (m.find())
        count++;

This is a generalized code not specific one though, tailor it to suit your need

Please feel free to correct me if there is any mistake.

Pradeep
  • 9,667
  • 13
  • 27
  • 34
Say_Am
  • 11
  • 4