1

I’m a bit confused with this example... I don’t understand what is written in this String pattern. Also, what is find? I am learning this from TutorialsPoint.

Please, can anyone help me understand it?

Code:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatches {
    public static void main(String args[]) {

        // String to be scanned to find the pattern.
        String line = "This order was placed for QT3000! OK?";
        String pattern = "(.*)(\\d+)(.*)";

        // Create a Pattern object
        Pattern r = Pattern.compile(pattern);

        // Now create matcher object.
        Matcher m = r.matcher(line);
        if (m.find()) {
            System.out.println("Found value: " + m.group(0));
            System.out.println("Found value: " + m.group(1));
            System.out.println("Found value: " + m.group(2));
        } else {
            System.out.println("NO MATCH");
        }
    }
}

Output:

Found value: This order was placed for QT3000! OK?
Found value: This order was placed for QT300
Found value: 0

Cœur
  • 37,241
  • 25
  • 195
  • 267
Vinoth Vino
  • 9,166
  • 3
  • 66
  • 70
  • Take a look at [Pattern](http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html). Btw, your question has nothing to do with design-patterns. – almightyGOSU Jul 08 '15 at 01:48
  • Take a look at https://docs.oracle.com/javase/tutorial/essential/regex/quant.html (especially part about greediness) and https://docs.oracle.com/javase/tutorial/essential/regex/groups.html. – Pshemo Jul 08 '15 at 01:49
  • If you want it to match the whole number, you can change the regex to `"(.*?)(\\d+)(.*)"` – Erwin Bolwidt Jul 08 '15 at 01:53
  • i didn't understood how they are matching the string line with "(.*)(\\d+)(.*)" pattern. How this output came like this? Will any one explain this program in briefly? – Vinoth Vino Jul 08 '15 at 02:10
  • Thnqqqqqqqqqqqq for all :) – Vinoth Vino Jul 08 '15 at 02:41

2 Answers2

2

The pattern has 3 capture groups

  • (.*) Means zero or more of any character (Capture group 1)
  • (\\d+) Means one or more digits (Capture group 2)
  • (.*) Means zero or more of any character (Capture group 3)

When find() is called it "Attempts to find the next subsequence of the input sequence that matches the pattern." (Matcher.Find())

When you call these lines:

System.out.println("Found value: " + m.group(0));
System.out.println("Found value: " + m.group(1));
System.out.println("Found value: " + m.group(2));
  • m.group(0) Means the entire String the regex evaluated
  • m.group(1) Refers to capture group 1 (zero or more of any character)
  • m.group(2) Refers to capture group 2 (one or more digits)
  • You're currently not outputting capture group 3 (zero or more of any character)

You'll see that m.group(1) returned This order was placed for QT300 and the last zero was left behind for m.group(2) because capture group 2 must have at least 1 digit.

If you were to add capture group 3 (m.group(3)) to the output it would display the remaining string after the last zero of m.group(2).

In other words:

System.out.println("Found value: " + m.group(0));
System.out.println("Found value: " + m.group(1));
System.out.println("Found value: " + m.group(2));
System.out.println("Found value: " + m.group(3));

Would display:

Found value: This order was placed for QT3000! OK?
Found value: This order was placed for QT300
Found value: 0
Found value: ! OK?

Hope this helps!

Shar1er80
  • 9,001
  • 2
  • 20
  • 29
1

find() will find the next subsequence of the input sequence that matches the pattern. Return true if, and only if, a subsequence of the input sequence matches this matcher's pattern.

Your regex has 3 group: (.*) is group 1st, (\\d+) is group 2nd, (.*) is group 3rd.

Group 1st is match 0 or more of the preceding token with any character except line break.

Group 2nd is match 1 or more of the preceding token with any digit character (0-9).

Group 3rd is the same with 1st.

So when your call:

m.group(0) will return the entire String.

m.group(1) will return the group 1st.

m.group(2) will return the group 2nd.

Parameter which is passed into group is index of group in regex.

codeaholicguy
  • 1,671
  • 1
  • 11
  • 18
  • Can u explain me about this program? @codeaholicguy – Vinoth Vino Jul 08 '15 at 02:11
  • @Vino as you see in your regex has 3 group (.*) is group 1st, (\\d+) is group 2nd, (.*) is group 3rd. Group 1st is match 0 or more of the preceding token with any character except line break. Group 2nd is match 1 or more of the preceding token with any digit character (0-9). Group 3rd is the same with 1st. – codeaholicguy Jul 08 '15 at 02:13
  • Tnx @codeaholicguy. So with in the () parameter,that is means for group right ? Ok thn (.*) means for?(is * means for all? Like * in package) and (\\d+) means for?(Matches digits. Equivalent to [0-9] Right?????? But what is \\d+ ? Explain me ? – Vinoth Vino Jul 08 '15 at 02:23
  • \\d+ as I said, + is match one or more of the preceding token, \d is any digit character (0 - 9) – codeaholicguy Jul 08 '15 at 02:26
  • why we are using that with \\d ? Double backslash means ? and .* too? – Vinoth Vino Jul 08 '15 at 02:30
  • '\\d' because java not understand '\' as backslash character, you must use escape character for it see http://stackoverflow.com/questions/1367322/what-are-all-the-escape-characters-in-java for more information about escape character. '?' is called optional, it mean match between 0 and 1 of the preceding token. '.' mean match any character except line break. '*' mean match between 0 or more of the preceding token. Take a look at http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html for more pattern information. – codeaholicguy Jul 08 '15 at 02:36
  • Tnx a lot :) Ya now i came to know about backslashes... I studied in escape sequence @code – Vinoth Vino Jul 08 '15 at 02:41