7

I was brushing up on my regular expressions in java when I did a simple test

Pattern.matches("q", "Iraq"); //false
"Iraq".matches("q"); //false

But in JavaScript

/q/.test("Iraq"); //true
"Iraq".match("q"); //["q"] (which is truthy)

What is going on here? And can I make my java regex pattern "q" behave the same as JavaScript?

Pshemo
  • 122,468
  • 25
  • 185
  • 269
jermel
  • 2,326
  • 21
  • 19
  • 2
    Please don't compare methods from two different languages merely based on their names. Rather go, and look into the respective documentation. – Rohit Jain Feb 19 '14 at 14:38
  • 2
    I figured out that it matches the entire string, but java did not have any documentation besides "For a more precise description of the behavior of regular expression constructs, please see Mastering Regular Expressions, 3nd Edition, Jeffrey E. F. Friedl, O'Reilly and Associates, 2006." – jermel Feb 19 '14 at 14:42
  • The sentence you quote is preceded by no less than 9 screenfuls of detailed elaboration of the Java regex syntax and semantics. – Marko Topolnik Feb 19 '14 at 14:49
  • And, of course, that's just the class documentation, where the documentation for the particular method you call points to this: [`Attempts to match the entire region against the pattern.`](http://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html#matches()) – Marko Topolnik Feb 19 '14 at 14:50
  • Ok i found it not on Pattern, but on (obviously) Matcher's page - The matches method attempts to match the entire input sequence against the pattern... – jermel Feb 19 '14 at 14:51
  • And on the `Pattern` page you should have found this: `behaves in exactly the same way as the expression Pattern.compile(regex).matcher(input).matches()`, which makes it indeed obvious. – Marko Topolnik Feb 19 '14 at 14:53
  • Yea the point is i read all (9 screens) of the documentation for the class with the method I was using, but couldn't find it. But now I did (on the docs for another resource) thanks which is why i asked on SO – jermel Feb 19 '14 at 14:57

3 Answers3

6

In JavaScript match returns substrings which matches used regex. In Java matches checks if entire string matches regex.

If you want to find substrings that match regex use Pattern and Matcher classes like

Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(yourData);
while(m.find()){
   m.group();//this will return current match in each iteration
   //you can also use other groups here using their indexes
   m.group(2);
   //or names (?<groupName>...)
   m.group("groupName");
}
Pshemo
  • 122,468
  • 25
  • 185
  • 269
5

This is because in Java Pattern#matches OR String#matches expects you to match complete input string not just a part of it.

On the other hand Javascript's String#match can match input partially as you're also seeing in your examples.

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    And if I understand it correctly, it's just a different engine - period. They work differently between Java, Ruby, JavaScript, Python, .NET, and so on - right? – Mike Perrenoud Feb 19 '14 at 14:38
  • 1
    Yes that is true. Java's `matches` name for this method causes lot of confusion also. – anubhava Feb 19 '14 at 14:39
  • Is there a *switch* in Java though that can keep it from being prefixed and suffixed with the start and end characters? – Mike Perrenoud Feb 19 '14 at 14:40
  • And Javascript `match` method returns an array containing all the matched groups for that regex pattern, starting with group 0. – Rohit Jain Feb 19 '14 at 14:41
  • 2
    btw.: Python also has `re.match()` and `re.search()`. The first one _match_es the whole string from beginning to end and the second one _search_es through it if it finds the string – Ronny Lindner Feb 19 '14 at 14:41
  • 3
    In Java it needs to be done like this: `Pattern.matches(".*?q.*", "Iraq");` OR better to use `Pattern#find()` method which is there for partial matches. – anubhava Feb 19 '14 at 14:42
4

In JavaScript, String.match looks for a partial match. In Java, Pattern.matches returns true if the whole input string is matched by the given pattern. That is equivalent to say, in your example, that "iraq" should match ^q$, which it obvious doesn't.

Here it is from Java's Matcher Javadoc (note that Pattern.matches internally creates a Matcher then calls matches on it):

public boolean matches()

Attempts to match the entire region against the pattern. If the match succeeds then more information can be obtained via the start, end, and group methods.

Returns: true if, and only if, the entire region sequence matches this matcher's pattern

If you want to test for only a part of the string, add .*? at the beginning of the regex, and .* at the end, such as Pattern.match("iraq", ".*?q.*").

Note that .*q.* would also work, but using the reluctant operator in front might significantly improve performances if the input string is very long. See this answer for explanation on the difference between reluctant and greedy operators, and their effects on backtracking for explanation.

James
  • 4,211
  • 1
  • 18
  • 34