4

I tried searching but could not find anything that made any sense to me! I am noob at regex :)

Trying to see if a particular word "some_text" exists in another string.

String s = "This is a test() function"
String s2 = "This is a test    () function"

Assuming the above two strings I can search this using the following pattern at RegEx Tool

[^\w]test[ ]*[(]

But unable to get a positive match in Java using

System.out.println(s.matches("[^\\w]test[ ]*[(]");

I have tried with double \ and even four \\ as escape characters but nothing really works.

The requirement is to see the word starts with space or is the first word of a line and has an open bracket "(" after that particular word, so that all these "test (), test() or test ()" should get a positive match.

Using Java 1.8

Cheers, Faisal.

Faisal
  • 442
  • 5
  • 13

5 Answers5

5

The point you are missing is that Java matches() puts a ^ at the start and a $ at the end of the Regex for you. So your expression actually is seen as:

^[^\w]test[ ]*[(]$

which is never going to match your input.

Going from your requirement description, I suggest reworking your regex expression to something like this (assuming by "particular word" you meant test):

(?:.*)(?<=\s)(test(?:\s+)?\()(?:.*)

See the regex at work here.

Explanation:

^                 Start of line - added by matches()
(?:.*)            Non-capturing group - match anything before the word, but dont capture into a group
(?<=\s)           Positive lookbehind - match if word preceded by space, but dont match the space
(                 Capturing group $1
  test(?:\s+)?    Match word test and any following spaces, if they exist
  \(              Match opening bracket
)                 
(?:.*)            Non-capturing group - match rest of string, but dont capture in group
$                 End of line - added by matches()

Code sample:

public class Main {
    public static void main(String[] args) {
        String s = "This is a test() function";
        String s2 = "This is a test    () function";
        System.out.println(s.matches("(?:.*)((?<=\\s))(test(?:\\s+)?\\()(?:.*)")); 
        //true
    }
}
vs97
  • 5,765
  • 3
  • 28
  • 41
  • Great explanation! makes perfect sense now! Thanks vs97 :) – Faisal Feb 29 '20 at 14:34
  • I wish I could accept two answers! But this was simpler as I don't need to go through two more class objects to get what I wanted. Cheers! – Faisal Feb 29 '20 at 14:45
2

The Matches() method tells whether or not this whole string matches the given regular expression. Since that's not the case you'll yield errors.

If you just interested in if your lookup-value exists within the string I found the following usefull:


import java.util.regex.Matcher;
import java.util.regex.Pattern;

class Main {
  public static void main(String[] args) {
    String s = "This is a test    () function";
    Pattern p = Pattern.compile("\\btest *\\(");
    Matcher m = p.matcher(s);
    if (m.find())
      System.out.println("Found a match");
    else
      System.out.println("Did not find a match");
  }
}

I went with the following pattern: \\btest *\\(

  • \\b - Match word-boundary (will also catch if first word).
  • test - Literally match your lookup-value.
  • * - Zero or more literal spaces.
  • \\( - Escaped open paranthesis to match literally.

Regular expression visualization

Debuggex Demo

JvdV
  • 70,606
  • 8
  • 39
  • 70
  • Tried: System.out.println(s.matches("\\btest\\b\\s*\\(")); but still returns false :( – Faisal Feb 29 '20 at 11:59
  • 1
    @Faisal, I'm no Java expert but pieced something together that worked for me. It may be helpfull. – JvdV Feb 29 '20 at 12:16
  • 2
    @JvdV Not my DV but you could make a note why the code of the OP does not work. – The fourth bird Feb 29 '20 at 12:21
  • 2
    @Thefourthbird, thanks for the feedback. I have edited a little. – JvdV Feb 29 '20 at 12:46
  • Thank you, guys! You are the best!!! Indeed "matches()" however, adds a "^" at the beginning of the string and assumes that it is the start of the string and matches it as a whole which of course does not work for me. find() works great, just tested it – Faisal Feb 29 '20 at 14:26
  • 1
    I wish I could accept this as the answer as well but I ended up accepting @vs97's solution as it made more sense to me that I don't have to add two more classes to my code! Thank you once again for this quick solution, this made my day much better and I learned quite a few new things about the crazy regex :) – Faisal Feb 29 '20 at 14:47
2

I believe this should be enough:

s.find("\\btest\\s*\\(")
Óscar López
  • 232,561
  • 37
  • 312
  • 386
2

Try this "\btest\b(?= *()".

And dont use "matches", use "find". Mathes trying to match the whole string

https://regex101.com/r/xaPCyp/1

SG Tech Edge
  • 477
  • 4
  • 16
1

The .matches method will match the whole string where your pattern would only get a partial match.

In the pattern that you tried, the negated character class [^\\w] could also match more than a whitespace boundary as it matches any char except a word character. It could for example also match a ( or a newline.

As per the comments test() function should also match, using [^\\w] or (?<=\s) expects a character to be there on the left.


Instead you could make use of (?<!\\S) to assert a whitespace boundary on the left.

.*(?<!\S)test\h*\(.*

Explanation

  • .* Match 0+ times any char except a newline
  • (?<!\S) Assert a whitespace boundary on the left
  • test\h* Match test and 0+ horizontal whitespace chars
  • \( Match a ( char
  • .* Match 0+ times any char except a newline

Regex demo | Java demo

In Java

System.out.println(s.matches(".*(?<!\\S)test\\h*\\(.*"));
The fourth bird
  • 154,723
  • 16
  • 55
  • 70