1

I was solving this question from Hackerrank

https://www.hackerrank.com/contests/find-google/challenges/find-google/problem

and came up with this pattern "^[gG][o0O()\[\]{}][o0O()\[\]{}][gG][lLI][eE3]" But this is giving wrong answer for test case g()()GI3. Can anyone tell me the error? Also tell me if there is more efficient expression for this.

    import java.util.regex.*;
    import java.io.*;
    import java.util.*;
    class Main {

public static void main (String[] args) {
    Scanner s = new Scanner(System.in);
    String str = s.next();
    Pattern pattern = Pattern.compile("^[gG][o0O()\\[\\]{}][o0O()\\[\\]{}][gG][lLI][eE3]",Pattern.CASE_INSENSITIVE);
    Matcher matcher = pattern.matcher(str);
    if(matcher.matches())
    System.out.println("YES");
    else System.out.println("NO");
}}
  • `[...]` character class can match only *one* symbol from specified range, so `[()]` doesn't represent `()` *sequence* but single `(` OR `)`. Same for `{}` (which probably should be `<>`). – Pshemo Dec 31 '18 at 12:37

3 Answers3

4

The problem with the current regex is that [], () and <> are put inside a character class that matches a single char inside it, but (), <> and [] char sequences consist of 2 chars.

You need to use a grouping construct with an alternation operator here to match os.

You may use this pattern with Pattern.matches():

Pattern pattern = Pattern.compile("[gG](?:[oO0]|\(\)|\[]|<>){2}[gG][LlI][eE3]");

See the regex demo

Details

  • [gG] - g or G
  • (?:[oO0]|\(\)|\[]|<>){2} - two occurrences of
    • [o0] - o, O or 0
    • | - or
    • \(\) - a () substring
    • | - or
    • \[] - a [] substring
    • | - or
    • <> - a <> substring
  • [gG] - g or G
  • [LlI] - l, L or I
  • [eE3] - e, E or 3.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • it is giving error in java prog.java:10: error: illegal escape character. Also why have you used ?: – abhineet agarwal Dec 31 '18 at 12:42
  • @abhineetagarwal Use ``\\`` instead of ``\``. – Pshemo Dec 31 '18 at 12:43
  • why are ?: used. Also why do we need \\ before ( and ) but not before [ and ] in "[gG](?:[oO0]|\\(\\)|\\[]|<>){2}[gG][LlI][eE3]" – abhineet agarwal Dec 31 '18 at 12:46
  • @abhineetagarwal `?:` is non capturing group. '\\' is used to escape character which have special meaning it regex – Code Maniac Dec 31 '18 at 12:47
  • @abhineetagarwal about `(?:...)` read [What is a non-capturing group? What does (?:) do?](https://stackoverflow.com/q/3512471). – Pshemo Dec 31 '18 at 12:47
  • Oh. I understand. But do we need delimiter before ( and ) and not for [and ] – abhineet agarwal Dec 31 '18 at 12:54
  • Thank you @Pshemo. I changed the pattern to "[gG](?:[oO0]|\\(\\)|<>|\\[]){2}[gG][lLI][eE3]" . But in hackerrank, the test case "google " is failing as my code is giving answer as true. But the expected output is false. Does an extra space at the end of the test case "google " affect the answer? – abhineet agarwal Dec 31 '18 at 13:07
  • @abhineetagarwal "why do we need \\ before ( and ) but not before [ and ]" but we do need \\ before `[`. We don't need it before `]`. Rationale is that if `]` doesn't have unescaped `[` and unclosed before it, regex engine cant match see `]` as ending of character class. In case of `)` similar rationale could be used but it would be more error-prone for programmers who write regex since groups `(..)` - especially nested - are used more often than `[]` so letting unescaped `)` would make regex harder to read. Also `)` is part of more regex mechanisms like `(?=..)`. – Pshemo Dec 31 '18 at 13:08
  • "the test case "google" is failing as my code is giving answer as true. But the expected output is false" either you are misreading something or it is bug in question. If we take a look at first example of input/expected output at https://www.hackerrank.com/contests/find-google/challenges/find-google/problem we see there `Sample Input 0 google` `Sample Output 0 True`. OR maybe it is case sensitive and it expects `True` not `true`. – Pshemo Dec 31 '18 at 13:12
  • does an extra space at the end of the input "google " affects the output? – abhineet agarwal Dec 31 '18 at 13:13
  • @abhineetagarwal Yes, space is also a character so regex treats is as such. Try to apply regex on result of `string.trim()` if you don't want regex to handle spaces at start or end. – Pshemo Dec 31 '18 at 13:17
  • Yup. The whitespace was giving the error. I just applied a condition to compare trimmed and original string. Thanks @Pshemo. – abhineet agarwal Dec 31 '18 at 13:23
  • 1
    @abhineetagarwal If you want to allow any amount of leading/trailing whitespace chars in your strings, add `\s*` at the start and end: `"\\s*[gG](?:[oO0]|\(\)|\[]|<>){2}[gG][LlI][eE3]\\s*"`. Non-capturing group is preferred when the text the group matches is not used later, as in this case. – Wiktor Stribiżew Dec 31 '18 at 13:43
1

You can use this

You do not need to add both uppercase and lowercase of a letter when you use case insensitive flag.

g(?:o|0|<>|\[]){2}g[li][e3]
  • g - Matches g or G ( as case insensitive flag is on )
  • (?:o|0|<>|\[]) - Matches o or 0 or <> or [].
  • [li] - Matches L or l or I or i.
  • [e3] - Matches e or E or 3

Demo

Code Maniac
  • 37,143
  • 5
  • 39
  • 60
  • `(o|0|<>|\[])(o|0|<>|\[])` can be rewritten as `(o|0|<>|\[]){2}` which IMO better shows intend on this regex (not to mention that it would be easier to maintain when you realize that you also need to handle `()`). – Pshemo Dec 31 '18 at 12:37
0

I propose you this version of the regex string along with the code:

String string = bufferedReader.readLine();
String regex = "^[gG]([o0O]|\\(\\)|\\[\\]|\\{\\}|\\<\\>){2}[gG][lLI][eE3]";        
String result = Pattern.matches(regex, string) ? "True" : "False";
System.out.println(result);

Since it's a "one shot" search you can use the static method "matches" of the Pattern class. That is, desume the result directly from a one-time match against your regex.

The regex is mostly the same as the one you devised, but some points are worth noting.

It is dangerous (if not erroneous) using case insensitivity when your regex tries to pick up letters of differente case. If you want to do the matching the classic way, avoid using the flag of case insensitivity. (in this particular case it would match a "google" written with a lower case "i" in place of a "L", which would lead to false positives).

Since there is only one way to express the "o", it is better to group its definition in a single sub-expression and then use the quantifier " {2} " to say that you exactly want two instances of that subexpression to match.

You want to find two occurrences of EITHER

  • either a lower/upper case "o"
  • a zero
  • a pair of normal/square/curly/angular parentheses

Last but not least, if you are looking for simple, square, curly or angular braces, you must escape them because those are special characters in a regexp. Moreover, you must escape them the java way, using a double blackslash.

Here is the full code:

    public static void main(String[] args) throws IOException {
       BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(System.in));

       String string = bufferedReader.readLine();
       String regex = "^[gG]([o0O]|\\(\\)|\\[\\]|\\{\\}|\\<\\>){2}[gG][lLI][eE3]";        
       String result = Pattern.matches(regex, string) ? "True" : "False";
       System.out.println(result);

       bufferedReader.close();
}

It prints "True" when fed with the input "g()()GI3"

S3lvatico
  • 262
  • 3
  • 9