1

I want to find a string (say x) that satisfies two conditions:

  1. matches the pattern \b(x)\b
  2. does not match the pattern ".*?(x).*?(?<!\\)"

In other words, I am looking for a value of x that is a complete word (condition1) and it is not in double quotes (condition2).

  • " x /" m" not acceptable
  • " x \" " + x + " except" :only the second x is acceptable.

What Java code will find x?

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Mary
  • 55
  • 1
  • 8
  • http://stackoverflow.com/questions/2667727/regular-expression-to-match-text-outside-quotes-etc, http://stackoverflow.com/questions/632475/regex-to-pick-commas-outside-of-quotes, http://stackoverflow.com/questions/6462578/alternative-to-regex-match-all-instances-not-inside-quotes (JS) - The best way is to either remove all double-quoted substrings (say with `"[^"]*"`) and then all the `x` you find are those you need or match the double-quoted strings and match and capture `x` into Group 1 - it will be your result. – Wiktor Stribiżew May 05 '16 at 07:25

1 Answers1

1

The first condition is straight forward. To check second condition you will have to check number of valid double quotes. If they are even then the string captured in first condition is valid.

String text = "basdf + \" asdf \\\" b\" + b + \"wer \\\"\"";
String toCapture = "b";
Pattern pattern1 = Pattern.compile("\\b" + toCapture + "\\b");
Pattern pattern2 = Pattern.compile("(?<!\\\\)\"");
Matcher m1 = pattern1.matcher(text);
Matcher m2; 
while(m1.find()){                               // if any <toCapture> found (first condition fulfilled)
    int start = m1.start();
    m2 = pattern2.matcher(text);
    int count = 0;
    while(m2.find() && m2.start() < start){     // count number of valid double quotes "
        count++;
    }
    if(count % 2 == 0) {                        // if number of valid quotes is even 
        char[] tcar = new char[text.length()];
        Arrays.fill(tcar, '-');
        tcar[start] = '^';
        System.out.println(start);
        System.out.println(text);
        System.out.println(new String(tcar));
    }
}

Output :

23
basdf + " asdf \" b" + b + "wer \""
-----------------------^-----------
afzalex
  • 8,598
  • 2
  • 34
  • 61
  • 1
    thanks, your answer appears correct. The other solution is that first all of the quotes statement are stored in an array and are replaced with # and then look for the first pattern and after matching... ,again restore those statement in the line. – Mary May 07 '16 at 04:08
  • I don't advice changing quotes to hash if it could be done in 1 step. I also checked the code for your inputs. – afzalex May 07 '16 at 04:54