0

This works just fine for normal string literal ("hello").

"([^"]*)"

But I also want my regex to match literal such as "hell\"o". This what i have been able to come up with but it doesn't work.

("(?=(\\")*)[^"]*")

here I have tried to look ahead for <\">.

  • 1
    See [this link](http://stackoverflow.com/questions/17043454/using-regexes-how-to-efficiently-match-strings-between-double-quotes-with-embed). It allows you to do what you want and even more. – fge Apr 05 '14 at 15:05

3 Answers3

1

How about

Pattern.compile("\"((\\\\\"|[^\"])*)\"")// 
                         ^^ - to match " literal
                     ^^^^   - to match \ literal
                     ^^^^^^ - will match \" literal

or

Pattern.compile("\"((?:\\\\\"|[^\"])*)\"")// 

if you don't want to add more capturing groups.

This regex accept \" or any non " between quotation marks.


Demo:

    String input = "ab \"cd\" ef \"gh \\\"ij\"";
    Matcher m = Pattern.compile("\"((?:\\\\\"|[^\"])*)\"").matcher(input);
    while (m.find())
        System.out.println(m.group(1));

Output:

cd
gh \"ij
Pshemo
  • 122,468
  • 25
  • 185
  • 269
  • Will work, however will "fail slow". See the link I posted above which is "fail fast" ;) – fge Apr 05 '14 at 15:31
  • @fge Idea you are describing is very nice, but I had problems with creating regex which will also match `"ab\c\"d"` where as you can see ``\`` is placed not only before `"`, but also before other characters. – Pshemo Apr 05 '14 at 15:49
  • Simple modification: just replace `special` with `\\.` ;) – fge Apr 05 '14 at 15:52
  • I thought about it earlier. In most cases this is good solution, because even numbers of ``\`` shouldn't be treated as escape sequence, so in case of `"ab\\"c"` correct match would be ``ab\\``). But if OP for some reasons doesn't want this "functionality" simple regex as I posted will do their job. – Pshemo Apr 05 '14 at 16:06
0

Use this method:

"((?:[^"\\\\]*|\\\\.)*)"

[^"\\\\]* now will not match \ anymore either. But on the other alternation, you get to match any escaped character.

Jerry
  • 70,495
  • 13
  • 100
  • 144
0

Try with this one:

Pattern pattern = Pattern.compile("((?:\\\"|[^\"])*)");

\\\" to match \" or,

[^\"] to match anything by "

Sabuj Hassan
  • 38,281
  • 14
  • 75
  • 85