1

My file contains some lines such as

"This is a string." = "This is a string's content."
" Another \" example \"" = " New example."
"My string
can have several lines." = "My string can have several lines."

I need to extract the substring :

This is a string.
This is a string's content.
 Another \" example \"
 New example.
My string
can have several lines.
My string can have several lines.

Here's my code:

String regex = "\".*?\"\\s*?=\\s*?\".*?\"";
Pattern pattern = Pattern.compile(regex,Pattern.DOTALL);
Matcher matcher = pattern.matcher(file);

For the moment, I can get the pair of left and right part of "=". But when my substring contains " \" ", my regex dosen't do the right job.

Can anyone help me write the correct regex please ? I tried \"^[\\"] instead of \", but it didn't work..

Thanks advance.

3 Answers3

3
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile(
    "\"          # Match a quote\n" +
    "(           # Capture in group number 1:\n" +
    " (?:        # Match either...\n" +
    "  \\\\.     # an escaped character\n" +
    " |          # or\n" +
    "  [^\"\\\\] # any character except quotes or backslashes\n" +
    " )*         # Repeat as needed\n" +
    ")           # End of capturing group\n" +
    "\"          # Match a quote", 
    Pattern.COMMENTS);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    matchList.add(regexMatcher.group(1));
} 
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
0

I'm sorry that I'm on a location where i can't test this, but does

\".*?(?:[^\\]\")\\s*=\\s*\".*?(?:[^\\]\")

work?

I just replaced the \" with (?:[^\\]\") so they won't match if the char before them is a \ anymore.

ronalchn
  • 12,225
  • 10
  • 51
  • 61
Mitja
  • 1,969
  • 30
  • 35
  • actually, stackoverflow kicked out some backslashes in my last line, but you can see it in the code line correctly (i hope) – Mitja Sep 12 '12 at 09:40
-1
/"([^"\\]*(?:\\.[^"\\]*)*)"/

Source. Also see this previous question.

Community
  • 1
  • 1
mogelbrod
  • 2,246
  • 19
  • 20