Possible Duplicate:
How to match a quoted string with escaped quotes in it?
I'm building a parser and I need a method that matches a string: The string starts and ends with a "
. Everything until the second "
, that is not escaped, should be matched. Escaped means that there's an odd amount of backslashes before it (e.g. \"
or \\\"
).
Some examples, the part before =>
is the input and the other part is what the method should extract:
"Hello World" => "Hello World"
"Hello" World => "Hello"
"Hello \"World" => "Hello \" World"
"Hello \\" World => "Hello \\"
I guess in most programming langs the backslashes need to be escaped to have an actual backslash in the string. That means that one would need two backslashes to get one real backslash inside the string. The above examples ignore this.
I came up with this regular expression (I'm using Ruby):
/
"
(?:
(?:\\{2})* # an even amount of backslashes
\\ # followed by a single backslash: odd amount of backslashes
"
|
[^"]
)*
"
/x
However, it doesn't work correctly with the third example string, or any string thas has a backslash to escape a "
. I I noticed that when I remove the *
in the third last line then escaping the "
works, but it doesn't work correctly with example 4.
I spent a long time trying to fix this regex, but I couldn't figure out how to. I know the question might be a little overwhelming, so tell me if you need more information!