I'm a regex noob attempting to match either the contents or the entirety of a quoted segment of text without breaking on escaped quotation marks.
Put another way, I need a regex that, between two question marks, will match all characters that are not quotation marks and also any quotation mark that has an odd number of consecutive backslashes preceding it. It has to be an odd number of backslashes as a pair of backslashes escapes to a single backslash.
I've successfully created a regex that does this but it relied on look-behind and because this project is in C++ and because the regex implementation of standard C++ does not have look-behind functionality, I could not use said regex.
Here is the regex with look-behind that I came up with: "(((?<!\\)(\\\\)*\\"|[^"])*)"
The following text should produce 8 matches:
"Woah. Look. A tab."
"This \\\\\\\\\\\\\" is all one string"
"This \"\"\"\" is\" also\"\\ \' one\"\\\" string."
"These \\""are separate strings"
"The cat said,\"Yo.\""
"
\"Shouldn't it work on multiple lines?\" he asked rhetorically.
\"Of course it should.\"
"
"If you don't have exactly 8 matches, then you've failed."
Here's a picture of my (probably naive) look-behind version for the visual people among you (You know who you are):
And here's a link to this example: https://regex101.com/r/uOxqWl/1
If this is impossible to do without look-behind, please let me know. Also, if there is a well-regarded C++ regex library that allows regex look-behind, please let me know (It doesn't have to be ECMAScript, though I would slightly prefer that).