0

I'm trying to understand the following RegEx from Stroustrups C++ 4th Ed. Page 178.

"('(?:[^\\\\']|\\\\.)*'|\"(?:[^\\\\\"]|\\\\.)*\")|"

I believe it's not a raw string literal, so backslash should be in front of special characters. I've tried inputting this into www.regex101.com however am unable to find what it matches. I think perhaps I'm not isolating the extra back slashes from the string.

Does someone with more experience here have an example of what this matches and what the raw expression should be?

UPDATE: Because this is not a RAW string, I remove the extra slashes and come up with this string:

"('(?:[^\\']|\\.)*'|"(?:[^\\"]|\\.)*")|"

Unfortunately, I still can't figure out what it matches.

Thanks

notaorb
  • 1,944
  • 1
  • 8
  • 18
  • You have a typo in it. `∗` is `\u2217`. Replace it with `*`. Also, `ˆ` is a typo, replace it with `^`. Once you do that you will see it is meant to remove `"` outside of single and double quoted string literals if used in `regex_replace`. Yeah, half the backslashes, 4 becomes 2 in regex101.com input regex field. – Wiktor Stribiżew Jun 05 '20 at 08:32
  • @WiktorStribiżew I replaced the bad characters and slashes. I updated the post with that string, however I tried it on regex101.com and still can't figure out what it matches. Do you know if I created the correct raw string? – notaorb Jun 05 '20 at 15:37
  • There is no such a thing as a raw string. There are raw string literals. See https://regex101.com/r/zxhG3a/1 – Wiktor Stribiżew Jun 05 '20 at 15:40
  • @WiktorStribiżew re: non capturing group "(?:[^\\"]|\\.)*", how were you able to match \"? I thought that would be excluded from capture? – notaorb Jun 05 '20 at 16:40
  • `\\.` matches any escape sequence. – Wiktor Stribiżew Jun 05 '20 at 17:18

1 Answers1

-1

I believe it's not a raw string literal

Correct.

You can simply print the string to see the un-escaped version that you can paste into regex101.com, which can explain what it matches.

These steps should work manually for this pattern:

  • Remove " from begin and end
  • Replace \" with "
  • Replace \\ with \. In other words, sequences of \ should become half as long, rounded up.

Given that there is a top level alternation where the right hand alternative is an empty string, it looks like this matches every position of every input string.

eerorika
  • 232,697
  • 12
  • 197
  • 326