1

I am using regex to strip quotes from a String value. These String values can contain escaped quotes but also escaped backslash characters.

I do not want to remove escaped quotes, only non-escaped quotes. However, the cases where escaped backslash characters are preceding a non-escaped quote is causing difficulty.

I want results like the following:

"value"         ->  value
'value'         ->  value
"\"value\""     ->  \"value\"   <-- contains escaped quotes
"value\"        ->  value\"
"value\\"       ->  value\\     <-- contains escaped backslash before non-escaped quote
"""val"ue\\\""" ->  value\\\"

The following regex almost works for me, except that it is also stripping backslashes when there is an even number of them before a quote, when I only want to escape double and single quote characters.

(?<!\\\\)(?:\\\\{2})*[\"']
  • Why is `"""val"ue\\\"""` should result in `value\\\"`? What about `"` in-between `value`? – Pshemo Jul 03 '18 at 12:01
  • I was thinking about something like `(["'])(?:\\.|.)*?\1` to match entire quote (`\.` is used to match *any escaped character*) but above example makes it harder. – Pshemo Jul 03 '18 at 12:03
  • Hello, please see both marked questions. They will definitely help you to port. Otherwise edit your question to reflect differences. – revo Jul 03 '18 at 12:20

1 Answers1

0

The problem occurs because you match those backslashes and they are removed. To keep them, capture these backslashes, and replace with $1 placeholder:

s.replaceAll("((?<!\\\\)(?:\\\\{2})*)[\"']", "$1")

See the regex demo.

The ((?<!\\\\)(?:\\\\{2})*) is now wrapped in (...) and you may refer to the value captured within this group by using $1 in the replacement pattern.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563