3

Basically what I need is a regex expression that only selects double-quotes that are not surrounded by single-quotes. (This is in order to quickly refactor double-quote into single-quotes without breaking any nested strings).

Example (same as here):

"foo" => 'foo'
'foo' => 'foo'
abc "foo" => abc 'foo'
foo "bar", "baz" => foo 'bar', 'baz'
abc 'foo "bar" baz' => abc 'foo "bar" baz'

So on searching for this question, I was able to find how to do this in PCRE but I wasn't able to figure out how to convert the (*SKIP)(*F) into usable Javascript Regex

My own Javascript attempt is: /(?:('.*["].*')|")/g (live demo).

The first pattern /'.*["].*'/ goes a good job matching what I eventually want to exclude ('foo "bar" baz') but I'm then unsure how to tell the expression that if this is matched, to exclude it.

I've tried playing around with the (?!) expression with no success.

If anyone has an idea on how to do either write a better regex or have an alternative solution to the problem I'd appreciate it.

EDIT:

As additional information, the regex expressions are being used for search and replace functions in WebStorm/PHPStorm to refactor source code.

Community
  • 1
  • 1
Hanna
  • 10,315
  • 11
  • 56
  • 89
  • This would be a heck of a lot easier when done with the mentioned `(*SKIP)(*FAIL)` mechanism. – Jan Sep 22 '16 at 00:50
  • Yes it would! Unfortunately it's not valid regex in that program. – Hanna Sep 22 '16 at 00:52
  • Can't you run a script over the code in question? This comes down to about fives lines, really. If this is not an option, have a look at http://www.rexegg.com/regex-best-trick.html and scroll down to the `JS` section - but even this requires a full JavaScript functionality which `PHPStorm` might not have (not tested). – Jan Sep 22 '16 at 00:53
  • Jan, actually that's a good thought. Do you have a recommendation on that (software/script)? – Hanna Sep 22 '16 at 00:56

2 Answers2

4

You can use this regex:

"(?=(?:[^']*'[^']*')*[^']*$)

It will match any double quotes outside single quotes, working sample.

The trick is to search forward till the end of the line for a pair of single quotes, don't accept if odd number of single quotes is found.

Leonardo Xavier
  • 443
  • 3
  • 16
  • This is really great, Koala -- and thanks for the explanation. Would this also be easy to expand to also exclude double-quotes that have single-quotes inside them as well? – Hanna Sep 22 '16 at 15:54
  • Yes, the regex `"([^'\n]+)"(?=(?:[^']*'[^']*')*[^']*$)` but now it will match the whole text and capture the content, so you will need to replace with `'$1'` – Leonardo Xavier Sep 22 '16 at 18:51
2

In addition to the comment:

<?php

$data = <<<DATA
"foo"
'foo'
abc "foo"
foo "bar", "baz"
abc 'foo "bar" baz'
DATA;

$regex = "~
        '[^']*'(*SKIP)(*FAIL) # match everything between single quotes and fail 
        |                     # or
        \"([^\"]*)\"          # match double quotes
        ~x";

$data = preg_replace($regex, "'$1'", $data);

echo $data;
?>

See a demo on ideone.com.

Jan
  • 42,290
  • 8
  • 54
  • 79