1

I want to perform a php preg_match_callback against all single or double-quoted strings, for which I'm using the code seen on https://codereview.stackexchange.com/a/217356, which includes handling of backslashed single/double quotes.

const PATTERN = <<<'PATTERN'
~(?|(")(?:[^"\\]|\\(?s).)*"|(')(?:[^'\\]|\\(?s).)*'|(#|//).*|(/\*)(?s).*?\*/|(<!--)(?s).*?-->)~
PATTERN;

$result=preg_replace_callback(PATTERN, function($m) {
            return $m[1]."XXXX".$m[1];
        }, $test);

but this runs into a problem when scanning blocks like that seen in .replace() calls from javascript, e.g.

x=y.replace(/'/g, '"');

... which treats '/g, ' as a string, with the "');......." as the following string.

To work around this I figure it would be good to do the callback except when the quotes are inside the first argument of .replace() as these cause problems with quoting.

i.e. do the standard callbacks, but when .replace is involved I want to change the XXXX part of abc.replace(/\'/, "XXXX"); but I want to ignore the \' quote/part.

How can I do this?

See https://onlinephp.io/c/5df12 ** https://onlinephp.io/c/8a697 for a running example, showing some successes (in green), and some failures (in red). (** Edit to correct missing slash)

Note, the XXXX is a placeholder for some more work later.

Also note that I have looked at Javascript regex to match a regex but this talks about matching regex's - and I'm talking about excluding them. If you plug in their regex pattern into my code it does not work - so should not be considered a valid answer

user1432181
  • 918
  • 1
  • 9
  • 24

1 Answers1

2

You can use verbs (*SKIP)(*F) to skip something. For skipping the first argument e.g.:

\(\s*/.*?/\w*\h*,(*SKIP)(*F)|(?|(")[^"\\]*(?:\\.[^"\\]*)*"|(')[^'\\]*(?:\\.[^'\\]*)*')

See this demo at regex101 or your updated php demo


The pattern on the skipped side is very simple, you might want to further improve that.
Besides I used a bit more efficient pattern to match the quoted parts, explained here.

bobble bubble
  • 16,888
  • 3
  • 27
  • 46
  • Please use 3v4l.org for your demo. That phpsandbox has a horrible UX on mobile. – mickmackusa Nov 25 '22 at 23:13
  • @mickmackusa Thanks for comment! It's the updated demo from OP, I like 3v4l.org too but my favorite is tio.run because it feels lightweight and supports a whole lot of languages. – bobble bubble Nov 25 '22 at 23:18
  • as an addition ... I've spotted I need to exclude strings that are proceeded by a `:`, e.g. `"propNm":xxxx`. Is there a simple adjustment to the regex pattern to take that into account too? – user1432181 Nov 26 '22 at 16:40
  • 1
    @user1432181 do you mean to exclude the quotes if the latter quote is followed by a colon [like this demo?](https://regex101.com/r/4YvQpQ/1) (I added a *lookahead* to skip if a colon appears after) – bobble bubble Nov 26 '22 at 17:47