0

I am trying to match curly quotes that are inside shortcodes and replace them with normal quotes but leave the ones outside.

Here is an example content:

“foobar” [tagname id=“1035” linked=“true”] “another” [tagname id=“1”]

Should output the following:

“foobar” [tagname id="1035" linked="true"] “another” [tagname id="1"]

It can be PCRE or Javascript regex. Any suggestions is appreciated.

DarkBee
  • 16,592
  • 6
  • 46
  • 58
voldomazta
  • 1,300
  • 1
  • 10
  • 19

2 Answers2

2

For doing replacements on substrings that match some pattern it's often more efficient and comfortable to use a callback if available. With PHP and preg_replace_callback e.g.:

$res = preg_replace_callback('~\[[^\]\[]*\]~', function($m) {
  return str_replace(['“','”'], '"', $m[0]);
}, $str);

This pattern matches an opening square bracket followed by any amount of characters that are no square brackets, followed by a closing square bracket. The callback function replaces quotes.

Here is a PHP demo at tio.run. This can easily be translated to JS with replace function (demo).

let res = str.replace(/\[[^\]\[]*\]/g, m => { return m.replace(/[“”]/g,'"'); });

Without callback in PCRE/PHP also the \G anchor can be used to continue where the previous match ended. To chain matches to an opening square bracket (without checking for a closing).

$res = preg_replace('~(?:\G(?!^)|\[)[^“”\]\[]*\K[“”]~u', '"', $str);

See this demo at regex101 or another PHP demo at tio.run

(?!^) prevents \G from matching at start (default). \K resets beginning of the reported match.


To have it mentioned, another method could be to use a lookahead at each for checking if there is a closing ] ahead without any other square brackets in between: [“”](?=[^\]\[]*\])
This does not check for an opening [ and works in all regex flavors that support lookaheads.

bobble bubble
  • 16,888
  • 3
  • 27
  • 46
  • I have already tried this solution also but was wondering if there was a one-pass solution. – voldomazta Oct 04 '22 at 21:33
  • 2
    @voldomazta The other idea is good! However to avoid something [like this](https://regex101.com/r/mwmYcW/1), I think you better replace the lookahead `(?=.*\])` with `(?=[^\]\[]*\])` [like that](https://regex101.com/r/5hLZWl/1). If you read my full answer, there are two one-pass solutions however the other answer is more concise targeting the `=` (performance). – bobble bubble Oct 05 '22 at 08:08
  • 1
    @bobblebubble Fixed my solution. Thanks for pointing out. Always feel free to ping my bugs! – nice_dev Oct 06 '22 at 08:16
0

Since this is a little tricky, I am contributing from my end.

So, we can,

  • match strings that follow a format of =“some_chars”

  • Since you have an additional constraint of match only if they are inside the square brackets, we will use positive lookahead ?= to match the above only if it is followed by a closing square bracket (since the string is uniformly formed, there will always be an opening square bracket which we won't bother about).

Snippet:

<?php

$str = "outside=“bar” “foobar” [tagname id=“1035” linked=“true”] “another” [tagname id=“1”]";

echo preg_replace('/(\=“([^”]*)”)(?=[^[]*\])/', '="${2}"', $str);

Online Demo

nice_dev
  • 17,053
  • 2
  • 21
  • 35