4

I am trying to parse templates where tokens are delimited by @ on both sides.

Example input:

Hello, @name@! Please contact admin@example.com, dear @name@!

Desired output:

Hello, Peter! Please contact admin@example.com, dear Peter!

Naive attempt to find matches and replace:

$content = 'Hello, @name@! Please contact admin@example.com, dear @name@!';

preg_replace_callback(
    '/(@.*@)/U', function ($token) {
        if ('@name@' == $token)  //replace recognized tokens with values
            return 'Peter';

        return $token;  //ignore the rest
    }, $content);

This regex doesn't correctly deal with spare @ - it matches first @name@ and @example.com, dear @ and fails to match the second @name, because an @ is already spent before. The output is:

Hello, Peter! Please contact admin@example.com, dear @name@!

To prevent spending @, I tried using lookarounds:

$content = 'Hello, @name@! Please contact admin@example.com, dear @name@!';

preg_replace_callback(
    '/(?<=@)(.*)(?=@)/U', function ($token) {
        if ('name' == $token)  //replace recognized tokens with values
            return 'Peter';

        return $token;  //ignore the rest
    }, $content);

This correctly matches every substring that's included between a pair of @s, but it doesn't allow me to replace the delimiters themselves. The output is:

Hello, @Peter@! Please contact admin@example.com, dear @Peter@!

How can I pass to callback anything between a pair of @s and replace it replacing the @s as well?

The tokens will not include newlines or @.

Another example

This is a bit artificial, but to show what I would like to do as the current suggestions rely on word boundaries.

For input

Dog@Cat@Donkey@Zebra

I would like the calback to get Cat to see if @Cat@ should be replaced with the token value and then receive Donkey to see if @Donkey@ to be replaced.

Džuris
  • 2,115
  • 3
  • 27
  • 55
  • 3
    If you know the variable name, wouldn't it be easier to do string replacements of `@name@` instead of looking for any `@...@`? – Devon Bessemer Aug 22 '18 at 00:03
  • Instead of the overly broad `.*` match for `\w+`. And use lookbacks to assert no letters before that. – mario Aug 22 '18 at 00:15
  • @Devon in the actual code I run a database query to find if the token has been defined. – Džuris Aug 22 '18 at 05:28
  • @Nick that seems a lot simpler case as the start and end delimiters are different. In my case the pain is that `@` can either start a token, end it or just be an `@` in the text. And I would like the callback to check stuff between each pair of consecutive `@`s within a single line. – Džuris Aug 22 '18 at 06:04
  • Do you have non-whitespace characters between each pair of `@`s? – revo Aug 22 '18 at 06:22
  • In your second example, would you replace both Cat and Donkey? – Nick Aug 22 '18 at 06:25
  • Can you have your tokens compared to values inside `@`, without them? – Wiktor Stribiżew Aug 22 '18 at 07:42
  • @Nick assume that at most one of those will match an actual token. Otherwise it's user error and rhe behaviour is undefined. A valid result could be DogPeterDonkey@Zebra or Dog@CatPeterZebra or remain unchanged depending whether either token is defined – Džuris Aug 22 '18 at 07:59
  • 1
    @Džuris Please check https://ideone.com/AYTjmk and if it is what you need, I will post. – Wiktor Stribiżew Aug 22 '18 at 08:05
  • 1
    Well, it might be also https://ideone.com/Lm19Gc, if you really want to match any chars between `@`. – Wiktor Stribiżew Aug 22 '18 at 08:55

2 Answers2

1

I suggest to use: /@\b([^@]+)\b@/

Capture group0 holds:  @name@
Capture group1 holds:  name
Andie2302
  • 4,825
  • 4
  • 24
  • 43
  • This is decent, but I would like to avoid mandatory word boundaries if possible. The callback should just receive anything that's between any consecutive pair of `@`s. – Džuris Aug 22 '18 at 05:56
1

Because of the possibly overlapping delimiters, I'm not sure this can be done with regexes. However here is a recursive function which will do the job. This code doesn't care what the token looks like (i.e. it doesn't have to be alphanumeric), just so long as it occurs between @ symbols:

function replace_tokens($tokens, $string) {
    $parts = explode('@', $string, 3);
    if (count($parts) < 3) {
        // none or only one '@' so can't be any tokens to replace
        return implode('@', $parts);
    }
    elseif (in_array($parts[1], array_keys($tokens))) {
        // matching token, replace
        return $parts[0] . $tokens[$parts[1]] . replace_tokens($tokens, $parts[2]);
    }
    else {
        // not a matching token, try further along...
        // need to replace the `@` symbols that were removed by explode
        return $parts[0] . '@' . $parts[1] . replace_tokens($tokens, '@' . $parts[2]);
    }
}

$tokens = array('name' => 'John', 'Cat' => 'Goldfish', 'xy zw' => '45');
echo replace_tokens($tokens, "Hello, @name@! Please contact admin@example.com, dear @name@!") . "\n";
echo replace_tokens($tokens, "Dog@Cat@Donkey@Zebra") . "\n";
echo replace_tokens($tokens, "auhdg@xy zw@axy@Cat@") . "\n";
$tokens = array('Donkey' => 'Goldfish');
echo replace_tokens($tokens, "Dog@Cat@Donkey@Zebra") . "\n";

Output:

Hello, John! Please contact admin@example.com, dear John!
DogGoldfishDonkey@Zebra
auhdg45axyGoldfish
Dog@CatGoldfishZebra
Nick
  • 138,499
  • 22
  • 57
  • 95