0

I am using tags to replace text before displaying output in a browser, similar to Wordpress' short codes.

Example string: Hi, this is a block of text {{block:welcome}} and this is a system variable {{variable:system_version}}

I have functions to replace these blocks accordingly, and I realize a foreach or while function will be the best way to deal with it, but unfortunately, replacing one {{...}} may introduce another. Hence, I opted for recursion until no more are found. Typical recursion is only once, but I have had two in one scenario. Maybe calling the function 3 times will work, but it sounds "wrong".

Now that is where the problem occurs: I do NOT want to replace them when they appear in:

1) A page where the URL you are calling contains something
2) Any form element such as `<input>` or `<textarea>`.

I need help on how to exclude from #2 above by means of a regex.

My regex currently look like this: ^\{\{((?!keep).)*$ (I realize it may still be wrong, or need modification - does not quite work yet).

If the item contains "keep", e.g., {{block:welcome:keep}} it should not be replaced, but when doing so, the recursion never stops, as I keep finding items to replace, and thus run out of memory, or get maximum nested level errors.

The reason why I want to do this, is because I do not want the content replaced when on an ADMIN page, or when you are editing form content.

Someone willing to give it a crack? I am using PHP, if that matters.

Thanks!

EDIT 1

Since @Pablo's answer was given to me in chat, I decided to edit my question to reflect why his answer was marked as the correct one.

My regex now look like this: /(?:<(?:textarea|select)[\s\S]*?>[\s\S]*?)?({{variable:(.*?)}})[\s\S]*?(?:<\/(?:textarea|select)>)?|(?:<(?:input)[\s\S]*?)?{{variable:(.*?)}}(?:[\s\S]*?>)?/im

I then check if the match contains an input, select or textarea, and if so, replace the {{ with something else temporarily, and then do my replacement, and when done, change the "something else" back to {{ as Pablo suggested. My regex is thanks to the answer on this question: Text replacement: PHP/regex.

If the above edit does not belong, feel free to remove.

Kobus Myburgh
  • 1,114
  • 1
  • 17
  • 46
  • You can test your regex on sites like [RegExr](https://regexr.com). It seems to work, but you should probably leave the `^` and the `$`, if your not checking at exactly this: `{{block:welcome:`, since `{{block:welcome:keep}}` would not match, since it has some non-matching string (`keep}}`) at the end. – Minding Jul 21 '19 at 21:08
  • If my comment does not help, you should probably add your code to the answer, to help us understand, how you use the regex. – Minding Jul 21 '19 at 21:12
  • Your question is not doing a great job of providing an [mcve]. Please improve this question by providing a complete html input string, some replacement data, and your exact desired output. Because this question needs to parse html, this page needs to receive sn answer that uses an html parser. It is unclear from your post if recursion is _actually_ necessary. – mickmackusa Jul 28 '19 at 22:19

1 Answers1

1

Instead of looking for the perfect RegEx I suggest looking into using preg_replace_callback(). It should allow you to use a simpler RegEx while having more control over the search and replace algorithm for your templating engine. Consider the following example:

  1. resolvePlaceholder() generates the replacing content
  2. interpolate() parses a template string. It supports nested parsing up to 4 levels.
  3. Stop recursive parsing for tags starting with !.

<?php

function resolvePlaceholder($name)
{
    $store = [
        'user:first'              => 'John',
        'user:last'               => 'Doe',
        'user:full_name'          => '{{user:first}} {{user:last}}',
        'block:welcome'           => 'Welcome {{user:full_name}}',
        'variable:system_version' => '2019.1',
        'nest-test'               => '{{level1}}',
        'level1'                  => '{{level2}}',
        'level2'                  => '{{level3}}',
        'level3'                  => '{{level4}}',
        'level4'                  => '{{level5}}',
        'level5'                  => 'Nesting Limit Test Failed',
        'user-template'           => 'This is a user template with {{weird-placeholder}} that will not be replaced in edit mode {{user:first}}',
    ];

    return $store[$name] ?? '';
}

function interpolate($text, $level = 1)
{
    // Limit interpolation recursion
    if ($level > 5) {
        return $text;
    }

    // Replace placeholders
    return preg_replace_callback('/{{([^}]*)}}/', function ($match) use ($level) {
        list($tag, $name) = $match;
        // Do not replace tags with :keep
        if (strpos($name, ':keep')) {
            // Remove :keep?
            return $tag;
        }

        if (strpos($name, '!') === 0) {
            return resolvePlaceholder(trim($name, '!'));
        }

        return interpolate(resolvePlaceholder($name), $level + 1);
    }, $text);
}

$sample = 'Hi, this is a block of text {{block:welcome}} and this is a system variable {{variable:system_version}}. ' .
    'This is a placeholder {{variable:web_url:keep}}. Nest value test {{nest-test}}. User Template: {{!user-template}}';

echo interpolate($sample);
// Hi, this is a block of text Welcome John Doe and this is a system variable 2019.1. This is a placeholder {{variable:web_url:keep}}. Nest value test {{level5}}. User Template: This is a user template with {{weird-placeholder}} that will not be replaced in edit mode {{user:first}}
Pablo
  • 5,897
  • 7
  • 34
  • 51
  • Hi Pablo, thank you for your answer. Your answer does give me something to think about, but lots of my content will be user provided. Furthermore, I still need to figure out how to not replace text between input/select/textarea tags. – Kobus Myburgh Jul 21 '19 at 22:11
  • @KobusMyburgh `resolvePlaceholder()` is supposed to be a placeholder for the part of your system responsible for generating the content for the placeholders. The expectation is not to have all possible values in a simple function like in the example provided :). Even the `interpolate()` function won't be as simple as in the example. You will probably want to have a lot more logic. What would be the use case of having a template with variables that shouldn't be replaced? I suggest creating a specific syntax for these case like using `:keep`, or prefix like `{{input:text-input-placholder}}` – Pablo Jul 21 '19 at 22:33
  • You will have a difficult time trying to parse HTML markup with RegEx. – Pablo Jul 21 '19 at 22:38
  • The condition will be when you are on an edit form where user is providing the content, then saves. It must be saved in the database as entered by the user. – Kobus Myburgh Jul 21 '19 at 22:46
  • @KobusMyburgh As suggested you just need to mark the tags with a flag so that it would tell the parser algorithm to only interpolate the string once. You have full control over the template engine syntax. Ex: '{{!user-input}}'. The flag here would be `!`. If `!` is present then only return `resolvePlaceholder($name)` without feeding it to the interpolation system. Let me it doesn't make sense so I can update the answer example. – Pablo Jul 21 '19 at 23:04
  • Sorry, Pablo, it does not, because the user will enter it without the `!`. The algorithm needs to check whether it is inside a form element. I have made a different question for that purpose, perhaps you can check that one out? https://stackoverflow.com/questions/57137532/text-replacement-php-regex. – Kobus Myburgh Jul 21 '19 at 23:08
  • 1
    Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/196789/discussion-between-pablo-and-kobus-myburgh). – Pablo Jul 21 '19 at 23:10