2

I have been able to match all text within a string that looks like the below:

[!--$element--]

using the below regex:

\[!--(.*?)]

However, now I want to match everything EXCEPT the above string, but can't seem to get it to work. I've attempted to use the not ^ expression and the look-ahead ?! expression, but can't seem to get it right.

For example,

I want

This is an [!--$element--]

to turn into:

[!--$element--]

I am using PHPs preg_replace() to remove all text that isn't an [--$element--]

Any idea how I can do so?

zx81
  • 41,100
  • 9
  • 89
  • 105
neuquen
  • 3,991
  • 15
  • 58
  • 78
  • Do you mean you want to test, i.e. `[!--$element--]` should result in `false`, or not match, i.e. `one[!--$element--]two` should result in `["one", "two"]`? – Ry- Jul 29 '14 at 22:17
  • 1
    Which language are you using? This would be better accomplished with a positive match. – Ry- Jul 29 '14 at 22:19

4 Answers4

3

Capture Group, (*SKIP)(*F) or Split

To match this, we have several options.

Option 1: With capture groups

This is a longer way to do it, but it will work in all engines.

\[!--[^-]*--\]|((?:(?!\[!--[^-]*--\]).)+)

The strings you want are in Group 1. In the Regex Demo, look at the captures in the right panel.

Option 2: (*SKIP)(*F)

This will only work in Perl and PCRE (and therefore PHP). It matches the strings directly.

\[!--[^-]*--\](*SKIP)(*F)|(?:(?!\[!--[^-]*--\]).)+

In the Regex Demo, see the matched strings.

Option 3: Split

Match All and Split are Two Sides of the Same Coin, so we can use preg_split instead of preg_match_all. By splitting the string, we remove the pieces we want to exclude.

$element_regex = '~\[!--[^-]*--\]~';
$theText = preg_split($element_regex, $yourstring);

Explanation

This problem is a classic case of the technique explained in this question to "regex-match a pattern, excluding..."

Option 1. In Option 1, The left side of the alternation | matches complete [!--$element--]. We will ignore these matches. The right side matches and captures other text, and we know it is the right ones because it was not matched by the expression on the left.

Option 2. In Option 2, the left side of the alternation | matches complete [!--$element--] then deliberately fails, after which the engine skips to the next position in the string. The right side matches the this word you want, and we know they are the right ones because they were not matched by the expression on the left.

Reference

Community
  • 1
  • 1
zx81
  • 41,100
  • 9
  • 89
  • 105
2

Instead of replacing the text you don’t want, just grab the text you do want:

preg_match_all('/\[!--(.*?)]/', $input, $matches);
$result = implode('', $matches[0]);

Demo

Ry-
  • 218,210
  • 55
  • 464
  • 476
  • I gave zx81 the check because he/she technically answered the question correctly. However, if I could give two checks I would because I will be using this instead. Thanks! – neuquen Jul 29 '14 at 22:38
0

Note that I didn't used [!--(.*?)] because it will also match elements that are different from [!--$element--] like [!--$element]

<?php
$text ='This is an [!--$element--]. Here is some other text';
preg_match_all('#(\[\!--.*--\])#',$text,$out);
print_r($out);
?>

And the output is

Array
(
    [0] => Array
        (
            [0] => [!--$element--]
        )

    [1] => Array
        (
            [0] => [!--$element--]
        )

)

Here is the online Demo

Jan Moritz
  • 2,145
  • 4
  • 23
  • 33
0

use preg_replace with this pattern Demo

preg_replace("/(.*)(\[!--.*?])(.?)/", "\1\3", $input_lines);
Zombiesplat
  • 943
  • 8
  • 19