0

I have a string and I would like to know the first position of a pattern. But it should be found only, if it's not enclosed by with brackets.

Example String: "This is a (first) test with the first hit"

I want to know the position of the second first => 32. To match it, the (first) must be ignored, because it's enclosed in brackets.

Unfortunately I do not have to ignore round brackets ( ) only, I have to ignore square brackets [ ] and brace brackets { } too.

I tried this:

preg_match(
  '/^(.*?)(first)/',
  "This is a (first) test with the first hit",
  $matches
);
$result = strlen( $matches[2] );

It works fine, but the result is the position of the first match (11).

So I need to change the .*?.

I tried to replace it with .(?:\(.*?\))*? in the hope, all characters inside the brackets will be ignored. But this does not match the brackets.

And I can't use negative look ahead '/(?<!\()first(?!\))/', since I have three different bracket types, which have to match open and closing bracket.

1 Answers1

0

You can match all 3 formats that you don't want using a group with and an alternation and make use of (*SKIP)(*FAIL) to not get those matches. Then match first between word boundaries \b

(?:\(first\)|\[first]|{first})(*SKIP)(*FAIL)|\bfirst\b

Regex demo

Example code

$strings = [
    "This is a (first) test with the first hit",
    "This is a (first] test with the first hit"
];

foreach ($strings as $str) {
    preg_match(
        '/(?:\(first\)|\[first]|{first})(*SKIP)(*FAIL)|\bfirst\b/',
        $str,
        $matches,
        PREG_OFFSET_CAPTURE);
    print_r($matches);
}

Output

Array
(
    [0] => Array
        (
            [0] => first
            [1] => 32
        )

)
Array
(
    [0] => Array
        (
            [0] => first
            [1] => 11
        )

)

Php demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • This answer does not work with UTF-8 strings. A solution for this issue is available [here](https://stackoverflow.com/questions/1725227/preg-match-and-utf-8-in-php#1725329). –  May 06 '20 at 08:05