3

Regex

preg_match_all('@(\b' . preg_quote($needle,'@') . '\b)@is', $haystack, $matches);

Haystack

(Job: NAS-Inkrementell) Operation succeeded.

Needle

Operation succeeded --> works

Operation succeeded. --> does not work

I did some tests:

Alternative Haystack

(Job: NAS-Inkrementell) Op.eration succeeded.

Alternative Needle

Op.eration succeeded --> works

So, the needle does get escaped correctly, as I can see by dump(preg_quote($needle,'@')); --> Operation succeeded\.

What would be the correct way to include a trailing dot?

For quick reference: http://www.phpliveregex.com/p/hzl

Thanks

EDIT

I also need to distinguish between My-NAS and My-NAS-2 in the haystack when the needle is My-NAS, so I need the trailing \b as well.

EDIT2

ideone.com/Suzz2E I want it to be 1 1 0 1 instead of 1 1 1 1, so MyNAS should not be found if there is only MyNAS-2 in the haystack.

PrimuS
  • 2,505
  • 6
  • 33
  • 66
  • 2
    Because of the word boundary. Either remove the trailing `\b`, or replace with `(?!\S)` or `(?!\w)` (that depends on what you consider to be a "word") – Wiktor Stribiżew Oct 19 '16 at 08:16
  • See my edit, in that case `(?!\w)` does not work either... – PrimuS Oct 19 '16 at 08:25
  • *I need the trailing `\b` as well* - no if you want to match a word ending in `.`. Use `preg_match_all('@(?<!\w)' . preg_quote($needle,'@') . '(?!\w)@is', $haystack, $matches);` to get rid of the word boundaries that are context dependent. – Wiktor Stribiżew Oct 19 '16 at 08:26
  • I can't test it right now, but I want that if needle is `MyNAS` and the haystack contains `MyNAS2` that it does NOT match? It works for the trailing `.` though. – PrimuS Oct 19 '16 at 08:33
  • See https://ideone.com/gj6v0m, needles are found alright. `echo preg_match_all('@(?<!\w)' . preg_quote('MyNAS','@') . '(?!\w)@is', "MyNAS2", $matches);` returns `0` (no match). – Wiktor Stribiżew Oct 19 '16 at 08:33
  • Well, it does not work for the second case, both are found: https://ideone.com/Suzz2E – PrimuS Oct 19 '16 at 08:36
  • 1
    `MyNAS-2` is found in `(Job: NAS-Inkrementell) MyNAS-2.` because `M` is not preceded with a word char (=letter/digit/underscore), and `2` is not followed with a word char. **What do you consider a word**? Anything between whitespaces/start/end of string? Then use `(?<!\S)` and `(?!\S)` instead of `(?<!\w)` / `(?!\w)` – Wiktor Stribiżew Oct 19 '16 at 08:53
  • Last try before I give up: https://ideone.com/Suzz2E I want it to be `1 1 0 1` instead of `1 1 1 1`, so `MyNAS` should not be found if there is only `MyNAS-2` in the haystack. Thank you for your effort so far! – PrimuS Oct 19 '16 at 09:02
  • So, you need to further customize your word boundaries. Add `-` to the `\w`: `(?<![\w-])` and `(?![\w-])`. See https://ideone.com/KEhnpZ – Wiktor Stribiżew Oct 19 '16 at 09:09
  • Perfect... I don't understand it fully right now, but it works. The idea behind is that a user can copy anything he wants to search for and my script makes a sufficient regex of that. Do you maybe want to create a "full" answer so that I can accept and credit? – PrimuS Oct 19 '16 at 09:14

1 Answers1

1

The . is not a word character, and \.\b pattern requires a word char after .. You need to implement custom word boundaries.

Since you need to match strings that are not preceded with a word char or a hyphen and that are not followed with a word char or hyphen, you may use

preg_match_all('@(?<![\w-])' . preg_quote($needle1,'@') . '(?![\w-])@is', $haystack, $matches)

See the regex demo.

See the PHP demo:

$haystack1 = "(Job: NAS-Inkrementell) Operation succeeded. MyNAS-2 reports duty.";

$needle1 = "Operation succeeded";
echo preg_match_all('@(?<![\w-])' . preg_quote($needle1,'@') . '(?![\w-])@is', $haystack1, $matches1) ."\n";

$needle2 = "Operation succeeded.";
echo preg_match_all('@(?<![\w-])' . preg_quote($needle2,'@') . '(?![\w-])@is', $haystack1, $matches2) ."\n";

$needle3 = "MyNAS";
echo preg_match_all('@(?<![\w-])' . preg_quote($needle3,'@') . '(?![\w-])@is', $haystack1, $matches3) ."\n";

$needle4 = "MyNAS-2";
echo preg_match_all('@(?<![\w-])' . preg_quote($needle4,'@') . '(?![\w-])@is', $haystack1, $matches4) ."\n";
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • [You can reada bit more on word boundaries here](http://stackoverflow.com/a/1324784/3832970). And [here](http://www.regular-expressions.info/wordboundaries.html). – Wiktor Stribiżew Oct 19 '16 at 09:19
  • Would you mind looking at this again: https://ideone.com/Suzz2E I don't know why the 5th and the 6th case get ignored? – PrimuS Oct 19 '16 at 13:42
  • 5) `SERVER-2012` is followed with `-`, thus, ignored. 6) the same reason. The negative lookahead `(?![\w-])` fails the match if there is a `-` after the search word. – Wiktor Stribiżew Oct 19 '16 at 13:45