2

I have this:

if (preg_match("/\b".preg_quote($kw_to_search_for)."\b/i", $search_strings[$i])) {
    // found
}

This works so far, but if I have special characters in the variable $kw_to_search_for, then this fails.

For instance
$kw_to_search_for = 'hello' WORKS.
$kw_to_search_for = 'HallÄ' FAILS.

How can I solve this, and what is causing it?

Thanks

2 Answers2

4

Try using the u modifier to enable UTF8 support:

u (PCRE8)
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.

http://ch.php.net/manual/en/reference.pcre.pattern.modifiers.php

ThiefMaster
  • 310,957
  • 84
  • 592
  • 636
1

I suspect that your problem is to do with multibyte characters and the encoding.

From: multi-byte function to replace preg_match_all?

Have you taken a look into mb_ereg?

Additionally, you can pass an UTF-8 encoded string into preg_match using the u modifier, which might be the kind of multbyte support you need. The other option is to encode into UTF-8 and then encode the results back.

In this case, the u modifier would be added like this

if (preg_match("/\b".preg_quote($kw_to_search_for)."\b/iu", $search_strings[$i])) {
    // found
}
Community
  • 1
  • 1
Ben Swinburne
  • 25,669
  • 10
  • 69
  • 108