0

The code above

preg_match('~\b(rain|dry|certain|clear)\b~i',$string);

It works like a charm, but when i'm searching for words with accentuated characters it doesn't work. Can somebody help me

user367864
  • 283
  • 2
  • 5
  • 11

1 Answers1

0

Well, technically a and á and à are all different characters to the interpreter. They are encoded differently and there is no way to know which different encodings represent a "similar" character (in some languages accented character are radically different letters). So you would need to include all variants you want to match. However, if you need the actual offset within the string, you might encounter difficulties, because for UTF-8 strings the offset is given in bytes not characters.

See this SO question for an example how to include all versions of a character.

And this bug report in case you encounter the problem with the wrong offsets.

Community
  • 1
  • 1
Martin Ender
  • 43,427
  • 11
  • 90
  • 130