The two strings compare to false because they are actually different sequence of bytes. To compare them, you need to normalize them in any way.
The best way to do that is to use the Transliterator class, part of the intl
extension on PHP 5.4+.
A test code:
<?php
$transliterator = Transliterator::createFromRules(':: NFD; :: [:Nonspacing Mark:] Remove; :: NFC;', Transliterator::FORWARD);
$test = ['abcd', 'èe', '€', 'àòùìéëü', 'àòùìéëü', 'tiësto'];
foreach($test as $e) {
$normalized = $transliterator->transliterate($e);
echo $e. ' --> '.$normalized."\n";
}
?>
Result:
abcd --> abcd
èe --> ee
€ --> €
àòùìéëü --> aouieeu
àòùìéëü --> aouieeu
tiësto --> tiesto
(taken from my answer here: mySQL - matching latin (english) form input to utf8 (non-English) data )
This replaces characters according to the tables of the ICU library, which are extremely complete and well-tested. Before transliterating, this normalizes the string, so it matches any possible way to represent characters like ñ (ñ, for example, can be represented with 1 multibyte character or as the combination of the two characters ˜ and n).
Unlike using soundex(), which is also very resource-intense, this does not compare sounds, so it's more accurate.