14

I got (for example) two strings:

$a = "joao";
$b = "joão";

if ( strtoupper($a) == strtoupper($b)) {
    echo $b;
}

I want it to be true even tho the accentuation. However I need it to ignore the accentuation instead of replacing because I need it to echo "joão" and not "joao".

All answers I've seen replace "ã" for "a" instead of making the comparison true. I've been reading about normalizing it, but I can't make it work either. Any ideas? Thank you.

John Conde
  • 217,595
  • 99
  • 455
  • 496
Penny
  • 253
  • 1
  • 5
  • 15

3 Answers3

27

Just convert the accents to their non-accented counter part and then compare strings. The function in my answer will remove the accents for you.

function removeAccents($string) {
    return strtolower(trim(preg_replace('~[^0-9a-z]+~i', '-', preg_replace('~&([a-z]{1,2})(acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '$1', htmlentities($string, ENT_QUOTES, 'UTF-8'))), ' '));
}

$a = "joaoaaeeA";
$b = "joãoâàéèÀ";

var_dump(removeAccents($a) === removeAccents($b));

Output:

bool(true)

Demo

John Conde
  • 217,595
  • 99
  • 455
  • 496
  • demo link broken, does it work for all accents characters ? like â à é è ? – Xsmael Dec 20 '18 at 16:13
  • Also Capital letters with accents ? – Xsmael Dec 20 '18 at 16:14
  • Updated the answer to use more sample characters – John Conde Apr 18 '19 at 15:53
  • 1
    But why the strtolower()? – chuckedw May 28 '19 at 17:24
  • Upvoted, great answer! But, unfortunately, this solution fails on the following case: `Ōsugi Sakae`. =\ – HoldOffHunger Dec 07 '21 at 18:01
  • @HoldOffHunger Thank you. I tested the string you provided and it seemed to work for me: https://3v4l.org/o5T4B – John Conde Dec 07 '21 at 18:03
  • Oh, whoa, that is interesting! Your linked code works! (yey!) Good --> `return preg_replace('/[\x{0300}-\x{036f}]/u',"",normalizer_normalize($str,Normalizer::FORM_D));` But this is diff from your answer, which, when I test, returns "-sugi". Take a look: https://ideone.com/CE7qcl – HoldOffHunger Dec 07 '21 at 18:15
  • Weird. The code is identical but gives different results. Works locally for me, too, so I am guessing that site has some kind of configuration that is different than what the other site and I have. – John Conde Dec 07 '21 at 18:28
5

I would like to share an elegant solution that avoids the usage of htmlentities and that doesn't need to manually list all chars replacements. It is the traduction in php of the answers to this post.

function removeAccents($str) {
    return preg_replace('/[\x{0300}-\x{036f}]/u',"",normalizer_normalize($str,Normalizer::FORM_D));
}

$a = "joaoaaeeA";
$b = "joãoâàéèÀ";

var_dump(removeAccents($a) === removeAccents($b));

Output:

bool(true)
Francois Kneib
  • 491
  • 4
  • 5
-2

It's not a plain PHP solution but works very well for this situation, run this query on MySQL:

SELECT 'joão' = 'joao'

So if you have access to mysql you can use it from PHP.

thiago marini
  • 528
  • 3
  • 11
  • 2
    And FYI: this is true only depending on the collation of the table, see https://stackoverflow.com/questions/4813620/how-to-remove-accents-in-mysql/17045193 – Romain 'Maz' BILLOIR Sep 01 '21 at 13:57