3

When I use the following script, I get double characters. Why?

$clean_lastname = "Dür";
$clean_lastname = preg_replace("/[ùúûü]/", "u", $clean_lastname);
echo $clean_lastname;

Output: Duur

I want it to be Dur.

I am still doing something wrong... What's up with the 'putting one value of an array in the preg function?

$clean_lastname = "Boerée";
$l = 0;
$pattern = array('[ÀÁÂÃÄÅ]','[Ç]','[ÈÉÊË]','[ÌÍÎÏ]','[Ñ]','[ÒÓÔÕÖØ]','[Ý]','[ß]','[àáâãäå]','[ç]','[èéêë]','[ìíîï]','[ñ]','[òóôõöø]','[ùúûü]','[ýÿ]');
$replace = array(A,C,E,I,N,O,Y,S,a,c,e,i,n,o,u,y);

foreach ($pattern as $wierdchar)
{
    $clean_lastname = preg_replace('/$wierdchar/u', '$replace[$l]', $clean_lastname);
    $l++;
}

//$clean_lastname = preg_replace('/[èéêë]/u', 'e', $clean_lastname);

//$clean_lastname = strtr($clean_lastname, "ùúûü","uuuu");
echo $clean_lastname;
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Thijs
  • 387
  • 5
  • 20

4 Answers4

2
$clean_lastname = str_replace(array('ù', 'ú', 'û', 'ü', 'Ù', 'Ú', 'Û', 'Ü'), array('u', 'u', 'u', 'u', 'U', 'U', 'U', 'U'), $clean_lastname);

// OR to solve your initial issue:

$clean_lastname = preg_replace('/[ùúûü]/u', 'u', $clean_lastname);
Paul Norman
  • 1,621
  • 1
  • 9
  • 20
  • 1
    I solved it!! i need preg_replace with a u modifier!! Thank you Soooo!! much! i was really lost!! – Thijs Dec 16 '10 at 18:00
2

The only situation I can image this to happen is when your two strings (the input string and the pattern) have a different character encoding or both are UTF-8 but you didn’t specify it properly.

Because in the latter case, "Dür" is equivalent to "D\xC3\xBCr" (ü is encoded with the two byte sequence 0xC3BC) and the pattern "/[ùúûü]/" is equivalent to "/[\xC3\xB9\xC3\xBA\xC3\xBB\xC3\xBC]/". Since each byte specified by the escape sequence \xHH is treated as a single character, this yields the following result:

echo preg_replace("/[\xC3\xB9\xC3\xBA\xC3\xBB\xC3\xBC]/", "u", "D\xC3\xBCr");  // Duur

So when working with UTF-8 make sure to set the u modifier flag so that the pattern and input string is treated as UTF-8 encoded:

"/[ùúûü]/u"

Edit    Now that you clarified your intentions and you seem to try to implement some kind of transliteration, you should take a look at iconv and it’s ability to transliterate:

iconv("UTF-8", "US-ASCII//TRANSLIT", $str)

See also other related topics like:

Community
  • 1
  • 1
Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • I am getting there... so if i am right...: I should type proper UTF-8 encoded characters in the strtr function... Then it wil work? – Thijs Dec 16 '10 at 17:38
  • @Thijs: No, you actually just need to use the *u* modifier in your pattern to have the pattern and subject string being treated as UTF-8 encoded. And no, `strtr` is a single-byte string function but UTF-8 is a multi-byte string encoding. You shouldn’t use single-byte string functions with multi-byte string encodings. – Gumbo Dec 16 '10 at 17:46
  • How do i do that? I am really getting lost here, i got the exact same characters in my php programm and it keeps not working... :( – Thijs Dec 16 '10 at 17:49
  • @Thijs: Are you sure you’re using UTF-8? – Gumbo Dec 16 '10 at 17:53
1
<?php
    $vowels = array("ù","ú","û","ü" );
    $consonents = array("u","u","u","u");
    $clean_lastname = "Dür";
    echo str_replace( $vowels, $consonents, $clean_lastname);
?>
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Pradeep Singh
  • 3,582
  • 3
  • 29
  • 42
1

stick with your original strtr

$clean_lastname = "Dür Dùr Dúr Dûr";
$clean_lastname = strtr($clean_lastname, "ùúûü", "uuuu");
echo $clean_lastname;
ajreal
  • 46,720
  • 11
  • 89
  • 119