2

Possible Duplicate:
PHP: Replace umlauts with closest 7-bit ASCII equivalent in an UTF-8 string

I want to replace diacritics characters with his non-diacritics brother. example: from "guľôčka" I wanna get "gulocka"

Is here some native function to do it?

I was looking for list of all worldwide diacritics characters for replace with str_replace. I can't find it.

Thanks a lot.

Community
  • 1
  • 1
Draex_
  • 3,011
  • 4
  • 32
  • 50

1 Answers1

1

You can achieve this by using iconv, available in PHP, and requesting an encoding conversion with transliteration. (This actually works for many different scripts!) If you only want basic European characters, make the target Latin-1, or even ASCII.

From the manual page:

iconv("UTF-8", "ISO-8859-1//TRANSLIT", $text)
Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • Thank you for your answer. This code will return "gulôcka" - it doesn't replace ô. As target I tried Latin-1, ASCII and ISO-8859-1. – Draex_ Jul 01 '11 at 19:57
  • Can you print the result in raw hex? Is the accent a separate byte? If so, you can run a regex over the result and strip non-alphanumerics. – Kerrek SB Jul 01 '11 at 20:03
  • `$text = 'guľôčka'; $iso = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $text); var_dump(strlen($iso)); //int(7)´ So I think, it's in 1 byte. I don't want to strip, I need replace. – Draex_ Jul 03 '11 at 07:02
  • @Peter: Don't target ISO-8859-1, but a genuine 7-bit ASCII encoding instead! – Kerrek SB Jul 03 '11 at 10:02
  • @Peter: There you go - now just run a regex over this to sieve out the non-alphabetical characters! – Kerrek SB Jul 05 '11 at 19:38