1

My language is Vietnamese, so I am having a problem with getting the exact word from the relevant keyword. I have one string and one keyword:

$mystring = "từ khóa a,từ khóa b, từ khóa c";
$mykeyword = "tu khoa b";

How can I use $mykeyword to get từ khóa b from $mystring?

Thank you!

Will
  • 24,082
  • 14
  • 97
  • 108
Giangimgs
  • 952
  • 1
  • 12
  • 16
  • 1
    You can try this answer here http://stackoverflow.com/questions/1008802/converting-symbols-accent-letters-to-english-alphabet – Gautam Jose Jul 12 '15 at 09:40

1 Answers1

1

What you want to do is called UTF-8 Normalization, I believe.

This post explains some of the foundations. Try this:

php > $mystring = "từ khóa a,từ khóa b, từ khóa c";
php > $mykeyword = "tu khoa b";
php > var_dump(transliterator_transliterate('Any-Latin; Latin-ASCII; [\u0080-\u7fff] remove', $mystring));
string(30) "tu khoa a,tu khoa b, tu khoa c"
php >

Now, you can use the normal string manipulation functions to see if $mykeyword is contained within $mystring. Note that characters which don't have ASCII translations will be removed.

Note that for this to work, you need the PHP intl module installed (often a package called php5-intl). See here.

You can also use the Normalizer and preg_replace() to strip accents:

php > var_dump(preg_replace('/\p{Mn}/u', '', Normalizer::normalize($mystring, Normalizer::FORM_KD)));
string(30) "tu khoa a,tu khoa b, tu khoa c"
php >

Yet another way is to use iconv():

php > var_dump(preg_replace('/[^a-zA-Z0-9 -]+/', '', iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $mystring)));
string(25) "t khoa at khoa b t khoa c"

However, as you can see, the didn't properly translate.

Community
  • 1
  • 1
Will
  • 24,082
  • 14
  • 97
  • 108
  • Why I'm seeing _"**Fatal error**: Call to undefined function transliterator_transliterate() in **C:\xampp\htdocs\...\index.php** on line **5**"_ upon testing? – 5ervant - techintel.github.io Jul 12 '15 at 10:19
  • Because you need the `intl` module :) See [here](https://stackoverflow.com/questions/23431788/how-to-install-intl-php-extension-with-wamp-server) for how to enable it on WAMP. – Will Jul 12 '15 at 10:23
  • Do you think that module is already installed on most shared web hosts? – 5ervant - techintel.github.io Jul 12 '15 at 10:27
  • Most likely. It's part of the standard PHP distribution. If it's not, for some reason, you should be able to get them to install it. I added another way to do it without `intl` that should work on any PHP 5.3.0+ installation. – Will Jul 12 '15 at 10:34
  • How did you install PHP? This is probably as simple as installing `php5-intl`, or at worst, running `pecl install intl`. – Will Jan 24 '16 at 05:06
  • Ok follow instructions [here](https://stackoverflow.com/questions/1451468/intl-extension-installing-php-intl-dll)! :) – Will Jan 24 '16 at 05:17
  • Add the `iconv` method as well. `preg_replace('/[^A-Z|^a-z|^0-9]/', '', iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $content));` http://ideone.com/jOw5Cu – Will B. May 31 '16 at 03:53