1

Yii2 removes all utf8 characters in order to generate a slug, but I want to use utf8 characters as part of the slug.

How to implement utf8 slug in yii2? In the yii2 documentation for sluggable behavior it is been said that you can use generateSlug, but how?

Can someone shed some light on the subject? thanks.

Alireza
  • 6,497
  • 13
  • 59
  • 132

2 Answers2

1

I think you have to create a sub-class MySluggableBehaviour of SluggableBehaviour and a sub-class MyInflector of Inflector.

Create your own MySluggableBehaviour::generateSlug() method that will use MyInflector instead of Inflector :

class MySluggableBehaviour extends SluggableBehaviour
{
  protected function generateSlug($slugParts)
  {
    return MyInflector::slug(implode('-', $slugParts));
  }
}

And MyInflector should change the slug result, by changing the transliteration rule, for instance :

class MyInflector extends Inflector
{
  public static $transliterator = self::TRANSLITERATE_STRICT;
}

Or rewrite MyInflector::slug() according to your needs.

Source code of the classes to extends : https://github.com/yiisoft/yii2/blob/master/framework/behaviors/SluggableBehavior.php and https://github.com/yiisoft/yii2/blob/master/framework/helpers/BaseInflector.php

Mat
  • 2,134
  • 1
  • 18
  • 21
1

Note: all code was modified based on the yii github https://github.com/yiisoft/yii2/blob/master/framework/behaviors/SluggableBehavior.php and https://github.com/yiisoft/yii2/blob/master/framework/helpers/BaseInflector.php

If all you want to do is increase the transliteration strictness, then you simply have to do what Mat says.

If you want the actual hieroglyphics to be displayed however (for chinese or japanese), you may want to rewrite the Inflector::slug function and use your own regex to filter.

Also, trim may not have utf8 support so you'll probably want to use preg_replace for that as well.

class MyInflector extends Inflector
{
  public static function slug($string, $replacement = '-', $lowercase = true, $unicode_filter = '')
  {
      mb_internal_encoding('UTF-8');
      $string = preg_replace('/([^a-zA-Z0-9\x{2014}\x{2013}=\s' . $unicode_filter .  '])+/u', '', $string);
      $string = preg_replace('/[=\s—–-]+/u', $replacement, $string);

      // Multi-byte trim
      $string = preg_replace("/(^\s+)|(\s+$)/u", "", $string);

      // Need to use multi-byte strtolower
      return $lowercase ? mb_strtolower($string) : $string;
  }
}

Now you can filter using any unicode set you want on top of the normal set, but there is not transliteration at all here, if you want transliteration support you'll need to transliterate all non unicode filtered characters separately with a loop and then piece it back together, it's messier that way, but doable.

Example:

If you were working with Japanese characters, you would use: [\x{3000}-\x{9faf}] (https://stackoverflow.com/a/30200250/3238924)

class MySluggableBehaviour extends SluggableBehaviour
{
  protected function generateSlug($slugParts)
  {
    return MyInflector::slug(implode('-', $slugParts), '-', true, '\x{3000}-\x{9faf}');
  }
}
Community
  • 1
  • 1
Architect Nate
  • 674
  • 1
  • 6
  • 21
  • Persian (Arabic) characters are removed from `Slug`. What should I do to prevent such a case? – Alireza Feb 05 '16 at 08:15
  • From what I can tell from this http://unicode.org/charts/PDF/U0600.pdf it would be \x{0600}-\x{06ff} however there are other ranges here http://unicode.org/charts/ – Architect Nate Feb 05 '16 at 08:17
  • So all possible arabic symbols you would probably pass in this: '\x{0600}-\x{06ff}\x{0750}-\x{077f}\x{08a0}-{08ff}\x{fb50}-\x{fdff}\x{fe70}-\x{feff}' You'll need to make sure that you can put the sets next to each other like that, I think you can though. Also feel free to exclude any sets you don't need, ie, from that site I posted you can see there are 4 sets under arabic plus the arabic set, I included all of them here. – Architect Nate Feb 05 '16 at 08:23
  • Also a note that cunieform is it's own set as well, if you needed that for older persian. – Architect Nate Feb 05 '16 at 08:24
  • 1
    I had to use `getValue()` function rather than `generateSlug()`. It is not called as I used `die` to see whether function is called or not. – Alireza Feb 05 '16 at 14:49