3

I have this function works not quite well in PHP 5.2.0, this function cut string into desired length :

function neat_trim($str, $n, $delim='...')
{
    $len = strlen($str);

    if ($len > $n)
    {
        preg_match('/(.{' . $n . '}.*?)\b/', $str, $matches);
        return rtrim($matches[1]) . $delim;
    }
    return $str;
}

And I call

$multibyte_string = "Portion of Chicken for 1 person<br>一人份鸡肉";

echo neat_trim($multibyte_string,42) . "</br>";

Will produce

Portion of Chicken for 1 person
一人�...

Unfortunately it won't work on PHP-5.4.29, it will produce:

...

I've tried this and this but didn't work. Please help.

Community
  • 1
  • 1
Seto
  • 1,234
  • 1
  • 17
  • 33
  • 1
    If this is *utf-8* 1.) `$len = strlen($str);` use `mb_strlen($str, "utf-8");` the char length is **40** not **50** [mbstring extension](http://php.net/manual/en/book.mbstring.php) needed. 2.) If it's unicode use `u` [flag](http://php.net/manual/en/reference.pcre.pattern.modifiers.php) and probably also `s` flag is wanted in your regex for making the dot also match newlines: `'/(.{' . $n . '}.*?)\b/us'` – Jonny 5 Aug 31 '15 at 11:40
  • Thanks @Jonny, your comment is really help me. I'm new in handling multi character in PHP. I posted my working code. – Seto Sep 01 '15 at 07:04
  • Answers go in the answer box below, not in the question. – Ignacio Vazquez-Abrams Sep 01 '15 at 07:10
  • I don't think "一人�..." is a particularly positive outcome. I wouldn't call this "it works". – deceze Sep 01 '15 at 07:26
  • @deceze OK, it's not quite well. – Seto Sep 01 '15 at 08:26
  • possible duplicate of [Multibyte trim in PHP?](http://stackoverflow.com/questions/10066647/multibyte-trim-in-php) – greut Sep 01 '15 at 08:27
  • @Seto would replace tags from `$multibyte_string` with space before processing. The word boundary `\b` can and will break html tags. [See here with few modifications](http://pastebin.com/kmMiHF5g). – Jonny 5 Sep 01 '15 at 17:10

1 Answers1

1

Working code based on @Jonny's comment, thanks again

function neat_trim($str, $n, $delim='...')
{
    $len = mb_detect_encoding($str) == "UTF-8" ? mb_strlen($str, "UTF-8") : strlen($str);
    if ($len > $n)
    {
        preg_match('/(.{' . $n . '}.*?)\b/us', $str, $matches);
        return rtrim($matches[1]) . $delim;
    }
    return $str;
}
Seto
  • 1,234
  • 1
  • 17
  • 33