0

I know mb_ is for dealing with utf8 characters, still it wont solve my problem.

So I have this string:

óóóóóóóóóóóóóóóóóóóóóóóóóóóóóóó

mb_substr ($oooo, 0,17, 'UTF-8');

óóóóóóóóóóóóóóóóó&oac

so the last character damages.

John Smith
  • 6,129
  • 12
  • 68
  • 123
  • 2
    Please check (and post) which **bytes** your string contains. It appears that at least some of those "ó" are actually the result of the entity escape `ó`, not the unicode code point U+00F3 (in *any* encoding). –  Jan 22 '14 at 15:17
  • 1
    Hint: PHP does not render HTML. – Álvaro González Jan 22 '14 at 15:18
  • I'm not a PHP dev, I'd have to search for the right way myself. To at least check whether it's an entity escape, you can check the source of the page containing your string. Of course, you might also go straight to the source - where does the string come from? –  Jan 22 '14 at 15:28
  • from database, but it might encoded with another way... – John Smith Jan 22 '14 at 15:41

1 Answers1

3

Your string is not actually

$str = 'óóóóóóóóóóóóóóóóóóóóóóóóóóóóóóó';

Your string is actually:

$str = 'óóóóóóóó...';

When looked at in the browser, the browser will of course render "ó", but that's of no interest to PHP.

The best solution is to get your content into the actually UTF-8 encoded characters "óóóóóóóóóóóóóó", then use your code as is. To make this work on your current string, you need to decode the HTML entities first:

$str = 'óóóóóóóó...';
$str = html_entity_decode($str, ENT_COMPAT, 'UTF-8');
echo mb_substr($str, 0, 17, 'UTF-8');

You'll then of course need to take care of the output encoding, since you're now outputting actual UTF-8 which the browser needs to understand. See UTF-8 all the way through.

Community
  • 1
  • 1
deceze
  • 510,633
  • 85
  • 743
  • 889