15

I have a situation where I am passing a string to a function. I want to convert   to " " (a blank space) before passing it to function. Does html_entity_decode does it?

If not how to do it?

I am aware of str_replace but is there any other way out?

Salman A
  • 262,204
  • 82
  • 430
  • 521
Abhishek Sanghvi
  • 4,582
  • 6
  • 28
  • 36

4 Answers4

43

Quote from html_entity_decode() manual:

You might wonder why trim(html_entity_decode(' ')); doesn't reduce the string to an empty string, that's because the ' ' entity is not ASCII code 32 (which is stripped by trim()) but ASCII code 160 (0xa0) in the default ISO 8859-1 characterset.

You can use str_replace() to replace the ascii character #160 to a space:

<?php
$a = html_entity_decode('>&nbsp;<');
echo 'before ' . $a . PHP_EOL;
$a = str_replace("\xA0", ' ', $a);
echo ' after ' . $a . PHP_EOL;
Salman A
  • 262,204
  • 82
  • 430
  • 521
  • 22
    If you are working with UTF-8 encoded strings you should replace \xC2\xA0 . $a = html_entity_decode('> <', ENT_QUOTES, 'UTF-8'); echo 'before ' . $a . PHP_EOL; $a = str_replace("\xC2\xA0", ' ', $a); echo ' after ' . $a . PHP_EOL; – chugadie Oct 11 '13 at 11:52
  • I've been struggling a lot with the data I retrieve from a `contenteditable` element, all `rtrim` and `preg_replace` attempts failed. I've also been trying to filter stuff with JavaScript before shooting it with `$.ajax()`, also failed. So now I do `str_replace(" ", ' ', $value)` and then `preg_replace('/\s+$/','',$value)`. It works, though not too elegant. If someone has suggestions, please tell me, – Matt Oct 20 '15 at 10:01
5

html_entity_decode does convert &nbsp; to a space, just not a "simple" one (ASCII 32), but a non-breaking space (ASCII 160) (as this is the definition of &nbsp;).

If you need to convert to ASCII 32, you still need a str_replace(), or, depending on your situation, a preg_match("/s+", ' ', $string) to convert all kinds of whitespace to simple spaces.

Aurimas
  • 2,518
  • 18
  • 23
5

YES

See PHP manual http://php.net/manual/en/function.html-entity-decode.php.

Carefully read the Notes, maybe that s the issue you are facing:

You might wonder why trim(html_entity_decode('&nbsp;')); doesn't reduce the string to an empty string, that's because the ' ' entity is not ASCII code 32 (which is stripped by trim()) but ASCII code 160 (0xa0) in the default ISO 8859-1 characterset.

Konrad Dzwinel
  • 36,825
  • 12
  • 98
  • 105
Frederic Bazin
  • 1,530
  • 12
  • 27
2

Not sure if it is a viable solution for most cases but I used trim(strip_tags(html_entity_decode(htmlspecialchars_decode($html), ENT_QUOTES, 'UTF-8')));in my most recent application. The addition of htmlspecialchars_decode() initially was the only thing that would actually strip them.

Tyler Christian
  • 520
  • 7
  • 14