11

Should I use htmlentities with strip_tags?

I am currently using strip_tags when adding to database and thinking about removing htmlentities on output; I want to avoid unnecessary processing while generating HTML on the server.

Is it safe to use only strip_tags without allowed tags?

stealthyninja
  • 10,343
  • 11
  • 51
  • 59
Somebody
  • 9,316
  • 26
  • 94
  • 142
  • @Beck I deleted my answer because on closer look, I'm not sure whether `strip_tags()` without allowed tags is actually unsafe. I started a specific question about that [here](http://stackoverflow.com/questions/5788527/is-strip-tags-vulnerable-to-scripting-attacks). – Pekka Apr 26 '11 at 09:37
  • Ok mate :) I think i'll have to find a proper generated pages caching tool for myself while you are deciding is it's safe or not. :P – Somebody Apr 26 '11 at 09:50
  • Yeah! :) For reference, the HTML purifier link from my question: http://htmlpurifier.org/ but why do you want to use `strip_tags()` over `htmlspecialchars()` in the first place, any special reason? – Pekka Apr 26 '11 at 09:51
  • strip_tags removes only complete tags as i noticed. So the rest is left untouched. And i wanted to keep data the same as user typed it in the first place, but safe for the rest end users and my website templates. – Somebody Apr 26 '11 at 09:55
  • Plus if it's surely safe, then htmlentities is not required at the final output and it's used pretty widely in my templates. Without caching it's eating server resources. – Somebody Apr 26 '11 at 10:01
  • 1
    @Beck but the amount of server resources consumed by doing a `htmlentities()` on output is minuscule. I can't imagine it making anything noticeably slower. `htmlentities()` would keep things *exactly* as the user typed them in, including `<` and `>`, would that not be the vastly superior approach? – Pekka Apr 26 '11 at 10:06
  • The thing is, that i actually didn't used htmlentities earlier. Because i was really lame at that time :D And now i need to decide if i really need it or not. Because i'm not sure if it's safe. – Somebody Apr 26 '11 at 10:14
  • @Beck I'd say use it - from what you say, it will probably serve your users better than strip_tags(). – Pekka Apr 26 '11 at 10:17
  • It seems the only unwanted feature strip_tags have is that it really strips some part of text after , < test4 awefawe> , < test6 awefawef > – Somebody Apr 26 '11 at 10:21
  • Ok then i'll have to overview all my templates :) – Somebody Apr 26 '11 at 10:21

4 Answers4

15

First: Use the escaping method only as soon as you need it. I.e. if you insert something into a database, only escape it for the database, i.e. apply mysql_real_escape_string (or PDO->quote or whatever database layer you are using). But don't yet apply any escaping for the output. No strip_tags or similar yet. This is because you may want to use the data stored in the database someplace else, where HTML escaping isn't necessary, but only makes the text ugly.

Second: You should not use strip_tags. It removes the tags altogether. I.e. the user doesn't get the same output as he typed in. Instead use htmlspecialchars. It will give the user the same output, but will make it harmless.

NikiC
  • 100,734
  • 37
  • 191
  • 225
11

strip_tags will remove all HTML tags:

"<b>foo</b><i>bar</i>" --> "foobar"

htmlentities will encode characters which are special characters in HTML

"a & b" --> "a &amp; b"
"<b>foo</b>" --> "&lt;b&gt;foo&lt;/b&gt;"

If you use htmlentities, then when you output the string to the browser, the user should see the text as they entered it, not as HTML

echo htmlentities("<b>foo</b>");

Visually results in: <b>foo</b>

echo strip_tags("<b>foo</b>");

Results in: foo

nickf
  • 537,072
  • 198
  • 649
  • 721
  • Mate i know what they are doing. I'm asking is strip_tags enough to output data without using htmlentities? – Somebody Apr 26 '11 at 09:27
  • Guess i must start to think about caching generated pages. Can someone suggest good caching tool for such task? :) – Somebody Apr 26 '11 at 09:36
  • @Beck: well, what's your desired result? Do you want there to be no HTML? Are you only concerned about SQL injection? – nickf Apr 26 '11 at 09:51
3

I wouldn't use htmlentities as this will allow you to insert the string, as is, into the database. Yhis is no good for account details or forums.

Use mysql_real_escape_string for inserting data into the database, and strip_tags for receiving data from the database and echoing out to the screen.

SherylHohman
  • 16,580
  • 17
  • 88
  • 94
matty
  • 31
  • 1
0

try this one and see the differences:

 <?php

  $d= isset($argv[1]) ? $argv[1] : "empty argv[1]".PHP_EOL;
  echo  strip_tags(htmlentities($d)) . PHP_EOL;
  echo  htmlentities(strip_tags($d)) . PHP_EOL;

 ?>

open up cmd or your terminal and type something like following;

  php your_script.php "<br>foo</br>"

this should get what you want and safe !

hafidh
  • 101
  • 1
  • 2
  • 8