3

Okay, so there's all these different string-escaping functions such as htmlentities(), mysql_real_escape_string(), addslashes()

But which should I use in what situation?
Resources and opinions please :)

Joshwaa
  • 840
  • 1
  • 11
  • 22
  • read answers carefully. most people just have no idea of what they're talking about. most upvoted answer is full of factual errors. – Your Common Sense Apr 24 '11 at 14:51
  • 1
    Not related to string escaping, but to preventing SQL injections: using [parametrized database queries](http://stackoverflow.com/questions/60174/best-way-to-stop-sql-injection-in-php) is almost always better and more safe than escaping. – Marcel Korpel Apr 24 '11 at 15:19
  • 2
    @Joshwaa: Summary of output conversion functions from comments: htmlspecialchars() is preferred. htmlentities() is fine *most* of the time, but is not necessary and in fact can cause problems in XML documents, can be abused as a fix for possible encoding issues, and very minor issue: adds page weight due to extra characters. Go with `htmlspecialchars()`. The important thing is that you use this on your *output*, not before storing in a database (creating false sense of security, for one thing). If I've missed something please point it out, there's a lot of useful comments here that are buried. – Wesley Murch Apr 24 '11 at 16:28
  • @Marcel Korpel: You can use the ENT_QUOTES flag to escape encode quotes with htmlentities(). OK, I'm done now, really! :) – Wesley Murch Apr 24 '11 at 17:00
  • 1
    @Marcel it does. double quotes by default and single with optional parameter – Your Common Sense Apr 24 '11 at 17:01
  • Whoops, both of you are right. Deleted remark – Marcel Korpel Apr 24 '11 at 17:12

4 Answers4

9
  • addslashes() / stripslashes() goes back to a rather bad idea called 'Magic Quotes' which has since been deprecated. It automatically escaped special characters, and you could then use addslashes() and stripslashes() to add or remove them. One of the problems was that you were never quite sure whether the data currently had slashes or not, and thus you ended up putting unescaped data into SQL, or had extra slashes on your web page.
  • htmlentities() is used often to display HTML on the page. If you try to write <b>Something</b> to a HTML page, you will just see Something (i.e. the original text in bold) - you won't see the bold tags around it. Using htmlentities('<b>Something</b>') converts the code to <b>Something<b> so in the browser you see the triangle brackets.
  • mysql_real_escape_string() is useful for defending against MySQL injection attacks - it escapes unsafe characters in strings. It does not escape anything in other data types, and so those need to be dealt with separately. It also does not encode % and _, which are used as wildcards in some queries.

In summary:

  • If you're encoding to write to a HTML page, use htmlentities()
  • If you're encoding a string to write to a database, use mymysql_real_escape_string()
  • Never use addslashes()
Dan Blows
  • 20,846
  • 10
  • 65
  • 96
  • 4
    you are wrong at mysql_real_escape_string() description – Your Common Sense Apr 24 '11 at 14:11
  • 1
    In what sense? I know it's a simplification of the documentation, but the exact description is: "Escapes special characters in the unescaped_string, taking into account the current character set of the connection so that it is safe to place it in a mysql_query()." – Dan Blows Apr 24 '11 at 14:14
  • 3
    Two people think Blowski's mysql_real_escape_string() description is wrong, zero have explained why. That's no good... – Owen Apr 24 '11 at 14:20
  • 3
    reread manual page again – Your Common Sense Apr 24 '11 at 14:20
  • 1
    omg zillion pepole cannot imagine simple scenario like `$id = ";drop table users"; $id = mysql_real_escape-string($id); $sql = "SELECT * FROM table WHERE id=$id"` – Your Common Sense Apr 24 '11 at 14:23
  • @Col.Shrapnel - are you suggesting that as developers we should use one and only one means of preventing SQL injection? In the case of your example, I would expect to see `if(is_numeric($_POST['id'])) { $id = $_POST['id']; }` before the query. – Dan Blows Apr 24 '11 at 14:26
  • 2
    o, rly? add this code to mine and see the result please – Your Common Sense Apr 24 '11 at 14:28
  • 3
    Ah, that's more convincing, should've posted that to begin with rather than being all cryptic. :p – Owen Apr 24 '11 at 14:28
  • 2
    @Col.Shrapnel - the question is about 'When do I use `mysql_real_escape_string()` vs `htmlentities()`?' not 'How do I prevent SQL injection?'. – Dan Blows Apr 24 '11 at 14:31
  • 8
    @thasc it will never help. because everyone here never asks for (nor giving out) understanding but a recipe. But a recipe without understanding will ALWAYS fail you. – Your Common Sense Apr 24 '11 at 14:31
  • LOL now you're trying to advocate yourself with literal question. That's silly. Ayway, watch your answer then. – Your Common Sense Apr 24 '11 at 14:34
  • 16
    Folks, lets turn the heat down here a few hundred degrees. (the flags, they cometh, because this looks like a boxing match more than a constructive conversation) – Tim Post Apr 24 '11 at 14:35
  • 3
    @Blowski - I'm quite frankly _astonished_ to see a complete absence of links to little bobby tables. I was beginning to think XKCD satisfied Godwin's law, with SQL injection being the catalyst. – Tim Post Apr 24 '11 at 14:41
  • 4
    @TimPost - Just for you... [obligatory XKCD reference](http://xkcd.com/327/) – Dan Blows Apr 24 '11 at 14:43
  • 1
    Please note that `mysql_real_escape_string()` does not escape `%` or `_` which are wildcards in certain queries, e.g. `WHERE x LIKE y`. – Core Xii Apr 24 '11 at 15:11
  • still addslashes() part makes no sense and lead astray. addslashes do not act automatically, so, you can always tell if your data escaped or not. What is the point in this statement? mysql_real_escape_string() improved but still way unclear. Literally speaking, it escaping same characters in all data types. and it is useful not to "defend" from whatever attacks, but, in the first place, for just escaping special characters in *harmless* strings. it should be used just all the time, on any string going to the query, just by the same rule you're adding quotes. it's syntax issue, not security – Your Common Sense Apr 24 '11 at 18:33
  • @Joshwaa Are you happy that you understand the difference now? – Dan Blows Apr 28 '11 at 23:02
3

which should I use in what situation?

  • htmlentities(). never use it, but htmlspecialchars(). For printing untrusted user input into browser.
  • mysql_real_escape_string is mysql database specific function. here is a comprehensive guide I wrote exactly on topic where to use it and where not and what else you need to know on mysql database security
  • addslashes(). it depends. most of time you just don't need it at all
Community
  • 1
  • 1
Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
  • 1
    Why is this?: `htmlentities(). never use it, but htmlspecialchars()` We are talking about converting *output* right? – Wesley Murch Apr 24 '11 at 14:31
  • manual pages for both functions are available – Your Common Sense Apr 24 '11 at 14:35
  • 1
    It does not say "Never use htmlentities()" in the manual. – Wesley Murch Apr 24 '11 at 14:37
  • yeah. but one can read and make conclusions. – Your Common Sense Apr 24 '11 at 14:40
  • 2
    Maybe that could be important to OP since he seems to be confused. My conclusion is that they are different functions designed for different purposes, but to simply say "htmlentities() should not be used" with no explanation as to why I'm sure is not helpful to OP. – Wesley Murch Apr 24 '11 at 14:42
  • oh really? what are these purposes? – Your Common Sense Apr 24 '11 at 14:43
  • From the manual: `[htmlentities()] is identical to htmlspecialchars() in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities. ` I fail to see what the harm is, please don't take it personally, I'm interested in hearing your reasoning. – Wesley Murch Apr 24 '11 at 14:46
  • 1
    @Madmartigan: it's simply not needed: just use an appropriate character set (in most cases: UTF-8) and ensure the outputted source uses that; `htmlentities` only adds extra bandwidth and doesn't make your page more safe. – Marcel Korpel Apr 24 '11 at 15:11
  • @Marcel Korpe: I think it's fair to say that "adds extra bandwidth" is a pretty far stretch to validate the advice "htmlentities(). never use it". Sure it may not be necessary, but when the context is **security**, as it is in this question, I think this is highly irrelevant, and not clearly explained. "Doesn't make your page more safe" is not the same as "Makes your page less safe [than htmlentities()]" – Wesley Murch Apr 24 '11 at 15:19
  • (@Col., [talk of the town again](http://meta.stackexchange.com/questions/88660/what-does-one-do-with-angry-users)... Cheers!) – Arjan Apr 24 '11 at 15:20
  • 3
    @Madmartigan but as @Marcel says, there is no point in using `htmlentities()`. It doesn't make anything less safe, but it has no advantages, either, and it breaks things when in an XML context (XML doesn't know HTML's entities). `htmlspecialchars()` is the right thing to use. Plus, htmlentities() it is often wrongly recommended as a cure against encoding issues instead of fixing the actual problem, which is why I'm in favour of discouraging it. – Pekka Apr 24 '11 at 15:30
  • @Mad: though I generally agree with you about ‘never use it’ (perhaps “it's never needed” would be a better formulation), I think one should think outside of the context, too; e.g., if someone asks “What does `mysql_real_escape_string` do?”, one should really explain that parametrized database queries are more safe and (often) easier to implement. – Marcel Korpel Apr 24 '11 at 15:33
  • @Pekka: I still fail to see the harm in using it, even given your examples. If it is a band-aid for encoding issues, it still doesn't contribute to the problem. I guess I just expected a straight answer from anyone who feels so strongly about it, and so far all I've heard is "more bandwidth", "not needed", and "people use as a bad fix for encoding". Are those really the only reasons to say **never use it**? – Wesley Murch Apr 24 '11 at 15:36
  • @Mad as @Marcel says, the better way to put it is probably "it is almost never needed these days, because working with HTML entities has become discouraged". – Pekka Apr 24 '11 at 15:40
  • 1
    @Pekka, @Marcel Korpe: Thanks for the input. I think we're on the same page here, but just swept up by all the drama in this post. You guys had me a little nervous at first by saying that htmlentities() VS htmlspecialchars() is security related, because I use htmlentities() quite a bit. I guess it's an old habit, and I can reconsider it's usefulness. – Wesley Murch Apr 24 '11 at 15:43
  • @Mad: Indeed. I once learned to use `strip_tags` (from a book I now consider full of security holes), but that's a ridiculously pompous way to achieve security and in the worst case a source for confusion to the user. – Marcel Korpel Apr 24 '11 at 16:50
  • (@Mad, just in case you think you can notify multiple users: see also [How do comment @replies work](http://meta.stackexchange.com/questions/43019/how-do-comment-replies-work/43020#43020).) – Arjan Apr 24 '11 at 22:16
1

when you insert data to a mysql database use this:

mysql_real_escape_string()

when you're going to display content a user gave you:

htmlentities()

if you database doesn't have it's own function in php, you could use: addslashes() , but it's not recommended to use when you have something specific that is better (mysql_real_escape_string()).

see this for more info:

Htmlentities vs addslashes vs mysqli_real_escape_string

P.S you should use mysqli_real_escape_string(), not mysql_real_escape_string().

EDIT:

to really prevent attacks, this is good reading material : http://www.php.net/manual/en/security.database.sql-injection.php...

You should also look into prepared statements: http://www.php.net/manual/en/mysqli.prepare.php

a lot of info is also available here on stack overflow.

Community
  • 1
  • 1
fingerman
  • 2,440
  • 4
  • 19
  • 24
0

It's all a variation on the same theme:

$bar = "O'Reilly";
"foo = '$bar'";  // foo = 'O'Reilly' -> invalid syntax

Blindly concatenating strings together may lead to syntax violations if the strings are supposed to follow a special syntax. At best this is an annoyance, at worst a security problem. Escaping values prevents these problems. Generic example:

"foo = '" . escape($bar) . "'";  // foo = 'O\'Reilly'

All the different functions are escaping values properly for different syntaxes:

htmlentities for escaping output for HTML.
mysql_real_escape_string for escaping values for SQL queries.
addslashes… not really good for anything, don't use.
json_encode for encoding/escaping/converting values for Javascript format.

deceze
  • 510,633
  • 85
  • 743
  • 889