6

This has been asked before but I need 100% clarity on this issue as it's very important for me to get it right.

The situation: A message system on a website. The user enters a message into a text-box, they submit the form and it gets entered to the database. This data can then be called from the database and displayed within <span>tags to another user.

What security procedures do I need to take to prevent this data from being malicious? I already use the mysql_real_escape_string to stop any injection and strip_tags seems useful but I have heard lots of other names mentioned. What do I need to use to protect this data considering it is only displayed in <span> tags?

Thank you.

James
  • 2,013
  • 3
  • 18
  • 31
  • possible duplicate of [PHP: How to totally prevent XSS attacks?](http://stackoverflow.com/questions/5934063/php-how-to-totally-prevent-xss-attacks) – Pekka Sep 02 '11 at 16:14
  • Just to clarify, I'm marking this as a duplicate not because it's a bad question (it's not!) - everyone should be this diligent – Pekka Sep 02 '11 at 16:15
  • The bottom line here is to always sanitize any user provided data no matter how insignificant. – Jared Sep 02 '11 at 16:18
  • I understand that the question has been asked many times but I didn't feel 100% sure about the topic even after reading 100+ questions on it. Having my own question and specifically stating what I wanted to do/achieve is what I wanted to feel satisfied. Thank you for doing the right thing though :) – James Sep 02 '11 at 16:19
  • Just understand how it works. If you allow them to put any type of html tag in something that outputs to the page, then they can slip javascript into your page and cause havok. htmlspecialchars() and strip_tags() will both work to remove html from the text, and prevent xss. – dqhendricks Sep 02 '11 at 16:27
  • @dqhendricks - I understand the principle of XSS but I don't fully understand the tools at my disposable and how to use them in the most efficient way. This seems to be a lot clearer now that people are responding and I can ask very precise things :) – James Sep 02 '11 at 16:29

3 Answers3

3

Use htmlspecialchars when outputting on an HTML page. It will display the data the same way the user entered it (so users can use something like <3 in their messages without stripping the rest of it)

knittl
  • 246,190
  • 53
  • 318
  • 364
  • and this alone will protect against all types of XSS considering I am only placing the content inside `` tags and not inside html objects? I'm sure I read elsewhere that people said this was old and useless? – James Sep 02 '11 at 16:20
  • @James: `htmlentities` is a better alternative if available on your system, but they will both work just fine. – Matthew Scharley Sep 02 '11 at 16:24
  • James, this is totally safe when you output text inside ``. It's not old and useless, it's recommended. [`htmlspecialchars`](http://php.net/htmlspecialchars) for output and [prepared statements](http://php.net/mysqli_prepare)/[`mysqli_real_escape_string`](http://php.net/mysqli_real_escape_string) – knittl Sep 02 '11 at 16:26
  • Matthew, htmlentities will encode too many chars for my liking, but that's probably just personal taste – knittl Sep 02 '11 at 16:27
3

The misconception is that you want to escape the input, which is wrong. You have to filter the output (and database is also an output).

It means that when the form is submitted, you use mysql_real_escape_string() to send (output) data to database, and you use htmlspecialchars() to output the content on the screen. The same principle applies to regular expressions, where you'd use preg_quote(), and so on.

No matter where data is coming from, you have to escape it in the context of where you are sending it to.

So for preventing XSS attacks, you must use htmlspecialchars() / htmlentities(). mysql_real_escape_string has nothing to do with XSS (but you still have to use it when you are sending data to the database).

netcoder
  • 66,435
  • 19
  • 125
  • 142
Maxim Krizhanovsky
  • 26,265
  • 5
  • 59
  • 89
  • I knew the mysql_real_escape_string was a little irrelevant to this topic but I mentioned it to avoid anyone saying "you have to do that too!" :) When you put a slash between the htmlspecialchars and the htmlentities, are you saying use both or use one or the other? which is better? also, what's the preg_quote() for? I'm not 100% sure exactly what you mean when you say when you're using it within a regular expression (even though I googled the definition! :D) – James Sep 02 '11 at 16:25
  • 2
    James, he mentions `preg_quote` in a sense that »use the right tool for the right job« = »use the proper and recommended function for escaping input for certain domains/contexts (htmlspecialchars for html, mysqli_real_escape_string for mysql dbs, prepared statements for practically any db, preg_quote for perl-compatible regular expressions, escapeshellarg/escapeshellcmd for shell commands, etc.)« – knittl Sep 02 '11 at 16:29
  • +1 for the distinction between input and output. If it's not supposed to be HTML, don't validate/filter it as such (using `strip_tags()`). – Tim Lytle Sep 02 '11 at 17:00
  • 1
    Also htmlspecialchars only works between tags. It does not work for say escaping data used in a javascript eventhandler like: onmouseover="confirm('Do you really want to delete _user data_')" In this case, even if the _user data_ is html encoded, an attacker can break out of the javascript string, and run arbitrary javascript. – Erlend Sep 20 '11 at 12:50
0

Please check the OWASP XSS Prevention Cheat Sheet. It will explain how to avoid XSS for different contexts. Htmlentities should do the job when between tags.

Erlend
  • 4,336
  • 22
  • 25