0

I need to prevent from XSS attacks when I echoing variables in PHP.

For example, just assume I have two values from my database, one for username and the other one is email address.

$username
$email

So now I want to prevent from XSS attack when I using these variables in my HTML.

I tried it something like this using htmlspecialchars() -

<h5>Editing User <?php echo '"<strong>'.htmlspecialchars($username, ENT_QUOTES, 'UTF-8').'"</strong> (<strong>'; echo htmlspecialchars($email, ENT_QUOTES, 'UTF-8').'</strong>)'; ?></h5>

This is rendered HTML from above PHP

<h5>Editing User <strong>test_user</strong> (<strong>example@gmail.com</strong>)</h5>

So, can somebody tell me is this the correct way do I need to go? If not so what is the correct way?

Hope somebody may help me out. Thank you.

SeinopSys
  • 8,787
  • 10
  • 62
  • 110
TNK
  • 4,263
  • 15
  • 58
  • 81
  • If XSS got into your database and now you need to prevent your script from echoing it, then perhaps you should be focusing on *preventing* the malicious data from getting inside your tables in the first place. – SeinopSys Jan 17 '15 at 15:59
  • @DJDavid98, I am using prepared statement when I inserting data to my tables. – TNK Jan 17 '15 at 16:05
  • You could still test the e-mail and other values with a regular expression, e.g. `preg_match('/^[\w\d._-\+]+@[\w\d._-\+](?:\.[\w\d._-]+)$/', $email)`, and not allow the insertion of the data on failure. – SeinopSys Jan 17 '15 at 16:09
  • @DJDavid98, Thanks for your suggestion. How about `preg_repalace()` like this `$username = preg_replace("/[^a-zA-Z0-9_\-]+/", "", $username);` – TNK Jan 17 '15 at 16:39
  • As far as security alone is concerned, I'd disagree with DJDavid98: This is the right place to prevent XSS attacks. Data is only malicious if it is actually run as executable code. However, it might be a good idea to make sure the data in the database is all valid email addresses anyway. I don't know whether this alone would be enough to prevent XSS. But I'd probably escape it here too anyway. Even if it can't be made executable, there may conceivably be characters that wouldn't display correctly. – David Knipe Jan 18 '15 at 12:07

2 Answers2

1

First of all, the correct way to escape output is htmlentities, not htmlspecialchars.
Escape ALL output you get from variables, database or user input.
This is pretty much all you have to do to escape XSS attacks.
You may also consider using strip_tags where it's appropriate.

Here you go:

<h5>
    Editing User <b><?=htmlentities($username)?></b> 
    (<b><?=htmlentities($email)?></b>)
</h5>
Oleg Dubas
  • 2,320
  • 1
  • 10
  • 24
  • Thanks for your answer. But its better if u can show me writing secure code using my above example. – TNK Jan 17 '15 at 15:49
  • @TNK, added the example – Oleg Dubas Jan 17 '15 at 15:57
  • Thanks again. Is this way prevent from XSS attacks? @acontell have mentioned `htmlspecialchars` as I am using it fine.. Still I am confusing about this issue. – TNK Jan 17 '15 at 16:08
  • @TNK `htmlentities` substitutes more characters than `htmlspecialchars`. This is unnecessary, makes the PHP script less efficient and the resulting HTML code less readable. Please, check this [question](http://stackoverflow.com/questions/46483/htmlentities-vs-htmlspecialchars) – acontell Jan 17 '15 at 16:53
  • It says htmlspecialchars() is much straightforward isn't it? . – TNK Jan 17 '15 at 16:59
  • you should ALWAYS use `htmlentities` not `htmlspecialchars` for the output. `htmlentities` will NOT protect you form some exploits and types of attacks. The "less efficient" acontell mentioned is bullsh-t. It will take may be 1/100,000,000 second longer, so what? For example, check this question: http://stackoverflow.com/questions/3623236/htmlspecialchars-vs-htmlentities-when-concerned-with-xss – Oleg Dubas Jan 17 '15 at 23:32
1

HTML entity encoding is okay for untrusted data that you put in the body of the HTML document, such as inside a tag. It even sort of works for untrusted data that goes into attributes, particularly if you're religious about using quotes around your attributes. But HTML entity encoding doesn't work if you're putting untrusted data inside a tag anywhere, or an event handler attribute like onmouseover, or inside CSS, or in a URL. So even if you use an HTML entity encoding method everywhere, you are still most likely vulnerable to XSS. You MUST use the escape syntax for the part of the HTML document you're putting untrusted data into. That's what the rules below are all about.

More info in OWASP.

The correct way to use htmlspecialchars is something like this:

echo htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

Also, have in mind that a user could send a username like "Jim onclick=alert('hi')"

If you don't wrap in quotes the value attribute, you'd get something like:

<input type="text" name="username" value=Jim onclick=alert('hi')>

ALWAYS use quotes around attributes. Even if they aren't user-inputted, it's a good habit to get into.

<input type="text" name="username" value="<?php echo htmlspecialchars($_POST['username'], ENT_QUOTES, 'UTF-8'); ?>">

Having these things in mind, you should be covered for most of the cases. However, if you want to be really picky, do read the OWASP document I mentioned before, it's really helpful.

UPDATE

There seems to be some controversy about htmlspecialchars vs htmlentities. I'm going to sum up a few things I've been reading and you can choose whatever of the two:

UTF-7 problem

Both htmlspecialchars and htmlentities are subceptible to the infamous UTF-7 problem. None of them support this encoding. As you can read in some of the comments of the SO posts provided at the bottom of the post:

If your page/browser is vulnerable to the UTF-7 issue, htmlentities isn't going to help you any more than htmlspecialchars will. Both of them will interpet the UTF-7 encodings of < and > as just "safe" ASCII chars and pass them through.

Solution: Don't use UTF-7 and also make sure that escaping is done using the same character encoding that the document is being served as to avoid disappearing quotes: establish in the header of your webpage the same encoding as the one you'll use in htmlspecialchars (UTF-8 for instance):

header('Content-Type: text/html; charset=utf-8');

htmlspecialchars will default to UTF-8 (in PHP 5.4/5.5) if you don't specify the third parameter so you should be safe even if you forgot to establish it.

Check this interesting article talking about the topic (and some more useful info about XSS). LINK

htmlentities() vs. htmlspecialchars()

htmlspecialchars

  • Use it when there is no need to encode all characters which have their HTML equivalents, it's better to use htmlspecialchars due to the fact that sends less code to the client. This isn't a matter to be taken lightly: less code sent, faster web pages. Code is also more readable than the one produced by htmlentities.
  • Sometimes you're writing XML data, and you can't use HTML entities in a XML file.

htmlentities

  • When there is a need to encode all characters. If your pages use encodings such as ASCII or LATIN-1 instead of UTF-8.

Check the documentation I provided and this SO questions:

htmlentities() vs. htmlspecialchars()

htmlspecialchars vs htmlentities when concerned with XSS

and choose the one that suits you best.

Community
  • 1
  • 1
acontell
  • 6,792
  • 1
  • 19
  • 32
  • Thanks for your answer. Can you show me using an example? – TNK Jan 17 '15 at 15:45
  • @TNK In the "XSS Prevention Rules Summary" part of the document I provided, there's a summary of how to safely render untrusted data in a variety of different contexts. Just apply them. For the body part, `htmlspecialchars` as you're using is fine. – acontell Jan 17 '15 at 15:50
  • @TNK It's up to you to use `htmlentities` or `htmlspecialchars` to output in the body of your page. There isn't much difference between them as you can read in the link (I'd say most people use `htmlspecialchars`). It's more important though to take into consideration things mentioned in the document I provided, such as outputting in input values – acontell Jan 17 '15 at 20:15
  • @TNK I've updated my answer with more examples. Hope it helps. – acontell Jan 17 '15 at 20:28
  • you should ALWAYS use `htmlentities` not `htmlspecialchars` for the output. `htmlentities` will NOT protect you form some exploits and types of attacks. The "less efficient" acontell mentioned is bullsh-t. It will take may be 1/100,000,000 second longer, so what? For example, check this question: http://stackoverflow.com/questions/3623236/htmlspecialchars-vs-htmlentities-when-concerned-with-xss – Oleg Dubas Jan 17 '15 at 23:32
  • @acontell, Excellent update, thank you. The only thing I am concerned of then is the convenience, as `htmlentities($s)` is much shorter than `htmlspecialchars($s, ENT_QUOTES)` :) But this can be ignored or totally solved in several ways. – Oleg Dubas Jan 18 '15 at 16:54