1

I'm trying to understand XSS attacks. I learnt that I should use htmlspecialchars() whenever outputting something to the browser that came from the user input. The code below works fine.

What I don't understand is whether there is a need to use htmlspecialchars() here for echoing the $enrollmentno or not?

 <?php 


$enrollmentno = (int)$_POST['enrollmentno'];





echo "<div style='border-radius:45px; border-width: 2px; border-style: dashed; border-color: black;'><center><h4><b>$enrollmentno</b></h4></center></div>";


$clink = "http://xyz/$enrollmentno/2013";

echo"<iframe src='$clink' width='1500' height='900' frameBorder='0'></iframe>";
?>

If I do something like

$safe = "<div style='border-radius:45px; border-width: 2px; border-style: dashed; border-color: black;'><center><h4><b>$enrollmentno</b></h4></center></div>";

echo htmlspecialchars($safe, ENT_QUOTES);

It doesn't show the correct HTML format.

I'm not sure if I have to use HTMLPurifer here. Does HTMLPurifer retain the HTML formating while prevent XSS?

Update

echo "<div style='border-radius:45px; border-width: 2px; border-style: dashed; border-color: black;'><center><h4><b>".htmlspecialchars ($enrollmentno)."</b></h4></center></div>";

Does the trick!

Nick_inaw
  • 145
  • 3
  • 13
  • 1
    Why don't you just sanitize the variable containing the value in `$enrollmentno` – Rahil Wazir Dec 12 '13 at 21:16
  • Thanks. Would this do? `echo "

    ".htmlspecialchars ($enrollmentno)."

    ";`
    – Nick_inaw Dec 12 '13 at 21:19
  • Ofcourse! it will do the job – Rahil Wazir Dec 12 '13 at 21:21
  • I found this [link](http://stackoverflow.com/questions/17207750/how-to-output-html-but-prevent-xss-attacks) where HTMLPurifier is recommened. Why is that so? I'm confused as my question is quite similar. – Nick_inaw Dec 12 '13 at 21:30
  • 1
    Its specifially written for `html/xss filteration` and yes it is powerfull than PHP regular function. But PHP also provide wide variety of filteration `functions`, `constants` you can also use them without using any third party script. – Rahil Wazir Dec 12 '13 at 21:38
  • Thanks. I don't think there is a need to use Htmlpurifier in my case, right? – Nick_inaw Dec 12 '13 at 21:44
  • It really depends on your application. In my case use it if you're facing difficutly when filtering large amount of data or any data. – Rahil Wazir Dec 12 '13 at 21:50

1 Answers1

2

Any time you use arbitrary data in the context of HTML, you should be using htmlspecialchars(). The reason for this is that it prevents your text content from being treated as HTML, which could potentially be malicious if coming from outside users. It also ensures you are generating valid HTML that browsers can handle consistently.

Suppose I want the text "8 > 3" to appear on in HTML. To do this, my HTML code would be 8 &gt; 3. The > is encoded as &gt; so that it isn't misinterpreted as part of a tag.

Now, suppose I am making a web page about how to write HTML. I want the user to see the following:

<p>This is how to make a paragraph</p>

If I don't want <p> and </p> to be interpreted as an actual paragraph, but as text, you need to encode:

&lt;p&gt;This is how to make a paragraph&lt;/p&gt;

htmlspecialchars() does that. It allows you to insert arbitrary text into an HTML context in a safe way.

Now, in your second example:

$safe = "<div style='border-radius:45px; border-width: 2px; border-style: dashed; border-color: black;'><center><h4><b>$enrollmentno</b></h4></center></div>";
echo htmlspecialchars($safe, ENT_QUOTES);

This does exactly what you asked it to do. You gave it some text, and it encoded that. If you wanted it as HTML, you should have just echoed it.

Now, if you need to display HTML as HTML and it comes from an untrusted source (i.e. not you), then you need tools like HTMLPurifier. You do not need this if you trust the source. Running all your output through htmlspecialchars() doesn't magically make things safe. You only need it when inserting arbitrary text data. Here's a good use case:

echo '<h1>Product Review from ', htmlspecialchars($username), '</h1>';
echo htmlspecialchars($reviewText);

In this case, both the username and review text can contain whatever that user typed in, and they will be encoded correctly for use in HTML.

Brad
  • 159,648
  • 54
  • 349
  • 530
  • Perfect! I think I solved my own question, but your explanation made it really clear. I cannot imagine a scenario where user will need to provide HTML and then someone need to use HTMLPurifier. Could you give me a simple example where HTMLpurifier is used in that application? – Nick_inaw Dec 12 '13 at 21:57
  • @Nick_inaw Suppose you accepted HTML-formatted comments on a message board. That would be the most common use case I think, but again not common. HTMLPurifier is certainly over-used by people who think they are getting security, when instead they are breaking their applications and being less secure. It should only be used when you actually want to filter HTML. – Brad Dec 12 '13 at 22:22
  • I totally get it now. Thanks. – Nick_inaw Dec 12 '13 at 22:34