1

I am trying to filter the user's input from malicious code to prevent XSS attack. When the user submits the input, the input goes trough the following checks... The input is contained in the $post variable.

$post = htmlspecialchars($post);
    $id = $_SESSION['loggedIn'];
    $sql = "UPDATE playerstats SET message='$post' WHERE id = $id";
    $db->query($sql);
    header("location: ?page=message");

Yeah i know i am not using prepared statements, but i made that code just for testing purposes. Okay, it works. In the database i see

<script>top.location.href = "?page=message";</script>

So in that message page i see the post that was just inserted. But i don't see the effect of htmlspecialchars? It affected the post when it got submitted to the database, but when i display it in the message page.. i see again

<script>top.location.href = "?page=message";</script>

Any idea why this is happening? Is the htmlspecialchars command only meant to be for output?

M. Eriksson
  • 13,450
  • 4
  • 29
  • 40
  • Not entirely sure I understand the problem. If you're actually seeing thos characters on the HTML page - which you should - then the `htmlspecialchars` has done its job. If you hadn't done it you wouldn't have *seen* anything, you'd have just had the script actually executing. (In this case, by redirecting you.) – Robin Zigmond Mar 20 '19 at 09:54
  • https://i.postimg.cc/65KH1hp3/241234.jpg This is what i see. I think i should not see that code in this way? That message page on the picture displays the post that was posted into the database. In the database htmlspecialchars did it's job, my questions is why i see again in the browser the <""> special chars. Sorry i am not good with the explanations. – Black Sun Phoenix Entertainmen Mar 20 '19 at 09:59
  • 2
    You should avoid escaping data before you store it in your database. You should save it "as is" which isn't an issue if you use parameterized [Prepared Statements](http://php.net/manual/en/mysqli.quickstart.prepared-statements.php). Escaping should be done before you use the data. If you're outputting it on HTML page, use htmlspecialchars() or htmlentities(). If you're going to use it for something else, you might need to escape it differently. – M. Eriksson Mar 20 '19 at 10:01

3 Answers3

4

The point of htmlspecialchars is to remove any HTML special characters and replace them with Ampersand-Codes which will show the character but not get interpreted as HTML. This is highly effective against XSS Attacks (but not SQL Injection).

That means if you put the string <script>malicousCode</script> through htmlspecialchars and echo it into the page, the user will see the actual string. If you do not put it through htmlspecialchars the browser would think that it's a <script> tag and execute malicousCode.

This in itself does not prevent SQL Injection! It is only used to sanitize strings which you want to show to the user.

To prevent SQL Injection use prepared statements (I discourage you from using any other forms of escaping like mysqli_real_escape_string because using prepared statements makes it completely impossible)

JensV
  • 3,997
  • 2
  • 19
  • 43
  • _"because using prepared statements makes it completely impossible"_ - Not entirely true. There are [edge cases](https://stackoverflow.com/a/12202218/2453432) where you still can be vulnerable so the statement _"completely impossible"_ is a bit too absolute. – M. Eriksson Mar 20 '19 at 10:07
  • I think you're mixing up two different security issues. You're right that `htmlspecialchars` does nothing whatsoever to prevent SQL Injection. For that you use prepared statements, as you say. `htmlspecialchars` is there to prevent cross-site scripting (XSS) attacks. – Robin Zigmond Mar 20 '19 at 10:45
  • Well yes in the end `htmlspecialchars` will also prevent XSS but it is not it's only function. You also use it to show normal user input text containing special html characters, prevent XSS is more like implied functionality from my point of view, but I'll add it to the answer as well. – JensV Mar 20 '19 at 10:50
2

htmlspecialchars encodes html characters, for example <tag> will be replaced with &lt;tag&gt;. When you load that string back from database and display it, the browser will still display &lt;tag&gt; as <tag>, but it won't be treated as html instruction by the browser. So in the page source, you will still see the encoded &lt;tag&gt;.

If you want to use the string on the page so it gets interpreted as normal html command, you have to use htmlspecialchars_decode (Docs) to convert it back after you loaded the content back from the database.

loaded_from_db = htmlspecialchars_decode(value);

If you want to escape your input because of security considerations to protect you from sql injections, you could use mysqli_real_escape_string instead.

But using prepared statements would be the best choice, because you define exactly what you expect as parameters for your statements and the values provided cannot mess it up. It's also the recommended approach even if it's just for testing purpose and it's not that hard to implement.

so your example with mysqli and prepared statements would be:

$stmt = $mysqli->prepare("UPDATE playerstats SET message=? WHERE id = ?");    
$stmt->bind_param("si", $post, $id)
$stmt->execute()

note that I didn't include any error handling.

Jack O'Neill
  • 1,032
  • 2
  • 19
  • 34
  • Don't recommend `mysqli_real_escape_string()`. Prepared statements is the way to go. Also, `mysqli_real_escape_string()` only works with mysqli, not with PDO. Both support prepared statements though. – M. Eriksson Mar 20 '19 at 10:12
  • Thanks. The thing that bothered me was that i was seeing the code NOT encoded on the browser output. – Black Sun Phoenix Entertainmen Mar 20 '19 at 10:23
  • 1
    that's because `<tag>` still gets displayed as ``, but just not interpreted as an html-tag. – Jack O'Neill Mar 20 '19 at 10:25
  • 1
    @BlackSunPhoenixEntertainmen - The point of encoding the data is to be able to output the string as it was written without the browser interpreting it as HTML. So if you check the page source, you will see the encoded string while the browser will output the "real" characters in the view. – M. Eriksson Mar 20 '19 at 10:26
  • Yeah bro, i just realized that when i replaced the encoded code with decoded code. The browser started to redirect me unlimited times on the same page. Thank you for your help! – Black Sun Phoenix Entertainmen Mar 20 '19 at 10:30
1

Yes, this is a correct usage of htmlspecialchars

$post = htmlspecialchars($post);

and here are another example:

<?php
$str = "This is some <b>bold</b> text.";
echo htmlspecialchars($str);
?>

The HTML output of the code above will be (View Source):

<!DOCTYPE html>
<html>
<body>
This is some &lt;b&gt;bold&lt;/b&gt; text.
</body>
</html> 

The browser output of the code above will be:

This is some <b>bold</b> text.

summary:

The htmlspecialchars() function converts some predefined characters to HTML entities.

The predefined characters are:

& (ampersand) becomes &amp;
" (double quote) becomes &quot;
' (single quote) becomes &#039;
< (less than) becomes &lt;
> (greater than) becomes &gt;
khaled saleh
  • 470
  • 7
  • 18