0

What do you consinder the best way to sanitize array below? I was thinking adding htmlentites before each $row or perhaps using the method below.

<?php 
$result = $conn->query("SELECT formula.id, tokens, direction, graph, module FROM formula INNER JOIN syntics ON formula.moduleid = syntics.id"); 

while ($row = $result->fetch_array())
filter_var_array($row, FILTER_SANITIZE_SPECIAL_CHARS); // OK?

echo                    "<td>". 
     $row['tokens']    ."<td>". 
     $row['direction']  ."<td>". 
     $row['graph']     ."<td>".
     $row['module ']   ."<td>". 
     "<a href='upong.php?soya=" . $row['id'] . "'>Specific type</a>" . "</tr>";
?>
Global
  • 103
  • 3
  • 13
  • It matters what your data is. Are the first 4 fields saved as text or html? Can I assume id is an integer? – Phil Oct 09 '16 at 20:42
  • The first 4 fields are saved as html. You're correct, the id is integer. – Global Oct 09 '16 at 20:46
  • 1
    your outputting from the db, what do you think you need to "Sanitize"? –  Oct 09 '16 at 20:49
  • Please can you give more information about why you are trying to sanitize? What are you trying to prevent? – Phil Oct 09 '16 at 20:50
  • I thought was a risk of XSS if not preventing characters, the line between DB and the actual output. If hacker perhaps could change the out from DB to its own string/script somehow. When user click on "Specific type" it gets directed to another page. The receiving id is then sanitized before query. – Global Oct 09 '16 at 20:59
  • That's why it's better to store text rather than html. Unless you can trust the source, you shouldn't be saving html because there are so many ways to hide dangerous javascript code. Have you thought of using [Markdown?](https://en.wikipedia.org/wiki/Markdown) – Phil Oct 09 '16 at 21:21
  • I'm still a rookie and haven't discovered it yet. Do you have any good link to read more about text - html? Apologize, the input from user is text! There's so many different opinions of what to do and not. Would it actually become a risk using my code as it is? The id variable is sanitized when other script receiving it, still feels risk somehow.. – Global Oct 09 '16 at 22:19

3 Answers3

3

I feel that you could benefit from some general information on the subject of sanitization and escaping.

  • To understand how to write secure PHP code, you need to prevent against XSS.
  • To prevent against XSS, you need to implement proper output escaping and may need to implement data sanitization.
  • To implement data sanitization and output escaping you need to understand why it's a problem and how to do it properly.

Sanitization

Sanitization should be done before saving the data to the database. It makes sure that things which shouldn't be saved to the database are not. It is also good to do it again after you read data out of your database incase you miss something and your database now contains something harmful. Generally if you are just storing text, you might want to allow any text to be saved and in this scenario sanitization is not really necessary. But it sounds like you are storing html...

If you are storing html you probably plan to output it to browsers at some point and you don't want it to contain harmful scripts for your users to execute. Sanitizing html to remove harmful javascript is actually very very difficult due to the many ways you can insert javascript. Whole PHP libraries (e.g. wp_kses_*) have been written dedicated to this and it's not enough to just remove all < script> tags as some SO answers suggest. Plus you will need to keep your html sanitization code up to date to prevent against the newest attacks. All in all, it's a very high risk/maintenance solution. If you want to go this route, there are some solutions here.

Usually you will want to give your users the ability to format their text with a subset of what html offers (e.g. bold, italics, underline and maybe some colours) and a better approach is to use a more lightweight language such as Markdown or BBCode

Also you should consider saving your fields as text only and handle the styling completely in your application.

Output Escaping

This is the step right before outputting the data. When you are piecing together HTML for output in PHP, you need to convert anything which isn't html yet in to safe html. If you use a templating language this is handled for you automatically. In my opinion it is the most misunderstood concept of PHP developers today, and unfortunately it is one of the most important. I won't go in to it here but I highly recommend this further reading.

Important Update

This code is NOT data sanitization, it is output escaping.

filter_var_array($row, FILTER_SANITIZE_SPECIAL_CHARS);

I can see now that confusingly, the word "Filter" has such a generic meaning in this answer and can arguably refer to both sanitization and escaping. I have removed it from my answer to help clear up any confusion.

Your example - Sanitization

I wont go as far to say never store html in a database field, but it is a lot harder this way. You need to decide what is expected and valid. If you update your question with more details on the specific data, it will become clear what these restrictions should be.

Your example - Output escaping

If your variables already contain well formed HTML fragment strings then you can safely append your variables using the "." (string concatenation operator) inside an open and close tag. What you have put in your question code is correct. However, I prefer to use direct output with short tags as it makes the code more readable and there is no real need to put everything in to a PHP string anyway.

<td><?= $row['tokens'] ?></td>
<td><?= $row['direction'] ?></td>
<td><?= $row['graph'] ?></td>
<td><?= $row['module'] ?></td>

Note: As explained above, by outputting html you are asking clients to trust, parse and display it. If these variables do in fact contain invalid or bad HTML, then it is a problem with your sanitization.

  • Output escaping should/does NOT clean your data.
  • Sanitization should/does NOT escape your data.

They are simply two different concepts working together.

Since your id should be an integer from your database you can cast it like this to make sure that it is.

<a href='upong.php?soya=<?= (int)$row['id'] ?>Specific type</a>

If the value is not castable to an integer (because something unexpected happened which you didn't account for) you end up with a 0 in your url which normally isn't that harmful.

Community
  • 1
  • 1
Phil
  • 1,996
  • 1
  • 19
  • 26
  • Thanks a lot for your clarification. It is indeed very interesting subject and I will continue reading the links. The data is saved from echo => $_POST => filter_input => query => db. Then output db => query => echo. So I should use htmlentites/specialchar for each row to minimize XSS, as in htmlentities($row['tokens']) htmlentities($row['direction'] etc? – Global Oct 09 '16 at 22:50
  • It depends what the data is and in what context you are outputting. From your comments you say that the 4 fields are html rather than text and since you are inserting them between "" tags, your context is html also. So in this situation you don't need to escape anything (i lied above). In fact, escaping them like that would display the html as plain text so you can see the tags. This is why when storing html in your database you should make sure it is sanitized and sanitize it again before output. – Phil Oct 13 '16 at 00:11
  • I'm quite lost now.. The data is stored from user input as POST, then sanitized and insert query to db. The output is declared in my first post. And if understand you correctly, I need to sanitize whole array right before echo? In what way would it be most consistent to type it? – Global Oct 14 '16 at 15:27
  • Please update your question to include some of your data. It matters what the data is – Phil Oct 14 '16 at 17:47
0

i believe you must use custom filter for sanitize your array. please read php document in url

hamid_reza hobab
  • 925
  • 9
  • 21
-1

If you are worried about an XSS attack then you really ought to be taking care of the problem before the input is inserted into your database by using htmlentities() on it. Never trust user input.

useyourillusiontoo
  • 1,287
  • 1
  • 10
  • 24
  • I strongly recommend to **never use htmlentities** on data **before** it is persisted. There is a difference between sanitizing and escaping and it's not good to assume the output format will always be html. – Phil Oct 09 '16 at 21:30