3

I've got a <textarea> whose value is sent off to the server and stored in a database. This value is then later rendered on different pages in HTML.

What do I need to do to sanitize this? Just remove the HTML tags? (It's already SQL-injection safe because I'm using a stored procedure and parameters.)

Does anyone have a sanitize routine?

icktoofay
  • 126,289
  • 21
  • 250
  • 231
  • 3
    Normally your server-side script has heaps of sanitizers in it's STL. Do not try to write your own! What language are you using? – Martin Olsen Apr 30 '11 at 01:09
  • It depends on what context you render it in. If you render it inside quote tags, say as the 'title' attribute of some link element, it needs to be escaped differently to if you were rendering it directly in the body. Can you clarify where you are rendering it? Also, as Martin says, your framework will have functions to do the escaping for the given scenario. For example, .NET has HtmlAttributeEncode, etc. – Noon Silk Apr 30 '11 at 04:32

2 Answers2

2

Do not sanitize input. Instead encode it when you output it. This is easy to enforce with the .net 4 features (<%: "" %>) or by code-reviewing.

Data should be stored in its native format. Human-readable text has as its native format just text, not some encoded version of it. You cannot easily manipulate encoded text (say doing highlighting of words or replaces).

Not encoding text in the database even saves a little storage space.

Sanitizing input is hard anyway. It is very hard to do more than just encoding everything. Blacklisting HTML tags is a certain way to forget something so don't do it.

usr
  • 168,620
  • 35
  • 240
  • 369
  • +1 because I think you just recommended that the asker does not HTML sanitize their data before storing into the database. – bitlather Jun 29 '13 at 13:53
  • This will save you storage, since you store the actual input. And when you output it, you make it slightly larger to be safely shown in the html, as shown here: http://stackoverflow.com/a/2794366/985511 – xrDDDD Sep 16 '13 at 21:51
0

Either remove the tags completely, or replace any special characters such as < and > with their HTML entities (&lt;). Whatever server-side language you're using probably already has a function to do this. PHP's htmlspecialchars or strip_tags will do the trick, for example.

mpen
  • 272,448
  • 266
  • 850
  • 1,236