If it can be output to a user, you must prevent the potentially malicious user from including HTML tags in his code. For example, if, in that post, I could include a script
tag, that could be very dangerous for any user reading my post. To prevent this, you want to use htmlentities
:
$clean_data = htmlentities($_POST['data']);
That way, my <script>
tag will be translated as <script>
so it won't harm users when displayed by their browsers.
Now, if you want to save my post in your database, you must beware of SQL injections. Let's say you're saving my post with that query (yes, you shouldn't use mysql_*
functions as they are deprecated, but that's just to explain the idea) :
mysql_query($db, "INSERT INTO posts(data) values('$clean_data');");
Sounds good ? Well, if I'm nasty, I'll try to include that post :
test'); DELETE FROM posts; SELECT * FROM posts WHERE data = '
What your MySQL gets is then
INSERT INTO posts(data) values('test'); DELETE FROM posts; SELECT * FROM posts WHERE data = '');
Ouch. So, you must basically prevent your user from including quotes and double quotes in his post, or, more precisely, you should escape them. That really depends on the library you are using, but in the obsolete one I used, that would be written :
$really_clean_data = mysql_real_escape_string($db, $clean_data);
mysql_query($db, "INSERT INTO posts(data) values('$really_clean_data');");
So, with the above malicious post, MySQL would now receive
INSERT INTO posts(data) values('test\'); DELETE FROM posts; SELECT * FROM posts WHERE data = \'');
Now, to MySQL, the whole INSERT INTO posts(data) values('test'); DELETE FROM posts; SELECT * FROM posts WHERE data = '');
part is a correct string, so what happens is what you want to happen.
Basically, that's all you need in almost all the cases. Just remember that, when you feed user data to an interpreter (it can be a web browser, a SQL engine or many other things), you should clean that data, and the best way to do it is to use the dedicated functions that should come with the library you are using.