2

Right now my application is secured against SQL injection and XSS, will add CSRF protection before deployment. I'm not filtering POST request thought and I don't use any GET request.

I was reading this question to see if I missed any security detail and one guy says to use

$username = filter_input(INPUT_POST, 'username', FILTER_SANITIZE_STRING);

The problem is my data is from different languages, English, French, Arabic, Russian and I don't know what. I have no idea what the people doing the data entry want to put in the database. I know that there will be data from many different languages.

input filtering and sanitizing won't work with none english characters right?

DarkBee
  • 16,592
  • 6
  • 46
  • 58
Lynob
  • 5,059
  • 15
  • 64
  • 114
  • `$var = "éèàùßsomething";$username=filter_var($var, FILTER_SANITIZE_STRING);` outputs: `"éèàùßsomething"`. – Syscall Feb 01 '18 at 10:12
  • 1
    The overall concept of generic contextless "sanitisation" makes me shiver. PHP combines great features with historic crap and I think this is closer to the second case. – Álvaro González Feb 01 '18 at 10:13
  • @ÁlvaroGonzález you don't think it's a good idea to filter post and sanitize it? I'm just asking, the only reason I'm doing it is because they said so in the other post. If it doesn't make any difference then I won't waste my time on it – Lynob Feb 01 '18 at 11:36
  • @Lynob It's not a good idea to do it in "batch" mode, without specific context in mind. For instance, `FILTER_SANITIZE_STRING` appears to remove HTML tags. That's a great way to e.g. damage a password, where HTML is the last thing to care about. In your case, it makes sense to restrict the characters used in user names but that's something that PHP can't do automatically for you because it doesn't know your requirements. – Álvaro González Feb 01 '18 at 11:43

1 Answers1

2

According to W3Schools

The FILTER_SANITIZE_STRING filter strips or encodes unwanted characters.

This filter removes data that is potentially harmful for your application. It is used to strip tags and remove or encode unwanted characters.

Non English characters will just be encoded and shouldn't be stripped. You should make sure you use UTF-8 in PHP and MySQL.

Kasia Gogolek
  • 3,374
  • 4
  • 33
  • 50
  • 1
    How can a sequence of bytes harm my application? PHP doesn't have the faintest idea of what I plan to do with the string. It's not as if text can contain corrosive substances or something... – Álvaro González Feb 01 '18 at 10:15
  • I agree, better way to sanitize your data is to know what you're expecting and limit based on that. Generic filtering can give you a sense of security, but can be abused anyway half the time when someone has malicious intent – Kasia Gogolek Feb 01 '18 at 10:20