-1

I need a Regex for to use with preg_replace php function in the search form input to use in SQL full text search in a MySQL multilingual utf8 database. I have considered using php filter_var with FILTER_SANITIZE_STRING, but I ended up with preg_replace:

I want these features:

  1. keep spaces and only one if more in a row (serial spaces)
  2. keep double quotes and only one if more in a row(so that I could use it in phrase in IN BOOLEAN MODE)
  3. keep - & + & '~' and only one if more in a row
  4. as I want it to be multi lingual it should consider Unicode (utf8) letters too
  5. I do not have/need accents to be considered.

This is what I have done:

$q = addslashes($q);
$q = preg_replace('/[^\w\d\s\s+\p{L}]/u', "", $q);

But the output does not satisfy me with like with quotes(") and minus (-). How can I write a safe query string to use in my search box?

Are there any better practises than using preg_replace?

halfer
  • 19,824
  • 17
  • 99
  • 186
Peyman Mohamadpour
  • 17,954
  • 24
  • 89
  • 100
  • 2
    If you are doing this to prevent sql injections then you're doing it wrong. Use [prepared statements](http://stackoverflow.com/questions/60174/how-to-prevent-sql-injection-in-php/60496#60496) instead. See the [difference](http://www.youtube.com/watch?v=nLinqtCfhKY). You may use `preg_match` to detect such invalid queries and return a funny message to the client. – HamZa Jul 07 '13 at 23:20
  • Thanks for your comment. there is no injection. I am just using a `SELECT`, not an `INSERT`. but it should be a clear query to perform a better search – Peyman Mohamadpour Jul 07 '13 at 23:34
  • 2
    @Arash Yes, and a simple `'; DROP TABLE abcd; --` will still work. It doesn't matter what command you are using, the attacker can still delete stuff as with SQL injection one can usually execute a completely different command. – Manishearth Jul 07 '13 at 23:39
  • 2
    @Arash it seems you are unaware of what "sql injections" means. I highly recommend you to read about this from [OWASP](https://www.owasp.org/index.php/SQL_Injection) and maybe the [wiki page](http://en.wikipedia.org/wiki/SQL_injection). Also if you are using `mysql` note that it's deprecated and you can't perform prepared statements with it. You *should* use `mysqli` or `PDO`. – HamZa Jul 07 '13 at 23:42
  • @Manishearth Thanks for that, but would you please answer my main question? – Peyman Mohamadpour Jul 07 '13 at 23:44
  • @HamZa, Thanks for the information. I will take a look at it. but anyway I am using mysqli – Peyman Mohamadpour Jul 07 '13 at 23:44
  • 1
    @Arash Using mysqli isn't enough, you have to use it _with prepared statements_. Otherwise it's just as bad. – Manishearth Jul 07 '13 at 23:53

1 Answers1

1

You have to do 2 preg_replace.

1- Replace invalid characters by nothing:

$q = preg_replace('/[^\p{L}\d\s~+"-]+/', '', $q);

2- Replace multiple char like spaces, ~, +, ", - by only one:

$q = preg_replace('/([\s~+"-])\1+/', "$1", $q);
Toto
  • 89,455
  • 62
  • 89
  • 125