0

I have few questions regarding preg_match in php.

if(preg_match('#[^0-9 -&()+@._A-Za-z]#', $input)){
    $errors .= 'Sorry, username can\'t contain those characters.<br>';
}

This is my preg_match. I am kinda new to these codes. I have red that its better to use # on the end and beginning than / for unknown reason xD Anyone knows what is up with that?

My main problem is that this preg_match actually let strings with % (percent signs) through and it shouldn't. Why? and how to stop that?

Another question is this preg_match code good? It works fine (except % part) but can it fail?

Thank you :)

chris85
  • 23,846
  • 7
  • 34
  • 51
Morsus
  • 107
  • 1
  • 16
  • The `#` instead of `/` is because `/` is used in URLs and will need to be escaped in every instance. I prefer `~` because it is rarely used in data I'm parsing. Also these are referred to as delimiters, http://php.net/manual/en/regexp.reference.delimiters.php – chris85 Jul 27 '15 at 20:56
  • What is that even for? I mean first and last character (#,~,/)? – Morsus Jul 27 '15 at 20:58
  • They are delimiters they tell the regex engine where the expression starts and ends. `When using the PCRE functions, it is required that the pattern is enclosed by delimiters` See that link at the end of my first comment. – chris85 Jul 27 '15 at 21:01
  • So in my case it does not even matter witch ones I use? – Morsus Jul 27 '15 at 21:15
  • Not currently, if you were to add `#` to the list of characters not allowed though you would need to escape it `\#` or change the delimiters to an unused character. – chris85 Jul 27 '15 at 21:16

1 Answers1

5

this preg_match actually let strings with "%" (percent signs) through and it shouldn't. Why?

That is due to unescaped hyphen in the middle of your regex:

'#[^0-9 -&()+@._A-Za-z]#'
--------^

- is acting as range from space (32) to & (38) thus matching anything in between including % ( 37).

It should be used as:

'#[^-0-9 &()+@._A-Za-z]#'

Or

'#[^-\w &()+@.]#'

However without anchors this character class will match only one character. You should use:

'#^[^-\w &()+@.]+$#'
anubhava
  • 761,203
  • 64
  • 569
  • 643