13

I want to write a regex in php to match only any english characters, spaces, numbers and all special chars.

From this question Regex any ascii character

I tried this

preg_match("/[\x00-\x7F]+/", $str);

but it throws a warning

No ending delimiter '/' found 

so, how to write this regex in php.

the alternative would be smth like [a-z\d\s] and also one by one consider all special chars, but is not there a way to do simpler ?

Thanks

Community
  • 1
  • 1
dav
  • 8,931
  • 15
  • 76
  • 140
  • 1
    @VedantTerkar: No, but it might be a good idea to use single quotes since in double-quoted strings, `\xnn` character escapes will be interpreted. I don't know PHP, but maybe it's getting thrown off by an ASCII NUL character inside a string? – Tim Pietzcker Jul 23 '14 at 06:22

2 Answers2

31

There are a number of intricate solutions, but I would suggest this beautifully simple regex:

^[ -~]+$

It allows all printable characters in the ASCII table.

In PHP:

$regex = '%^[ -~]+$%';
if (preg_match($regex, $yourstring, $m)) {
    $thematch = $m[0];
    } 
else { // no match...
     }

Explanation

  • The ^ anchor asserts that we are at the beginning of the string
  • The character class [ -~] matches all characters between the space and the tilde (all the printable characters in the ASCII table)
  • The + quantifier means "match one or more of those"
  • The $ anchor asserts that we are at the end of the string
zx81
  • 41,100
  • 9
  • 89
  • 105
  • Added PHP code. I build this a while ago and have been using it, it works great, highly recommend it. :) – zx81 Jul 23 '14 at 06:24
  • Didn't mean to imply you needed that code... Sounds like you're a pro. :) It's there more for a newbie who might read it a year from now, always nice to complete the answer. Btw when I built that regex some time ago I realized there are cool variations, for instance `[!-~]` for all printable chars minus the space. – zx81 Jul 23 '14 at 06:28
  • come on bro !! :) no prob, I just commented, without any implications or smth. Sure, u r right, I am completely with u. That is cool stuff, tks for the additional info :) – dav Jul 23 '14 at 06:33
  • I tried it as soon as I found out it worked, a warning poped up sayting u can accept in 6 mins, lemme try again ;) – dav Jul 23 '14 at 06:35
  • +1 for the use of `[ -~]`, not very widely used range that is. – anubhava Jul 23 '14 at 06:56
  • You're very much welcome @zx81 Glad to see another of the "good guys" on SO, *cheers!* – Funk Forty Niner Jul 23 '14 at 21:09
  • Great suggestion! There's a problem with the PHP code, though. You should either escape the `~` in the range or use another delimiter. – Matthias Samsel Mar 14 '16 at 16:51
  • Works great. When i used it with textarea, new lines caused problems though. so i had to add space in the regex: /^[ -~\s]+$/ – tylik Feb 05 '19 at 14:57
3

PHP's regex comes with a set of character classes you can reference, ascii being one of them

$test = "hello world 123 %#* 单 456";
preg_match_all("/[[:ascii:]]+/",$test,$matches);
print_r($matches);
FuzzyTree
  • 32,014
  • 3
  • 54
  • 85
  • thanks for the asnwer, but this does not work, e.g. try `asdfas==адъфасдфас==asdfasd` with string, that contains Russian chars, does not work. @zx81's solution works fine. tks – dav Jul 23 '14 at 06:29
  • @dav it seems to work for me `[0] => asdfas== [1] => ==asdfasd`, what is your expected result? http://ideone.com/giYFW1 – FuzzyTree Jul 23 '14 at 06:30
  • this is interesting, in my localhost I tried exactly this code, just copy pasted `$test = 'asdfas==адъфасдфас==asdfasd';; preg_match_all("/[[:ascii:]]+/",$test,$matches); print_r($matches);die;` and I get this result `Array ( [0] => Array ( [0] => asdfas==адъфасдфас==asdfasd ) )`, but in http://phpfiddle.org/ it outputs this `Array ( [0] => Array ( [0] => asdfas== [1] => ==asdfasd ) )` , did u try in fiddle or in localhost ? it looks like there is encoding problem in that fiddle. tks – dav Jul 23 '14 at 06:42
  • @dav i tried it on localhost and got the same results as your and my fiddles – FuzzyTree Jul 23 '14 at 06:49