1

Allow letters, numbers and spaces (3 spaces maximum). How can i do it using regular expression?

I will the regular expression in PHP.

if(eregi("...HERE...", $_POST['txt_username']))
{
   //do something
}
user311509
  • 2,856
  • 12
  • 52
  • 69

4 Answers4

4

How about this?

/^([^\W_]*\s){0,3}[^\W_]*$/

No catastrophic backtracking, since the ([^\W_]*\s) groups are clearly delimited.

Edit: Adopting tchrist's unicode-friendly version: /^([\pN\pL\pM]*\s){0,3}[\pN\pL\pM]*$/

Justin Morgan - On strike
  • 30,035
  • 12
  • 80
  • 104
3

You could use:

if(preg_match('@^(\w*\s){0,3}\w*$@', $_POST['txt_username)) {
    // do something
}

See it in action on: rubular.com


Note: \w includes the underscore (_). If you don't want it, you can use:

if(preg_match('@^([^\W_]*\s){0,3}[^\W_]*$@', $_POST['txt_username)) {
    // do something
}

Instead.

EDIT: Since the OP decided to accept my answer, I added Justin's improvements.

NullUserException
  • 83,810
  • 28
  • 209
  • 234
  • 1
    Good answer but I would specify that this includes an extra character (`_`), just in case. – Nicole Aug 25 '11 at 04:16
  • 1
    This will lead to catastrophic backtracking if the input has less than 3 spaces. The engine will not be able to decide where one `([^\W_]\s?)` group ends and another begins, so it will keep trying different permutations until it uses up a ton of system resources. A reasonably long input (not very long at all) will cause the system to hang. – Justin Morgan - On strike Aug 25 '11 at 04:29
  • @NullUserException - Actually yes. Try adding 4 spaces, or an invalid character at the end. If there are no spaces it's an easy match: the first `\w*` can match everything, and the rest is optional. It gets much slower when there isn't an obvious match, which is usually the case with catastrophic backtracking. – Kobi Aug 25 '11 at 04:33
  • I don’t know what all that `\W` monkeybusiness is. Letters are `\pL` and non-Letters are `\PN`, numbers are `\pN` and nonnumbers are `\PN`, and although spaces should be `\s` or `\h`, they are probably something like `[\s\xA0\x85\pZ]`. – tchrist Aug 25 '11 at 04:57
  • 2
    It is just plain silly and complicated and obfuscatory — and wrong, actually — to write `[^\W_]` when what you mean is simply `[\pL\pN]`, although you really ought to through in `\pM` there too for reasons that you will hate. The 7 Unicode general category charclasses — `\pL`, `\pN`, `\pM`, `\pS`, `\pP`, `\pZ`, and `\pC`, plus their `\P` inverses — should be in every regex programmer’s arsenal. They are what you *should* turn to whenever you want specific types of character. And they are only one character longer than `\w` and such, which although should work, due to PHP being stupid, don’t. – tchrist Aug 25 '11 at 05:12
  • 2
    Moreover, the 7 charclasses work whether you have ASCII or Unicode, whereas due to PHP and PCRE conspiring to break Unicode regexes, the things like `\w` and `\s` and `\b` will not. Which do you prefer: something that always works no matter the data, or something that breaks or even worse has security holes depending on what data you feed it? Until they fix how the PCRE that PHP is linked against is compiled and optioned, you have to use the 7 classes, not the shortcuts, if you care about reliability, and write once and run anywhere. Otherwise, you lose. Make the 7 charclasses automatic grabs. – tchrist Aug 25 '11 at 05:17
  • @tchrist Noted. There was a time when I thought Unicode was the greatest thing since sliced bread, but I changed my mind after working with it. And apparently my team at work spent years converting code to make it work with Unicode... And the result is parts of the software now run significantly slower. – NullUserException Aug 25 '11 at 05:28
  • @Kobi - Thanks for that. I'm usually okay at spotting catastrophic backtracking, but I haven't got the hang of predicting how to trigger it. – Justin Morgan - On strike Aug 25 '11 at 17:02
1
  • If you don't want to consecutive spaces, and no spaces near the edges, you can try:

    preg_match("#^\w+(?: \w+){0,3}$#", "123 4 5", $matches);
    if($matches)
       print_r(":-)");
    
  • If you don't care about consecutive spaces, a similar option is ^\w*(?: \w*){0,3}$

  • Or, a more modern approach, with a lookahead (which is good for adding more constrains):
    ^(?![^ ]*(?: [^ ]*){4})[\w ]*$

Either way, note that \w includes underscores, you may want to replace it with something more suitable, for example [a-zA-Z\d] or the Unicode aware [\p{L}\p{N}].

Kobi
  • 135,331
  • 41
  • 252
  • 292
  • Hmm... Haven't been downvoted in a while now. Any comment on why? I'm always happy to improve my answers. – Kobi Aug 25 '11 at 04:58
  • 1
    In a [Unicode regex engine](http://unicode.org/reports/tr18/#Compatibility_Properties), `\w` is `[\p{Alphabetic=True}\p{GC=Decimal_Number}\p{GC=Letter_Number}\p{GC=Mark}\p{GC=Connector_Punctuation}]`. However, getting PHP to work that way is a real pain, because you have to build PCRE the right way, and nobody ever does. Probably the best you can do is `[\pL\pM\p{Nd}\p{Nl}\p{Pc}]` — even though you should never havce to write all that just to mean `\w`. – tchrist Aug 25 '11 at 05:04
  • Well, I think I was pretty close `:)`. You got me on spaces though. Thanks! – Kobi Aug 25 '11 at 05:10
  • @tchrist A lil' grumpy today, huh? – NullUserException Aug 25 '11 at 05:10
  • @Null - Try this one, it's an excellent read: [Why does modern Perl avoid UTF-8 by default?](http://stackoverflow.com/questions/6162484/why-does-modern-perl-avoid-utf-8-by-default/6163129#6163129) – Kobi Aug 25 '11 at 05:11
0

I would do something like this:

if( count(explode(' ', $_POST['txt_username'])) <= 5 && preg_match('/^[\w ]+$/', $_POST['txt_username']) ){
  // do something
}

It's possible you could handle this all with a regular expression, but the solution would be overly complex. This should achieve the same result.


Example:

// press Ctrl-Enter to execute this code

$names = array(
    "hello world",         //valid
    "hello wo rl d",       //valid
    "h e l l o w o r l d", //invalid
    "hello123@@"           //invalid
);

function name_is_valid($name){
    return count(explode(' ', $name)) <= 5 && preg_match('/^[\w ]+$/', $name);
}

foreach($names as $n){
    echo sprintf("%s is %s\n", $n, name_is_valid($n)?"valid":"invalid");
}

/*
hello world is valid
hello wo rl d is valid
h e l l o w o r l d is invalid
hello123@@ is invalid
*/

see it here on tehplayground.com

maček
  • 76,434
  • 37
  • 167
  • 198