2

I need a regular expression that will allow only a to z and 0 to 9. I came across the function below on this site, but it allows a few symbols thru (#.-). How should it be done if it has to allow only a to z (both upper and lower case) and 0 to 9? I'm scared to edit it since I know nothing about regular expressions.

Also is this regular expression good to check for a to z and 0 to 9, or is there any way it can be bettered.

function isValid($str) {
    return !preg_match('/[^A-Za-z0-9.#\\-$]/', $str);
}

Thanks

Jocelyn
  • 11,209
  • 10
  • 43
  • 60
Norman
  • 6,159
  • 23
  • 88
  • 141

3 Answers3

3

The following seems to be what you need in this case:

function isValid($str) {
    return !preg_match('/[^A-Za-z0-9]/', $str);
}

The […] regex construct is called a character class. Something like [aeiou] matches one of any of the vowels.

The [^…] is a negated character class, so [^aeiou] matches one of anything but the vowels (which includes consonants, digits, symbols, etc).

The -, depending on where/how it appears in a character class definition, is a range definition, so 0-9 is the same as 0123456789.

Thus, the regex [^A-Za-z0-9] actually matches a character that's neither a letter nor a digit. This is why the result of preg_match is negated with !.

That is, the logic of the above method uses double negation:

isValid = it's not the case that
              there's something other than a letter or a digit
                  anywhere in the string

You can alternatively get rid of the double negation and use something like this:

function isValid($str) {
    return preg_match('/^[A-Za-z0-9]*$/', $str);
}

Now there's no negation. The ^ and $ are the beginning and of the string anchors, and * is a zero-or-one-of repetition metacharacter. Now the logic is simply:

isValid = the entire string from beginning to end
              is a sequence of letters and digits

References

Related questions


Non-regex alternative

Some languages have standard functions/idiomatic ways to validate that a string consists of only alphanumeric characters (among other possible string "types").

In PHP, for example, you can use ctype_alnum.

bool ctype_alnum ( string $text )

Checks if all of the characters in the provided string , text, are alphanumeric.

API links

  • PHP Ctype Functions - list of entire family of ctype functions
    • ctype_alpha, digit, lower, upper, space, etc
Community
  • 1
  • 1
polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
  • Thanks for the correction and also for pointing me to ctype_alnum. I never knew php had a function like that. I think i'll use that from now on. – Norman Jul 31 '10 at 17:02
  • +1 You are an expert at regex, would you please check the regex here in my answer: http://stackoverflow.com/questions/3381331/jquery-convert-br-and-br-and-p-and-such-to-new-line/3381470#3381470 – Sarfraz Aug 01 '10 at 10:46
2

Whilst I have nothing against regular expressions, with such a simple pattern you should probably consider using

if(ctype_alnum($input)) {

http://uk3.php.net/manual/en/function.ctype-alnum.php

Peter O'Callaghan
  • 6,181
  • 3
  • 26
  • 27
  • Just discovered this function here :-) It makes things real easy. Will be using this from now on. – Norman Jul 31 '10 at 17:05
0

You can match z and 0-9 with [Zz0-9] and you can match a-z and 0-9 with [a-z0-9]. If you want both upper and lower case then you would use [A-Za-z0-9].

See regular expression character classes for more on this.

Further, the !preg_match() isn't really necessary. Instead you could use a positive match on what you want, such as return preg_match('/^[A-Za-z0-9]+$/', $str); The one you have is actually a negated character class, so it will disallow anything within the brackets. I may be misunderstanding your purpose, though.

eldarerathis
  • 35,455
  • 10
  • 90
  • 93