29

In my ASP.NET page, I have an input box that has to have the following validation on it:

Must be alphanumeric, with at least one letter (i.e. can't be ALL numbers).

informatik01
  • 16,038
  • 10
  • 74
  • 104
mrblah
  • 99,669
  • 140
  • 310
  • 420

6 Answers6

69
^\d*[a-zA-Z][a-zA-Z0-9]*$

Basically this means:

  • Zero or more ASCII digits;
  • One alphabetic ASCII character;
  • Zero or more alphanumeric ASCII characters.

Try a few tests and you'll see this'll pass any alphanumeric ASCII string where at least one non-numeric ASCII character is required.

The key to this is the \d* at the front. Without it the regex gets much more awkward to do.

tchrist
  • 78,834
  • 30
  • 123
  • 180
cletus
  • 616,129
  • 168
  • 910
  • 942
  • "One alphanumeric character" should read "One alphabetic character" or similiar: that part of the regex does not include digits. –  Jun 27 '09 at 02:40
  • @John: not only clever, but efficient! The \d* avoids potential O(N**2) backtracking ... I think. – Stephen C Jul 29 '09 at 05:09
  • 6
    Great solution, cletus. You can make the regex a little shorter if you use the case insensitive flag like so: /^\d*[a-z][a-z0-9]*$/i – pr1001 Aug 08 '09 at 19:46
  • How you people are able to understand it ? Pls suggest me how to learn it :( – rolling stone Mar 26 '15 at 07:15
  • I found this page to help understand Regex https://www.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html to test the regex use a free online tester https://www.freeformatter.com/regex-tester.html – Andres R Apr 24 '20 at 16:28
25

Most answers to this question are correct, but there's an alternative, that (in some cases) offers more flexibility if you want to change the rules later on:

^(?=.*[a-zA-Z].*)([a-zA-Z0-9]+)$

This will match any sequence of alphanumerical characters, but only if the first group also matches the whole sequence. It's a little-known trick in regular expressions that allows you to handle some very difficult validation problems.

For example, say you need to add another constraint: the string should be between 6 and 12 characters long. The obvious solutions posted here wouldn't work, but using the look-ahead trick, the regex simply becomes:

^(?=.*[a-zA-Z].*)([a-zA-Z0-9]{6,12})$
Philippe Leybaert
  • 168,566
  • 31
  • 210
  • 223
  • What purpose does the second .* have? – Peter Boughton Jun 28 '09 at 19:40
  • None :-) It can be safely ommitted. – Philippe Leybaert Jun 28 '09 at 19:58
  • I know that this is a very old post but this saved me a lot of time and trouble today. NOTE: in the second pattern, the second * IS required if you want it to correctly enforce. Thanks again. – David L Feb 08 '13 at 21:39
  • Note that the second set of parentheses is unneeded for both expressions. In any case, for the second regex above, I salute you most excellent sir! – ErikE Jun 19 '14 at 00:24
  • That's sick. Thanks, fixed my problem ;-) – Gordon Thompson Sep 15 '16 at 17:08
  • Thanks, this helped me when I had a regex consisting of multiple optional blocks but I still wanted to ensure there was at least one digit being tested. e.g. `^\$?(\d|,\d{3})*(\.\d{1,2})?$` > `^(?=.*[0-9].*)\$?(\d|,\d{3})*(\.\d{1,2})?$` – Tama Jul 26 '19 at 22:06
  • Thanks, for a more stricter version of alpha numeric regex, I added lookahead in the begining of your RegEx : (?=.*[0-9].*). Now if there is a sequence of alphabets only, there has to be at least one number and vice versa, to become valide alpha numeric word. – Abhinav Saxena Sep 21 '20 at 13:03
5
^[\p{L}\p{N}]*\p{L}[\p{L}\p{N}]*$

Explanation:

  • [\p{L}\p{N}]* matches zero or more Unicode letters or numbers
  • \p{L} matches one letter
  • [\p{L}\p{N}]* matches zero or more Unicode letters or numbers
  • ^ and $ anchor the string, ensuring the regex matches the entire string. You may be able to omit these, depending on which regex matching function you call.

Result: you can have any alphanumeric string except there's got to be a letter in there somewhere.

\p{L} is similar to [A-Za-z] except it will include all letters from all alphabets, with or without accents and diacritical marks. It is much more inclusive, using a larger set of Unicode characters. If you don't want that flexibility substitute [A-Za-z]. A similar remark applies to \p{N} which could be replaced by [0-9] if you want to keep it simple. See the MSDN page on character classes for more information.

The less fancy non-Unicode version would be

^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • 5
    \p{N} would match Unicode numbers, such as ① ② . This may come as surprise for some. – J-16 SDiZ Jun 27 '09 at 02:40
  • Turning the first part, [\p{L}\p{N}]*, into \p{N}* simplifies the explanation and prevents some back tracking. –  Jun 27 '09 at 02:43
2
^[0-9]*[A-Za-z][0-9A-Za-z]*$

is the regex that will do what you're after. The ^ and $ match the start and end of the word to prevent other characters. You could replace the [0-9A-z] block with \w, but i prefer to more verbose form because it's easier to extend with other characters if you want.

Add a regular expression validator to your asp.net page as per the tutorial on MSDN: http://msdn.microsoft.com/en-us/library/ms998267.aspx.

Dexter
  • 18,213
  • 4
  • 44
  • 54
  • this could contain all numbers, no? – akf Jun 27 '09 at 02:31
  • missed the extra bit about needing at least one character - i've edited to force at least one character – Dexter Jun 27 '09 at 02:34
  • If you're not capturing, you don't need () except for grouping, and [] makes a fine group. A-z is more than just letters. You probably meant [0-9A-Za-z]*[A-Za-z][0-9A-Za-z]* –  Jun 27 '09 at 02:38
1
^\w*[\p{L}]\w*$

This one's not that hard. The regular expression reads: match a line starting with any number of word characters (letters, numbers, punctuation (which you might not want)), that contains one letter character (that's the [\p{L}] part in the middle), followed by any number of word characters again.

If you want to exclude punctuation, you'll need a heftier expression:

^[\p{L}\p{N}]*[\p{L}][\p{L}\p{N}]*$

And if you don't care about Unicode you can use a boring expression:

^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$
Welbog
  • 59,154
  • 9
  • 110
  • 123
0
^[0-9]*[a-zA-Z][a-zA-Z0-9]*$

Can be

  • any number ended with a character,
  • or an alphanumeric expression started with a character
  • or an alphanumeric expression started with a number, followed by a character and ended with an alphanumeric subexpression
eKek0
  • 23,005
  • 25
  • 91
  • 119