Regex for alphanumeric, but at least one letter

Question

In my ASP.NET page, I have an input box that has to have the following validation on it:

Must be alphanumeric, with at least one letter (i.e. can't be ALL numbers).

score 69 · Accepted Answer · edited Apr 04 '15 at 15:51

69

^\d*[a-zA-Z][a-zA-Z0-9]*$

Basically this means:

Zero or more ASCII digits;
One alphabetic ASCII character;
Zero or more alphanumeric ASCII characters.

Try a few tests and you'll see this'll pass any alphanumeric ASCII string where at least one non-numeric ASCII character is required.

The key to this is the \d* at the front. Without it the regex gets much more awkward to do.

edited Apr 04 '15 at 15:51

tchrist

78,834
30
123
180

answered Jun 27 '09 at 02:33

cletus

616,129
168
910
942

"One alphanumeric character" should read "One alphabetic character" or similiar: that part of the regex does not include digits. – Jun 27 '09 at 02:40
@John: not only clever, but efficient! The \d* avoids potential O(N**2) backtracking ... I think. – Stephen C Jul 29 '09 at 05:09
6

Great solution, cletus. You can make the regex a little shorter if you use the case insensitive flag like so: /^\d*[a-z][a-z0-9]*$/i – pr1001 Aug 08 '09 at 19:46
How you people are able to understand it ? Pls suggest me how to learn it :( – rolling stone Mar 26 '15 at 07:15
I found this page to help understand Regex https://www.ntu.edu.sg/home/ehchua/programming/howto/Regexe.html to test the regex use a free online tester https://www.freeformatter.com/regex-tester.html – Andres R Apr 24 '20 at 16:28

score 25 · Answer 2 · answered Jun 28 '09 at 19:35

25

Most answers to this question are correct, but there's an alternative, that (in some cases) offers more flexibility if you want to change the rules later on:

^(?=.*[a-zA-Z].*)([a-zA-Z0-9]+)$

This will match any sequence of alphanumerical characters, but only if the first group also matches the whole sequence. It's a little-known trick in regular expressions that allows you to handle some very difficult validation problems.

For example, say you need to add another constraint: the string should be between 6 and 12 characters long. The obvious solutions posted here wouldn't work, but using the look-ahead trick, the regex simply becomes:

^(?=.*[a-zA-Z].*)([a-zA-Z0-9]{6,12})$

answered Jun 28 '09 at 19:35

Philippe Leybaert

168,566
31
210
223

What purpose does the second .* have? – Peter Boughton Jun 28 '09 at 19:40
None :-) It can be safely ommitted. – Philippe Leybaert Jun 28 '09 at 19:58
I know that this is a very old post but this saved me a lot of time and trouble today. NOTE: in the second pattern, the second * IS required if you want it to correctly enforce. Thanks again. – David L Feb 08 '13 at 21:39
Note that the second set of parentheses is unneeded for both expressions. In any case, for the second regex above, I salute you most excellent sir! – ErikE Jun 19 '14 at 00:24
That's sick. Thanks, fixed my problem ;-) – Gordon Thompson Sep 15 '16 at 17:08
Thanks, this helped me when I had a regex consisting of multiple optional blocks but I still wanted to ensure there was at least one digit being tested. e.g. `^\$?(\d|,\d{3})*(\.\d{1,2})?$` > `^(?=.*[0-9].*)\$?(\d|,\d{3})*(\.\d{1,2})?$` – Tama Jul 26 '19 at 22:06
Thanks, for a more stricter version of alpha numeric regex, I added lookahead in the begining of your RegEx : (?=.*[0-9].*). Now if there is a sequence of alphabets only, there has to be at least one number and vice versa, to become valide alpha numeric word. – Abhinav Saxena Sep 21 '20 at 13:03

John Kugelman · Answer 3 · 2009-06-27T02:41:34.163

^[\p{L}\p{N}]*\p{L}[\p{L}\p{N}]*$

Explanation:

[\p{L}\p{N}]* matches zero or more Unicode letters or numbers
\p{L} matches one letter
[\p{L}\p{N}]* matches zero or more Unicode letters or numbers
^ and $ anchor the string, ensuring the regex matches the entire string. You may be able to omit these, depending on which regex matching function you call.

Result: you can have any alphanumeric string except there's got to be a letter in there somewhere.

\p{L} is similar to [A-Za-z] except it will include all letters from all alphabets, with or without accents and diacritical marks. It is much more inclusive, using a larger set of Unicode characters. If you don't want that flexibility substitute [A-Za-z]. A similar remark applies to \p{N} which could be replaced by [0-9] if you want to keep it simple. See the MSDN page on character classes for more information.

The less fancy non-Unicode version would be

^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$

\p{N} would match Unicode numbers, such as ① ② . This may come as surprise for some. — J-16 SDiZ, Jun 27 '09 at 02:40
Turning the first part, [\p{L}\p{N}]*, into \p{N}* simplifies the explanation and prevents some back tracking. — , Jun 27 '09 at 02:43

Dexter · Answer 4 · 2009-06-27T02:46:23.733

2

^[0-9]*[A-Za-z][0-9A-Za-z]*$

is the regex that will do what you're after. The ^ and $ match the start and end of the word to prevent other characters. You could replace the [0-9A-z] block with \w, but i prefer to more verbose form because it's easier to extend with other characters if you want.

Add a regular expression validator to your asp.net page as per the tutorial on MSDN: http://msdn.microsoft.com/en-us/library/ms998267.aspx.

edited Jun 27 '09 at 02:46

answered Jun 27 '09 at 02:29

Dexter

18,213
4
44
54

this could contain all numbers, no? – akf Jun 27 '09 at 02:31
missed the extra bit about needing at least one character - i've edited to force at least one character – Dexter Jun 27 '09 at 02:34
If you're not capturing, you don't need () except for grouping, and [] makes a fine group. A-z is more than just letters. You probably meant [0-9A-Za-z]*[A-Za-z][0-9A-Za-z]* – Jun 27 '09 at 02:38

Welbog · Answer 5 · 2009-06-27T02:37:07.730

1

^\w*[\p{L}]\w*$

This one's not that hard. The regular expression reads: match a line starting with any number of word characters (letters, numbers, punctuation (which you might not want)), that contains one letter character (that's the [\p{L}] part in the middle), followed by any number of word characters again.

If you want to exclude punctuation, you'll need a heftier expression:

^[\p{L}\p{N}]*[\p{L}][\p{L}\p{N}]*$

And if you don't care about Unicode you can use a boring expression:

^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$

edited Jun 27 '09 at 02:37

answered Jun 27 '09 at 02:31

Welbog

59,154
9
110
123

@R. Pate: Thanks. It wouldn't be code if it didn't have bugs. – Welbog Jun 27 '09 at 02:35

score 0 · Answer 6 · answered Jun 27 '09 at 02:52

0

^[0-9]*[a-zA-Z][a-zA-Z0-9]*$

Can be

any number ended with a character,
or an alphanumeric expression started with a character
or an alphanumeric expression started with a number, followed by a character and ended with an alphanumeric subexpression

answered Jun 27 '09 at 02:52

eKek0

23,005
25
91
119

1

Isn't this the same as the top answer? Just with \d replaced with [0-9]. – Evan Fosmark Jun 27 '09 at 18:29
Well, yes. I didn't see that at the time of posting – eKek0 Jun 28 '09 at 00:46

Regex for alphanumeric, but at least one letter

6 Answers6

Linked

Related