Regex not beginning with number

Question

How do I create a regex that matches all alphanumerics without a number at the beginning?

Right now I have "^[0-9][a-zA-Z0-9_]"

For example, 1ab would not match, ab1 would match, 1_bc would not match, bc_1 would match.

Do you mean starting from start of the line? – Tommy Andersen Oct 27 '14 at 20:32 — Tommy Andersen, Oct 27 '14 at 20:32

abarnert · Accepted Answer · 2014-10-27T20:42:00.093

There are three things wrong with what you've written.

First, to negate a character class, you put the ^ inside the brackets, not before them. ^[0-9] means "any digit, at the start of the string"; [^0-9] means "anything except a digit".

Second, [^0-9] will match anything that isn't a digit, not just letters and underscores. You really want to say that the first character "is not a digit, but is a digit, letter, or underscore", right? While it isn't impossible to say that, it's a lot easier to just merge that into "is a letter or underscore".

Also, you forgot to repeat the last character set. As-is, you're matching exactly two characters, so b1 will work, but b12 will not.

So:

[a-zA-Z_][a-zA-Z0-9_]*

Regular expression visualization

Debuggex Demo

In others words: one letter or underscore, followed by zero or more letters, digits, or underscores.

I'm not entirely sure this is what you actually want, at least if the regex is your whole parser. For example, in foo-bar, do you want the bar to get matched? If so, in 123spam, do you want the spam to get matched? But it's what you were trying to write.

$abarnert thanks for this answer (and +1 for the Debuggex Demo). This is exactly what I neeeded. — Apollo, Oct 27 '14 at 20:38

score 6 · Answer 2 · answered Oct 27 '14 at 20:33

6

This should do it:

^[^0-9][a-zA-Z0-9_]+$

Explaination:

^: Match beggining of line
[^0-9]: Matches one of anything but a digit
[a-zA-Z0-9_]+: Matches one or more alphanumeric character
$: Matches the end of the line

answered Oct 27 '14 at 20:33

Linuxios

34,849
13
91
116

I'm pretty sure this isn't what he wants. After all, `-foo` doesn't have a number at the beginning, so it will match your expression, but I don't think it's what he's looking for. – abarnert Oct 27 '14 at 20:38
Well, it would have been better with a more complete set of test input; I'm _guessing_ he doesn't want `-foo` based on the way he phrased his description, but it would be better to _know_ that… – abarnert Oct 27 '14 at 20:41
@abarnert: Reading the question again, I'm pretty sure you're right. +1'd your answer. – Linuxios Oct 27 '14 at 20:43

score 2 · Answer 3 · edited Jul 01 '21 at 17:22

2

You can use \D for any non-digit

/^\D[a-zA-Z0-9_]+$/ Should work !

edited Jul 01 '21 at 17:22

Sven Eberth

3,057
12
24
29

answered Jul 01 '21 at 16:50

Mrunmayee Kulkarni

21
1

score 0 · Answer 4 · edited Apr 26 '16 at 21:49

0

You can use this: ^[A-Za-z_][A-Za-z0-9_]*$

edited Apr 26 '16 at 21:49

FliegendeWurst

176
4
9

answered Oct 27 '14 at 20:33

Mazdak

105,000
18
159
188

score 0 · Answer 5 · answered Oct 27 '14 at 20:38

0

Another suggestion, try this:

\b([a-zA-Z][^\s]*)

You can use this code to iterate over the results:

reobj = re.compile(r"\b([a-zA-Z][^\s]*)")
for match in reobj.finditer(subject):
    start = match.start()
    end = match.end()
    text = match.group()

answered Oct 27 '14 at 20:38

Tommy Andersen

7,165
1
31
50

score 0 · Answer 6 · answered Oct 27 '14 at 21:47

You can use this regex:

^[a-z]\w+$

Working demo

enter image description here

The idea of the regex is that

^[a-z]   -> Have to start with a letter
\w+$     -> can contain multiple alphanumeric characters (\w is the shortcut for [A-Za-z_])

Bear in mind the regex flags i for insensitive and m for multiline.

The python code you can use is:

import re
p = re.compile(ur'^[a-z]\w+$', re.MULTILINE | re.IGNORECASE)
test_str = u"would match\nab1\nbc_1\n\nwould not match\n1_bc\n1ab"

re.findall(p, test_str)

score 0 · Answer 7 · answered Apr 23 '19 at 19:10

0

this is the right answer.

^(?!^[0-9].*$).*

it matches whole parts if the line does not starts with a number.

and this one is also one another pattern does similar job:

^[^0-9]+.*

answered Apr 23 '19 at 19:10

Zen Of Kursat

2,672
1
31
47

Regex not beginning with number

7 Answers7

Linked