-1

I am trying to create a validation function for usernames. It must be:

  • Min six characters / Max 16 characters
  • Only letters, numbers and, at the most, one hyphen
  • It must start with a letter and no end with a hyphen

I came up with this regular expression:

^([A-Za-z]+-?[A-Za-z0-9]{4,14}[A-Za-z0-9])$

Which does everything but the maximum number of characters. What am I missing?

You may wonder why I am using 4 to 14: it is because the first and last characters are already defined by the other conditions, so I need the rest to be at least 4 (to make it to six) and at most 14 characters (to make it to 16).

This is a simple validation function in Python:

# Username validation
import re

def validate(username):
    regex = re.compile('^([A-Za-z]+-?[A-Za-z0-9]{4,14}[A-Za-z0-9])$', re.I)
    match = regex.match(str(username))
    return bool(match)

print(validate("PeterParker")) #Valid username
print(validate("Peter-Parker")) #Valid username
print(validate("Peter-P-arker")) #Invalid username
print(validate("Peter")) #Invalid username
print(validate("PeterParker-")) #Invalid username
print(validate("-PeterParker")) #Invalid username
print(validate("1PeterParker")) #Invalid username
print(validate("Peter Parker")) #Invalid username
print(validate("PeterParkerSpiderMan")) #Invalid username

Thanks very much for any suggestion.

Wilmar
  • 558
  • 1
  • 5
  • 16

4 Answers4

3

Try (?i)^(?=[a-z].{5,15}$)[a-z0-9]*-?[a-z0-9]+$

just need to validate length in assertion

demo

  • 1
    Thanks Edward. I must confess I have no clue why this regex works but it seems to. I'll try to understand it. Thank you! – Wilmar May 30 '20 at 17:05
  • see demo regex is formatted and see simple syntax explain at regex101 –  May 30 '20 at 17:06
  • Any particular reason you did not write `^(?=.{6,16}$)[a-z][a-z0-9]*-?[a-z0-9]+$`? Imo that reads better ("the string must contain 6-16 characters,..."). – Cary Swoveland May 30 '20 at 18:48
  • because `(?=[a-z]...` will fail sooner a is best spot for it –  May 30 '20 at 18:52
  • In that case `^[a-z](?=.{5,15}$)...` would be even better, but I don't buy either. Readability is king. :-) – Cary Swoveland May 30 '20 at 19:01
0

Please use this website to try out your regex:

https://regex101.com/r/RfD4YR/11

As you can see the first group matches all the expression because + is greedy and takes as many characters as needed. This should work properly:

^([A-Za-z][A-Za-z0-9-]{4,14}[A-Za-z0-9])$
  • problem with length https://regex101.com/r/4XPRSY/1 –  May 30 '20 at 17:15
  • Hey, There is no problem, why do you keep using the + ? Look at my regex and look at what you have pasted https://regex101.com/r/RfD4YR/12 – Bogdan Ratiu May 30 '20 at 17:24
  • ahhhh the + of course! That was it. Thanks! – Wilmar May 30 '20 at 17:25
  • how do lengh ask, yes ? –  May 30 '20 at 17:28
  • I have downvoted your answer because it permits no hyphens and permits up to 14 single quotes, of which none are permitted. That can be verified at the link given in your comment. For example, `"P''''''''''''''0"` matches the regex. I will remove my downvote if you correct your answer. – Cary Swoveland May 30 '20 at 18:36
  • You are right, I just tried to keep it as much closer to the original version. – Bogdan Ratiu May 30 '20 at 19:17
  • Cary, I have edited my post, but you are right, it still matches the hyphen multiple times. In order to correct that I would need to use Edward's answer which is much better overall – Bogdan Ratiu May 30 '20 at 20:24
0
^(?=.{6,16}$)[A-Za-z][A-Za-z0-9]*-?[A-Za-z0-9]+$
alani
  • 12,573
  • 2
  • 13
  • 23
  • is different mine ? –  May 30 '20 at 17:17
  • @Edward I was just looking at yours now. I think that they are equivalent (aside from the upper case letters), but you are enforcing the rule that the first character must be a letter as part of the lookahead assertion, while I am enforcing it as part of the rest of the regexp. – alani May 30 '20 at 17:19
  • @Edward I've just seen what your `(?i)` does (case insensitivity). Yes, that does make it tidier than having `A-Za-z` repeatedly. But I won't edit my regexp to bring it closer to yours, because the variety of options may be informative. – alani May 30 '20 at 17:23
  • a small fyi `[A-Za-z]` is alwayz faster than `(?i)[a-z]` –  May 30 '20 at 17:31
0

I used for loops to build the entire hypen and alphabet combinations, pretty brute force. I'd say using a length check in code is much cleaner. If you want the expression you can use just print it before compiling.

# Username validation
import re

def validate(username):
    print(username)
    hypen = []
    for length in range(4,14+1):
        for i in range(length):
            hypen.append("[A-Za-z]{{{}}}-[A-Za-z]{{{}}}".format(i,length-i-1))
    expression = '^([A-Za-z]([A-Za-z]{{4,14}}|{})[A-Za-z0-9])$'.format("|".join(hypen))
    regex = re.compile(expression, re.I)
    match = regex.match(str(username))
    return bool(match)

print(validate("PeterParker")) #Valid username
print(validate("Peter-Parker")) #Valid username
print(validate("Peter-P-arker")) #Invalid username
print(validate("Peter")) #Invalid username
print(validate("PeterParker-")) #Valid username
print(validate("-PeterParker")) #Invalid username
print(validate("1PeterParker")) #Invalid username
print(validate("Peter Parker")) #Invalid username
print(validate("PeterParkerSpiderMan")) #Invalid username

OUTPUT

PeterParker
True
Peter-Parker
True
Peter-P-arker
False
Peter
False
PeterParker-
False
-PeterParker
False
1PeterParker
False
Peter Parker
False
PeterParkerSpiderMan
False
Albin Paul
  • 3,330
  • 2
  • 14
  • 30