The code is great, but the pattern lacks a bit of functionality. In fact for e-mail addresses, it misses the dash -
and the underscore _
. Luckily, you can just say to match \w
. It is the same as if you would have specified [a-zA-Z0-9_]
. (it still misses the dash though, so your approach is good but too short.) Anyway, there are a few further things that an address should meet.
- it must start with a alphabetic character
- While theoretically, the address could be composed of a single character at the start and only to after the @ sign, and be almost infinitely long, it is highly unlikely
I suggest the pattern
'[a-zA-Z]+[a-zA-Z0-9_\-]{0,42}@[a-zA-Z]{2,42}\.((com)|(edu)|(net))\b?'
Limiting the number of characters with '{m,n}'
lets you ensure that you won't have an overflow error when storing the address. Well and addresses shorter than 'a@bc.st'
simply don't exist as at least two characters are required.
Lastly, the or-operator applies only to the immediate adjoin characters, so you need to group the mail extensions:
((com)|(edu)|(net))
import re
pattern = '[a-zA-Z]+[a-zA-Z0-9_\-]{0,42}@[a-zA-Z]{2,42}\.((com)|(edu)|(net))\b?'
while True:
user_input = input('Enter Your Email:')
if re.match(pattern, user_input):
print(re.search(pattern,user_input))
print('Valid Email:'+ user_input)
break
else:
print(re.match(pattern,user_input))
print('Invalid Email:'+ user_input)
I think, it is better if you use re.match()
as it matches the string right from the start. Usually one doesn't like if you end up with 1abc@def.comm
to be a valid address (because re.search()
would find the valid string abc@def.com
. With the same argumentation, you should add a \b
to the end of the pattern