0

Need help making email verifications with the variable 'pattern' and making it so that it loops if it doesn't contain whatever is within the pattern. Required to use re.search. I tried a couple of things for the last hour and this is where I'm kind of lost.

import re
pattern = '[a-zA-Z0-9]' + '[a-zA-Z0-9]'+'@[a-zA-Z]'+'(.com/.edu/.net)'
user_input = input('Enter Your Email:')
while user_input is not pattern:
    if (re.search(pattern,user_input)):
        print(re.seach(pattern,user_input))
        print('Valid Email:'+ user_input)
    else:
        print(re.search(pattern,user_input))
        print('Invalid Email:'+ user_input)
        user_input = input('Enter Your Email:')```
  • 1
    You should spend some time reading through a regular expression tutorial. The way you've written your pattern, it only matches email that contains two letters/digits, an `@`, a single letter, and then anything matching the pattern `.com/.edu/.net` (where `.` means "any character") . So for example this matches: `aa@a.com/.edu/.net` as does: `aa@axcom/bedu/cnet`` – larsks May 05 '22 at 19:38
  • https://regex101.com/ is a good resource for testing out regular expression patterns. – larsks May 05 '22 at 19:39
  • You may actually be a lot closer to the regex than you think. The '+' token means one or more of the previous pattern. So you really want your + signs _inside_ the string. Also, the variation token (I just made that up) is the pipe character '|' not '/'. – Mark May 05 '22 at 19:41
  • I should also noted that your while condition `user_input is not pattern` is really very far from what you want. You want to stop the while loop when the user_input can be matched by the pattern. In my answer below, I show how to use `break` to end a loop when a condition matches. – Mark May 05 '22 at 19:53

2 Answers2

1

The code is great, but the pattern lacks a bit of functionality. In fact for e-mail addresses, it misses the dash - and the underscore _. Luckily, you can just say to match \w. It is the same as if you would have specified [a-zA-Z0-9_]. (it still misses the dash though, so your approach is good but too short.) Anyway, there are a few further things that an address should meet.

  1. it must start with a alphabetic character
  2. While theoretically, the address could be composed of a single character at the start and only to after the @ sign, and be almost infinitely long, it is highly unlikely

I suggest the pattern '[a-zA-Z]+[a-zA-Z0-9_\-]{0,42}@[a-zA-Z]{2,42}\.((com)|(edu)|(net))\b?'

Limiting the number of characters with '{m,n}' lets you ensure that you won't have an overflow error when storing the address. Well and addresses shorter than 'a@bc.st' simply don't exist as at least two characters are required.

Lastly, the or-operator applies only to the immediate adjoin characters, so you need to group the mail extensions: ((com)|(edu)|(net))

import re
pattern = '[a-zA-Z]+[a-zA-Z0-9_\-]{0,42}@[a-zA-Z]{2,42}\.((com)|(edu)|(net))\b?'
while True:
  user_input = input('Enter Your Email:')
  if re.match(pattern, user_input):
      print(re.search(pattern,user_input))
      print('Valid Email:'+ user_input)
      break
  else:
      print(re.match(pattern,user_input))
      print('Invalid Email:'+ user_input)

I think, it is better if you use re.match() as it matches the string right from the start. Usually one doesn't like if you end up with 1abc@def.comm to be a valid address (because re.search() would find the valid string abc@def.com. With the same argumentation, you should add a \b to the end of the pattern

max
  • 3,915
  • 2
  • 9
  • 25
0

I made a slight modification to your pattern and to your code.

import re
pattern = '[a-zA-Z0-9]+@[a-zA-Z]+(\.com|\.edu|\.net)'
while True:
  user_input = input('Enter Your Email:')
  if (re.search(pattern,user_input)):
      print(re.search(pattern,user_input))
      print('Valid Email:'+ user_input)
      break
  else:
      print(re.search(pattern,user_input))
      print('Invalid Email:'+ user_input)
     

Here's an example run:

Enter Your Email:fred
None
Invalid Email:fred
Enter Your Email:mark@so.com
<re.Match object; span=(0, 11), match='mark@so.com'>
Valid Email:mark@so.com
Mark
  • 4,249
  • 1
  • 18
  • 27
  • 1
    ...although note that in practice this isn't a great pattern, because it's not going to match something like `bob@regex101.com` or `alice@big.org.uk`, etc. See https://stackoverflow.com/questions/201323/how-can-i-validate-an-email-address-using-a-regular-expression for some detailed discussion on the topic of address validation with regular expressions. – larsks May 05 '22 at 19:46
  • 1
    Good point. I considered finding a good resource to point him to since I recognized that even the corrected pattern was insufficient. I'm glad you did. I prioritized on the obvious errors with the pattern . – Mark May 05 '22 at 19:51
  • Shouldn't it be an escape character before the `.` and grouping before the OR-operator?: `((\.com)|(\.edu)|(\.net))` – max May 06 '22 at 17:02
  • I certainly agree with escaping the `.`, but I haven't found the need to group before the OR-operator. – Mark May 08 '22 at 14:15