0

Before I jump into my question, let me preface with this: I had a strict set of requirements to follow with regards to email address validation. I attempted to dispute some of them, but was overruled.

Anyways, amongst the requirements were the following:

  • No consecutive periods
  • No special characters in pos 1
  • No periods directly before or after the @
  • Allow the following characters: +!#$%&*/=?`{|}~'_-.

My attempt to satisfy the requirement was successful, with one snag. An incorrect minimum of 3 characters is now required due to the regex I am using for the local part. Here is my attempt:

(^(?!.*\\.{2})([a-zA-Z0-9{1}]+[a-zA-Z0-9\\._\\-\\+!#$%&*/=?`{|}~']+[a-zA-Z0-9{1}])+@([a-zA-Z0-9{1}]+[a-zA-Z0-9\\-]+[a-zA-Z0-9{1}]+\\.)+([a-zA-Z0-9\\-]{2}|net|com|gov|mil|org|edu|int|NET|COM|GOV|MIL|ORG|EDU|INT)$)|^$

I understand why this is happening, I just don't know how to get around it. Any assistance would be greatly appreciated.

Edited: After much discussion, it turns out that my issues were not specific to the local part of the email address. The domain part is also suffering from the same issues.

Thanks, Seb

Seb
  • 11
  • 3

2 Answers2

2

For the local part (the part before @), this is the regex fragment that satisfies all conditions above:

^[a-zA-Z0-9][a-zA-Z0-9+!#$%&*/=?`{|}~'_-]*(\.[a-zA-Z0-9+!#$%&*/=?`{|}~'_-]+)*

Breakdown:

^                                 # Beginning of the string
[a-zA-Z0-9]                       # First character is not special
[a-zA-Z0-9+!#$%&*/=?`{|}~'_-]*    # 0 or more alphanumeric and special characters, except .
(?:                               # Group, repeated 0 or more times
  \.                              # A literal .
  [a-zA-Z0-9+!#$%&*/=?`{|}~'_-]+  # 1 or more alphanumeric and special characters, except .
)*

The "No consecutive periods" and "No periods directly before or after the @" conditions are enforced by the fact that . can only appear between 2 non-dot characters, as seen in the regex above.

I don't have a full knowledge of the email specification, so even if it satisfies the conditions in the question, I can't guarantee that the email is a valid one according to specs.


The domain part has same problem with {1} inside the character class.

I take the liberty to use the restriction on hostname, where the labels must not start or end with -.

[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*(?:\.[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)*

If you want to enforce TLD:

[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*(?:\.[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)*\.(?i:[a-z0-9]{2}|net|com|gov|mil|org|edu|int)

Note that I make the TLD case-insensitive using the non-capturing group with i flag.

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
  • Based on the edit to the original question, I think `(?=[^@]{3})` should be removed, but everything else looks correct. – ajb Aug 27 '13 at 16:10
  • @ajb: The question is confusing. Is the "minimum 3 characters" one of the conditions? – nhahtdh Aug 27 '13 at 16:13
  • @nhahtdh No it is not one of the conditions. The minimum 3 characters occured due to my incorrect regex. The minumim should be 1 character. – Seb Aug 27 '13 at 16:16
  • @Seb: I removed the assertion. It should satisfy your conditions. – nhahtdh Aug 27 '13 at 16:17
  • @Seb: See the fix for the domain part. You should be able to put everything together. – nhahtdh Aug 27 '13 at 16:55
  • @nhahtdh I'm having issues with the extension of the email address. I've tried many different things but none seem to work. I need to specify a minimum length of 2 characters. Here is what I have: (^[a-zA-Z0-9][a-zA-Z0-9+!#$%&*/=?`{|}~'_-]*(\\.[a-zA-Z0-9+!#$%&*/=?`{|}~'_-]+)*)@([a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*(?:\\.[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)*)+([a-zA-Z0-9\\-]{2}|net|com|gov|mil|org|edu|int|NET|COM|GOV|MIL|ORG|EDU|INT$)|^$ – Seb Aug 28 '13 at 12:53
  • @Seb: Not sure what you are talking about. The whole regex already force min length of 2. – nhahtdh Aug 28 '13 at 13:01
  • @nhahtdh I mean specifically in the extension section. All of my testing won't allow an extension less than 3 characters. So for example a@x.ca, is not valid – Seb Aug 28 '13 at 13:09
  • @nhahtdh Early morning brain cramp... Sorry, nothing to see here :) – Seb Aug 28 '13 at 13:37
0

Could you please try this (just slight modifications to your code):

(^(?!.*\\.{2})([a-zA-Z0-9][a-zA-Z0-9\\._\\-\\+!#$%&*/=?`{|}~']+[a-zA-Z0-9])+@([a-zA-Z0-9]+[a-zA-Z0-9\\-]+[a-zA-Z0-9]\\.)+([a-zA-Z0-9\\-]{2}|net|com|gov|mil|org|edu|int|NET|COM|GOV|MIL|ORG|EDU|INT)$)|^$

(The test addresses provided so far work. They all don't match.)

oddparity
  • 438
  • 5
  • 14
  • Thanks, But I am still required a minimum 3 characters with this code. – Seb Aug 27 '13 at 15:57
  • This will reject `ab_@something.com`, which should be valid. – nhahtdh Aug 27 '13 at 15:59
  • How about trying ab@xyz.com, or a@xyz.com, or a@x.com – Seb Aug 27 '13 at 15:59
  • 1
    @Seb The three-character minimum is there, because the regex has a character class that matches one character, followed by a class that must match at least one character, followed by another class that must match at least one character. That makes three. The problem is that this regex doesn't allow a special character right before the `@`. – ajb Aug 27 '13 at 16:00
  • @ajb I understand that. However, that was what my initial query was about. I don't know how to satisfy the requirement and allow 1 or greater characters. – Seb Aug 27 '13 at 16:05
  • @Seb OK, I see your edit to the original question now. That changes everything. – ajb Aug 27 '13 at 16:07