0

i found this regular expression :

        "^(([a-zA-Z0-9_\\-\\.]+)@([a-zA-Z0-9_\\-\\.]+)\\.([a-zA-Z]{2,5}){1,25})+([;.](([a-zA-Z0-9_\\-\\.]+)@([a-zA-Z0-9_\\-\\.]+)\\.([a-zA-Z]{2,5}){1,25})+)*$"

which validates a list of email like : address1@gmail.com;adresse2@gmail.com

But i need to tweak it to validate in fact this sturcture :

                   address1@gmail.com;adresse2@gmail.com;

and also just one email address with this structure :

                 address1@gmail.com;

I also want to be able to validate email addresses containing + sign ,

for example validating :

address1@gmail;adress2@gmail.com;addres+3@gmail.com; 

as a valid list of emails.

Thank you for your help.

jose
  • 3
  • 3
  • As a warning, your regex also validates `address1@gmail.com.adresse2@gmail.com`. I wouldn't rely on regex to validate emails - valid email addresses are more complex than you are probably aware of (see [this horror story](http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html) for an example) – Phylogenesis Nov 20 '14 at 14:33

4 Answers4

1

do not abuse regular expression too much.

it's not worthy to spend a lot of time and effort writing something inefficient and hard to analyze.

if you know it's semicolon separated, i would provid following pseudocode:

A<-split email-list with ';'
valid<-True
foreach email in A
    if email != "" and email doesn't match [\w\-\.+]+@([\w+\-]+\.)+[a-zA-Z]{2,5}
        valid<-False
    end
end
return valid

the regular expression [\w\-\.+]+@([\w+\-]+\.)+[a-zA-Z]{2,5} validates one email address. it's perl-compatible syntax.

it matches a-zA-Z_-.+ in the domain, and allows domain names with a-zA-Z- in it, and end with 2 to 5 a-zA-Z combination.

in the regex you provided, it matches domain name with ([a-zA-Z0-9_\\-\\.]+)\\.([a-zA-Z]{2,5}){1,25})+, it's odd. i don't think you should do it this way.

about the reason why i said you are abusing regex is that, even though the problem you want to solve can be solved in regex, but

  1. it takes more than linear time to design and debug regex as it gets longer.

  2. it takes more than linear time for long regex to execute.

  3. it's hard for other people to understand what you attempt to do. it's kind of preventing people from modifying you code, if it's not what you want.

so, please, never try to solve problem like this using pure regex. it's not a programming language.

Jason Hu
  • 6,239
  • 1
  • 20
  • 41
1

This regex will match email-id's based on your criteria.

(?![\W_])((?:([\w+-]{2,})\.?){1,})(?<![\W_])@(?![\W_])(?=[\w.-]{5,})(?=.+\..+)(?1)(?<![\W_])

Regard the semicolon separated email-id's it is best to split them based on semicolon and match each occurrence individually to avoid confusions.

You can take a look at the matches here.

Kannan Mohan
  • 1,810
  • 12
  • 15
0

Just split the whole string using ; character and match each element based on the following regex. It will take care of plus sign also

string pattern = " \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b";

foreach(string email in emailString.Split(';')
{
    if(Regex.IsMatch(email, pattern))
    {
         //do stuff
    }
}
Piyush Parashar
  • 866
  • 8
  • 20
0

As others have said, first split on ;, then validate the addresses individually. If you are going to use a regex to validate email, at least use one that's been at least vaguely tested against both good and bad examples, such as those on fightingforalostcause.net , listed in this project, or discussed in this definitive question, of which this question is effectively a duplicate.

Community
  • 1
  • 1
Synchro
  • 35,538
  • 15
  • 81
  • 104