2

I have a list of e-mail ids among which I have to select only those which do not have ruba.com as domain name with regex. For examples, if I have ads@gmail.com, dgh@rubd.com and ert@ruba.com, then my regular expression should select first two Ids. What should be the regular expression for this problem?

I have tried with two expressions:

[a-zA-Z0-9_.+-]+@[^(ruba)]+.[a-zA-Z0-9-.]+ and [a-zA-Z0-9_.+-]+@[^r][^u][^b][^a]+.[a-zA-Z0-9-.]+

None of the above two was able to fulfill my requirement.

Deba
  • 429
  • 3
  • 8
  • 18
  • Why do you want to use regex here? Why not just parse the email, and then check the domain against a blacklist? (That second parse can use a regex, but it'll be a really simple one.) – abarnert Mar 17 '18 at 22:04

2 Answers2

0

You could use a negative lookahead to ensure that you do not match the domain ruba.com.

The negative lookahead: (?!rubd) will match against anything that you want to exclude. Also, because emails typically have more than word characters (such as hyphens and periods), you would be better off using [\w\.\-] rather than just \w.

^[\w\.\-]+@(?!rubd)[\w\.\-]+\.(?:com|net|org|edu)$

DEMO

K.Dᴀᴠɪs
  • 9,945
  • 11
  • 33
  • 43
0

I assume that by email ID you mean the part before the @ symbol, otherwise that would be a full email address.

.+(?=@)(?!@ruba\.com)
  • . the dot character is a special symbol for regex engines and it is used to capture everything
  • * also known as Kleene plus says you want to capture one or more instances of the preceding symbol, in our case .; basically you are saying "give me every char"
  • (?=@) is a positive lookahead, i.e. a special search feature that makes sure that what follows is @; I'm using it to take the cursor to the position of @ and "stop" capturing, otherwise + would go on indefinitely
  • (?!@ruba\.com) is a negative lookahead, i.e. a special search feature that makes sure that what follows is not (!) @ruba\.com; I'm escaping the dot not to confuse it with the capture-all symbol I was talking before

Live demo here.

Francesco B.
  • 2,729
  • 4
  • 25
  • 37
  • The problem is that `.+` will match all characters, even invalid characters that do not belong in an email. – K.Dᴀᴠɪs Mar 17 '18 at 22:06
  • Yes, valid point. But he didn't say those address needed to be validated first, since he said "I have a list of email ids...", so I didn't feel compelled to satisfy a requirement that wasn't stated. Then the email specification may be a little tricky, I'm not sure that just \w\.\- would do because [other chars are admitted](https://stackoverflow.com/questions/2049502/what-characters-are-allowed-in-an-email-address). But in the end, my solution won't work if by "ID" he means a full address. – Francesco B. Mar 17 '18 at 22:11