0

Can someone help me please? I am not very familiar with regEx and I am trying to validate email addresses. The regEx and code that I have is:

public class TestRegExEmail {

public static void main(final String[] args) {

    // List of valid URLs
    List<String> validValues = new ArrayList<>();
    validValues.add("wiliam@hotmail.com");
    validValues.add("wiliam.ferraciolli@hotmail.com");
    validValues.add("wiliam@hotmail.co.uk");
    validValues.add("wiliam.ferraciolli@hotmail.co.uk");
    validValues.add("wiliam_ferraciolli@hotmail.co.uk");
    validValues.add("wiliam'ferraciolli@hotmail.co.uk");
    validValues.add("wiliam334-1@mydomain.co.uk.me");

    // List on invalid URLs
    List<String> invalidValues = new ArrayList<>();
    invalidValues.add("wiliam.ferraciolli@hotmail.com.dodge.too.many");
    invalidValues.add("hwiliam@hotmail.com.otherdomain.uk.dodge");
    invalidValues.add("wiliam.ferraciolli@hotmail.com.com.com.com");
    invalidValues.add("wiliam.hotmail.com");
    invalidValues.add("wiliam..ferraciolli@hotmail.com");
    invalidValues.add("wiliam%ferraciolli.@hotmail.com");
    invalidValues.add("wiliam$ferraciolli.@hotmail.com"); 
    invalidValues.add("wiliam/ferraciolli.@hotmail.com");         

    // Pattern        
    String regex = "^[_A-Za-z0-9-\\+]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$";

    Pattern pattern = Pattern.compile(regex);

    // print valid emails
    for (String s : validValues) {
        Matcher matcher = pattern.matcher(s);
        System.out.println(s + " // " + matcher.matches());
    }

    System.out.println();
    // print invalid emails
    for (String s : invalidValues) {
        Matcher matcher = pattern.matcher(s);
        System.out.println(s + " // " + matcher.matches());
    }

}

}

This regEx works fine but fails on emails with apostrophes. The other issue is that it would be ideal to allow only 3 dots after the @ symbol. Any suggestions would be appreciated.

Regards

Wil Ferraciolli
  • 449
  • 3
  • 9
  • 21

1 Answers1

1

This regEx works fine but fails on emails with apostrophes.

Followed @Am_I_Helpful comment and found a good solution. "[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[‌​a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?"

Exactly, you need to include all the allowed characters in the character class:

[a-z0-9!#$%&'*+/=?^_`{|}~-]+
       \_________________/
     non-alphanumerics allowed

I'd also anchor your pattern to he beginning and end of string with ^ and $ as you had in your previous version.


The other issue is that it would be ideal to allow only 3 dots after the @ symbol.

This non-capturing group from your regex is repeated once for every dot after the @:

@(?:[a-z0-9](?:[‌​a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
            \                      \_/ /
             \           literal dots /
              \______________________/
             repeated once for each dot

But you're using the + quantifier to repeat at least once but as many times as it can match. Instead, limit repetition with the {1,3} quantifier.

@(?:[a-z0-9](?:[‌​a-z0-9-]*[a-z0-9])?\\.){1,3}[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
                                       ^^^^^

Regex

"^[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[‌​a-z0-9-]*[a-z0-9])?\\.){1,3}[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$"

regex101 demo

Mariano
  • 6,423
  • 4
  • 31
  • 47