13

I was researching best possible way to check if a String was a valid email Address. I am now fixated on two options, viz., using javax.mail.internet.InternetAddress; or using Apache Commons EmailValidator, which internally uses complicated regex parser.

I was wondering if there is any advantages on picking one over the other in terms of correctness, or is both just fine? I know for a fact that InternetAddress doesn't handle non-ascii characters efficiently in some cases.

kuriouscoder
  • 5,394
  • 7
  • 26
  • 40
  • I would use Apache Commons since I don't see anything wrong with the regex validator. I don't know of a better way to validate an email address besides using a regular expression. Do you? – Icarus Aug 24 '11 at 03:18
  • take a look at http://download.oracle.com/javaee/5/api/javax/mail/internet/InternetAddress.html – kuriouscoder Aug 24 '11 at 03:26
  • thanks for the link. How do you know that internally the library is not using a Regex expression to validate an email address? And if it isn't, is that really better than using a regex expression? The validate() method doesn't say how does it perform the validation; it just says that "checks many rules but not all pertaining to RFC 822" – Icarus Aug 24 '11 at 03:37
  • My question is very SIMPLE -- is there any pitfall of choosing one over the other? – kuriouscoder Aug 24 '11 at 03:43
  • There aren't any pitfalls since I'm sure both accomplish their task (validating an email address) appropriately. – Icarus Aug 24 '11 at 03:53

3 Answers3

31

You can use an EmailValidator from Apache Commons Validator library for that:

import org.apache.commons.validator.EmailValidator;
...

EmailValidator validator = EmailValidator.getInstance();
if (validator.isValid(email)) {
   // is valid, do something
} else {
   // is invalid, do something
}

isValid method checks if a field has a valid e-mail address.

This is the best Java email address validation method according to this question What is the best Java email address validation method?

Boris
  • 22,667
  • 16
  • 50
  • 71
2

For something as well-established as email address format, the difference between two approaches is minuscule. Then again, fifty years ago, people never saw the need to use 4 digits for encoding years, so...

The only 'pitfall' with using the regex from Apache Commons, is that its functionality for validating an email address isn't "Java standard". To what extent that affects you as a developer? depends on how paranoid you are.

On the other hand, the standard Java implementation might be less efficient. You'd have to construct an InternetAddress and validate it. Looking at JavaMail's source code, I could see this:

/**
 * Check that the address is a valid "mailbox" per RFC822.
 * (We also allow simple names.)
 *
 * XXX - much more to check
 * XXX - doesn't handle domain-literals properly (but no one uses them)
 */

(The XXX seems to be some sort of a note, or a "to do" item)

Isaac
  • 16,458
  • 5
  • 57
  • 81
1

I've just tested it, and apparently the performance on InternetAddress is substantially better then using EmailValidator

package com.avaya.oss.server.errors;

import javax.mail.internet.AddressException;
import javax.mail.internet.InternetAddress;

import org.apache.commons.validator.EmailValidator;

public class TestValidationTypes {

    static String email = "test@testy.com";
    static int maxItr = 10000;

    public static void main(String[] args) throws AddressException {

        long start = System.currentTimeMillis();
        for (int i = 0; i < maxItr; i++) {
            EmailValidator.getInstance().isValid(email);
        }
        System.out.println("EmailValidator duration: " + (System.currentTimeMillis() - start));

        start = System.currentTimeMillis();
        for (int i = 0; i < maxItr; i++) {
            InternetAddress internetAddress = new InternetAddress(email);
            internetAddress.validate();
        }
        System.out.println("InternetAdress duration: " + (System.currentTimeMillis() - start));

    }

}

Output:

EmailValidator duration: 1195

InternetAdress duration: 67

The results are that EmailValidator took ~20 times longer:

shemerk
  • 257
  • 5
  • 16
  • 1
    `user@localhost` is considered valid. `bla@bla` is considered valid. `someone@[10.10.1.5]` is also considered valid. I think you would want to **treat them invalid**. So I think spend 20 times longer is **worth** it. – Saravanabalagi Ramachandran Sep 08 '15 at 17:59
  • For anyone stumbling on this years later, [JMail](https://github.com/RohanNagar/jmail) is faster and more correct than both of these options. Plus, it is customizable so you can treat addresses with domain literals (like user@localhost) as invalid. – Rohan Jun 01 '21 at 23:19