8

I don't think that this question has been asked before... I certainly cannot find something with this requirement.

Background

There is an API that returns ID's of people. In general the ID should be treated as being case sensitive... but if the ID is actually their email address... and you are talking to a less than stellar implementation of this API that returns a mixed case version of their email address, there is plenty of fun to be had...

So you are talking to one implementation... it gives you back URL like things as the ID, e.g.

  • http://foo.bar.com/blahblahblah

You could next be talking to another implementation... that gives you back some non-obvious ID, e.g.

  • as€jlhdésdj678hjghas7t7qhjdhg£

You could be talking to a nice implementation which gives you back a nice lowercase email address:

  • bob.mcspam@acme.org

Or you could be talking to the less than stellar implementation that returns the exactly equivalent ID

  • bob.mcspam@ACME.org

RFC 2821 states that only the mailbox is case sensitive, but that exploiting the case sensitivity will cause a raft of inter-op issues...

What I want to do is identify the strings that are emails and force the domain to lowercase. Identifying the URI like strings is easier as the scheme is either http or https and I just need to lowercase the domain name which is a lot easier to parse.

Question

If given a string provided by an external service, is there a test I can use that will determine if the string is an email address so I can force the domain name to lower case?

It is acceptable for a small % of email addresses to be missed and not get the domain name lowercased. (False negatives allowed)

It is not acceptable to force part of a string to lowercase if it is not the domain part of an email address. (False positives not allowed)

 Update

Note that this question is subtly different from this and this as in the context of those two questions you already know that the string is supposed to be an email address.

In the context of this question we do not know if the string is an email address or something else... which makes this question different

Community
  • 1
  • 1
Stephen Connolly
  • 13,872
  • 6
  • 41
  • 63
  • apart from checking that the domain exists and has an email server in its DNS entry, why cant you use a regexp to check for syntactically-legal email addresses? plenty of those flying around. – radai Aug 27 '13 at 11:18
  • possible duplicate of [Verify email in Java](http://stackoverflow.com/questions/153716/verify-email-in-java) – Bernhard Barker Aug 27 '13 at 11:19
  • 3
    And then there's also [What is the best Java email address validation method?](http://stackoverflow.com/questions/624581/what-is-the-best-java-email-address-validation-method) – Bernhard Barker Aug 27 '13 at 11:20
  • @radai Well I don't want to invoke a DNS query on the code path where this code gets evaluated as that would introduce issues... specifically the server that this code is running on may not be able to validate the domain name that is in the returned ID. So checking DNS entries is out – Stephen Connolly Aug 27 '13 at 11:20
  • the dns-verification was an extra step, but apart from regexp i dont really see any other way out of this – radai Aug 27 '13 at 11:24
  • The specification for email addresses specified by RFC2821 is insane. But back in the day, things were different - RFC821 had to account for, and basically wrap, all manner of proprietary email addresses. Otherwise getting buy-in from the players of the day would have been impossible. That being said, email addresses are really overly flexible. The crazier (but valid) attributes are rarely used. Learn more about the brain-popping complexity of the task here: http://en.wikipedia.org/wiki/Email – Tony Ennis Aug 27 '13 at 12:04
  • @radai the way out this is to code to RFC2821. Or, the OP could decide that he'll only recognize as valid a subset of email address formats. A common-sense definition would probably be 99% accurate. – Tony Ennis Aug 27 '13 at 12:14

5 Answers5

12

- Try the below code, this may be helpful to you.

public class EmailCheck {

    public static void main(String[] args){


        String email = "vivek.mitra@gmail.com";
        Pattern pattern = Pattern.compile("[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,4}");
        Matcher mat = pattern.matcher(email);

        if(mat.matches()){

            System.out.println("Valid email address");
        }else{

            System.out.println("Not a valid email address");
        }
    }

}

- Also take a look at this site, which shows another deeper validation using regular expression. Deeper validation using regular expression

Kumar Vivek Mitra
  • 33,294
  • 6
  • 48
  • 75
  • 1
    Note that `"ping @ pong!"@[1.2.3.4]` is a valid email address which your code will not correctly identify... now it is also a valid email address which does not need lowercasing of the bit after the `@` but that is a different issue. Using regex to identify email addresses is an anti-pattern – Stephen Connolly Aug 27 '13 at 12:08
7

You can use following for verifying an email;

String email ="nbjvkj@kn.com"
Pattern p = Pattern.compile(".+@.+\\.[a-z]+");
Matcher m = p.matcher(email);
boolean matchFound = m.matches();
if (matchFound) {
    //your work here
}
Fahim Parkar
  • 30,974
  • 45
  • 160
  • 276
Shiv
  • 4,569
  • 4
  • 25
  • 39
3

My suggestion is to use:

org.apache.commons.validator.routines.EmailValidator.isValid(value::String)

https://commons.apache.org/proper/commons-validator/apidocs/org/apache/commons/validator/routines/EmailValidator.html

Pasha
  • 642
  • 6
  • 22
  • 3
    Handy class, but the documentation gives this disclaimer: ```"This implementation is not guaranteed to catch all possible errors in an email address. "``` – KayO May 29 '20 at 15:01
2

Thanks to @Dukeling

private static toLowerCaseIfEmail(String string) {
    try {
        new InternetAddress(string, true);
    } catch (AddressException e) {
        return string;
    }
    if (string.trim().endsWith("]")) {
        return string;
    }
    int lastAt = string.lastIndexOf('@');
    if (lastAt == -1) {
        return string;
    }
    return string.substring(0,lastAt)+string.substring(lastAt).toLowerCase();
}

should, from what I can tell, do the required thing.

Update

Since the previous one ignored the possibility of (comment) syntax after the last @... which lets face it, if we see them should just bail out fast and return the string unmodified

private static toLowerCaseIfEmail(String string) {
    try {
        new InternetAddress(string, true);
    } catch (AddressException e) {
        return string;
    }
    int lastAt = string.lastIndexOf('@');
    if (lastAt == -1 
        || string.lastIndexOf(']') > lastAt
        || string.lastIndexOf(')' > lastAt) {
        return string;
    }
    return string.substring(0,lastAt)+string.substring(lastAt).toLowerCase();
}
Community
  • 1
  • 1
Stephen Connolly
  • 13,872
  • 6
  • 41
  • 63
  • 1
    is this really answer?? – Fahim Parkar Aug 27 '13 at 11:41
  • 1
    Yes because the spec of emails in RFC 2821 is *a lot* stranger than you think and the best way using standard Java APIs is to let a spec compliant parser parse the request and bail for the routing path addresses (hence the `endsWith("]")`) – Stephen Connolly Aug 27 '13 at 11:47
  • I'm betting @StephenConnolly has sent a LOT of email. He knows the pain. – Tony Ennis Aug 27 '13 at 12:18
  • Of course my attempt doesn't handle comments in the email address... `john."M@c"."Smith!"(coolguy)@(thefantastic)[1.2.3.4](onlythebest)` is also a valid email address... but it *could* probably be ignored safely if instead of bailing with `endsWith("[")` the check is that `indexOf(']') < lastAt` since that will indicate an email address that doesn't *need* lowercasing as `[]` is only for IP addresses – Stephen Connolly Aug 27 '13 at 13:47
0
        Pattern pattern = Pattern.compile("^[A-Za-z0-9._]{1,16}+@{1}+[a-z]{1,7}\\.[a-z]{1,3}$");
        Matcher mail = pattern.matcher(your_mail);

        if (mail.find()) {
            System.out.println("True");
        } else {
            System.out.println("False");
        }