0

I have an email address validation regex Which I use in the code like this:

public class Test {

  public static void main(String[] args) {
    try {
      String lineIwant = "myname@asl.ramco-group.com";
      String emailreg = "^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$";
      Boolean b = lineIwant.matches(emailreg);

      if (b == false) {
        System.out.println("Address is Invalid");
      }else if(b == true){
        System.out.println("Address is Valid");
      }
    } catch (Exception e) {

      e.printStackTrace();
      System.out.println(e.getMessage());
    }
  }
}

On this specific email address in the example, the Boolean returns false while this is a valid customer email address.

I am suspecting it is because of the hyphen between ramco and group because when I remove it the Boolean returns true.

How can I change my regex to accommodate such an email address?

Duncan Jones
  • 67,400
  • 29
  • 193
  • 254
Stanley Mungai
  • 4,044
  • 30
  • 100
  • 168
  • 3
    Don't re-invent the wheel - use something like [`EmailValidator`](http://commons.apache.org/proper/commons-validator/javadocs/api-1.4.0/). – Duncan Jones Apr 30 '13 at 08:26
  • 1
    For your Info I had already Looked at that question and taht regex does not Validate either on this specific email ID – Stanley Mungai Apr 30 '13 at 08:28
  • One of approaches [here](http://www.mkyong.com/regular-expressions/how-to-validate-email-address-with-regular-expression/). – skuntsel Apr 30 '13 at 08:28
  • @DuncanJones: Do you mind writing an answer? – nhahtdh Apr 30 '13 at 08:33
  • I didn't agree with the duplicate, so I've reopened. I don't understand the down-votes either, since you've supplied good example code and described your goals clearly. Up-vote from me. – Duncan Jones Nov 14 '14 at 08:18

4 Answers4

16

Your regex is not allowing a - after the @ sign, so

String emailreg = "^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9-]+(\\.[A-Za-z0-9-]+)*(\\.[A-Za-z]{2,})$";

would "fix" this specific problem. But Email addresses are much more complicated than that. Validating them using a regex is not a good idea. Check out @DuncanJones' comment.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
16

I would recommend you don't try to solve this problem yourself. Instead, rely on a well-tested solution such as EmailValidator from commons-validator.

For example:

EmailValidator.getInstance().isValid(emailAddressString);
Duncan Jones
  • 67,400
  • 29
  • 193
  • 254
  • 2
    EmailValidator also gives you the option of excluding local email addresses so that the top level domain must be present, which was a requirement for me. – leojh Jan 30 '14 at 19:29
  • 2
    EmailValidator seems deprecated, it doesn't recognize latest TLD. – Antares Nov 13 '14 at 23:56
  • 1
    @Antares Can you give an example? – Duncan Jones Nov 14 '14 at 08:16
  • 1
    I was using EmailValidator 1.4, and it was considering "abcd@domain.email" as an invalid email address. – Antares Nov 14 '14 at 13:04
  • 1
    @Antares Looks like this is known about, see [VALIDATOR-305](https://issues.apache.org/jira/browse/VALIDATOR-305). So I guess you are stuck until a new release is made. – Duncan Jones Nov 14 '14 at 13:15
3

there are many regexp validation strings, that you can use. Like here

but it is realy nescesary to have Optimal solution? Don't need you only sub optimal solution that covers 99.9999% used adresses?

read this article

MarekM
  • 1,426
  • 17
  • 14
2

Add \\- to that section of the regex string after the @.

The \\ is an escape telling Java that you do not want to use the dash as it's normally used to show the difference between two values. So like this...

^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9\\-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$

Update

Tim acknowledged in the comments that the escape is not necessary!

And a quick tip of my own is you might want to use \w in replace of [A-Za-z0-9_] so you don't have to keep writing that over and over. And finally get familiar with this site. Once you start using regex it's a great help.

gmustudent
  • 2,229
  • 6
  • 31
  • 43
  • 1
    Not quite. First, the dash is used to denote a *range* between two characters, not a difference (but perhaps that's what you meant). Second, if the dash is the last character (or the first) in a character class, it doesn't need to be escaped. – Tim Pietzcker Apr 30 '13 at 08:30
  • Learned something new today, thanks Tim I had no idea. I always play it safe and add my escapes but that's a great tip thanks. – gmustudent Apr 30 '13 at 08:32
  • Note: `\w` is `A-Za-z0-9_`, `\w` is superset of `\d`. – nhahtdh Apr 30 '13 at 08:35
  • 1
    `\w` is great and all, since it shorten the regex, but the thing is that people keep forgetting that `_` and digits are included. And its meaning, although usually consistent between languages, might receive an upgrade to match Unicode characters, like in .NET regular expression. I don't object to its use, though. – nhahtdh Apr 30 '13 at 08:39