2

I need to check if a string (local part of email address) has nothing except:

  • letters (a-zA-Z)
  • numbers (0-9)
  • underscores
  • at most one dot (.)

How do I do it using Java regex?

Example: a_1_b.c and a_b_1, should be okay ,but 1.a_b.2 and 1_a*3 should be discarded.

Amit Dalal
  • 642
  • 8
  • 20
  • 1
    Are you asking about email addresses, or about your own format? Because `foo+bar` is a valid email inbox name (as in `foo+bar@example.com`) but would be rejected by your validation – Gareth Apr 26 '11 at 09:00
  • Possible duplicate of http://stackoverflow.com/questions/3274701/using-regular-expression-for-validating-data-is-correct-or-not if not try http://www.google.co.uk/search?q=email+address+correctness+using+regex 651K results. – Peter Lawrey Apr 26 '11 at 09:00
  • it's about my own format. Rejecting `foo+bar@example.com` should be okay. – Amit Dalal Apr 26 '11 at 09:02
  • 2
    can I suggest you hit the 'edit' button and give the question a more useful title then? :) – Gareth Apr 26 '11 at 09:03
  • Not an exact duplicate, but pretty relevant: http://stackoverflow.com/questions/156430/regexp-recognition-of-email-address-hard – Joachim Sauer Apr 26 '11 at 09:04
  • And since we're are it: [an "alphabet" is a **Set of letters**](http://en.wikipedia.org/wiki/Alphabet). "a", "b", "c" are letters, not "alphabets". – Joachim Sauer Apr 26 '11 at 10:40

5 Answers5

8

If you want to verify email correctness you might want to just rely on the JavaMail API to do it for you. Then you don't need to worry about encoding the details of the RFC 822 specification into a regex. Not to mention if you're dealing with email addresses you likely want an easy way to send them, and the library has that too. You could verify that an email address is valid with simply:

try {
    new InternetAddress(email).getAddress();
} catch (AddressException e) {
    // it's not valid
}
WhiteFang34
  • 70,765
  • 18
  • 106
  • 111
  • +1 for not reinventing the wheel. It will allow some that OP appears to want to not allow (although, he may be looking for a simple case because he doesn't want to cover all the cases manually. Hard to tell.) – corsiKa Apr 26 '11 at 09:08
  • @glowcoder: good point, and it would actually allow `Foo Bar `. One quick fix for that case would be to check that `email.equals(new InternetAddress(email).getAddress())` to verify it's not of that form. – WhiteFang34 Apr 26 '11 at 09:17
2

Your should find more information here.

bastianwegge
  • 2,435
  • 1
  • 24
  • 30
  • 1
    While partially addressing the user's issue, simply linking another page is kind of weaksauce. :-( Not gonna -1, just sayin'... – corsiKa Apr 26 '11 at 09:07
  • Yeah sure, but to read it and post it as ones own idea is better? This is like using google for someone else and he even wrote "it worked for me", so what's the point? – bastianwegge Apr 26 '11 at 09:27
  • Don't post it as your own - always cite your sources. I would bring out the relevant information and put it in the post, personally. Provide it as a source and as a resource for further reading. – corsiKa Apr 26 '11 at 09:31
  • But that's kind of unfair to the person who wrote the Article, isn't it? I respect your comment and that you might see it as "weaksauce" but that's just your subjective **oppinion** and yours only. – bastianwegge Apr 26 '11 at 09:34
  • It isn't just my opinion. It's widespread throughout the community here. Consider http://meta.stackexchange.com/questions/29909/is-it-ok-to-answer-questions-with-just-a-link and the answers it links. The short answer is "no." So like I said, I'm not going to downvote it, because I do think it's useful, just that it could be more so. There isn't ~much~ value added, considering (from answering questions like this myself) that particular site is usually the first result in Google. – corsiKa Apr 26 '11 at 09:43
1

The regex [\w]*\.[\w]+|[\w]+ should work, I guess.

Matt Ellen
  • 11,268
  • 4
  • 68
  • 90
Amit Dalal
  • 642
  • 8
  • 20
0

To check your own format, this should work:

Pattern p = Pattern.compile("[a-zA-Z0-9_]+[\\.]{0,1}[a-zA-Z0-9_]*");
  • a_1_b.c should be okay
  • a_1b also
  • 1.a_b.2 and 1_a*3 should get rejected

Test it on http://www.regexplanet.com/simple/index.html

Vincent Mimoun-Prat
  • 28,208
  • 16
  • 81
  • 124
  • the dot might or might not be there. – Amit Dalal Apr 26 '11 at 09:09
  • by question, better use `*` instead of `+` (and a '?' for the dot) - the proposed pattern requests at least 2 characters separated by a dot – user85421 Apr 26 '11 at 09:20
  • 1
    You'll get a more readable regex using the CaseInsensitive flag and using `\\.?` instead of `[.]{0,1}` (within a character class no escaping for the dot) `Pattern.compile("[a-z0-9_]+\\.?[a-z0-9_]*,Pattern.CASE_INSENSITIVE")` – bw_üezi Apr 27 '11 at 09:53
0

Assuming there also has to be exactly one '@', and it must come before (but not immediately before) the single '.', you could do

final static private String charset = "[a-zA-Z0-9_]"; // separate this out for future fixes
final static priavte String regex = charset + "+@" + charset + "+\." + charset + "+";

boolean validEmail(String email) {
    return email.matches(regex);
}
corsiKa
  • 81,495
  • 25
  • 153
  • 204
  • @amdalal in that case, you're looking at the same thing with the front chopped off. `regex = charset + "+\." + charset + "+";` I left `charset` as a separate variable so if you make a change to it (adding dashes for instance) you only have to change it in one place. – corsiKa Apr 26 '11 at 09:22
  • no doubt leaving `charset` separate is a good idea, but the `.` might not be there in the string. i was looking for at most one dot. – Amit Dalal Apr 26 '11 at 09:43
  • So you are okay with the dot NOT being there? – corsiKa Apr 26 '11 at 09:45