0

I am looking to improve my regex skills for use in my Ruby programming.
I have come up with the matcher below for emails.
Can it be improved? Will it work for all email addresses?
Is the mailto: bit ok?

/(mailto:)*\w+@\w+.[A-z]+.[A-z]{2,4}/

It matches addresses like

bob@test.com
bob_smith@test.com
bob_smith@prefix.test.com
abc@xyz.co.uk
mailto:fred@test.com
junky
  • 1,480
  • 1
  • 17
  • 32
  • It won't work for special or non-latin characters, such as nøx@nåx.dk – Nix Feb 28 '12 at 16:18
  • 2
    You do realize that `[mailto:]*` means "zero or more characters from the set {`:`, `a`, `i`, `l`, `m`, `o`, `t`}"? – ruakh Feb 28 '12 at 16:19
  • good point ruakh, thyat's why I posted this. Do you know how it be improved? – junky Feb 28 '12 at 16:21
  • 1
    possible duplicate of [How to use a regular expression to validate an email addresses?](http://stackoverflow.com/questions/201323/how-to-use-a-regular-expression-to-validate-an-email-addresses) – ruakh Feb 28 '12 at 16:21
  • it own't match filters, `bob+SO@test.com` – Evan Davis Feb 28 '12 at 16:22
  • How about `(mailto:)*\w+@\w+.[A-z]+.[A-z]{2,4}` ? – junky Feb 28 '12 at 16:22
  • 1
    What is your intent? To see if an email only matches your pattern? Or, is it to see if it is a valid address? The second is a much more difficult question, because an email address might match your pattern and still not be real. The simplest answer is to try sending a message to the address, asking the human at the other end to respond. Doing that answers your question too, along with other questions you'll probably want answered in subsequent steps. – the Tin Man Feb 28 '12 at 16:25
  • Mathletics, I don't get that, sorry. I don't want to allow a +. Please expand on your comment for clarity "it own't match filters" means something to you but nothing to me. Thanks! – junky Feb 28 '12 at 16:26
  • Hi Tin Man - sorry to repeat my first line, but "I am looking to improve my regex skills for use in my Ruby programming." – junky Feb 28 '12 at 16:27
  • Actually you **do** want to allow a `+` as that's perfectly valid in an email address; you're also missing `bob.smith@pan-cakes.com`. And `\w+` isn't what you want for a host name, find and study the appropriate RFCs or better, practice your regex skills on something less complicated. – mu is too short Feb 28 '12 at 16:33
  • Could the folks being negative find some other questions to comment on please. I am looking for help with regexp's not opinions about why I am using them, if I am using them, if it's appropriate, etc. Those are great questions but not the question I am asking. I am looking for help with the immediate task at hand, not the philosophy or reasoning behind it. – junky Feb 28 '12 at 16:46
  • Hi mu, I didn't know that a + was valid. Thank You. – junky Feb 28 '12 at 16:47
  • junky: if you're talking about the RFC "spec" for email addresses, then ANY character is valid _so long as it is escaped properly_ - thus, `"The@night;train.2%katmandu\ *(A#story!)@some.server.gift-horse.justified.BigBuilding.museum` is a valid email address according to the RFC (as far as I know) - most servers have more restrictive rules, though. – Code Jockey Feb 28 '12 at 20:31
  • As for "filters", you can learn more about how an email address can be structured, including "tags" (or filters) on [Wikipedia](http://en.wikipedia.org/wiki/Email_address#Address_tags) – Code Jockey Feb 28 '12 at 20:33
  • Finally, the best site I know of for learning regex in general is [Regular-Expressions.info](http://www.regular-expressions.info) – Code Jockey Feb 28 '12 at 20:34

1 Answers1

2

short answer: NO. not ALL emails can be checked by regex. there's a thread somewhere here on SO, where they explain this much better than i could if i attempted. I think the only way to check if email is really an email is to contact the mail server and enquire whether user account exists.

please, have a read here: https://stackoverflow.com/a/1373724/81520

Community
  • 1
  • 1
Peter Perháč
  • 20,434
  • 21
  • 120
  • 152