6

I am coding a site in php and I am currently on the contact us page and I was wondering what was the best way to validate an email address?

  1. By sending a validation link to their email?
  2. Regex
  3. Any other method?

Also could you tell me why and a guide along my way to achieving it? I dont want someone to do the code for me because thats no fun for me and I won't learn but just some guidance on the techniques used to achieve either the methods above.

Also I am going to use these methods to implement a subscribe button on my webpage. Is this the best way to do this? any other methods I should condsider?

RSM
  • 14,540
  • 34
  • 97
  • 144
  • 1
    Possible duplicate: http://stackoverflow.com/questions/201323/what-is-the-best-regular-expression-for-validating-email-addresses – Anax Jul 15 '10 at 07:56
  • 2
    Regex won't validate an email, it will only validate that the user input looks like an email. (adasdsa@dadsd.com will validate) If you really need to validate, you have to send a validation email. – ahmetunal Jul 15 '10 at 07:57

10 Answers10

16

I usually go through these steps

  1. Regex
  2. Send an activation code to the email

if the first step fails it never reaches second step. if the email sending fails because the email doesn't exist I delete the account or do some other stuff

--edit

3 - If for some reason the activation email doesn't get sent, email doesn't get deleted, it stays unapproved for 7 days (or as configured by you), email resending is tried in every 2-3 hours, after those days if no success, email is deleted

4 - If email sent successfully but not activated it stays unapproved but can be reactivated anytime by generating a new activation code

Flakron Bytyqi
  • 3,234
  • 20
  • 20
  • I would also like to add that, for user convenience, assign the non-validated user some sort of 'pre-approved' status so that when mail delivery is slow the user can still make use of your services. And also, don't be too strict in your regex, just make sure that it 'kinda sorta looks like an email address', you don't want too many false negatives. – Dennis Haarbrink Jul 15 '10 at 08:01
  • @Dennis: I don't think that entering a plausible but potentially false email address is sufficient reason to give the user any more access. – Steven Sudit Aug 01 '10 at 00:54
  • please don't use a regex to validate an email address. it isn't possible to write a regex that matches the spec from RFC2822 exactly. you will end up with both false positives and false negatives. false negatives are a big problem because they prohibit valid email addresses from getting through. Jeffrey Friedl developed an email matching regex in "Mastering Regular Expressions". It was something like 7000 characters long, and matched 98% of valid address formats. It's better to just use a library that uses this. – bluesmoon Aug 01 '10 at 12:15
11

I think the best is a combination of 3. and 1.

In an initial phase you verify syntactically the e-mail (to catch typos):

filter_var($email, FILTER_VALIDATE_EMAIL)

And in a second one you send an e-mail with a confirmation address (to both catch errors and deliberately wrong information).

Artefacto
  • 96,375
  • 17
  • 202
  • 225
4

The best way to do it is to send an email with a validation link in it. At the very least if you don't want activation emails, validate the email address. The best email validation function is RFC-compliant email address validator by Dominic Sayers.

Simply include the php file in your project and use it like this:

if (is_email($email, $checkDNS, $diagnose)) //$checkDNS and $diagnose are false by default
    echo 'Email valid';
else
    echo 'Email invalid';
  • If $checkDNS is set to true, it will validate that the domain exists. If the domain don't exist the function return false even if email is valid.
  • If $diagnose is set to true, the function return a code instead of a boolean who will tell you why the email is invalid (or 0 if valid).
AlexV
  • 22,658
  • 18
  • 85
  • 122
3

That depends on whether or not the user actually wants to recieve a response.

If the user asks a question, he'll want a response and probably give his valid e-mail address. In this case, I'd use a very loose regex check to catch typos or a missing address. (Something like .+@.+.)

If the user does not want to be contacted, but you wanto to know their address, you'll need to work with a validation link. There is no other way to ensure that the e-mail address is valid and belongs to the user.

Jens
  • 25,229
  • 9
  • 75
  • 117
3

The only way to really know if an email is valid or not is to send an email to it. If you really have to, use one of these. Technically, there don't even have to be any periods after the @ for local domains. All that's necessary is a domain follows the @.

Community
  • 1
  • 1
Jacob Greenleaf
  • 369
  • 1
  • 6
2

A regex is not really suitable for determining the validity of email address syntax, and the FILTER_VALIDATE_EMAIL option for the filter_var function is rather unreliable too. I use the EmailAddressValidator Class to test email address syntax.

I have put together a few examples of incorrect results returned by filter_var (PHP Version 5.3.2-1ubuntu4.2). There are probably more. Some are admittedly a little extreme, but still worth noting:

RFC 1035 2.3.1. Preferred name syntax
https://www.rfc-editor.org/rfc/rfc1035
Summarised as: a domain consists of labels separated by dot separators (not necessarily true for local domains though).

echo filter_var('name@example', FILTER_VALIDATE_EMAIL);
// name@example

RFC 1035 2.3.1. Preferred name syntax
The labels must follow the rules for ARPANET host names. They must start with a letter, and with a letter or digit, and have as interior characters only letters, digits, and hyphen.

echo filter_var('name@1example.com', FILTER_VALIDATE_EMAIL);
// name@1example

RFC 2822 3.2.5. Quoted strings
https://www.rfc-editor.org/rfc/rfc2822#section-3.2.5
This is valid (although it is rejected by many mail servers):

echo filter_var('name"quoted"string@example', FILTER_VALIDATE_EMAIL);
// FALSE

RFC 5321 4.5.3.1.1. Local-part
https://www.rfc-editor.org/rfc/rfc5321#section-4.5.3.1.1
The maximum total length of a user name or other local-part is 64 octets.
Test with 70 characters:

echo filter_var('AbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghij@example.com', FILTER_VALIDATE_EMAIL);
// AbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghij@example.com

RFC 5321 4.5.3.1.2. Domain
https://www.rfc-editor.org/rfc/rfc5321#section-4.5.3.1.2
The maximum total length of a domain name or number is 255 octets.
Test with 260 characters:

echo filter_var('name@AbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghij.com', FILTER_VALIDATE_EMAIL);
// name@AbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghij.com

Have a look at Validate an E-Mail Address with PHP, the Right Way for more information.

Mark Amery
  • 143,130
  • 81
  • 406
  • 459
Mike
  • 21,301
  • 2
  • 42
  • 65
  • 1
    Doug Lovell's article from Linux Journal is factually wrong in several cases. He repeats the original mistakes in RFC 3696 that have now been corrected in the errata. Unfortunately Linux Journal have not seen fit to correct this misleading article and it still gets cited as an authority. – Dominic Sayers Mar 15 '11 at 11:19
2

Before sending off a validation email you could also use checkdnsrr() to verify that the domain exists and does have MX records set up. This will detect emails that use bogus domains (like user@idontexist.com).

function validateEmail($email, $field, $msg = '')
{
    if (!filter_var($email, FILTER_VALIDATE_EMAIL))
    {
        return false;
    }
    list($user, $domain) = explode('@', $email);
    if (function_exists('checkdnsrr') && !checkdnsrr($domain, 'MX'))
    {
        return false;
    }
    return true;
}

We need to use function_exists() to verify checkdnsrr() is available to us because it was not available on Windows before PHP 5.3.

John Conde
  • 217,595
  • 99
  • 455
  • 496
  • 1
    Why do you think that MX records are required for email to be delivered? "If no MX records were present, the server falls back to A, that is to say, it makes a request for the A record of the same domain." – sanmai Jul 29 '10 at 03:51
  • @sanmai While that may be true in theory you rarely, if ever, see that happen in practice. Plus when it comes to validating email address, with the exception of sending an email to the address and awaiting a response, no automated process is going to be perfect. This method included. But if bad email addresses being provided is a problem this will help to mitigate that. – John Conde Jul 29 '10 at 13:45
  • @john-conde I've seen this happen at least a couple of times. Imagine you describing a manager why your valuable client can't register using his working (they checked) email address. – sanmai Jul 30 '10 at 09:20
  • I've seen the same thing sanmai has. It's not common, but it's real. Ultimately, checking for an MX record doesn't buy you much, anyhow, so I wouldn't bother. – Steven Sudit Aug 01 '10 at 00:55
  • Hmmm. Perhaps then this would best be used as one of several determining factors in determining an email addresses probability of being legit? Used only in conjunction with other tests. – John Conde Aug 01 '10 at 05:27
1

Depends upon your objective. If you must have a valid and active email, then you must send an email that requires verification of receipt. In this case, there is no need for regex validations except as a convenience to your user.

But if your desire is to help the user avoid typos while minimizing user annoyance, validate with regex.

kingjeffrey
  • 14,894
  • 6
  • 42
  • 47
0

Some good answers here, and I agree with the chosen one except for the regex bit. As other people have pointed out it's difficult if not impossible to find a regex that fully implements all the quirks of RFC 5321.

You are welcome to use my free PHP function is_email() to validate addresses. It's available here.

It will ensure that an address is fully RFC 5321 compliant. It can optionally also check whether the domain actually exists.

You shouldn't rely on a validator to tell you whether a user's email address actually exists: some ISPs give out non-compliant addresses to their users, particularly in countries which don't use the Latin alphabet. More in my essay about email validation here: http://isemail.info/about.

Dominic Sayers
  • 1,783
  • 2
  • 20
  • 26
-1
 function checkEmail($email) {
  if(preg_match("/^([a-zA-Z0-9])+([a-zA-Z0-9\._-])
  ↪*@([a-zA-Z0-9_-])+([a-zA-Z0-9\._-]+)+$/",
               $email)){
    list($username,$domain)=split('@',$email);
    if(!checkdnsrr($domain,'MX')) {
      return false;
    }
    return true;
  }
  return false;
Sachin R
  • 11,606
  • 10
  • 35
  • 40
  • 1
    Your regex rejects many valid email addresses. For example *@example.com, "Hello world"@example.com, someone@[127.0.0.1], someone@[2001:1234:1234::1.2.3.4] – jcoder Jul 15 '10 at 08:16
  • While I'd let you off with the ipv6 mismatch, the rest of the regex is just too poor to consider. – symcbean Jul 16 '10 at 14:49