2

This must have a canonical answer but I cannot find it... Using a regular expression to validate an email address has answers which show regex is really not the best way to validate emails. Searching online keeps turning up lots and lots of regex-based answers.

That question is about PHP and an answer references a handy class MailAddress. C# has something very similar but what about plain old C++? Is there a boost/C++11 utility to take all the pain away? Or something in WinAPI/MFC, even?

Community
  • 1
  • 1
Mr. Boy
  • 60,845
  • 93
  • 320
  • 589
  • 5
    To be honest, don't try too hard. Check it contains exactly 1 `@`, and then try to send an email to it. You'll have to do the latter anyway, and it's the only way to prove the address exists, even if it is semantically valid. – BoBTFish Jan 15 '16 at 11:41
  • 1
    Fair point @BoBTFish but in our implementation email requests go into a queue so the user won't get immediate feedback if the send failed - therefore I'd prefer to be _reasonably_ strict. – Mr. Boy Jan 15 '16 at 11:53
  • So, what rule do you wish to use? – David Heffernan Jan 15 '16 at 12:06
  • 1
    The RFC allows some addresses that you may never even consider could be valid. – Jonathan Potter Jan 15 '16 at 12:12
  • @DavidHeffernan if there is nothing ready-made then I suppose "a.b.c@xyz.def" is _probably_ about as far as we'd need. I don't expect we'd have any wacky edge cases, even supporting '+' is probably not a requirement. – Mr. Boy Jan 15 '16 at 12:13
  • 1
    Here is more fun reading on the subject: http://girders.org/blog/2013/01/31/dont-rfc-validate-email-addresses/ – Vlad Feinstein Jan 15 '16 at 14:41
  • In the end it all depends on your comfort level between false-positives and false-negatives. Do you, for example, accept the principal that `"It is better that ten guilty persons escape than that one innocent suffer"`? :) – Vlad Feinstein Jan 15 '16 at 19:34
  • 1
    @BoBTFish: Although it's used in the *vast* majority of current email addresses, even the `@` isn't strictly required. – Jerry Coffin Jan 18 '16 at 17:56

1 Answers1

1

I have to write one solution because I have a g++ version installed that doesnt support std::regex (Application crashes) and I dont want to upgrade the thing for a single E-Mail validation as this application probably never will need any further regex I wrote a function doing the job. You can even easily scale allowed characters for each part of the E-Mail addres (before @, after @ and after '.') depdending on your needs. Took 20 min to write and was way easier then messing with compiler and environment stuff just for one function call.

Here you go, have fun:

bool emailAddressIsValid(std::string _email)
{
    bool retVal = false;

    //Tolower cast
    std::transform(_email.begin(), _email.end(), _email.begin(), ::tolower);

    //Edit these to change valid characters you want to be supported to be valid. You can edit it for each section. Remember to edit the array size in the for-loops below.

    const char* validCharsName = "abcdefghijklmnopqrstuvwxyz0123456789.%+_-"; //length = 41, change in loop
    const char* validCharsDomain = "abcdefghijklmnopqrstuvwxyz0123456789.-"; //length = 38, changein loop
    const char* validCharsTld = "abcdefghijklmnopqrstuvwxyz"; //length = 26, change in loop

    bool invalidCharacterFound = false;
    bool atFound = false;
    bool dotAfterAtFound = false;
    uint16_t letterCountBeforeAt = 0;
    uint16_t letterCountAfterAt = 0;
    uint16_t letterCountAfterDot = 0;

    for (uint16_t i = 0; i < _email.length(); i++) {
        char currentLetter = _email[i];

        //Found first @? Lets mark that and continue
        if (atFound == false && dotAfterAtFound == false && currentLetter == '@') {
            atFound = true;
            continue;
        }

        //Found '.' after @? lets mark that and continue
        if (atFound == true && dotAfterAtFound == false && currentLetter == '.') {
            dotAfterAtFound = true;
            continue;
        }

        //Count characters before @ (must be > 0)
        if (atFound == false && dotAfterAtFound == false) {
            letterCountBeforeAt++;
        }

        //Count characters after @ (must be > 0)
        if (atFound == true && dotAfterAtFound == false) {
            letterCountAfterAt++;
        }

        //Count characters after '.'(dot) after @ (must be between 2 and 6 characters (.tld)
        if (atFound == true && dotAfterAtFound == true) {
            letterCountAfterDot++;
        }

        //Validate characters, before '@'
        if (atFound == false && dotAfterAtFound == false) {
            bool isValidCharacter = false;
            for (uint16_t j = 0; j < 41; j++) {
                if (validCharsName[j] == currentLetter) {
                    isValidCharacter = true;
                    break;
                }
            }
            if (isValidCharacter == false) {
                invalidCharacterFound = true;
                break;
            }
        }

        //Validate characters, after '@', before '.' (dot)
        if (atFound == true && dotAfterAtFound == false) {
            bool isValidCharacter = false;
            for (uint16_t k = 0; k < 38; k++) {
                if (validCharsDomain[k] == currentLetter) {
                    isValidCharacter = true;
                    break;
                }
            }
            if (isValidCharacter == false) {
                invalidCharacterFound = true;
                break;
            }
        }

        //After '.' (dot), and after '@' (.tld)
        if (atFound == true && dotAfterAtFound == true) {
            bool isValidCharacter = false;
            for (uint16_t m = 0; m < 26; m++) {
                if (validCharsTld[m] == currentLetter) {
                    isValidCharacter = true;
                    break;
                }
            }
            if (isValidCharacter == false) {
                invalidCharacterFound = true;
                break;
            }
        }

        //Break the loop to speed up thigns if one character was invalid
        if (invalidCharacterFound == true) {
            break;
        }
    }

    //Compare collected information and finalize validation. If all matches: retVal -> true!
    if (atFound == true && dotAfterAtFound == true && invalidCharacterFound == false && letterCountBeforeAt >= 1 && letterCountAfterAt >= 1 && letterCountAfterDot >= 2 && letterCountAfterDot <= 6) {
        retVal = true;
    }

    return retVal;
}
Steini
  • 2,753
  • 15
  • 24