0

I would like to create a custom validation formula in Microsoft Excel (.xlsx) to check for the correct format of email addresses in bulk. I have over 50,000 emails and I want to flag incorrect email addresses. For example, emails without .com, emails without “@” character, emails with multiple “@” characters, emails with leading space, and emails with trailing space.

After several hours of searching online I found a few useful formulas but they don’t work well with existing data - it appears to work best for new email entries.

=ISNUMBER(MATCH("*@*.?*",A1,0))
=AND(FIND("@",A1),FIND(".",A1),ISERROR(FIND(" ",A1)))

I also found this formula from another thread on SO: It checks for commas.

=AND(ISNUMBER(FIND("@",A2)),ISNUMBER(FIND(".",A2)),ISERROR(OR(FIND(" ",A2),FIND(",",A2))))

Any help or suggestions would be greatly appreciated. Thanks so much!

Oskar_U
  • 472
  • 4
  • 13
  • 1
    I've tried all these formula's and they work as expected with a mocked up dataset. Can you specify how they don't work with existing data, or what other issues you have with them? – Plutian Sep 05 '19 at 10:06
  • 1
    When you write "*for example, emails without …*", that implies the list not all inclusive. So your first task is to lay out an INCLUSIVE statement. Then it is just a matter of writing a formula that includes ALL of your requirements. And, unless you have a limited set of requirements, it is not a trivial task. And the only real way to validate an email address that someone provides to you is to verify you can send emails to it. – Ron Rosenfeld Sep 05 '19 at 10:20
  • Thanks so much for your response. I have to copy and paste the formula into a new column and it returns true or false. I'm skeptical because I only have about 100 false returns from over 50,000 emails. They were manually entered so I'm expecting more false records. Additionally, =AND(FIND("@",A1),FIND(".",A1),ISERROR(FIND(" ",A1))) gives me the following error in the data validation tool "the formula currently evaluates to an error. Do you want to continue?". Any suggestions? Thanks everyone for looking into this. – javaSurfer456 Sep 06 '19 at 00:43
  • You can ignore that message. It just means that the cell in which you are entering the validation doesn't pass. (Because the formula is returning `#VALUE` instead of `FALSE`. But your DV should work anyway. However, DV is not going to flag already entered invalid emails. It is designed to trigger when you try to enter something in the cell. You may want to consider conditional formatting for the Flag. – Ron Rosenfeld Sep 06 '19 at 01:26
  • Most likely, if you have many fewer invalid email addresses than you should, your formula is not screening sufficiently. – Ron Rosenfeld Sep 06 '19 at 01:30
  • You have to find out all the possible invalid scenarios before writing the formula. Check this [article](https://stackoverflow.com/questions/2049502/what-characters-are-allowed-in-an-email-address) out for some ideas. I suspect you are trying to use data validation but again you need to know all the scenario before defining the formula. – Terry W Sep 06 '19 at 03:39

0 Answers0