0

So I have a very big database, the question is to find out the incorrect format e-mail address. Please help me, thank you!

1:How many records have incorrect email addresses (lines with an @ in it but formatted incorrectly)? An email address has a user-id and domain names can consist of letters, numbers, periods, and dashes. An email address should have a top-level-domain (something.top-leveldomain). Top-level-domains are of the form: com, org, edu etc.,

I know how to find the email address: grep -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,6}\b" HW1_Data.txt; But if I use grep -E -v "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,6}\b" HW1_Data.txt, I will just got everything but email..so I just don't know how to do

James
  • 39
  • 2
  • 9

1 Answers1

0

Please run grep @ HW1_Data.txt to get all the lines that could be email addresses. Then exclude the non - valid. The answer is

grep @ HW1_Data.txt | grep -E -v "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,6}\b"

Krassi Em
  • 182
  • 8
  • This is what I do..., but this command will just get everything but email! I need to get incorrect format email – James Sep 23 '17 at 04:02