So I've been given the task of going through our mailing list and "cleaning" the emails which are slightly incorrect (trailing symbols, random symbols accidentally in the middle of the email, syntactically invalid, etc.) and to delete the entries which are totally bogus. The problem is I'm finding some of the emails in the database have already been syntactically corrected but the original entries still exist. There are over 2000 emails which my query is returning as incorrect (it seems past cleanings encountered upwards of 10,000 bad emails!) and going through them one at a time is no good. Does anyone have any suggestions for how to go about this?
Here's the query I used to return the bad emails:
SELECT id,email
FROM table
WHERE NOT
(
CHARINDEX(' ',LTRIM(RTRIM([email]))) = 0
AND LEFT(LTRIM([email]),1) <> '@'
AND RIGHT(RTRIM([email]),1) <> '.'
AND CHARINDEX('.',[Email],CHARINDEX('@',[email])) - CHARINDEX('@',[email]) > 1
AND LEN(LTRIM(RTRIM([email]))) - LEN(REPLACE(LTRIM(RTRIM([email])),'@','')) = 1
AND CHARINDEX('.',REVERSE(LTRIM(RTRIM([email])))) >= 3
AND (CHARINDEX('.@',[email]) = 0 AND CHARINDEX('..',[email]) = 0)
)
or id in (select id from table where email like '%[+;(,!]%')