All , i have the below query in my applicaton which i use to check for valid and invalid email addresses. The input string can contain multiple emails addresses saperated by a semi colon ';'. The regular expression covers most of the valid emails untill someone pointed me towards the wiki link (https://en.wikipedia.org/wiki/Email_address ) where the list of valid emails are showing as invalid in my query . Bascially from my observation from the list what i do see is that any set of characters enclosed in double quotes ' " ' is considered a valid email address which includes characters like @ and ; also (copied the list below). Also , DNS name (.com or any other) are also not mandatory as per wiki.
Can someone please help me come up with a regular expression or any other approach to include all valid cases.
WITH EMAIL_ADDRESS AS
(SELECT NVL(TRIM(REGEXP_SUBSTR('a@b.com;@n.com', '[^;]+', 1, LEVEL)), ' ') EMAIL_ID
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(RTRIM('a@b.com;@n.com', ';'), ';') + 1)
SELECT EMAIL_ID,
(CASE
WHEN REGEXP_INSTR(LOWER(EMAIL_ID),
'^[[:alnum:]'']+((\&|\.|\+|\-|_)[[:alnum:]'']+)?+@[[:alnum:]-]+((\.[a-z]{2,}){1,4})$') = 1 THEN
'Y'
ELSE
'N'
END) IS_VALID_EMAIL
FROM EMAIL_ADDRESS;
EMAIL_ID I
-------------- -
a@b.com Y
@n.com N