I am trying to find email addresses from the HTML file, I need email addresses with top-level domain(tld) to level 1 only, for example from the email addresses given below, bold addresses are invalid in this case
- test@domain12.com
- test@domain12.com
- test123@domain-12.com
- test@domain.co.au
- test.abc@domain.ac.nz
- test@abc.co
- example@testdomain.net
- sample@organization.org
I am using the following regex it works fine if there are only email addresses, but if I add any text after the email addresses it doesn't match the criteria.
(?=<\s|^)\b[a-zA-Z0-9.-]+@[a-zA-Z0-9-]+.[a-zA-Z]{2,6}$(?=\s|$|.+)
success case:
- test@domain12.com
- example@testdomain.net
- sample@organization.org
Failure case:
- test@domain12.com random text after email address
- example@testdomain.net random text after email address
- sample@organization.org random text after email address
Any help in this scenario will be really appreciated.