0

The regex that I'm using can't capture the entire email adress from an html tag. It removes/doesn't read the final/top-level domain.

My regex pattern looks like this:

(?<!mailto:)(?<=^|[^A-Za-z0-9_\-\.+@])[_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*@[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*(\.[a-zA-Z]{2,})(?!\<\/a\>)

Tested regex using an online regex tester

The image above is from when I tested my regex using an online regex tester. What the image displays is also the output I get when I try to get the email and print it on my website. It leaves out the final domain, instead of grabbing "testing.user@dom.longdomain.se" it only gets "testing.user@dom.longdomain". But when I leave out the html tag it can read it perfectly fine.

Any idea on what my regex is missing or if I'm looking at it incorrectly?

Fresh Java
  • 107
  • 1
  • 10
  • 2
    `(?!\<\/a\>)` means *fail if immediately followed with ``*. – Wiktor Stribiżew Mar 31 '20 at 13:20
  • @WiktorStribiżew Yes I believe you're correct. So to my understanding it removes the final domain if there is a closing tag. Thank you. – Fresh Java Mar 31 '20 at 13:23
  • 2
    Does this answer your question? [Java regex email](https://stackoverflow.com/questions/8204680/java-regex-email) – andrewJames Mar 31 '20 at 13:28
  • 1
    May I ask why you didn't search further on this site for similar questions? The terms "java regex email" brought up a large number of questions. And I didn't even bother with google. – NomadMaker Mar 31 '20 at 13:31

1 Answers1

0

I think I found a solution after looking up a regex translator. It seems that the end parenthesis discards anything before the ending tag.

(?!\<\/a\>)

I removed it and it seems to work fine. Will test to make sure it works correctly.

Fresh Java
  • 107
  • 1
  • 10