3

I am trying to work up a Regex to extract email addresses off a WP directory using WebHarvy (.NET)

The emails could be in multiple formats, using dots and underscores and so I tried the following expressions

(\w+|\w+(\W|\.)\w+)@\w+.\w+
\w.+|\w+\S\w+@\w+\.\w+

Though they seem to be working in the Regexstorm tester, when I am using them in WebHarvy, they are only extracting the part preceding the @

Please advise

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
blackystrat
  • 109
  • 1
  • 5

1 Answers1

2

The problem is that WebHarvey returns the capturing group value. Since you wrapped the user part with a capturing group ((\w+|\w+(\W|\.)\w+)), it returns only that part.

You may fix your regex using a non-capturing group ((?:...)) as

(\w+(?:\W+\w+)*@\w+\.\w+)

or use a more generic

([^\s<>'"]+@[^\s<>'"]+\.[^\s<>'"]+)

The [^\s<>'"]+ will match 1+ chars other than whitespace, <, >, ' and " symbols. @ and \. match a @ and a . respectively.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563