0

In my case, content will come it may contain emailaddress in any of the below format.

1. From : dr-dscsda-ds-info@asdf.com
2. From:sadaedf@sdc.sdfds
3. rom: sdcasdc <information@dsf.net> 
4. From: "adads" <adsasd@adsc.cd>
5. From: <customercare@ddsaf.com>

I need to find the one of the above pattern and extract the email address from this

Regex I tried

From\s*:\s*((["|']?)[\w|\s]+\2)?\s*(\<?)([\w]([\w\-\.\+\'\/]*)@([\w\-\.]*)(\.[a-zA-Z]{2,22}(\.[a-zA-Z]{2}){0,2}))(\>?)

All the format matches in my regex. But except 1 and 2, I'll be extract the email address from the regex using group(4).

But in case of 1 and 2, It does not give me full emailaddress.

Please help to alter the regex so that it matches all my case.

Roshan
  • 2,019
  • 8
  • 36
  • 56
  • You’ve got so many capturing groups… use `(?:`…`)` more often. Use `(`…`)` only when needed, e.g. to extract the email itself. It’s getting quite confusing otherwise. – Sebastian Simon Feb 20 '17 at 07:05
  • [`From\s*:\s*.*?([^"\n<>]+@[^"\n<>]+\.[^"\n<>]+)`](https://regex101.com/r/L4TdiL/1) appears to work just fine. – Sebastian Simon Feb 20 '17 at 07:10
  • 1
    Possible duplicate of [What is the correct regular expression for an email address?](http://stackoverflow.com/questions/450696/what-is-the-correct-regular-expression-for-an-email-address) – tripleee Feb 20 '17 at 07:54
  • 1
    @Xufox "Appears to work" until you add more test cases, yes. This is a common FAQ and the standard answer is "don't do that; or at least understand the problem well enough to understand what compromises you need to make". Not using a regex is usually the sane way forward. – tripleee Feb 20 '17 at 07:55

1 Answers1

1

Your regex expects an unquoted name before the actual email address. It’s already matching some letters without knowing whether they belong to the email address or to the unquoted name.

I think your regex may be a bit too complex; it can be simplified. This regex appears to work just fine for those cases:

/From\s*:\s*.*?([^"'<>\s]+@[^"<>\s]+\.[^"'<>\s]+)/gi

RegEx101 Demo.

The first capturing group gives you the email address.

It tries to match characters before and after an @ symbol that don’t include spaces or other delimiters. After the @ a TLD has to appear.

This can’t filter out all invalid emails or match all valid emails, but it’s good enough for most cases.

Sebastian Simon
  • 18,263
  • 7
  • 55
  • 75