0

I have the following function:

public static string ReturnEmailAddresses(string input)
    {

        string regex1 = @"\[url=";
        string regex2 = @"mailto:([^\?]*)";
        string regex3 = @".*?";
        string regex4 = @"\[\/url\]";

        Regex r = new Regex(regex1 + regex2 + regex3 + regex4, RegexOptions.IgnoreCase | RegexOptions.Multiline);
        MatchCollection m = r.Matches(input);
        if (m.Count > 0)
        {
            StringBuilder sb = new StringBuilder();
            int i = 0;
            foreach (var match in m)
            {
                if (i > 0)
                    sb.Append(Environment.NewLine);
                string shtml = match.ToString();
                var innerString = shtml.Substring(shtml.IndexOf("]") + 1, shtml.IndexOf("[/url]") - shtml.IndexOf("]") - 1);
                sb.Append(innerString); //just titles                    
                i++;
            }

            return sb.ToString();
        }

        return string.Empty;
    }

As you can see I define a url in the "markdown" format:

[url = http://sample.com]sample.com[/url]

In the same way, emails are written in that format too:

[url=mailto:service@paypal.com.au]service@paypal.com.au[/url]

However when i pass in a multiline string, with multiple email addresses, it only returns the first email only. I would like it to have multple matches, but I cannot seem to get that working?

For example

[url=mailto:service@paypal.com.au]service@paypal.com.au[/url] /r/n a whole bunch of text here /r/n more stuff here [url=mailto:anotheremail@paypal.com.au]anotheremail@paypal.com.au[/url]

This will only return the first email above?

user1112324
  • 623
  • 1
  • 7
  • 23
  • The "Multiline" Regex option is for when you want to use `^` and `$` to match the beginning and end of a line rather than the beginning and end of the whole string. If you aren't using those tokens, that option is meaningless. – Abion47 Jan 30 '17 at 00:24

2 Answers2

2

The mailto:([^\?]*) part of your pattern is matching everything in your input string. You need to add the closing bracket ] to the inside of your excluded characters to restrict that portion from overflowing outside of the "mailto" section and into the text within the "url" tags:

\[url=mailto:([^\?\]]*).*?\[\/url\]

See this link for an example: https://regex101.com/r/zcgeW8/1

Abion47
  • 22,211
  • 4
  • 65
  • 88
0

You can extract desired result with help of positive lookahead and positive lookbehind. See http://www.rexegg.com/regex-lookarounds.html

Try regex: (?<=\[url=mailto:).*?(?=\])

Above regex will capture two email addresses from sample string

[url=mailto:service@paypal.com.au]service@paypal.com.au[/url] /r/n a whole bunch of text here /r/n more stuff here [url=mailto:anotheremail@paypal.com.au]anotheremail@paypal.com.au[/url]

Result:

service@paypal.com.au
anotheremail@paypal.com.au
Saleem
  • 8,728
  • 2
  • 20
  • 34