I have a very good regex that works and is able to replace urls in a string to clickable once.
string regex = @"((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:@=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])";
Now, how can I tell it to ignore already clickable links and images?
So it ignores below strings:
<a href="http://www.someaddress.com">Some Text</a>
<img src="http://www.someaddress.com/someimage.jpg" />
Example:
The website www.google.com, once again <a href="http://www.google.com">www.google.com</a>, the logo <img src="http://www.google.com/images/logo.gif" />
Result:
The website <a href="http://www.google.com">www.google.com</a>, once again <a href="http://www.google.com">www.google.com</a>, the logo <img src="http://www.google.com/images/logo.gif" />
Full HTML Parser code:
string regex = @"((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:@=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])";
Regex r = new Regex(regex, RegexOptions.IgnoreCase);
text = r.Replace(text, "<a href=\"$1\" title=\"Click to open in a new window or tab\" target=\"_blank\" rel=\"nofollow\">$1</a>").Replace("href=\"www", "href=\"http://www");
return text;