1

I have got a block of user text where I need to find all the web addresses and change them to hyperlinks. For eg in the following block I need to replace www.google.com with <a href="www.google.com">www.google.com</a> and www.yahoo.com with <a href="www.yahoo.com">www.yahoo.com</a>.

Lorem ipsum dolor sit www.google.com amet, consectetuer adipiscing elit, www.yahoo.com sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip

Do I have to split the string, and then match each word with a regular expression, and if match is found I replace? But I think there is a better approach to it, just that I am unable to figure it out.

Thanx for the help.

Devang.

DevM
  • 13
  • 1
  • 6
  • How good does it need to be? For example, do you want it to match domains like 'google.com', or can you assume that links will always start with 'www'? – cbp Jul 25 '11 at 03:49
  • http://stackoverflow.com/questions/37684/how-to-replace-plain-urls-with-links That looks like a good solution – rkw Jul 25 '11 at 04:52
  • @cbp - Regex needs to accommodate various combinations of www and http/s followed by the address and of course if there are more than one urls in the block it should be smart enough to replace all of them. And last thing the url could be followed by special characters like comma, full-stop, question-mark, etc. – DevM Jul 28 '11 at 05:59

2 Answers2

0

Regex.Replace will replace multiple occurrences of sub-strings that match a given pattern, so there is no need to split the string first.

The hard part is deciding what you want to match as a URL. For example, if you want to match any string that is compatible with RFC 3987, then your pattern is going to get quite complicated.

If your embedded URLs don't include the "http://" part, then it may be dificult to identify them, so the pattern you choose will depend upon the your input text.

Ergwun
  • 12,579
  • 7
  • 56
  • 83
-1
string s = "Lorem ipsum dolor sit www.google.com amet, consectetuer adipiscing elit, www.yahoo.com sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip";

string newS = Regex.Replace(s, "((https?://)?www\\.[^\\s]+)", "<a href=\"$1\">$1</a>");

Console.WriteLine(newS);
Petar Ivanov
  • 91,536
  • 11
  • 82
  • 95