How can I convert a url to html link from text using Html Agility Pack + c#?
For example: "www.stackoverflow.com is a very cool site."
Output:
"<a href="www.stackoverflow.com">www.stackoverflow.com</a> is a very cool site."
How can I convert a url to html link from text using Html Agility Pack + c#?
For example: "www.stackoverflow.com is a very cool site."
Output:
"<a href="www.stackoverflow.com">www.stackoverflow.com</a> is a very cool site."
Thanks @user1778606 for your answer. I got this working though it still uses a bit of Regex. It works much better and safer (i.e. it will never create hyperlinks within hyperlinks and the href attribute).
//convert text to html
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(inputString);
// \w* - means it can start with any alphanumeric charactar
// \s+ - was placed to replace all white spaces (when there is more than one word).
// \b - set bounderies for the keyword
const string pattern = @"((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[.\!\/\\w]*))?)";
//get all elements text propery except for anchor element
var nodes = doc.DocumentNode.SelectNodes("//text()[not(ancestor::a)]") ?? new HtmlAgilityPack.HtmlNodeCollection(null);
foreach (var node in nodes)
{
Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
node.InnerHtml = regex.Replace(node.InnerHtml, "<a href=\"$1\">$1</a>").Replace("href=\"www", "href=\"http://www");
}
return doc.DocumentNode.OuterHtml;
I'm pretty sure its possible, although I haven't attempted it.
Here's how to replace a fixed string in a document with links
Find keyword in text when keyword match certain conditions - C#
Heres how to regex for urls
Put those together and it should be possible.
Pseudocode
select all text nodes
for each node
get the inner text
find urls in the text (use regex?)
for each url foundreplace the text of the url with string literal link tag (a href = etc ...)