get some links from an HTML string

Question

I have some strings with the content like this

<a href="http://example.com/2014/06/22/new-idea-about-life.zip">One</a>
<a href="http://example.com/2014/06/22/new-idea-about-life-rar.rar">Two</a>

I need this output:

http://example.com/2014/06/22/new-idea-about-life.zip
http://example.com/2014/06/22/new-idea-about-life-rar.rar

Take a look at html agility pack. It is a library that makes working with html strings or files easier. Supports linq-to-objects amongst other things. Also allows you to extract attributes from tags, which is what you need to do here. — Umair, Jan 28 '17 at 14:57

score 0 · Answer 1 · answered Jan 28 '17 at 15:25

0

HTML Agility Pack is a good library to parse HTML in C#.

An example for extracting urls is:

var html = "<a href=\"http://reallife.com/2014/06/22/new-idea-about-life.zip\">New idea about life (zip) (25MB)</a><a href=\"http://reallife.com/2014/06/22/new-idea-about-life-rar.rar\">New idea about life (rar) (23MB)</a>
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var links = new List<string>();
foreach (var link in htmlDoc.DocumentNode.SelectNodes("//a[@href]"))
{
    links.Add(link.GetAttributeValue("href", string.Empty));    
}
// do something with the links inside the links-List

answered Jan 28 '17 at 15:25

Ralf Bönning

14,515
5
49
67

i wrote this but i got these erors http://uupload.ir/files/ksa2_untitled.png – j kobe Jan 28 '17 at 15:54
Do you know how to fix it? – j kobe Jan 28 '17 at 16:08
You need to install the library via nuget. Have you done this? If yes, then you need to reference it. – Umair Jan 29 '17 at 00:13

get some links from an HTML string

1 Answers1