0

I'm using the HAP library to parse HTML: http://html-agility-pack.net

I basically just want to retrieve the src value from all the img tags.

I've tried several thing but I can't seem to do it!

wp78de
  • 18,207
  • 7
  • 43
  • 71
raklos
  • 28,027
  • 60
  • 183
  • 301

2 Answers2

3

Modified from the examples page:

HtmlDocument doc = new HtmlDocument();
doc.Load("file.htm"); //or whatever HTML file you have
HtmlNodeCollection imgs = doc.DocumentNode.SelectNodes("//img[@src]");
if (imgs == null)
   return;
foreach (HtmlNode img in imgs)
{
   if (img.Attributes["src"] == null)
      continue;
   HtmlAttribute src = img.Attributes["src"];
   //Do something with src.Value
}
NickAldwin
  • 11,584
  • 12
  • 52
  • 67
  • @alexn Thanks, I guess that's what I guess when I copy and paste too quickly :) – NickAldwin Jan 06 '11 at 17:26
  • i tried this earlier but it didnt work: cannot apply indexing with to an expression of type 'HtmlAgilityPack.HtmlNode' – raklos Jan 06 '11 at 17:34
  • @raklos added an edit which should fix your problem (from http://stackoverflow.com/questions/1517804/htmlagilitypack-example-for-changing-links-doesnt-work-how-do-i-accomplish-thi ) – NickAldwin Jan 06 '11 at 17:41
  • @raklos Does the newly edited code work for you? Or are you still having problems? – NickAldwin Jan 06 '11 at 17:59
0

Did you try something like this:

HtmlNodeCollection images = doc.DocumentNode.SelectNodes("//img[@src]");
Remy
  • 12,555
  • 14
  • 64
  • 104