0

I use HtmlAgilityPack to extract movie information on the web (as pictured) with c# winform on visual studio. But I can not get the movie link (as shown). Please help me find a way to get the link highlighted in the picture.

enter image description here

HtmlWeb htmlWeb = new HtmlWeb()
{
    AutoDetectEncoding = false,
    OverrideEncoding = Encoding.UTF8  
};
htmlWeb.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36";
HtmlAgilityPack.HtmlDocument document = htmlWeb.Load("http://woohay.com/xem-phim/anh-2018-11458");
String link_film = document.DocumentNode.SelectSingleNode("//div[@class='jw-media jw-reset']/video").Attributes["src"].Value;

Movie_module.FrmVLC frmVLC = new Movie_module.FrmVLC(link_film);
frmVLC.StartPosition = FormStartPosition.CenterScreen;
frmVLC.btn_down.Visible = true;
frmVLC.Show();
dferenc
  • 7,918
  • 12
  • 41
  • 49
pro trum
  • 1
  • 1
  • 1

1 Answers1

1

HtmlAgilityPack won't be able to extract dynamically generated DOM content. I had the same issue when trying something similar.

I ended up using Selenium, which is able to traverse the dynamically generated DOM content, and it's also possible to leverage HtmlAgilityPack with content extracted from Selenium, it's not quite straightforward, but it can be done.

Ojingo
  • 202
  • 2
  • 9
  • See also https://html-agility-pack.net/knowledge-base/41983000/how-to-get-dynamically-loaded-content-using-htmlagilitypack – TrueWill Dec 03 '18 at 20:37
  • `Selenium` with `phantomjs` driver is usually good enough. `PhantomJs` is incredibly tiny, but has a working javascript engine so that dynamic DOM gets loaded (usually) as expected. If the website is quirky, you can alway load a headless `Chrome` driver. That's a bit of a fatter dependency, but will always get the job done. – Ojingo Dec 03 '18 at 20:59
  • I tried using selenium, it helped me achieve what I wanted. Thank you so much. – pro trum Dec 05 '18 at 01:19