3

So I had some working test code in a console app, that I am moving over to a windows store app. NoW the problem is, ive just copied over the HtmlAgilityPack code that I had in my console app and now it doesnt work. I do have HtmlAgilityPack as a reference...

Now some of the HtmlAgilityPack does work. what is not working is
"using (var client = new WebClient())" just through the error "The type or namespace name 'WebClient' could not be found (are you missing a using directive or an assembly reference?)"

and the next part that does not work is " foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))" at the selectnodes part, with the error "'HtmlAgilityPack.HtmlNode' does not contain a definition for 'SelectNodes' and no extension method 'SelectNodes' accepting a first argument of type 'HtmlAgilityPack.HtmlNode' could be found (are you missing a using directive or an assembly reference)"

Now N know that Html Agility Pack relies on .NET for the XPATH implementation. And that WinRT doesn't support XPATH. Now my question is, how would I accomplish the same below with something that will run in a windows store app?

The code below does the the following. Downloads the html page from http://www.dubstep.net/track/5436, loops through it looking for href, once it finds a #. It takes the href above it and and sends it as a uri to start.

i have verified that the code below does work in a console application.

 using (var client = new WebClient())
        {
            // Download the HTML
            string html = client.DownloadString("http://www.dubstep.net/track/5436");

            // Now feed it to HTML Agility Pack:
            HtmlDocument doc = new HtmlDocument();
            doc.LoadHtml(html);
            int i = 0;
            // Now you could query the DOM. For example you could extract
            // all href attributes from all anchors:
            List<string> list = new List<string>();
            foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))
            {
                HtmlAttribute href = link.Attributes["href"];
                if (href != null)
                {
                    list.Add(href.Value);
                    i++;
                    if (href.Value == "#")
                    {
                        int t = i - 2;
                        Uri test = new Uri(list[t]);
                        start(test);
                    }
                }
            }
        }

 public static void start(Uri t)
    {
        Uri remoteUri = new Uri("http://soundcloud.com/dubstep/spag-heddy-the-master-vip/download");
        string fileName1 = "t", myStringWebResource = null;

        // Create a new WebClient instance.
        using (WebClient myWebClient = new WebClient())
        {
            myWebClient.DownloadFileCompleted += DownloadCompleted;
            myWebClient.DownloadProgressChanged += myWebClient_DownloadProgressChanged;
            myWebClient.DownloadFileAsync(t, "file.mp3");
        }
    }
hurnhu
  • 888
  • 2
  • 11
  • 30

2 Answers2

3

You can try to replace WebClient with HtmlWeb and use HtmlAgilityPack's LINQ API instead of XPath, to make it works in Windows Store apps :

//use HAP's HtmlWeb instead of WebClient
var htmlweb = new HtmlWeb();
// load HtmlDocument from web URL
HtmlDocument doc = htmlweb.Load("http://www.dubstep.net/track/5436");


int i = 0;
List<string> list = new List<string>();

//use LINQ API to select all `<a>` having `href` attribute
var links = doc.DocumentNode
               .DescendantsAndSelf("a")
               .Where(o => o.GetAttributeValue("href", null) != null);
foreach (HtmlNode link in links)
{
    HtmlAttribute href = link.Attributes["href"];
    if (href != null)
    {
        list.Add(href.Value);
        i++;
        if (href.Value == "#")
        {
            int t = i - 2;
            Uri test = new Uri(list[t]);
            start(test);
        }
    }
}
har07
  • 88,338
  • 12
  • 84
  • 137
  • 1
    hmm, seems like im getting the error "'HtmlAgilityPack.HtmlWeb' does not contain a definition for 'Load' and no extension method 'Load' accepting a first argument of type 'HtmlAgilityPack.HtmlWeb' could be found (are you missing a using directive or an assembly reference?)" with the above code.. thanks a lot tho for the help! ive been looking through some linq code, but was not making to much since of it.. – hurnhu Apr 21 '14 at 04:45
  • 1
    ah, I missed that part, I haven't rally tried HAP in WinRT too. Try to replace `htmlweb.Load(...)` with `await htmlweb.LoadFromWebAsync(...)` – har07 Apr 21 '14 at 04:49
  • this "HtmlDocument doc = await htmlweb.LoadFromWebAsync("http://www.dubstep.net/track/5436");" comes back with the error "The 'await' operator can only be used within an async method. Consider marking this method with the 'async' modifier and changing its return type to 'Task'." but from what ive also looked through this should work – hurnhu Apr 21 '14 at 04:53
  • 1
    Mark function where above codes reside as `async` like : `private void async MyFunction()`. For further reference : http://stackoverflow.com/questions/11836325/await-operator-can-only-be-used-within-an-async-method – har07 Apr 21 '14 at 06:12
  • thanks! it got that part to work!! now my next problem is how would i download the file with htmlweb? i added the start method, to show how i currently download it.. – hurnhu Apr 21 '14 at 15:42
1

For XPath, you can use the following link to find an implementation (+source code) for XPath for Windows Phone. The code is easily transferrable to WinRT.

Note: Using LINQ is generally far superior to using XPath. There's one case where that's not true - if your XPaths are coming from a server. In this cases, you can use a solution such as this.

http://socialebola.wordpress.com/2011/07/06/xpath-support-for-the-html-agility-pack-on-windows-phone/

Shahar Prish
  • 4,838
  • 3
  • 26
  • 47