0

I am in the same situation at the guy who asked this question. I need to get some data from a website saved as a string.

My problem here is, that the website i need to save data from, requires the user to be logged in to view the data...

So here my plan was to make the user go to the website using the WebBrowser, then login and when the user is on the right page, click a button which will automaticly save the data.

I want to use a similar method to the one used, in the top answer at the other question that i linked to in the start.

string data = doc.DocumentNode.SelectNodes("//*[@id=\"main\"]/div[3]/div/div[2]/div[1]/div[1]/div/div/div[2]/a/span[1]")[0].InnerText;

I tried doing things like this:

string data = webBrowser1.DocumentNode.SelectNodes("//*[@id=\"main\"]/div[3]/div/div[2]/div[1]/div[1]/div/div/div[2]/a/span[1]")[0].InnerText;

But you can't do "webBrowser1.DocumentNode.SelectNodes"

I also saw that the answer on the other question says, that he uses HtmlAgilityPack, but i tried to download it, and i have no idea what to do with it..

Not the best with C#, so please don't comment too complicated answers. Or at least try to make it understandable.

Thanks in advance :)

Community
  • 1
  • 1
Mldx
  • 13
  • 5
  • You need to do a POST to the login page, read the response to get cookie information, and include the cookie/login information with any further GET requests. Have you tried: http://stackoverflow.com/questions/24845573/using-c-sharp-httpclient-to-login-on-a-website-and-scrape-information-from-anoth – Shannon Holsinger Sep 17 '16 at 16:22
  • This sounds way too complicated... – Mldx Sep 17 '16 at 16:28
  • It's not - it's really only two methods once you get the login routine worked out. Depending on what you want to do with the information, there are several higher-level solutions out there. I remember using WebReplay a while back - don't know if they're still around. – Shannon Holsinger Sep 17 '16 at 16:35
  • I found this code on the login page, should i use this for anything? I think i can use it because it is the form where you put in your username and password, and it uses the method POST. – Mldx Sep 17 '16 at 16:47
  • [The code](http://mldx.dk/img/loginpage.PNG) 1= the box to put the username in, and 2= the box to put the password in. – Mldx Sep 17 '16 at 16:48

1 Answers1

1

Here is the an example of HtmlAgilityPack usage:

public string GetData(string htmlContent)
{
      HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
      htmlDoc.OptionFixNestedTags = true;
      htmlDoc.LoadHtml(htmlContent);
      if (htmlDoc.DocumentNode != null)
      {
          string data = htmlDoc.DocumentNode.SelectNodes("//*[@id=\"main\"]/div[3]/div/div[2]/div[1]/div[1]/div/div/div[2]/a/span[1]")[0].InnerText;
          if(!string.IsNullOrEmpty(data))
             return data;
      }
      return null;
}

Edit: If you want to emulate some actions in browser I would suggest you to use Selenium instead of regular WebBrowser control. Here is the link where to download it: http://www.seleniumhq.org/ or use NuGet to download it. This is a good question on how to use it: How do I use Selenium in C#?.

Community
  • 1
  • 1
r.mirzojonov
  • 1,209
  • 10
  • 18
  • Thanks, this worked to get the data, i just had to use a minute or two to figure out how to install HtmlAgilityPack. For people that might read this in the future, it can be found under (Project>Manage NuGet packages) inside of visual studio... But what do i do, if i want the program to automaticly put in the username and password and press Login? Right when the program is started, the webbrowser locates to the login page, then it should automatically login, and after that, use the code that you just wrote to get the data. Again, thanks for helping me trying to get this working! :) – Mldx Sep 18 '16 at 20:29
  • The edit you made answered my question. Thank you :) – Mldx Sep 19 '16 at 08:55