1

So!

For a fansite I run I also run a website scraper(/xmleader) that reads information from a secure weblocation of a game. It works perfectly as it is now but I want to make it better and mainly faster.

The 1st problem I faced was how to maintain a session where you can do a ton of requests (like 1 to 10 every 30 seconds) while maintaining logged in. Normal httprequest didn't really worked because the login was secured with a token that must be submitted together with my login information. Now the solution was made as followed: On a Form is just placed a webbrowser control and when the login page was loaded(documentCompleted event) I fill the login information inside the document and simply submit it.

Now I can access all the secure pages I want to BUT not with a HttpWebRequest I placed inside the code. BUT When I placed multiple WebBrowserControls on the same form all them could access the secure part of the site. So that is how I placed 6 of them to do -kind of- parallel requests (for xml and html) to access information in my account quickly.

This works like a charm actually, you nicely see 7 browsers browse away and analyse the domdocument but naturally this creates a lot of overhead since I don't need the images and all the flash etc to load (or the iFrames which cause very annoying multiple documentLoaded events). So I want to login once and be able to request inside the code with HttpWebRequest with the session/cookie information of the webbrowser(or login in some other way).

So how do I do this? Is this even possible or should I approach it completely differently ?

(ps I write everything in C#)

ovanwijk
  • 159
  • 7
  • use http://www.visualwebripper.com/ – Mahmoud Darwish Apr 01 '14 at 13:09
  • well there is whole lot of more logic happening than just ripping. The application also connects with my website database in order to basically let other users from my website look into my account and through that program automatically interact with it. So just ripping aint enough. – ovanwijk Apr 01 '14 at 13:19
  • visual web ripper has an API with it so you can integrate it with your application – Mahmoud Darwish Apr 01 '14 at 13:19
  • although i don't know if it will meet all your requirements, however it seem what you are doing is more of a browser than ripping , maybe write your custom webrequest? – Mahmoud Darwish Apr 01 '14 at 13:23

2 Answers2

1

You can show the first WebBrowser, login and, after the submit, you get the cookies from it and attach them all over your HttpWebRequests.

Having only the WebBrowser shown for the first login should improve your performance a lot! Only pay attention to browser validation / async content loading.

AlbertoA
  • 93
  • 6
  • This may not work for many reasons. The session is not only identified by cookies, there are other unique [fields](http://en.wikipedia.org/wiki/List_of_HTTP_header_fields#Field_values). – noseratio Apr 01 '14 at 22:07
0

You can't use HttpWebRequests to share the same session with WebBrowser. You'd need to use an API based on UrlMon or WinInet, that's what WebBrowser uses behind the scene.

I listed some of the options here: https://stackoverflow.com/a/22686805/1768303.

Perhaps, the XMLHTTPRequest COM object would be the most feasible one.

Community
  • 1
  • 1
noseratio
  • 59,932
  • 34
  • 208
  • 486
  • Actually I used this: http://ycouriel.blogspot.nl/2010/07/webbrowser-and-httpwebrequest-cookies.html and it works like a charm! – ovanwijk Apr 02 '14 at 11:12
  • @ovanwijk, awesome if that works for you, but it totally depends on that particular server, which is apparently ignoring other unique headers, like `User-Agent`. It certainly wouldn't work a server supporting `Authorization`. – noseratio Apr 02 '14 at 11:38