0

I need a slight/fast way to download just the content of the html of a page. Than I can catch the meta tag of it. This is my actual code:

HttpWebRequest request = WebRequest.Create(resoruce_url) as HttpWebRequest;
request.UserAgent = Request.UserAgent;

try
{
    using (WebResponse response = request.GetResponse())
    {
        using (var reader = new StreamReader(response.GetResponseStream()))
        {
            var objectText = reader.ReadToEnd();
            Response.Write(objectText);
        }
    }
}
catch (Exception e) { Response.Write(e.Message); }

the problem is it doesnt support "Javascript" as request, so the page with some controls, I just get the <noscript> html code.

How can I do it? I can't do client side because the requested page are not in the same domains. So the only way is Server Side.

Someone says to use WebBrowser, but I know it is a sort of "browser emulator", which requires many resources. I just need a slight solution. Any suggestions?

markzzz
  • 47,390
  • 120
  • 299
  • 507

2 Answers2

1

how about WebClient ?

very simple to implement. see: http://www.hanselman.com/blog/HTTPPOSTsAndHTTPGETsWithWebClientAndCAndFakingAPostBack.aspx

geevee
  • 5,411
  • 5
  • 30
  • 48
  • in order to set user-agent, see: http://stackoverflow.com/questions/11841540/setting-the-user-agent-header-for-a-webclient-request – geevee Oct 14 '13 at 07:54
0

Have a look at PhantomJS, which is a scriptable and portable "headless" WebKit-based browser. I'm not sure though it uses less resources than IE's WebBrowser control.

noseratio
  • 59,932
  • 34
  • 208
  • 486