10

I want to save complete web page asp in local drive by .htm from url or url but I did not success.

Code

public StreamReader Fn_DownloadWebPageComplete(string link_Pagesource)
{
     //--------- Download Complete ------------------
     //  using (WebClient client = new WebClient()) // WebClient class inherits IDisposable
     //   {

     //client
     //HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(link_Pagesource);

                    //webRequest.AllowAutoRedirect = true;
                    //var client1 = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(link_Pagesource);
                    //client1.CookieContainer = new System.Net.CookieContainer();


                 //   client.DownloadFile(link_Pagesource, @"D:\S1.htm");

              //  }
         //--------- Download Page Source ------------------
 HttpWebRequest URL_pageSource = (HttpWebRequest)WebRequest.Create("https://www.digikala.com");

                    URL_pageSource.Timeout = 360000;
                    //URL_pageSource.Timeout = 1000000;
                    URL_pageSource.ReadWriteTimeout = 360000;
                   // URL_pageSource.ReadWriteTimeout = 1000000;
                    URL_pageSource.AllowAutoRedirect = true;
                    URL_pageSource.MaximumAutomaticRedirections = 300;

                    using (WebResponse MyResponse_PageSource = URL_pageSource.GetResponse())
                    {

                        str_PageSource = new StreamReader(MyResponse_PageSource.GetResponseStream(), System.Text.Encoding.UTF8);
                        pagesource1 = str_PageSource.ReadToEnd();
                        success = true;
                    }

Error :

Too many automatic redirections were attempted.

Attemp by this codes but not successful.

many url is successful with this codes but this url not successful.

themefield
  • 3,847
  • 30
  • 32
RedArmy
  • 315
  • 1
  • 6
  • 16

3 Answers3

19

here is the way

string url = "https://www.digikala.com/";

using (HttpClient client = new HttpClient())
{
   using (HttpResponseMessage response = await client.GetAsync(url))
   {
      using (HttpContent content = response.Content)
      {
         string result = await content.ReadAsStringAsync();
      }
   }
}

and result variable will contains the page as HTML then you can save it to a file like this

System.IO.File.WriteAllText("path/filename.html", result);

NOTE you have to use the namespace

using System.Net.Http;

Update if you are using legacy VS then you can see this answer for using WebClient and WebRequest for the same purpose, but Actually updating your VS is a better solution.

jflaga
  • 4,610
  • 2
  • 24
  • 20
Hakan Fıstık
  • 16,800
  • 14
  • 110
  • 131
  • Thanks,must update to vs2013 and install System.Net.Http and test. – RedArmy Jan 21 '17 at 10:49
  • @RedArmy Glad to help you, and please consider to accept this answer if it is solved your problem. – Hakan Fıstık Jan 21 '17 at 10:50
  • @RedArmy I updated my answer to solve your problem without updating Visual Studio and without Installing any `dll` (although this is not recommended) – Hakan Fıstık Jan 21 '17 at 10:55
  • 2
    While using the `.Result` does work, it converts the async call into a blocking sync call. It is better to `await` the async method call (no need to use .Result then) to benefit from the async nature. – Hans Kesting Jan 21 '17 at 11:01
  • dear @HakamFostok , i try witout update with webclient but not successful but update vs and successful your solution but when save the page, no show product lis . – RedArmy Jan 21 '17 at 11:02
  • in url : `https://www.digikala.com/search/category-motherboard/#%21/category-computer-devices/category-computer-parts/category-electronic-devices/category-motherboard/category-motherboard#%21category-computer-devicescategory-computer-partscategory-electronic-devicescategory-motherboard/category-motherboard#%21category-computer-devicescategory-computer-partscategory-electronic-devicescategory-motherboardcategory-motherboard#%21category-computer-devicescategory-computer-partscategory-electronic-devicescategory-motherboard/` – RedArmy Jan 21 '17 at 11:02
  • 1
    @HansKesting your are correct, but this depends on the requirement of the user itself. Any way I am resolving this issue in my other expanded answer which I linked. you can consider that expanded answer to handle most of the cases of `HttpClient` – Hakan Fıstık Jan 21 '17 at 11:04
  • use the AJax For show of Product , so i want to save complete page with c# but not successful. – RedArmy Jan 21 '17 at 11:05
  • 1
    Good solution for save source code dear @HakamFostok but this site used , ajax or etc for show product,please see url – RedArmy Jan 21 '17 at 11:08
  • This solution seemed to be working until digikala generated the HTML dynamically using JS and now there is no such thing as price or other useful data in result. Is there an update for this? – Arya Apr 27 '23 at 10:50
  • 1
    @Arya you are correct, this solution is to get the HTML of the page and not to execute the javascript that is coming in the page, this answer does not and should not cover that, you have to take more steps and consult more questions to be able to do that – Hakan Fıstık Apr 27 '23 at 14:10
3
using (WebClient client = new WebClient ())
{
    client.DownloadFile("https://www.digikala.com", @"C:\localfile.html");
}
2
using (WebClient client = new WebClient ())
{
    string htmlCode = client.DownloadString("https://www.digikala.com");
}