21

I can download this by hand in IE.

http://scholar.google.com/scholar.ris?q=info:j8ymU9rzMsEJ:scholar.google.com/&output=citation&hl=zh-CN&as_sdt=2000&oe=GB&ct=citation&cd=0

But, using follow code

WebClient client = new WebClient();
client.DownloadFile(address, filename);

Show Exception: 403 Forbidden

What's wrong? How can I do that?

others

http://scholar.google.com/scholar.ris?q=info:sskrpr5jlLwJ:scholar.google.com/&output=citation&hl=zh-CN&as_sdt=2000&oe=GB&ct=citation&cd=1

Tasos K.
  • 7,979
  • 7
  • 39
  • 63
Begtostudy
  • 1,374
  • 4
  • 13
  • 28

8 Answers8

95

Just add a simple line before you make your download:

string url = ... 
string fileName = ...

WebClient wb = new WebClient();
wb.Headers.Add("User-Agent: Other");   //that is the simple line!
wb.DownloadFile(url, fileName);

That's it.

Borg8
  • 1,562
  • 11
  • 19
  • 1
    +1 - I had suspected this, but only as a theory. Incredibly that they block requests with no user agents. Pretty smart when you think about it. – Mathias Lykkegaard Lorenzen Aug 09 '13 at 08:36
  • 3
    My .net WebClient suddenly started getting 403 forbidden error from a https site that used work with the following `user-agent` setting: `Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)`. Replacing that with `Other` fixed the problem. Can someone tell me why though? – ebhh2001 Nov 19 '17 at 16:12
  • Thanks for this, I made a simple SteamDB fetcher to get game banner images and appid information for games for me and it had always worked even without that "User-Agent" line, but recently i guess it was changed to block non user-agent requests and kept getting a 403, but this resolved my problems. – user1931470 Jul 10 '19 at 04:09
11

403 may also be caused by TLS issues. To verify, you should check the text of the WebException.Response object.

     catch (WebException ex)
     {
        if (ex.Response != null)
        {
           var response = ex.Response;
           var dataStream = response.GetResponseStream();
           var reader = new StreamReader(dataStream);
           var details = reader.ReadToEnd();
        }
     }

If it is TLS then try adding this to your code to force TLS1.2.

For .net4:

ServicePointManager.SecurityProtocol = (SecurityProtocolType)3072;

For .net4.5 or later:

ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;

Developer Wonk
  • 276
  • 2
  • 4
  • After more investigating and trial and error and finding that the Url was not the problem as if I reversed them it then worked and the other failed. I thought it might be something to do with server throttling and hence tried the system.thread.sleep between requests but that did not help. So I eventually put the header additions within the loop and between each iteration cleared the header and re-added them and surprise surprise everything worked correctly as expected. So not sure why this caused the issue and only on a couple of web sites but appeared to have a solution. – user3502865 Apr 26 '18 at 17:43
  • This "ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;" worked for me. – Nanu Jun 21 '18 at 20:02
8

I had this problem trying to download an image from a SharePoint site url. In my case setting the user-agent to Other or blank in the header wasn't enough, I had to set the user-agent as follows instead:

client.Headers.Add("user-agent", " Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0");

That solution came from this answer.

tomRedox
  • 28,092
  • 24
  • 117
  • 154
1

You need to set appropriate http headers before calling your DownloadFile method.

WebClient webClient = new WebClient();
webClient.Headers.Add("???", "???");
webClient.Headers.Add("???", "???");
webClient.Headers.Add("???", "???");
webClient.DownloadFile(address, filename);

To put correct values instead of these question marks might be tricky. You will need to download Fiddler or some other program or webbrowser extension to reveal what http headers are being sent to Google by your webbrowser and basically replicate the same request in your program.

lubos hasko
  • 24,752
  • 10
  • 56
  • 61
  • http://scholar.google.com/scholar.ris?q=info:sskrpr5jlLwJ:scholar.google.com/&output=citation&hl=zh-CN&as_sdt=2000&oe=GB&ct=citation&cd=1 I used Fiddler. But there are client/cookies/transport, which should be used? – Begtostudy Jul 17 '10 at 16:31
1

This is what happened with me:

I was trying to download a (public) .xls file (via DownloadFile method) which was getting downloaded comfortably from all browsers.

After trying and struggling with all answers (but no luck), I finally opened the Stack and noticed something odd (refer screenshot).

Although, the file was getting downloaded via http in browser but it was giving 403 error via DownloadFile method.

Finally, I just changed the URL from http://something to https://something and it worked fine.

Hope this helps!

Screenshot

1JD
  • 345
  • 3
  • 5
0

I get a 403 in IE, I guess you need to be logged in to retrieve the resource. Your browser may have the credentials cached but your app isn't designed to log you in. Or are you logged in to Google in your browser - try logging out and see if you still have access....

philiphobgen
  • 2,234
  • 17
  • 28
  • http://scholar.google.com/scholar.ris?q=info:sskrpr5jlLwJ:scholar.google.com/&output=citation&hl=zh-CN&as_sdt=2000&oe=GB&ct=citation&cd=1 But,also, System.Net.WebException: The remote server returned an error: (403) Forbidden. – Begtostudy Jul 17 '10 at 15:56
  • I'd take a look at this project http://desktopgooglereader.codeplex.com/ where it looks like they've solved this problem including recent changes by Google – philiphobgen Jul 17 '10 at 16:26
0

The key to solving this for me was to do the request once via code, a second time in the browser, log both requests with Fiddler and ensure the headers match up.

I ended up having to add headers for:

  • Accept
  • Accept-Encoding
  • Accept-Language
  • User-Agent
  • Upgrade-Insecure-Requests

I hope this helps people in the future.

JMK
  • 27,273
  • 52
  • 163
  • 280
0

I ran into the same issue trying to download a file on an Amazon 3S url. I blogged about it here: http://blog.cdeutsch.com/2010/11/net-webclient-403-forbidden-error.html

The final solution I used was found here: GETting a URL with an url-encoded slash

Community
  • 1
  • 1
cdeutsch
  • 3,847
  • 27
  • 25