6

I'm using SharpBITS to download file from AmazonS3.

> // Create new download job. BitsJob
> job = this._bitsManager.CreateJob(jobName, JobType.Download);
> // Add file to job.
> job.AddFile(downloadFile.RemoteUrl, downloadFile.LocalDestination);
> // Resume
> job.Resume();

It works for files which do no need authentication. However as soon as I add authentication query string for AmazonS3 file request the response from server is http state 403 -unauthorized. Url works file in browser.

Here is the HTTP request from BIT service:

HEAD /mybucket/6a66aeba-0acf-11df-aff6-7d44dc82f95a-000001/5809b987-0f65-11df-9942-f2c504c2c389/v10/summary.doc?AWSAccessKeyId=AAAAZ5SQ76RPQQAAAAA&Expires=1265489615&Signature=VboaRsOCMWWO7VparK3Z0SWE%2FiQ%3D HTTP/1.1
Accept: */*
Accept-Encoding: identity
User-Agent: Microsoft BITS/7.5
Connection: Keep-Alive
Host: s3.amazonaws.com

The only difference between the one from a web browser is the request type. Firefox makes a GET request and BITS makes a HEAD request. Are there any issues with Amazon S3 HEAD requests and query string authentication?

Regards, Blaz

lsalamon
  • 7,998
  • 6
  • 50
  • 63
  • It would be helpful to see exactly what the HTTP request that SharpBits generates looks like. You might be able to get that out using the debugger. – Adam Crossland Feb 06 '10 at 17:31
  • I think there can be a problem with HEAD request, perhaps S3 does not handle it properly. BITS uses Range Protocol header. – Blaz Lipuscek Feb 06 '10 at 18:10
  • The fact that these are in the comments makes them nearly unintelligible. Why don't you edit your question and include the headers there, and format that with a code block. – Adam Crossland Feb 06 '10 at 18:11
  • That's an excellent line of inquiry, user*. – Adam Crossland Feb 06 '10 at 18:11
  • Yeah. HEAD is a problem. The signature parameter is a hash which also includes a http METHOD. When changing the signature generator to use HEAD it worked ok. However there is another major problem, the next request after HEAD which BITS sends is GET and now I'm again stuck with the signature problem :). Unfortunately I can not pass different HEAD and GET request URLs. The only solution I see is a proxy?.. – Blaz Lipuscek Feb 06 '10 at 19:16

2 Answers2

3

You are probably right that a proxy is the only way around this. BITS uses the HEAD request to get a content length and decide whether or not it wants to chunk the file download. It then does the GET request to actually retrieve the file - sometimes as a whole if the file is small enough, otherwise with range headers.

If you can use a proxy or some other trick to give it any kind of response to the HEAD request, it should get unstuck. Even if the HEAD request is faked with a fictitious content length, BITS will move on to a GET. You may see duplicate GET requests in a case like this, because if the first GET request returns a content length longer than the original HEAD request, BITS may decide "oh crap, I better chunk this after all."

Given that, I'm kind of surprised it's not smart enough to recover from a 403 error on the HEAD request and still move on to the GET. What is the actual behaviour of the job? Have you tried watching it with bitsadmin /monitor? If the job is sitting in a transient error state, it may do that for around 20 mins and then ultimately recover.

2

Before beginning a download, BITS sends an HTTP HEAD request to the server in order to figure out the remote file's size, timestamp, etc. This is especially important for BranchCache-based BITS transfers and is the reason why server-side HTTP HEAD support is listed as an HTTP requirement for BITS downloads.

That being said, BITS bypasses the HTTP HEAD request phase, issuing an HTTP GET request right away, if either of the following conditions is true:

  1. The BITS job is configured with the BITS_JOB_PROPERTY_DYNAMIC_CONTENT flag.
  2. BranchCache is disabled AND the BITS job contains a single file.

Workaround (1) is the most appropriate, since it doesn't affect other BITS transfers in the system.

For workaround (2), BranchCache can be disabled through BITS' DisableBranchCache group policy. You'll need to do "gpupdate" from an elevated command prompt after making any Group Policy changes, or it will take ~90 minutes for the changes to take effect.