26

I'm developing http client/server framework, and looking for the correct way to handle partial uploads (the same as for downloads using GET method with Range header).

But, HTTP PUT is not intended to be resumed. And PATCH method, as i know, doesn't accept Range header.

Is there any way to handle this in by HTTP standard (not using extension headers or etc)?

E_net4
  • 27,810
  • 13
  • 101
  • 139
  • 1
    See @btimby's answer in [Difference between Content-Range and Range headers?](http://stackoverflow.com/questions/716680/difference-between-content-range-and-range-headers). – CodeCaster Jan 07 '14 at 10:40
  • 1
    Thanks for your comment. I seen question about difference and answers. But, it's not clear with partial PUT, because some rfcs say that Content-range header with PUT is not acceptable. About PATCH method, i didn't seen any information, about using Content-Range with it. – Андрей Москвичёв Jan 07 '14 at 11:47
  • The spec doesn't forbid it, but you'll have to consult your server's manual on whether it implements it or not. You might have to write custom code or configuration depending on your server software and version. – CodeCaster Jan 07 '14 at 11:53
  • 1
    I'm writting http client and server from scratch. Of course, i can use some not-standard extension, but if there is a standard way, it's always better to use it. – Андрей Москвичёв Jan 07 '14 at 12:56
  • Then explain what you are trying to do. If you want your client to support it, you'll have to _know_ somehow the server implements it. Is your actual question _"How to detect an HTTP server supports partial uploads using the `Content-Range` header"_? If you want your server to support it, just implement it. – CodeCaster Jan 07 '14 at 13:02
  • Thanks for your comment. "How to detect ..." is actually another question. Initially, my question was because i need to know standard method for partial uploads, which is acceptable my all major http server implementations. – Андрей Москвичёв Jan 07 '14 at 13:45

4 Answers4

15

I think there is no standard for partial uploads:

  • Content-Range inside requests is not explicitly forbidden in RFC2616 (http), but also the wording refers to it as an response header which gets used in response of a range-request
  • while you could use the PATCH method to update an existing resource (e.g. to add more bytes) it would not be the same as a partial upload, because all the time the incomplete resource would be available

If you look at the protocols of Dropbox, google drive etc they all roll their own protocol to transfer a single files in multiple chunks. What you need for resumeable uploads is

  • a way to address an incomplete upload. Normal URLs address a complete, not a partial resource and I know of no standard for partial resources.
  • a way to find out the current state of the upload, maybe also checksums of the part to be sure, that the local file did not change. This could be provided by WebDAV PROPFIND method (once you are able to address the incomplete resource :)
  • a way to upload a chunk. Here one could maybe use PATCH with a content-range header. mod_dav seems to allow PUT with content-range header (see http://www.gossamer-threads.com/lists/apache/users/432346)
  • a way to publish the resource once it is complete, or a way to define upfront what complete means (e.g size of resource, checksum...)
Eponymous
  • 6,143
  • 4
  • 43
  • 43
Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
  • 2
    HTTP spec now unluckily forbids explicitly usage of Contant-Range header for PUT requests, that would have been the simplest and cleanest solution of all. – Piranna Jul 28 '17 at 12:10
  • @Piranna Agreed, the feature seems to have been rejected chiefly on fears -- that existing proxies (compliant at best with RFC 2616) would be interfering with the request-response chain. As if the choice couldn't have been left up to HTTP clients, at least. As a result, we're left with, quoting the answer, "it would not be the same as a partial upload, because all the time the incomplete resource would be available" problem which PATCH doesn't seem to specifically address. Does a file being uploaded through a sequence of PATCH requests, have a valid representation between these? – Armen Michaeli Sep 14 '21 at 11:33
  • A PATCH request can have specified multiple diff fragments in it, so if all of them have been correctly applied in a single PATCH request, the patched resourced should have a new valid state, and if not, then in the first place the PATCH request was not well formed on client side. – Piranna Sep 15 '21 at 12:26
4

As noted in some of the comments, newer versions of the HTTP specification have clarified this somewhat. Per Section 4.3.4 of RFC 7231:

An origin server that allows PUT on a given target resource MUST send a 400 (Bad Request) response to a PUT request that contains a Content-Range header field (Section 4.2 of [RFC7233]), since the payload is likely to be partial content that has been mistakenly PUT as a full representation. Partial content updates are possible by targeting a separately identified resource with state that overlaps a portion of the larger resource, or by using a different method that has been specifically defined for partial updates (for example, the PATCH method defined in [RFC5789]).

Unfortunately, the discussion of the range headers which occurs in RFC 7233 focuses more or less entirely on GET requests, and RFC 5789 defines almost nothing about PATCH except that it is specifically not required to transmit the entire content (but is allowed to), nor is it required to be idempotent (but is allowed to be).

The bright side is that because PATCH is so loosely defined, it does accommodate the approach given in an answer to a related question (https://stackoverflow.com/a/6711496/7467189): just change "PUT" to "PATCH". While there is no requirement that a server interpret a PATCH request with a Content-Range header this way, it is certainly a valid interpretation, just not one that can be relied upon from arbitrary servers or clients. But in cases such as the original question, where control of both ends is available, it is at least an obvious approach and does not violate the current standards.

One additional consideration is that the Content-Type should express whatever is being transmitted, rather than the content type of the entity as a whole (the RFC gives some examples in this regard). For content that is being "patched" in arbitrary chunks this would imply application/octet-stream, although there are scenarios where the client and server might be more content-aware and opt to send patches as entities which have a more specific definition (e.g. single pages of a multi-page image format).

Community
  • 1
  • 1
Joel Aelwyn
  • 347
  • 5
  • 9
4

PATCH would be a logical method to choose for resumable uploads: it expects a media type that indicates how to change the target resource. Though not specifically defined as format to perform patching, multipart/byteranges specifies a byte range and the contents of that range, making it suitably well defined for PATCH payloads.

Example:

PATCH /document HTTP/1.1
Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES

--THIS_STRING_SEPARATES
Content-Type: text/plain
Content-Range: bytes 10-21/22

1234567890
--THIS_STRING_SEPARATES--

This example uploads twelve bytes at a ten-byte offset. THIS_STRING_SEPARATES is an arbitrary, user-picked delimiter, and should be randomly generated. Some headers omitted for brevity, each line is terminated with ␍␊.

awwright
  • 565
  • 3
  • 8
  • That's an interesting idea. How about writing this down in an Internet Draft? – Julian Reschke Jun 18 '19 at 05:45
  • @JulianReschke Working on it! It would have to answer a few more questions, like how to do resumable POST (e.g. what if I don't want to name the document that's being uploaded), and how the server should respond to requests on resources that are only partially uploaded. – awwright Jun 18 '19 at 09:04
-2

Use the Range xxxx-yyyy header or the Range xxxx- header with PUT to update part of a file. It is supported by Apache.

Don't be confused by the statement in RFC 7231 that Content-Range cannot be used. That is intended to prevent badness by clients taking headers received from the server and using PUT to send them back to the server. This caution notice is not relevant to the question of partials PUTs.

Bruce
  • 449
  • 4
  • 6
  • Do you have a reference to back up the claim about the **intention** of the statement in RFC 7231? – Ajoy Bhatia Jun 21 '18 at 20:45
  • 1
    No, it was just my interpretation of the RFCs as I was studying to make it work. Actually after that I had problems getting Apache to do it correctly, sorry. – Bruce Jun 23 '18 at 06:41