How to send larger amounts of data to a stateless RESTful retrieval service

Question

This seems quite a basic question so apologies if this has been asked before; please point me in the direction of any useful resources.

So I have a RESTful service to retrieve some data. However, the RESTful service requires a certain amount of data in order to do the retrieval. This data could be roughly summed up as "user context" data - information about the user (whether stored by the calling application or previously retrieved from another application) which the service needs to use to action the retrieval.

Since REST works semantically, the correct verb (HTTP method) to retrieve something is a GET request. Most example GET requests I have seen only use small amounts of data, and the data is passed on the URL. However once we get into the realm of services which require larger amounts of data to make the retrieval, it seems wrong to put all that information in the URL. Not only that, but there are known limits to URL length enforced by certain components (often 255 characters or so, IIRC).

Seemingly the options available are:

Use POST to send the data in the request body. However, this is not semantic since we are not asking the service to update anything, only retrieve.
Put the larger portion of information (in my case, the "user context") into an HTTP header. However, this "feels wrong" as headers should be used for headers, not data.
Make multiple requests to send the data in multiple URLs. However this seems to break the stateless goal, as the service would have to maintain some kind of state to tie the requests together.
Write the data to a database and then pass the service a key to retrieve the data from there. However this would result in the request not being self-contained and also introduce performance bottlenecks.

Is there another option? What's the best practice here?

I would think about context reusability. If the context is not reusable, then use POST and solve caching somehow. Otherwise better to store the context at least partially on the server as a separate resource linked to the API consumer. — inf3rno, Sep 17 '22 at 06:07

score 2 · Answer 1 · edited May 23 '17 at 12:25

While not specification-limited, HTTP headers, like the request path (URL), do have limited length in effect (see below links).

You could adjust or remove server limits, but then you'd make it much harder to remain compatible with 3rd party systems (such as HTTP caches), and there's no guarantee that arbitrary HTTP clients would support surpassing these de facto limits either.

The only way to compatibly send arbitrarily large amounts of data to the server is via the request body; of the standard HTTP verbs, only POST and PUT requests may have bodies, and of those, PUT is semantically incompatible with what you're trying to accomplish.

Unless you can guarantee that all necessary information will always fit in the URL or the request headers (taking into account the aforementioned limits), you should design your REST API to express this need via a POST request.

It's a misconception that the POST verb is only used for modifying or creating something on the server. Really, it's just that the other verbs are idempotent (no side effects, and that performing a duplicate request should produce a duplicate result), therefore POST is the only verb that could satisfy those needs, but in general, POST is a catch-all for anything the other verbs don't do well.

POST can legitimately be used as a retrieval (more resulting in redirection) or processing verb when the limitations of GET cannot otherwise be worked around. POST responses can also be made to behave more like GET by setting caching-related response headers, such as Cache-Control and Expires.

Wish I could upvote this more than once. I've struggled with blindly using HTTP verbs only for (my assumption of) their stated purpose. I've been doing what I saw as abusing the POST verb for years and finally feel somewhat justified in my treatment of POST. I love you POST, don't ever change who you are. — Mike Devenney, Feb 12 '19 at 00:45

score 2 · Accepted Answer · answered Sep 17 '22 at 04:40

If your goal is that you want to do some complex read operation, you want to pass a lot of data to do so and want to use an appropriate HTTP method these are probably your best 2 options:

Do a POST request to create a 'query resource', and then do a separate GET request to receive the result of that query.
Use the HTTP QUERY method, which is specifically designed for this purpose. QUERY is basically a POST-like method but it's intended for safe, idempotent read operations and supports a body. It's relatively new but most HTTP clients and servers accept any HTTP method known or unknown.

Option 1 effectively gives you a URL that you can re-use, link to and do GET operations on, which can be especially helpful in hypermedia applications. The drawback is that the server needs to remember more, it requires 2 http requests and you might need some kind of cleanup, which makes option 2 much easier to implement.

thanks for mentioning the HTTP QUERY method, I hadn't heard of that. looking forward to that being implemented in more places (e.g. java HttpClient) — Adam Burley, Sep 21 '22 at 12:29

score 0 · Answer 3 · answered Sep 17 '22 at 02:38

That is the problem with Restful protocol that it does not not assume intermediate state where you can split the request and send parts. However you can stream your parameters as a file. Then essentially you can have unlimited size request as your file is streamed it is gradually written on the server. Once your upload finishes the server can read the file it just saved in small chunks.

score 0 · Answer 4 · answered Dec 19 '22 at 21:51

0

The Heavy-HTTP project is specifically designed to handle this problem. (I'm one of the authors of the project)

answered Dec 19 '22 at 21:51

Pasindu

81
8

How to send larger amounts of data to a stateless RESTful retrieval service

4 Answers4