99

I have a set of resources whose representations are lazily created. The computation to construct these representations can take anywhere from a few milliseconds to a few hours, depending on server load, the specific resource, and the phase of the moon.

The first GET request received for the resource starts the computation on the server. If the computation completes within a few seconds, the computed representation is returned. Otherwise, a 202 "Accepted" status code is returned, and the client must poll the resource until the final representation is available.

The reason for this behavior is the following: If a result is available within a few seconds, it needs to be retrieved as soon as possible; otherwise, when it becomes available is not important.

Due to limited memory and the sheer volume of requests, neither NIO nor long polling is an option (i.e. I can't keep nearly enough connections open, nor even can I even fit all of the requests in memory; once "a few seconds" have passed, I persist the excess requests). Likewise, client limitations are such that they cannot handle a completion callback, instead. Finally, note I'm not interested in creating a "factory" resource that one POSTs to, as the extra roundtrips mean we fail the piecewise realtime constraint more than is desired (moreover, it's extra complexity; also, this is a resource that would benefit from caching).

I imagine there is some controversy over returning a 202 "Accepted" status code in response to a GET request, seeing as I've never seen it in practice, and its most intuitive use is in response to unsafe methods, but I've never found anything specifically discouraging it. Moreover, am I not preserving both safety and idempotency?

So, what do folks think about this approach?

EDIT: I should mention this is for a so-called business web API--not for browsers.

user359996
  • 5,533
  • 4
  • 33
  • 24
  • 4
    I personally think it's a good one, it is _exactly_ the definition of a `202`. That it is seldom used in practice is IMHO more because few webdevelopers care about proper status codes as they're more used to browser / user-agent interaction in which case a `202` gives them no visible clue (give them a `200` and they're happy...). – Wrikken Nov 04 '10 at 18:26
  • 2
    @user359996, just use `200`. `202` is what it's *supposed* to be, but in practice people don't expect `202`. – Pacerier Oct 08 '15 at 10:49
  • 1
    it needs an ETA for a 200 to be useful in practice though. – Rob Mar 02 '17 at 18:30
  • Note that the POST method is cacheable so that is not a valid argument for excluding it (cf. RFC 7231, [§ 4.2.3](https://datatracker.ietf.org/doc/html/rfc7231#section-4.2.3)). – Géry Ogam Jun 23 '21 at 13:33

4 Answers4

73

If it's for a well-defined and -documented API, 202 sounds exactly right for what's happening.

If it's for the public Internet, I would be too worried about client compatibility. I've seen so many if (status == 200) hard-coded.... In that case, I would return a 200.

Also, the RFC makes no indication that using 202 for a GET request is wrong, while it makes clear distinctions in other code descriptions (e.g. 200).

The request has been accepted for processing, but the processing has not been completed.

Pekka
  • 442,112
  • 142
  • 972
  • 1,088
22

We did this for a recent application, a client (custom application, not a browser) POST'ed a query and the server would return 202 with a URI to the "job" being posted - the client would use that URI to poll for the result - this seems to fit nicely with what was being done.

The most important thing here is anyway to document how your service/API works, and what a response of 202 means.

nos
  • 223,662
  • 58
  • 417
  • 506
  • +1 Thanks for your comment. Good point about documentation. But please note the clarifying edits to my question (look for "factory"). – user359996 Nov 04 '10 at 18:36
  • Well, you can omit that URI in the response if you just want to poll the same URI as you intially requested. (Just document how this should work :-) ) – nos Nov 04 '10 at 18:39
  • Good idea, but remember I want caching, so no POST. Moreover, the URI specifies the resource, not a method. I'm taking a RESTful, rather than RPC approach (sorry, another unspecified constraint--my bad). – user359996 Nov 04 '10 at 19:26
  • To be precise, by "RESTful", I actually mean "resource-oriented", which is technically a bit more than is specified by the REST constraints. – user359996 Nov 04 '10 at 19:36
  • This is also backed by "1.10 How to Use POST for Asynchronous Tasks" by the book "[RESTful Web Services Cookbook](https://www.oreilly.com/library/view/restful-web-services/9780596809140/)" by [Subbu Allamraju](https://twitter.com/sallamar) – koppor Dec 11 '20 at 12:03
12

From what I can recall - GET is supposed to return a resource without modifying the server. Maybe activity will be logged or what have you, but the request should be rerunnable with the same result.

POST on the other hand is a request to change the state of something on the server. Insert a record, delete a record, run a job, something like that. 202 would be appropriate for a POST that returned but isn't finished, but not really a GET request.

It's all very puritan and not well practiced in the wild, so you're probably safe by returning 202. GET should return 200. POST can return 200 if it finished or 202 if it's not done.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

Dlongnecker
  • 3,008
  • 3
  • 25
  • 40
  • 4
    Very good thinking but I'm not sure whether it applies here: From what the OP says, this seems to be a proper GET request (in that it doesn't change anything on the server), it just takes longer to compute and in that case, is to be fetched at another time. Maybe the OP can give an authoritative comment. It's for an API so it's fine to be "puritan" for the sake of a clean interface – Pekka Nov 04 '10 at 18:43
  • Oh, touche pekka. You're right, GET is the way to go. And I don't think the HTTP spc really took into account GET's that aren't ready. So he could go either way – Dlongnecker Nov 04 '10 at 18:52
  • 8
    (Now-irrelevant) authoritative comment: Yes, I view this as idempotent. The *resource* is neither being modified nor created, rather it's *representation* has not yet been computed. – user359996 Nov 04 '10 at 19:01
  • Oh snap, nice lingo. Ya, I stick by my comment that you're in uncharted territory. I'd say 202 makes the most sense, but 200 is probably safer. – Dlongnecker Nov 04 '10 at 19:38
  • Oh, and it's all wrong, because a GET request shouldn't take that long to return the desired result (at least, according the the spec). – Dlongnecker Nov 04 '10 at 19:38
  • 1
    Where does it say that? Also, if I return 200, the client should expect a representation has been returned, but it hasn't. – user359996 Nov 04 '10 at 19:43
  • 1
    I take it back. 202 doesn't correspond to only GET or POST it seems. Just the mindset I was in when I looked at the protocol made me thing 202 only existed for GET requests. 202 should be fine for your purposes. – Dlongnecker Nov 04 '10 at 21:19
0

In case of a resource that is supposed to have a representation of an entity that is clearly specified by an ID (as opposed to a "factory" resource, as described in the question), I recommend staying with the GET method and, in a situation when the entity/representation is not available because of lazy-creation or any other temporary situation, use the 503 Service Unavailable response code that is more appropriate and was actually designed for situations like this one.

Reasoning for this can be found in the RFCs for HTTP itself (please verify the description of the 503 response code), as well as on numerous other resources.

Please compare with HTTP status code for temporarily unavailable pages. Although that question is about a different use case, it actually relates to the exact same feature of HTTP.

Hermes
  • 756
  • 7
  • 10