4

I am making an HTTP web API that's mainly fed by a database. Simplified, the db contains userobjects.
These objects have a last_online (when the user was online) and last_checked (the last time I checked the userobject).

Checking the userobject can take from 3 to 30 seconds. When the last_checked time is less than 10 minutes then everything's okay; API call returns 200 and the userobject.

But I want to reprocess the userobject when the data is staler than 10 minutes. Obviously I can not have my API return sit there and wait.

What is the right approach to HTTP APIs that (sometimes) need to return data from long running processes?

Raedwald
  • 46,613
  • 43
  • 151
  • 237
Gerben Jacobs
  • 4,515
  • 3
  • 34
  • 56
  • What sort of API are you talking about? A web API? A library in some specific language? If so, which language? – Jon Skeet Mar 26 '14 at 07:08
  • Possible duplicate of http://stackoverflow.com/questions/9794696/best-http-status-code-in-rest-api-for-not-ready-yet-try-again-later – Raedwald Mar 26 '14 at 20:01

2 Answers2

5

My first proposal would be to have the server update the user object every X minutes as a background process. I don't see any reason to place the burden of keeping server data up-to-date on the client. Responses to the GET call would include an Expires header. The client could then cache the response for a fixed amount of time, saving you server hits until the data gets refreshed.

If you must make the refresh be client-driven, you want your GET to return a 202 Accepted, which indicates a valid request that the API is working on but has not completed. The entity that gets returned from your GET request should provide a timestamp for when the API should check back to get the updated data. Once the data has been refreshed, the GET will return a 200 Ok with the refreshed data. This is the approach I recommend.

GET /userObject
<- 202 Accepted
{ "checkAt": <timestamp> }

GET /userObject
<- 200 OK
{ "userName": "Bob", ... }

You could also consider using the Retry-After header in your response, but that's only appropriate for 503 Service Unavailable or any of the various 3xx (Redirection) responses. You definitely aren't describing a 503, and it doesn't sound like redirection is correct either.

If you do want to go the redirection route, you'd return a 302 Found, specifying the temporary URI in the Location header and the delay time in the Retry-After header.

A fourth approach would be to use a POST and the Post-Redirect-Get pattern. You could POST to your userObject URI and have it return the 302 Found with the Retry-After header.

I really don't think that options three or four buy you anything that the second option doesn't, and I think it's the most clear. Three implies that your resource currently lives in a different location when it doesn't. Four transforms what is fundamentally a GET request (give me the user object) into a POST (refresh the user object, but only if you need to).

If you do decide to follow @JonSkeet's suggestion, you probably want a separate resource, something like /userObjects and /userObjectRequests. The client would always POST to /userObjectRequests. If the userObject was valid on the back end, that POST would return a 302 to /userObjects. If it wasn't valid, the POST would return an entity with an id and an estimated completion time. The client could call GET on /userObjectRequests/{id}, and they'd either get a 302 to the userObject (if it's ready) or a 200 with the id and a new estimated completion time.

Community
  • 1
  • 1
Eric Stein
  • 13,209
  • 3
  • 37
  • 52
4

One fairly "old-school" way of handling this would be to return a continuation token - basically a job ID saying, "Check this periodically; sooner or later it'll come back with a result." Given that even 30 seconds is quite a long time, you might want to give back a continuation token even in the normal "checking" situation.

More modern alternatives would be web sockets or a hanging get... it really depends on what your client use cases are.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • What would be the HTTP code for returning a token and acknowledging the request? – Gerben Jacobs Mar 26 '14 at 08:52
  • @GerbenJacobs: Well I'd assume you'd be returning the data as JSON - I'd just use 200, as it *is* a successful response in itself. It's possible that there's a more dedicated response code, but I'm not aware of it. – Jon Skeet Mar 26 '14 at 08:57
  • I'd return the job-id token in a `202` code ("Accepted") – Ron Klein Apr 29 '15 at 11:06