What is the correct client reaction to a HTTP 429 when the client is multi-threaded?

Question

The HTTP status code 429 tells the client making the request to back off and retry the request after a period specified in the response's Retry-After header.

In a single-threaded client, it is obvious that the thread getting the 429 should wait as told and then retry. But the RFC explicitly states that

this specification does not define how the origin server identifies the user, nor how it counts requests.

Consequently, in a multi-threaded client, the conservative approach would stop all threads from sending requests until the Retry-After point in time. But:

Many threads may already be past the point where they can note the information from the one rejected thread and will send at least one more request.
The global synchronization between the threads can be a pain to implement and get right
If the setup runs not only several threads but several clients, potentially on different machines, backing off all of them on one 429 becomes non-trivial.

Does anyone have specific data from the field how servers of cloud providers actually handle this? Will they get immediately aggravated if I don't globally hold back all threads. Microsoft's advice is

Wait the number of seconds specified in the Retry-After field.

Retry the request.

If the request fails again with a 429 error code, you are still being throttled. Continue to use the recommended Retry-After delay and retry the request until it succeeds.

It twice says 'the request' not 'any requests' or 'all requests', but this is legal-type interpretation I am not confident about.

To be sure this is not an opinion question, let me phrase it as fact-based as possible:

Are there more detailed specifications for cloud APIs (Microsoft, Google, Facebook, Twitter) then the example above that allow me to make an informed decision whether global back-off is necessary or whether it suffices to back-off with the specific request that got the 429?

One hit: https://learn.microsoft.com/en-us/dynamics365/customer-engagement/developer/api-limits says: "All requests will return these error responses until the volume of API requests falls below the limit." Might be construed as to not needing global back-off, because each requests gets a back-off answer. — Harald, Nov 26 '18 at 08:06

tgkprog · Answer 1 · 2022-10-06T13:35:47.963

Servers knows that its tuff to sync or expect programmers to do this. So doubt if there is a penalty unless they get an ocean of requests that do not back off at all after 429.

Each thread should wait, but each would, after being told individually.

A good system would know what its rate is and be within that. One way to impolement this is having a sleepFor variable between requests. Exact prod value can be arrived at by trial and error, and would be the sleep time minus the previous request time.

So if one requests ends, and say it took x milliseconds. Now if the sleep time is 0 or less, move immediately to next request if 1 or more than find out sleepTime - x, if this is less than 1, go to next immediately, else sleep for so many milliseconds and then move to next request.

Another way would be to have a timeCountStrarted at request 1; count for every 5 minutes or so. After every request, check if the actual request count of current thread is more than that. If yes current thread sleeps till 5 minutes is up before moving to next. Here 5 can be configured as the timePeriod. If after a request the count is not more than set figure but time elapsed since timeCountStrarted is more than 5 minutes; then set timeCountStrarted to current time and the count of requests to 0.

What we do is keep these configuration values in a data base but cache them at run time.

Also have a page to invalidate the caches so if we like we can update the data base from an admin page, then invalidate the caches and thus the clients would pick up the new information on the run. This helps to configure the correct value to stay within API limits and get enough jobs done.

What is the correct client reaction to a HTTP 429 when the client is multi-threaded?

1 Answers1