The HTTP status code 429 tells the client making the request to back off and retry the request after a period specified in the response's Retry-After header.
In a single-threaded client, it is obvious that the thread getting the 429 should wait as told and then retry. But the RFC explicitly states that
this specification does not define how the origin server identifies the user, nor how it counts requests.
Consequently, in a multi-threaded client, the conservative approach would stop all threads from sending requests until the Retry-After point in time. But:
- Many threads may already be past the point where they can note the information from the one rejected thread and will send at least one more request.
- The global synchronization between the threads can be a pain to implement and get right
- If the setup runs not only several threads but several clients, potentially on different machines, backing off all of them on one 429 becomes non-trivial.
Does anyone have specific data from the field how servers of cloud providers actually handle this? Will they get immediately aggravated if I don't globally hold back all threads. Microsoft's advice is
- Wait the number of seconds specified in the Retry-After field.
- Retry the request.
- If the request fails again with a 429 error code, you are still being throttled. Continue to use the recommended Retry-After delay and retry the request until it succeeds.
It twice says 'the request' not 'any requests' or 'all requests', but this is legal-type interpretation I am not confident about.
To be sure this is not an opinion question, let me phrase it as fact-based as possible:
Are there more detailed specifications for cloud APIs (Microsoft, Google, Facebook, Twitter) then the example above that allow me to make an informed decision whether global back-off is necessary or whether it suffices to back-off with the specific request that got the 429?