2

I want to implement Retry pattern in PHP (Guzzle) to determine if I need to send the request again in case of failure or not. And in case I need to, should I use some delay before sending it again or not. NOTE: it's internal services communication, and each service is in scaling group and behind a load balancer, so we assume that target URL's is existing URL but may be unavailable for some reason, also all servers are NGINX

Is there any best practices whether to perform a retry or not and with delay or not??

As far as I know, status 503 means the server is overloaded, so probably in such a case small delay can help to wait for new instance to start and help to distribute the load???

What to do in case of a 502/504 error, also retry with some delay???

What to do in case of a 500 error?? In my understanding 500 should be thrown when something is wrong with a server or logic in general and we do not need to perform any retry???

What about 400, same action as if we got 500??

What about 404?? There can be two types of 404, one is if the endpoint is really not existed (I do not think it's possible in case of communication between internal services), and another is requested resource not found (for example user not found by credentials). I think in case of 404 we do not need to perform a retry

422 I use in case of some domain error or validation error, but maybe the server can return it in some other case? If it's only triggered by me I can assume no retry needed.

What about other status codes, also there NGINX specific codes???

I know that probably I need to make a specific retry strategy per URI cases, but I believe there are some common/reusable rules.

Bogdan Dubyk
  • 4,756
  • 7
  • 30
  • 67
  • It's all good and well defering to the HTTP status code to determine whether or not to retry a request. This does however assume that those who developed the application you're trying to communicate with have used the correct status code in each given situation. I would be inclined to potentially just retry every request a set number of times at exponentially increasing intervals. E.g. 1 minute, 2 minutes, 4 minutes, etc. – fubar Jan 15 '20 at 03:14
  • "just retry every request a set number of times at exponentially increasing intervals" not sure it's a good idea to always perform the retry, in case of microservices, each of the microservice can communicate with other microservices and it's not always possible by the async way. And during single user request, there can be a number of HTTP calls between microservices (even if you trying to make microservices to be isolated), and in such a case response time can be pretty big for end-user – Bogdan Dubyk Jan 15 '20 at 03:34
  • I agree that it should be determined by the status code, and you can have some internal contracts/rules between microservice developers what statuses to use, but still, it's not always clear when and what status to use, and also there is a number of statuses not related to the application itself and not clear what to perform in such a a case – Bogdan Dubyk Jan 15 '20 at 03:39
  • The exponential increase doesn't need to be in minutes. It could be 1 second, 2 seconds, 4 seconds, etc. But if you're working in an auto scaling environment, it'll take longer than that to spin up another server instance. With regard to async requests, you could consider implementing a queue based system to register and handle requests asynchronously when an immediate response isn't required. – fubar Jan 15 '20 at 04:10
  • well, in most cases few seconds is too much, mostly requirement is less than 2 seconds for end-user to get a response :) There is no problem with async, we using queues. The main question here is the best practices on when to perform and what kind of retry in http (sync) reqeust. I read a lot and someone consider to never retry 4xx responses, and only retry 5xx. Another telling to never retry 500 code and always retry 503, 504,408,409. So I'm confused :) – Bogdan Dubyk Jan 15 '20 at 04:17
  • Does this answer your question? [Which HTTP errors should never trigger an automatic retry?](https://stackoverflow.com/questions/47680711/which-http-errors-should-never-trigger-an-automatic-retry) – Michael Freidgeim Mar 03 '23 at 12:27

1 Answers1

4

I end up with such list:

  • 400 Bad Request - no RETRY
  • 401 Unauthorized - no RETRY
  • 402 Payment Required - no RETRY
  • 403 Forbidden - no RETRY
  • 404 Not found - as I told before I assuming we have different 404, if some resource not found, like the user in DB, if 404 pages not found in case of really wrong URL, and not found b-z balancing issue. So if some resource is not found we going to send some custom data in such case no RETRY, in other cases we going to RETRY
  • 405 Method Not Allowed - no RETRY
  • 406 Not Acceptable - no RETRY
  • 407 Proxy Authentication Required - no RETRY
  • 408 Request Timeout - RETRY
  • 409 Conflict - RETRY
  • 410 Gone - no RETRY
  • 411 Length Required - no RETRY
  • 412 Precondition Failed - no RETRY
  • 413 Payload Too Large - no RETRY
  • 414 URI Too Long - no RETRY
  • 415 Unsupported Media Type - no RETRY
  • 416 Range Not Satisfiable - no RETRY
  • 417 Expectation Failed - no RETRY
  • 421 Misdirected Request - no RETRY
  • 422 Unprocessable Entity - no RETRY
  • 423 Locked - RETRY if specified locking time and time not too big
  • 424 Failed Dependency - no RETRY
  • 426 Upgrade Required - no RETRY
  • 428 Precondition Required - no RETRY
  • 429 Too Many Requests - probably retryRETRY
  • 431 Request Header Fields TooLarge - no RETRY
  • 451 Unavailable For Legal Reasons - no RETRY

So, most of the 4** Client errors should not be retried.

The 5** Servers errors that should not be retried:

  • 500 Internal Server Error - no RETRY, in most cases it's not caught application errors, so we should not retry it
  • 501 Not Implemented - no RETRY
  • 502 Bad Gateway -RETRY
  • 503 Service Unavailable - RETRY
  • 504 Gateway Timeout RETRY
  • 505 HTTP Version Not Supported - no RETRY
  • 506 Variant Also Negotiates - no RETRY
  • 507 Insufficient Storage - no RETRY
  • 508 Loop Detected - no RETRY
  • 510 Not Extended - no RETRY
  • 511 Network Authentication Required - no RETRY

This one will go to the base retry strategy, but as I told each request should be handled individually, so most of the request will have own strategy with overriding some code handling and using different retry timing.

Bogdan Dubyk
  • 4,756
  • 7
  • 30
  • 67