0

I have a GRPC client proxy implementation that I wrapped in Resillience4J as a CircuitBreaker for failure scenarios.

The circuit breaker works fine and it opens, but it never recovers. When I goes back to Half-open and I hit the GRPC endpoint I get the same error as I do when it first breaks.

2022-09-18 19:34:11.258 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : [id: 0x13ba49c9, L:/10.0.9.172:53308 - R:10.0.9.2/10.0.9.2:50000] OUTBOUND HEADERS: streamId=4835 headers=GrpcHttp2OutboundHeaders[:authority: 10.0.9.2:50000, :path: /grpc.reflection.v1alpha.ServerReflection/ServerReflectionInfo, :method: POST, :scheme: http, content-type: application/grpc, te: trailers, user-agent: grpc-java-netty/1.48.1, x-b3-traceid: 632772b3bbb53b9c47dddb9e00629897, x-b3-spanid: 5ad57d4d348297cb, x-b3-parentspanid: 4caf63e791f1a5ce, x-b3-sampled: 1, grpc-accept-encoding: gzip, grpc-timeout: 998924u] streamDependency=0 weight=16 exclusive=false padding=0 endStream=false
2022-09-18 19:34:11.259 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : [id: 0x13ba49c9, L:/10.0.9.172:53308 - R:10.0.9.2/10.0.9.2:50000] OUTBOUND DATA: streamId=4835 padding=0 endStream=true length=8 bytes=00000000033a012a
2022-09-18 19:34:11.260 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : [id: 0x13ba49c9, L:/10.0.9.172:53308 - R:10.0.9.2/10.0.9.2:50000] INBOUND PING: ack=false bytes=1234
2022-09-18 19:34:11.261 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : [id: 0x13ba49c9, L:/10.0.9.172:53308 - R:10.0.9.2/10.0.9.2:50000] OUTBOUND PING: ack=true bytes=1234
2022-09-18 19:34:11.261 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : [id: 0x13ba49c9, L:/10.0.9.172:53308 - R:10.0.9.2/10.0.9.2:50000] INBOUND HEADERS: streamId=4835 headers=GrpcHttp2ResponseHeaders[:status: 200, content-type: application/grpc, grpc-encoding: identity, grpc-accept-encoding: gzip] padding=0 endStream=false
2022-09-18 19:34:11.262 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : [id: 0x13ba49c9, L:/10.0.9.172:53308 - R:10.0.9.2/10.0.9.2:50000] INBOUND DATA: streamId=4835 padding=0 endStream=false length=96 bytes=000000005b12033a012a32540a2a0a28677270632e7265666c656374696f6e2e7631616c7068612e5365727665725265666c656374696f6e0a260a246e65742e...
2022-09-18 19:34:11.262 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : [id: 0x13ba49c9, L:/10.0.9.172:53308 - R:10.0.9.2/10.0.9.2:50000] OUTBOUND PING: ack=false bytes=1234
2022-09-18 19:34:11.262 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : [id: 0x13ba49c9, L:/10.0.9.172:53308 - R:10.0.9.2/10.0.9.2:50000] INBOUND HEADERS: streamId=4835 headers=GrpcHttp2ResponseHeaders[grpc-status: 0] padding=0 endStream=true
2022-09-18 19:34:11.263 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : [id: 0x13ba49c9, L:/10.0.9.172:53308 - R:10.0.9.2/10.0.9.2:50000] INBOUND PING: ack=true bytes=1234
2022-09-18 19:34:11.263 DEBUG [,,] 1 --- [-worker-ELG-6-4] io.grpc.netty.NettyClientHandler         : Window: 1048576
2022-09-18 19:34:16.257 ERROR [,632772b3bbb53b9c47dddb9e00629897,47dddb9e00629897] 1 --- [or-http-epoll-5] request                                  : POST http://localhost:28082/grpc/Echo/echo 503 5005.583ms

Restarting the client makes it work again so I know the server is still working, but I am trying to avoid restarting the client.

I already put in enableRetry and deadline and keepalive none of them seem to make the channel reconfigure itself.

At least the HTTP service recovery still works, but the GRPC proxying service does not.

The weird part is it's getting pings according to the logs.

In case you need the source it's in https://github.com/trajano/spring-cloud-demo/tree/rework

Note the only way I can trigger this problem is through an artillery script to put sufficient load to trigger the circuit breaker.

UPDATE: I also tried using DirectExecutor on the client no luck still.

UPDATE: Removing resilience4j basically kills the gateway server under load.

Archimedes Trajano
  • 35,625
  • 19
  • 175
  • 265

0 Answers0