18

I'm occasionally getting cancellation errors when calling gRPC methods.

Here's my client-side code (Using grpc-java 1.22.0 library):

public class MyClient {
    private static final Logger logger = LoggerFactory.getLogger(MyClient.class);
    private ManagedChannel channel; 
    private FooGrpc.FooStub fooStub;

    private final StreamObserver<Empty> responseObserver = new StreamObserver<>() {
        @Override
        public void onNext(Empty value) {
        }

        @Override
        public void onError(Throwable t) {
            logger.error("Error: ", t);
        }

        @Override
        public void onCompleted() {
        }
    };

    public MyClient() {
        this.channel = NettyChannelBuilder
            .forAddress(host, port)
            .sslContext(GrpcSslContexts.forClient().trustManager(certStream).build())
            .build();
        var pool = Executors.newCachedThreadPool(
                new ThreadFactoryBuilder().setNameFormat("foo-pool-%d").build());
        this.fooStub = FooGrpc.newStub(channel)
                .withExecutor(pool);
    }

    public void callFoo() {
        fooStub.withDeadlineAfter(500L, TimeUnit.MILLISECONDS)
                .myMethod(whatever, responseObserver);
    }
}

When I invoke callFoo() method, it usually works. Client sends a message and server receives it without problem.

But this call occasionally gives me an error:

io.grpc.StatusRuntimeException: CANCELLED: io.grpc.Context was cancelled without error
        at io.grpc.Status.asRuntimeException(Status.java:533) ~[handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:442) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:507) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:66) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:627) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:515) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:686) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:675) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [handler-0.0.1-SNAPSHOT.jar:?]
        at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [handler-0.0.1-SNAPSHOT.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]

The weird thing is that even though the call gives an error at client side, the server does receive the request, mostly. But sometimes the server misses it.

It is not even DEADLINE_EXCEEDED exception, it just throws CANCELLED: io.grpc.Context was cancelled without error. No other description is provided, So I cannot figure out why this is happening.

To summarize:

  1. gRPC call from client randomly gives CANCELLED error.
  2. When the error happens, the server sometimes gets the call, but sometimes not.
hylowaker
  • 966
  • 2
  • 9
  • 24
  • Is the client class used to service incoming RPCs? As in, is the process using the client also a server? – Eric Anderson Jul 19 '19 at 15:31
  • Yes. Actually the "server" first calls RPC on "client". After the "client" receives incoming RPC, it invokes callFoo() method to call RPC on the "server". This happens in a same thread. Is this a problematic way? Should I invoke 'callFoo()' in a separate thread? – hylowaker Jul 19 '19 at 17:22
  • 1
    Possible duplicate of [gRPC future callback got exception:StatusRuntimeException: CANCELLED](https://stackoverflow.com/questions/43511228/grpc-future-callback-got-exceptionstatusruntimeexception-cancelled) – Eric Anderson Jul 19 '19 at 22:04

1 Answers1

31

grpc-java supports automatic deadline and cancellation propagation. When an inbound RPC causes outbound RPCs, those outbound RPCs inherit the inbound RPC's deadline. Also, if the inbound RPC is cancelled the outbound RPCs will be cancelled.

This is implemented via io.grpc.Context. If you do an outbound RPC that you want to live longer than the inbound RPC, you should use Context.fork().

public void myRpcMethod(Request req, StreamObserver<Response> observer) {
  // ctx has all the values as the current context, but
  // won't be cancelled
  Context ctx = Context.current().fork();
  // Set ctx as the current context within the Runnable
  ctx.run(() -> {
    // Can start asynchronous work here that will not
    // be cancelled when myRpcMethod returns
  });
  observer.onNext(generateReply());
  observer.onCompleted();
}
Eric Anderson
  • 24,057
  • 5
  • 55
  • 76
  • Lets say 1 does a RPC to 2 and 2 does a RPC to 3, now if 1 is doing a futureStub call to 2 and 1 does not wait for the response and completes itself, how will the call from 2 to 3 be affected ? In my case 2 is sometimes able to do a RPC to 3 and sometimes it cannot. The question here is - Is there a cancellation propagation happening, from 1, after it completes, such that 2 receives the cancellation request before it can send a RPC to 3 ? Is there a race condition arising here ? – Tausif mohammad Jan 14 '20 at 06:02
  • The future stub observes the context as well, so yes, if 1 is calling 2 because it received an RPC then when 1 completes that RPC the call to 2 will be cancelled. – Eric Anderson Jan 14 '20 at 17:17
  • Question 1-> what do you mean by "when 1 completes" ? when i do a future call, i add callbacks, so won't those callbacks keep the context alive? Question 2-> Who calls the cancellation on the context after the request is done ? Is cancellation called after every request completion? – Tausif mohammad Jan 21 '20 at 17:03
  • 1
    gRPC creates a cancellable context for each inbound RPC. When that RPC completes (basically, when the server responds with a onCompleted()/onError()) grpc will cancel that context. Callbacks on a future call do not impact the context lifetime. – Eric Anderson Jan 21 '20 at 20:21
  • Thanks @Eric Anderson, Your explanation helped a lot :) – Tausif mohammad Jan 23 '20 at 06:45