1

I have written a Python GRPC client that can connect to a number of GRPC Golang services. I have been able to make this work like so:

from alphausblue.connection.conn import grpc_client_connection
from alphausblue.iam.v1.iam_pb2 import WhoAmIRequest
from alphausblue.iam.v1.iam_pb2_grpc import IamStub

async def main():
    conn = grpc_client_connection(svc = "blue")
    stub = IamStub(conn)
    resp = await stub.WhoAmI(WhoAmIRequest())
    print(resp)

where the service name is blue. However, if I were to try to connect to a different service to request data like this:

from alphausblue.connection.conn import grpc_client_connection
from alphausblue.cost.v1.cost_pb2 import ListAccountsRequest
from alphausblue.cost.v1.cost_pb2_grpc import CostStub

async def main():
    conn = grpc_client_connection(svc = "cost")
    stub = CostStub(conn)
    account = await stub.GetAccount(GetAccountRequest(vendor = 'aws', id = '731058950257'))
    print(account)

I get an UNIMPLEMENTED response. This would make sense if the service didn't exist but it does and my Golang client connects to it just fine. Moreover, when I check the server's logs I can clearly see that the request reached the server. Doing some more research, I discovered this code on my server:

type service struct {
    UserInfo *blueinterceptors.UserData

    cost.UnimplementedCostServer
}

func (s *service) GetAccount(ctx context.Context, in *cost.GetAccountRequest) (*api.Account, error) {
    switch in.Vendor {
    case "aws":
        // Do stuff
    default:
        return status.Errorf(codes.Unimplemented, "not implemented")
    }
}

What this tells me is that the function is being called but the payload is being deserialized is missing the vendor field. However, when debugging I can see this line:

src/core/lib/security/transport/secure_endpoint.cc:296] WRITE 0000018E2C62FB80: 00 00 00 13 0a 03 61 77 73 12 0c 37 33 31 30 35 38 39 35 30 32 35 37 '......aws..731058950257'

So, the data is being sent over GRPC to the server but is being deserialized into an object with missing fields. What, then is the cause of this?

Update

I looked at the definitions for GetAccountRequest for both the Python and Golang clients as suggested by @blackgreen.

Golang client code:

// Request message for the Cost.GetAccount rpc.
type GetAccountRequest struct {
    state         protoimpl.MessageState
    sizeCache     protoimpl.SizeCache
    unknownFields protoimpl.UnknownFields

    Vendor string `protobuf:"bytes,1,opt,name=vendor,proto3" json:"vendor,omitempty"`
    Id string `protobuf:"bytes,2,opt,name=id,proto3" json:"id,omitempty"`
}

Python client code:

_GETACCOUNTREQUEST = _descriptor.Descriptor(
  name='GetAccountRequest',
  full_name='blueapi.cost.v1.GetAccountRequest',
  filename=None,
  file=DESCRIPTOR,
  containing_type=None,
  create_key=_descriptor._internal_create_key,
  fields=[
    _descriptor.FieldDescriptor(
      name='vendor', full_name='blueapi.cost.v1.GetAccountRequest.vendor', index=0,
      number=1, type=9, cpp_type=9, label=1,
      has_default_value=False, default_value=b"".decode('utf-8'),
      message_type=None, enum_type=None, containing_type=None,
      is_extension=False, extension_scope=None,
      serialized_options=None, file=DESCRIPTOR,  create_key=_descriptor._internal_create_key),
    _descriptor.FieldDescriptor(
      name='id', full_name='blueapi.cost.v1.GetAccountRequest.id', index=1,
      number=2, type=9, cpp_type=9, label=1,
      has_default_value=False, default_value=b"".decode('utf-8'),
      message_type=None, enum_type=None, containing_type=None,
      is_extension=False, extension_scope=None,
      serialized_options=None, file=DESCRIPTOR,  create_key=_descriptor._internal_create_key),
  ],
  extensions=[
  ],
  nested_types=[],
  enum_types=[
  ],
  serialized_options=None,
  is_extendable=False,
  syntax='proto3',
  extension_ranges=[],
  oneofs=[
  ],
  serialized_start=946,
  serialized_end=993,
)

It's clear that the fields are in the same order here so I don't think that's the issue unless GRPC is using index rather than number.

Woody1193
  • 7,252
  • 5
  • 40
  • 90

1 Answers1

1

After you make sure that the string you are setting in the Python version field doesn't contain invisible characters, there is probably a version mismatch in the generated code imported in the Python client vs. the one imported by the Go server.

In particular, if both versions do actually have a version field, but somehow the Go code fails to "recognize" it, it could be caused by a mismatch in the field tag number.

The byte payload you show (hexa 0a 03 61 77 73 12 0c 37 33 31 30 35 38 39 35 30 32 35 37, base64 CgNhd3MSDDczMTA1ODk1MDI1Nw==) is indeed sending aws string in the proto field with tag number 1 (how do I know)

So let's consider a contrived example, the proto schema of GetAccountRequest used in the Python client may be

message GetAccountRequest {
    string version = 1; // field tag number 1
}

And the one used in the Go server may be:

message GetAccountRequest {
    string version = 3; // field tag number 3
}

In this case you will see the message on the wire having the version field, but upon deserializing it against a Go struct with a different tag number, it will end up empty. The client sends it with tag 1 and the server expects it with tag 3.

You can validate this hypothesis by inspecting the Go GetAccountRequest struct. It should look like the following:

type GetAccountRequest struct {
    // unexported fields

    // other fields, in order
    Version string `protobuf:"bytes,3,opt,name=version,proto3" json:"version,omitempty"`
}

The part between backticks ` is the struct tag, where 3 is the tag number.

I don't know about what gRPC codegen in Python looks like, but you should be able to compare the tag numbers somehow. If they do differ, make sure to update either the client or server code with generated structs that have the same tag numbers.

blackgreen
  • 34,072
  • 23
  • 111
  • 129
  • 1
    Thanks; that was a very helpful answer and something I legitimately did not check for. But, unfortunately, the field tags are the same so I don't think that's the issue here. – Woody1193 Oct 12 '21 at 01:17
  • @Woody1193 I see. Indeed the tag *numbers* are the same. Other things that come to mind are: presence of interceptors that may be altering the request; the running server is not up to date with the sources you are looking at; the *registered* server implementation is not the one with the method you are inspecting. Actually... – blackgreen Oct 12 '21 at 06:30
  • ...actually, the Go RPC handler that you show in your question has a strange signature, in that it returns just `error`, whereas usually handlers are generated with two return types, like `(*pb.Resp, error)`. Newer versions of the Go proto library introduce mandatory forward compatibility, which may cause unexpected issues, I've recently stumbled into this myself. See [this](https://stackoverflow.com/questions/65079032/grpc-with-mustembedunimplemented-method/69480218#69480218) for details. To verify, you may check the proto schema, and how the Go server is initialized – blackgreen Oct 12 '21 at 06:32
  • I don't think it's an interceptors problem because the service has no interceptors that were not on other services I tried that worked properly. I don't think it's an issue of the service being up-to-date either as it is updated via the deployment file. Finally, I'm sure I'm hitting the right service because I can see the request in the logs. – Woody1193 Oct 21 '21 at 02:03
  • I looked at the `UnimplementedCostServer` definition and the signatures for `GetAccount` match. Also, I think I made a typo in the question because the function does have a return. – Woody1193 Oct 21 '21 at 02:11
  • Quick update: I got it to work locally but it does not work with the production server. Maybe there's an issue with the load balancer... – Woody1193 Oct 21 '21 at 04:05