0

I have a client grpc and a server grpc that when I run them with python on my laptop, work and comminute, however whenever I upload the code into balena (linux cloud), I get an error:

verStatus failed: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1661787080.151660372","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1661787080.151659062","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"

Why would one work on the laptop and the other not in the containers? to be clear: client and server are on different containers. the code:

import grpc
from time import sleep
from proto import nvidia_pb2
from proto import nvidia_pb2_grpc

GRPC_PORT = '50071'
socket = '127.0.0.1:{0}'.format(GRPC_PORT)
my_timeout_in_seconds = 10

class GrpcClientNvidia():
    def __init__(self):
        try:
            self.channel = grpc.insecure_channel(socket, options=(('grpc.enable_http_proxy', 0),))
            
            self.stub = nvidia_pb2_grpc.NvidiaStub(self.channel)
        except grpc.FutureTimeoutError:
            print("ERROR")
    
    def getNvidiaStatus(self):
        try:
            request = nvidia_pb2.Empty()
            while not grpc.ChannelConnectivity.READY:
                sleep(2)
            res = self.stub.NvidiaDriverStatus(request,  timeout=my_timeout_in_seconds)
            return res.status.value
        except grpc.RpcError as e:
            return False

and the main of the client:

   def runNvidiaDriverCheck():
    try: 
        print('Running SW Nvidia driver checks...')
        client = GrpcClientNvidia()
        result = client.getNvidiaStatus()
        return result
    except Exception:
        print("Status Nvidia driver check: FAIL")
    return False

 def main():
  print(runNvidiaDriverCheck())

if __name__ == "__main__":
    main()

and a server:

from concurrent import futures

import grpc
import sys
sys.path.append("proto")
import proto.nvidia_pb2_grpc
from servicer import NvidiaServicer



GRPC_PORT = '50071'
socket = "127.0.0.1:{0}".format(GRPC_PORT)

def server():
    logger.info('Setting up gRPC server')
    grpc_server = grpc.server(futures.ThreadPoolExecutor(max_workers=20))
    proto.nvidia_pb2_grpc.add_NvidiaServicer_to_server(
        NvidiaServicer(), grpc_server
    )

    logger.info(f'Starting server at {socket}')
    grpc_server.add_insecure_port(socket)
    return grpc_server

the servicer:

from proto import nvidia_pb2
from proto import nvidia_pb2_grpc
import logging
from driver_status import checkDriverStatus
logger = logging.getLogger()

class NvidiaServicer(nvidia_pb2_grpc.NvidiaServicer):
    def NvidiaDriverStatus(self, request, context):
        print('gRPC server got request to check driver status')
        response = nvidia_pb2.DriverStatus()
        result = checkDriverStatus()

        response.status.value= result
        print(f"response.value:{response.status.value}")
        return response

servers main:

def main():
    logger.info('Setting up server')
    grpc_server = server()
    grpc_server.start()
    logger.info('server running...')
    grpc_server.wait_for_termination()

if __name__ == '__main__':
    main()
Yuki1112
  • 365
  • 2
  • 12
  • You cannot use `127.0.0.1` (localhost) to connect 2 containers. localhost traffic does not use the network. You may be able to use `0.0.0.0` instead. This address generally means all network interfaces and it will permit you to publish (!) the server's port (e.g. `50071`) to e.g. your host's network. You will then also need either (!) to provide a routable address to the client **or** publish the client (!) on the host's network (if client and server are on the same host) so that the client can send traffic to the server – DazWilkin Aug 29 '22 at 23:37
  • I just tested it on my laptop, I changed both server and client to 0.0.0.0 but whenever I change the clients port to 0.0.0.0 it fails to connect, however localhost:50071 on client side works while 0.0.0.0 is on server, does that change in containers? – Yuki1112 Aug 30 '22 at 06:43
  • 1
    Yes, apologies. The client will need to have a specific address but the server can (usually) bind to all interfaces (`0.0.0.0`). It's difficult to explain networking in Stack overflow comments. Generally, on a single host without containers, you can use `localhost`. On a single host with container **if** you bind the containers to the host's network, you can also use `localhost`. Otherwise, servers can usually bind to `0.0.0.0` but clients will need the container's DNS or IP address. – DazWilkin Aug 30 '22 at 15:16

0 Answers0