We are using the kubernetes python client (4.0.0) in combination with google's kubernetes engine (master + nodepools run k8s 1.8.4) to periodically schedule workloads on kubernetes. The simplified version of the script we use to creates the pod, attach to the the logs and report the end status of the pod looks as follows:
config.load_kube_config(persist_config=False)
v1 = client.CoreV1Api()
v1.create_namespaced_pod(body=pod_specs_dict, namespace=args.namespace)
logging_response = v1.read_namespaced_pod_log(
name=pod_name,
namespace=args.namespace,
follow=True,
_preload_content=False
)
for line in logging_response:
line = line.rstrip()
logging.info(line)
status_response = v1.read_namespaced_pod_status(pod_name, namespace=args.namespace)
print("Pod ended in status: {}".format(status_response.status.phase))
Everything works pretty fine, however we are experiencing some authentication issues. Authentication happens through the default gcp
auth-provider, for which I obtained the initial access token by running a kubectl container cluster get-credentials
manually on the scheduler. At some random timeframes, some API calls result in a 401 response from the API server. My guess is that this happens whenever the access token is expired, and the script tries to obtain a new access token. However it happens that multiple scripts are running concurrently on the scheduler, resulting in obtaining a new API key multiple times of which only one is still valid. I tried out multiple ways to fix the issue (use persist_config=True
, retry 401's after reloading the config,...) without any success. As I am not completely aware how the gcp authentication and the kubernetes python client config work (and docs for both are rather scarce), I am a bit left in the dark.
Should we use another authentication method instead of the gcp
auth-provider? Is this a bug in the kubernetes python client? Should we use multiple config files?