3

I am creating a deployment in GKE with a following (standard) deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      component: api
  template:
    metadata:
      labels:
        component: api
    spec:
      containers:
      - name: api
        image: eu.gcr.io/xxxx-xxx/api:latest
        imagePullPolicy: Always
        resources:
          requests:
            memory: "320Mi"
            cpu: "100m"
          limits:
            memory: "450Mi"
            cpu: "150m"
        ports:
        - containerPort: 5010

However, for some reason GKE complains about a permission issue. The containers are in container registry of the same project and PRIVATE, but as far as I am aware if it with a GCP project GKE should be able to have access. The GKE cluster is vpc-native (if that might make a difference) as that is the only difference I can think of compared a cluster I used to run with the same containers and installers.

Events:
  Type     Reason     Age                    From                                                     Message
  ----     ------     ----                   ----                                                     -------
  Normal   Scheduled  34m                    default-scheduler                                        Successfully assigned default/api-deployment-f68977b84-fmhdx to gke-gke-dev-cluster-default-pool-6c6bb127-nw61
  Normal   Pulling    32m (x4 over 33m)      kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  pulling image "eu.gcr.io/xxxx-xxx/api:latest"
  Warning  Failed     32m (x4 over 33m)      kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  Failed to pull image "eu.gcr.io/xxxx-xxx/api:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for eu.gcr.io/xxxx-xxx/api, repository does not exist or may require 'docker login'
  Warning  Failed     32m (x4 over 33m)      kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  Error: ErrImagePull
  Normal   BackOff    32m (x6 over 33m)      kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  Back-off pulling image "eu.gcr.io/xxxx-xxx/api:latest"
  Warning  Failed     3m59s (x131 over 33m)  kubelet, gke-gke-dev-cluster-default-pool-6c6bb127-nw61  Error: ImagePullBackOff

Do I need to add ImageSecrets as well for GKE clusters with the google cloud repository or might there be another problem?

The GKE cluster was created using TerraForm with the following gke.tf for GKE

resource "google_container_cluster" "primary" {
  name = "gke-${terraform.workspace}-cluster"
  zone = "${var.region}-b"

  additional_zones = [
    "${var.region}-c",
    "${var.region}-d",
  ]

  # minimum kubernetes version for master
  min_master_version = "${var.min_master_version}"
  # version for the nodes. Should equal min_master_version on create
  node_version       = "${var.node_version}"
  initial_node_count = "${var.gke_num_nodes[terraform.workspace]}"
  network            = "${var.vpc_name}"
  subnetwork         = "${var.subnet_name}"

  addons_config {

    http_load_balancing {
      disabled = false  # this is the default
    }

    horizontal_pod_autoscaling {
      disabled = false
    }

    kubernetes_dashboard {
      disabled = false
    }
  }

  # vpc-native network
  ip_allocation_policy {
#    use_ip_aliases = true
  }

  master_auth {
    username = "${var.gke_master_user}"
    password = "${var.gke_master_pass}"
  }

  node_config {
    oauth_scopes = [
      "https://www.googleapis.com/auth/compute",
      "https://www.googleapis.com/auth/devstorage.read_only",
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]

    labels = {
      env = "${var.gke_label[terraform.workspace]}"
    }

    disk_size_gb = 10
    machine_type = "${var.gke_node_machine_type}"
    tags         = ["gke-node"]
  }
}

running gcloud gcloud container clusters describe [CLUSTER] gives

nodePools:
- config:
    diskSizeGb: 10
    diskType: pd-standard
    imageType: COS
    labels:
      env: dev
    machineType: n1-standard-1
    metadata:
      disable-legacy-endpoints: 'true'
    oauthScopes:
    - https://www.googleapis.com/auth/monitoring
    - https://www.googleapis.com/auth/devstorage.read_only
    - https://www.googleapis.com/auth/logging.write
    - https://www.googleapis.com/auth/compute
    serviceAccount: default

so devstorage.read_only seems to be there

Mike
  • 3,775
  • 8
  • 39
  • 79

3 Answers3

8

Are your GKE cluster node pools configured with the https://www.googleapis.com/auth/devstorage.read_only OAuth scope?

To check you can run gcloud container clusters describe [CLUSTER NAME]: scopes are listed under the oauthScopes property. Or check your node pool details at the GCP dashboard:

GKE node pool OAuth scopes

Storage should be enabled.

Aleksi
  • 4,483
  • 33
  • 45
  • Thanks Aleksi, I have added my output to the question. Storage seems enabled to me. I also have the same screenshot as you have. – Mike Aug 12 '19 at 15:13
  • Ok, so we can rule that out. Would you mind attaching your terraform configuration? One of [these answers](https://stackoverflow.com/a/52572227/1763012) might be relevant to your case also. – Aleksi Aug 12 '19 at 15:22
  • I have added the .tf file as well. I came across the thread you referred, but could not resolve it yet. – Mike Aug 12 '19 at 15:27
  • Hmm.. one more thing you might want to check is that the service account used by GKE nodes (most likely `Compute Engine default service account`) has correct permissions. – Aleksi Aug 12 '19 at 16:49
  • I am not sure how to see which permissions it has? I can see it for the IAM accounts, but not the default service account – Mike Aug 12 '19 at 22:38
  • Try `gcloud projects get-iam-policy [PROJECT ID]`; that shows each user and service account along their roles. – Aleksi Aug 13 '19 at 04:45
  • it replies etag: ACAB – Mike Aug 14 '19 at 20:58
  • Oh? It should reply with a list of bindings. You can also check service account roles via the web UI, at `https://console.cloud.google.com/iam-admin/iam?project=[PROJECT ID]`. – Aleksi Aug 15 '19 at 04:19
2

In order to use GCR, the nodes will need to run with service accounts and OAuth scopes that allow reading from cloud storage. There's some guidance for this topic here for example: https://cloud.google.com/kubernetes-engine/docs/how-to/access-scopes#service_account

mensi
  • 9,580
  • 2
  • 34
  • 43
  • the project is owned by a service account with Storage Admin permissions – Mike Aug 11 '19 at 13:18
  • 2
    The project owner's permissions do not guarantee that the cluster has correct permissions; we need to set cluster/node pool OAuth scopes separately. – Aleksi Aug 12 '19 at 13:40
0

On addition to Aleksi comment and based on this documentation [1], you can also retrieve IAM policies for an individual service-account with:

gcloud iam service-accounts get-iam-policy [SERVICE_ACCOUNT]
Bruno
  • 393
  • 2
  • 9