6

Problem

Terraform GCP google_service_account and google_project_iam_binding resource to attach roles/editor deleted Google APIs Service Agent and GCP default compute engine default service account in the IAM principals. GKE cluster cannot be deleted / created due to the deletion in IAM principals, although it still remains in IAM Service Accounts.

The problem here is it disappears (which I wrote "deleted") from the IAM principals, and the Compute Engine default service account is compromised, hence no more able to manage Compute Engine, including GKE cluster/nodes.

Question

I believe this is a Terraform bug but please help understand if there are things I am missing which can prevent the problem.

Please also advise if there is a way to restore the Compute Engine default service account back in IAM principals with the Editor role.

Environment

$ terraform version
Terraform v1.0.4
on linux_amd64
+ provider registry.terraform.io/hashicorp/google v4.6.0

.terraform.lock.hcl

# This file is maintained automatically by "terraform init".
# Manual edits may be lost in future updates.

provider "registry.terraform.io/hashicorp/google" {
  version = "4.6.0"
  hashes = [
    "h1:QbO4yjDrnoSpiYKSHrICNL1ZuWsl5J2rVRFj2kNg7xA=",
    "zh:005a28a2c79f6b29680b0f57260c69c85d8a992688007b6e5645149bd379951f",
    "zh:2604d825de72cf99b4899d7880837adeb19d371f48e419666e32c4c3cf6a72e9",
    "zh:290da4eb18e44469480cf299bebce89f54e4d301f856cdffe2837b498878c7ec",
    "zh:3e5ba1a55d38fa17533a18fc14a612e781ded76c6309734d3dc0a937be27eec1",
    "zh:4a85de3cdb33c092d8ccfced3d7302934de0dd4f72bbcebd79d45afe0a0b6f85",
    "zh:5fb1a79800833ae922aaba594a8b2bc83be1d254052e12e0ce8330ca0d8933d9",
    "zh:679b9f50c6fe0476e74d37935f7598d46d6e9612f75b26a8ef1ca3c13144d06a",
    "zh:893216e32378839668c51ef135af1676cd887d63e2edb6625cf9adad7bfa346f",
    "zh:ad8f2fd19adbe4c10281ba9b3c8d5100877a9c541d3580bbbe9357714aa77619",
    "zh:bff5d6fd15e98c12ee9ed98b0338761dc4a9ba671a37834926daeabf73c71783",
    "zh:debdf15fbed8d63e397cd004bf65586bd2b93ce04e47ca51a7c70c1fe9168b87",
  ]
}

Reproduction Steps

Tested twice in different GCP projects and the issue was reproduced in the same manner.

Start

In a GCP project, starts without Compute Engine enabled, hence no Compute Engine default service account.

enter image description here

enter image description here

Enable Compute Engine API.

enter image description here

Compute Engine default service account gets created and appears both in IAM Principals and IAM Service Accounts.

enter image description here

enter image description here

Terraform apply

Apply the terraform script to create a service account with IAM bindings.

variable "PROJECT_ID" {
  type        = string
  description = "GCP Project ID"
  default     = "test-tf-sa"
}

variable "REGION" {
  type        = string
  description = "GCP Region"
  default     = "us-central1"
}


variable "roles_to_grant_to_service_account" {
  description = "IAM roles to grant to the service account"
  type        = list(string)
  default = [
    "roles/editor",
    "roles/iam.serviceAccountAdmin",
    "roles/resourcemanager.projectIamAdmin"
  ]
}

provider "google" {
  project = var.PROJECT_ID
  region  = var.REGION
}
resource "google_service_account" "terraform" {
  account_id   = "terraform"
  display_name = "terraform service account"
}

resource "google_project_iam_binding" "terraform" {
  project = var.PROJECT_ID

  #--------------------------------------------------------------------------------
  # Grant the service account to have the roles
  #--------------------------------------------------------------------------------
  members = [
    "serviceAccount:${google_service_account.terraform.email}"
  ]
  for_each = toset(var.roles_to_grant_to_service_account)
  role     = each.value
}

$ terraform apply --auto-approve

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # google_project_iam_binding.terraform["roles/editor"] will be created
  + resource "google_project_iam_binding" "terraform" {
      + etag    = (known after apply)
      + id      = (known after apply)
      + members = (known after apply)
      + project = "test-tf-sa"
      + role    = "roles/editor"
    }

  # google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] will be created
  + resource "google_project_iam_binding" "terraform" {
      + etag    = (known after apply)
      + id      = (known after apply)
      + members = (known after apply)
      + project = "test-tf-sa"
      + role    = "roles/iam.serviceAccountAdmin"
    }

  # google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] will be created
  + resource "google_project_iam_binding" "terraform" {
      + etag    = (known after apply)
      + id      = (known after apply)
      + members = (known after apply)
      + project = "test-tf-sa"
      + role    = "roles/resourcemanager.projectIamAdmin"
    }

  # google_service_account.terraform will be created
  + resource "google_service_account" "terraform" {
      + account_id   = "terraform"
      + disabled     = false
      + display_name = "terraform service account"
      + email        = (known after apply)
      + id           = (known after apply)
      + name         = (known after apply)
      + project      = (known after apply)
      + unique_id    = (known after apply)
    }

Plan: 4 to add, 0 to change, 0 to destroy.
google_service_account.terraform: Creating...
google_service_account.terraform: Creation complete after 2s [id=projects/test-tf-sa/serviceAccounts/terraform@test-tf-sa.iam.gserviceaccount.com]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Creating...
google_project_iam_binding.terraform["roles/editor"]: Creating...
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Creating...
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Creation complete after 9s [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/editor"]: Creation complete after 9s [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Still creating... [10s elapsed]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Creation complete after 10s [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]

Apply complete! Resources: 4 added, 0 changed, 0 destroyed.

Terraform has deleted the Compute Engine default service account from the IAM principals

Immediately after the terraform apply, verify the IAM principals and the Compute Engine default service account has been deleted in the IAM principal view.

enter image description here

As suggested by @JohnHanley, clicked Include Google-provided role grants to unhide Google-managed service accounts. The original Compute Engine default service account 1079157603081-compute@developer.gserviceaccount.com has gone in the IAM principals view.

enter image description here

The gcloud projects get-iam-policy command does not show the Compute Engine default service account 1079157603081-compute@developer.gserviceaccount.com.

$ GCP_PROJECT_ID=test-tf-sa
$ gcloud projects get-iam-policy $GCP_PROJECT_ID
bindings:
- members:
  - serviceAccount:service-1079157603081@compute-system.iam.gserviceaccount.com
  role: roles/compute.admin
- members:
  - serviceAccount:service-1079157603081@compute-system.iam.gserviceaccount.com
  role: roles/compute.instanceAdmin
- members:
  - serviceAccount:service-1079157603081@compute-system.iam.gserviceaccount.com
  role: roles/compute.serviceAgent
- members:
  - serviceAccount:service-1079157603081@container-engine-robot.iam.gserviceaccount.com
  role: roles/container.serviceAgent
- members:
  - serviceAccount:service-1079157603081@containerregistry.iam.gserviceaccount.com
  role: roles/containerregistry.ServiceAgent
- members:
  - serviceAccount:service-1079157603081@compute-system.iam.gserviceaccount.com
  role: roles/editor
- members:
  - user:****@gmail.com
  role: roles/owner
- members:
  - serviceAccount:service-1079157603081@gcp-sa-pubsub.iam.gserviceaccount.com
  role: roles/pubsub.serviceAgent
etag: BwXVf2S5fCQ=
version: 1

The service account though still remains in the IAM Service Accounts menu.

enter code here

Create GKE

Enable the Kubernetes Engine API, and create a GKE cluster. At this point, the impact of Compute Engine default service account did not hinder the GKE creation. It may be because of the eventual consistency.

enter image description here

enter image description here

terraform destroy

Run terraform destroy.

$ terraform destroy --auto-approve
google_service_account.terraform: Refreshing state... [id=projects/test-tf-sa/serviceAccounts/terraform@test-tf-sa.iam.gserviceaccount.com]
google_project_iam_binding.terraform["roles/editor"]: Refreshing state... [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Refreshing state... [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Refreshing state... [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply":

  # google_project_iam_binding.terraform["roles/editor"] has been changed
  ~ resource "google_project_iam_binding" "terraform" {
      ~ etag    = "BwXVe+z+aCU=" -> "BwXVfBieTDw="
        id      = "test-tf-sa/roles/editor"
      ~ members = [
          + "serviceAccount:1079157603081@cloudservices.gserviceaccount.com",
            # (1 unchanged element hidden)
        ]
        # (2 unchanged attributes hidden)
    }
  # google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] has been changed
  ~ resource "google_project_iam_binding" "terraform" {
      ~ etag    = "BwXVe+z+aCU=" -> "BwXVfBieTDw="
        id      = "test-tf-sa/roles/iam.serviceAccountAdmin"
        # (3 unchanged attributes hidden)
    }
  # google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] has been changed
  ~ resource "google_project_iam_binding" "terraform" {
      ~ etag    = "BwXVe+z+aCU=" -> "BwXVfBieTDw="
        id      = "test-tf-sa/roles/resourcemanager.projectIamAdmin"
        # (3 unchanged attributes hidden)
    }

Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to
undo or respond to these changes.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  # google_project_iam_binding.terraform["roles/editor"] will be destroyed
  - resource "google_project_iam_binding" "terraform" {
      - etag    = "BwXVfBieTDw=" -> null
      - id      = "test-tf-sa/roles/editor" -> null
      - members = [
          - "serviceAccount:1079157603081@cloudservices.gserviceaccount.com",
          - "serviceAccount:terraform@test-tf-sa.iam.gserviceaccount.com",
        ] -> null
      - project = "test-tf-sa" -> null
      - role    = "roles/editor" -> null
    }

  # google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] will be destroyed
  - resource "google_project_iam_binding" "terraform" {
      - etag    = "BwXVfBieTDw=" -> null
      - id      = "test-tf-sa/roles/iam.serviceAccountAdmin" -> null
      - members = [
          - "serviceAccount:terraform@test-tf-sa.iam.gserviceaccount.com",
        ] -> null
      - project = "test-tf-sa" -> null
      - role    = "roles/iam.serviceAccountAdmin" -> null
    }

  # google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] will be destroyed
  - resource "google_project_iam_binding" "terraform" {
      - etag    = "BwXVfBieTDw=" -> null
      - id      = "test-tf-sa/roles/resourcemanager.projectIamAdmin" -> null
      - members = [
          - "serviceAccount:terraform@test-tf-sa.iam.gserviceaccount.com",
        ] -> null
      - project = "test-tf-sa" -> null
      - role    = "roles/resourcemanager.projectIamAdmin" -> null
    }

  # google_service_account.terraform will be destroyed
  - resource "google_service_account" "terraform" {
      - account_id   = "terraform" -> null
      - disabled     = false -> null
      - display_name = "terraform service account" -> null
      - email        = "terraform@test-tf-sa.iam.gserviceaccount.com" -> null
      - id           = "projects/test-tf-sa/serviceAccounts/terraform@test-tf-sa.iam.gserviceaccount.com" -> null
      - name         = "projects/test-tf-sa/serviceAccounts/terraform@test-tf-sa.iam.gserviceaccount.com" -> null
      - project      = "test-tf-sa" -> null
      - unique_id    = "107173424725895843752" -> null
    }

Plan: 0 to add, 0 to change, 4 to destroy.
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Destroying... [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]
google_project_iam_binding.terraform["roles/editor"]: Destroying... [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Destroying... [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Destruction complete after 10s
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Destruction complete after 10s
google_project_iam_binding.terraform["roles/editor"]: Still destroying... [id=test-tf-sa/roles/editor, 10s elapsed]
google_project_iam_binding.terraform["roles/editor"]: Destruction complete after 11s
google_service_account.terraform: Destroying... [id=projects/test-tf-sa/serviceAccounts/terraform@test-tf-sa.iam.gserviceaccount.com]
google_service_account.terraform: Destruction complete after 1s

Destroy complete! Resources: 4 destroyed.

Problems

Cannot delete GKE

The impact of the Compute Engine default service account deletion in IAM principals started.

enter image description here

Cannot delete GKE cluster with the error.

Google Compute Engine: Required 'compute.instanceGroups.update' permission for 'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp'.

enter image description here

$ gcloud container clusters delete cluster-1 --zone=us-central1-c
The following clusters will be deleted.
 - [cluster-1] in [us-central1-c]

Do you want to continue (Y/n)?  Y

Deleting cluster cluster-1...done.                                                                                                                                  
ERROR: (gcloud.container.clusters.delete) Some requests did not succeed:
 - args: ['Operation [<Operation\n clusterConditions: [<StatusCondition\n canonicalCode: CanonicalCodeValueValuesEnum(PERMISSION_DENIED, 7)\n message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">]\n detail: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n endTime: \'2022-01-14T00:20:54.190004708Z\'\n error: <Status\n code: 7\n details: []\n message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">\n name: \'operation-1642119632548-20038ec5\'\n nodepoolConditions: []\n operationType: OperationTypeValueValuesEnum(DELETE_CLUSTER, 2)\n selfLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/operations/operation-1642119632548-20038ec5\'\n startTime: \'2022-01-14T00:20:32.548792723Z\'\n status: StatusValueValuesEnum(DONE, 3)\n statusMessage: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n targetLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/clusters/cluster-1\'\n zone: \'us-central1-c\'>] finished with error: Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.']
   exit_code: 1

Cannot create GKE

Try to create another GKE cluster.

enter image description here

Cannot create GKE cluster anymore. This is the original issue GCP GKE - Google Compute Engine: Not all instances running in IGM I encountered which lead to this trouble shooting.

enter image description here

cluster-2
Google Compute Engine: Not all instances running in IGM after 18.798524988s. Expected 3, running 0, transitioning 3. Current errors: [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.instances.create' permission for 'projects/1079157603081/zones/us-central1-c/instances/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '1079157603081@cloudservices.gserviceaccount.com'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.disks.create' permission for 'projects/1079157603081/zones/us-central1-c/disks/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '1079157603081@cloudservices.gserviceaccount.com'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.disks.setLabels' permission for 'projects/1079157603081/zones/us-central1-c/disks/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '1079157603081@cloudservices.gserviceaccount.com'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.subnetworks.use' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '1079157603081@cloudservices.gserviceaccount.com'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.subnetworks.useExternalIp' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '1079157603081@cloudservices.gserviceaccount.com') (truncated).

enter image description here


Attempts to fix

Tried these measures but no luck.

Reassign roles/Editor to the service account

GCP_PROJECT_ID=test-tf-sa
GCP_SVC_ACC="serviceAccount:1079157603081-compute@developer.gserviceaccount.com"

gcloud projects add-iam-policy-binding ${GCP_PROJECT_ID} \
    --member=serviceAccount:${GCP_SVC_ACC} \
    --role=roles/Editor
-----
ERROR: Policy modification failed. For a binding with condition, run "gcloud alpha iam policies lint-condition" to identify issues in condition.
ERROR: (gcloud.projects.add-iam-policy-binding) INVALID_ARGUMENT: Role roles/Editor is not supported for this resource.

Apply undelete service account

$ gcloud beta iam service-accounts undelete 109558708367309276392
restoredAccount:
  email: 1079157603081-compute@developer.gserviceaccount.com
  etag: MDEwMjE5MjA=
  name: projects/test-tf-sa/serviceAccounts/1079157603081-compute@developer.gserviceaccount.com
  oauth2ClientId: '109558708367309276392'
  projectId: test-tf-sa
  uniqueId: '109558708367309276392'

They did not bring the Compute Engine default service account back to IAM principals.

enter image description here

Disable Compute Engine API

Tried to disable the Compute Engine API but as GKE nodes cannot be deleted, it cannot be disabled.

Manually add back the service account

Manually added Compute Engine account 1079157603081-compute@developer.gserviceaccount.com" and added IAM roles/Editor. It is not appear in gcloud projects get-iam-policy command output, but still cannot delete the GKE cluster.

$ gcloud projects get-iam-policy $GCP_PROJECT_ID
bindings:
...
- members:
  - serviceAccount:1079157603081-compute@developer.gserviceaccount.com           <-----
  - serviceAccount:service-1079157603081@compute-system.iam.gserviceaccount.com
  role: roles/editor
...
etag: BwXVf9cVnaU=
version: 1

$ gcloud container clusters delete cluster-1 --zone=us-central1-c
The following clusters will be deleted.
 - [cluster-1] in [us-central1-c]

Do you want to continue (Y/n)?  Y

Deleting cluster cluster-1...done.                                                                                                                                  
ERROR: (gcloud.container.clusters.delete) Some requests did not succeed:
 - args: ['Operation [<Operation\n clusterConditions: [<StatusCondition\n canonicalCode: CanonicalCodeValueValuesEnum(PERMISSION_DENIED, 7)\n 
 message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">]\n 
 detail: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n 
 endTime: \'2022-01-14T00:33:38.746564953Z\'\n error: <Status\n code: 7\n details: []\n 
 message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">\n 
 name: \'operation-1642120382096-034b0eb7\'\n nodepoolConditions: []
 \n operationType: OperationTypeValueValuesEnum(DELETE_CLUSTER, 2)\n 
 selfLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/operations/operation-1642120382096-034b0eb7\'\n 
 startTime: \'2022-01-14T00:33:02.096736326Z\'\n status: StatusValueValuesEnum(DONE, 3)\n 
 statusMessage: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n 
 targetLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/clusters/cluster-1\'\n 
 zone: \'us-central1-c\'>] finished with error: Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.']
   exit_code: 1

enter image description here

Another service account for GKE

Created another service account that has compute.admin roles, and used it to create/delete the GKE cluster(s). However, once the Compute Engine default service account has been compromised, keep having the GCP GKE - Google Compute Engine: Not all instances running in IGM issue.


Goal to achieve

Bring the Compute Engine default service account back into the IAM principals like in the snapshot below, and be able to manage Compute Engines and GKE nodes.

enter image description here

mon
  • 18,789
  • 22
  • 112
  • 205
  • 2
    I do not believe the service account is deleted. You have a different problem. 1) In your screenshot after **Terraform has deleted the Compute Engine default service account**, click the little button **Include Google-provided role grants** to unhide Google-managed service accounts. 2) Use the CLI and verify that the service account is missing rather than hidden **gcloud projects get-iam-policy** 3) If this is actually a bug, you are at the wrong forum. Go to **Google Issue Tracker** https://developers.google.com/issue-tracker and post the problem there. – John Hanley Jan 13 '22 at 23:24
  • @JohnHanley, you are right, it should have been "deleted from the IAM principals" console view. It still remains in the IAM Service Accounts console view, but it cannot be no more usable to manage Compute Engines with roles/Editor gone. Tried to reassign the role with gcloud projects add-iam-policy-binding but ERROR: Policy modification failed. gcloud beta iam service-accounts undelete did not bring it back into IAM principals. Any suggestion? – mon Jan 13 '22 at 23:32
  • 1
    Since the service account is **not** deleted, you cannot **undelete** it. To see the service account, click the button I mentioned. If it does not appear after clicking the button, use the CLI and confirm which roles are assigned to the service account. Service accounts with no roles, do not appear under IAM but do appear under Service Accounts. Note; this is a recent change within the last year with the GUI. – John Hanley Jan 13 '22 at 23:44
  • If the service account has no roles assigned to it within the project, you can go to **IAM** and add the service account with required roles. It will then appear again. – John Hanley Jan 13 '22 at 23:45
  • Thanks @JohnHanley. Include Google-provided role grants showed hidden accounts, but the original Compute Engine default account 1079157603081-compute@developer.gserviceaccount.com does not exist in IAM principals, nor any account with name "Compute Engine default service account". gcloud projects get-iam-policy command does not show the Compute Engine default service account 1079157603081-compute@developer.gserviceaccount.com, either. If there is other suggestion to bring the Compute Engine default service account 1079157603081-compute@developer.gserviceaccount.com back, please advise. – mon Jan 14 '22 at 00:16
  • Go to IAM, add a new member using the email address of the service account. Attach your desired roles. – John Hanley Jan 14 '22 at 00:21
  • @JohnHanley, thanks for the suggestion. Added the Cloud Engine default service account email in IAM principals, it appears in gcloud projects get-iam-policy command, but no luck. Still cannot manage GKE nodes, cannot delete, etc. Tried to open in the Google issue tracker under IAM but got You do not have permission to create issues in this component. ... thanks for the valuable suggestions though. Learned a lot. – mon Jan 14 '22 at 00:44
  • @mon, I still believe it's a clash between the Editor role defined in `roles_to_grant_to_service_account` and the `google_project_iam_binding` role. Can re-run the scenario without editor role and see if the service account is being delete? the `binding` resource is authoritative which mean it will delete any binding that is NOT explicitly specified in the terraform configuration. – Totem Jan 15 '22 at 01:58
  • Thanks so much @Totem, as you pointed out, it only happens with "roles/Editor". I should have read the document word by word and used google_project_iam_member, I suppse. – mon Jan 15 '22 at 03:37

2 Answers2

5

Related issues

I wish I had read these before getting into this issue as another bites the sand.

Usability improvements for *_iam_policy and *_iam_binding resources #8354

Description I'm sure you know by now there is a decent amount of care required when using the *_iam_policy and *_iam_binding versions of IAM resources. There are a number of "be careful!" and "note" warnings in the resources that outline some of the potential pitfalls, but there are hidden dangers as well. For example, using the google_project_iam_policy resource may inadvertently remove Google's service agents' (https://cloud.google.com/iam/docs/service-agents) IAM roles from the project. Or, the dangers of using google_storage_bucket_iam_policy and google_storage_bucket_iam_binding, which may remove the default IAM roles granted to projectViewers:, projectEditors:, and projectOwners: of the containing project.

The largest issue I encounter with people running into the above situations is that the initial terraform plan does not show that anything is being removed. While the documentation for google_project_iam_policy notes that it's best to terraform import the resource beforehand, this is in fact applicable to all *_iam_policy and *_iam_binding resources. Unfortunately this is tedious, potentially forgotten, and not something that you can abstract away in a Terraform module.

Cause

As @toteem pointed out

google_project_iam_binding resource is Authoritative which mean it will delete any binding that is NOT explicitly specified in the terraform configuration.

Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the project are preserved.

Not sure who can get the clear idea what terraform does with google_project_iam_binding but as GCP has identified, Terraform google_project_iam_binding has deleted all the accounts not in the members attribute that have "roles/Editor" role.

Still, I believe this is a terraform defect.

As per the Google APIs Service Agent document, it is the essential service accounts that GCP internally manages. Terraform should not delete any such GCP managed internal service accounts as it bring the GCP projects down. I doubt in what use cases do we need this to happen.

Some Google Cloud services need access to your resources so that they can act on your behalf. For example, when you use Cloud Run to run a container, the service needs access to any Pub/Sub topics that can trigger the container.

To meet this need, Google creates and manages service accounts for many Google Cloud services. These service accounts are known as Google-managed service accounts. You might see Google-managed service accounts in your project's IAM policy, in audit logs, or on the IAM page in the Cloud Console.

Google-managed service accounts are not listed in the Service accounts page in the Cloud Console.

Google APIs Service Agent. Your project is likely to contain a service account named the Google APIs Service Agent, with an email address that uses the following format: project-number@cloudservices.gserviceaccount.com

This service account runs internal Google processes on your behalf. It is automatically granted the Editor role (roles/editor) on the project.

Solution

Use google_project_iam_member.

#--------------------------------------------------------------------------------
# Service Account Roles
# Need roles/resourcemanager.projectIamAdmin to be able to execute this.
#--------------------------------------------------------------------------------
# resource "google_project_iam_binding" "terraform" {
#   project = var.PROJECT_ID
#
#   #--------------------------------------------------------------------------------
#   # Grant the service account to have the roles
#   #--------------------------------------------------------------------------------
#   members = [
#     "serviceAccount:${google_service_account.terraform.email}"
#   ]
#   for_each = toset(var.roles_to_grant_to_service_account)
#   role     = each.value
# }

#--------------------------------------------------------------------------------
# Service Account Roles
# Need roles/resourcemanager.projectIamAdmin to be able to execute this.
#--------------------------------------------------------------------------------
resource "google_project_iam_member" "terraform" {
  project = local.PROJECT_ID

  #--------------------------------------------------------------------------------
  # Grant the service account to have the roles
  #--------------------------------------------------------------------------------
  member   = "serviceAccount:${google_service_account.terraform.email}"
  for_each = toset(var.roles_to_grant_to_service_account)
  role     = each.value
}

Fix

In case the GCP internal service accounts have been deleted by google_project_iam_binding.

According to GCP:

To fix this issue you can add the service agent in the IAM page using the Add option at the top. The principal will be "${PROJECT_ID}@cloudservices.gserviceaccount.com" and add the editor role.

As per the error message, add '1079157603081@cloudservices.gserviceaccount.com' in IAM.

'compute.subnetworks.useExternalIp' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '1079157603081@cloudservices.gserviceaccount.com') (truncated).

The Google APIs Service Agent is restored in the view.

enter image description here

Create GKE.

enter image description here

Conclusion

I would never use them as I doubt if any use cases exist which we need to destroy other accounts that have the same roles.

  • google_project_iam_member
  • google_service_account_iam_binding
mon
  • 18,789
  • 22
  • 112
  • 205
1

You can restore the service accounts using the “gcloud beta iam service-accounts undelete” command.

If you accidentally delete a service account, you can try to undelete the service account instead of creating a new service account.

Please review this link if you need more info. You may notice that in order to restore a deleted account you may need the 21 digit unique ID. If you do not have this ID for the account, you could try this command :

gcloud logging read --freshness=30d --format='table(timestamp,resource.labels.email_id,resource.labels.project_id,resource.labels.unique_id)' protoPayload.methodName="google.iam.admin.v1.DeleteServiceAccount" resource.type="service_account" logName:"cloudaudit.googleapis.com%2Factivity"'

or this command:

gcloud logging read --freshness=30d protoPayload.methodName="google.iam.admin.v1.DeleteServiceAccount" | grep 'email_id|unique_id'

Leo
  • 695
  • 3
  • 11
  • Thanks for the suggestion, unfortunately it did not work. gcloud beta iam service-accounts undelete 109558708367309276392 run, but it did not bring it back to IAM principals. I should have been accurate. It still remains as a service account as I can see in IAM Service Account view, but it is not anymore in IAM principals view. When Compute Engine API is enabled, it appears in IAM principals as well as IAM Service Accounts, but it disappeared form IAM principals once Terraform is executed. – mon Jan 13 '22 at 23:35