5

Scenario - Running Dataflow jobs on project A using a shared VPC to use the region and subnetwork of host project B

On the service account, I have following permission on both project A and B

Compute Admin
Compute Network User
Dataflow Admin
Cloud Dataflow Service Agent
Editor
Storage Admin
Serverless VPC Access Admin 

But still I get this error

Workflow failed. Causes: Error: Message: Required 'compute.subnetworks.get' permission for 'projects/<host project>/regions/us-east1/subnetworks/<subnetwork name>' HTTP Code: 403

What am I missing here? or what other permission should this have? Thanks for looking into this.

Krish
  • 390
  • 4
  • 15

2 Answers2

3

I just had exact same problem using a Shared VPC network and subnet with a dataflow job, and I missed adding network permission to default dataflow service account. The below 2 steps worked just fine.

There are two service accounts involved in Cloud Dataflow (Project in which dataflow job is running)

 1 Default Cloud Dataflow service account : service-<Project
   Number>@dataflow-service-producer-prod.iam.gserviceaccount.com"
 2 Custom Controller service account : myserviceaccount@PROJECT
   ID>.iam.gserviceaccount.com

Step 1 : Add both service accounts to IAM role on network HOST project (as Compute Network User). Additionally, you may add the required permissions running the dataflow job to the Custom Controller service account you created.

Step 2 :Pass the network parameters in the below format to the job (on WebUI or with commandline)

 1. network : projects/<HOST PROJECT ID>/global/networks/<VPC NETWORK NAME>
 2. subnetwork : https://www.googleapis.com/compute/v1/projects/<HOST PROJECT ID>/regions/us-central1/subnetworks/<SUBNET NAME>

More details :

cloud_dataflow_service_account

controller_service_account

specifying-networks#shared

Jijo John
  • 1,368
  • 2
  • 17
  • 31
2

Okay I have figured the issue, There are two things to keep in mind here

  1. The service account which is used to submit a Dataflow job from Airflow or any other scheduling tool needs to have below permission on both project and host project

    Compute Network User Dataflow Admin Cloud Dataflow Service Agent Editor

  2. Then we have two other service accounts that need permissions, compute@developer.gserviceaccount.com - Service account with this suffix needs permission for on host project B with Storage Object Viewer

  3. Also dataflow service account from project A with suffix dataflow-service-producer-prod.iam.gserviceaccount.com needs access on host project A with permissions Storage Object Viewer

Taking care these things solved my problem

Krish
  • 390
  • 4
  • 15