I have designed a simple pipeline to read a CSV file from Cloud Storage and write to a BigQuery Table. While running the pipeline, the operation stops abruptly without any error message in logs. Have already required Firewall rules. Please suggest how to approach this.
Asked
Active
Viewed 943 times
1 Answers
3
My guess is this could a quota issue related dataproc cluster. When a pipeline is run in cloud data fusion the default profile spins up a dataproc cluster with x number of worker nodes. This was defaulted to 10 for sometime. This might be the source of the problem for this failure but will need more information to confirm if this the case.
@Safiyur couple of questions,
- Could you tell me when this instance was spun up?
- To verify, before running the pipeline could you customize the profile to have 3 worker nodes. This is to make sure if this is even the cause of the failure(Configure -> Compute Config -> Customize)
- Could you also check master service logs to see (Cloud Data Fusion -> System Admin) if there are any errors thrown related to running the pipeline?
- Could also provide what firewall rules you added? (Is it based on the docs
Note:
While running the pipeline, the operation stops abruptly without any error message in logs.
From this I assume its the pipeline logs you are talking about.

Ajai
- 3,440
- 5
- 28
- 41