Questions tagged [cdap]

CDAP exposes developer APIs (Application Programming Interfaces) for creating applications and accessing core CDAP services. CDAP defines and implements a diverse collection of services that support applications and data on existing Hadoop infrastructure such as HBase, HDFS, YARN, MapReduce, Hive, and Spark.

References

138 questions
4
votes
1 answer

GCP Data Fusion no discoverable foud error

I'm trying to use GCP Data Fusion Basic Edition with Private IP option, buth when I try to create a pipeline every action gives me this error No discoverable found for request POST…
4
votes
2 answers

Permissions Issue with Google Cloud Data Fusion

I'm following the instructions in the Cloud Data Fusion sample tutorial and everything seems to work fine, until I try to run the pipeline right at the end. Cloud Data Fusion Service API permissions are set for the Google managed Service account as…
Helvick
  • 238
  • 2
  • 10
3
votes
1 answer

Data Fusion - Issue with http post plugin

I am trying to make a http call using DataFusion. Source - GCS - csv file Sink - HTTP POST API is expecting the file as part of the HTTP request. When this is executed, I get the below error in the API logs. Required request part 'file' is not…
3
votes
1 answer

Using a multi-character delimiter in Cloud Data fusion

I am trying to read a csv file in cloud datafusion. The csv file uses a multi-character (i.e. ~^~)delimiter. When i try to parse the column using a custom delimiter the tool only considers the first character and splits the file accordingly. I end…
Trishit Ghosh
  • 235
  • 3
  • 10
3
votes
2 answers

How to edit an already published Cloud Data Fusion Pipeline

I have deployed a data pipeline in Google Cloud Data Fusion but it does not work as expected. Is there a way to edit an already deployed data pipeline in Cloud Data Fusion or must it be deleted and rebuilt from scratch and deployed again?
Terence Keys
  • 33
  • 1
  • 3
3
votes
1 answer

Cloud Data Fusion storagebucketslist permission issue

I just installed Cloud Data Fusion, and get this error when I try to explore the “Cloud Storage Default” bucket. How do I fix this? cloud-datafusion-management-sa@xxxxxxxxxxxx-tp.iam.gserviceaccount.com does not have storage.buckets.list access to…
James
  • 2,321
  • 14
  • 30
3
votes
3 answers

How to debug CDAP sandbox with IntelliJ on mac

I am trying to debug CDAP code and plugin code I have tried several options to run the CDAP sandbox: https://docs.cask.co/cdap/5.1.0-SNAPSHOT/en/developer-manual/getting-started/sandbox/docker.html The sandbox runs and the stout logs say port 5005…
Rubber Duck
  • 3,673
  • 3
  • 40
  • 59
2
votes
1 answer

How to trigger a CDAP pipeline using airflow operators?

I have an onpremise CDAP data fusion instance with multiple namespaces. How to trigger the pipeline using airflow operators? I have tried exploring the airflow available operators and this page but not very helpful…
2
votes
1 answer

Terraform Data Fusion instance changed causes ERROR to occur during plan

So consider the scenario where I have a Data Fusion in version 6.4.1 and I wish to re-deploy it as 6.5.0 version via Terraform (this is just an example, but the problem applies to any update to the Data Fusion instance). In Terraform, this implies…
FVCC
  • 262
  • 2
  • 16
2
votes
1 answer

GCP - CDAP - Dataproc cluster stucks in running state

We have a DataFusion pipeline which is triggered by a Cloud Composer DAG. This pipeline provisions an ephemeral DataProc cluster which cluster - in an ideally scenario - terminates after finishing the tasks. In our case, sometimes, not always, this…
Robert
  • 127
  • 2
  • 11
2
votes
2 answers

Pipeline on Oracle cdap to BigQuery Multitables

I am building a Pipeline on the cdap, where I have an oracle database where I connect and get a table, then connect this data to the BigQuery Multitables component. Individually both components were validated and by the cdap tool itself, when I…
user11825409
2
votes
3 answers

Pipeline Dependencies in Data Fusion

I have three pipelines in Data Fusion say A,B and C. I want to the Pipeline C to get triggered after execution of Pipeline A and B both Completes. Pipeline triggers are putting the dependency on one pipeline only. Can this be implemented in Data…
2
votes
1 answer

Google Data Fusion: "Looping" over input data to then execute multiple Restful API calls per input row

I have the following challenge I would like to solve preferably in Google Data Fusion: I have one web service that returns about 30-50 elements describing an invoice in a JSON payload like this: { "invoice-services": [ { "serviceId":…
JensU
  • 21
  • 1
2
votes
4 answers

Macros in Datafusion using Argument setter

Using Argument setter by supplying the parameter value I want to make the Datafusion pipeline as resuable. As said by many other answer's have tried implementing using the cloud reusable pipeline example given in Google guide.I was not able to pass…
2
votes
3 answers

Convert to date in cloud datafusion

How do we convert a string to date in cloud datafusion? I have a column with the value say 20191120 (format of yyyyMMdd) i want to load this into a table in bigquery as date. The table column datatype is also date. What i have tried so far is that…
Trishit Ghosh
  • 235
  • 3
  • 10
1
2 3
9 10