0

I am trying to stream pub/sub messages into a BigQuery table with matching schema. I want to use GCP provided PubSubToBigQuery template to do this, but am unable to set it up successfully.

Here is what I have tried so far

  1. I created a GCE instance with permissions to write to GS (useful link)

  2. Clone GCP source from git git clone https://github.com/GoogleCloudPlatform/DataflowTemplates

  3. As specified here, ran: mvn compile exec:java -Dexec.mainClass=com.google.cloud.teleport.templates.PubSubToBigQuery...

  4. The process created all the jar files created in the /staging buckets. It was supposed to generate template information in /templates bucket, but did not.

What am I missing here?

CMR
  • 986
  • 7
  • 16

1 Answers1

0

I tried to execute this command in the project root:

#!/bin/bash
PROJECT_ID=XXX
BUCKET_NAME=XXX
PIPELINE_FOLDER=gs://YYY/dataflow/pipelines/pubsub-to-bigquery

# Set the runner
RUNNER=DataflowRunner

# Build the template
mvn compile exec:java \
-Dexec.mainClass=com.google.cloud.teleport.templates.PubSubToBigQuery \
-Dexec.cleanupDaemonThreads=false \
-Dexec.args=" \
--project=${PROJECT_ID} \
--stagingLocation=${PIPELINE_FOLDER}/staging \
--tempLocation=${PIPELINE_FOLDER}/temp \
--templateLocation=${PIPELINE_FOLDER}/template \
--runner=${RUNNER}"

And it successfully generated a template file:

$gsutil ls -lh gs://YYY/dataflow/pipelines/pubsub-to-bigquery/template
228.33 KiB  2019-01-14T05:54:01Z  gs://YYY/dataflow/pipelines/pubsub-to-bigquery/template
TOTAL: 1 objects, 233805 bytes (228.33 KiB)

Could you please paste some logs on your side?

Zhou Yunqing
  • 444
  • 2
  • 3