Questions tagged [google-datastream]

Use this tag when facing Google Datastream issues or questions like: set up, running, viewing logs, handle error, recovery, working with datastream public API

50 questions
5
votes
0 answers

Datastream from cloud SQL (PostgreSQL) to Bigquery BIGQUERY_TOO_MANY_PRIMARY_KEYS/BIGQUERY_UNSUPPORTED_TYPE_FOR_PRIMARY_KEY

I just started to experiment with the new datastream from CloudSQL (PostgreSQL) to Bigquery and I am having many errors when starting the stream. Mostly BIGQUERY_UNSUPPORTED_TYPE_FOR_PRIMARY_KEY and BIGQUERY_TOO_MANY_PRIMARY_KEYS. Some of the table…
4
votes
1 answer

Workaround for Enum data types for PostgreSQL Google Cloud Datastream

I was surprised to find that Cloud Datastream does not support enum data types in source if replicating from PostgreSQL. Datastream doesn't support replication of columns of the enumerated (ENUM) data type. As we have quite a few fields created…
3
votes
1 answer

How to understand Google Cloud Datastream UNSUPPORTED_EVENTS_DISCARDED

Is there a way to get more detail about unsupported events in google cloud datastream? I am running a datastream from MySQL and have a few UNSUPPORTED_EVENTS_DISCARDED and I would like to understand what these events are. In the logs explorer detail…
2
votes
1 answer

How can I find available ip ranges in a gcloud VPC?

I have to specify a subnet in order to allow Google's Datastream to connect with a source database but every single subnet I specify gives me the error: Error: Error creating PrivateConnection: googleapi: Error 400: The IP range specified…
CClarke
  • 503
  • 7
  • 18
2
votes
1 answer

BigQuery removes fields with Postgres Array during datastream ingestion

I have this table named student_classes: | id | name | class_ids | | ----| ---------| -----------| | 1 | Rebecca | {1,2,3} | | 2 | Roy | {1,3,4} | | 3 | Ted | {2,4,5} | name is type text / string class_ids is type…
2
votes
1 answer

Use Terraform on Google Cloud SQL Postgres to create a Replication Slot

Overall I'm trying to create a Datastream Connection to a Postgres database in Cloud SQL. As I'm trying to configure it all through Terraform, I'm stuck on how I should create a Replication Slot. This guide explains how to do it through the Postgres…
2
votes
1 answer

Datastream configuration not working for Cloud SQL PostgreSQL as source and BigQuery as destination

I wanted to quickly give a try to datastream from cloud SQL PotgreSQL to BigQuery. I created a test Cloud SQL instance where I use postgres who is already as cloudsqlsuperuser. But the wizard provided by Datastream is not helping: part of the…
MBHA Phoenix
  • 2,029
  • 2
  • 11
  • 25
1
vote
0 answers

CDC change data capture in GCP using Datastream, Postgres and Cloud storage

I want to setup CDC from postgres to cloud storage using Datastream through Terraform. I am referring to Terraform Docs. But the example given in this doc doesnt work out of box. Following is what I have built based on docs: provider "google" { …
1
vote
0 answers

How do you avoid scanning the whole BigQuery table when querying a table created by Datastream?

Have just created a stream from PostreSQL to BigQuery using Datastream and was pretty pleased with the results. For each table I altered the DDL after initial streaming to add daily partitions on our created_at fields assuming everything would work…
1
vote
0 answers

i am using GCP datastream for MySQL to GCS for CDC using python,i can able create the stream but not able to run the stream using datastream_v1alpha1

stream = datastream_v1alpha1.Stream( display_name='MySQL to gcs Stream', source_config = source_config, destination_config =…
1
vote
1 answer

Error in BigQuery project history tab while submitting google datastream job

I'm submitting a Datastream job that reads from Aurora PostgreSQL and writes to a BigQuery dataset and table. The Datastream job seems to be completed and ingest data in bigQuery dataset correctly, But i am getting Below error in PROJECT HISTORY…
1
vote
1 answer

Preventing Google Cloud Platform's Datastream from deleting destination table rows when the source rows are deleted

PROBLEM: While setting up a CDC pipeline using datastream in Google Cloud platform, when there is a delete query fired on the source table the same is getting reflected on the destination table as well which we need to prevent. SOLUTION NEEDED: How…
1
vote
1 answer

Cloud data fusion Permission denied due to datastream.connectionProfiles.discover

I am trying to create a cloud data fusion replication job from oracle to bigquery. Receiving the below error. Failed to connect to the database due to below error : io.grpc.StatusRuntimeException: PERMISSION_DENIED:…
1
vote
1 answer

Google Cloud DataStream failing with reason code: BIGQUERY_UNSUPPORTED_PRIMARY_KEY_CHANGE

I am getting error when I was trying to partition the destination table in BigQuery while working with DataStream. step by step to reproduce this: start DataStream from CloudSQL(MYSQL) to BigQuery once the Stream Completed all tables in BigQuery,…
1
vote
0 answers

Datastream stream failed permanently : failed to read from the PostgreSQL replication slot because it is already being used by a different process

On the Log Explorer this is the log entry where it failed. { "textPayload": "2022-10-17 14:43:12.896 UTC [219890]: [1-1] db=xxx,user=datatstream_test ERROR: replication slot \"datastream_replication_slot_test\" is active for PID 219872", …
1
2 3 4