1

We are trying to replicate data from AlloyDB to Bigquery using Datastream. We Get "An unknown error occurred. Please try again. If the error persists, contact Google support."

In the Datastream console --> objects list, we see all source tables with Object Status "Failed" and Backfill status "Completed". In Bigquery we see only a subset of the tables (not all the "Completed" objects were synced).

In the Logs Explorer I can see this error on BQ: enter image description here

I also see this error: error: { code: 11 message: "Unsupported primary key column either does not exist or is a pseudocolumn at [1:401]" } The column referred in the error is of type enum.

The desired situation is having all the AlloyDB tables replicated into Bigquery.

The error message is not very informative... What does it mean? What would be the best way to go about troubleshooting this?

N.N.
  • 3,094
  • 21
  • 41
  • Hi @N.N. , Can you check the logs in [Cloud Logging](https://cloud.google.com/logging/docs/api/platform-logs) and errors in [Error Reporting](https://cloud.google.com/error-reporting/docs/viewing-errors) for more error details? Let me know if that helps. – Shipra Sarkar Sep 24 '22 at 06:32
  • Thank @ShipraSarkar ! I have edited the question and added the error i found on Logs Explorer. – N.N. Sep 25 '22 at 06:09
  • Added another finding from the log... Could it be related to source field type being enum? – N.N. Sep 28 '22 at 12:04

3 Answers3

1

We're actively working on making these error messages be more informative, and improvements are continuously being rolled out as we identify more edge cases. Assuming you followed all the steps in the documentation, then you may need to open a ticket with support for further investigation. If a support ticket isn't an option, you can still report the issue using the public issue tracker

  • Here is the issue on tracker: https://issuetracker.google.com/issues/249688077. Thanks a lot for your support – N.N. Sep 29 '22 at 07:36
1

I just had this same issue but connecting to a PostgreSQL in AWS RDS:

Beginning with Postgres 10, passwords are encrypted using SCRAM-SHA-256 in PostgreSQL. Google DataStream still expects MD5 password encryption, or it will generate an "unknown error" in the logs and fail the backfills.

You'll need to update your postgresql.conf (or RDS Cluster Parameter Group if you're using AWS like me):

password_encryption = 'MD5'

Restart the database and make sure the parameter has changed with:

SHOW password_encryption;

Reset the password of your users:

ALTER USER "{username}" with password '{password}';

More info from the PostgreSQL docs: https://www.postgresql.org/docs/current/auth-password.html

Juan Salas
  • 11
  • 1
0

@N.N.

The error message you posted is "error: { code: 11 message: "Unsupported primary key column either does not exist or is a pseudocolumn at [1:401]" } " You also stated that "The column referred in the error is of type enum."

As per Google DataStream documentation there is a known limitation: "Datastream doesn't support columns of the enumerated (ENUM) data types." So it looks like this field is "ignored" by the DataStream.

But since this field is likely a primary key, DataStream is unable to create an object (table) with no primary key in it because there is yet another known limitation for DataStream: "Tables without primary keys must have a REPLICA IDENTITY".

See below link for more details on unsupported data types. https://cloud.google.com/datastream/docs/sources-postgresql#postgresqlknownlimitations

Possible solution: I think to resolve the issue you need to have a “replica identity” configured in order to be able to replicate UPDATE and DELETE operations on that one table. See below link for details on how to do it: https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY