1

What would be the best way to enforce a unique constraint on a table that is created in BigQuery? I am creating the table by a DDL "create" statement and then I insert the contents.

Ismail
  • 1,068
  • 1
  • 6
  • 11
denim
  • 427
  • 4
  • 16

1 Answers1

2

BigQuery is not an operational database, instead it's an analytical database.

BigQuery doesn't support constraints, primary keys etc.
It's engineered for BigData and petabyte scale.

Pentium10
  • 204,586
  • 122
  • 423
  • 502
  • Agree, but I still have the practical problem of making sure a staging table that I import does not have duplicate rows. – denim Dec 04 '20 at 20:58
  • 2
    Engineer to accept duplicate rows. On top of the table you create a VIEW that reads always the most recent row of an entity. That's how these engines work. https://stackoverflow.com/questions/25269410/google-big-query-sql-get-most-recent-column-value – Pentium10 Dec 04 '20 at 21:12
  • so if you're compiling data from legacy upstream sources which contain duplication (without timestamps), you need to enforce uniqueness through whatever logic you deem best in the create/insert queries you use (i.e. if inserting additional records from secondary sources/criteria, only append if entity does not already exist)? – James Jun 20 '22 at 15:49
  • and, by "engineer to accept duplicate rows ... (read) the most recent row of an entity" means add some feature such as a timestamp that lets us choose the latest addition? Any reason why we would want to retain older rows if all we are ever going to require is the newest? Seems like keeping older entries is an unnecessary complication and a waste of space (and any compute resources running queries on a table bigger than it needs to be). – James Jun 20 '22 at 16:04
  • constraints are now supported but they are limited. E.G. : As of writing this, enforcing the constraints isn't supported. https://cloud.google.com/bigquery/docs/information-schema-table-constraints – AXMIM May 10 '23 at 18:56