Background
I would like to update the schema of a table in BigQuery to match the schema of another table that contains a superset of the original columns. I would like to do it through the BigQuery Python client.
Problem
In practice I want to add some columns containing NULL
to an already existing BigQuery table at an arbitrary position that is not necessarily the beginning or the end.
I know how to append new columns at the end of a table, following this snippet, but I would like to add columns in an arbitrary position. Moreover I would like to do it through a schema update, without having to query the entire table.
Being that the schema is actually a list of SchemaField
objects, I thought that substituting the append
method with the insert
method would have sufficed. But this snippet does not do what I'd like:
from google.cloud import bigquery
client = bigquery.Client()
dataset_id = 'my_dataset'
table_id = 'my_table'
table_ref = client.dataset(dataset_id).table(table_id)
table = client.get_table(table_ref) # API request
original_schema = table.schema
new_schema = original_schema[:] # creates a copy of the schema
# insert new_col at position 2, instead of appending
new_schema.insert(2, bigquery.SchemaField('new_col', 'STRING'))
table.schema = new_schema
table = client.update_table(table, ['schema']) # API request
This code results in the schema being updated exactly as if the method called was append
, i.e. new_col
gets placed at the end of the schema.
Question
Do you know if it's possible to modify the schema of a BigQuery table so that the new (NULL
) columns are inserted at an arbitrary position?