BigQuery - Update schema adding a NULL column in arbitrary position from Python client

Question

Background

I would like to update the schema of a table in BigQuery to match the schema of another table that contains a superset of the original columns. I would like to do it through the BigQuery Python client.

Problem

In practice I want to add some columns containing NULL to an already existing BigQuery table at an arbitrary position that is not necessarily the beginning or the end.

I know how to append new columns at the end of a table, following this snippet, but I would like to add columns in an arbitrary position. Moreover I would like to do it through a schema update, without having to query the entire table.

Being that the schema is actually a list of SchemaField objects, I thought that substituting the append method with the insert method would have sufficed. But this snippet does not do what I'd like:

from google.cloud import bigquery
client = bigquery.Client()
dataset_id = 'my_dataset'
table_id = 'my_table'

table_ref = client.dataset(dataset_id).table(table_id)
table = client.get_table(table_ref)  # API request

original_schema = table.schema

new_schema = original_schema[:]  # creates a copy of the schema
# insert new_col at position 2, instead of appending
new_schema.insert(2, bigquery.SchemaField('new_col', 'STRING')) 

table.schema = new_schema
table = client.update_table(table, ['schema'])  # API request

This code results in the schema being updated exactly as if the method called was append, i.e. new_col gets placed at the end of the schema.

Question

Do you know if it's possible to modify the schema of a BigQuery table so that the new (NULL) columns are inserted at an arbitrary position?

score 0 · Answer 1 · answered Jan 24 '19 at 14:20

as per the answer in this question I would suggest that to copy half table, add the required column and then add join the of the old table would be a possibility, really more expensive than add just a new column at the end, but still a possibility.

score 0 · Answer 2 · answered Jan 28 '19 at 10:10

as explained on this post, such functionality doesn't exist in any SQL server, as column order is irrelevant. what could be done, is to append the new column, and then e.g. scramble the columns order and recreate the table with random column order. What is your business need for this?

BigQuery - Update schema adding a NULL column in arbitrary position from Python client

2 Answers2