If I already have the schema file, for example: schema.json. How can I load the file to create the table or job schema using the google-cloud-python API?
3 Answers
You can try this solution:
import json
from google.cloud import bigquery
bigquerySchema = []
with open('schema.json') as f:
bigqueryColumns = json.load(f)
for col in bigqueryColumns:
bigquerySchema.append(bigquery.SchemaField(col['name'], col['type']))
bigqueryClient = bigquery.Client()
tableRef = "myproject.mydataset.mytable"
table = bigquery.Table(tableRef, schema=bigquerySchema)
table = bigqueryClient.create_table(table)

- 3,483
- 1
- 12
- 38
-
1This won't work for schemas with nested RECORD type fields unfortunately. – okhobb Nov 28 '21 at 17:26
-
https://stackoverflow.com/questions/67458605/json-schema-file-will-not-execute-in-bigquery-python-api/70146001 – okhobb Nov 28 '21 at 17:50
I don't think this is currently possible. This is why I tend to use the bq cli when I want to load complicated JSON files with many different columns.
Something like this:
bq load --source_format=NEWLINE_DELIMITED_JSON \
[PROJECT_ID]:[DATASET].[TABLE] gs://[BUCKET]/[FILENAME].json \
[PATH TO SCHEMA FOLDER]/schema.json

- 794
- 12
- 22
In case anyone finds this question 3 years later, this can now be done in the cloud shell found here: https://console.cloud.google.com/cloudshell/
If you are not comfortable using the command line for uploading files you can click on the editor icon and upload via drag and drop.
Google Cloud Platform documentation on uploading JSON data on command line, including with a schema file, can be found here: https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-json#bigquery_load_table_gcs_json-cli

- 403
- 4
- 15